WO2021169616A1

WO2021169616A1 - Method and apparatus for detecting face of non-living body, and computer device and storage medium

Info

Publication number: WO2021169616A1
Application number: PCT/CN2021/070470
Authority: WO
Inventors: 徐国诚
Original assignee: 深圳壹账通智能科技有限公司
Priority date: 2020-02-27
Filing date: 2021-01-06
Publication date: 2021-09-02
Also published as: CN111428570A

Abstract

The present application provides a method and apparatus for detecting the face of a non-living body, and a computer device and a storage medium, which belong to the field of facial detection technology. The present application is used for improving the accuracy of recognizing the face of a living body. The method for detecting the face of a non-living body comprises: acquiring a video image, and extracting, from the video image, a plurality of pictures to be detected; inputting said pictures into a pre-trained target picture detection model, to obtain a target image included in each of said pictures and a category to which the target image belongs; sequentially determining whether the target image of each of said pictures comprises a target image the category of which is a facial region, if so, further determining whether the target image of the corresponding picture comprises an environmental element the category of which is a preset abnormal category; and if the relative position relationship between the facial region of at least one of said pictures and the abnormal environment element is within a preset range, determining that the face in said picture is the face of a non-living body.

Description

Method, device, computer equipment and storage medium for detecting non-living human face

This application claims the priority of the Chinese patent application filed with the Chinese Patent Office on February 27, 2020, the application number is 202010122186.3, and the invention title is "Methods, devices, computer equipment and storage media for detecting non-living human faces", all of which The content is incorporated in this application by reference.

Technical field

This application relates to the technical field of face detection, in particular to a method, device, computer equipment and storage medium for detecting a non-living human face.

Background technique

Based on the protection of the user's privacy or property, in some scenarios, the user needs to be identified in vivo through the camera of the terminal device. Only when a living user is identified, is it allowed to access certain functions in the application.

Existing live detection solutions based on dynamic key points of human faces generally allow users to blink, open their mouths, or raise their heads when recognizing a living body. However, the existing living body recognition technology has the risk of being compromised by fake animations. Face animation is very realistic and can be detected by existing living targets.

Due to the rapid development of animation production technology, the existing living body recognition technology is at risk of being compromised by fake animations, and the existing living body recognition technology needs to be improved urgently.

Summary of the invention

The embodiments of the present application provide a method, a device, a computer device, and a storage medium for detecting a non-living human face, which can improve the accuracy of living face recognition.

According to an aspect of the present application, a method for detecting a non-living human face is provided, and the method includes:

Acquiring a video image, and extracting multiple pictures to be detected from the video image;

Inputting the pictures to be detected into a pre-trained target picture detection model to obtain the target image included in each picture to be detected and the category to which the target image belongs;

It is determined in turn whether the target image of each of the pictures to be detected includes a target image whose category is a face area, and if so, it is further determined whether the target image corresponding to the picture to be detected includes a predetermined abnormal category. Environmental components

If the relative positional relationship between at least one face area of the to-be-detected image and the abnormal environmental element is within a preset range, it is determined that the human face in the to-be-detected image is a non-living human face.

According to another aspect of the present application, a device for detecting a non-living human face is provided, the device comprising:

The video acquisition module is used to acquire a video image, and extract multiple pictures to be detected from the video image;

The input module is configured to input the pictures to be detected into a pre-trained target picture detection model to obtain the target image included in each picture to be detected and the category to which the target image belongs;

The first judgment module is used to sequentially judge whether the target image of each of the pictures to be detected includes a target image whose category is a face area, and if so, further determine whether the target image corresponding to the picture to be detected includes a category Environmental components of the preset abnormal category;

The second determining module is configured to determine the person in the image to be detected if the relative positional relationship between the face area of the at least one image to be detected and the abnormal environmental element is within a preset range The face is a non-living human face.

A computer device provided according to another aspect of the present application includes a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the program:

According to another aspect of the present application, one or more readable storage media storing computer-readable instructions are provided, and the computer-readable storage medium stores computer-readable instructions, wherein the computer-readable instructions are controlled by one or When multiple processors are executed, the one or more processors are caused to execute the following steps:

The method, device, computer equipment, and storage medium for detecting non-living human faces provided in this application add an abnormal environmental element detection technology, and let the target image detection model learn the abnormal environmental elements in advance, so that the person to be detected can be identified When the face is a non-living human face, the standard picture detection model can first identify whether there are abnormal environmental elements in the picture or video to be detected, and then determine the positional relationship between the environmental elements and the face area to determine the to-be-detected Whether the face in the picture is a non-living face, which improves the accuracy of live face recognition.

Description of the drawings

FIG. 1 is a schematic diagram of an application environment of a method for detecting a non-living human face in an embodiment of this application;

Fig. 2 is a flowchart of a method for detecting a non-living human face according to an embodiment of the present application;

FIG. 3 is a flowchart of judging whether the target image includes environmental elements whose category is a preset abnormal category;

Figure 4 is a flowchart of training the target picture detection model;

Fig. 5 is a flowchart of a method for detecting a non-living human face according to another embodiment of the present application;

Fig. 6 is an exemplary structural block diagram of a non-living human face detection device according to an embodiment of the present application;

Fig. 7 is a schematic diagram of the internal structure of a computer device according to an embodiment of the present application.

Detailed ways

The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments are a part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

FIG. 1 is a schematic diagram of an application environment of a method for detecting a non-living human face in an embodiment of this application. As shown in FIG. 1, the method for detecting a non-living human face provided by this application can be applied in the application environment of FIG. 1 . Among them, non-living human face detection equipment includes, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, etc., and the computer equipment is equipped with a camera for acquiring video images.

Fig. 2 is a flowchart of a method for detecting a non-living human face according to an embodiment of the present application. The method for detecting a non-living human face according to an embodiment of the present application will be described in detail below with reference to Fig. 2, as shown in Fig. 2, The method includes the following steps S101 to S104.

S101. Obtain a video image, and extract multiple pictures to be detected from the video image.

In one of the embodiments, the video image is a video image collected by a camera of the terminal device.

In one of the embodiments, before the step of step S101, the method for detecting a non-living human face further includes the following steps:

Output a prompt message that allows the user to perform preset facial actions on the camera;

Obtain a video image including the preset facial motion.

In this embodiment, the preset facial actions include, but are not limited to, blinking, raising the head, opening the mouth, and so on.

S102. Input the pictures to be detected into a pre-trained target picture detection model to obtain target images included in each picture to be detected and a category to which the target images belong.

Wherein, the target picture detection model may be an SSD (Single Shot MultiBox Detector) model. The picture to be detected includes not only pictures of non-living human faces, but also pictures of non-living human faces, as well as ordinary environmental elements, including but not limited to monitors, televisions, projector screens, computers, etc. .

In one of the embodiments, the facial feature image corresponding to blinking is an image of a human eye area, the facial feature image corresponding to an open mouth is an image of a person's mouth area, and the facial feature image corresponding to a head-up is an image of a person's chin area.

S103. Determine in turn whether the target image of each of the pictures to be detected includes a target image whose category is a face area, and if so, further determine whether the target image corresponding to the picture to be detected includes an abnormality whose category is a preset Category of environmental components.

In this implementation, which environmental elements are abnormal environmental elements can be manually set. The abnormal environmental components include, but are not limited to, the frame of the display, the frame of the tablet/computer, the TV, and the projection screen.

In one of the embodiments, the abnormal environmental element is the original environmental element that can be detected by the target image detection model, which can be specifically the border of the display or TV, the border of the projection screen, the border of the projection area projected on the wall, etc. .

Fig. 3 is a flowchart of determining whether the target image includes environmental elements whose category is a preset abnormal category. As shown in Fig. 3, the determination of whether the target image corresponding to the picture to be detected includes a predetermined category of environmental elements The steps of the environmental element of the abnormal category include the following steps S301 and S302.

S301: Acquire the category to which each target image obtained from each of the to-be-detected pictures belongs and the category of a preset abnormal environmental element;

S302. Determine whether the category to which each target image belongs includes at least one category of the abnormal environmental element, and if so, determine that the target image includes an abnormal environmental element.

S104. If the relative positional relationship between the face area of the at least one image to be detected and the abnormal environmental element is within a preset range, determine that the face in the image to be detected is a non-living person Face.

In one of the embodiments, when the relative position relationship is within a preset range, for example, the face area is within the area range of the abnormal environmental element, that is, the area of the abnormal environmental element completely includes the human face. The area or part includes the face area.

In one of the embodiments, in step S104, if the relative positional relationship between the face area of the at least one image to be detected and the abnormal environmental element is within a preset range, then it is determined that the The steps for the face in the picture to be detected as a non-living face include:

It is determined whether the display area of the abnormal environment element and the face area overlap, and if so, it is determined that the face in the picture to be detected is a non-living face.

In one of the embodiments, the target image includes facial feature images such as eyes, mouth, etc., before the step of determining that the face in the image to be detected is a non-living face, and after the step S102, the method further includes :

Identifying the same facial feature image corresponding to the facial action in different pictures to be detected;

It is determined whether the facial feature image has been changed correspondingly in different pictures to be detected, and if not, skip to the above step S103.

Fig. 4 is a flowchart of training the target picture detection model. According to an embodiment of the present application, as shown in Fig. 4, the steps of training the target picture detection model include the following steps S401 to S404.

S401. Receive multiple sample pictures.

In one of the embodiments, the sample pictures include, but are not limited to, pictures actually taken by photographers, pictures downloaded from the Internet, and images extracted from video picture frames.

In this embodiment, the sample picture may be a picture containing a human face area, a picture containing a display frame, a projection screen, and other sample pictures that need to be learned by a target picture detection model.

S402: Mark the image area in the sample picture and the category to which the image area belongs according to the received instruction.

In this embodiment, the marked image area in the sample picture includes the face area, the area of the normal environmental original, and the area of the abnormal environmental original. Among them, the area of the original abnormal environment includes, but is not limited to, monitors, tablets/computers, televisions, projection screens, etc.

Further, the category to which the face area belongs is a face, and the category of the original anomalous environment includes a display, a tablet/computer, a TV, and a projection screen.

S403. Input the marked image area and the category to which the image area belongs into the target picture detection model.

S404: Learning the image area and the category to which the image area belongs through the target picture detection model, to obtain the trained target picture detection model.

Fig. 5 is a flowchart of a method for detecting a non-living human face according to another embodiment of the present application. The method for detecting a non-living human face according to another embodiment of the present application will be described in detail below in conjunction with Fig. 5, as shown in Fig. 5 As shown, before the step of obtaining a video image in step S101, the method further includes:

S501: Output a prompt message that allows the user to move from far to near or from near to far or approach a camera, where the camera is a camera that collects the video image;

S502: Output a prompt message that allows the user to perform a preset facial motion on the camera.

The above step S101 is further the following step S503:

S503: Acquire a video image in which the face area of the user changes from large to small or from small to large and contains preset facial actions, and extracts multiple pictures to be detected from the video images.

The method for detecting a non-living human face provided in this embodiment allows the target picture detection model to learn human faces and abnormal environmental elements, so that when identifying whether the face to be detected is a non-living human face, the target image is first passed through. The detection model recognizes whether the picture or video to be detected includes a human face, then distinguishes whether there are abnormal environmental elements in the picture or video to be detected, and then judges the picture to be detected by judging the positional relationship between the environmental element and the face area Whether the face in is a non-living face, which improves the accuracy of live face recognition.

According to an example of this embodiment, the numbers of the above steps S101 to S503 are not used to limit the sequence of the steps in this embodiment, and the numbers of the steps are just to make it easy to refer to the numbers of the steps when describing each step. It means that, for example, the above-mentioned step S501 may be before the step of S502 or after the step of step S502, as long as the order of execution of each step does not affect the logical relationship of this embodiment.

Fig. 6 is an exemplary structural block diagram of a non-living face detection device according to an embodiment of the present application. The following describes in detail the non-living face detection device according to an embodiment of the present application in conjunction with Fig. 6, as shown in Fig. 6 As shown, the device 100 for detecting a non-living human face includes a video acquisition module 11, an input module 12, a first judgment module 13 and a second judgment module 14.

The video acquisition module 11 is used to acquire a video image, and extract multiple pictures to be detected from the video image.

The input module 12 is configured to input the pictures to be detected into a pre-trained target picture detection model to obtain the target image included in each picture to be detected and the category to which the target image belongs.

The first judgment module 13 is used to sequentially judge whether the target image of each of the pictures to be detected includes a target image whose category is a face area, and if so, further determine whether the target image corresponding to the picture to be detected includes a target image The category is an environmental component of a preset abnormal category.

The second judging module 14 is used for judging whether the relative positional relationship between the face area of the at least one image to be detected and the abnormal environmental element is within a preset range The human face is a non-living human face.

In one of the embodiments, the first judgment module 13 further includes:

The category acquisition unit is configured to acquire the category to which each target image obtained from each of the to-be-detected pictures belongs and the category of the preset abnormal environmental element;

The first judging unit is configured to judge whether the category to which each target image belongs includes at least one category of the abnormal environmental element, and if so, judging that the target image includes an abnormal environmental element.

In one of the embodiments, the second judgment module is specifically used for:

In one of the embodiments, the target image includes facial feature images such as eyes and mouth.

In one of the embodiments, the device 100 for detecting a non-living human face further includes:

The picture receiving module is used to receive multiple sample pictures. Optionally, the sample pictures include, but are not limited to, pictures actually taken by photographers, pictures downloaded from the Internet, and images extracted from video picture frames. In this embodiment, the sample picture may be a picture containing a human face area, a picture containing a display frame, a projection screen, and various sample pictures that need to be learned by a target picture detection model;

The labeling module is configured to label the image area in the sample picture and the category to which the image area belongs according to the received instruction. In this embodiment, the marked image area in the sample picture includes the face area, the area of the normal environmental original, and the area of the abnormal environmental original. Among them, the area of the original abnormal environment includes, but is not limited to, monitors, tablets/computers, televisions, projection screens, etc. Further, the category to which the face area belongs is a face, and the category of the original anomalous environment includes a display, a tablet/computer, a TV, and a projection screen;

The input module is also used to input the marked image area and the category to which the image area belongs into the target picture detection model;

The learning module is used to learn the image area and the category to which the image area belongs through the target picture detection model to obtain the trained target picture detection model.

The first output module is configured to output a prompt message that allows the user to move from far to near or from near to far or approach a camera, where the camera is a camera that collects the video image;

The second output module is used to output a prompt message that allows the user to perform a preset facial motion on the camera;

The video acquisition module 11 is specifically configured to acquire a video image that includes a user's face area from large to small or from small to large and contains preset facial actions.

Among them, the meaning of "first" and "second" in the above-mentioned first judgment module and second judgment module is only to distinguish two different modules, and is not used to limit the priority of the determining module of which preselected area Higher or other restrictive meaning. In addition, the terms "including" and "having" and any variations of them are intended to cover non-exclusive inclusions. For example, a process, method, system, product, or device that includes a series of steps or modules is not necessarily limited to what is clearly listed. Those steps or modules may include other steps or modules that are not clearly listed or are inherent to these processes, methods, products, or equipment. The division of modules in this application is only a logical division , There can be other division methods when realizing in practical applications.

Wherein, each module included in the device for detecting a non-living human face can be implemented in whole or in part by software, hardware, or a combination thereof. Further, each module in the device for detecting a non-living human face may be a program segment for realizing corresponding functions.

The device for detecting a non-living human face provided in this embodiment allows the target picture detection model to learn human faces and abnormal environmental elements, so that when identifying whether the face to be detected is a non-living human face, the target picture is first passed through. The detection model recognizes whether the picture or video to be detected includes a human face, then distinguishes whether there are abnormal environmental elements in the picture or video to be detected, and then judges the picture to be detected by judging the positional relationship between the environmental element and the face area Whether the face in is a non-living face, which improves the accuracy of live face recognition.

The foregoing apparatus for detecting a non-living human face may be implemented in a form of computer-readable instructions, and the computer-readable instructions may run on the computer device as shown in FIG. 7.

For the specific limitation of the detection device for non-living human face, please refer to the above limitation on the detection method of non-living human face, which will not be repeated here. Each module in the above-mentioned non-living human face detection device can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.

In one embodiment, a computer device is provided. The computer device may be a terminal, and its internal structure diagram may be as shown in FIG. 7. The computer equipment includes a processor, a memory, a network interface, a display screen and an input device connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a readable storage medium and an internal memory. The readable storage medium stores an operating system and computer readable instructions. The internal memory provides an environment for the operation of the operating system and computer readable instructions in the readable storage medium. The network interface of the computer device is used to communicate with the controlled device through a network connection. When the computer-readable instructions are executed by the processor, a device for detecting a non-living human face is realized. In this example, the readable storage medium may be a non-volatile readable storage medium or a volatile readable storage medium.

In one embodiment, a computer device is provided, including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor. The steps of the method for detecting a living human face are, for example, step 101 to step 104 shown in FIG. 2. Or, when the processor executes the computer-readable instructions, the functions of the modules/units of the non-living human face detection apparatus in the above-mentioned embodiment are implemented, for example, the functions of the modules 11 to 14 shown in FIG. 6. To avoid repetition, I won’t repeat them here.

The processor may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), off-the-shelf Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor, etc. The processor is the control center of the computer device, and various interfaces and lines are used to connect various parts of the entire computer device.

The memory may be used to store the computer-readable instructions and/or modules, and the processor may run or execute the computer-readable instructions and/or modules stored in the memory, and call data stored in the memory, Realize various functions of the computer device. The memory may mainly include a storage program area and a storage data area. The storage program area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; the storage data area may store Data created based on the use of mobile phones (such as audio data, video data, etc.), etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as hard disks, memory, plug-in hard disks, smart media cards (SMC), and secure digital (SD) cards. , Flash Card, at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The memory may be integrated in the processor, or may be provided separately from the processor.

In one embodiment, one or more readable storage media storing computer readable instructions are provided. The computer readable storage medium stores computer readable instructions, wherein the computer readable instructions are stored by one or more When executed by the two processors, the one or more processors are caused to execute the steps of the method for detecting non-living human faces in the foregoing embodiment, for example, step 101 to step 104 shown in FIG. 2. Or, when the computer-readable instructions are executed by the processor, the functions of the modules/units of the non-living human face detection apparatus in the foregoing embodiments are realized, for example, the functions of the modules 11 to 14 shown in FIG. 6. To avoid repetition, I won’t repeat them here.

The method, device, computer equipment, and storage medium for detecting non-living human faces provided in this embodiment add an abnormal environmental element detection technology, and let the target image detection model learn abnormal environmental elements in advance, so that the object to be detected is recognized When the human face is a non-living human face, the standard picture detection model can first identify whether there are abnormal environmental elements in the picture or video to be detected, and then determine the positional relationship between the environmental elements and the face area. It detects whether the face in the picture is a non-living face, which improves the accuracy of live face recognition.

Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM), including Several instructions are used to make a terminal (which can be a mobile phone, a computer, a server or a network device, etc.) execute the methods described in the various embodiments of the present application.

The technical features of the above-mentioned embodiments can be combined arbitrarily. In order to make the description concise, all possible combinations of the various technical features in the above-mentioned embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, All should be considered as the scope of this specification.

The above-mentioned embodiments only express several implementation manners of the present application, and the description is relatively specific and detailed, but it should not be understood as a limitation on the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of this application, several modifications and improvements can be made, and these all fall within the protection scope of this application. Therefore, the scope of protection of the patent of this application shall be subject to the appended claims.

Claims

A method for detecting a non-living human face, wherein the method includes:

Acquiring a video image, and extracting multiple pictures to be detected from the video image;

Inputting the pictures to be detected into a pre-trained target picture detection model to obtain the target image included in each picture to be detected and the category to which the target image belongs;

It is determined in turn whether the target image of each picture to be detected includes a target image whose category is a face area, and if so, it is further determined whether the target image corresponding to the picture to be detected includes a predetermined abnormal category. Environmental components

If the relative positional relationship between at least one face area of the to-be-detected image and the abnormal environmental element is within a preset range, it is determined that the human face in the to-be-detected image is a non-living human face.
The method for detecting a non-living human face according to claim 1, wherein the step of determining whether the target image corresponding to the picture to be detected includes an environmental element whose category is a preset abnormal category comprises:

Acquiring the category to which each target image obtained from each of the to-be-detected pictures belongs and the category of the preset abnormal environmental element;

It is determined whether the category to which each target image belongs includes at least one category of the abnormal environmental element, and if so, it is determined that the target image includes an abnormal environmental element.
The method for detecting a non-living human face according to claim 1, wherein the method for detecting a non-living human face further comprises:

Receive multiple sample pictures;

Mark the image area in the sample picture and the category to which the image area belongs according to the received instruction;

Input the marked image area and the category to which the image area belongs into the target picture detection model;

Learning the image area and the category to which the image area belongs through the target picture detection model to obtain the trained target picture detection model.
The method for detecting a non-living human face according to claim 1, wherein the relative positional relationship between the face area of the to-be-detected image and the abnormal environmental element is within a preset range if there is at least one image to be detected. , The step of determining that the face in the picture to be detected is a non-living face includes:

It is determined whether the display area of the abnormal environment element and the face area overlap, and if so, it is determined that the face in the picture to be detected is a non-living face.
The method for detecting a non-living human face according to any one of claims 1 to 4, wherein, before the step of obtaining a video image, the method further comprises:

Outputting a prompt message that allows the user to move from far to near or from near to far or get close to a camera, where the camera is a camera that collects the video image;

Output a prompt message that allows the user to perform preset facial actions on the camera;

Obtain a video image that includes the user's face area from large to small or from small to large and contains preset facial actions.
A detection device for a non-living human face, wherein the device includes:

The video acquisition module is used to acquire a video image, and extract multiple pictures to be detected from the video image;

The input module is configured to input the pictures to be detected into a pre-trained target picture detection model to obtain the target image included in each picture to be detected and the category to which the target image belongs;

The first judgment module is used to sequentially judge whether the target image of each of the pictures to be detected includes a target image whose category is a face area, and if so, further determine whether the target image corresponding to the picture to be detected includes a category Environmental components of the preset abnormal category;

The second determining module is configured to determine the person in the image to be detected if the relative positional relationship between the face area of the at least one image to be detected and the abnormal environmental element is within a preset range The face is a non-living human face.
The device for detecting a non-living human face according to claim 6, wherein the first judgment module further comprises:

The category acquisition unit is configured to acquire the category to which each target image obtained from each of the to-be-detected pictures belongs and the category of the preset abnormal environmental element;

The first judging unit is configured to judge whether the category to which each target image belongs includes at least one category of the abnormal environmental element, and if so, judging that the target image includes an abnormal environmental element.
The device for detecting a non-living human face according to claim 6, wherein the device for detecting a non-living human face further comprises:

The picture receiving module is used to receive multiple sample pictures;

An annotation module, configured to annotate the image area in the sample picture and the category to which the image area belongs according to the received instruction;

The input module is further configured to input the marked image area and the category to which the image area belongs into the target picture detection model;

The learning module is used to learn the image area and the category to which the image area belongs through the target picture detection model to obtain the trained target picture detection model.
The device for detecting a non-living human face according to claim 6, wherein the second judgment module is specifically configured to:

It is determined whether the display area of the abnormal environment element and the face area overlap, and if so, it is determined that the face in the picture to be detected is a non-living face.
The device for detecting a non-living human face according to any one of claims 6 to 9, wherein the device for detecting a non-living human face further comprises:

The first output module is configured to output a prompt message that allows the user to move from far to near or from near to far or approach a camera, where the camera is a camera that collects the video image;

The second output module is used to output a prompt message that allows the user to perform a preset facial motion on the camera;

The video acquisition module is used to acquire a video image in which the face area of the user changes from large to small or from small to large and contains preset facial actions.
A computer device includes a memory, a processor, and computer readable instructions stored on the memory and running on the processor, wherein the processor implements the following steps when the program is executed:

Acquiring a video image, and extracting multiple pictures to be detected from the video image;

Inputting the pictures to be detected into a pre-trained target picture detection model to obtain the target image included in each picture to be detected and the category to which the target image belongs;

It is determined in turn whether the target image of each of the pictures to be detected includes a target image whose category is a face area, and if so, it is further determined whether the target image corresponding to the picture to be detected includes a predetermined abnormal category. Environmental components

If the relative positional relationship between at least one face area of the to-be-detected image and the abnormal environmental element is within a preset range, it is determined that the human face in the to-be-detected image is a non-living human face.
11. The computer device according to claim 11, wherein the step of determining whether the target image corresponding to the picture to be detected includes an environmental element whose category is a preset abnormal category comprises:

Acquiring the category to which each target image obtained from each of the to-be-detected pictures belongs and the category of the preset abnormal environmental element;

It is determined whether the category to which each target image belongs includes at least one category of the abnormal environmental element, and if so, it is determined that the target image includes an abnormal environmental element.
The computer device according to claim 11, wherein the processor further implements the following steps when executing the program:

Receive multiple sample pictures;

Mark the image area in the sample picture and the category to which the image area belongs according to the received instruction;

Input the marked image area and the category to which the image area belongs into the target picture detection model;

Learning the image area and the category to which the image area belongs through the target picture detection model to obtain the trained target picture detection model.
The computer device according to claim 11, wherein if the relative positional relationship between the face area of the at least one image to be detected and the abnormal environmental element is within a preset range, it is determined that The step of stating that the face in the picture to be detected is a non-living face includes:

It is determined whether the display area of the abnormal environment element and the face area overlap, and if so, it is determined that the face in the picture to be detected is a non-living face.
The computer device according to any one of claims 11 to 14, wherein, before the step of obtaining a video image, the processor further implements the following steps when executing the program:

Outputting a prompt message that allows the user to move from far to near or from near to far or get close to a camera, where the camera is a camera that collects the video image;

Output a prompt message that allows the user to perform preset facial actions on the camera;

Obtain a video image that includes the user's face area from large to small or from small to large and contains preset facial actions.
One or more readable storage media storing computer readable instructions. The computer readable storage medium stores computer readable instructions. When the computer readable instructions are executed by one or more processors, the The one or more processors execute the following steps:

Acquiring a video image, and extracting multiple pictures to be detected from the video image;

Inputting the pictures to be detected into a pre-trained target picture detection model to obtain the target image included in each picture to be detected and the category to which the target image belongs;

It is determined in turn whether the target image of each of the pictures to be detected includes a target image whose category is a face area, and if so, it is further determined whether the target image corresponding to the picture to be detected includes a predetermined abnormal category. Environmental components

If the relative positional relationship between at least one face area of the to-be-detected image and the abnormal environmental element is within a preset range, it is determined that the human face in the to-be-detected image is a non-living human face.
The readable storage medium according to claim 16, wherein the step of determining whether the target image corresponding to the picture to be detected includes an environmental element whose category is a preset abnormal category comprises:

Acquiring the category to which each target image obtained from each of the to-be-detected pictures belongs and the category of the preset abnormal environmental element;

It is determined whether the category to which each target image belongs includes at least one category of the abnormal environmental element, and if so, it is determined that the target image includes an abnormal environmental element.
The readable storage medium according to claim 16, wherein, when the computer-readable instructions are executed by one or more processors, the one or more processors further execute the following steps:

Receive multiple sample pictures;

Mark the image area in the sample picture and the category to which the image area belongs according to the received instruction;

Input the marked image area and the category to which the image area belongs into the target picture detection model;

Learning the image area and the category to which the image area belongs through the target picture detection model to obtain the trained target picture detection model.
The readable storage medium according to claim 16, wherein if the relative positional relationship between the face area of the at least one image to be detected and the abnormal environmental element is within a preset range, then The step of determining that the human face in the picture to be detected is a non-living human face includes:

It is determined whether the display area of the abnormal environment element and the face area overlap, and if so, it is determined that the face in the picture to be detected is a non-living face.
The readable storage medium according to any one of claims 16 to 19, wherein, when the computer-readable instructions are executed by one or more processors before the step of obtaining a video image, the one or more Each processor also performs the following steps:

Outputting a prompt message that allows the user to move from far to near or from near to far or get close to a camera, where the camera is a camera that collects the video image;

Output a prompt message that allows the user to perform preset facial actions on the camera;

Obtain a video image that includes the user's face area from large to small or from small to large and contains preset facial actions.