WO2022227191A1

WO2022227191A1 - Inactive living body detection method and apparatus, electronic device, and storage medium

Info

Publication number: WO2022227191A1
Application number: PCT/CN2021/097078
Authority: WO
Inventors: 刘杰; 庄伯金; 曾凡涛
Original assignee: 平安科技（深圳）有限公司
Priority date: 2021-04-28
Filing date: 2021-05-30
Publication date: 2022-11-03
Also published as: CN113255456B; CN113255456A

Abstract

An inactive living body detection method and apparatus, an electronic device, and a computer readable storage medium, relating to the biological identification technology and the blockchain technology. The method comprises: obtaining an initial reference point set in an original video screenshot, and performing position disturbance to obtain a target reference point set; according to the initial reference point set and the target reference point set, performing geometric transformation on the original video screenshot to obtain a target image; inputting the original video screenshot, the target image, and the initial reference point set into a reference point analysis network so as to obtain a predicted reference point set; and according to an obtained final detection loss value, optimizing an inactive living body detection model to obtain a trained inactive living body detection model, and identifying an image to be identified, so as to obtain an inactive living body detection result. The described target image can be stored in a node of a blockchain. The method can solve the problem that the stability of inactive living body detection is not high.

Description

Inactive living body detection method, device, electronic device and storage medium

This application claims the priority of the Chinese patent application filed on April 28, 2021 with the application number 202110467269.0 and the title of the invention is "Inactive living detection method, device, electronic device and storage medium", the entire content of which is approved by Reference is incorporated in this application.

technical field

The present application relates to the field of biometric identification, and in particular, to a non-active living body detection method, device, electronic device, and computer-readable storage medium.

Background technique

At present, banks usually use online video calls to review applicants who apply for business, but the environment when applicants make video calls is usually noisy, and background faces other than the applicants may appear in the video background, such as photo frames or posters If there is a non-active living body including a human face, the background faces other than the applicant need to be eliminated when the applicant's face information in the video background is counted to prevent multiple false positives.

The inventor realized that in the prior art, the method for detecting inactive living bodies in background images is usually based on single-frame or multi-frame image texture and analysis of multi-frame eye, mouth, head gesture movements, etc. The stability of the multi-frame image texture method relies heavily on the quality of image acquisition and the completeness of training data. However, in actual online video calls, because there are no restrictions on user mobile phone models and call scenarios, it is very difficult to collect training data for complete scenarios. The image quality of network calls is poor, especially in bad lighting scenarios. Therefore, this type of application scenario greatly reduces the stability of existing image silent live detection methods based on single-frame or multi-frame textures.

SUMMARY OF THE INVENTION

A non-active live detection method provided by this application includes:

Obtain original video screenshots, and select reference points for the original video screenshots to obtain an initial reference point set;

Performing position disturbance processing on the initial reference point set to obtain a target reference point set;

Generate a geometric transformation matrix according to the initial reference point set and the target reference point set, and use the geometric transformation matrix to perform geometric transformation processing on the original video screenshot to obtain a target image;

Inputting the original video screenshot, the target image and the initial reference point set into a preset reference point analysis network to obtain a prediction reference point set;

Use the preset root mean square error loss function to calculate the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point, and compare the root mean square loss value with the preset root mean square loss value Perform arithmetic operations on the classification loss value of the binary classification network to obtain the final detection loss value;

Perform iterative optimization processing on the non-active living body detection model constructed by the reference point analysis network and the two-classification network according to the final detection loss value to obtain a trained non-active living body detection model;

Obtaining images to be recognized with a preset number of frames, inputting the images to be recognized into the trained non-active living body detection model, and obtaining a non-active living body detection result.

The present application also provides a non-active liveness detection device, the device comprising:

The target image acquisition module is used for acquiring original video screenshots, performing reference point selection on the original video screenshots to obtain an initial reference point set; performing position disturbance processing on the initial reference point set to obtain a target reference point set; according to the The initial reference point set and the target reference point set generate a geometric transformation matrix, and the original video screenshot is subjected to geometric transformation processing by using the geometric transformation matrix to obtain a target image;

A reference point analysis module, for inputting the original video screenshot, the target image and the initial reference point set into a preset reference point analysis network to obtain a prediction reference point set;

The loss value calculation module is used to calculate the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point by using the preset root mean square error loss function, and calculate the Perform arithmetic operation on the square root loss value and the preset classification loss value of the two-class network to obtain the final detection loss value;

A model training module, configured to perform iterative optimization processing on the non-active living detection model constructed by the reference point analysis network and the two-classification network according to the final detection loss value, to obtain a trained non-active living detection model;

The image detection module is used for acquiring images to be recognized with a preset number of frames, and inputting the images to be recognized into the trained non-active living body detection model to obtain a non-active living body detection result.

The present application also provides an electronic device, the electronic device comprising:

a memory that stores at least one instruction; and

A processor that executes the instructions stored in the memory to implement the following non-active liveness detection method:

The present application also provides a computer-readable storage medium, where at least one instruction is stored in the computer-readable storage medium, and the at least one instruction is executed by a processor in an electronic device to implement the non-active living body detection method described below :

Description of drawings

FIG. 1 is a schematic flowchart of a non-active living body detection method provided by an embodiment of the present application;

FIG. 2 is a functional block diagram of a non-active living body detection device provided by an embodiment of the present application;

FIG. 3 is a schematic structural diagram of an electronic device for implementing the non-active living body detection method according to an embodiment of the present application.

The realization, functional characteristics and advantages of the purpose of the present application will be further described with reference to the accompanying drawings in conjunction with the embodiments.

Detailed ways

It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.

The embodiments of the present application provide a non-active live detection method. The execution subject of the non-active living body detection method includes, but is not limited to, at least one of electronic devices that can be configured to execute the method provided by the embodiments of the present application, such as a server and a terminal. In other words, the non-active liveness detection method may be executed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.

Referring to FIG. 1 , a schematic flowchart of a non-active living body detection method provided by an embodiment of the present application is shown. In this embodiment, the non-active liveness detection method includes:

S1. Obtain original video screenshots, and select reference points for the original video screenshots to obtain an initial reference point set.

In the embodiment of the present application, the original video screenshot is an image captured during an online video call, and the image may include a human face and a human face background. For example, an image captured during identity information verification through an online video call in the financial field.

Specifically, obtaining the original video screenshot, performing reference point selection on the original video screenshot, and obtaining an initial reference point set, including:

mapping the original video screenshot to a preset two-dimensional rectangular coordinate system;

Multiple reference points in the original video screenshot are randomly selected on the two-dimensional rectangular coordinate system to generate an initial reference point set.

Optionally, in this embodiment of the present application, the number of randomly selected reference points is four.

S2. Perform position disturbance processing on the initial reference point set to obtain a target reference point set.

In the embodiment of the present application, performing position disturbance processing on the initial reference point set to obtain the target reference point set includes:

Obtain the coordinate value of each initial reference point in the initial reference point set, and obtain the initial coordinate point set corresponding to the initial reference point set;

Using the scrambling function to perform scrambling calculation on the initial coordinate point set to obtain the target coordinate point set;

The target coordinate point set is mapped to a two-dimensional rectangular coordinate system to obtain a target reference point set.

In detail, in the embodiment of the present application, the disturbance function is a Hash function.

S3. Generate a geometric transformation matrix according to the initial reference point set and the target reference point set, and use the geometric transformation matrix to perform geometric transformation processing on the original video screenshot to obtain a target image.

In this embodiment of the present application, geometric transformation refers to an anti-projection from a set with a geometric structure to itself or other such sets, and the geometric transformation includes image affine transformation and homography transformation. Wherein, the image affine transformation is to perform a linear transformation on a vector space followed by a translation in geometry to transform it into another vector space, and the homography transformation is to map the image on the world coordinate system to the pixel coordinate system.

Further, using the geometric transformation matrix to perform geometric transformation processing on the original video screenshot, that is, using the geometric transformation matrix and the position of each pixel in the original video screenshot to perform multiplication processing to obtain a geometrically transformed image. pixel points and generate a target image according to the pixel points.

Preferably, the homography transformation is adopted in this solution, that is, the generated geometric transformation matrix is a homography matrix, and the geometric transformation processing is a homography transformation processing.

Specifically, the generating a geometric transformation matrix according to the initial reference point set and the target reference point set includes:

Obtain the first arbitrary coordinates of any initial reference point in the set of initial reference points and the second arbitrary coordinates of any one of the target reference points in the set of target reference points;

The first arbitrary coordinate and the second arbitrary coordinate of the coordinates of the initial reference point and the coordinates of the target reference point are calculated according to the geometric transformation formula to obtain the geometric transformation matrix.

Further, the first arbitrary coordinate and the second arbitrary coordinate of the coordinates of the initial reference point and the coordinates of the target reference point are calculated according to the geometric transformation formula to obtain a geometric transformation matrix, including:

The first arbitrary coordinate and the second arbitrary coordinate of the coordinates of the initial reference point and the coordinates of the target reference point are calculated according to the geometric transformation formula to obtain a geometric transformation matrix, including:

Use the following geometric transformation formula to calculate the first arbitrary coordinate and the second arbitrary coordinate of the coordinates of the initial reference point and the coordinates of the target reference point, and obtain the geometric transformation matrix:

b=Ha ^T

Wherein, H is the geometric transformation matrix, b is the second arbitrary coordinate, a is the first arbitrary coordinate, and T is a fixed parameter.

S4. Input the original video screenshot, the target image and the initial reference point set into a preset reference point analysis network to obtain a prediction reference point set.

In this embodiment of the present application, the original video screenshot, the target image and the initial reference point set are input into a preset reference point analysis network, and the reference point analysis network is used to predict the original video screenshot. the position of the initial reference point set on the target image, and obtain the prediction reference point set, the coordinates of the prediction reference point included in the prediction reference point set are the initial reference point set on the original video screenshot on the target image coordinate of.

Wherein, the reference point analysis network may be a deep learning model.

S5, using a preset root mean square error loss function to calculate the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point, and compare the root mean square loss value with the An arithmetic operation is performed on the classification loss value of the preset two-class network to obtain the final detection loss value.

In the embodiment of the present application, the calculation of the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point by using the preset root mean square error loss function includes:

Use the following preset root mean square error loss function formula to calculate the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point:

Among them, MSE is the root mean square error loss value, y _i is the prediction reference point,

is the real reference point, n is the total number of prediction reference points in the prediction reference point set, and i refers to the ith prediction reference point.

Further, the arithmetic calculation of the root mean square loss value and the preset classification loss value of the two-class network includes calculating the root mean square loss value and the classification loss value of the two-class network. and to obtain the final detection loss value, wherein the classification loss value of the binary classification network can be calculated by using the cross-entropy loss function.

S6. Perform iterative optimization processing on the non-active living body detection model constructed by the reference point analysis network and the two-classification network according to the final detection loss value to obtain a trained non-active living body detection model.

In the embodiment of the present application, the trained non-active living body detection model is used to detect whether the human face in the picture containing the human face belongs to the non-active living body attack.

In the embodiment of the present application, the non-active living body detection model constructed by the reference point analysis network and the two-classification network is iteratively optimized according to the final detection loss value to obtain a trained non-active living body detection model. models, including:

judging the size between the final detection loss value and a preset loss threshold;

If the final detection loss value is less than the preset loss threshold, determine that the non-active living body detection model is a trained non-active living body detection model;

If the final detection loss value is greater than or equal to the preset loss threshold, adjust the internal parameters of the inactive living body detection model, until the final detection loss value is less than the preset loss threshold, obtain training Good non-active liveness detection model.

Wherein, the internal parameters of the inactive living body detection model may be the gradient of the model or the internal parameters of the model.

S7. Acquire images to be recognized with a preset number of frames, and input the images to be recognized into the trained non-active living body detection model to obtain a non-active living body detection result.

In this embodiment of the present application, the preset number of frames may be at least two consecutive or discontinuous images.

In the embodiment of the present application, the to-be-recognized images of the preset number of frames may be multiple consecutive screenshots containing human faces in the chat video; or the images to be recognized of the preset number of frames may be intercepted during identity verification through a camera The obtained multiple screenshots contain human faces, and the multiple screenshots can be 2D images.

In the embodiment of the present application, by inputting the images to be identified with a preset number of frames into the non-active living body detection model, it can be monitored whether the to-be-identified images contain non-active living bodies.

This embodiment of the present application obtains a target image by performing geometric transformation processing on the set of initial reference points selected in the original video screenshot and the target reference point obtained through position disturbance processing, and according to the original video screenshot, the target image and the initial reference point The reference point analysis network is trained, and the trained reference point analysis network pays more attention to learning the information related to the geometric transformation of the face, which can reduce the dependence on a large amount of training data, so it can improve the stability of the non-active living body detection, and this application implements In the example, a binary classification network is added, which improves the accuracy and generalization, and improves the accuracy of non-active liveness detection. Therefore, the non-active living detection method proposed in this application can solve the problem of low stability of the non-active living detection.

As shown in FIG. 2 , it is a functional block diagram of an inactive living body detection device provided by an embodiment of the present application.

The non-active living body detection apparatus 100 described in this application may be installed in an electronic device. According to the implemented functions, the inactive living body detection apparatus 100 may include a target image acquisition module 101 , a reference point analysis module 102 , a loss value calculation module 103 , a model training module 104 and an image detection module 105 . The modules described in this application may also be referred to as units, which refer to a series of computer program segments that can be executed by the processor of the electronic device and can perform fixed functions, and are stored in the memory of the electronic device.

In this embodiment, the functions of each module/unit are as follows:

The target image acquisition module 101 is configured to acquire original video screenshots, select reference points for the original video screenshots, and obtain an initial reference point set; perform position disturbance processing on the initial reference point set to obtain a target reference point set; Generate a geometric transformation matrix according to the initial reference point set and the target reference point set, and use the geometric transformation matrix to perform geometric transformation processing on the original video screenshot to obtain a target image;

The reference point analysis module 102 is configured to input the original video screenshot, the target image and the initial reference point set into a preset reference point analysis network to obtain a prediction reference point set;

The loss value calculation module 103 is configured to use a preset root mean square error loss function to calculate the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point, and calculate the root mean square loss value between the prediction reference point and the preset real reference point. The root mean square loss value and the preset classification loss value of the two-class network are subjected to an arithmetic operation to obtain the final detection loss value;

The model training module 104 is configured to iteratively optimize the non-active living body detection model constructed by the reference point analysis network and the two-classification network according to the final detection loss value, and obtain a trained non-active living body detection model;

The image detection module 105 is configured to acquire images to be recognized with a preset number of frames, input the images to be recognized into the trained non-active living detection model, and obtain a non-active living detection result.

In detail, when each module of the non-active body detection apparatus 100 is executed by the processor of the electronic device, a non-active body detection method including the following steps can be implemented:

Step 1: Obtain a screenshot of the original video, select a reference point for the original video screenshot, and obtain an initial set of reference points.

In the embodiment of the present application, the original video screenshot is a picture image captured during an online video call, and the image may include a human face and a human face background. For example, an image captured during identity information verification through an online video call in the financial field.

Step 2: Perform position disturbance processing on the initial reference point set to obtain a target reference point set.

Step 3: Generate a geometric transformation matrix according to the initial reference point set and the target reference point set, and use the geometric transformation matrix to perform geometric transformation processing on the original video screenshot to obtain a target image.

Obtain the first arbitrary coordinates of any one of the initial reference points in the set of initial reference points and the second arbitrary coordinates of any one of the target reference points in the set of target reference points;

b=Ha ^T

Step 4: Input the original video screenshot, the target image and the initial reference point set into a preset reference point analysis network to obtain a prediction reference point set.

Wherein, the reference point analysis network may be a deep learning model.

Step 5. Calculate the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point by using the preset root mean square error loss function, and calculate the root mean square loss value. Perform arithmetic operation with the classification loss value of the preset binary classification network to obtain the final detection loss value.

Step 6: Perform iterative optimization processing on the non-active living body detection model constructed by the reference point analysis network and the two-classification network according to the final detection loss value to obtain a trained non-active living body detection model.

Step 7: Acquire images to be recognized with a preset number of frames, input the images to be recognized into the trained non-active living body detection model, and obtain a non-active living body detection result.

This embodiment of the present application obtains a target image by performing geometric transformation processing on the set of initial reference points selected in the original video screenshot and the target reference point obtained through position disturbance processing, and according to the original video screenshot, the target image and the initial reference point The reference point analysis network is trained, and the trained reference point analysis network pays more attention to learning the information related to the geometric transformation of the face, which can reduce the dependence on a large amount of training data, so it can improve the stability of the non-active living body detection, and this application implements In the example, a binary classification network is added, which improves the accuracy and generalization, and improves the accuracy of non-active liveness detection. Therefore, the non-active living body detection device proposed in the present application can solve the problem that the stability of the non-active living body detection is not high.

As shown in FIG. 3 , it is a schematic structural diagram of an electronic device for implementing an inactive living body detection method provided by an embodiment of the present application.

The electronic device 1 may include a processor 10, a memory 11 and a bus, and may also include a computer program stored in the memory 11 and executable on the processor 10, such as an inactive living body detection program 12.

Wherein, the memory 11 includes at least one type of readable storage medium, and the readable storage medium may be volatile or non-volatile. Specifically, the readable storage medium includes a flash memory, a mobile hard disk, a multimedia card, a card-type memory (eg, SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like. The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a mobile hard disk of the electronic device 1 . In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, such as a pluggable mobile hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital) equipped on the electronic device 1. , SD) card, flash memory card (Flash Card), etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 can not only be used to store application software installed in the electronic device 1 and various types of data, such as codes of the non-active living body detection program 12, etc., but also can be used to temporarily store data that has been output or will be output.

In some embodiments, the processor 10 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits packaged with the same function or different functions, including one or more integrated circuits. Central Processing Unit (CPU), microprocessor, digital processing chip, graphics processor and combination of various control chips, etc. The processor 10 is the control core (Control Unit) of the electronic device, and uses various interfaces and lines to connect various components of the entire electronic device, and by running or executing programs or modules (such as non- Active living body detection program, etc.), and call data stored in the memory 11 to execute various functions of the electronic device 1 and process data.

The bus may be a peripheral component interconnect (PCI for short) bus or an extended industry standard architecture (Extended industry standard architecture, EISA for short) bus or the like. The bus can be divided into address bus, data bus, control bus and so on. The bus is configured to implement connection communication between the memory 11 and at least one processor 10 and the like.

FIG. 3 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation on the electronic device 1, and may include fewer or more components than those shown in the figure. components, or a combination of certain components, or a different arrangement of components.

For example, although not shown, the electronic device 1 may also include a power supply (such as a battery) for powering the various components, preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so that the power management The device implements functions such as charge management, discharge management, and power consumption management. The power source may also include one or more DC or AC power sources, recharging devices, power failure detection circuits, power converters or inverters, power status indicators, and any other components. The electronic device 1 may further include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.

Further, the electronic device 1 may also include a network interface, optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.

Optionally, the electronic device 1 may further include a user interface, and the user interface may be a display (Display), an input unit (eg, a keyboard (Keyboard)), optionally, the user interface may also be a standard wired interface or a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch panel, and the like. The display may also be appropriately called a display screen or a display unit, which is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.

It should be understood that the embodiments are only used for illustration, and are not limited by this structure in the scope of the patent application.

The non-active living body detection program 12 stored in the memory 11 in the electronic device 1 is a combination of multiple instructions, and when running in the processor 10, can realize:

Specifically, for the specific implementation method of the above-mentioned instruction by the processor 10, reference may be made to the description of the relevant steps in the corresponding embodiment of FIG. 1, and details are not described herein.

Further, if the modules/units integrated in the electronic device 1 are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium. The computer-readable storage medium may be volatile or non-volatile. For example, the computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a removable hard disk, a magnetic disk, an optical disc, a computer memory, a read-only memory (ROM, Read-Only). Memory).

The present application also provides a computer-readable storage medium, where the readable storage medium stores a computer program, and when executed by a processor of an electronic device, the computer program can realize:

In the several embodiments provided in this application, it should be understood that the disclosed apparatus, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the modules is only a logical function division, and there may be other division manners in actual implementation.

The modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, each functional module in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of hardware plus software function modules.

It will be apparent to those skilled in the art that the present application is not limited to the details of the above-described exemplary embodiments, but that the present application can be implemented in other specific forms without departing from the spirit or essential characteristics of the present application.

Accordingly, the embodiments are to be regarded in all respects as illustrative and not restrictive, and the scope of the application is to be defined by the appended claims rather than the foregoing description, which is therefore intended to fall within the scope of the claims. All changes within the meaning and scope of the equivalents of , are included in this application. Any reference signs in the claims shall not be construed as limiting the involved claim.

The blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Furthermore, it is clear that the word "comprising" does not exclude other units or steps and the singular does not exclude the plural. Several units or means recited in the system claims can also be realized by one unit or means by means of software or hardware. Second-class terms are used to denote names and do not denote any particular order.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application rather than limitations. Although the present application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present application can be Modifications or equivalent substitutions can be made without departing from the spirit and scope of the technical solutions of the present application.

Claims

A non-active liveness detection method, wherein the method comprises:

Obtain original video screenshots, and select reference points for the original video screenshots to obtain an initial reference point set;

Performing position disturbance processing on the initial reference point set to obtain a target reference point set;

Generate a geometric transformation matrix according to the initial reference point set and the target reference point set, and use the geometric transformation matrix to perform geometric transformation processing on the original video screenshot to obtain a target image;

Inputting the original video screenshot, the target image and the initial reference point set into a preset reference point analysis network to obtain a prediction reference point set;

Use the preset root mean square error loss function to calculate the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point, and compare the root mean square loss value with the preset root mean square loss value Perform arithmetic operations on the classification loss value of the binary classification network to obtain the final detection loss value;

Perform iterative optimization processing on the non-active living body detection model constructed by the reference point analysis network and the two-classification network according to the final detection loss value to obtain a trained non-active living body detection model;

Obtaining images to be recognized with a preset number of frames, inputting the images to be recognized into the trained non-active living body detection model, and obtaining a non-active living body detection result.
The non-active liveness detection method according to claim 1, wherein the acquiring the original video screenshot, performing reference point selection on the original video screenshot, and obtaining an initial reference point set, comprising:

mapping the original video screenshot to a preset two-dimensional rectangular coordinate system;

Multiple reference points in the original video screenshot are randomly selected on the two-dimensional rectangular coordinate system to generate an initial reference point set.
The non-active living body detection method according to claim 1, wherein the performing position disturbance processing on the initial reference point set to obtain the target reference point set, comprising:

Obtain the coordinate value of each initial reference point in the initial reference point set, and obtain the initial coordinate point set corresponding to the initial reference point set;

Using the scrambling function to perform scrambling calculation on the initial coordinate point set to obtain the target coordinate point set;

The target coordinate point set is mapped to a two-dimensional rectangular coordinate system to obtain a target reference point set.
The non-active liveness detection method according to claim 1, wherein the generating a geometric transformation matrix according to the initial reference point set and the target reference point set comprises:

Obtain the first arbitrary coordinates of any one of the initial reference points in the set of initial reference points and the second arbitrary coordinates of any one of the target reference points in the set of target reference points;

The first arbitrary coordinate and the second arbitrary coordinate of the coordinates of the initial reference point and the coordinates of the target reference point are calculated according to the geometric transformation formula to obtain the geometric transformation matrix.
The non-active living body detection method according to claim 4, wherein the first arbitrary coordinate and the second arbitrary coordinate of the coordinates of the initial reference point and the coordinates of the target reference point are calculated according to the geometric transformation formula to obtain the geometric transformation matrix, include:

Use the following geometric transformation formula to calculate the first arbitrary coordinate and the second arbitrary coordinate of the coordinates of the initial reference point and the coordinates of the target reference point, and obtain the geometric transformation matrix:

b=Ha T

Wherein, H is the geometric transformation matrix, b is the second arbitrary coordinate, a is the first arbitrary coordinate, and T is a fixed parameter.
The non-active living body detection method according to claim 1, wherein the non-active living body detection model constructed by the reference point analysis network and the binary classification network is iteratively optimized according to the final detection loss value. , get a trained non-active liveness detection model, including:

judging the size between the final detection loss value and a preset loss threshold;

If the final detection loss value is less than the preset loss threshold, determine that the non-active living body detection model is a trained non-active living body detection model;

If the final detection loss value is greater than or equal to the preset loss threshold, adjust the internal parameters of the inactive living body detection model, until the final detection loss value is less than the preset loss threshold, obtain training Good non-active liveness detection model.
The non-active living body detection method according to any one of claims 1 to 6, wherein the prediction reference point in the prediction reference point set and the preset real value are calculated by using a preset root mean square error loss function RMS loss values between reference points, including:

Use the following preset root mean square error loss function formula to calculate the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point:

Among them, MSE is the root mean square error loss value, yi is the prediction reference point,
is the real reference point, n is the total number of prediction reference points in the prediction reference point set, and i refers to the ith prediction reference point.
A non-active liveness detection device, wherein the device comprises:

The target image acquisition module is used for acquiring original video screenshots, performing reference point selection on the original video screenshots to obtain an initial reference point set; performing position disturbance processing on the initial reference point set to obtain a target reference point set; according to the The initial reference point set and the target reference point set generate a geometric transformation matrix, and the original video screenshot is subjected to geometric transformation processing by using the geometric transformation matrix to obtain a target image;

a reference point analysis module, configured to input the original video screenshot, the target image and the initial reference point set into a preset reference point analysis network to obtain a prediction reference point set;

The loss value calculation module is used to calculate the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point by using the preset root mean square error loss function, and calculate the Perform arithmetic operation on the square root loss value and the preset classification loss value of the two-class network to obtain the final detection loss value;

A model training module, configured to perform iterative optimization processing on the non-active living detection model constructed by the reference point analysis network and the two-classification network according to the final detection loss value, to obtain a trained non-active living detection model;

The image detection module is used for acquiring images to be recognized with a preset number of frames, and inputting the images to be recognized into the trained non-active living body detection model to obtain a non-active living body detection result.
An electronic device, wherein the electronic device comprises:

at least one processor; and,

a memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform an inactive liveness detection method as described below:

Obtain original video screenshots, and select reference points for the original video screenshots to obtain an initial reference point set;

Performing position disturbance processing on the initial reference point set to obtain a target reference point set;

Generate a geometric transformation matrix according to the initial reference point set and the target reference point set, and use the geometric transformation matrix to perform geometric transformation processing on the original video screenshot to obtain a target image;

Inputting the original video screenshot, the target image and the initial reference point set into a preset reference point analysis network to obtain a prediction reference point set;

Use the preset root mean square error loss function to calculate the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point, and compare the root mean square loss value with the preset root mean square loss value Perform arithmetic operations on the classification loss value of the binary classification network to obtain the final detection loss value;

Perform iterative optimization processing on the non-active living body detection model constructed by the reference point analysis network and the two-classification network according to the final detection loss value to obtain a trained non-active living body detection model;

Obtaining images to be recognized with a preset number of frames, inputting the images to be recognized into the trained non-active living body detection model, and obtaining a non-active living body detection result.
The electronic device according to claim 9, wherein the performing position perturbation processing on the initial reference point set to obtain the target reference point set comprises:

Obtain the coordinate value of each initial reference point in the initial reference point set, and obtain the initial coordinate point set corresponding to the initial reference point set;

Using the scrambling function to perform scrambling calculation on the initial coordinate point set to obtain the target coordinate point set;

The target coordinate point set is mapped to a two-dimensional rectangular coordinate system to obtain a target reference point set.
The electronic device according to claim 9, wherein the generating a geometric transformation matrix according to the initial reference point set and the target reference point set comprises:

Obtain the first arbitrary coordinates of any one of the initial reference points in the set of initial reference points and the second arbitrary coordinates of any one of the target reference points in the set of target reference points;

The first arbitrary coordinate and the second arbitrary coordinate of the coordinates of the initial reference point and the coordinates of the target reference point are calculated according to the geometric transformation formula to obtain the geometric transformation matrix.
The electronic device according to claim 11, wherein the first arbitrary coordinates and the second arbitrary coordinates of the coordinates of the initial reference point and the coordinates of the target reference point are calculated according to the geometric transformation formula to obtain a geometric transformation matrix, comprising:

Use the following geometric transformation formula to calculate the first arbitrary coordinate and the second arbitrary coordinate of the coordinates of the initial reference point and the coordinates of the target reference point, and obtain the geometric transformation matrix:

b=Ha T

Wherein, H is the geometric transformation matrix, b is the second arbitrary coordinate, a is the first arbitrary coordinate, and T is a fixed parameter.
The electronic device according to claim 9, wherein the inactive living body detection model constructed by the reference point analysis network and the binary classification network is iteratively optimized according to the final detection loss value to obtain training Good non-active liveness detection models include:

judging the size between the final detection loss value and a preset loss threshold;

If the final detection loss value is less than the preset loss threshold, determine that the non-active living body detection model is a trained non-active living body detection model;

If the final detection loss value is greater than or equal to the preset loss threshold, adjust the internal parameters of the inactive living body detection model, until the final detection loss value is less than the preset loss threshold, obtain training Good non-active liveness detection model.
The electronic device according to any one of claims 9 to 13, wherein calculating the difference between the prediction reference point in the prediction reference point set and the preset real reference point by using a preset root mean square error loss function RMS loss values between , including:

Use the following preset root mean square error loss function formula to calculate the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point:

Among them, MSE is the root mean square error loss value, y i is the prediction reference point,
is the real reference point, n is the total number of prediction reference points in the prediction reference point set, and i refers to the ith prediction reference point.
A computer-readable storage medium storing a computer program, wherein, when the computer program is executed by a processor, the non-active living body detection method according to any one of claims 1 to 7 is implemented:

Obtain original video screenshots, and select reference points for the original video screenshots to obtain an initial reference point set;

Performing position disturbance processing on the initial reference point set to obtain a target reference point set;

Generate a geometric transformation matrix according to the initial reference point set and the target reference point set, and use the geometric transformation matrix to perform geometric transformation processing on the original video screenshot to obtain a target image;

Inputting the original video screenshot, the target image and the initial reference point set into a preset reference point analysis network to obtain a prediction reference point set;

Use the preset root mean square error loss function to calculate the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point, and compare the root mean square loss value with the preset root mean square loss value Perform arithmetic operations on the classification loss value of the binary classification network to obtain the final detection loss value;

Perform iterative optimization processing on the non-active living body detection model constructed by the reference point analysis network and the two-classification network according to the final detection loss value to obtain a trained non-active living body detection model;

Obtaining images to be recognized with a preset number of frames, inputting the images to be recognized into the trained non-active living body detection model, and obtaining a non-active living body detection result.
The computer-readable storage medium according to claim 15, wherein the performing position perturbation processing on the initial reference point set to obtain the target reference point set comprises:

Obtain the coordinate value of each initial reference point in the initial reference point set, and obtain the initial coordinate point set corresponding to the initial reference point set;

Using the scrambling function to perform scrambling calculation on the initial coordinate point set to obtain the target coordinate point set;

The target coordinate point set is mapped to a two-dimensional rectangular coordinate system to obtain a target reference point set.
The computer-readable storage medium of claim 15, wherein the generating a geometric transformation matrix according to the initial reference point set and the target reference point set comprises:

Obtain the first arbitrary coordinates of any one of the initial reference points in the set of initial reference points and the second arbitrary coordinates of any one of the target reference points in the set of target reference points;

The first arbitrary coordinate and the second arbitrary coordinate of the coordinates of the initial reference point and the coordinates of the target reference point are calculated according to the geometric transformation formula to obtain the geometric transformation matrix.
The computer-readable storage medium according to claim 17, wherein the first arbitrary coordinate and the second arbitrary coordinate of the coordinates of the initial reference point and the coordinates of the target reference point are calculated according to the geometric transformation formula to obtain the geometric transformation matrix, include:

Use the following geometric transformation formula to calculate the first arbitrary coordinate and the second arbitrary coordinate of the coordinates of the initial reference point and the coordinates of the target reference point, and obtain the geometric transformation matrix:

b=Ha T

Wherein, H is the geometric transformation matrix, b is the second arbitrary coordinate, a is the first arbitrary coordinate, and T is a fixed parameter.
The computer-readable storage medium according to claim 15, wherein the inactive living body detection model constructed by the reference point analysis network and the binary classification network is iteratively optimized according to the final detection loss value. , get a trained non-active liveness detection model, including:

judging the size between the final detection loss value and a preset loss threshold;

If the final detection loss value is less than the preset loss threshold, determine that the non-active living body detection model is a trained non-active living body detection model;

If the final detection loss value is greater than or equal to the preset loss threshold, adjust the internal parameters of the inactive living body detection model, until the final detection loss value is less than the preset loss threshold, obtain training Good non-active liveness detection model.
The computer-readable storage medium according to any one of claims 15 to 19, wherein the prediction reference point in the prediction reference point set and the preset ground truth are calculated by using a preset root mean square error loss function RMS loss values between reference points, including:

Use the following preset root mean square error loss function formula to calculate the root mean square loss value between the prediction reference point in the prediction reference point set and the preset real reference point:

Among them, MSE is the root mean square error loss value, y i is the prediction reference point,
is the real reference point, n is the total number of prediction reference points in the prediction reference point set, and i refers to the ith prediction reference point.