CN114758369A - Living body detection model training method, system, equipment and storage medium - Google Patents
Living body detection model training method, system, equipment and storage medium Download PDFInfo
- Publication number
- CN114758369A CN114758369A CN202011582603.9A CN202011582603A CN114758369A CN 114758369 A CN114758369 A CN 114758369A CN 202011582603 A CN202011582603 A CN 202011582603A CN 114758369 A CN114758369 A CN 114758369A
- Authority
- CN
- China
- Prior art keywords
- face frame
- training
- data
- detection model
- depth
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a living body detection model training method, a system, equipment and a storage medium based on training data augmentation, comprising the following steps: reading depth data and a corresponding RGB image; performing face detection on the RGB image to generate a reference face frame, and constructing a random face frame according to the reference face frame; and performing screenshot on the depth area corresponding to the depth data according to the random face frame to generate a negative sample training set. And training the 3D in-vivo detection model according to the negative sample training set to generate a target 3D in-vivo detection model. According to the method, a plurality of random face frames are generated according to the standard face frame generated by face detection on the RGB image, then screenshot is performed on the depth area corresponding to the depth data according to the random face frames to generate a negative sample training set, and the 3D living body detection model is trained by adopting the negative sample training set, so that the instability of the 3D living body detection model caused by face virtual detection can be reduced.
Description
Technical Field
The invention relates to face detection, in particular to a living body detection model training method, a living body detection model training system, living body detection model training equipment and a storage medium based on training data augmentation.
Background
Face detection methods can be roughly divided into two categories: face detection based on 2D face images and face detection based on 3D face images. The 2D face detection is realized by planar imaging of a 2D camera, third-position information (geometric data such as size, distance and the like) in the physical world cannot be received, even if the algorithm and software are advanced, the safety level is not high enough in a limited information receiving state, the safety level can be easily broken through the modes of photos, videos, cosmetics, human masks and the like, and the requirements of the safety level of the smart phone cannot be met.
The 3D face detection is realized by three-dimensional imaging of a 3D camera, and the three-dimensional coordinate information of each point position in the space in the visual field can be detected, so that a computer can obtain the 3D data of the space and restore the complete three-dimensional world, and various intelligent three-dimensional positioning can be realized. The face detection function can distinguish states of plane images, videos, makeup, leather products, twins and the like, and the face detection method is suitable for application scenes with high requirements on security levels in the financial field, smart phones and the like.
Because technologies such as face detection and face alignment are developed on RGB images, face detection and face alignment are usually performed by means of RGB in the design process of a 3D living body algorithm, and then face is acquired by aligning to a depth map and then 3D face living body recognition is performed.
Due to the complexity of the scene used by the 3D live body algorithm, a large number of false detections of the face region may occur during the use of the 3D live body algorithm, as shown in fig. 1. Due to the false detection of the human face region, other regions which enter the 3D living body judgment algorithm and may not be outside the target region, such as the background, the head region, the partial region of the face, and the like, as shown in fig. 2, may bring unpredictable results to the use of the human face living body judgment algorithm.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a living body detection model training method, a living body detection model training system, living body detection model training equipment and a storage medium based on training data augmentation.
The living body detection model training method based on training data augmentation provided by the invention comprises the following steps:
step S1: reading depth data and a corresponding RGB image;
step S2: performing face detection on the RGB image to generate a reference face frame, and constructing a random face frame according to the reference face frame;
step S3: screenshot is carried out on a depth area corresponding to the depth data according to the random face frame to generate a negative sample training set;
step S4: and training the 3D in-vivo detection model according to the negative sample training set to generate a target 3D in-vivo detection model.
Preferably, the target 3D in-vivo detection model is generated by training a neural network model;
the neural network model comprises a feature extraction layer, a full connection layer, a random discarding layer, a classification layer and a loss calculation layer which are sequentially arranged.
Preferably, the step S2 includes the steps of:
step S201: performing face detection on the RGB image to generate a reference face frame, wherein g is (x1, y1, W, H), (x1, y1) is an angular point coordinate of the reference face frame, W is the width of the reference face frame, and H is the height of the reference face frame;
step S202: taking the smaller value of the width and the height of the reference face frame as the maximum size S of the generated random face frame;
step S203: randomly acquiring a value nx in a range of [0, W-S), randomly acquiring a value ny in a range of [0, H-S), and generating a set of target face frames c ═ n (nx, ny, S) in the RGB image, wherein (nx, ny) is an angular point coordinate of the target face frame, and S is a width and a height of the target face frame;
step S204: and judging whether the intersection ratio between the target face frame and the reference face frame is smaller than a preset intersection ratio threshold value or not, and determining that the target face frame is a random face frame when the intersection ratio between the target face frame and the reference face frame is smaller than the preset intersection ratio threshold value.
Preferably, the step S3 includes the steps of:
step S301: intercepting a depth area corresponding to the depth data according to the random face frame to generate 2D depth data;
step S302: acquiring preset key point information;
step S303: and corresponding the key point information to the 2D depth data, and further performing normalization processing to generate the negative sample training set.
Preferably, the method further comprises the following steps:
step S4: and training the 3D in-vivo detection model according to the negative sample training set to generate a target 3D in-vivo detection model.
Preferably, the step S4 includes the steps of:
step S401: performing key point detection on the RGB image, and determining a plurality of face key points in the reference face frame;
step S402: mapping the key points of the human face in the RGB image to depth data normalized to a preset size to generate a positive sample training set;
step S403: and training the 3D in-vivo detection model according to the negative sample training set and the positive sample training set to generate a target 3D in-vivo detection model.
Preferably, the step S204 includes the steps of:
step S2041: acquiring a target face frame;
step S2042: acquiring a preset intersection ratio threshold, judging whether the intersection ratio between the target face frame and the reference face frame is smaller than the preset intersection ratio threshold, triggering a step S2043 when the intersection ratio between the target face frame and the reference face frame is smaller than or equal to the preset intersection ratio threshold, and triggering a step S2041 when the intersection ratio between the target face frame and the reference face frame is larger than the preset intersection ratio threshold;
step S2043: and determining the target face frame as a random face frame.
Preferably, the step S301 includes the steps of:
step S3011: determining depth area data corresponding to the depth data according to the random face frame;
step S3012: judging whether zero-value data exists in the depth region data in the depth direction, when the zero-value data exists and the ratio of the number of the zero-value data to the total data of the depth region data is larger than a preset proportional threshold, discarding the random face frame and returning to the step S3011, otherwise, triggering the step S3013;
step S3013: and intercepting a depth area corresponding to the depth data according to the random face frame to generate 2D depth data.
The living body detection model training system based on training data augmentation provided by the invention comprises the following modules:
the data reading module is used for reading the depth data and the corresponding RGB image;
the random face frame generating module is used for carrying out face detection on the RGB image to generate a reference face frame and constructing a random face frame according to the reference face frame;
the training set generation module is used for carrying out screenshot on a depth area corresponding to the depth data according to the random face box to generate a negative sample training set;
and the model training module is used for training the 3D in-vivo detection model according to the negative sample training set to generate a target 3D in-vivo detection model.
According to the invention, the living body detection model training device based on training data augmentation comprises:
a processor;
a memory having stored therein executable instructions of the processor;
wherein the processor is configured to perform the steps of the training method for a liveness detection model augmented based on training data via execution of the executable instructions.
According to the present invention, there is provided a computer-readable storage medium storing a program which, when executed, realizes the steps of the living body detection model training method augmented based on training data.
Compared with the prior art, the invention has the following beneficial effects:
according to the method, a plurality of random face frames are generated according to the standard face frame generated by face detection on the RGB image, then screenshot is performed on the depth area corresponding to the depth data according to the random face frames to generate a negative sample training set, and the 3D living body detection model is trained by adopting the negative sample training set, so that the instability of the 3D living body detection model caused by face virtual detection can be reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts. Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a schematic diagram of false detection of a face region during face detection in the prior art;
FIG. 2 is a schematic diagram illustrating the influence of virtual detection of a face region on 3D in-vivo detection model training in the prior art;
FIG. 3 is a flowchart illustrating steps of a method for training a biopsy model based on training data augmentation according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating steps of constructing a random face frame according to the reference face frame in the embodiment of the present invention
FIG. 5 is a flowchart illustrating steps for generating a negative example training set according to an embodiment of the present invention;
FIG. 6(a) is a schematic structural diagram of a training phase of an in-vivo detection model in an embodiment of the present invention;
FIG. 6(b) is a schematic structural diagram of the inference phase of the in-vivo detection model in the embodiment of the present invention;
FIG. 7 is a flowchart illustrating steps for generating a 3D biopsy model of a target in accordance with an embodiment of the present invention;
FIG. 8 is a flowchart illustrating steps for determining a random face frame according to an embodiment of the present invention;
FIG. 9 is a flowchart illustrating the steps of generating 2D depth data according to an embodiment of the present invention;
FIG. 10 is a diagram illustrating a random face frame in a depth image according to an embodiment of the present invention;
FIG. 11 is a flowchart illustrating steps of an offline augmentation method according to an embodiment of the present invention;
FIG. 12 is a flowchart illustrating the steps of an online augmentation method according to an embodiment of the present invention;
FIG. 13 is a block diagram of an in vivo testing model training system based on training data augmentation in an embodiment of the present invention;
FIG. 14 is a schematic structural diagram of an in-vivo detection model training apparatus based on training data augmentation in an embodiment of the present invention; and
fig. 15 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The technical solution of the present invention will be described in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.
The invention provides a training data augmentation-based in vivo detection model training method, and aims to solve the problems in the prior art.
The following describes the technical solutions of the present invention and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present invention will be described below with reference to the accompanying drawings.
Fig. 3 is a flowchart illustrating steps of a training method of a biopsy model based on training data augmentation in an embodiment of the present invention, and as shown in fig. 3, the training method of the biopsy model based on training data augmentation provided by the present invention includes the following steps:
step S1: reading depth data and a corresponding RGB image;
in an embodiment of the invention, the depth data is aligned with the RGB image pixel level.
Step S2: performing face detection on the RGB image to generate a reference face frame, and constructing a random face frame according to the reference face frame;
fig. 4 is a flowchart of steps of constructing a random face frame according to the reference face frame in the embodiment of the present invention, and as shown in fig. 4, the step S2 includes the following steps:
step S201: performing face detection on the RGB image to generate a reference face frame, wherein g is (x1, y1, W, H), (x1, y1) is an angular point coordinate of the reference face frame, W is the width of the reference face frame, and H is the height of the reference face frame;
in the embodiment of the present invention, the (x1, y1) is the coordinates of the upper corner point of the reference face frame.
Step S202: taking the smaller value of the width and the height of the reference face frame as the maximum size S of the generated random face frame;
step S203: randomly acquiring a value nx in a range of [0, W-S), randomly acquiring a value ny in a range of [0, H-S), and generating a set of target face frames c ═ n (nx, ny, S) in the RGB image, wherein (nx, ny) is an angular point coordinate of the target face frame, and S is a width and a height of the target face frame;
step S204: and judging whether the intersection ratio between the target face frame and the reference face frame is smaller than a preset intersection ratio threshold value or not, and determining that the target face frame is a random face frame when the intersection ratio between the target face frame and the reference face frame is smaller than the preset intersection ratio threshold value.
In the embodiment of the present invention, the Intersection over Union (IoU) is: (W1 ≧ W2)/(W1+ W2-W1: -W2), W1 can be the target face frame, W2 can be the benchmark face frame. In the embodiment of the present invention, the preset intersection ratio threshold is an arbitrary value between 0.2 and 0.4, and is preferably 0.3.
In the embodiment of the invention, about 10 random face frames are generated based on the standard face frame.
Step S3: and performing screenshot on the depth area corresponding to the depth data according to the random face frame to generate a negative sample training set.
Fig. 5 is a flowchart of steps of generating a negative sample training set according to an embodiment of the present invention, and as shown in fig. 5, the step S3 includes the following steps:
step S301: intercepting a depth area corresponding to the depth data according to the random face frame to generate 2D depth data;
step S302: acquiring preset key point information;
step S303: and corresponding the key point information to the 2D depth data, and further carrying out normalization processing to generate the negative sample training set.
In the embodiment of the present invention, the normalized uniform size is 180 pixels wide and 220 pixels high. The preset key point information can be set according to the characteristics of the human face, for example, four positions (70.5, 116.5), (109.5, 116.5), (90.5, 137.5) and (90, 159) are respectively used as positions of a left eye, a right eye, a nose and a mouth, and the preset key points are preset.
Step S4: and training the 3D in-vivo detection model according to the negative sample training set to generate a target 3D in-vivo detection model.
Fig. 6(a) is a schematic structural diagram of a training stage of a living body detection model in an embodiment of the present invention, and fig. 6(b) is a schematic structural diagram of an inference stage of the living body detection model in the embodiment of the present invention, as shown in fig. 6(a) and fig. 6(b), in the embodiment of the present invention, the target 3D living body detection model is generated by training a neural network model, and the neural network model includes a feature extraction layer, a full connection layer, a random discarding layer, a classification layer, and a loss calculation layer, which are sequentially arranged.
Fig. 7 is a flowchart of steps of generating a 3D biopsy model of a target in an embodiment of the present invention, and as shown in fig. 7, the step S4 includes the following steps:
step S401: performing key point detection on the RGB image, and determining a plurality of face key points in the reference face frame;
step S402: mapping the key points of the human face in the RGB image to depth data normalized to a preset size to generate a positive sample training set;
step S403: and training the 3D in-vivo detection model according to the negative sample training set and the positive sample training set to generate a target 3D in-vivo detection model.
Fig. 8 is a flowchart of a step of determining a random face frame in the embodiment of the present invention, and as shown in fig. 8, the step S204 includes the following steps:
step S2041: acquiring a target face frame;
step S2042: acquiring a preset intersection ratio threshold, judging whether the intersection ratio between the target face frame and the reference face frame is smaller than the preset intersection ratio threshold, triggering a step S2043 when the intersection ratio between the target face frame and the reference face frame is smaller than or equal to the preset intersection ratio threshold, and triggering a step S2041 when the intersection ratio between the target face frame and the reference face frame is larger than the preset intersection ratio threshold;
step S2043: and determining the target face frame as a random face frame.
Fig. 10 is a schematic diagram of a random face frame in a depth image according to an embodiment of the present invention, and as shown in fig. 10, a random face frame generated according to a reference face frame can be seen.
Fig. 9 is a flowchart of a step of generating 2D depth data according to an embodiment of the present invention, and as shown in fig. 9, the step S301 includes the following steps:
step S3011: determining depth area data corresponding to the depth data according to the random face frame;
step S3012: judging whether zero-value data exists in the depth region data in the depth direction, when the zero-value data exists and the ratio of the number of the zero-value data to the total data of the depth region data is larger than a preset proportional threshold, discarding the random face frame and returning to the step S3011, otherwise, triggering the step S3013;
step S3013: and intercepting a depth area corresponding to the depth data according to the random face frame to generate 2D depth data.
In an embodiment of the present invention, the preset proportional threshold is 20% to 40%, and preferably 30%.
Fig. 11 is a flowchart of steps of an offline augmentation method in an embodiment of the present invention, as shown in fig. 11, and as shown in fig. 11, in an offline augmentation process, 2D depth data is generated by performing individual augmentation based on RGB images corresponding to each depth data in a training set, a new negative sample training set can be obtained by adding a negative sample obtained by augmentation to the training list and the training data set, and an algorithm with stronger false detection resistance robustness can be obtained by performing training based on the new larger training set.
Fig. 12 is a flowchart of steps of an online augmentation method in an embodiment of the present invention, as shown in fig. 12, in the online augmentation mode, data augmentation is performed in real time during a training process, and no additional memory is required to store augmented samples. When the random switch is turned on, the first passage and the second passage are simultaneously switched on, and a positive sample based on a living body of a real person and an augmented prosthesis sample can be simultaneously generated based on pre-stored depth data; when the random switch is closed, only the second channel is opened, the data amplification of the negative sample is not carried out in the training process, and only the depth data of the live body of the real person is used as the positive sample to participate in the training.
Fig. 13 is a schematic block diagram of a training system of a biopsy model based on training data augmentation in an embodiment of the present invention, and as shown in fig. 13, the training system of the biopsy model based on training data augmentation provided by the present invention includes the following modules:
the data reading module is used for reading the depth data and the corresponding RGB image;
the random face frame generating module is used for carrying out face detection on the RGB image to generate a reference face frame and constructing a random face frame according to the reference face frame;
and the training set generation module is used for carrying out screenshot on the depth area corresponding to the depth data according to the random face box to generate a negative sample training set.
And the model training module is used for training the 3D in-vivo detection model according to the negative sample training set to generate a target 3D in-vivo detection model.
The embodiment of the invention also provides living body detection model training equipment based on training data augmentation, which comprises a processor. A memory having stored therein executable instructions of the processor. Wherein the processor is configured to perform the steps of the augmented in-vivo detection model training method based on training data via execution of executable instructions.
As described above, in this embodiment, a plurality of random face frames can be generated according to a standard face frame generated by performing face detection on an RGB image, a negative sample training set is generated by performing screenshot on a depth region corresponding to the depth data according to the random face frames, and a 3D in-vivo detection model is trained by using the negative sample training set, so that instability of the 3D in-vivo detection model caused by face false detection can be reduced.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Accordingly, various aspects of the present invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" platform.
FIG. 14 is a schematic structural diagram of a training apparatus for a living body test model augmented based on training data according to an embodiment of the present invention. An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 14. The electronic device 600 shown in fig. 14 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 14, the electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one memory unit 620, a bus 630 connecting the different platform components (including the memory unit 620 and the processing unit 610), a display unit 640, etc.
Wherein the storage unit stores program code which may be executed by the processing unit 610 to cause the processing unit 610 to perform the steps according to various exemplary embodiments of the present invention as described in the above-mentioned training data based augmented liveness detection model training method section of the present specification. For example, processing unit 610 may perform the steps as shown in fig. 1.
The storage unit 620 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)6201 and/or a cache memory unit 6202, and may further include a read-only memory unit (ROM) 6203.
The memory unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 600, and/or with any device (e.g., router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 650. Also, the electronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 660. The network adapter 660 may communicate with other modules of the electronic device 600 via the bus 630. It should be understood that although not shown in FIG. 14, other hardware and/or software modules may be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage platforms, to name a few.
The embodiment of the invention also provides a computer readable storage medium for storing a program, and the steps of the living body detection model training method based on training data augmentation are realized when the program is executed. In some possible embodiments, aspects of the invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the invention described in the above section of the method for training a living body detection model augmented based on training data of the present description, when the program product is run on the terminal device.
As shown above, when the program of the computer-readable storage medium of this embodiment is executed, in the present invention, a plurality of random face frames are generated according to a standard face frame generated by performing face detection on an RGB image, and then a negative sample training set is generated by performing screenshot on a depth area corresponding to the depth data according to the random face frames, and the 3D in-vivo detection model is trained by using the negative sample training set, which can reduce instability brought by face false detection to the 3D in-vivo detection model.
Fig. 15 is a schematic structural diagram of a computer-readable storage medium in an embodiment of the present invention. Referring to fig. 15, a program product 800 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this respect, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
In the embodiment of the invention, a plurality of random face frames are generated according to the standard face frame generated by face detection on the RGB image, then the depth area corresponding to the depth data is subjected to screenshot according to the random face frames to generate the negative sample training set, and the 3D in-vivo detection model is trained by adopting the negative sample training set, so that the instability of the 3D in-vivo detection model caused by face virtual detection can be reduced.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.
Claims (10)
1. A living body detection model training method based on training data augmentation is characterized by comprising the following steps:
step S1: reading depth data and a corresponding RGB image;
step S2: performing face detection on the RGB image to generate a reference face frame, and constructing a random face frame according to the reference face frame;
step S3: screenshot is carried out on a depth area corresponding to the depth data according to the random face frame to generate a negative sample training set;
step S4: and training the 3D in-vivo detection model according to the negative sample training set to generate a target 3D in-vivo detection model.
2. The training data augmentation-based in vivo detection model training method according to claim 1, wherein the target 3D in vivo detection model is generated by adopting neural network model training;
the neural network model comprises a feature extraction layer, a full connection layer, a random discarding layer, a classification layer and a loss calculation layer which are sequentially arranged.
3. The training method for the living body test model augmented based on the training data as claimed in claim 1, wherein the step S2 comprises the steps of:
step S201: performing face detection on the RGB image to generate a reference face frame, wherein g is (x1, y1, W, H), (x1, y1) is an angular point coordinate of the reference face frame, W is the width of the reference face frame, and H is the height of the reference face frame;
step S202: taking the smaller value of the width and the height of the reference face frame as the maximum size S of the generated random face frame;
step S203: randomly acquiring a value nx in a range of [0, W-S), randomly acquiring a value ny in a range of [0, H-S), and generating a set of target face frames c ═ n (nx, ny, S) in the RGB image, wherein (nx, ny) is an angular point coordinate of the target face frame, and S is a width and a height of the target face frame;
step S204: and judging whether the intersection ratio between the target face frame and the reference face frame is smaller than a preset intersection ratio threshold value or not, and determining that the target face frame is a random face frame when the intersection ratio between the target face frame and the reference face frame is smaller than the preset intersection ratio threshold value.
4. The training method for the living body test model augmented based on the training data as claimed in claim 1, wherein the step S3 comprises the steps of:
step S301: intercepting a depth area corresponding to the depth data according to the random face frame to generate 2D depth data;
step S302: acquiring preset key point information;
step S303: and corresponding the key point information to the 2D depth data, and further performing normalization processing to generate the negative sample training set.
5. The training method for living body detection model based on training data augmentation of claim 1, wherein the step S4 comprises the steps of:
step S401: performing key point detection on the RGB image, and determining a plurality of face key points in the reference face frame;
step S402: mapping the key points of the human face in the RGB image to depth data normalized to a preset size to generate a positive sample training set;
step S403: and training the 3D in-vivo detection model according to the negative sample training set and the positive sample training set to generate a target 3D in-vivo detection model.
6. The method for training the living body detection model based on the training data augmentation of claim 2, wherein the step S204 comprises the steps of:
step S2041: acquiring a target face frame;
step S2042: acquiring a preset intersection ratio threshold, judging whether the intersection ratio between the target face frame and the reference face frame is smaller than the preset intersection ratio threshold, triggering a step S2043 when the intersection ratio between the target face frame and the reference face frame is smaller than or equal to the preset intersection ratio threshold, and triggering a step S2041 when the intersection ratio between the target face frame and the reference face frame is larger than the preset intersection ratio threshold;
step S2043: and determining the target face frame as a random face frame.
7. The method for training a living body test model based on training data augmentation of claim 3, wherein the step S301 comprises the steps of:
step S3011: determining depth area data corresponding to the depth data according to the random face frame;
step S3012: judging whether zero-value data exist in the depth region data in the depth direction, when the zero-value data exist and the ratio of the number of the zero-value data to the total data of the depth region data is larger than a preset proportional threshold, discarding the random face frame and returning to the step S3011, otherwise, triggering the step S3013;
step S3013: and intercepting a depth area corresponding to the depth data according to the random face frame to generate 2D depth data.
8. A living body detection model training system based on training data augmentation is characterized by comprising the following modules:
the data reading module is used for reading the depth data and the corresponding RGB image;
the random face frame generating module is used for carrying out face detection on the RGB image to generate a reference face frame and constructing a random face frame according to the reference face frame;
the training set generation module is used for carrying out screenshot on a depth area corresponding to the depth data according to the random face box to generate a negative sample training set;
and the model training module is used for training the 3D in-vivo detection model according to the negative sample training set to generate a target 3D in-vivo detection model.
9. A live body test model training apparatus based on training data augmentation, comprising:
a processor;
a memory having stored therein executable instructions of the processor;
wherein the processor is configured to perform the steps of the training method for a living body detection model augmented based on training data of any one of claims 1 to 7 via execution of the executable instructions.
10. A computer-readable storage medium storing a program which, when executed, implements the steps of the training method for a living body test model augmented based on training data of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011582603.9A CN114758369A (en) | 2020-12-28 | 2020-12-28 | Living body detection model training method, system, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011582603.9A CN114758369A (en) | 2020-12-28 | 2020-12-28 | Living body detection model training method, system, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114758369A true CN114758369A (en) | 2022-07-15 |
Family
ID=82324618
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011582603.9A Pending CN114758369A (en) | 2020-12-28 | 2020-12-28 | Living body detection model training method, system, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114758369A (en) |
-
2020
- 2020-12-28 CN CN202011582603.9A patent/CN114758369A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11762475B2 (en) | AR scenario-based gesture interaction method, storage medium, and communication terminal | |
US10699103B2 (en) | Living body detecting method and apparatus, device and storage medium | |
US20190392587A1 (en) | System for predicting articulated object feature location | |
CN110866977B (en) | Augmented reality processing method, device, system, storage medium and electronic equipment | |
CN113221767B (en) | Method for training living body face recognition model and recognizing living body face and related device | |
CN111611873A (en) | Face replacement detection method and device, electronic equipment and computer storage medium | |
CN113221771B (en) | Living body face recognition method, device, apparatus, storage medium and program product | |
CN111273772B (en) | Augmented reality interaction method and device based on slam mapping method | |
EP4033458A2 (en) | Method and apparatus of face anti-spoofing, device, storage medium, and computer program product | |
CN110688878B (en) | Living body identification detection method, living body identification detection device, living body identification detection medium, and electronic device | |
US20230017578A1 (en) | Image processing and model training methods, electronic device, and storage medium | |
CN109635021A (en) | A kind of data information input method, device and equipment based on human testing | |
CN111881740B (en) | Face recognition method, device, electronic equipment and medium | |
CN113642639A (en) | Living body detection method, living body detection device, living body detection apparatus, and storage medium | |
CN111783674A (en) | Face recognition method and system based on AR glasses | |
CN111274946A (en) | Face recognition method, system and equipment | |
CN112037305B (en) | Method, device and storage medium for reconstructing tree-like organization in image | |
CN113255400A (en) | Training and recognition method, system, equipment and medium of living body face recognition model | |
EP4064215A2 (en) | Method and apparatus for face anti-spoofing | |
CN114758369A (en) | Living body detection model training method, system, equipment and storage medium | |
CN114758370A (en) | Training data augmentation method, system, device, and storage medium | |
CN112862840B (en) | Image segmentation method, device, equipment and medium | |
CN115661890A (en) | Model training method, face recognition device, face recognition equipment and medium | |
CN115376198A (en) | Gaze direction estimation method, gaze direction estimation device, electronic apparatus, medium, and program product | |
CN114627521A (en) | Method, system, equipment and storage medium for judging living human face based on speckle pattern |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |