WO2021179719A1 - Procédé de détection faciale, appareil, support et dispositif électronique - Google Patents

Procédé de détection faciale, appareil, support et dispositif électronique Download PDF

Info

Publication number
WO2021179719A1
WO2021179719A1 PCT/CN2020/135548 CN2020135548W WO2021179719A1 WO 2021179719 A1 WO2021179719 A1 WO 2021179719A1 CN 2020135548 W CN2020135548 W CN 2020135548W WO 2021179719 A1 WO2021179719 A1 WO 2021179719A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
video stream
stream data
shaking
coordinates
Prior art date
Application number
PCT/CN2020/135548
Other languages
English (en)
Chinese (zh)
Inventor
蔡中印
陆进
陈斌
宋晨
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021179719A1 publication Critical patent/WO2021179719A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive

Definitions

  • This application relates to the field of artificial intelligence and is applied to the field of face recognition, and in particular to a method, device, medium, and electronic equipment for detecting a living body of a face.
  • Action in vivo detection is one of the important means of in vivo detection. It mainly selects several actions randomly from the actions of shaking the head, nodding, opening and closing the mouth, opening and closing eyes, etc., and sends instructions to the user. The user performs corresponding actions in front of the camera according to the instructions. Finally, the video data recorded by the camera is obtained, analyzed, and the detection result is obtained. Shaking the head is one of the key actions of the motion detection.
  • a new attack method for live detection has emerged, that is, according to the instructions, the paper or head model containing the face is used to shake the head to simulate the shaking of the head. The current live detection method cannot deal with this Means for identification have resulted in low accuracy of live detection and high safety risks.
  • the purpose of this application is to provide a method, device, medium, and electronic equipment for detecting a living body of a human face.
  • a method for detecting a human face includes: inputting a face region picture corresponding to a face shaking video stream data to be subjected to a living body detection into a preset recognition model to obtain The face key point coordinates and the human eye sight offset vector output by the preset recognition model, wherein the preset recognition model is a face key point detection model combined with the human eye sight offset vector output layer,
  • the face key point detection model includes a convolutional layer, the human eye sight offset vector output layer is connected to the last layer of the convolution layer in the face key point detection model, and the face key point coordinates are
  • the human eye sight deviation vector corresponds to each face image frame included in the face shaking video stream data, and the human eye sight deviation vector is used to measure the degree of deviation of the human eye sight during the face shaking process Determining whether the face shaking video stream data passes the current stage of living body detection according to the face key point coordinates corresponding to each face image frame and the eye sight offset vector.
  • a face living detection device comprising: an input module configured to input the face area picture corresponding to the face shaking video stream data to be subjected to the living detection to A preset recognition model to obtain the key point coordinates of the face and the eye sight offset vector output by the preset recognition model, where the preset recognition model is a face combined with the human eye sight offset vector output layer A key point detection model, the face key point detection model includes a convolutional layer, the human eye sight offset vector output layer is connected to the last layer of the convolution layer in the face key point detection model, and the person The coordinates of the key points of the face and the eye sight offset vector correspond to each face image frame included in the face shaking video stream data, and the eye sight offset vector is used to measure the process of shaking the head of the face The degree of deviation of the human eye line of sight; the judgment module is configured to determine whether the face shaking video stream data passes according to the face key point coordinates corresponding to each face image frame and the human eye line of sight offset vector Live detection
  • a computer-readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by the computer, the computer executes the following method:
  • the face region picture corresponding to the face shaking video stream data to be subjected to the live detection is input into the preset recognition model, and the key point coordinates of the face and the eye sight offset vector output by the preset recognition model are obtained,
  • the preset recognition model is a face key point detection model combined with a human eye sight offset vector output layer
  • the face key point detection model includes a convolutional layer
  • the human eye sight offset vector output layer is The last layer of the convolutional layer in the face key point detection model is connected, and the face key point coordinates and the eye sight offset vector are respectively connected to each face included in the face shaking video stream data.
  • the human eye sight deviation vector is used to measure the degree of deviation of the human eye sight in the process of shaking the head of the face; according to the face key point coordinates and the human eye corresponding to each face image frame
  • the sight offset vector determines whether the face shaking video stream data passes the current stage of living body detection.
  • an electronic device including: a processor; , To implement the following method: input the face area picture corresponding to the face shaking video stream data to be subjected to the live detection into the preset recognition model, and obtain the key point coordinates of the face and the line of sight of the human eye output by the preset recognition model Offset vector, wherein the preset recognition model is a face key point detection model combined with an output layer of the human eye line of sight offset vector, the face key point detection model includes a convolutional layer, and the human eye line of sight is biased The shift vector output layer is connected to the last layer of the convolutional layer in the face key point detection model.
  • the face key point coordinates and the human eye sight offset vector are respectively compared with the face shaking head video stream data.
  • the included face image frames correspond to each other, and the human eye sight offset vector is used to measure the degree of deviation of the human eye sight during the process of shaking the head of the face; according to the face key point coordinates corresponding to each face image frame Determine whether the video stream data of the human face shaking head passes the current stage of the living body detection by using the sight deviation vector of the human eye.
  • This application uses the face key point detection model combined with the human eye sight offset vector output layer to calculate the human eye sight offset vector corresponding to the face area picture, and uses the human eye sight offset vector to perform face living detection . Therefore, in the process of living body detection, fraudulent means using paper or head molds containing human faces to shake can be identified, thereby improving the accuracy of living body detection and reducing security risks.
  • Fig. 1 is a schematic diagram showing a system architecture of a method for detecting a human face according to an exemplary embodiment.
  • Fig. 2 is a flow chart showing a method for detecting human face living according to an exemplary embodiment.
  • Fig. 3 is a schematic diagram showing at least part of the structure of a preset recognition model used in a method for detecting a human face according to an exemplary embodiment.
  • FIG. 4 is a flowchart of steps before step 240 of an embodiment shown in the embodiment corresponding to FIG. 2.
  • Fig. 5 is a block diagram showing a device for detecting human face living according to an exemplary embodiment.
  • Fig. 6 is a block diagram showing an example of an electronic device for realizing the above method for detecting a human face according to an exemplary embodiment.
  • Fig. 7 shows a computer-readable storage medium for realizing the above-mentioned method for detecting human face living according to an exemplary embodiment.
  • the technical solution of the present application can be applied to the fields of artificial intelligence, smart city, blockchain and/or big data technology to realize living body detection.
  • the data involved in this application such as video stream data and/or face area pictures, etc.
  • the live face detection mainly refers to the process of judging whether the face in the video is a live face based on the recorded video containing the face.
  • the live face detection is one of the important technical means in the field of identity verification.
  • motion live detection is an important part of face live detection.
  • the user needs to perform corresponding actions according to instructions such as voice, text, etc. These actions mainly include shaking their heads, nodding, opening and closing their mouths, opening and closing eyes, etc. Of course, they can also not issue instructions to the user, but Observe user actions randomly.
  • the implementation terminal of this application can be any device with computing, processing, and storage functions.
  • the device can be connected to an external device for receiving or sending data.
  • it can be a portable mobile device, such as a smart phone, a tablet computer, a notebook computer, PDA (Personal Digital Assistant), etc., can also be fixed devices, such as computer equipment, field terminals, desktop computers, servers, workstations, etc., or a collection of multiple devices, such as cloud computing physical infrastructure or server clusters.
  • the implementation terminal of this application may be a server or a physical infrastructure of cloud computing.
  • Fig. 1 is a schematic diagram showing a system architecture of a method for detecting a human face according to an exemplary embodiment.
  • the system architecture includes a server 110 and a mobile terminal 120.
  • the mobile terminal 120 may be, for example, a smart phone.
  • the mobile terminal 120 is connected to the server 110 through a communication link. Therefore, the mobile terminal 120 can send data to the server 110 or receive data from the server 110.
  • the server 110 is provided with a server program and a preset recognition model, and the mobile terminal Client software is installed and running on 120, and server 110 is the implementation terminal in this embodiment.
  • a specific process may be as follows: the user records and uploads the face shaking his head to the server 110 by operating the client software on the mobile terminal 120 Video stream data; after the server 110 receives the face shaking head video stream data, it runs a server program to extract the face area picture in the face shaking head video stream data; then, the server 110 inputs the face area picture to The recognition model is preset to obtain the face key point coordinates and the human eye line of sight offset vector output by the model; finally, the server 110 runs the server program to according to the face key point coordinates and the human eye line of sight offset vector output by the model To judge and output the detection results of the current stage of the living body detection.
  • Figure 1 is only an embodiment of the present application.
  • the implementation terminal in this embodiment is a server, and the terminal that provides face shaking video stream data is a mobile terminal, in other embodiments or actual
  • the implementation terminal and the terminal that provides the face shaking video stream data can be various terminals or devices as described above; although in this embodiment, the face shaking video stream data is from the terminal other than the application implementation terminal. It is sent from the terminal, but in fact, the face shaking video stream data can be directly obtained by the local terminal.
  • This application does not limit this, and the protection scope of this application should not be restricted in any way.
  • Fig. 2 is a flow chart showing a method for detecting human face living according to an exemplary embodiment.
  • the face living detection method provided in this embodiment can be executed by a server, as shown in FIG. 2, and includes the following steps.
  • Step 240 Input the face region picture corresponding to the face shaking video stream data to be subjected to the live detection into a preset recognition model, and obtain the key point coordinates of the face and the human eye sight offset output by the preset recognition model Vector.
  • the preset recognition model is a face key point detection model combined with a human eye line of sight offset vector output layer
  • the face key point detection model includes a convolutional layer
  • the human eye line of sight offset vector output layer Connected to the last layer of the convolutional layer in the face key point detection model, the face key point coordinates and the eye sight offset vector are respectively the same as each person included in the face shaking video stream data
  • the human eye sight deviation vector is used to measure the degree of deviation of the human eye sight during the process of shaking the head of the face.
  • the face key point coordinates and the eye sight offset vector correspond to each face image frame, that is, for each face image frame, there is a corresponding face key point coordinate and eye sight offset vector.
  • the human eye sight offset vector includes the direction and length. For example, the human eye sight to the left is positive, and to the right is negative. Length can be defined as the normalized relative degree of the deviation of the human eye pupil from the center of the eye socket.
  • Fig. 3 is a schematic diagram showing at least part of the structure of a preset recognition model used in a method for detecting a human face according to an exemplary embodiment.
  • the preset recognition model 300 includes at least a face key point detection model 310 and an eye sight offset vector output layer 320.
  • the part framed by the dashed line is the structural part of the face key point detection model 310, including the convolutional layer 311 and the output part 312 after the convolutional layer 311.
  • the convolutional layer 311 can be stacked by a multilayer neural network structure.
  • the output part 312 will finally output the coordinates of the key points of the face.
  • other structures may be included before the convolutional layer 311 and between the network structures of the convolutional layer 311.
  • the human eye sight offset vector output layer 320 receives the input of the last layer of the convolutional layer, and finally outputs the human eye sight offset vector corresponding to the face image frame.
  • the human eye sight offset vector output layer 320 is usually a fully connected layer.
  • FIG. 4 is a flowchart of steps before step 240 of an embodiment shown in the embodiment corresponding to FIG. 2. Please refer to Figure 4, including the following steps.
  • Step 210 Deframe the face shaking video stream data to be subjected to the living body detection, and obtain the face image frame corresponding to the face shaking video stream data.
  • Deframing the face shaking video stream data is a process of dividing the face shaking video stream data into face image frames.
  • the method before deframing the face shaking video stream data to be subjected to liveness detection to obtain the face image frame corresponding to the face shaking video stream data, the method further includes: obtaining from a user terminal Video stream data of face shaking head to be subjected to live detection.
  • the method before acquiring the face shaking video stream data to be subjected to the living body detection from the user terminal, the method further includes: randomly selecting a preset action instruction from a plurality of preset action instructions and selecting the selected one.
  • the preset action instruction is sent to the user terminal, where the plurality of preset action instructions include shaking the head, and acquiring from the user terminal the face shaking video stream data to be subjected to living body detection is based on the selected preset action
  • the instruction is carried out under the condition of shaking the head.
  • Step 220 Input the face image frame into a preset face detection model, and obtain the face detection frame coordinates corresponding to the face image frame.
  • the pixel area included in the face image frame may be very large, and the pixel area occupied by the face may be only a part or a small part of the face image frame. In order to accurately detect the face, it is necessary for the face image The area corresponding to the face in the frame is identified in a targeted manner.
  • the face detection frame coordinates are the position coordinates of the area corresponding to the face in the face image frame.
  • the preset face detection model can output the corresponding face detection frame coordinates according to the input of the face image frame.
  • the preset face detection model can be implemented based on various algorithms or principles, for example, general machine learning algorithms. It is a deep learning algorithm.
  • Step 230 Extract a face area picture from the face image frame according to the face detection frame coordinates.
  • the extracting a face region picture from the face image frame according to the face detection frame coordinates includes: determining the first person corresponding to the face detection frame coordinates in the face image frame Face detection frame area; expand the first face detection frame area according to a predetermined expansion ratio to obtain a second face detection frame area; extract people based on the range defined by the second face detection frame area Face area picture.
  • the first face detection frame area can be a rectangle
  • the face detection frame coordinates are coordinates that can be used to uniquely determine the range of the rectangle.
  • the face detection frame coordinates can be the coordinates of the four vertices of the rectangle, using the rectangle
  • the coordinates of the four vertices can determine the range of a rectangle;
  • the coordinates of the face detection frame can also be the coordinates of the intersection of the two diagonals of the rectangle. After having the coordinates of the intersection of the two diagonals, the Assuming the length and width of the rectangle, the range of the corresponding rectangle can also be determined.
  • the predetermined frame expansion ratio is the ratio of further expanding the coverage area on the basis of the original area.
  • the predetermined expansion ratio can be various predetermined ratios, such as 20%.
  • the expansion operation of the first face area can be performed in a variety of ways or angles, such as expanding from the center to the surroundings, to the left and right or up and down. Amplify, amplify to the upper right or lower left, and so on. In this way, after the frame expansion operation, the area of the second face detection frame area obtained is larger than the first face detection frame area.
  • the face area picture is not extracted directly according to the range defined by the first face detection frame area, but First, expand the frame of the first face detection frame area to obtain the second face detection frame area, and then extract the face area picture based on the range defined by the second face detection frame area. Therefore, this can make the extracted The face area picture is large enough to retain more information about the face, which improves the live detection effect to a certain extent.
  • the inputting the face image frame into a preset face detection model to obtain the face detection frame coordinates corresponding to the face image frame includes: Input to the preset face detection model to obtain the face detection frame coordinates corresponding to each of the face image frames; the extracting the face area picture from the face image frame according to the face detection frame coordinates includes: Extract a face region picture from each face image frame according to the coordinates of each face detection frame.
  • Extracting the face region picture from the face image frame is the process of matting in the face image frame.
  • the face region pictures are first determined by the preset face detection model to determine the face detection frame coordinates, and then extracted according to the face detection frame coordinates.
  • the inputting the face image frame into a preset face detection model to obtain the face detection frame coordinates corresponding to the face image frame includes: adding at least one of the face image frame Input to the preset face detection model to obtain the first face detection frame coordinates corresponding to at least one of the face image frames respectively; said extracting a face region picture from the face image frame according to the face detection frame coordinates , Including: extracting a corresponding first face region picture from the face image frame corresponding to the first face detection frame coordinates according to each of the first face detection frame coordinates; The region picture is input to the preset recognition model to obtain the face key point coordinates and the eye sight offset vector corresponding to each of the first face region pictures; determine the face corresponding to each of the first face region pictures The circumscribed rectangle of the face corresponding to the key point coordinates; determining a second face detection frame corresponding to at least one face image frame after the at least one face image frame according to the circumscribed rectangle of the face and a preset estimation algorithm Coordinates; according to the determined coordinates
  • the circumscribed rectangle of the face is a rectangle that can just cover the face area, and at least a part of the points on the edge of the face area are located on the rectangle.
  • the preset estimation algorithm may be various algorithms capable of estimating or calculating the motion state of the face, for example, it may be a Kalman filter.
  • Kalman Filter also known as Kalman filter equation or Kalman equation of motion, is an algorithm that uses linear system state equations to perform optimal estimation of system state through system input and output observation data. Specifically, by bringing the circumscribed rectangle of the face corresponding to at least one previous face image frame into the Kalman equation of motion, the coordinates of the second face detection frame corresponding to the current or future face image frames can be determined. The coordinates of the face detection frame are predicted based on the Kalman equation of motion.
  • two methods are used to determine the face detection frame coordinates corresponding to the face image frame: For at least one face image frame in the front, the method used is to The face image frame is input to the preset face detection model to obtain the face detection frame coordinates, and then the corresponding face area picture is extracted from the corresponding face image frame according to the face detection frame coordinates; for the current or subsequent faces The image frame is also determined based on the previously extracted face region picture.
  • the previously extracted face region picture is input into the preset recognition model to obtain the key point coordinates of the face, and then according to the face
  • the key point coordinates determine the corresponding circumscribed rectangle of the face, and finally input the circumscribed rectangle of the face into the preset estimation algorithm, and the coordinates of the second face detection frame corresponding to the current and subsequent face image frames can be determined.
  • this method consumes less computing resources and is more efficient.
  • Step 250 Determine whether the face shaking video stream data passes the current stage of living body detection according to the face key point coordinates corresponding to each face image frame and the eye sight offset vector.
  • the method further includes: obtaining face video stream data after the face shaking video stream data when the current stage of the live body detection is passed; and performing silent live body detection on the face video stream data.
  • Various algorithms or models can be used to perform silent live detection on face video stream data.
  • the person does not need to shake his head, and the position and angle of the face are in a relatively unchanged state.
  • the subsequent silent detection is only performed when the current stage of the living body detection is passed, the number of users who can complete the living body detection alone is far smaller than the number of users who can complete the silent detection alone.
  • the current stage of live detection can filter out a large number of users, so this reduces resource consumption to a certain extent.
  • the part of the preset recognition model that is related to the output layer of the human eye shift vector is trained in the following way: Obtain the normal person corresponding to the normal face shaking video stream data in the sample data set The face area picture and the face paper area picture corresponding to the face paper shaking head video stream data, the sample data set includes multiple normal face shaking head video stream data and multiple face paper shaking head video stream data; the normal person The face area picture and the face paper area picture are input to the preset recognition model, and the person output by the preset recognition model corresponding to the normal face area picture and the face paper area picture respectively Face key point coordinates and human eye line of sight offset vector; respectively use the face key point coordinate sequence corresponding to the normal face shaking head video stream data and the face key point coordinates corresponding to the face paper shaking head video stream data The sequence determines the face shaking degree sequence corresponding to the normal face shaking head video stream data and the face paper shaking head video stream data; for each normal face shaking head video stream data and each face sheet Head shaking video stream data, determine the face key point coordinates corresponding to the
  • the determining whether the face shaking video stream data passes the current stage of living body detection according to the face key point coordinates corresponding to each face image frame and the human eye sight offset vector includes: The face key point coordinates corresponding to the face image frame determine the face key point coordinates corresponding to the face shaking degree within the predetermined face shaking degree range, as the second target face key point coordinates; 2. The coordinates of the key points of the target face and the sight deviation vector of the human eye determine the score corresponding to the face shaking video stream data to be subjected to the live detection; if the score reaches the score threshold, it is determined to pass the current stage of the live detection , Otherwise, it is determined that the current stage of the living body test has not passed.
  • the score of normal face shaking video stream data is generally greater than the score of face paper shaking video stream data.
  • the determining the score threshold value using each of the scores includes: determining the score threshold value according to the score corresponding to each normal face shaking video stream data, so that and only making the normal face shaking video stream data The score of a predetermined proportion among the corresponding scores reaches the score threshold; the training of the preset recognition model based on the score threshold includes: determining that the score corresponding to each normal face shaking video stream data is less than the score Threshold the ratio of the number of scores of human face paper shaking head video stream data to the score of all human face paper shaking head video stream data; training the preset recognition model according to the ratio.
  • This ratio measures the proportion of all face paper shaking head video stream data that can be correctly identified as human face paper shaking head video stream data, that is, the correct rejection rate. Therefore, this ratio can be increased through training.
  • each of the scores can also be used in other ways to determine the score threshold.
  • the minimum value of a predetermined proportion of scores from small to large can be used as the score threshold, or a score threshold can be determined so that the face paper shakes the head video stream.
  • the scores of the predetermined proportion among the scores corresponding to the data do not reach the score threshold.
  • the scores ranked at 99 from the largest to the smallest will be used as the score threshold.
  • the degree of face shaking is based on the angle, which can be used to measure the size of the shaking head angle.
  • the change in the coordinates of the key points of the face affects the degree of face shaking. Therefore, the corresponding face shaking can be determined according to the coordinate sequence of the key points of the face.
  • the degree sequence, determining the corresponding face shaking degree sequence according to the coordinate sequence of the key points of the face can be implemented using various algorithms or models.
  • the predetermined range of the degree of face shaking may be, for example, 15 degrees.
  • Each picture of the normal face area or the picture of the face paper area corresponds to a degree of shaking the head of the face.
  • all the face shaking degrees corresponding to the normal face region pictures constitute a face shaking degree sequence.
  • the human face shaking degree corresponding to the picture of the paper area of the human face in the face paper shaking head video stream data can also form a face shaking degree sequence.
  • the normal face shaking video stream data and the face paper shaking video stream data are a set of face image frames in time series
  • the normal face area image and the face paper shaking video corresponding to the normal face shaking video stream data are all in the form of a picture sequence.
  • the face key point coordinates corresponding to the normal face shaking head video stream data and the face paper shaking head video stream data can also be sequenced The way exists.
  • the face key point detection model combined with the human eye sight offset vector output layer is used to calculate the eye sight deviation corresponding to the face region picture Shift the vector, and use the eye sight shift vector to detect the face living body. Therefore, in the process of living body detection, fraudulent means using paper or head molds containing human faces to shake can be identified, thereby improving the accuracy of living body detection and reducing security risks.
  • the present application also provides a face living detection device, and the following are the device embodiments of the present application.
  • Fig. 5 is a block diagram showing a device for detecting human face living according to an exemplary embodiment.
  • the device 500 includes: an input module 510, configured to input the face region picture corresponding to the face shaking video stream data to be subjected to the live detection into a preset recognition model, and the preset recognition model is obtained from the preset recognition model.
  • the detection model includes a convolutional layer.
  • the human eye sight offset vector output layer is connected to the last layer of the convolutional layer in the face key point detection model.
  • the face key point coordinates and the human eye sight deviation corresponds to each face image frame included in the face shaking video stream data, and the human eye sight deviation vector is used to measure the degree of deviation of the human eye sight during the face shaking process; the judgment module 520 , Configured to determine whether the face shaking video stream data passes the current stage of living body detection according to the face key point coordinates corresponding to each face image frame and the eye sight offset vector.
  • an electronic device capable of implementing the above method.
  • the electronic device 600 according to this embodiment of the present application will be described below with reference to FIG. 6.
  • the electronic device 600 shown in FIG. 6 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present application.
  • the electronic device 600 is represented in the form of a general-purpose computing device.
  • the components of the electronic device 600 may include, but are not limited to: the aforementioned at least one processing unit 610, the aforementioned at least one storage unit 620, and a bus 630 connecting different system components (including the storage unit 620 and the processing unit 610).
  • the storage unit 620 stores program code, and the program code can be executed by the processing unit 610, so that the processing unit 610 executes the various exemplary methods described in the above-mentioned "Embodiment Method" section of this specification. Steps of implementation.
  • the storage unit 620 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 621 and/or a cache storage unit 622, and may further include a read-only storage unit (ROM) 623.
  • the storage unit 620 may also include a program/utility tool 624 having a set of (at least one) program module 625.
  • Such program module 625 includes but is not limited to: an operating system, one or more application programs, other program modules, and program data, Each of these examples or some combination may include the implementation of a network environment.
  • the bus 630 may represent one or more of several types of bus structures, including a storage unit bus or a storage unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any bus structure among multiple bus structures. bus.
  • the electronic device 600 may also communicate with one or more external devices 800 (such as keyboards, pointing devices, Bluetooth devices, etc.), and may also communicate with one or more devices that enable a user to interact with the electronic device 600, and/or communicate with Any device (such as a router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. Such communication may be performed through an input/output (I/O) interface 650, such as communication with the display unit 640.
  • I/O input/output
  • the electronic device 600 may also communicate with one or more networks (for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through the network adapter 660.
  • networks for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet
  • the network adapter 660 communicates with other modules of the electronic device 600 through the bus 630.
  • other hardware and/or software modules can be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives And data backup storage system, etc.
  • the example embodiments described here can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which can be a personal computer, a server, a terminal device, or a network device, etc.) execute the method according to the embodiment of the present application.
  • a computing device which can be a personal computer, a server, a terminal device, or a network device, etc.
  • the computer-readable storage medium stores computer-readable instructions, when the computer-readable instructions are executed by the computer, the computer executes this specification The above method.
  • the storage medium involved in this application such as a computer-readable storage medium, may be non-volatile or volatile.
  • each aspect of the present application can also be implemented in the form of a program product, which includes program code.
  • the program product runs on a terminal device, the program code is used to make the The terminal device executes the steps according to various exemplary embodiments of the present application described in the above-mentioned "Exemplary Method" section of this specification.
  • a program product 700 for implementing the above method according to an embodiment of the present application is described. It can adopt a portable compact disk read-only memory (CD-ROM) and include program code, and can be stored in a terminal device, For example, running on a personal computer.
  • the program product of this application is not limited to this.
  • the readable storage medium can be any tangible medium that contains or stores a program, and the program can be used by or in combination with an instruction execution system, device, or device.
  • the program product can use any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Type programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the computer-readable signal medium may include a data signal propagated in baseband or as a part of a carrier wave, and readable program code is carried therein.
  • This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the readable signal medium may also be any readable medium other than a readable storage medium, and the readable medium may send, propagate, or transmit a program for use by or in combination with the instruction execution system, apparatus, or device.
  • the program code contained on the readable medium can be transmitted by any suitable medium, including but not limited to wireless, wired, optical cable, RF, etc., or any suitable combination of the above.
  • the program code used to perform the operations of the present application can be written in any combination of one or more programming languages.
  • the programming languages include object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming languages. Programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computing device, partly on the user's device, executed as an independent software package, partly on the user's computing device and partly executed on the remote computing device, or entirely on the remote computing device or server Executed on.
  • the remote computing device can be connected to a user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (for example, using Internet service providers). Business to connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • Internet service providers for example, using Internet service providers.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé de détection faciale, un appareil, un support et un dispositif électronique. Le procédé consiste à : entrer, dans un modèle de reconnaissance prédéfini, une image de région faciale correspondant à des données de flux vidéo de hochement de tête devant être soumises à une détection d'organisme vivant afin d'obtenir des coordonnées de points-clés de visage et un vecteur de décalage de ligne de visée généré par le modèle de reconnaissance prédéfini (240), le modèle de reconnaissance prédéfini étant un modèle de détection de points-clés de visage combiné à une couche de sortie du vecteur de décalage de ligne de visée, le vecteur de décalage de ligne de visée servant à mesurer le degré de décalage de la ligne de visée pendant le processus de hochement de la tête ; et en fonction des coordonnées des points-clés de visage correspondant à chaque trame d'image de visage et du vecteur de décalage de ligne de visée, déterminer si les données de flux vidéo de hochement de tête franchissent l'étape actuelle de détection d'organisme vivant. Selon le procédé, lors d'un processus de détection d'organisme vivant, il est possible d'identifier les moyens frauduleux utilisant des documents ou des modèles de tête contenant des visages humains à agiter, ce qui permet d'améliorer la précision de détection d'organisme vivant et de réduire les risques de sécurité.
PCT/CN2020/135548 2020-10-12 2020-12-11 Procédé de détection faciale, appareil, support et dispositif électronique WO2021179719A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011086784.6 2020-10-12
CN202011086784.6A CN112149615B (zh) 2020-10-12 2020-10-12 人脸活体检测方法、装置、介质及电子设备

Publications (1)

Publication Number Publication Date
WO2021179719A1 true WO2021179719A1 (fr) 2021-09-16

Family

ID=73953002

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/135548 WO2021179719A1 (fr) 2020-10-12 2020-12-11 Procédé de détection faciale, appareil, support et dispositif électronique

Country Status (2)

Country Link
CN (1) CN112149615B (fr)
WO (1) WO2021179719A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114140355A (zh) * 2021-11-29 2022-03-04 北京比特易湃信息技术有限公司 基于最优化算法的检测框去抖动方法
CN115802101A (zh) * 2022-11-25 2023-03-14 深圳创维-Rgb电子有限公司 一种短视频生成方法、装置、电子设备及存储介质
CN116110111A (zh) * 2023-03-23 2023-05-12 平安银行股份有限公司 人脸识别方法、电子设备及存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668553B (zh) * 2021-01-18 2022-05-13 东莞先知大数据有限公司 一种司机间断瞭望行为检测方法、装置、介质及设备
CN113392810A (zh) * 2021-07-08 2021-09-14 北京百度网讯科技有限公司 用于活体检测的方法、装置、设备、介质和产品
CN113642428B (zh) * 2021-07-29 2022-09-27 北京百度网讯科技有限公司 人脸活体检测方法、装置、电子设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170169304A1 (en) * 2015-12-09 2017-06-15 Beijing Kuangshi Technology Co., Ltd. Method and apparatus for liveness detection
CN109886087A (zh) * 2019-01-04 2019-06-14 平安科技(深圳)有限公司 一种基于神经网络的活体检测方法及终端设备
CN109977771A (zh) * 2019-02-22 2019-07-05 杭州飞步科技有限公司 司机身份的验证方法、装置、设备及计算机可读存储介质
US20190377963A1 (en) * 2018-06-11 2019-12-12 Laurence Hamid Liveness detection
CN111160251A (zh) * 2019-12-30 2020-05-15 支付宝实验室(新加坡)有限公司 一种活体识别方法及装置
CN111401127A (zh) * 2020-01-16 2020-07-10 创意信息技术股份有限公司 一种人脸活体检测联合判断方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4976156B2 (ja) * 2007-02-08 2012-07-18 富山県 画像識別方法
CN108875524B (zh) * 2018-01-02 2021-03-02 北京旷视科技有限公司 视线估计方法、装置、系统和存储介质
CN109522798A (zh) * 2018-10-16 2019-03-26 平安科技(深圳)有限公司 基于活体识别的视频防伪方法、系统、装置及可存储介质
CN111539249A (zh) * 2020-03-11 2020-08-14 西安电子科技大学 多因子人脸活体检测系统和方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170169304A1 (en) * 2015-12-09 2017-06-15 Beijing Kuangshi Technology Co., Ltd. Method and apparatus for liveness detection
US20190377963A1 (en) * 2018-06-11 2019-12-12 Laurence Hamid Liveness detection
CN109886087A (zh) * 2019-01-04 2019-06-14 平安科技(深圳)有限公司 一种基于神经网络的活体检测方法及终端设备
CN109977771A (zh) * 2019-02-22 2019-07-05 杭州飞步科技有限公司 司机身份的验证方法、装置、设备及计算机可读存储介质
CN111160251A (zh) * 2019-12-30 2020-05-15 支付宝实验室(新加坡)有限公司 一种活体识别方法及装置
CN111401127A (zh) * 2020-01-16 2020-07-10 创意信息技术股份有限公司 一种人脸活体检测联合判断方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114140355A (zh) * 2021-11-29 2022-03-04 北京比特易湃信息技术有限公司 基于最优化算法的检测框去抖动方法
CN115802101A (zh) * 2022-11-25 2023-03-14 深圳创维-Rgb电子有限公司 一种短视频生成方法、装置、电子设备及存储介质
CN116110111A (zh) * 2023-03-23 2023-05-12 平安银行股份有限公司 人脸识别方法、电子设备及存储介质
CN116110111B (zh) * 2023-03-23 2023-09-08 平安银行股份有限公司 人脸识别方法、电子设备及存储介质

Also Published As

Publication number Publication date
CN112149615B (zh) 2024-06-28
CN112149615A (zh) 2020-12-29

Similar Documents

Publication Publication Date Title
WO2021179719A1 (fr) Procédé de détection faciale, appareil, support et dispositif électronique
KR102063037B1 (ko) 신원 인증 방법, 단말기 장치 및 컴퓨터 판독 가능한 저장 매체
CN108875833B (zh) 神经网络的训练方法、人脸识别方法及装置
CN111767900B (zh) 人脸活体检测方法、装置、计算机设备及存储介质
WO2018177379A1 (fr) Reconnaissance de geste, commande de geste et procédés et appareils d'apprentissage de réseau neuronal, et dispositif électronique
WO2018028546A1 (fr) Procédé de positionnement de point clé, terminal et support de stockage informatique
WO2020024484A1 (fr) Procédé et dispositif de production de données
WO2022105118A1 (fr) Procédé et appareil d'identification d'état de santé basés sur une image, dispositif et support de stockage
US20190362171A1 (en) Living body detection method, electronic device and computer readable medium
WO2022100337A1 (fr) Procédé et appareil d'évaluation de qualité d'image de visage, dispositif informatique et support de stockage
WO2022188697A1 (fr) Procédé et appareil d'extraction de caractéristique biologique, dispositif, support et produit programme
WO2023035531A1 (fr) Procédé de reconstruction à super-résolution pour image de texte et dispositif associé
WO2020124993A1 (fr) Procédé et appareil de détection de vivacité, dispositif électronique et support d'informations
WO2020006964A1 (fr) Procédé et dispositif de détection d'image
WO2020238321A1 (fr) Procédé et dispositif d'identification d'âge
WO2021169616A1 (fr) Procédé et appareil de détection du visage d'un corps non vivant, ainsi que dispositif informatique et support de stockage
CN111353336B (zh) 图像处理方法、装置及设备
WO2020124994A1 (fr) Procédé et appareil de détection de preuve de vie, dispositif électronique et support de stockage
WO2023173646A1 (fr) Procédé et appareil de reconnaissance d'expression
CN113194281B (zh) 视频解析方法、装置、计算机设备和存储介质
WO2021159669A1 (fr) Procédé et appareil de connexion sécurisée à un système, dispositif informatique et support de stockage
WO2020052062A1 (fr) Procédé et dispositif de détection
WO2023124040A1 (fr) Procédé et appareil de reconnaissance faciale
US11741986B2 (en) System and method for passive subject specific monitoring
CN106778574A (zh) 用于人脸图像的检测方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20924217

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20924217

Country of ref document: EP

Kind code of ref document: A1