WO2021051539A1 - 一种人脸识别的方法、装置及终端设备 - Google Patents

一种人脸识别的方法、装置及终端设备 Download PDF

Info

Publication number
WO2021051539A1
WO2021051539A1 PCT/CN2019/117184 CN2019117184W WO2021051539A1 WO 2021051539 A1 WO2021051539 A1 WO 2021051539A1 CN 2019117184 W CN2019117184 W CN 2019117184W WO 2021051539 A1 WO2021051539 A1 WO 2021051539A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
point cloud
face
cloud data
human body
Prior art date
Application number
PCT/CN2019/117184
Other languages
English (en)
French (fr)
Inventor
张国辉
李佼
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021051539A1 publication Critical patent/WO2021051539A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive

Definitions

  • This application belongs to the technical field of face recognition, and particularly relates to a method, device and terminal device for face recognition.
  • Face Recognition is a kind of biometric recognition technology based on the facial feature information of people. Generally speaking, face recognition actually collects images or video streams containing human faces through a camera or camera, and automatically detects and tracks the human face in the image, and then performs a series of correlations on the detected face. The general term for technology. Face recognition technology has been widely used in many fields such as finance, justice, public security, border inspection, education, and medical treatment.
  • the embodiments of the present application provide a face recognition method, device, and terminal equipment to solve the problem of the entire face caused by the requirement of the recognized person to make expressions in front of the camera for live body detection in the prior art. Identify problems with low efficiency in the process.
  • a method for face recognition including:
  • a face recognition device including:
  • the collection module is used to collect human body point cloud data of the current user, where the human body point cloud data includes a plurality of data points, and each data point has a corresponding coordinate value;
  • An extraction module for extracting face point cloud data in the human body point cloud data An extraction module for extracting face point cloud data in the human body point cloud data
  • An obtaining module configured to obtain voxel data in the face point cloud data according to the coordinate value of each data point in the face point cloud data
  • a calculation module for extracting multiple feature points in the voxel data using a pre-set three-dimensional spatial information level-by-level learning network model, and calculating the distance between each feature point;
  • the recognition module is configured to recognize whether the current user is a target user based on the distance between the various feature points.
  • a terminal device including a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor.
  • the processor executes the computer-readable instructions.
  • a computer non-volatile readable storage medium stores computer readable instructions that, when executed by a processor, realize the human
  • the steps of the face recognition method are as follows:
  • the beneficial effects of the face recognition method, device and terminal equipment provided by the embodiments of the present application are: by collecting 3D face point cloud data, it can automatically determine whether the object to be recognized is a living body according to the depth information in the point cloud data , The living body can be judged without relying on user behavior, which solves the problem that the user is required to make facial expressions or other actions in front of the camera in the prior art to distinguish whether the face is a living body, and reduces the possibility of fake human faces by holding the photos of the parties with their hands. It improves the efficiency of face recognition.
  • FIG. 1 is a schematic flowchart of steps of a face recognition method according to an embodiment of the present invention
  • Fig. 2 is a schematic flowchart of steps of another method for face recognition according to an embodiment of the present invention.
  • Fig. 3 is a schematic diagram of a face recognition device according to an embodiment of the present invention.
  • Fig. 4 is a schematic diagram of a terminal device according to an embodiment of the present invention.
  • FIG. 1 there is shown a schematic flow chart of the steps of a face recognition method according to an embodiment of the present invention, which may specifically include the following steps:
  • the terminal device can identify whether the face point cloud data included in the aforementioned point cloud data belongs to the target user by collecting the human body point cloud data of the current user.
  • human body point cloud data refers to 3D human body point cloud data.
  • 3D human body point cloud data is a kind of data that records the structure of the human body in the form of data points, and each data point contains three-dimensional coordinates. For example, it can be the coordinate values on the x, y, and z axes.
  • each data point may also contain other information such as gray scale, which is not limited in this embodiment.
  • the depth information of various parts of the human body can be obtained through a specific detection device or collection device. Then, these devices can automatically output 3D human body point cloud data based on the obtained depth information.
  • the above-mentioned equipment can be a depth camera, a depth camera, a depth sensor, or a lidar.
  • the depth camera is usually composed of an infrared projector and an infrared depth camera.
  • the infrared projector is mainly used to emit uniform infrared rays to the outside world and form an infrared speckle image on the target human body.
  • the speckle image information obtained by the reflection of the target human body is composed of infrared
  • the depth camera receives, and finally, after forming the depth information of the target human body, the infrared depth camera can output the human body point cloud data of the target human body by analyzing and processing the formed depth information.
  • the collected human body point cloud data may include a whole body point cloud or a half body point cloud, and so on. Since face recognition only needs to process the point cloud data of the user's face, in order to reduce the amount of calculation for subsequent recognition, after the human body point cloud data is collected, the face can be extracted from the human body point cloud data first Point cloud data, that is, the human body point cloud data of the current user's face.
  • the human nose is basically in the center of the human face. Therefore, in order to extract face point cloud data from the collected human body point cloud data, the position of the nose tip of the face in the human body point cloud data of the current user can be identified according to the coordinate value of each data point in the human body point cloud data. Then, based on the position of the nose tip of the face, the face point cloud data is cut out from the human body point cloud data.
  • the human body point cloud data is a three-dimensional three-dimensional data
  • the position corresponding to the maximum value on the horizontal axis or the vertical axis in the three-dimensional data can be used as the nose tip position of the human face.
  • the position corresponding to the maximum value on the horizontal axis in the three-dimensional data can be used as the nose tip position; if the direction perpendicular to the face is the y-axis Direction, the position corresponding to the maximum value on the vertical axis in the three-dimensional data can be used as the position of the nose tip of the face. This embodiment does not limit this.
  • a coordinate system can be constructed with the position of the nose tip of the human face as the origin, and the face point cloud data can be obtained by extracting multiple data points within a preset length in each direction of the coordinate system.
  • the position of the nose tip of a human face can be determined as the origin to construct a three-dimensional coordinate system, and then starting from the origin, data points within a certain length range in each direction of the coordinate axis can be extracted respectively, and the human body point cloud data can be "faced".
  • face point cloud data can be obtained.
  • the foregoing length can be determined by a person skilled in the art according to empirical values, which is not limited in this embodiment.
  • the sparse relationship can be compared with the sparse relationship of the face point cloud sample data, so as to identify the part that is more similar to the sparse relationship of the face point cloud sample data.
  • the face part is not limited in this embodiment.
  • S103 Acquire voxel data in the face point cloud data according to the coordinate value of each data point in the face point cloud data;
  • Voxel is the abbreviation of Volume Pixel.
  • the stereo containing the voxel can be represented by stereo rendering or by extracting a polygonal isosurface with a given threshold contour.
  • a voxel is the smallest unit of digital data in the three-dimensional space segmentation, and can be used in three-dimensional imaging, scientific data, and medical imaging.
  • the preset three-dimensional spatial information layer-by-layer learning network model may be a VoxelNet model.
  • VoxelNet is a level-by-level learning network of three-dimensional spatial information based on point clouds. It can divide the three-dimensional point cloud into a certain number of Voxel (voxels). After the points are randomly sampled and normalized, each non-empty Voxel performs local feature extraction, which can realize object recognition.
  • the VoxelNet model can automatically extract feature points from the input voxel data, and these feature points are the feature points on the face to be recognized.
  • the distance before these feature points can be calculated first.
  • the distance between each feature point may be Euclidean Metric.
  • Euclidean distance also known as Euclidean metric, is a commonly used distance definition, which refers to the true distance between two points in m-dimensional space, or the natural length of the vector (that is, the distance from the point to the origin) .
  • the Euclidean distance in two-dimensional and three-dimensional space is the actual distance between two points.
  • the target user is the user who has collected face information in advance. For example, before a user uses the face recognition function of a mobile terminal such as a mobile phone, he needs to input his face information into the mobile phone first, and then the mobile phone can be unlocked and paid through face recognition.
  • a mobile terminal such as a mobile phone
  • the terminal device can extract multiple feature points from the face information, and calculate and store the distance between each feature point.
  • the distance between the facial feature points of the current user calculated in real time can be compared with the pre-stored distance. If the two have high similarity, the current user can be identified as the target user .
  • the face point can be obtained according to the coordinate value of each data point in the face point cloud data
  • the voxel data in the cloud data can be used to extract multiple feature points in the voxel data layer by layer learning network model using the preset three-dimensional spatial information, and calculate the distance between each feature point, and then based on the difference between each feature point The distance between the two can be used to identify whether the current user is the target user.
  • the object to be recognized is a living body according to the depth information in the point cloud data, and the living body judgment can be performed without relying on user behavior, which solves the need in the prior art
  • the problem that users can distinguish whether a person’s face is alive by making facial expressions or other actions in front of the camera reduces the possibility of fake faces by holding a photo of the person with their hands, and improves the efficiency of face recognition.
  • FIG. 2 there is shown a schematic flow diagram of the steps of another face recognition method according to an embodiment of the present invention, which may specifically include the following steps:
  • S201 Collect human body point cloud data of the current user, where the human body point cloud data includes multiple data points, and each data point has a corresponding coordinate value.
  • the human body point cloud data of the sample user can be collected through equipment such as a depth camera, a depth camera, a depth sensor, or a lidar.
  • the collected human body point cloud data may include a whole body point cloud or a half body point cloud.
  • these data points include coordinate values in a three-dimensional coordinate system, and the information embodied by these data points can characterize the specific human body structure .
  • the recognition error is reduced.
  • the human body point cloud data can also be preprocessed.
  • the preprocessing of human point cloud data may include denoising processing.
  • the collected human body point cloud data will have some noise, such as some outlier points. You can filter out these outlier points by denoising the body point cloud data to remove the influence of noise on subsequent recognition.
  • S202 Use a preset three-dimensional point cloud network model to identify sparse relationships between data points in the human body point cloud data, where the three-dimensional point cloud network model is obtained by training multiple pieces of face point cloud sample data;
  • the preset 3D point cloud network model may be a PointNet++ model.
  • the PointNet++ model is a deep learning multi-classification framework model based on 3D point cloud design. This model can be used to classify objects presented in the 3D point cloud.
  • a PointNet++ model for detecting whether the point cloud data is a face point cloud can be obtained.
  • the fully connected layer of the PointNet++ model can be configured to output two types of results, and the pre-collected sample set can be trained to realize the classification of faces and non-faces.
  • the PointNet++ model can be used to identify the sparse relationship between various data points, and then the face point cloud data can be extracted.
  • S203 Calculate the similarity between the sparse relationship between each data point in the human body point cloud data and the sparse relationship between each data point in the face point cloud sample data;
  • the face point cloud sample data may be multiple pieces of face point cloud data collected in advance, and the PointNet++ model can train the above sample data to obtain a universal point cloud for characterizing each face. Data with sparse relationships of data points.
  • the sparse relationship of each part of the current user's human body point cloud can be compared with the sparse relationship of the sample data, and the part whose similarity exceeds a certain threshold is extracted as the human The area where the face position is located, and all the data points in the area constitute the face point cloud data of the current user.
  • S205 Determine the maximum value of the coordinate value and the minimum value of the coordinate value of each data point in the preset three-dimensional coordinate system on the x, y, and z axes of the face point cloud data respectively;
  • point cloud data only contains coordinate information, it cannot be directly used for face recognition.
  • the voxel data is obtained, and then the feature points on the face can be extracted from the voxel data and used as a model The input data to realize the whole recognition process.
  • Voxelization is to convert the geometric representation of an object into the voxel representation closest to the object to generate a voxel data set.
  • Voxels not only contain the surface information of the object, but also describe the internal properties of the object.
  • the voxels used to represent the spatial information of objects are similar to the two-dimensional pixels of the image, except that they extend from two-dimensional points to three-dimensional cube units.
  • voxelizing point cloud data you can first find a cube based on this point cloud coordinate set, which can contain the entire face point cloud.
  • the cube can be a cube with the smallest area containing all data points.
  • each data point corresponds to the coordinate value on the x, y, and z axes. Therefore, it is determined that the area containing all the data points is the smallest In the case of a cube, you can first find the maximum and minimum coordinate values of each data point on the x, y, and z axes, that is, xmin, xmax, ymin, ymax, zmin, and zmax.
  • a cube containing the entire face point cloud can be generated.
  • the coordinates of the 8 vertices of the cube in the current coordinate system are: (xmin, ymin, zmin), (xmax, ymin, zmin ), (xmax, ymax, zmin), (xmin, ymax, zmin), (xmin, ymin, zmax), (xmax, ymin, zmax), (xmax, ymax, zmax) and (xmin, ymax, zmax).
  • the obtained data is the voxel data.
  • the acquired voxel data may be normalized and mapped to a stereo space of a specific size. For example, 200*200*200 space.
  • mapping process can be completed according to the proportional relationship between the current cube containing all the points of the face point cloud and the normalized three-dimensional space.
  • the current cube can be scaled down to perform the mapping.
  • the smallest cube that currently contains all the points of the face point cloud is a cube of 500*500*500
  • the required normalized three-dimensional space is a space of 200*200*200
  • the difference between the two The proportional relationship is 5:2. Therefore, the coordinate value of each marked data point in the 500*500*500 cube can be reduced by 2.5 times in equal proportion, and the coordinate value can be marked in a 200*200*200 space. From the data points of each face point cloud, normalized voxel data is obtained.
  • the normalized voxel data can be input to the preset VoxelNet model for feature point extraction and recognition.
  • S209 Extract multiple feature points in the voxel data using a pre-set three-dimensional spatial information level-by-level learning network model, and calculate the distance between each feature point;
  • the normalized input voxel data is the face to be recognized.
  • the feature points extracted by the VoxelNet model from the input voxel data are the feature points on the face to be recognized.
  • S210 Identify whether the current user is a target user based on the distance between the various feature points.
  • multiple target feature points in the face of the target user input in advance may be extracted first, and the Euclidean distance between each target feature point may be calculated.
  • the pre-input target feature points may be feature points extracted from the user's face input in advance. For example, they can be feature points of eyebrows, eyes, nose, etc.
  • each feature point and each target feature point can be determined.
  • the Euclidean distance between each target feature point it is possible to determine the Euclidean distance between each feature point of the current user and each of the above corresponding relationships. Whether the error between the Euclidean distance between the target feature points is less than the preset value; if it is, the current user can be identified as the target user, otherwise, it can be identified that the current user is not the target user.
  • the feature points such as eyebrows, eyes, and nose in the face that are currently recognized can have a corresponding relationship with the target feature points such as eyebrows, eyes, and nose in the face that are input in advance. That is, the eyebrows correspond to the eyebrows, the eyes correspond to the eyes, and the nose corresponds to the nose.
  • the Euclidean distance E11 between the two feature points of the eyebrow and the eye and the Euclidean distance E12 between the two feature points of the eye and the nose can be calculated. Then compare the E11 and E12 with the Euclidean distance between the target feature points input in advance. That is, E11 is compared with the pre-input Euclidean distance E01 between the two target feature points of the eyebrow and the eye, and E12 is compared with the pre-input Euclidean distance E02 between the two target feature points of the eye and the nose.
  • a threshold can be set. When comparing the Euclidean distance between the feature points extracted from two faces, if the mutual error is less than the above threshold, the two faces can be considered to belong to the same person. Otherwise, they belong to different people.
  • the PointNet++ model can be used to extract the face point cloud data from the human body point cloud data, and then obtain the voxel data in the face point cloud data and Perform normalization processing; the normalized voxel data can be input to the VoxelNet model for feature point extraction and recognition.
  • the living body when performing face recognition, the living body can be judged without relying on user behavior, which solves the problem that the user fakes the face by holding the photo of the party by hand, improves the efficiency of face recognition, and ensures the safety of face recognition Sex.
  • FIG. 3 a schematic diagram of a face recognition apparatus according to an embodiment of the present invention is shown, which may specifically include the following modules:
  • the collection module 301 is configured to collect human body point cloud data of the current user.
  • the human body point cloud data includes a plurality of data points, and each data point has a corresponding coordinate value;
  • the extraction module 302 is configured to extract face point cloud data in the human body point cloud data
  • the obtaining module 303 is configured to obtain the voxel data in the face point cloud data according to the coordinate value of each data point in the face point cloud data;
  • the calculation module 304 is configured to extract multiple feature points in the voxel data by using the preset three-dimensional spatial information layer-by-level learning network model, and calculate the distance between each feature point;
  • the recognition module 305 is configured to recognize whether the current user is a target user based on the distance between the various feature points.
  • the extraction module 302 may specifically include the following sub-modules:
  • the sparse relationship recognition sub-module is used to recognize the sparse relationship between various data points in the human body point cloud data using a preset three-dimensional point cloud network model.
  • the three-dimensional point cloud network model compares multiple face point cloud samples Data is obtained through training;
  • a similarity calculation sub-module for calculating the similarity between the sparse relationship between each data point in the human body point cloud data and the sparse relationship between each data point in the face point cloud sample data;
  • the face point cloud data extraction sub-module is used to extract multiple data points whose similarity exceeds a preset threshold as face point cloud data.
  • the extraction module 302 may also include the following sub-modules:
  • the nose tip position recognition sub-module of the face is used to identify the nose tip position of the face in the human body point cloud data of the current user according to the coordinate value of each data point in the human body point cloud data;
  • the face point cloud data cropping sub-module is used to crop the face point cloud data from the human body point cloud data based on the position of the nose tip of the face.
  • the face point cloud data cropping submodule may specifically include the following units:
  • the face point cloud data cropping unit is used to construct a coordinate system with the nose tip position of the face as the origin, and obtain face point cloud data by extracting multiple data points within a preset length in each direction of the coordinate system .
  • the acquiring module 303 may specifically include the following sub-modules:
  • the coordinate value determination sub-module is used to determine the maximum coordinate value and the minimum coordinate value of each data point in the face point cloud data on the x, y, and z axes of the preset three-dimensional coordinate system;
  • a cube generation sub-module for generating the smallest cube containing all the data points in the face point cloud data according to the maximum value of the coordinate value and the minimum value of the coordinate value;
  • the voxel data acquisition sub-module is used to mark all the data points in the smallest cube to obtain the voxel data in the face point cloud data.
  • the acquisition module 303 may also include the following sub-modules:
  • the voxel data mapping sub-module is used to map the voxel data to a three-dimensional space of a specific size as the input data of the three-dimensional spatial information learning network model layer by layer.
  • the distance between the feature points is the Euclidean distance between the feature points
  • the recognition module 305 may specifically include the following sub-modules:
  • the target feature point extraction sub-module is used to extract multiple target feature points in the face of the target user input in advance
  • the Euclidean distance calculation sub-module is used to calculate the Euclidean distance between each target feature point
  • Correspondence determination sub-module configured to determine the corresponding relationship between each feature point and each target feature point
  • the Euclidean distance judging sub-module is used to judge whether the error between the Euclidean distance between the respective feature points and the Euclidean distance between the respective target feature points having the corresponding relationship is less than a preset value;
  • the identification sub-module is configured to, if yes, identify the current user as the target user; if not, identify that the current user is not the target user.
  • the description is relatively simple, and for related parts, please refer to the description of the method embodiment part.
  • the terminal device 400 of this embodiment includes a processor 410, a memory 420, and computer-readable instructions 421 stored in the memory 420 and running on the processor 410.
  • the processor 410 executes the computer-readable instruction 421
  • the steps in the various embodiments of the above-mentioned face recognition method are implemented, for example, steps S101 to S105 shown in FIG. 1.
  • the processor 410 executes the computer-readable instructions 421
  • the functions of the modules/units in the foregoing device embodiments, such as the functions of the modules 301 to 305 shown in FIG. 3, are implemented.
  • the computer-readable instructions 421 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 420 and executed by the processor 410.
  • the one or more modules/units may be a series of computer-readable instruction segments capable of completing specific functions, and the instruction segments may be used to describe the execution process of the computer-readable instructions 421 in the terminal device 400.
  • the computer-readable instruction 421 may be divided into a collection module, an extraction module, an acquisition module, a calculation module, and an identification module, and the specific functions of each module are as follows:
  • the collection module is used to collect human body point cloud data of the current user, where the human body point cloud data includes a plurality of data points, and each data point has a corresponding coordinate value;
  • An extraction module for extracting face point cloud data in the human body point cloud data An extraction module for extracting face point cloud data in the human body point cloud data
  • An obtaining module configured to obtain voxel data in the face point cloud data according to the coordinate value of each data point in the face point cloud data
  • a calculation module for extracting multiple feature points in the voxel data using a pre-set three-dimensional spatial information level-by-level learning network model, and calculating the distance between each feature point;
  • the recognition module is configured to recognize whether the current user is a target user based on the distance between the various feature points.
  • the terminal device 400 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the terminal device 400 may include, but is not limited to, a processor 410 and a memory 420.
  • FIG. 4 is only an example of the terminal device 400, and does not constitute a limitation on the terminal device 400. It may include more or less components than shown in the figure, or combine certain components, or different components.
  • the terminal device 400 may also include input and output devices, network access devices, buses, and so on.
  • the processor 410 may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 420 may be an internal storage unit of the terminal device 400, such as a hard disk or a memory of the terminal device 400.
  • the memory 420 may also be an external storage device of the terminal device 400, such as a plug-in hard disk equipped on the terminal device 400, a smart memory card (Smart Media Card, SMC), or a Secure Digital (SD). Card, Flash Card, etc.
  • the memory 420 may also include both an internal storage unit of the terminal device 400 and an external storage device.
  • the memory 420 is used to store the computer-readable instructions 421 and other instructions and data required by the terminal device 400.
  • the memory 420 may also be used to temporarily store data that has been output or will be output.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

一种人脸识别的方法、装置及终端设备,适用于人脸识别技术领域,所述方法包括:采集当前用户的人体点云数据,所述人体点云数据包括多个数据点,各个数据点分别具有相应的坐标值(S101);提取所述人体点云数据中的人脸点云数据(S102);根据所述人脸点云数据中各个数据点的坐标值,获取所述人脸点云数据中的体素数据(S103);采用预置的三维空间信息逐层次学习网络模型提取所述体素数据中的多个特征点,并计算各个特征点之间的距离(S104);基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户(S105)。通过点云数据中的深度信息自动判断待识别的对像是否为活体,无需依靠用户行为就能进行活体判断,提高了人脸识别的效率。

Description

一种人脸识别的方法、装置及终端设备
本申请申明享有2019年09月18日递交的申请号为201910882001.6、名称为“一种人脸识别的方法、装置及终端设备”中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。
技术领域
本申请属于人脸识别技术领域,特别是涉及一种人脸识别的方法、装置及终端设备。
背景技术
人脸识别(Face Recognition)是基于人的脸部特征信息进行身份识别的一种生物识别技术。通常所说的人脸识别实际上是通过摄像机或摄像头采集含有人脸的图像或视频流,并自动在图像中检测和跟踪人脸,进而对检测到的人脸进行脸部识别的一系列相关技术的总称。人脸识别技术已广泛应用于金融、司法、公安、边检、教育、医疗等众多领域。
现有技术中的人脸识别大多都是基于2D平面图像来进行检测和识别的。但这对于不是本人却举着本人照片来替代自己的脸的情况,并不能很好地完成报警。也就是说,通过2D平面图像来进行检测和识别存在一个很大的漏洞,拿着当事人的照片挡住自己的脸就可以被识别为当事人。为了解决上述问题,就需要验证被拍摄的这个人脸图像是否是一个活人的脸。目前的处理方式往往就是要求被识别的人在镜头前做表情,通过检测被识别人的动作来确认这张脸是否为活体,然后再进行识别。这种方法虽然能够降低通过图像冒充当事人的可能性,但由于需要被识别的人在镜头前做表情,然后再识别,导致整个人脸识别过程较长,效率较低,并不能做到无感识别,用户体验也较差。
发明概述
技术问题
有鉴于此,本申请实施例提供了一种人脸识别的方法、装置及终端设备,以解决现有技术中由于要求被识别的人在镜头前做表情来进行活体检测而导致的整 个人脸识别过程效率较低的问题。
问题的解决方案
技术解决方案
为解决上述技术问题,本申请实施例采用的技术方案是:
第一方面,提供了一种人脸识别的方法,包括:
采集当前用户的人体点云数据,所述人体点云数据包括多个数据点,各个数据点分别具有相应的坐标值;
提取所述人体点云数据中的人脸点云数据;
根据所述人脸点云数据中各个数据点的坐标值,获取所述人脸点云数据中的体素数据;
采用预置的三维空间信息逐层次学习网络模型提取所述体素数据中的多个特征点,并计算各个特征点之间的距离;
基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户。
第二方面,提供了一种人脸识别的装置,包括:
采集模块,用于采集当前用户的人体点云数据,所述人体点云数据包括多个数据点,各个数据点分别具有相应的坐标值;
提取模块,用于提取所述人体点云数据中的人脸点云数据;
获取模块,用于根据所述人脸点云数据中各个数据点的坐标值,获取所述人脸点云数据中的体素数据;
计算模块,用于采用预置的三维空间信息逐层次学习网络模型提取所述体素数据中的多个特征点,并计算各个特征点之间的距离;
识别模块,用于基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户。
第三方面,提供了一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现上述人脸识别的方法的如下步骤:
采集当前用户的人体点云数据,所述人体点云数据包括多个数据点,各个数据点分别具有相应的坐标值;
提取所述人体点云数据中的人脸点云数据;
根据所述人脸点云数据中各个数据点的坐标值,获取所述人脸点云数据中的体素数据;
采用预置的三维空间信息逐层次学习网络模型提取所述体素数据中的多个特征点,并计算各个特征点之间的距离;
基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户。
第四方面,提供了一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机可读指令,所述计算机可读指令被处理器执行时实现上述人脸识别的方法的如下步骤:
采集当前用户的人体点云数据,所述人体点云数据包括多个数据点,各个数据点分别具有相应的坐标值;
提取所述人体点云数据中的人脸点云数据;
根据所述人脸点云数据中各个数据点的坐标值,获取所述人脸点云数据中的体素数据;
采用预置的三维空间信息逐层次学习网络模型提取所述体素数据中的多个特征点,并计算各个特征点之间的距离;
基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户。
本申请实施例提供的人脸识别的方法、装置及终端设备的有益效果在于:通过采集3D人脸点云数据,从而能够根据点云数据中的深度信息自动判断待识别的对像是否为活体,无需依靠用户行为就能进行活体判断,解决了现有技术中需要用户在镜头前做表情或其他动作才能辨别人脸是否为活体的问题,降低了通过手举当事人的照片假冒人脸的可能性,提高了人脸识别的效率。
发明的有益效果
对附图的简要说明
附图说明
图1是本发明一个实施例的一种人脸识别的方法的步骤流程示意图;
图2是本发明一个实施例的另一种人脸识别的方法的步骤流程示意图;
图3是本发明一个实施例的一种人脸识别的装置的示意图;
图4是本发明一个实施例的一种终端设备的示意图。
发明实施例
本发明的实施方式
参照图1,示出了本发明一个实施例的一种人脸识别的方法的步骤流程示意图,具体可以包括如下步骤:
S101、采集当前用户的人体点云数据,所述人体点云数据包括多个数据点,各个数据点分别具有相应的坐标值;
需要说明的是,本方法可以应用于终端设备中。该终端设备通过采集当前用户的人体点云数据,可以识别出上述点云数据中包括的人脸点云数据是否属于目标用户。
通常,人体点云数据即是指3D人体点云数据。3D人体点云数据是以数据点的形式记录人体结构的一种数据,每一个数据点均包含有三维坐标。例如,可以是x、y、z轴上的坐标值。当然,每一个数据点也还可以包含有灰度等其他信息,本实施例对此不作限定。
在具体实现中,可以通过特定的检测设备或采集设备获取人体各个部位的深度信息。然后,这些设备可以基于得到的深度信息自动输出3D人体点云数据。通常,上述设备可以是深度摄像机、深度照相机、深度传感器或激光雷达等设备。
以深度摄像机为例。深度摄像机通常由红外投影机和红外深度摄像机构成,其中,红外投影机主要用于向外界发射均匀的红外线,并在目标人体上形成红外散斑图像,目标人体反射得到的散斑图像信息由红外深度摄像机接收,最后在形成目标人体的深度信息后,红外深度摄像机通过对形成的深度信息进行分析处理,可以输出目标人体的人体点云数据。
S102、提取所述人体点云数据中的人脸点云数据;
在本发明实施例中,采集得到的人体点云数据可以包括全身点云或半身点云等等。由于进行人脸识别时只需要对用户人脸部分的点云数据进行处理,因此为了减少后续识别的计算量,在采集得到人体点云数据后,可以首先从人体点云数据中提取出人脸点云数据,也就是当前用户的人脸部分的人体点云数据。
通常,人的鼻子基本上处于人脸的居中位置。因此,为了从采集得到的人体点云数据中提取出人脸点云数据,可以根据人体点云数据中各个数据点的坐标值,识别出当前用户的人体点云数据中的人脸鼻尖位置。然后再基于人脸鼻尖位置,从人体点云数据中裁剪出人脸点云数据。
由于人体点云数据是一种立体的三维数据,可以以三维数据中横轴或纵轴上的最大值所对应的位置作为人脸鼻尖位置。
例如,若在坐标轴中,垂直于人脸的方向为x轴方向,则可以三维数据中横轴上的最大值所对应的位置作为人脸鼻尖位置;若垂直于人脸的方向为y轴方向,则可以三维数据中纵轴上的最大值所对应的位置作为人脸鼻尖位置。本实施例对此不作限定。
在确定出人脸鼻尖的位置后,可以以人脸鼻尖位置为原点构建坐标系,通过提取在坐标系的各个方向上预设长度内的多个数据点,获得人脸点云数据。
例如,可以以确定出的人脸鼻尖位置为原点构建三维坐标系,然后从原点出发,分别提取出坐标轴各个方向上一定长度范围内的数据点,进行人体点云数据的“抠脸”,得到人脸点云数据。上述长度可以由本领域技术人员根据经验值确定,本实施例对此不作限定。
当然,根据实际需要,本领域技术人员还可以选择其他方式从采集的人体点云数据中提取出人脸点云数据。例如,可以通过计算人体点云数据中各个部分的稀疏关系,将稀疏关系与人脸点云样本数据的稀疏关系进行比较,从而识别出与人脸点云样本数据的稀疏关系较为相似的部分为人脸部分,本实施例对此不作限定。
S103、根据所述人脸点云数据中各个数据点的坐标值,获取所述人脸点云数据中的体素数据;
体素是体积元素(Volume Pixel)的简称,包含体素的立体可以通过立体渲染或者提取给定阈值轮廓的多边形等值面表现出来。体素是数字数据于三维空间分割上的最小单位,可以用于三维成像、科学数据与医学影像等领域。
在本发明实施例中,在确定出人脸位置后,实际上得到的就是一个点云坐标集合。根据这个点云坐标集合,可以找到一个立方体,能够包含整个人脸点云。
在上述立方体内,对各个数据点所在的位置进行标记,所得到的数据即是体素数据。
S104、采用预置的三维空间信息逐层次学习网络模型提取所述体素数据中的多个特征点,并计算各个特征点之间的距离;
在本发明实施例中,预置的三维空间信息逐层次学习网络模型可以是VoxelNet模型。
VoxelNet是一种基于点云的三维空间信息逐层次学习网络,能够将三维点云划分为一定数量的Voxel(体素),经过点的随机采样以及归一化后,通过对每一个非空Voxel进行局部特征提取,可以实现对物体的识别。
在具体实现中,VoxelNet模型可以从输入的体素数据中自动提取出特征点,这些特征点即是待识别的人脸上的特征点。
为了通过使用这些特征点进行后续的人脸识别,可以首先计算这些特征点之前的距离。
在本发明实施例中,各个特征点之间的距离可以是欧式距离(Euclidean Metric)。欧氏距离也被称作欧几里得度量,是一个通常采用的距离定义,指在m维空间中两个点之间的真实距离,或者向量的自然长度(即该点到原点的距离)。在二维和三维空间中的欧氏距离就是两点之间的实际距离。
当然,根据实际需要,本领域技术人员也可以采用其他手段来计算特征点之间的距离,如曼哈顿距离,马氏距离等等,本实施例对此不作限定。
S105、基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户。
在本发明实施例中,目标用户即是预先采集了人脸信息的用户。例如,在用户使用手机等移动终端的人脸识别功能前,需要首先将自己的人脸信息输入至手机中,后续才能通过人脸识别的方式实现手机解锁、支付等功能。
在具体实现中,终端设备在采集目标用户的人脸信息后,可以从人脸信息中提取出多个特征点,并计算各个特征点之间的距离并存储。当接收到人脸识别指令时,可以将实时计算出的当前用户的人脸特征点之间的距离与预先存储的距离进行比较,如果二者相似度较高,则可以识别当前用户为目标用户。
在本发明实施例中,通过采集当前用户的人体点云数据并提取人体点云数据中 的人脸点云数据后,能够根据人脸点云数据中各个数据点的坐标值,获取人脸点云数据中的体素数据,从而可以采用预置的三维空间信息逐层次学习网络模型提取体素数据中的多个特征点,并计算各个特征点之间的距离,进而基于各个特征点之间的距离,可以对当前用户是否为目标用户进行识别。本实施例通过采集3D人脸点云数据,从而能够根据点云数据中的深度信息自动判断待识别的对像是否为活体,无需依靠用户行为就能进行活体判断,解决了现有技术中需要用户在镜头前做表情或其他动作才能辨别人脸是否为活体的问题,降低了通过手举当事人的照片假冒人脸的可能性,提高了人脸识别的效率。
参照图2,示出了本发明一个实施例的另一种人脸识别的方法的步骤流程示意图,具体可以包括如下步骤:
S201、采集当前用户的人体点云数据,所述人体点云数据包括多个数据点,各个数据点分别具有相应的坐标值;
在具体实现中,可以通过深度摄像机、深度照相机、深度传感器或激光雷达等设备采集样本用户的人体点云数据。采集得到的人体点云数据可以包括全身点云或半身点云。当然,无论是全身点云或半深点云,均包括有多个数据点,这些数据点包含有三维坐标系下的坐标值,通过这些数据点所体现出的信息,可以表征具体的人体结构。
在本发明实施例中,为了减少后续识别时的数据处理量,减少识别误差。在采集得到人体点云数据后,还可以对这些人体点云数据作预处理。对人体点云数据的预处理可以包括去噪处理。
通常,采集的人体点云数据都会存在一些噪点,例如一些离群的点,可以通过对人体点云数据进行去噪处理,将这些离群的点过滤掉,去除噪点对后续识别的影响。
S202、采用预置的三维点云网络模型识别所述人体点云数据中各个数据点之间的稀疏关系,所述三维点云网络模型通过对多份人脸点云样本数据进行训练得到;
在本发明实施例中,预置的三维点云网络模型可以是PointNet++模型。
PointNet++模型是基于3D点云设计的深度学习多分类框架模型,可以利用该模 型来对3D点云呈现的数据进行物体分类。
在本发明实施例中,在对多份人脸点云样本数据进行训练,并模型的输出结果修改为二分类后,可以得到用于检测点云数据是否为人脸点云的PointNet++模型。
在具体实现中,可以通过将PointNet++模型的全连接层配置为输出结果为两类,并对预先采集的样本集进行训练即可实现对人脸和非人脸的分类。
在本发明实施例中,对于预处理后的人体点云数据,可以采用PointNet++模型识别得到各个数据点之间的稀疏关系,进而提取出人脸点云数据。
S203、计算所述人体点云数据中各个数据点之间的稀疏关系与所述人脸点云样本数据中各个数据点之间的稀疏关系的相似度;
S204、提取所述相似度超过预设阈值部分的多个数据点作为人脸点云数据;
在本发明实施例中,人脸点云样本数据可以是预先采集的多份人脸点云数据,PointNet++模型能够对上述样本数据进行训练,得到具有通用性的用于表征人脸点云中各个数据点的稀疏关系的数据。
在识别出当前用户的人体点云中各个部分的稀疏关系后,可以将当前用户人体点云中各个部分的稀疏关系与样本数据的稀疏关系进行比较,提取出相似度超过一定阈值的部分作为人脸位置所在的区域,该区域内的所有数据点即构成当前用户的人脸点云数据。
S205、分别确定所述人脸点云数据中各个数据点在预设的三维坐标系的x、y、z轴上的坐标值最大值和坐标值最小值;
在本发明实施例中,在确定出人脸位置后,实际上得到的就是一个点云坐标集合。由于点云数据只包含坐标信息,不能直接用于人脸识别,需要将点云数据作体素化处理后,得到体素数据,才能通过体素数据去提取人脸上的特征点,作为模型的输入数据,实现整个识别过程。
体素化(Voxelization)是将物体对象的几何形式表示转换成最接近该物体对象的体素表示形式,产生体素数据集。体素不仅包含物体对象的表面信息,而且能描述该物体对象的内部属性。用于表示物体对象空间信息的体素与表示图像的二维像素比较类似,只不过从二维的点扩展到三维的立方体单元。
在对点云数据作体素化处理时,可以首先根据这个点云坐标集合,可以找到一个立方体,能够包含整个人脸点云。该立方体可以是包含全部数据点的面积最小的正方体。
在具体实现中,由于上述点云坐标集合中包含有各个数据点的三维坐标,也就是各个数据点对应于x、y、z轴上的坐标值,因此,在确定包含全部数据点的面积最小的立方体时,可以首先找到各个数据点在x、y、z轴上的坐标值最大值和坐标值最小值,即xmin、xmax、ymin、ymax、zmin和zmax。
S206、根据所述坐标值最大值和坐标值最小值,生成包含所述人脸点云数据中全部数据点的最小立方体;
通过对上述坐标值最大值和坐标值最小值进行组合,得到立方体的8个顶点,从而可以生成包含整个人脸点云的立方体。
例如,若以(xmin、ymin、zmin)对应的点作为坐标原点,则该立方体的8个顶点在当前坐标系下的的坐标分别为:(xmin、ymin、zmin)、(xmax、ymin、zmin)、(xmax、ymax、zmin)、(xmin、ymax、zmin)、(xmin、ymin、zmax)、(xmax、ymin、zmax)、(xmax、ymax、zmax)和(xmin、ymax、zmax)。
S207、对所述最小立方体中的全部数据点进行标记,获得所述人脸点云数据中的体素数据;
在上述立方体内,通过对各个数据点所在的位置进行标记,所得到的数据即是体素数据。
S208、将所述体素数据映射至特定大小的立体空间,作为所述三维空间信息逐层次学习网络模型的输入数据;
在本发明实施例中,为了方便后续的识别,可以将获取到的体素数据作归一化处理,映射至特定大小的立体空间。例如,200*200*200空间。
在具体实现中,可以根据当前包含全部人脸点云的点的立方体与归一化后的立体空间之间的比例关系来完成映射过程。
需要说明的是,由于归一化要求后的立体空间通常较包含全部人脸点云的点的立方体要小,因此在映射处理过程中,可以通过对当前立方体进行等比例缩小 来进行映射。
例如,若当前包含全部人脸点云的点的最小立方体为500*500*500的立方体,而要求的归一化后的立体空间为200*200*200大小的空间,则二者之间的比例关系为5∶2,因此,可以将500*500*500的立方体中各个被标记的数据点的坐标值等比例缩小2.5倍,并按照坐标值在200*200*200大小的空间中标记出各个人脸点云的数据点,得到归一化后的体素数据。
然后,归一化后的体素数据可以被输入至预置的VoxelNet模型进行特征点的提取和识别。
S209、采用预置的三维空间信息逐层次学习网络模型提取所述体素数据中的多个特征点,并计算各个特征点之间的距离;
在本发明实施例中,归一化后输入的体素数据即是待识别的人脸。VoxelNet模型从输入的体素数据中提取出的特征点即是待识别的人脸上的特征点。
在进行人脸识别时,可以首先计算各个特征点之间的欧式距离。
S210、基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户。
在本发明实施例中,在对当前用户进行人脸识别时,可以首先提取预先输入的目标用户人脸中的多个目标特征点并计算各个目标特征点之间的欧式距离。
预先输入的目标特征点可以是在预先输入的用户的人脸中提取出的特征点。例如,可以是眉毛、眼睛、鼻子等位置的特征点。
然后,可以确定各个特征点与各个目标特征点之间的对应关系,通过计算各个目标特征点之间的欧式距离,可以判断当前用户的各个特征点之间的欧式距离与具有上述对应关系的各个目标特征点之间的欧式距离之间的误差是否小于预设数值;若是,则可以识别当前用户为目标用户,否则,则可以识别当前用户不为目标用户。
例如,当前识别出人脸中的眉毛、眼睛、鼻子等特征点,可以与预先输入的人脸中的眉毛、眼睛、鼻子等目标特征点具有对应关系。即,眉毛与眉毛相对于,眼睛与眼睛相对应,鼻子与鼻子相对应。
在计算特征点及目标特征点各自之间的欧式距离时,可以计算当前识别出的眉毛与眼睛两个特征点之间的欧式距离E11和眼睛与鼻子两个特征点之间的欧式距 离E12,然后将上述E11与E12分别与预先输入的目标特征点之间的欧式距离进行比较。即,将E11与预先输入的眉毛与眼睛两个目标特征点之间的欧式距离E01进行比较,将E12与预先输入的眼睛与鼻子两个目标特征点之间的欧式距离E02进行比较。
在具体实现中,可以设定一阈值,在对两张脸中提取的特征点之间的欧式距离进行比较时,若相互之间的误差小于上述阈值,则可以认为两张脸属于同一个人,反之则属于不同的人。
在本发明实施例中,在采集得到当前用户的人体点云数据后,可以采用PointNet++模型从人体点云数据中提取出人脸点云数据,进而获得人脸点云数据中的体素数据并进行归一化处理;归一化后的体素数据可以被输入至VoxelNet模型进行特征点的提取和识别。本实施例在进行人脸识别时,无需依靠用户行为就能进行活体判断,解决了用户通过手举当事人的照片假冒人脸的问题,提高了人脸识别的效率,保证了人脸识别的安全性。
需要说明的是,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。
参照图3,示出了本发明一个实施例的一种人脸识别的装置的示意图,具体可以包括如下模块:
采集模块301,用于采集当前用户的人体点云数据,所述人体点云数据包括多个数据点,各个数据点分别具有相应的坐标值;
提取模块302,用于提取所述人体点云数据中的人脸点云数据;
获取模块303,用于根据所述人脸点云数据中各个数据点的坐标值,获取所述人脸点云数据中的体素数据;
计算模块304,用于采用预置的三维空间信息逐层次学习网络模型提取所述体素数据中的多个特征点,并计算各个特征点之间的距离;
识别模块305,用于基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户。
在本发明实施例中,所述提取模块302具体可以包括如下子模块:
稀疏关系识别子模块,用于采用预置的三维点云网络模型识别所述人体点云数据中各个数据点之间的稀疏关系,所述三维点云网络模型通过对多份人脸点云样本数据进行训练得到;
相似度计算子模块,用于计算所述人体点云数据中各个数据点之间的稀疏关系与所述人脸点云样本数据中各个数据点之间的稀疏关系的相似度;
人脸点云数据提取子模块,用于提取所述相似度超过预设阈值部分的多个数据点作为人脸点云数据。
在本发明实施例中,所述提取模块302还可以包括如下子模块:
人脸鼻尖位置识别子模块,用于根据所述人体点云数据中各个数据点的坐标值,识别所述当前用户的人体点云数据中的人脸鼻尖位置;
人脸点云数据裁剪子模块,用于基于所述人脸鼻尖位置,从所述人体点云数据中裁剪出人脸点云数据。
在本发明实施例中,所述人脸点云数据裁剪子模块具体可以包括如下单元:
人脸点云数据裁剪单元,用于以所述人脸鼻尖位置为原点构建坐标系,通过提取在所述坐标系的各个方向上预设长度内的多个数据点,获得人脸点云数据。
在本发明实施例中,所述获取模块303具体可以包括如下子模块:
坐标值确定子模块,用于分别确定所述人脸点云数据中各个数据点在预设的三维坐标系的x、y、z轴上的坐标值最大值和坐标值最小值;
立方体生成子模块,用于根据所述坐标值最大值和坐标值最小值,生成包含所述人脸点云数据中全部数据点的最小立方体;
体素数据获取子模块,用于对所述最小立方体中的全部数据点进行标记,获得所述人脸点云数据中的体素数据。
在本发明实施例中,所述获取模块303还可以包括如下子模块:
体素数据映射子模块,用于将所述体素数据映射至特定大小的立体空间,作为所述三维空间信息逐层次学习网络模型的输入数据。
在本发明实施例中,所述各个特征点之间的距离为所述各个特征点之间的欧式距离,所述识别模块305具体可以包括如下子模块:
目标特征点提取子模块,用于提取预先输入的目标用户人脸中的多个目标特征 点;
欧式距离计算子模块,用于计算各个目标特征点之间的欧式距离;
对应关系确定子模块,用于确定所述各个特征点与所述各个目标特征点之间的对应关系;
欧式距离判断子模块,用于判断所述各个特征点之间的欧式距离与具有所述对应关系的所述各个目标特征点之间的欧式距离之间的误差是否小于预设数值;
识别子模块,用于若是,则识别所述当前用户为所述目标用户;若否,则识别所述当前用户不为所述目标用户。
对于装置实施例而言,由于其与方法实施例基本相似,所以描述得比较简单,相关之处参见方法实施例部分的说明即可。
参照图4,示出了本申请一个实施例的一种终端设备的示意图。如图4所示,本实施例的终端设备400包括:处理器410、存储器420以及存储在所述存储器420中并可在所述处理器410上运行的计算机可读指令421。所述处理器410执行所述计算机可读指令421时实现上述人脸识别的方法各个实施例中的步骤,例如图1所示的步骤S101至S105。或者,所述处理器410执行所述计算机可读指令421时实现上述各装置实施例中各模块/单元的功能,例如图3所示模块301至305的功能。
示例性的,所述计算机可读指令421可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器420中,并由所述处理器410执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机可读指令段,该指令段可以用于描述所述计算机可读指令421在所述终端设备400中的执行过程。例如,所述计算机可读指令421可以被分割成采集模块、提取模块、获取模块、计算模块和识别模块,各模块具体功能如下:
采集模块,用于采集当前用户的人体点云数据,所述人体点云数据包括多个数据点,各个数据点分别具有相应的坐标值;
提取模块,用于提取所述人体点云数据中的人脸点云数据;
获取模块,用于根据所述人脸点云数据中各个数据点的坐标值,获取所述人脸点云数据中的体素数据;
计算模块,用于采用预置的三维空间信息逐层次学习网络模型提取所述体素数据中的多个特征点,并计算各个特征点之间的距离;
识别模块,用于基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户。
所述终端设备400可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述终端设备400可包括,但不仅限于,处理器410、存储器420。本领域技术人员可以理解,图4仅仅是终端设备400的一种示例,并不构成对终端设备400的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述终端设备400还可以包括输入输出设备、网络接入设备、总线等。
所述处理器410可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
所述存储器420可以是所述终端设备400的内部存储单元,例如终端设备400的硬盘或内存。所述存储器420也可以是所述终端设备400的外部存储设备,例如所述终端设备400上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等等。进一步地,所述存储器420还可以既包括所述终端设备400的内部存储单元也包括外部存储设备。所述存储器420用于存储所述计算机可读指令421以及所述终端设备400所需的其他指令和数据。所述存储器420还可以用于暂时地存储已经输出或者将要输出的数据。
本领域普通技术人员可以理解,实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用 的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。

Claims (20)

  1. 一种人脸识别的方法,其特征在于,包括:
    采集当前用户的人体点云数据,所述人体点云数据包括多个数据点,各个数据点分别具有相应的坐标值;
    提取所述人体点云数据中的人脸点云数据;
    根据所述人脸点云数据中各个数据点的坐标值,获取所述人脸点云数据中的体素数据;
    采用预置的三维空间信息逐层次学习网络模型提取所述体素数据中的多个特征点,并计算各个特征点之间的距离;
    基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户。
  2. 根据权利要求1所述的方法,其特征在于,所述提取所述人体点云数据中的人脸点云数据的步骤包括:
    采用预置的三维点云网络模型识别所述人体点云数据中各个数据点之间的稀疏关系,所述三维点云网络模型通过对多份人脸点云样本数据进行训练得到;
    计算所述人体点云数据中各个数据点之间的稀疏关系与所述人脸点云样本数据中各个数据点之间的稀疏关系的相似度;
    提取所述相似度超过预设阈值部分的多个数据点作为人脸点云数据。
  3. 根据权利要求1所述的方法,其特征在于,所述提取所述人体点云数据中的人脸点云数据的步骤包括:
    根据所述人体点云数据中各个数据点的坐标值,识别所述当前用户的人体点云数据中的人脸鼻尖位置;
    基于所述人脸鼻尖位置,从所述人体点云数据中裁剪出人脸点云数据。
  4. 根据权利要求3所述的方法,其特征在于,所述基于所述人脸鼻尖位置,从所述人体点云数据中裁剪出人脸点云数据的步骤包括:
    以所述人脸鼻尖位置为原点构建坐标系,通过提取在所述坐标系的各个方向上预设长度内的多个数据点,获得人脸点云数据。
  5. 根据权利要求1所述的方法,其特征在于,所述根据所述人脸点云数据中各个数据点的坐标值,获取所述人脸点云数据中的体素数据的步骤包括:
    分别确定所述人脸点云数据中各个数据点在预设的三维坐标系的x、y、z轴上的坐标值最大值和坐标值最小值;
    根据所述坐标值最大值和坐标值最小值,生成包含所述人脸点云数据中全部数据点的最小立方体;
    对所述最小立方体中的全部数据点进行标记,获得所述人脸点云数据中的体素数据。
  6. 根据权利要求5所述的方法,其特征在于,还包括:
    将所述体素数据映射至特定大小的立体空间,作为所述三维空间信息逐层次学习网络模型的输入数据。
  7. 根据权利要求5所述的方法,其特征在于,所述各个特征点之间的距离为所述各个特征点之间的欧式距离,所述基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户的步骤包括:
    提取预先输入的目标用户人脸中的多个目标特征点,计算各个目标特征点之间的欧式距离;
    确定所述各个特征点与所述各个目标特征点之间的对应关系;
    判断所述各个特征点之间的欧式距离与具有所述对应关系的所述各个目标特征点之间的欧式距离之间的误差是否小于预设数值;
    若是,则识别所述当前用户为所述目标用户;
    若否,则识别所述当前用户不为所述目标用户。
  8. 一种人脸识别的装置,其特征在于,包括:
    采集模块,用于采集当前用户的人体点云数据,所述人体点云数据包括多个数据点,各个数据点分别具有相应的坐标值;
    提取模块,用于提取所述人体点云数据中的人脸点云数据;
    获取模块,用于根据所述人脸点云数据中各个数据点的坐标值,获取所述人脸点云数据中的体素数据;
    计算模块,用于采用预置的三维空间信息逐层次学习网络模型提取所述体素数据中的多个特征点,并计算各个特征点之间的距离;
    识别模块,用于基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户。
  9. 根据权利要求8所述的装置,其特征在于,所述提取模块包括:
    稀疏关系识别子模块,用于采用预置的三维点云网络模型识别所述人体点云数据中各个数据点之间的稀疏关系,所述三维点云网络模型通过对多份人脸点云样本数据进行训练得到;
    相似度计算子模块,用于计算所述人体点云数据中各个数据点之间的稀疏关系与所述人脸点云样本数据中各个数据点之间的稀疏关系的相似度;
    人脸点云数据提取子模块,用于提取所述相似度超过预设阈值部分的多个数据点作为人脸点云数据。
  10. 根据权利要求8所述的装置,其特征在于,所述获取模块包括:
    坐标值确定子模块,用于分别确定所述人脸点云数据中各个数据点在预设的三维坐标系的x、y、z轴上的坐标值最大值和坐标值最小值;
    立方体生成子模块,用于根据所述坐标值最大值和坐标值最小值,生成包含所述人脸点云数据中全部数据点的最小立方体;
    体素数据获取子模块,用于对所述最小立方体中的全部数据点进行标记,获得所述人脸点云数据中的体素数据。
  11. 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:
    采集当前用户的人体点云数据,所述人体点云数据包括多个数据 点,各个数据点分别具有相应的坐标值;
    提取所述人体点云数据中的人脸点云数据;
    根据所述人脸点云数据中各个数据点的坐标值,获取所述人脸点云数据中的体素数据;
    采用预置的三维空间信息逐层次学习网络模型提取所述体素数据中的多个特征点,并计算各个特征点之间的距离;
    基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户。
  12. 根据权利要求11所述的终端设备,其特征在于,所述处理器执行所述计算机可读指令时还实现如下步骤:
    采用预置的三维点云网络模型识别所述人体点云数据中各个数据点之间的稀疏关系,所述三维点云网络模型通过对多份人脸点云样本数据进行训练得到;
    计算所述人体点云数据中各个数据点之间的稀疏关系与所述人脸点云样本数据中各个数据点之间的稀疏关系的相似度;
    提取所述相似度超过预设阈值部分的多个数据点作为人脸点云数据。
  13. 根据权利要求11所述的终端设备,其特征在于,所述处理器执行所述计算机可读指令时还实现如下步骤:
    分别确定所述人脸点云数据中各个数据点在预设的三维坐标系的x、y、z轴上的坐标值最大值和坐标值最小值;
    根据所述坐标值最大值和坐标值最小值,生成包含所述人脸点云数据中全部数据点的最小立方体;
    对所述最小立方体中的全部数据点进行标记,获得所述人脸点云数据中的体素数据。
  14. 根据权利要求13所述的终端设备,其特征在于,所述处理器执行所述计算机可读指令时还实现如下步骤:
    将所述体素数据映射至特定大小的立体空间,作为所述三维空间 信息逐层次学习网络模型的输入数据。
  15. 根据权利要求13所述的终端设备,其特征在于,所述各个特征点之间的距离为所述各个特征点之间的欧式距离,所述处理器执行所述计算机可读指令时还实现如下步骤:
    提取预先输入的目标用户人脸中的多个目标特征点,计算各个目标特征点之间的欧式距离;
    确定所述各个特征点与所述各个目标特征点之间的对应关系;
    判断所述各个特征点之间的欧式距离与具有所述对应关系的所述各个目标特征点之间的欧式距离之间的误差是否小于预设数值;
    若是,则识别所述当前用户为所述目标用户;
    若否,则识别所述当前用户不为所述目标用户。
  16. 一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现如下步骤:
    采集当前用户的人体点云数据,所述人体点云数据包括多个数据点,各个数据点分别具有相应的坐标值;
    提取所述人体点云数据中的人脸点云数据;
    根据所述人脸点云数据中各个数据点的坐标值,获取所述人脸点云数据中的体素数据;
    采用预置的三维空间信息逐层次学习网络模型提取所述体素数据中的多个特征点,并计算各个特征点之间的距离;
    基于所述各个特征点之间的距离,识别所述当前用户是否为目标用户。
  17. 根据权利要求16所述的计算机非易失性可读存储介质,其特征在于,所述计算机可读指令被处理器执行时还实现如下步骤:
    采用预置的三维点云网络模型识别所述人体点云数据中各个数据点之间的稀疏关系,所述三维点云网络模型通过对多份人脸点云样本数据进行训练得到;
    计算所述人体点云数据中各个数据点之间的稀疏关系与所述人脸点云样本数据中各个数据点之间的稀疏关系的相似度;
    提取所述相似度超过预设阈值部分的多个数据点作为人脸点云数据。
  18. 根据权利要求16所述的计算机非易失性可读存储介质,其特征在于,所述计算机可读指令被处理器执行时还实现如下步骤:
    分别确定所述人脸点云数据中各个数据点在预设的三维坐标系的x、y、z轴上的坐标值最大值和坐标值最小值;
    根据所述坐标值最大值和坐标值最小值,生成包含所述人脸点云数据中全部数据点的最小立方体;
    对所述最小立方体中的全部数据点进行标记,获得所述人脸点云数据中的体素数据。
  19. 根据权利要求18所述的计算机非易失性可读存储介质,其特征在于,所述计算机可读指令被处理器执行时还实现如下步骤:
    将所述体素数据映射至特定大小的立体空间,作为所述三维空间信息逐层次学习网络模型的输入数据。
  20. 根据权利要求18所述的计算机非易失性可读存储介质,其特征在于,所述各个特征点之间的距离为所述各个特征点之间的欧式距离,所述计算机可读指令被处理器执行时还实现如下步骤:
    提取预先输入的目标用户人脸中的多个目标特征点,计算各个目标特征点之间的欧式距离;
    确定所述各个特征点与所述各个目标特征点之间的对应关系;
    判断所述各个特征点之间的欧式距离与具有所述对应关系的所述各个目标特征点之间的欧式距离之间的误差是否小于预设数值;
    若是,则识别所述当前用户为所述目标用户;
    若否,则识别所述当前用户不为所述目标用户。
PCT/CN2019/117184 2019-09-18 2019-11-11 一种人脸识别的方法、装置及终端设备 WO2021051539A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910882001.6 2019-09-18
CN201910882001.6A CN110728196B (zh) 2019-09-18 2019-09-18 一种人脸识别的方法、装置及终端设备

Publications (1)

Publication Number Publication Date
WO2021051539A1 true WO2021051539A1 (zh) 2021-03-25

Family

ID=69219179

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117184 WO2021051539A1 (zh) 2019-09-18 2019-11-11 一种人脸识别的方法、装置及终端设备

Country Status (2)

Country Link
CN (1) CN110728196B (zh)
WO (1) WO2021051539A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344029A (zh) * 2021-05-10 2021-09-03 深圳瀚维智能医疗科技有限公司 人体点云提取方法、电子装置和可读存储介质
CN113506227A (zh) * 2021-07-08 2021-10-15 江苏省地质测绘院 避免车载点云数据纠正点无效采集的方法及系统
CN113657903A (zh) * 2021-08-16 2021-11-16 支付宝(杭州)信息技术有限公司 一种刷脸支付方法、装置、电子设备和存储介质
CN114842543A (zh) * 2022-06-01 2022-08-02 华南师范大学 三维人脸识别方法、装置、电子设备及存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111652086B (zh) * 2020-05-15 2022-12-30 汉王科技股份有限公司 人脸活体检测方法、装置、电子设备及存储介质
CN112000940B (zh) * 2020-09-11 2022-07-12 支付宝(杭州)信息技术有限公司 一种隐私保护下的用户识别方法、装置以及设备
CN112200056B (zh) * 2020-09-30 2023-04-18 汉王科技股份有限公司 人脸活体检测方法、装置、电子设备及存储介质
CN113920282B (zh) * 2021-11-15 2022-11-04 广州博冠信息科技有限公司 图像处理方法和装置、计算机可读存储介质、电子设备
CN114155557B (zh) * 2021-12-07 2022-12-23 美的集团(上海)有限公司 定位方法、装置、机器人及计算机可读存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8526677B1 (en) * 2012-07-16 2013-09-03 Google Inc. Stereoscopic camera with haptic feedback for object and location detection
CN104091162A (zh) * 2014-07-17 2014-10-08 东南大学 基于特征点的三维人脸识别方法
CN105956582A (zh) * 2016-06-24 2016-09-21 深圳市唯特视科技有限公司 一种基于三维数据的人脸识别系统
CN108549873A (zh) * 2018-04-19 2018-09-18 北京华捷艾米科技有限公司 三维人脸识别方法和三维人脸识别系统

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100828412B1 (ko) * 2006-11-06 2008-05-09 연세대학교 산학협력단 멀티 포인트 신호를 이용한 3차원 얼굴 인식 방법
CN106127250A (zh) * 2016-06-24 2016-11-16 深圳市唯特视科技有限公司 一种基于三维点云数据的人脸质量评估方法
CN109670487A (zh) * 2019-01-30 2019-04-23 汉王科技股份有限公司 一种人脸识别方法、装置及电子设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8526677B1 (en) * 2012-07-16 2013-09-03 Google Inc. Stereoscopic camera with haptic feedback for object and location detection
CN104091162A (zh) * 2014-07-17 2014-10-08 东南大学 基于特征点的三维人脸识别方法
CN105956582A (zh) * 2016-06-24 2016-09-21 深圳市唯特视科技有限公司 一种基于三维数据的人脸识别系统
CN108549873A (zh) * 2018-04-19 2018-09-18 北京华捷艾米科技有限公司 三维人脸识别方法和三维人脸识别系统

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344029A (zh) * 2021-05-10 2021-09-03 深圳瀚维智能医疗科技有限公司 人体点云提取方法、电子装置和可读存储介质
CN113344029B (zh) * 2021-05-10 2024-04-05 深圳瀚维智能医疗科技有限公司 人体点云提取方法、电子装置和可读存储介质
CN113506227A (zh) * 2021-07-08 2021-10-15 江苏省地质测绘院 避免车载点云数据纠正点无效采集的方法及系统
CN113657903A (zh) * 2021-08-16 2021-11-16 支付宝(杭州)信息技术有限公司 一种刷脸支付方法、装置、电子设备和存储介质
CN114842543A (zh) * 2022-06-01 2022-08-02 华南师范大学 三维人脸识别方法、装置、电子设备及存储介质
CN114842543B (zh) * 2022-06-01 2024-05-28 华南师范大学 三维人脸识别方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN110728196B (zh) 2024-04-05
CN110728196A (zh) 2020-01-24

Similar Documents

Publication Publication Date Title
WO2021051539A1 (zh) 一种人脸识别的方法、装置及终端设备
US10699103B2 (en) Living body detecting method and apparatus, device and storage medium
WO2020000908A1 (zh) 一种人脸活体检测方法及装置
WO2021139324A1 (zh) 图像识别方法、装置、计算机可读存储介质及电子设备
CN110569756B (zh) 人脸识别模型构建方法、识别方法、设备和存储介质
WO2017219391A1 (zh) 一种基于三维数据的人脸识别系统
TW202006602A (zh) 三維臉部活體檢測方法、臉部認證識別方法及裝置
US10650260B2 (en) Perspective distortion characteristic based facial image authentication method and storage and processing device thereof
WO2015149534A1 (zh) 基于Gabor二值模式的人脸识别方法及装置
JP2017506379A (ja) 無制約の媒体内の顔を識別するシステムおよび方法
WO2020258120A1 (zh) 人脸识别的方法、装置和电子设备
WO2019200702A1 (zh) 去网纹系统训练方法、去网纹方法、装置、设备及介质
Islam et al. A review of recent advances in 3D ear-and expression-invariant face biometrics
WO2021196721A1 (zh) 一种舱内环境的调整方法及装置
CN111091075A (zh) 人脸识别方法、装置、电子设备及存储介质
CN112101208A (zh) 高龄老人特征串联融合手势识别方法及装置
CN108875549B (zh) 图像识别方法、装置、系统及计算机存储介质
CN111783629A (zh) 一种面向对抗样本攻击的人脸活体检测方法及装置
CN113298158A (zh) 数据检测方法、装置、设备及存储介质
CN112686191A (zh) 基于人脸三维信息的活体防伪方法、系统、终端及介质
Manh et al. Small object segmentation based on visual saliency in natural images
Mangla et al. Sketch-based facial recognition: a weighted component-based approach (WCBA)
JP6003367B2 (ja) 画像認識装置、画像認識方法および画像認識プログラム
WO2021051538A1 (zh) 一种人脸检测的方法、装置及终端设备
Mr et al. Developing a novel technique to match composite sketches with images captured by unmanned aerial vehicle

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19945693

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19945693

Country of ref document: EP

Kind code of ref document: A1