CN112651380A

CN112651380A - Face recognition method, face recognition device, terminal equipment and storage medium

Info

Publication number: CN112651380A
Application number: CN202110042337.9A
Authority: CN
Inventors: 李运涛; 童志军; 丁小羽
Original assignee: Shanghai Readsense Network Technology Co ltd; Shenzhen Yixin Vision Technology Co ltd
Current assignee: Shanghai Readsense Network Technology Co ltd; Shenzhen Yixin Vision Technology Co ltd
Priority date: 2021-01-13
Filing date: 2021-01-13
Publication date: 2021-04-13

Abstract

The application is applicable to the technical field of image processing, and provides a face recognition method, a face recognition device, a terminal device and a storage medium, wherein the face recognition method comprises the following steps: acquiring an infrared image and three-dimensional point cloud data of a face to be recognized; determining the target infrared characteristics of the face to be recognized according to the infrared image; determining target three-dimensional point cloud characteristics of the face to be recognized according to the three-dimensional point cloud data; fusing the target infrared features and the target three-dimensional point cloud features to obtain fused features; and carrying out face recognition according to the fusion characteristics to obtain a face recognition result. The method and the device can reduce the influence of the external environment on the face recognition and improve the accuracy of the face recognition.

Description

Face recognition method, face recognition device, terminal equipment and storage medium

Technical Field

The present application belongs to the field of image processing technologies, and in particular, to a face recognition method, a face recognition apparatus, a terminal device, and a storage medium.

Background

With the development of social informatization, the individual identification technology based on biological characteristics is more and more widely applied. Among them, the face recognition technology has great advantages in the aspects of simplicity, non-contact, privacy, etc., and occupies an important position in the individual identification technology based on the biological features. The traditional face recognition technology is easily influenced by the change of external environments (such as illumination, posture, expression and the like), so that the recognition accuracy of the face recognition technology is greatly reduced. Therefore, how to improve the recognition accuracy of the face recognition technology under the condition of being influenced by the external environment becomes an important problem to be solved urgently.

Disclosure of Invention

The embodiment of the application provides a face recognition method, a face recognition device, a terminal device and a storage medium, which can reduce the influence of an external environment on face recognition and improve the accuracy of face recognition.

In a first aspect, an embodiment of the present application provides a face recognition method, where the face recognition method includes:

acquiring an infrared image and three-dimensional point cloud data of a face to be recognized;

determining the target infrared characteristics of the face to be recognized according to the infrared image;

determining target three-dimensional point cloud characteristics of the face to be recognized according to the three-dimensional point cloud data;

fusing the target infrared features and the target three-dimensional point cloud features to obtain fused features;

and carrying out face recognition according to the fusion characteristics to obtain a face recognition result.

In a second aspect, an embodiment of the present application provides a face recognition apparatus, where the face recognition apparatus includes:

the acquisition module is used for acquiring an infrared image and three-dimensional point cloud data of a face to be recognized;

the first determining module is used for determining the target infrared characteristics of the face to be recognized according to the infrared image;

the second determination module is used for determining the target three-dimensional point cloud characteristics of the face to be recognized according to the three-dimensional point cloud data;

the fusion module is used for fusing the target infrared characteristic and the target three-dimensional point cloud characteristic to obtain a fusion characteristic;

and the recognition module is used for carrying out face recognition according to the fusion characteristics to obtain a face recognition result.

In a third aspect, an embodiment of the present application provides a terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the face recognition method according to the first aspect when executing the computer program.

In a fourth aspect, the present application provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the steps of the face recognition method according to the first aspect.

In a fifth aspect, an embodiment of the present application provides a computer program product, which, when running on a terminal device, causes the terminal device to execute the steps of the face recognition method according to any one of the above first aspects.

Compared with the prior art, the embodiment of the application has the advantages that: the method comprises the steps of determining target infrared characteristics of a face to be recognized by acquiring an infrared image of the face to be recognized, wherein the target infrared characteristics of the face can be extracted from a complex background and interference according to the acquired face infrared image; and determining the target three-dimensional point cloud characteristics of the face to be recognized according to the acquired three-dimensional point cloud data, fusing the target infrared characteristics and the target three-dimensional point cloud characteristics to obtain fused characteristics as the three-dimensional point cloud characteristics fully contain the shape and texture characteristics of the face, and recognizing the face according to the fused characteristics, so that the influence of the external environment on the face recognition can be reduced, and the accuracy of the face recognition is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic flowchart of a face recognition method according to an embodiment of the present application;

fig. 2 is a schematic flow chart of a face recognition method according to a second embodiment of the present application;

fig. 3 is a schematic structural diagram of a face recognition apparatus according to a third embodiment of the present application;

fig. 4 is a schematic structural diagram of a terminal device according to a fourth embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

The face recognition method provided by the embodiment of the application can be applied to terminal devices such as mobile phones, tablet computers, wearable devices, vehicle-mounted devices, notebook computers, ultra-mobile personal computers (UMPCs), netbooks and Personal Digital Assistants (PDAs), and the specific type of the terminal device is not limited at all.

By way of example and not limitation, when the terminal device is a wearable device, the wearable device may also be a general term for intelligently designing daily wearing by applying wearable technology and developing wearable devices. A wearable device is a portable device that is worn directly on the body or integrated into the clothing or accessories of the user. The wearable device is not only a hardware device, but also realizes powerful functions through software support, data interaction and cloud interaction. The generalized wearable intelligent device has the advantages of complete functions, large size and capability of realizing complete or partial functions without depending on a smart phone, such as a smart watch or smart glasses.

It should be understood that, the sequence numbers of the steps in this embodiment do not mean the execution sequence, and the execution sequence of each process should be determined by the function and the inherent logic of the process, and should not constitute any limitation to the implementation process of the embodiment of the present application.

In order to explain the technical solution of the present application, the following description is given by way of specific examples.

Referring to fig. 1, a schematic flow chart of a face recognition method provided in an embodiment of the present application is shown, where the face recognition method is applied to a terminal device, and as shown in the drawing, the face recognition method may include the following steps:

step 101, acquiring an infrared image and three-dimensional point cloud data of a face to be recognized.

In the embodiment of the application, the infrared image and the three-dimensional point cloud data of the face to be recognized can be obtained by the same 3D camera, and the obtaining time of the infrared image and the three-dimensional point cloud data is less than the preset time, so that the preset time can ensure that the obtained infrared image and the three-dimensional point cloud data aim at the same face.

Specifically, the 3D camera may refer to a camera composed of an infrared camera, a general camera (e.g., a camera having basic functions such as video camera and still image capture), an infrared laser transmitter, and an image processing chip. The infrared camera can be used for acquiring the infrared image of the face to be recognized and comprises two-dimensional coordinate information of the face to be recognized. And then, acquiring depth information, namely depth coordinate information, of the face to be recognized by an infrared laser transmitter, and sending the two-dimensional coordinate information and the depth coordinate information of the face to be recognized to an image processing chip to obtain three-dimensional data information, namely three-dimensional point cloud data, of the face to be recognized.

In specific implementation, the infrared image and the three-dimensional point cloud data of the face to be recognized can be acquired from face information acquired by terminal equipment with a 3D camera, and can also be acquired from face information sent by other equipment. The face information includes, but is not limited to, captured or received images and video streams.

Illustratively, when an infrared image of a face to be recognized is acquired from a terminal device with a 3D camera, the image processing chip may perform face detection on the acquired image by using a face detection algorithm, output the image when the image is detected to contain the face, determine the image as an infrared image of the face to be recognized, acquire depth information, i.e., depth coordinate information, of point cloud data in the infrared image by using an infrared laser transmitter, and send two-dimensional coordinate information and depth coordinate information extracted from the infrared image to the image processing chip to obtain point cloud data information (i.e., three-dimensional point cloud data) of the face to be recognized. In addition, the infrared image and the three-dimensional point cloud data of the face to be recognized can be obtained from a video stream collected by a terminal device with a 3D camera, the face detection algorithm is used for carrying out face detection on the collected video stream, when a certain video frame in the video stream is detected to contain the face, the video frame is output and determined to be the infrared image of the face to be recognized, after the face is detected to be contained in the video frame, the same operation as the operation of obtaining the point cloud data depth coordinate information is carried out on the video frame, and then the three-dimensional point cloud data of the face to be recognized is obtained.

Illustratively, when an infrared image and three-dimensional point cloud data of a face to be recognized are acquired from face information sent by other equipment, the image sent by the other equipment is received, an image processing chip in the terminal equipment of the application is used for carrying out face detection on the image (face detection can be carried out on the image by adopting a face detection algorithm), when the image contains the face, the image is output, the image is determined to be the infrared image of the face to be recognized, depth information, namely depth coordinate information, of the point cloud data in the infrared image is acquired from the face information sent by the other equipment, two-dimensional coordinate information and the depth coordinate information extracted from the infrared image are sent to the image processing chip in the terminal equipment of the application, and the point cloud data information, namely three-dimensional data, of the face to be recognized is acquired. In addition, the infrared image and the three-dimensional point cloud data of the face to be recognized can also be obtained from video streams sent by other equipment, the video streams sent by other equipment are received firstly, then the image processing chip in the terminal equipment of the application is used for carrying out face detection on the video streams, when a certain video frame in the video streams is detected to contain the face, the video frame is output and determined to be the infrared image of the face to be recognized, after the face is detected to be contained in the video frame, the same operation as the operation of obtaining the depth coordinate information of the point cloud data is carried out on the video frame, and the three-dimensional point cloud data of the face to be recognized is obtained.

It should be noted that the format of the obtained Video stream may be an Audio Video Interleaved format (AVI), a Video player format, a multimedia container format (Matroska, MKV), or the like, and the format of the obtained Video stream is not limited in this application.

It should be further noted that, the above-mentioned obtaining the depth information of the point cloud data in the infrared image from the face information sent by other devices means that an infrared laser transmitter provided in other devices is used to obtain the depth information of the point cloud data in the infrared image, and the depth information is sent to the terminal device of the present application.

In this embodiment of the present application, the face detection algorithm may be a feature-based face detection algorithm, and the feature-based face detection algorithm specifically may be: extracting the features in the face information acquired by the mobile terminal, matching the extracted features in the face information with a template image of a face, and judging whether the face exists or not by using a classifier to obtain a judgment result. The features in the extracted face information are artificially set features (for example, deep learning features), and the matched contents are the extracted features.

It should be noted that the face detection algorithm adopted in the foregoing may be a feature-based face detection algorithm, and may also be an image-based face detection algorithm, and the face detection algorithm in the present application includes, but is not limited to, the above detection algorithms.

And step 102, determining target infrared characteristics of the face to be recognized according to the infrared image.

The infrared image is an infrared image containing a face to be recognized, and the target infrared features of the face to be recognized can be face feature points obtained by using a face feature point positioning algorithm. The present application is not limited to the type and number of face feature points, and may be, for example, 5 face feature points such as the left eye, the right eye, the tip of the nose, the right corner of the mouth, and the left corner of the mouth of a human face.

Optionally, determining the target infrared features of the face to be recognized according to the infrared image includes:

determining a face image in the infrared image, wherein the face image refers to an image of a face area in the infrared image;

acquiring human face characteristic points in a human face image;

and determining the infrared characteristics of the target according to the face characteristic points.

In the embodiment of the present application, determining a face image in an infrared image means determining a face region in the infrared image, and then determining an image in the face region. The human face area can be framed in the infrared image in a human face frame mode. The face characteristic points in the face image can be obtained by using a face characteristic point positioning algorithm, and the target infrared characteristics of the face image are determined through the face characteristic points.

In a specific implementation, determining the face image in the infrared image may specifically be: and determining the range of a face area in the infrared image through an image enhancement algorithm, so as to select the face image in the range of the face area and determine the face image in the infrared image. The image enhancement algorithm may be: the method comprises the steps of firstly converting an acquired infrared image containing a face to be recognized into a gray image, namely converting the gray value of each pixel point in the infrared image containing the face to be recognized into the range of [0, 255 ]. Then, the pixels with the maximum gray values in the infrared image are extracted (for example, the pixels with the maximum gray values of 5% in the infrared image can be extracted), linear amplification processing is performed, so that the average gray value of the pixels reaches 255, and the gray image of the face area in the infrared image is obtained. The gray level image of the face area can be corroded first, and then the gray level image after corrosion treatment is expanded, so that noise in the gray level image can be removed, the face image after noise removal is subjected to frame selection, and the face image in the infrared image is determined.

In a specific implementation, the feature point location algorithm may be a face feature point location algorithm based on an Active Shape Model (ASM). Before using ASM, it needs to be trained. The training ASM may specifically be: firstly, constructing an ASM of a human face, selecting N human face sample images as a training sample set, and then manually marking M feature points in each human face sample image, wherein the M feature points comprise eyebrows, eyes, a nose, a mouth and features on the boundary of the outline of the human face. And (3) the coordinates of the characteristic points marked by each face sample image in the training sample set are concatenated into a characteristic vector, so that a characteristic vector group of the characteristic points marked by the face sample images in the whole training sample set is obtained, and the face sample images in the training sample set are normalized and aligned. And finally, performing principal component analysis processing on the aligned features to obtain an ASM training network, and inputting the face image in the infrared image into the ASM training network to obtain the face feature points in the face image. In order to reduce the amount of image calculation, 5 feature points are selected as manually marked feature points in a sample image in the application, including the left eye, the right eye, the nose tip, the right mouth corner and the left mouth corner of a human face.

It should be noted that, in the present application, the feature point location algorithm may also adopt a face feature point location algorithm based on a moving appearance model or based on deep learning, and the present application does not limit the face feature point location method.

Optionally, determining the target infrared features according to the face feature points comprises: and determining the face characteristic points as target infrared characteristics.

In the embodiment of the application, the face characteristic points are determined to be target infrared characteristics, wherein the face characteristic points are obtained according to a characteristic point positioning algorithm, and the target infrared characteristics are infrared characteristics of a face to be recognized obtained according to an infrared image. Because the human face has texture features and non-rigid shape features, different human faces have different position information contained in human face feature points determined according to a feature point positioning algorithm, for example, a human face image determined according to an infrared image is a two-dimensional image, and the position information contained in the human face feature points is a two-dimensional coordinate data point. And the face characteristic points obtained according to the characteristic point positioning algorithm are characteristic points with face characteristic information, and the face characteristic points can be directly used as infrared characteristics of the face to be recognized, namely target infrared characteristics.

And 103, determining target three-dimensional point cloud characteristics of the face to be recognized according to the three-dimensional point cloud data.

In the embodiment of the application, the three-dimensional point cloud data can be three-dimensional coordinates containing face feature information, the extraction of the face feature points cannot be directly performed due to the disorder of the three-dimensional point cloud data in the space, the normalized three-dimensional point cloud data can be obtained by preprocessing the three-dimensional point cloud data, and the target three-dimensional point cloud feature of the face to be recognized is determined according to the normalized three-dimensional point cloud data. The three-dimensional point cloud data refers to 3D point cloud data. The standard three-dimensional point cloud data is obtained by mapping unordered point cloud data to a reference face model to obtain three-dimensional point cloud data corresponding to the reference face model.

In a specific implementation, the three-dimensional point cloud data is preprocessed through denoising, reconstructing and registering. Denoising can be to remove some outlier point clouds which obviously do not belong to human face features; the reconstruction can be that the denoised three-dimensional point cloud data is input into a trained three-dimensional face reconstruction network to obtain reconstructed three-dimensional point cloud data; the registration may be performed on the reconstructed three-dimensional Point cloud data by using an Iterative Closest Point algorithm (ICP) to obtain the registered three-dimensional Point cloud data. And performing template face comparison on the registered three-dimensional point cloud data to obtain a compared face characteristic, and determining the obtained compared face characteristic as a target three-dimensional point cloud characteristic of the face to be recognized.

And 104, fusing the target infrared characteristic and the target three-dimensional point cloud characteristic to obtain a fusion characteristic.

In the embodiment of the application, the target infrared feature and the three-dimensional point cloud feature are feature vectors, and the fusion of the target infrared feature and the target three-dimensional point cloud feature may be a fusion of feature vectors of an infrared image and three-dimensional point cloud data, so as to obtain a fusion feature.

Optionally, fusing the target infrared feature and the target three-dimensional point cloud feature to obtain a fused feature, including: and inputting the target infrared characteristic and the target three-dimensional point cloud characteristic into the trained fusion network to obtain a fusion characteristic.

In the embodiment of the application, the target infrared characteristic and the target three-dimensional point cloud characteristic are both expressed in the form of a characteristic vector. Because three-dimensional point cloud data has disorder and sparsity, the three-dimensional point cloud data cannot be effectively processed by a traditional convolutional neural network, and due to the fact that a plurality of Multilayer perceptrons (MLPs) are used in a PointNet network structure, N-dimensional features of each point cloud data can be extracted. Therefore, the method and the device adopt the PointNet network to process the three-dimensional point cloud data, determine the three-dimensional point cloud data output by the PointNet network as target three-dimensional point cloud characteristics, input the target infrared characteristics and the target three-dimensional point cloud characteristics into a fusion network of a Convolutional Neural Network (CNN) and the PointNet network, and output the fusion characteristics as target infrared characteristics and target three-dimensional point cloud characteristics. Wherein, the fusion network adopts a loss function to carry out supervision training.

Optionally, fusing the target infrared feature and the target three-dimensional point cloud feature, and obtaining a fusion feature further includes: and carrying out linear weighting on the target infrared characteristic and the target three-dimensional point cloud characteristic to obtain a fusion characteristic.

In the embodiment of the application, when the influence of the external environment on the face recognition technology is small, the proportion of the target infrared features in the fusion feature calculation is set to be high, the proportion of the target three-dimensional point cloud features in the fusion feature calculation is set to be low, the calculation amount of a three-dimensional image is reduced, and the target infrared features determine the final face recognition effect; when the influence of the external environment on the face recognition technology is large, for example, under the condition that the angle of the face to be recognized is a non-positive angle, the occupied proportion of the target three-dimensional point cloud feature in the calculation of the fusion feature is high, at the moment, the face information contained in the acquired target infrared feature is incomplete, the acquired three-dimensional point cloud feature has more complete face information and can not be influenced by the angle of the face to be recognized, so that the occupied proportion of the target three-dimensional point cloud feature in the calculation of the fusion feature is high, the influence of the external environment on the face recognition can be reduced, and the accuracy of the face recognition is improved. Therefore, when the characteristics are fused, different weights are given to the infrared characteristics of the target and the three-dimensional point cloud characteristics of the target, and better adaptability of the identification characteristics is guaranteed.

In a specific implementation, the target infrared characteristic weight is given as alpha₁Giving the target three-dimensional point cloud characteristic weight as alpha₂And α is₁+α₂＝1。

And 105, carrying out face recognition according to the fusion features to obtain a face recognition result.

In the embodiment of the application, the face recognition according to the fusion features may refer to inputting the fusion feature vector of the face to be recognized into a face database, comparing the fusion feature vector with the face features in the face database to obtain the similarity between the face to be recognized and each face in the face database, and returning a face recognition result if the similarity is greater than a preset threshold; and if the similarity is smaller than the preset threshold value, the face recognition fails. The human face features in the human face database are fusion features of infrared features and three-dimensional point cloud features.

Specifically, the face recognition result may refer to that the face matching is successful or the face recognition is successful.

Illustratively, when the face mobile phone is unlocked, acquiring fusion features of a face to be recognized, comparing the fusion features with fusion features of a preset face in the mobile phone to obtain similarity between the face to be recognized and the preset face, if the similarity is greater than a preset threshold, indicating that the face recognition is successful, returning a command of successful face recognition, and unlocking the mobile phone.

In the embodiment, the infrared image of the face to be recognized is obtained, the target infrared feature of the face to be recognized is determined, then the target three-dimensional point cloud feature of the face to be recognized is determined according to the obtained three-dimensional point cloud data, the target infrared feature and the target three-dimensional point cloud feature are fused to obtain the fusion feature because the three-dimensional point cloud feature fully contains the shape and texture features of the face and is not influenced by the external environment, and the face recognition is performed according to the fusion feature, so that the influence of the external environment on the face recognition can be reduced, and the accuracy of the face recognition is improved.

Referring to fig. 2, a schematic flow chart of a face recognition method provided in the second embodiment of the present application is shown, where the face recognition method is applied to a terminal device, and as shown in the drawing, the face recognition method may include the following steps:

step 201, acquiring an infrared image and three-dimensional point cloud data of a face to be recognized.

Step 201 of this embodiment is similar to step 101 of the previous embodiment, and reference may be made to this embodiment, which is not described herein again.

Step 202, determining the target infrared characteristics of the face to be recognized according to the infrared image.

In the embodiment of the application, in order to improve the accuracy of the target infrared features, a face image including face feature points may be input into a trained convolutional neural network, and when the convolutional neural network converges or the positions of the face feature points do not change any more, the face feature points at this time are acquired as the target infrared features.

Optionally, determining the target infrared characteristics according to the face characteristic points further comprises: and inputting the face image into the trained convolutional neural network to obtain the target infrared features.

In the embodiment of the present application, the convolutional neural Network may be a Residual Network (ResNet) obtained by adding Residual learning to a conventional convolutional neural Network. Inputting the face image into the convolutional neural network may refer to inputting the face image into a ResNet network structure, and processing the face feature points in the face image, so as to output the precisely positioned face feature points, thereby obtaining the target infrared features. The ResNet network structure can adopt a ResNet 101 network structure, wherein 101 refers to the depth of the ResNet network, namely the number of layers of parameters needing to be updated through training in the ResNet network.

It should be noted that the trained convolutional neural network may use an npair loss function to perform supervised training, or may use other loss functions to perform supervised training, and the loss function performing the supervised training is not limited in the present application.

Optionally, before inputting the face image into the trained convolutional neural network, the method further includes:

carrying out face correction on the face image according to the face characteristic points to obtain a corrected face image;

inputting the face image into a trained convolutional neural network to obtain target infrared features, wherein the target infrared features comprise:

and inputting the corrected face image into the trained convolutional neural network to obtain the target infrared features.

In the embodiment of the application, because the face image in the acquired infrared image usually contains a non-positive face image, the face image is subjected to face correction before being input into the trained convolutional neural network, so that the accuracy of face feature point positioning can be guaranteed.

Specifically, the face correction of the face image according to the face feature points may be performed by intercepting the face image with the minimum size including the face feature points according to the positions of the face feature points, and performing slight face torsion and correction on the obtained face image with the minimum size including the face feature points to obtain the corrected face image. The size of the corrected face image may be 224 × 224, the corrected face image includes the above-mentioned feature points of the face acquired according to the feature point positioning, and the size of the corrected face image is smaller than the size of the face image acquired according to the infrared image, and the size of the corrected face image is the minimum size face image including the feature points of the face, so that the amount of calculation of the image can be reduced.

Step 203, determining a human face point cloud area in the three-dimensional point cloud data according to the human face characteristic points.

In the embodiment of the application, the face feature points are obtained by using a feature point positioning algorithm according to the infrared image, the face point cloud area in the three-dimensional point cloud data is determined by taking the nose tip in the face feature points as the center and taking r as the radius, and after the face point cloud area is obtained, the three-dimensional point cloud data in the face point cloud area is determined and preprocessed. Wherein, r can be the longest distance from the tip of the nose to the edge of the face area in the infrared image, and can also be the longest distance from the tip of the nose to the face frame in the infrared image.

And 204, denoising the face point cloud data in the face point cloud area to obtain denoised face point cloud data.

In this embodiment of the application, the denoising processing performed on the face point cloud data in the face point cloud region may refer to performing bilateral filtering processing on the face point cloud data in the face point cloud region, so as to smooth noise in the face point cloud data while retaining image edge features.

Optionally, the bilateral filtering processing on the face point cloud data in the face point cloud region includes: acquiring a filtering center of the face point cloud data, wherein the filtering center is any point in the face point cloud data in the face point cloud area;

acquiring a neighborhood range of a filtering center, acquiring face point cloud data in the neighborhood range according to the neighborhood range of the filtering center, and calculating a numerical value after filtering the face point cloud data in the neighborhood range of the filtering center;

acquiring a pixel value of the face point cloud data of a filtering center and a pixel value of the face point cloud data in a neighborhood range of the filtering center, and calculating a value obtained after filtering the face point cloud data in the neighborhood range of the filtering center according to the acquired pixel values;

acquiring filtered face point cloud data according to the filtered numerical value of the face point cloud data in the neighborhood range of the filtering center and the pixel filtered numerical value of the face point cloud data in the neighborhood range of the filtering center, and determining the filtered face point cloud data as de-noised face point cloud data;

in the embodiment of the application, the neighborhood range of the filtering center is calculatedThe filtered values of the pixels of the face point cloud data in the filtering center may be distances between the pixel values of the face point cloud data in the neighborhood of the filtering center and the pixel values of the face point cloud data in the neighborhood of the filtering center, and luminance differences between the pixels. Wherein, the pixel value of the face point cloud data of the filtering center can be assumed as I_pThe pixel value of the face point cloud data in the neighborhood range is I_qAnd q points are within the neighborhood.

In a specific implementation, bilateral filtering processing is performed on the face point cloud data in the face point cloud region, and the obtained filtered face point cloud data can be expressed as follows:

wherein, f (-) is Gaussian filtering with point cloud p as the center (p is any point in the face point cloud data in the face point cloud area); g (-) is I_pFiltering by taking the center; Ω is the size of the filter kernel, i.e. the size of the neighborhood range in the center of the filter; k_pIs the sum of the f (·) · g (·) filter weights, and is used for carrying out normalization processing on the filtered point cloud data; j. the design is a square_pIs a filtered point cloud. The bilateral filtering is adopted to carry out denoising processing on the face point cloud data, namely, the point cloud data is denoised from a space domain and a pixel domain simultaneously, so that not only can the edge characteristics of the face point cloud area be reserved, but also the noise in the non-edge characteristics can be removed, and the smoothing processing on the non-edge characteristics is realized.

Step 205, inputting the denoised face point cloud data into the trained three-dimensional face reconstruction network to obtain reconstructed face point cloud data.

In the embodiment of the application, the three-dimensional face reconstruction network is obtained through training, three-dimensional point clouds at different face angles are collected in a large scale through a plurality of cameras, data fusion is carried out on point cloud images at different viewing angles collected by the same camera through a dynamic fusion algorithm, real three-dimensional face models under different cameras are obtained, and therefore a large number of three-dimensional face models are used for carrying out supervision training to obtain the reconstruction network.

In the specific implementation, the denoised face point cloud data is input into a trained three-dimensional face reconstruction network, when the reconstruction network is converged, a three-dimensional face model output by the three-dimensional face reconstruction network is obtained, and the point cloud data extracted from the three-dimensional face model is determined as the reconstructed face point cloud data.

And step 206, registering the reconstructed face point cloud data to obtain registered face point cloud data.

In this embodiment of the present application, registering the reconstructed face point cloud data may refer to performing ICP registration on the reconstructed face point cloud data. In the application, 5 feature points of a human face are mainly used for ICP registration, a rotation parameter and a translation parameter of the human face are calculated, and then the reconstructed point cloud data are registered according to the rotation parameter and the translation parameter to obtain the registered human face point cloud data.

Optionally, performing ICP registration using 5 feature points of the face, and calculating a rotation parameter and a translation parameter of the face includes:

acquiring face point cloud data to be registered and reference face point cloud data, wherein the face point cloud data to be registered is reconstructed face point cloud data, the face point cloud data to be registered is located in a first coordinate system, and the reference face point cloud data is located in a second coordinate system;

acquiring reference face point cloud data corresponding to the face point cloud data to be registered in a first coordinate system in a second coordinate system;

and calculating a rotation parameter and a translation parameter between the face point cloud data to be registered and the reference face point cloud data according to the face point cloud data to be registered and the reference face point cloud data corresponding to the face point cloud data.

In the embodiment of the application, when the rotation parameter and the translation parameter between the face point cloud data to be registered and the reference face point cloud data are calculated, the distance between the face point cloud data to be registered and the reference face point cloud data can be set as a target function, and the variable value when the target function is the minimum is determined as the rotation parameter and the translation parameter between the face point cloud data to be registered and the reference face point cloud data.

In a specific implementation, the calculation method of the rotation parameter and the translation parameter can be expressed as follows:

wherein, input p_sIs the point cloud data of the face to be registered, p_tIs referred to the point cloud data of the human face,

for the ith human face point cloud data to be registered,

normalizing the reconstructed face point cloud data by using the rotation parameters and the translation parameters, wherein the reference face point cloud data is corresponding to the ith face point cloud data to be registered, R is the rotation parameters, and t is the translation parameters; argmin represents the minimum of the objective function.

In the present application, the number of feature points may not be limited to 5, and the feature points may not be limited to the left eye, the right eye, the tip of the nose, the right mouth corner, and the left mouth corner of the human face when the ICP alignment is performed, that is, the present application does not limit the types and the number of the feature points used in the ICP alignment.

In this embodiment of the application, normalizing the reconstructed face point cloud data by using the rotation parameter and the translation parameter may be:

registering the face point cloud data to be registered according to the rotation parameters and the translation parameters to obtain the face point cloud data in a third coordinate system;

if the average distance between the face point cloud data in the third coordinate system and the reference face point cloud data in the second coordinate system is greater than the preset distance, taking the face point cloud data in the third coordinate system as new face point cloud data to be registered, and continuing iterative computation;

and if the average distance between the face point cloud data in the third coordinate system and the reference face point cloud data in the second coordinate system is smaller than or equal to the preset distance, determining the face point cloud data in the third coordinate system as the registered face point cloud data, thereby realizing the normalization of the reconstructed face point cloud data.

It should be understood that the third coordinate system may be a world coordinate system, and the average distance between the face point cloud data in the third coordinate system and the reference point cloud data in the second coordinate system refers to an average value of distances between the face point cloud data in the third coordinate system and the reference point cloud data in the second coordinate system corresponding to the face point cloud data in the third coordinate system.

It should also be understood that the iterative computation refers to returning and executing registration of the face point cloud data to be registered according to the rotation parameter and the translation parameter, and acquiring the face point cloud data in the third coordinate system.

And step 207, determining target three-dimensional point cloud characteristics according to the registered human face point cloud data.

In the embodiment of the application, in order to improve the accuracy of the target three-dimensional point cloud feature, the registered human face point cloud data is input into a trained PointNet network, and when the trained PointNet network converges, the human face point cloud data in the network convergence is determined as the target three-dimensional point cloud feature.

Optionally, before determining that the face point cloud data when the network converges is the target three-dimensional point cloud feature, the method further includes:

and carrying out normalization processing on the human face point cloud data during network convergence, and determining the human face point cloud data after the normalization processing as a target three-dimensional point cloud characteristic.

Specifically, the normalization processing of the face point cloud data during network convergence may be represented as follows:

p'_x＝(2*p_x-minp_x-maxp_x)/(maxp_x-minp_x)

p'_y＝(2*p_y-minp_y-maxp_y)/(maxp_y-minp_y)

p'_z＝(2*p_z-minp_z-maxp_z)/(maxp_z-minp_z)

wherein (p)_x，p_y，p_z) Is the three-dimensional coordinate, minp, of the face point cloud data extracted during network convergence_x，minp_y，minp_zRespectively is the minimum value of x, y and z axis coordinates in the three-dimensional coordinates of the face point cloud data, maxp_x，maxp_y，maxp_zRespectively is the maximum value p 'of x, y and z axis coordinates in three-dimensional coordinates of the face point cloud data'_x，p'_y，p'_zThe point cloud data of the human face after normalization.

It should be understood that, because the point cloud data of the human face has disorder and sparsity, the traditional convolutional neural network cannot effectively process the shape data of the point cloud, and the PointNet network structure uses a plurality of MLPs, which can realize feature extraction of each point cloud data by sharing weight convolution.

Optionally, determining the target three-dimensional point cloud feature according to the registered face point cloud data includes:

and determining the registered human face point cloud data as a target three-dimensional point cloud characteristic.

In the embodiment of the application, since the registered face point cloud data is the standard face three-dimensional point cloud data, in order to reduce the calculation amount of feature extraction, the registered face point cloud data can be directly determined as the target three-dimensional point cloud feature.

And 208, fusing the target infrared characteristic and the target three-dimensional point cloud characteristic to obtain a fused characteristic.

Step 208 of this embodiment is similar to step 104 of the previous embodiment, and reference may be made to this embodiment, which is not described herein again.

And step 209, performing face recognition according to the fusion features to obtain a face recognition result.

Step 209 of this embodiment is similar to step 105 of the previous embodiment, and may refer to each other, which is not described herein again.

In order to improve the accuracy of the acquired target infrared features and target three-dimensional point cloud features, the image containing the human face feature points is input into a trained convolutional neural network, and the human face feature points output by the convolutional neural network are determined to be the target infrared features; and meanwhile, inputting the face point cloud data obtained after registration into a trained PointNet network, and determining the face point cloud data output by the PointNet network as a target three-dimensional point cloud feature. The target infrared features and the target three-dimensional point cloud features are input into the trained fusion network, high-precision fusion features can be extracted for face recognition, and the technical scheme can improve the accuracy of the face recognition technology.

Referring to fig. 3, a schematic structural diagram of a face recognition apparatus provided in the third embodiment of the present application is shown, and for convenience of description, only parts related to the third embodiment of the present application are shown, where the face recognition apparatus may specifically include the following modules:

the acquisition module 301 is configured to acquire an infrared image and three-dimensional point cloud data of a face to be recognized;

a first determining module 302, configured to determine, according to the infrared image, a target infrared feature of the face to be recognized;

the second determining module 303 is configured to determine a target three-dimensional point cloud feature of the face to be recognized according to the three-dimensional point cloud data;

the fusion module 304 is configured to fuse the target infrared feature and the target three-dimensional point cloud feature to obtain a fusion feature;

and the recognition module 305 is configured to perform face recognition according to the fusion features to obtain a face recognition result.

In this embodiment of the application, the first determining module 302 may specifically include the following sub-modules:

the image determining submodule is used for determining a face image in the infrared image, and the face image refers to an image of a face area in the infrared image;

the characteristic acquisition submodule is used for acquiring human face characteristic points in the human face image;

and the characteristic determination submodule is used for determining the infrared characteristics of the target according to the face characteristic points.

Optionally, the feature determination sub-module may specifically include the following units:

the infrared determining unit is used for determining the face characteristic points as target infrared characteristics;

and the image input unit is used for inputting the face image into the trained convolutional neural network to obtain the target infrared features.

Optionally, the feature determination sub-module further includes the following units:

the face correction unit is used for carrying out face correction on the face image according to the face characteristic points to obtain a corrected face image;

the image input unit is specifically configured to input the corrected face image into a trained convolutional neural network to obtain a target infrared feature.

In this embodiment of the application, the second determining module 303 may specifically include the following sub-modules:

the region determining submodule is used for determining a human face point cloud region in the three-dimensional point cloud data according to the human face characteristic points;

the denoising submodule is used for denoising the face point cloud data in the face point cloud area to obtain denoised face point cloud data;

the reconstruction submodule is used for inputting the denoised face point cloud data into a trained three-dimensional face reconstruction network to obtain reconstructed face point cloud data;

the registration submodule is used for registering the reconstructed face point cloud data to obtain registered face point cloud data;

and the point cloud determining submodule is used for determining the target three-dimensional point cloud characteristics according to the registered face point cloud data.

Optionally, the point cloud determining sub-module may specifically include the following units:

the characteristic determining unit is used for determining the registered human face point cloud data as a target three-dimensional point cloud characteristic;

and the data input unit is used for inputting the registered face point cloud data into the trained PointNet network to obtain the target three-dimensional point cloud characteristics.

In this embodiment, the fusion module 304 may specifically include the following sub-modules:

the characteristic input sub-module is used for inputting the target infrared characteristic and the target three-dimensional point cloud characteristic into the trained fusion network to obtain a fusion characteristic;

and the linear weighting submodule is used for carrying out linear weighting on the target infrared characteristic and the target three-dimensional point cloud characteristic to obtain a fusion characteristic.

The face recognition device provided in the embodiment of the present application can be applied to the foregoing method embodiments, and for details, reference is made to the description of the foregoing method embodiments, which is not described herein again.

Fig. 4 is a schematic structural diagram of a terminal device according to a fourth embodiment of the present application. As shown in fig. 4, the terminal device 4 of this embodiment includes: at least one processor 410 (only one shown in fig. 4), a memory 420, and a computer program 421 stored in the memory 420 and executable on the at least one processor 410, the processor 410 implementing the steps in any of the various embodiments of the face recognition method when executing the computer program 421.

The terminal device 400 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The terminal device may include, but is not limited to, a processor 410, a memory 420. Those skilled in the art will appreciate that fig. 4 is merely an example of the terminal device 400, and does not constitute a limitation of the terminal device 400, and may include more or less components than those shown, or combine some of the components, or different components, such as an input-output device, a network access device, etc.

The Processor 410 may be a Central Processing Unit (CPU), and the Processor 410 may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 420 may in some embodiments be an internal storage unit of the terminal device 400, such as a hard disk or a memory of the terminal device 400. The memory 420 may also be an external storage device of the terminal device 400 in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 400. Further, the memory 420 may also include both an internal storage unit and an external storage device of the terminal device 400. The memory 420 is used for storing an operating system, an application program, a BootLoader (BootLoader), data, and other programs, such as program codes of the computer programs. The memory 420 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

When the computer program product runs on a terminal device, the terminal device can implement the steps in the method embodiments.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same. Although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A face recognition method is characterized by comprising the following steps:

2. The method of claim 1, wherein the determining the target infrared features of the face to be recognized according to the infrared image comprises:

determining a face image in the infrared image, wherein the face image is an image of a face area in the infrared image;

acquiring human face characteristic points in the human face image;

and determining the target infrared characteristics according to the face characteristic points.

3. The method of claim 2, wherein the determining the target infrared features from the face feature points comprises:

determining the face feature points as the target infrared features;

or inputting the face image into a trained convolutional neural network to obtain the target infrared features.

4. The face recognition method of claim 3, further comprising, prior to inputting the face image into the trained convolutional neural network:

carrying out face correction on the face image according to the face characteristic point to obtain a corrected face image;

the inputting the face image into the trained convolutional neural network to obtain the target infrared features comprises:

and inputting the corrected face image into the trained convolutional neural network to obtain the target infrared characteristic.

5. The method for recognizing the human face according to claim 2, wherein the determining the target three-dimensional point cloud feature of the human face to be recognized according to the three-dimensional point cloud data comprises:

determining a human face point cloud area in the three-dimensional point cloud data according to the human face characteristic points;

denoising the face point cloud data in the face point cloud area to obtain denoised face point cloud data;

inputting the denoised face point cloud data into a trained three-dimensional face reconstruction network to obtain reconstructed face point cloud data;

registering the reconstructed face point cloud data to obtain registered face point cloud data;

and determining the target three-dimensional point cloud characteristics according to the registered human face point cloud data.

6. The face recognition method of claim 5, wherein the determining the target three-dimensional point cloud feature from the registered face point cloud data comprises:

determining the registered human face point cloud data as the target three-dimensional point cloud feature;

or inputting the registered face point cloud data into a trained PointNet network to obtain the target three-dimensional point cloud characteristics.

7. The face recognition method according to any one of claims 1 to 6, wherein the fusing the target infrared feature and the target three-dimensional point cloud feature to obtain a fused feature comprises:

inputting the target infrared features and the target three-dimensional point cloud features into a trained fusion network to obtain fusion features;

or carrying out linear weighting on the target infrared characteristic and the target three-dimensional point cloud characteristic to obtain the fusion characteristic.

8. A face recognition apparatus, characterized in that the face recognition apparatus comprises:

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the face recognition method according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the face recognition method according to any one of claims 1 to 7.