WO2021051538A1

WO2021051538A1 - Face detection method and apparatus, and terminal device

Info

Publication number: WO2021051538A1
Application number: PCT/CN2019/117181
Authority: WO
Inventors: 张国辉; 李佼
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-09-18
Filing date: 2019-11-11
Publication date: 2021-03-25
Also published as: CN110717406B; CN110717406A

Abstract

A face detection method and apparatus, and a terminal device, wherein same fall within the technical field of face detection. The method comprises: collecting body point cloud data of a plurality of sample users, wherein the body point cloud data comprises a plurality of data points, and all the data points respectively have corresponding coordinate values (S101); respectively identifying the nose tip position on the face in the body point cloud data of each sample user according to the coordinate values of all the data points (S102); cutting face point cloud data out of the body point cloud data on the basis of the nose tip positions on the faces (S103); generating a face detection model by means of carrying out model training on the face point cloud data of the plurality of sample users (S104); and when point cloud data of an object to be detected is received, carrying out detection on the point cloud data of said object by means of the face detection model, and identifying whether the point cloud data of said object comprises a face (S105). The problem in the art of a 3D face detection algorithm being easily cracked is solved, thereby improving the security of face detection.

Description

Method, device and terminal equipment for face detection

This application affirms that it enjoys the priority of the Chinese patent application with the application number 201910882002.0 filed on September 18, 2019, entitled "A method, device and terminal device for face detection", and the entire content of the Chinese patent application is by reference The method is incorporated in this application.

Technical field

This application belongs to the technical field of face detection, and particularly relates to a method, device and terminal device for face detection.

Background technique

Face Detection (Face Detection) refers to any given image, using a certain strategy to search for it to determine whether it contains a face, and if so, return information such as the position, size, and posture of the face. With the development of technology, face detection has important application value in content-based retrieval, digital video processing, and video detection.

At present, face detection mainly includes two forms: 2D face detection and 3D face detection. Among them, 2D face detection can be used to detect faces that appear in a plane image, and 3D face detection can use a 3D camera to perform stereo imaging to identify the three-dimensional coordinate information of each point in the field of view. As the machine obtains more information, the accuracy of 3D face detection analysis and judgment has been greatly improved compared to 2D face detection. However, most of the 3D face detection in the prior art is implemented based on the projection of a 3D point cloud on a 2D image. The face detection of the 3D point cloud is completed by the face detection of the RGB2D image. This method is compared Easily cracked.

For example, in front of a 3D structured light camera, when a photo containing a face image is used to block one's face for face detection, in the face detection algorithm based on RGB2D image, the photo is also considered to be a face, then the detection The bounding box of the face will be mapped to the 3D point cloud coordinates, so that the result of the face bounding box in the 3D point cloud can be output. However, the photo containing the face image is only a plane in the point cloud taken by the 3D structured light camera, without any face information, and the detection result output according to this detection algorithm is actually wrong.

Summary of the invention

technical problem

In view of this, the embodiments of the present application provide a method, device and terminal device for face detection, so as to solve the problem that the 3D face detection algorithm implemented based on the projection of 3D point clouds on 2D images in the prior art is easily cracked. , The problem of lower security.

The solution to the problem

Technical solutions

In order to solve the above technical problems, the technical solutions adopted in the embodiments of this application are:

In the first aspect, a method for face detection is provided, including:

Collecting human body point cloud data of multiple sample users, where the human body point cloud data includes multiple data points, and each data point has a corresponding coordinate value;

According to the coordinate value of each data point, respectively identify the position of the nose tip of the face in the human body point cloud data of each sample user;

Cropping out face point cloud data from the human body point cloud data based on the position of the nose tip of the face;

Generating a face detection model by performing model training on the face point cloud data of the multiple sample users;

When the point cloud data of the object to be detected is received, the face detection model is used to detect the point cloud data of the object to be detected, and it is recognized whether the point cloud data of the object to be detected includes a human face.

In a second aspect, a face detection device is provided, including:

The collection module is used to collect human body point cloud data of multiple sample users, where the human body point cloud data includes multiple data points, and each data point has a corresponding coordinate value;

The recognition module is used to recognize the nose tip position of the face in the human body point cloud data of each sample user according to the coordinate value of each data point;

A cropping module, configured to crop the face point cloud data from the human body point cloud data based on the position of the nose tip of the face;

A generating module for generating a face detection model by performing model training on the face point cloud data of the multiple sample users;

The detection module is configured to, when receiving the point cloud data of the object to be detected, use the face detection model to detect the point cloud data of the object to be detected, and identify whether the point cloud data of the object to be detected includes human face.

In a third aspect, a terminal device is provided, including a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor. When the processor executes the computer-readable instructions The following steps are implemented in the method of face detection:

In a fourth aspect, a computer non-volatile readable storage medium is provided, and the computer non-volatile readable storage medium stores computer readable instructions that, when executed by a processor, realize the human The steps of the face detection method are as follows:

The beneficial effects of the face detection method, device and terminal device provided by the embodiments of the present application are: by collecting the human body point cloud data of multiple sample users, it is possible to identify the data points of each sample user according to the coordinate value of each data point. The position of the nose tip of the face in the human body point cloud data, so that based on the nose tip position, the face point cloud data can be cut out from the human body point cloud data as sample data for model training to generate a face detection model; In the case of the point cloud data of the object, the face detection model obtained by the above training can be used to detect the point cloud data of the object to be detected, so as to identify whether the point cloud data of the object to be detected includes a human face. In this embodiment, by performing model training on existing open source data sets, a batch of 3D point cloud data sets with only face information can be obtained. These 3D point cloud data sets can be directly used for subsequent face detection without the need for RGB2D. The image solves the problem that the 3D face detection algorithm implemented based on the projection of the 3D point cloud on the 2D image in the prior art is easily cracked, and the security of face detection is improved.

The beneficial effects of the invention

Brief description of the drawings

Description of the drawings

FIG. 1 is a schematic flowchart of steps of a method for face detection according to an embodiment of the present application;

FIG. 2 is a schematic flowchart of the steps of another method for face detection according to an embodiment of the present application;

Fig. 3 is a schematic diagram of a face detection apparatus according to an embodiment of the present application;

Fig. 4 is a schematic diagram of a terminal device according to an embodiment of the present application.

Invention embodiment

Embodiments of the present invention

1, there is shown a schematic flow chart of the steps of a face detection method according to an embodiment of the present application, which may specifically include the following steps:

S101. Collect human body point cloud data of multiple sample users, where the human body point cloud data includes multiple data points, and each data point has a corresponding coordinate value.

It should be noted that this method can be applied to terminal equipment. The terminal device can recognize whether the corresponding human face is included in the above-mentioned point cloud data by collecting the point cloud data of the detected object.

In the embodiment of the present application, in order to detect whether the point cloud data of the detected object includes a face, firstly, the human body point cloud data of multiple sample users can be collected for model training, and the corresponding face detection model can be constructed. , And then the face detection model completes the subsequent detection process.

Generally, human body point cloud data refers to 3D human body point cloud data. 3D human body point cloud data is a kind of data that records the structure of the human body in the form of data points, and each data point contains three-dimensional coordinates. For example, it can be the coordinate values on the x, y, and z axes. Of course, each data point may also contain other information such as gray scale, which is not limited in this embodiment.

In a specific implementation, the depth information of various parts of the human body can be obtained through a specific detection device or collection device. Then, these devices can automatically output 3D human body point cloud data based on the obtained depth information. Generally, the above-mentioned equipment can be a depth camera, a depth camera, a depth sensor, or a lidar.

Take the depth camera as an example. The depth camera is usually composed of an infrared projector and an infrared depth camera. The infrared projector is mainly used to emit uniform infrared rays to the outside world and form an infrared speckle image on the target object (such as the human body or other objects). The speckle image information is received by the infrared depth camera, and finally after forming the depth information of the target object, the infrared depth camera can output the point cloud data of the target object by analyzing and processing the formed depth information.

In the embodiment of the present application, the human body point cloud data of the sample user is the sample data that needs to be collected in advance for subsequent model training, and the sample data can be obtained by 3D shooting of multiple different users through devices such as a depth camera.

Of course, the human body point cloud data of multiple sample users can also be directly extracted from some databases storing human body point cloud data. This embodiment does not limit the collection method of the human body point cloud data used as the sample data.

S102: According to the coordinate value of each data point, respectively identify the nose tip position of the face in the human body point cloud data of each sample user;

In the embodiment of the present application, the collected human body point cloud data may include a full-body point cloud or a half-body point cloud, and so on. The detection object of the detection model obtained by collecting these human body point cloud data for model training is certain, that is, whether the point cloud data includes a human face is detected. Therefore, after obtaining the full-body point cloud or half-body point cloud data of the sample user, in order to reduce the amount of data processing for subsequent model training, the face point cloud data can be cropped from the human body point cloud data.

Generally, the human nose is basically in the center of the human face. Therefore, in order to cut out the face point cloud data from the collected human body point cloud data, it is possible to first identify the approximate position of the nose tip of each sample user in the respective human body point cloud data.

S103: Based on the position of the nose tip of the face, crop the face point cloud data from the human body point cloud data;

In the embodiment of the present application, after determining the position of the nose tip of the human face, the position can be used as the origin, and data of a certain length can be cropped in various directions of the coordinate axis, so as to obtain the face point cloud data.

Alternatively, the position of the nose tip of the face can be used as the center point of the sphere, and a sphere with the center point as the center of the sphere and a specific value as the radius is cut out from the human body point cloud data, and the data contained in the sphere is regarded as the face point Cloud data. The above length and value can be determined by those skilled in the art based on empirical values, which is not limited in this embodiment.

S104: Generate a face detection model by performing model training on the face point cloud data of the multiple sample users;

After obtaining the face point cloud data of each sample user, the face point cloud data can be used as sample data for model training to obtain a face detection model.

In the embodiment of the present application, the face detection model can be obtained by inputting the face point cloud data of each sample user mentioned above into a preset three-dimensional point cloud network model for training. The aforementioned three-dimensional point cloud network model may be a PointNet++ model.

The PointNet++ model is a deep learning multi-classification framework model based on 3D point cloud design. This model can be used to classify objects presented in the 3D point cloud.

In the specific implementation, after the PointNet++ model is used to train the face point cloud data of each sample user as the sample data, the generated face detection model can obtain the correlation information between the face point cloud data of each sample user. The associated information can be used for subsequent face detection.

S105. When the point cloud data of the object to be detected is received, use the face detection model to detect the point cloud data of the object to be detected, and identify whether the point cloud data of the object to be detected includes a human face.

In the embodiment of the present application, the object to be detected may be a user or other object to be detected. The point cloud data of the object to be detected can also be collected by depth cameras and other equipment.

The collected point cloud data of the object to be detected can be input to the face detection model generated in step S104, and the model performs the point cloud data detection on the current point cloud data that needs to be detected according to the correlation information between the face point cloud data obtained by training. Recognition, so as to output the information of whether the face is included in the point cloud data.

In specific implementation, the face detection model can judge the points to be detected by comparing the correlation information between the current point cloud data to be detected and the correlation information between the face point cloud data obtained through model training. Whether the cloud data includes face point cloud data.

In the embodiment of the present application, by collecting the human body point cloud data of multiple sample users, the position of the nose tip of the face in the human body point cloud data of each sample user can be identified according to the coordinate value of each data point, so as to be based on the human body point cloud data. The position of the tip of the face and nose can further cut out the face point cloud data from the human point cloud data as sample data for model training to generate a face detection model; so that when the point cloud data of the object to be detected is received, the above can be used The face detection model obtained by training detects the point cloud data of the object to be detected to identify whether the point cloud data of the object to be detected includes a face. In this embodiment, by performing model training on existing open source data sets, a batch of 3D point cloud data sets with only face information can be obtained. These 3D point cloud data sets can be directly used for subsequent face detection without the need for RGB2D. The image solves the problem that the 3D face detection algorithm implemented based on the projection of the 3D point cloud on the 2D image in the prior art is easily cracked, and the security of face detection is improved.

Referring to FIG. 2, there is shown a schematic flow chart of the steps of another face detection method according to an embodiment of the present application, which may specifically include the following steps:

S201. Collect human body point cloud data of multiple sample users, where the human body point cloud data includes multiple data points, and each data point has a corresponding coordinate value.

In the embodiment of the present application, the human body point cloud data of multiple sample users is the sample data for subsequent model training, and the corresponding face detection model can be generated by performing model training on the sample data.

In a specific implementation, the human body point cloud data of the sample user can be collected through equipment such as a depth camera, a depth camera, a depth sensor, or a lidar. The collected human body point cloud data may include a whole body point cloud or a half body point cloud. Of course, whether it is a full-body point cloud or a semi-deep point cloud, there are multiple data points, these data points include coordinate values in a three-dimensional coordinate system, and the information embodied by these data points can characterize the specific human body structure .

S202: Preprocessing the human body point cloud data of the multiple sample users;

In the embodiment of the present application, in order to reduce the amount of data processing during subsequent model training, the training error is reduced. After the human body point cloud data is collected, the human body point cloud data can also be preprocessed. The preprocessing of human point cloud data can include denoising processing and normalization processing.

Generally, the collected human body point cloud data will have some noise, such as some outliers. You can denoise the human body point cloud data of multiple sample users to filter out these outliers and remove the noise for subsequent follow-up. Identify the impact.

Then, during the normalization process, the human body point cloud data can be normalized to the human body point cloud data with preset specifications by performing scale transformation on the coordinate values of each data point after denoising.

Generally, the size of different human point cloud data may be different. For example, the area covered by some point cloud data is 3*3*3, while the area covered by some point cloud data is 6*6*6. Therefore, all human body point cloud data can be normalized to get The processed human body point cloud data with the same specifications.

In specific implementation, since the human body point cloud data includes the coordinate value of each data point, a cube containing all the data points in all the human body point cloud data can be generated according to each coordinate value, and then the coordinate value of each data point can be calculated. The scale transformation method normalizes each data point to a data point in a cube of the same specification.

S203: According to the origin and direction of the preset coordinate system, identify the position of the data point corresponding to the maximum value of the coordinate value on the horizontal axis or the vertical axis of the coordinate system in the human body point cloud data as the nose tip position of the human face;

In this embodiment of the application, after obtaining the human body point cloud data of the sample user, in order to reduce the amount of data processing for subsequent model training, the face point cloud data can be cropped from the human body point cloud data, and the face point cloud data As the positive sample data for subsequent training.

Since the human body point cloud data is a three-dimensional three-dimensional data, the position corresponding to the maximum value on the horizontal axis or the vertical axis can be selected as the nose tip position in the constructed coordinate system. Whether the position of the data point corresponding to the maximum value of the horizontal axis or the maximum value of the vertical axis is used as the position of the nose tip of the human face can be specifically determined according to the directions of the horizontal axis and the vertical axis of the coordinate system.

Generally speaking, the shape of the human body is approximately symmetrical. After the human body point cloud data is collected, a plane (second plane) can be determined first, by which the human body point cloud can be divided into two parts, and the number of point clouds in the left and right parts is approximately equal. Then, according to the coordinate value of each data point, the center point of the human body point cloud data can be determined, and the center point can be used as the origin of the coordinate system to be constructed. After the origin of the coordinate system is determined, the horizontal axis and the vertical axis of the coordinate system can be constructed based on the origin, so that another plane (the first plane) formed by the horizontal axis and the vertical axis is parallel to the horizontal plane and is parallel to the second plane. The plane is vertical. In this way, the horizontal or vertical axis of the coordinate system is parallel to the second plane.

If the horizontal axis of the above-mentioned coordinate system is parallel to the second plane, the vertical axis of the above-mentioned coordinate system is perpendicular to the second plane, and the position corresponding to the maximum value on the horizontal axis can be used as the position of the nose tip of the face; if the above-mentioned coordinate system The vertical axis of is parallel to the second plane, and the horizontal axis of the coordinate system is perpendicular to the second plane. At this time, the position corresponding to the maximum value on the vertical axis can be used as the position of the nose tip of the human face. It should be noted that the maximum value may be the maximum absolute value of the coordinate value.

S204: Constructing a three-dimensional coordinate system with the position of the nose tip of the face as the origin, and obtaining face point cloud data by extracting multiple data points within a preset length in each direction of the three-dimensional coordinate system;

For example, the position of the nose tip of a human face can be determined as the origin to construct a three-dimensional coordinate system, and then starting from the origin, data points within a certain length range in each direction of the coordinate axis can be extracted respectively, and the human body point cloud data can be "faced". Obtain face point cloud data.

S205. Generate a face detection model by performing model training on the face point cloud data of the multiple sample users, and the face detection model stores one of the data points in the face point cloud data obtained after training. Sparsity data between;

After obtaining the face point cloud data of each sample user, the face point cloud data can be used as positive sample data for model training to obtain a face detection model.

In the specific implementation, the face point cloud data of multiple sample users can be input into the preset 3D point cloud network model PointNet++ for model training for the first time. Then, by configuring the fully connected layer of the PointNet++ model into two layers, a two-class face detection model is generated.

Since the PointNet++ model is a deep learning multi-classification framework model based on 3D point cloud design, this model can be used to classify objects in the data presented by the 3D point cloud. Therefore, in the embodiment of the present application, by modifying the output result of the aforementioned PointNet++ model to a two-class classification, it is possible to classify whether the detected object is a human face. That is, the detected object is recognized through the PointNet++ model, and the corresponding output result is a human face or not a human face.

In specific implementation, the fully connected layer of the PointNet++ model can be configured to output two types of results, and the pre-collected sample set can be trained to realize the classification of faces and non-faces.

Of course, in order to improve the accuracy of subsequent recognition of the model, input the collected face point cloud data as positive sample data into the PointNet++ model for training. In the process of generating the face detection model, some non-face point clouds can also be collected The data is trained as negative sample data.

Since the above sample set for model training includes face point cloud data and non-face point cloud data, the PointNet++ model can obtain the sparseness between face point cloud data and non-face point cloud data by training the sample set data. Among them, the sparsity data of the face point cloud data can be used to indicate the location of each data point in the face point cloud data and the relative positional relationship between the various data points. In the subsequent recognition process, it is possible to determine whether the point cloud data to be detected is face point cloud data by comparing the similarity between the point cloud data to be detected and the sparsity data of the face point cloud data.

S206: When the point cloud data of the object to be detected is received, use the face detection model to detect the point cloud data of the object to be detected, and identify whether the point cloud data of the object to be detected includes a human face.

In the embodiment of this application, since the output result of the two-classified PointNet++ model only includes two cases, one is a human face, and the other is not a human face. Therefore, after collecting the point cloud data of the object to be detected, pass Input the point cloud data into the PointNet++ model, you can directly identify whether the point cloud data is a face.

In specific implementation, when the point cloud data of the object to be detected is collected, the above-mentioned pre-generated two-class face detection model can be used to obtain the sparsity between each data point in the point cloud data of the object to be detected; Calculating the similarity between the sparsity of each data point and the sparsity data between the data points stored in the face detection model can identify whether the point cloud data of the object to be detected includes a face.

Generally, if the aforementioned similarity exceeds a certain threshold, it can be determined that the point cloud data of the object to be detected includes a human face, otherwise, it does not include a human face. The face detection model can output corresponding detection results in real time.

In the embodiment of this application, by modifying the deep learning multi-classification framework model PointNet++ to two classifications, and preprocessing the existing open source data sets, a batch of 3D point cloud data sets with only face information can be obtained. After the above-mentioned 3D point cloud data set is trained on the model, the trained 3D point cloud data can be directly used for face detection without the need to resort to RGB2D images, which improves the security of face detection.

It should be noted that the size of the sequence number of each step in the above embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any implementation process of the embodiments of this application. limited.

Referring to FIG. 3, a schematic diagram of a face detection apparatus according to an embodiment of the present application is shown, which may specifically include the following modules:

The collection module 301 is configured to collect human body point cloud data of multiple sample users, where the human body point cloud data includes multiple data points, and each data point has a corresponding coordinate value;

The recognition module 302 is configured to recognize the position of the nose tip of the face in the human body point cloud data of each sample user according to the coordinate value of each data point;

The cropping module 303 is configured to crop the face point cloud data from the human body point cloud data based on the position of the nose tip of the face;

The generating module 304 is configured to generate a face detection model by performing model training on the face point cloud data of the multiple sample users;

The detection module 305 is configured to, when receiving the point cloud data of the object to be detected, use the face detection model to detect the point cloud data of the object to be detected, and identify whether the point cloud data of the object to be detected is Including human faces.

In the embodiment of the present application, the device may further include the following modules:

A denoising module, configured to perform denoising processing on the human body point cloud data of the multiple sample users;

The normalization module is used to normalize the human body point cloud data into target point cloud data with preset specifications by performing proportional transformation on the coordinate values of the respective data points after denoising.

In the embodiment of the present application, the identification module 302 may specifically include the following sub-modules:

The nose tip position recognition sub-module of the face is used to identify the data point corresponding to the maximum value of the coordinate value on the horizontal or vertical axis of the coordinate system in the human body point cloud data according to the origin and direction of the preset coordinate system The position is the position of the nose tip of the human face; wherein the origin of the coordinate system is the center point of the human body point cloud data, and the first plane formed by the horizontal axis and the vertical axis of the coordinate system is parallel to the horizontal plane and perpendicular to the second plane, The second plane is used to divide the human body point cloud data into two parts, and the horizontal axis or the vertical axis of the coordinate system is parallel to the second plane.

In the embodiment of the present application, the cropping module 303 may specifically include the following sub-modules:

The face point cloud data extraction sub-module is used to construct a three-dimensional coordinate system with the nose tip position of the face as the origin, and obtain a face by extracting multiple data points within a preset length in each direction of the three-dimensional coordinate system Point cloud data.

In the embodiment of the present application, the generating module 304 may specifically include the following sub-modules:

The model training sub-module is used to input the face point cloud data of the multiple sample users into a preset three-dimensional point cloud network model for model training;

The model configuration sub-module is used to generate a two-class face detection model by configuring the fully connected layer of the three-dimensional point cloud network model into two layers, and the face detection model stores the face obtained after training The sparsity data between each data point in the point cloud data.

In the embodiment of the present application, the detection module 305 may specifically include the following sub-modules:

A sparsity acquisition sub-module for acquiring the sparsity between various data points in the point cloud data of the object to be detected by using the two-class face detection model;

A similarity calculation sub-module for calculating the similarity between the sparsity between the various data points and the sparsity data between the various data points stored in the face detection model;

The face detection sub-module is configured to recognize that the point cloud data of the object to be detected includes a face when the similarity exceeds a preset threshold.

As for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the description of the method embodiment part.

Referring to FIG. 4, a schematic diagram of a terminal device according to an embodiment of the present application is shown. As shown in FIG. 4, the terminal device 400 of this embodiment includes a processor 410, a memory 420, and computer-readable instructions 421 stored in the memory 420 and running on the processor 410. When the processor 410 executes the computer-readable instructions 421, the steps in the various embodiments of the above-mentioned face detection method are implemented, for example, steps S101 to S105 shown in FIG. 1. Alternatively, when the processor 410 executes the computer-readable instructions 421, the functions of the modules/units in the foregoing device embodiments, such as the functions of the modules 301 to 305 shown in FIG. 3, are implemented.

Exemplarily, the computer-readable instructions 421 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 420 and executed by the processor 410. To complete this application. The one or more modules/units may be a series of computer-readable instruction segments capable of completing specific functions, and the instruction segments may be used to describe the execution process of the computer-readable instructions 421 in the terminal device 400. For example, the computer-readable instructions 421 can be divided into a collection module, an identification module, a cropping module, a generation module, and a detection module. The specific functions of each module are as follows:

The terminal device 400 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The terminal device 400 may include, but is not limited to, a processor 410 and a memory 420. Those skilled in the art can understand that FIG. 4 is only an example of the terminal device 400, and does not constitute a limitation on the terminal device 400. It may include more or less components than shown in the figure, or combine certain components, or different components. For example, the terminal device 400 may also include input and output devices, network access devices, buses, and so on.

The processor 410 may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.

The memory 420 may be an internal storage unit of the terminal device 400, such as a hard disk or a memory of the terminal device 400. The memory 420 may also be an external storage device of the terminal device 400, such as a plug-in hard disk equipped on the terminal device 400, a smart memory card (Smart Media Card, SMC), or a Secure Digital (SD). Card, Flash Card, etc. Further, the memory 420 may also include both an internal storage unit of the terminal device 400 and an external storage device. The memory 420 is used to store the computer-readable instructions 421 and other instructions and data required by the terminal device 400. The memory 420 may also be used to temporarily store data that has been output or will be output.

A person of ordinary skill in the art can understand that the implementation of all or part of the processes in the methods of the above-mentioned embodiments can be accomplished by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a non-volatile In a computer-readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims

A method for face detection, which is characterized in that it includes:

Collecting human body point cloud data of multiple sample users, where the human body point cloud data includes multiple data points, and each data point has a corresponding coordinate value;

According to the coordinate value of each data point, respectively identify the position of the nose tip of the face in the human body point cloud data of each sample user;

Cropping out face point cloud data from the human body point cloud data based on the position of the nose tip of the face;

Generating a face detection model by performing model training on the face point cloud data of the multiple sample users;

When the point cloud data of the object to be detected is received, the face detection model is used to detect the point cloud data of the object to be detected, and it is recognized whether the point cloud data of the object to be detected includes a human face.
The method according to claim 1, characterized in that, after the step of collecting human body point cloud data of a plurality of sample users, the method further comprises:

Denoising processing on the human body point cloud data of the multiple sample users;

The human body point cloud data is normalized to human body point cloud data with preset specifications by performing proportional transformation on the coordinate values of the respective data points after denoising.
The method according to claim 1, wherein the step of respectively identifying the position of the nose tip of the face in the human body point cloud data of each sample user according to the coordinate value of each data point comprises:

According to the origin and direction of the preset coordinate system, the position of the data point corresponding to the maximum value of the coordinate value on the horizontal axis or the vertical axis of the coordinate system in the human point cloud data is identified as the nose tip position of the human face; The origin of the coordinate system is the center point of the human body point cloud data, the first plane formed by the horizontal axis and the vertical axis of the coordinate system is parallel to the horizontal plane and perpendicular to the second plane, and the second plane is used to transfer the The human body point cloud data is divided into two parts, and the horizontal axis or the vertical axis of the coordinate system is parallel to the second plane.
The method according to claim 3, wherein the step of cropping face point cloud data from the human body point cloud data based on the position of the nose tip of the face comprises:

A three-dimensional coordinate system is constructed with the position of the nose tip of the human face as the origin, and a plurality of data points within a preset length in each direction of the three-dimensional coordinate system are extracted to obtain the face point cloud data.
The method according to claim 1, wherein the step of generating a face detection model by performing model training on the face point cloud data of the multiple sample users comprises:

Input the face point cloud data of the multiple sample users into a preset three-dimensional point cloud network model for model training;

By configuring the fully connected layer of the three-dimensional point cloud network model into two layers, a two-class face detection model is generated, and the face detection model stores each data point in the face point cloud data obtained after training Sparsity data between.
The method of claim 5, wherein the face detection model is used to detect the point cloud data of the object to be detected, and to identify whether the point cloud data of the object to be detected includes a human face The steps include:

Acquiring the sparsity between each data point in the point cloud data of the object to be detected by using the two-class face detection model;

Calculating the similarity between the sparsity between the various data points and the sparsity data between the various data points stored in the face detection model;

If the similarity exceeds a preset threshold, the point cloud data identifying the object to be detected includes a human face.
A face detection device, which is characterized in that it comprises:

The collection module is used to collect human body point cloud data of multiple sample users, where the human body point cloud data includes multiple data points, and each data point has a corresponding coordinate value;

The recognition module is used to recognize the nose tip position of the face in the human body point cloud data of each sample user according to the coordinate value of each data point;

A cropping module, configured to crop the face point cloud data from the human body point cloud data based on the position of the nose tip of the face;

A generating module for generating a face detection model by performing model training on the face point cloud data of the multiple sample users;

The detection module is configured to, when receiving the point cloud data of the object to be detected, use the face detection model to detect the point cloud data of the object to be detected, and identify whether the point cloud data of the object to be detected includes human face.
The device according to claim 7, wherein the identification module comprises:

The nose tip position recognition sub-module of the face is used to identify the data point corresponding to the maximum value of the coordinate value on the horizontal or vertical axis of the coordinate system in the human body point cloud data according to the origin and direction of the preset coordinate system The position is the position of the nose tip of the human face; wherein the origin of the coordinate system is the center point of the human body point cloud data, and the first plane formed by the horizontal axis and the vertical axis of the coordinate system is parallel to the horizontal plane and perpendicular to the second plane, The second plane is used to divide the human body point cloud data into two parts, and the horizontal axis or the vertical axis of the coordinate system is parallel to the second plane.
The device according to claim 8, wherein the cutting module comprises:

The face point cloud data extraction sub-module is used to construct a three-dimensional coordinate system with the nose tip position of the face as the origin, and obtain a face by extracting multiple data points within a preset length in each direction of the three-dimensional coordinate system Point cloud data.
The device according to claim 7, wherein the generating module comprises:

The model training sub-module is used to input the face point cloud data of the multiple sample users into a preset three-dimensional point cloud network model for model training;

The model configuration sub-module is used to generate a two-class face detection model by configuring the fully connected layer of the three-dimensional point cloud network model into two layers, and the face detection model stores the face obtained after training The sparsity data between each data point in the point cloud data.
A terminal device, comprising a memory, a processor, and computer-readable instructions stored in the memory and running on the processor, wherein the processor executes the computer-readable instructions as follows step:

Collecting human body point cloud data of multiple sample users, where the human body point cloud data includes multiple data points, and each data point has a corresponding coordinate value;

According to the coordinate value of each data point, respectively identify the position of the nose tip of the face in the human body point cloud data of each sample user;

Cropping out face point cloud data from the human body point cloud data based on the position of the nose tip of the face;

Generating a face detection model by performing model training on the face point cloud data of the multiple sample users;

When the point cloud data of the object to be detected is received, the face detection model is used to detect the point cloud data of the object to be detected, and it is recognized whether the point cloud data of the object to be detected includes a human face.
The terminal device according to claim 11, wherein the processor further implements the following steps when executing the computer-readable instruction:

According to the origin and direction of the preset coordinate system, the position of the data point corresponding to the maximum value of the coordinate value on the horizontal or vertical axis of the coordinate system in the human point cloud data is identified as the nose tip position of the human face; The origin of the coordinate system is the center point of the human body point cloud data, the first plane formed by the horizontal axis and the vertical axis of the coordinate system is parallel to the horizontal plane and perpendicular to the second plane, and the second plane is used to transfer the The human body point cloud data is divided into two parts, and the horizontal axis or the vertical axis of the coordinate system is parallel to the second plane.
The terminal device according to claim 12, wherein the processor further implements the following steps when executing the computer-readable instruction:

A three-dimensional coordinate system is constructed with the position of the nose tip of the human face as the origin, and a plurality of data points within a preset length in each direction of the three-dimensional coordinate system are extracted to obtain the face point cloud data.
The terminal device according to claim 11, wherein the processor further implements the following steps when executing the computer-readable instruction:

Input the face point cloud data of the multiple sample users into a preset three-dimensional point cloud network model for model training;

By configuring the fully connected layer of the three-dimensional point cloud network model into two layers, a two-class face detection model is generated, and the face detection model stores each data point in the face point cloud data obtained after training Sparsity data between.
The terminal device according to claim 14, wherein the processor further implements the following steps when executing the computer-readable instruction:

Acquiring the sparsity between each data point in the point cloud data of the object to be detected by using the two-class face detection model;

Calculating the similarity between the sparsity between the various data points and the sparsity data between the various data points stored in the face detection model;

If the similarity exceeds a preset threshold, the point cloud data identifying the object to be detected includes a human face.
A computer non-volatile readable storage medium, the computer non-volatile readable storage medium storing computer readable instructions, wherein the computer readable instructions are executed by a processor to implement the following steps:

Collecting human body point cloud data of multiple sample users, where the human body point cloud data includes multiple data points, and each data point has a corresponding coordinate value;

According to the coordinate value of each data point, respectively identify the position of the nose tip of the face in the human body point cloud data of each sample user;

Cropping out face point cloud data from the human body point cloud data based on the position of the nose tip of the face;

Generating a face detection model by performing model training on the face point cloud data of the multiple sample users;

When the point cloud data of the object to be detected is received, the face detection model is used to detect the point cloud data of the object to be detected, and it is recognized whether the point cloud data of the object to be detected includes a human face.
The computer non-volatile readable storage medium according to claim 16, wherein the computer readable instruction further implements the following steps when being executed by the processor:

According to the origin and direction of the preset coordinate system, the position of the data point corresponding to the maximum value of the coordinate value on the horizontal axis or the vertical axis of the coordinate system in the human point cloud data is identified as the nose tip position of the human face; The origin of the coordinate system is the center point of the human body point cloud data, the first plane formed by the horizontal axis and the vertical axis of the coordinate system is parallel to the horizontal plane and perpendicular to the second plane, and the second plane is used to transfer the The human body point cloud data is divided into two parts, and the horizontal axis or the vertical axis of the coordinate system is parallel to the second plane.
The computer non-volatile readable storage medium according to claim 17, wherein the computer readable instruction further implements the following steps when being executed by the processor:

A three-dimensional coordinate system is constructed with the position of the nose tip of the human face as the origin, and a plurality of data points within a preset length in each direction of the three-dimensional coordinate system are extracted to obtain the face point cloud data.
The computer non-volatile readable storage medium according to claim 16, wherein the computer readable instruction further implements the following steps when being executed by the processor:

Input the face point cloud data of the multiple sample users into a preset three-dimensional point cloud network model for model training;

By configuring the fully connected layer of the three-dimensional point cloud network model into two layers, a two-class face detection model is generated, and the face detection model stores each data point in the face point cloud data obtained after training Sparsity data between.
The computer non-volatile readable storage medium of claim 19, wherein the computer readable instruction further implements the following steps when being executed by the processor:

Acquiring the sparsity between each data point in the point cloud data of the object to be detected by using the two-class face detection model;

Calculating the similarity between the sparsity between the various data points and the sparsity data between the various data points stored in the face detection model;

If the similarity exceeds a preset threshold, the point cloud data identifying the object to be detected includes a human face.