CN114677735A

CN114677735A - Face key point and three-dimensional angle detection method and device and terminal equipment

Info

Publication number: CN114677735A
Application number: CN202210301641.5A
Authority: CN
Inventors: 刘明; 魏玉蓉; 苏云强
Original assignee: Shenzhen Jingyang Information Technology Co ltd
Current assignee: Shenzhen Jingyang Information Technology Co ltd
Priority date: 2022-03-25
Filing date: 2022-03-25
Publication date: 2022-06-28

Abstract

The application is suitable for the technical field of face detection, and provides a method, a device and a terminal device for detecting key points and three-dimensional angles of a face, wherein the method comprises the following steps: acquiring image data to be detected; the image data to be detected is original image data containing at least one face; inputting image data to be detected into a pre-trained face key point and three-dimensional angle detection model to obtain a detection result of the image data to be detected; the detection result is image data which comprises position information of each face, key point position information of each face, three-dimensional angle data of each face and proportion scores. According to the method and the device, the pre-trained face key points and the three-dimensional angle detection model are used for carrying out multiple processing on the image data to be detected containing the face, and the image data of the face detection results corresponding to a plurality of different face detection requirements are obtained respectively, so that the equipment loss and the detection time are reduced, and the detection efficiency and the detection precision of the multi-task face detection processing are improved.

Description

Face key point and three-dimensional angle detection method and device and terminal equipment

Technical Field

The application belongs to the technical field of face detection, and particularly relates to a face key point and three-dimensional angle detection method and device, a terminal device and a readable storage medium.

Background

In the process of performing face detection processing on image data, corresponding face detection processing needs to be performed according to different requirements of a user, so as to obtain different detection results (for example, detecting and determining position information of a face, detecting and determining position information of key points of the face, detecting and determining a three-dimensional angle of the face, and the like).

Related face detection methods generally can only achieve one detection requirement for faces; when a user has multiple detection requirements, multiple face detection results can be obtained by inputting data for multiple times, which causes a large amount of detection time and lower equipment performance consumption, detection efficiency and detection precision.

Disclosure of Invention

The embodiment of the application provides a method and a device for detecting key points and three-dimensional angles of a human face, a terminal device and a readable storage medium, and can solve the problem that the related human face detection efficiency and detection precision are low.

In a first aspect, an embodiment of the present application provides a method for detecting key points and three-dimensional angles of a human face, including:

acquiring image data to be detected; the image data to be detected is original image data containing at least one human face;

inputting the image data to be detected into a pre-trained face key point and three-dimensional angle detection model to obtain a detection result of the image data to be detected; and the detection result is image data comprising position information of each face, key point position information of each face, three-dimensional angle data of each face and a proportion score.

In one embodiment, the face key point and three-dimensional angle detection model is a newly-built Retinaface face detection network model;

before the image data to be detected is obtained, the method comprises the following steps:

establishing the newly-established Retinaface face detection network model; every branch of the output head of the newly-built Retinaface face detection network model is formed by sequentially connecting a three-dimensional angle calculation model, a convolution layer, a conversion layer and a remodeling layer.

In one embodiment, the three-dimensional angle calculation model is composed of a 3 × 3 convolution layer, a normalization layer and an activation layer which are sequentially connected;

the three-dimensional angle calculation model is used for processing the image data to be detected to obtain the three-dimensional angle data of each face in the image data to be detected.

In an embodiment, after the creating of the new Retinaface face detection network model, the method further includes:

acquiring a plurality of training image data containing human faces;

determining position information of each face, key point position information of each face, three-dimensional angle data of each face and a proportion score of each training image data in each training image data, and adding corresponding labels to obtain a training data set;

and inputting the training data set into the newly-built Retinaface face detection network model for training to obtain a pre-trained face key point and three-dimensional angle detection model.

In an embodiment, the inputting the training data set into the newly-built Retinaface face detection network model for training to obtain a pre-trained face key point and three-dimensional angle detection model includes:

inputting the training data set into the newly-built Retinaface face detection network model, and calculating three-dimensional angle data of each face in the training image data through a newly-added angle loss function to obtain an error update value;

and performing back propagation based on the error update value to obtain the pre-trained face key points and the three-dimensional angle detection model.

In one embodiment, the angle loss function expression is:

wherein the content of the first and second substances,

which represents the cross-entropy loss in the entropy domain,

represents the least squares error loss, β is 0.001.

In a second aspect, an embodiment of the present application provides a device for detecting key points and three-dimensional angles of a human face, including:

the data acquisition module is used for acquiring image data to be detected; the image data to be detected is original image data containing at least one human face;

the detection module is used for inputting the image data to be detected into a pre-trained human face key point and three-dimensional angle detection model to obtain a detection result of the image data to be detected; and the detection result is image data comprising position information of each face, key point position information of each face, three-dimensional angle data of each face and a proportion score.

the device further comprises:

the model establishing module is used for establishing the newly-established Retinaface face detection network model; every branch of the output head of the newly-built Retinaface face detection network model is formed by sequentially connecting a three-dimensional angle calculation model, a convolution layer, a conversion layer and a remodeling layer.

In one embodiment, the apparatus further comprises:

the training data acquisition module is used for acquiring a plurality of training image data containing human faces;

the label module is used for determining the position information of each face, the key point position information of each face, the three-dimensional angle data of each face and the proportional score of each training image data in each training image data, and adding a corresponding label to obtain a training data set;

and the pre-training module is used for inputting the training data set into the newly-built Retinaface face detection network model for training to obtain a pre-trained face key point and three-dimensional angle detection model.

In one embodiment, the pre-training module comprises:

an error determining unit, configured to input the training data set into the newly-built Retinaface face detection network model, and calculate three-dimensional angle data of each face in the training image data by adding an angle loss function, so as to obtain an error update value;

and the training unit is used for carrying out back propagation on the basis of the error updating value to obtain the pre-trained human face key points and the three-dimensional angle detection model.

In one embodiment, the angle loss function expression is:

wherein the content of the first and second substances,

which represents the cross-entropy loss in the entropy domain,

represents the least squares error loss, β is 0.001.

In a third aspect, an embodiment of the present application provides a terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor, when executing the computer program, implements the method for detecting a key point of a human face and a three-dimensional angle according to any one of the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method for detecting key points and three-dimensional angles of a human face according to any one of the foregoing first aspects is implemented.

In a fifth aspect, an embodiment of the present application provides a computer program product, which when running on a terminal device, causes the terminal device to execute the method for detecting a key point of a human face and a three-dimensional angle according to any one of the above first aspects.

Compared with the prior art, the embodiment of the application has the advantages that: the image data to be detected containing the human face are subjected to multiple processing through the pre-trained human face key points and the three-dimensional angle detection model, and the image data of human face detection results corresponding to a plurality of human face detection requirements are obtained respectively, so that the equipment loss and the detection time are reduced, and the detection efficiency and the detection precision of the multi-task human face detection processing are improved.

It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic flow chart of a face key point and three-dimensional angle detection method provided in an embodiment of the present application;

fig. 2 is another schematic flow chart of a face key point and three-dimensional angle detection method according to an embodiment of the present application;

fig. 3 is a schematic flowchart of another method for detecting key points and three-dimensional angles of a human face according to an embodiment of the present application;

fig. 4 is a schematic flowchart of step S1003 of a method for detecting key points and three-dimensional angles of a human face according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a face key point and three-dimensional angle detection apparatus according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a terminal device according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

The face key point and three-dimensional angle detection method provided by the embodiment of the application can be applied to terminal devices such as mobile phones, tablet computers, wearable devices, vehicle-mounted devices, Augmented Reality (AR)/Virtual Reality (VR) devices, notebook computers, Personal Digital Assistants (PDAs) and the like, and the embodiment of the application does not limit the specific types of the terminal devices.

Fig. 1 shows a schematic flowchart of a face key point and three-dimensional angle detection method provided in the present application, which may be applied to the above-mentioned notebook computer by way of example and not limitation.

S101, acquiring image data to be detected; the image data to be detected is original image data containing at least one human face.

Specifically, image data to be detected, which is obtained by shooting by an external camera device, transmitted by the external device or input by a user, is acquired; the image data to be detected is original image data which contains at least one human face and is not subjected to image processing.

S102, inputting the image data to be detected into a pre-trained face key point and three-dimensional angle detection model to obtain a detection result of the image data to be detected; and the detection result is image data comprising position information of each face, key point position information of each face, three-dimensional angle data of each face and a proportion score.

Specifically, inputting image data to be detected into a pre-established and pre-trained face key point and three-dimensional angle detection model to obtain a detection result of the image data to be detected output by the face key point and three-dimensional angle detection model; the human face key point and three-dimensional angle detection model is a newly-built Retinaface human face detection network model, the detection result output by the model specifically comprises image data of results corresponding to a plurality of different human face detection requirements, the results corresponding to the plurality of different human face detection requirements are respectively position information of each human face in the image data to be detected and position information of each human face key point (under general conditions, 5 key points for detecting the human face are set and are respectively a left eye, a right eye, a nose, a left mouth corner and a right mouth corner), three-dimensional angle data of each human face and a proportion score of the image data to be detected.

In one embodiment, the face key point and three-dimensional angle detection model is a newly-built Retinaface face detection network model.

As shown in fig. 2, in an embodiment, before the step S101 of acquiring image data to be detected, the method includes:

s100, establishing the newly-established Retinaface face detection network model; every branch of the output head of the newly-built Retinaface face detection network model is formed by sequentially connecting a three-dimensional angle calculation model, a convolution layer, a conversion layer and a remodeling layer.

Specifically, the Retinaface face detection network model can detect and determine the position information of key points of a face, the proportional score of image data (when the Retinaface face detection network model detects the face, an anchor frame is divided into two types including a background and the face, wherein the proportional score is the probability of the face in the image data) and the position information of the face, so that the Retinaface face detection network model needs to be improved in advance, and the face key points and the three-dimensional angle detection model are established; compared with the original Retinaface face detection network model, the difference point of the newly-built Retinaface face detection network model mainly lies in the output head part and the loss function part. Each branch in an output header of the newly-built Retinaface face detection network model is formed by sequentially connecting a three-dimensional angle calculation model, a convolution layer, a conversion layer and a remodeling layer, so that the size of output data of the newly-built Retinaface face detection network model is (Batchsize, numclusters, dimensionality), wherein Batchsize refers to the number of input image data to be processed, numclusters refers to the number of anchor frames generated by the size of the image data to be processed, and the dimensionalities corresponding to different face detection tasks are different; if the dimension of the simple face detection is 2, the dimension (box) of the face detection frame is 4, and the dimension of the face position information (landmark) is 10.

Specifically, the difference between the output head of the newly-built Retinaface face detection network model and the original Retinaface face detection network model is mainly as follows: the three-dimensional angle calculation model is formed by sequentially connecting a 3 x 3 convolution layer, a normalization layer and an activation layer and is used for processing image data to be detected to obtain three-dimensional angle data of each face in the image data to be detected.

The three-dimensional angle data of the face mainly refers to euler angles, when the face is detected, a space coordinate system with a central point of the head of a user as an origin is established, and the corresponding euler angles include pitch (a pitch angle, which refers to an angle of the face rotating around an X axis), yaw (a yaw angle, which refers to an angle of the face rotating around a Y axis), and roll (a roll angle, which refers to an angle of the face rotating around a Z axis).

As shown in fig. 3, in an embodiment, after the creating of the new Retinaface face detection network model, the method further includes:

s1001, acquiring a plurality of training image data containing human faces;

s1002, determining position information of each face, key point position information of each face, three-dimensional angle data of each face and a proportion score of each training image data in each training image data, and adding corresponding labels to obtain a training data set;

s1003, inputting the training data set into the newly-built Retinaface face detection network model for training to obtain a pre-trained face key point and three-dimensional angle detection model.

Specifically, a large amount of training image data containing at least one face is obtained, position information of each face, key point position information of each face, three-dimensional angle data of each face and a proportion score of each training image data in each training image data are identified and determined, corresponding labels are added one by one to obtain a training data set, the training data set is input into a newly-built Retinafac face detection network model for optimization training, and pre-trained face key points and a three-dimensional angle detection model are obtained.

As shown in fig. 4, in an embodiment, the step S1003 of inputting the training data set into the newly-created Retinaface face detection network model for training to obtain a pre-trained face key point and three-dimensional angle detection model includes:

s10031, inputting the training data set into the newly-built Retinaface face detection network model, and calculating three-dimensional angle data of each face in the training image data by adding an angle loss function to obtain an error update value;

and S10032, performing back propagation based on the error update value to obtain the pre-trained face key points and the three-dimensional angle detection model.

Specifically, an angle loss function is newly added in an original Retinaface face detection network model, a training data set is input into the newly-added Retinaface face detection network model, three-dimensional angle data of each face in training image data is calculated through the newly-added angle loss function (including a classification loss function and a regression loss function of an angle interval), an error update value is obtained, back propagation is carried out through the error update value, and errors related to three-dimensional angle detection data are reduced, so that a pre-trained face key point and a three-dimensional angle detection model are obtained.

In one embodiment, the angle loss function expression is:

wherein the content of the first and second substances,

which represents the cross-entropy loss in the entropy domain,

represents the least squares error loss, β is 0.001.

Specifically, when calculating the error of the three-dimensional angle data, it is first necessary to classify the euler angles according to angle intervals (for example, setting the angle interval to be 3 degrees, and the range of the Yaw angle Yaw to be [ -99, 99], and correspondingly, the Yaw angle Yaw may be divided into 66 categories), classify the image data based on the classified angle intervals, determine the classification result of the angle intervals, convert the classification result of the angle intervals into three-dimensional angle data, determine the classification loss based on the three-dimensional angle data, perform regression processing on the three-dimensional angle data obtained by the conversion to obtain the regression loss, and combine the regression loss (adding the weight coefficient β before the regression loss error) and the classification loss to obtain the final angle loss, where the expression of the angle loss function is:

wherein the content of the first and second substances,

which represents the cross-entropy loss of the entropy,

represents the least squares error loss, β is 0.001.

Specifically, based on the newly added angle loss function, the final loss of the pre-trained face key points and the three-dimensional angle detection model can be expressed as follows: l is 2L_cls+L_ciou+ wingloss +0.25 Lpose. Wherein L is_clsThe categorical losses for the angle interval are expressed, wing losses are expressed, ciou the losses for the complete overlap.

In the embodiment, the pre-trained face key points and the three-dimensional angle detection model are used for carrying out multiple processing on the image data to be detected containing the face, so that the image data of the face detection results corresponding to a plurality of different face detection requirements are obtained, the equipment loss and the detection time are reduced, and the detection efficiency and the detection precision of the multi-task face detection processing are improved.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Corresponding to the method for detecting key points of a human face and three-dimensional angles described in the foregoing embodiments, fig. 5 shows a block diagram of a device for detecting key points of a human face and three-dimensional angles provided in the embodiments of the present application.

Referring to fig. 5, the face keypoint and three-dimensional angle detection apparatus 100 includes:

the data acquisition module 101 is used for acquiring image data to be detected; the image data to be detected is original image data containing at least one face;

the detection module 102 is configured to input the image data to be detected into a pre-trained face key point and three-dimensional angle detection model, so as to obtain a detection result of the image data to be detected; and the detection result is image data comprising position information of each face, key point position information of each face, three-dimensional angle data of each face and a proportion score.

the device further comprises:

In one embodiment, the apparatus further comprises:

In one embodiment, the pre-training module comprises:

and the training unit is used for carrying out back propagation on the basis of the error update value to obtain the pre-trained human face key points and the three-dimensional angle detection model.

In one embodiment, the angle loss function expression is:

wherein, the first and the second end of the pipe are connected with each other,

which represents the cross-entropy loss in the entropy domain,

represents the least squares error loss, β is 0.001.

It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.

Fig. 6 is a schematic structural diagram of the terminal device provided in this embodiment. As shown in fig. 6, the terminal device 6 of this embodiment includes: at least one processor 60 (only one is shown in fig. 6), a memory 61, and a computer program 62 stored in the memory 61 and operable on the at least one processor 60, wherein the processor 60 executes the computer program 62 to implement the steps in any of the above embodiments of the face keypoint and three-dimensional angle detection method.

The terminal device 6 may be a computing device such as a desktop computer, a notebook, a palm computer, and a cloud server. The terminal device may include, but is not limited to, a processor 60, a memory 61. Those skilled in the art will appreciate that fig. 6 is only an example of the terminal device 6, and does not constitute a limitation to the terminal device 6, and may include more or less components than those shown, or combine some components, or different components, such as an input/output device, a network access device, and the like.

The Processor 60 may be a Central Processing Unit (CPU), and the Processor 60 may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 61 may in some embodiments be an internal storage unit of the terminal device 6, such as a hard disk or a memory of the terminal device 6. In other embodiments, the memory 61 may also be an external storage device of the terminal device 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital Card (SD), a Flash memory Card (Flash Card), and the like, which are equipped on the terminal device 6. Further, the memory 61 may also include both an internal storage unit and an external storage device of the terminal device 6. The memory 61 is used for storing an operating system, an application program, a BootLoader (BootLoader), data, and other programs, such as program codes of the computer program. The memory 61 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

An embodiment of the present application further provides a terminal device, where the terminal device includes: at least one processor, a memory, and a computer program stored in the memory and executable on the at least one processor, the processor implementing the steps of any of the various method embodiments described above when executing the computer program.

An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in the foregoing method embodiments.

The embodiments of the present application provide a computer program product, which when running on a mobile terminal, enables the mobile terminal to implement the steps in the above method embodiments when executed.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal device, recording medium, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunication signals, and software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/network device and method may be implemented in other ways. For example, the above-described apparatus/network device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the embodiments of the present application, and they should be construed as being included in the present application.

Claims

1. A method for detecting key points and three-dimensional angles of a human face is characterized by comprising the following steps:

inputting the image data to be detected into a pre-trained face key point and three-dimensional angle detection model to obtain a detection result of the image data to be detected; and the detection result is image data comprising position information of each face, key point position information of each face, three-dimensional angle data of each face and proportion scores.

2. The method according to claim 1, wherein the face key point and three-dimensional angle detection model is a newly-built Retinaface face detection network model;

3. The method of claim 2, wherein the three-dimensional angle calculation model comprises a 3 x 3 convolution layer, a normalization layer and an activation layer sequentially connected together;

4. The method for detecting key points and three-dimensional angles of a human face according to claim 2, wherein after the establishing of the new Retinaface human face detection network model, the method further comprises:

acquiring a plurality of training image data containing human faces;

5. The method for detecting face key points and three-dimensional angles according to claim 4, wherein the step of inputting the training data set into the newly-built Retinaface face detection network model for training to obtain a pre-trained face key point and three-dimensional angle detection model comprises the steps of:

inputting the training data set into the newly-built Retinaface face detection network model, and calculating the three-dimensional angle data of each face in the training image data through a newly-added angle loss function to obtain an error update value;

6. The method for detecting key points and three-dimensional angles of a human face according to claim 5, wherein the angle loss function expression is:

wherein the content of the first and second substances,

which represents the cross-entropy loss of the entropy,

represents the least squares error loss, β is 0.001.

7. A human face key point and three-dimensional angle detection device is characterized by comprising:

the detection module is used for inputting the image data to be detected into a pre-trained human face key point and three-dimensional angle detection model to obtain a detection result of the image data to be detected; and the detection result is image data comprising position information of each face, key point position information of each face, three-dimensional angle data of each face and proportion scores.

8. The apparatus according to claim 7, wherein the face key point and three-dimensional angle detection model is a newly-built Retinaface face detection network model;

the device further comprises:

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 6 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 6.