CN113642354A

CN113642354A - Face pose determination method, computer device and computer readable storage medium

Info

Publication number: CN113642354A
Application number: CN202010344039.0A
Authority: CN
Inventors: 李渊; 熊宇龙
Original assignee: Wuhan TCL Group Industrial Research Institute Co Ltd
Current assignee: Wuhan TCL Group Industrial Research Institute Co Ltd
Priority date: 2020-04-27
Filing date: 2020-04-27
Publication date: 2021-11-12

Abstract

The application relates to a method for determining a face pose, computer equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring an image to be processed, wherein the image to be processed comprises a target face; determining position information respectively corresponding to each key point in the target face; and determining the face pose corresponding to the target face according to the position information respectively corresponding to each key point. Due to the fact that relative position relations of the key points are different under different postures, when the face posture is determined, the face posture can be accurately determined through position information corresponding to a few key points, the face posture corresponding to the target face can be accurately determined, computing time is saved, and the face posture corresponding to the target face can be quickly and accurately determined.

Description

Face pose determination method, computer device and computer readable storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method for determining a face pose, a computer device, and a computer-readable storage medium.

Background

With the development of artificial intelligence, the application of artificial intelligence technology has become popular, for example, the application of face recognition technology in the field of artificial intelligence technology is more and more extensive, for example, the application is to brushing face to get express delivery: through the counter face recognition device that sets up of express delivery, thereby discernment gets a people and opens corresponding cupboard to get a people and take out the express delivery, can also use certainly to brush the face and advance the district, brush the face and take by bus, brush face aspects such as payment.

At present, the existing face recognition technology mainly adopts the following procedures, key point positioning (5 points or 68 points or more) is performed on a detected face, the face pose is judged by using the positioned key points, the face pose is specifically determined by a face three-dimensional mapping mode, specifically, the face pose is determined by calculating the face angle in a three-dimensional space, and the face angle includes: left and right side face angles, upper and lower face pitch angles, and face rotation angles. However, when the face pose is determined by adopting face three-dimensional mapping, the following problems exist: the calculation speed is high by adopting fewer key points (for example, 5 key points), but because the number of the key points is small, the difference of the calculated angles under different postures is not large, so that misjudgment can be caused, and the posture of the human face estimated by mapping is inaccurate. And more key points (for example, 68 key points) are adopted, although the face pose can be well estimated, and the face pose can be accurately obtained, because the number of the key points is more, each key point participates in the calculation, the calculation amount of the three-dimensional mapping is greatly increased, and the three-dimensional reconstruction estimation process is time-consuming. That is, the prior art has the problem that efficiency and accuracy cannot be considered at the same time.

Therefore, the prior art is in need of improvement.

Disclosure of Invention

The invention aims to solve the technical problem of providing a method for determining a human face posture, computer equipment and a computer readable storage medium so as to rapidly and accurately determine the human face posture.

In one aspect, an embodiment of the present invention provides a method for determining a face pose, including:

acquiring an image to be processed, wherein the image to be processed comprises a target face;

determining position information respectively corresponding to each key point in the target face;

and determining the face pose corresponding to the target face according to the position information respectively corresponding to each key point.

In a second aspect, an embodiment of the present invention provides a computer device, including a memory and a processor, where the memory stores a computer program, and the processor implements the following steps when executing the computer program:

In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the following steps:

Compared with the prior art, the embodiment of the invention has the following advantages:

the relative position relations of the key points are different under different postures, so that the corresponding position information of the key points is different, the face posture corresponding to the target face can be accurately determined according to the corresponding position information of the key points, and the face posture can be determined by the corresponding position information of a small number of key points when the face posture is determined, namely, the face posture can be determined without needing a large number of key points, so that the calculation amount is reduced, the calculation time is saved, and the face posture of the target face can be rapidly and accurately determined.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is an application environment diagram of a method for determining a face pose according to an embodiment of the present invention;

FIG. 2 is a first flowchart of a method for determining a face pose according to an embodiment of the present invention;

FIG. 3 is a second flowchart of a method for determining a face pose according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a front key point in an embodiment of the present invention;

FIG. 5 is a schematic diagram of a side key point in an embodiment of the present invention;

FIG. 6 shows coef on the front surface of an embodiment of the present invention_xm、coef_ym、coef_xymA visualization schematic of cpose;

FIG. 7 shows the coef of the side of the embodiment of the present invention_xm、coef_ym、coef_xymA visualization schematic of cpose;

fig. 8 is an internal structural diagram of a computer device in an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

First, terms related to embodiments of the present invention are described:

face pose: pose generally refers to the relative orientation and position of an object, and in particular a human face, relative to a camera. In general, when a head is rotated in a horizontal or vertical direction with respect to a front face, the front face of a person is at a certain angle of rotation or pitch with respect to a fixed camera. The human face pose comprises a front face, a side face, an upward view, an overlook view and the like, wherein the side face is divided into a left side face and a right side face.

Key points of the face: the key points of the face refer to points that can reflect each part of the face, and specifically, refer to feature points that reflect each part of the face in the face image. For example, facial five sense organs keypoints: the position of each five sense organs is a key point. The key points of the face also comprise key points of eyeball center, canthus, nose tip, mouth corner, face contour, eyebrow and other parts.

Location information of key points: the location information of the key points refers to information representing locations of the key points, and specifically, in the case of the face image, refers to information representing locations of the key points in the face image. In the embodiment of the invention, the position of the key point in the face image is represented by adopting the coordinate information, and the coordinate information refers to the coordinate of the key point in a coordinate system.

Face pose score: the face pose score is a score obtained by performing quantitative analysis on the face pose. The face pose score is obtained according to a correlation coefficient of the target vector and a preset correlation coefficient central value, wherein the correlation coefficient of the target vector is obtained under a face pose corresponding to the target face, and the preset correlation coefficient central value is obtained under a preset face pose. The face pose score reflects a difference between a correlation coefficient of the target vector and a preset correlation coefficient center value, and thus the face pose score actually reflects a difference between a face pose corresponding to the target face and a preset face pose, and the face pose score is used for determining the face pose, that is, the face pose corresponding to the target face is determined according to the difference between the face pose corresponding to the target face and the preset face pose. The preset face pose is a preset face pose and is used as a reference for judging the face pose corresponding to the target face, that is, a certain face pose is set as the preset face pose to provide a reference for judging the face pose corresponding to the target face, for example, if the front face is the preset face pose, the face pose score is larger, and the difference between the face pose corresponding to the target face and the front face is larger, the face pose corresponding to the target face is not the front face; otherwise, the face pose score is smaller, which indicates that the difference between the face pose corresponding to the target face and the front face is smaller, and then the face pose corresponding to the target face is the front face. Different preset human face postures are set, the obtained human face posture scores are different, and the human face posture scores reflect the difference between the human face posture corresponding to the target human face and the preset human face posture, so that the preset human face posture is set, and the determination of the human face posture is not influenced.

Target vector: the target vector is a vector obtained by recombining coordinate information of the key points. For example, the coordinate information corresponding to each of the key points is (x)₁，y₁)，(x₂，y₂)，(x₃，y₃)，···，(x_t，y_t)，···，(x_n，y_n) The target vectors obtained after recombination treatment include: target vector v corresponding to x dimension_x＝(x₁，x₂，x₃，···，x_t，···，x_n) Target vector v corresponding to y dimension_y＝(y₁，y₂，y₃，···，y_t，···，y_n) Wherein x is_tCoordinate information corresponding to the x dimension of the t-th key point, y_tAnd coordinate information corresponding to the y dimension of the t-th key point.

Lag vector: the lag vector is a vector obtained by lag processing of a target vector, all elements of the lag vector are derived from the target vector, and at least one element in one lag vector is different from the elements of the other lag vector.

Dimension: dimension refers to the number of coordinates of a keypoint in a coordinate system. For example, the coordinate information corresponding to the key point is (x)₁，y₁) That is, the number of coordinates of the key point in the coordinate system is 2, and the coordinate system is a two-dimensional coordinate system, in which the dimension is 2, which is denoted as x dimension and y dimension, respectively. For another example, the coordinate information corresponding to the key point is (x)₁，y₁，z₁) That is, the number of coordinates of the key point in the coordinate system is 3, and the coordinate system is a three-dimensional coordinate system, in which the dimension is 3, which is denoted as x dimension, y dimension and z dimension.

Correlation coefficient: the correlation coefficient is a statistical index used for reflecting the degree of closeness of correlation between target vectors. The correlation coefficient comprises an autocorrelation coefficient and a cross correlation coefficient, the autocorrelation coefficient is a statistical index used for reflecting the degree of closeness of the correlation between the target vector and the autocorrelation coefficient, and the cross correlation coefficient is a statistical index used for reflecting the degree of closeness of the correlation between two different target vectors.

Variance value: the variance value refers to a value for measuring the degree of dispersion of a target vector (or a lag vector). The variance value in the invention comprises mean square error and covariance, wherein the covariance refers to a numerical value of the discrete degree of two target vectors (or two lag vectors); the mean square error, also called standard deviation, refers to the square of the deviation of the data from the mean and the root of the mean, and measures the value of the degree of dispersion of a target vector (or a lag vector).

Preset correlation coefficient center value: the preset correlation coefficient center value is a preset reference value for measuring the sizes of the correlation coefficients of different target vectors. The preset central value of the correlation coefficient can be obtained according to the position information of each key point under the preset human face posture (such as the human face front posture). The size of the preset correlation coefficient central value reflects the discrete degree of the position information of each key point under the preset human face posture, the larger the preset correlation coefficient central value is, the more dispersed the position information of each key point under the preset human face posture is, and the larger the preset correlation coefficient central value is, the closer the position information of each key point under the preset human face posture is. For example, when the key points are the same, the preset correlation coefficient center value in the face side pose is smaller than the preset correlation coefficient center value in the face front pose.

Correlation coefficient decentralized value: the correlation coefficient decentralization value is a numerical value used for measuring the dispersion degree between the correlation coefficient of the target vector and a preset correlation coefficient central value. The magnitude of the correlation coefficient de-centering value reflects the magnitude of the difference between the correlation coefficient value of the target vector and the preset correlation coefficient center value, the greater the correlation coefficient de-centering value is, the greater the difference between the correlation coefficient of the target vector and the preset correlation coefficient center value is, the greater the difference between the face pose corresponding to the target vector (i.e., the face pose corresponding to the target face) and the preset face pose corresponding to the preset correlation coefficient center value is, that is, the face pose corresponding to the target face cannot be the preset face pose corresponding to the preset correlation coefficient center value.

The inventor has found that in face recognition, a plurality of key points are usually located and the face pose is determined according to the key points, and the number of the key points is usually 5 points or 68 points or more. When fewer key points are adopted, the calculation difficulty is lower, the time consumption is less, but the accuracy is lower; when more key points are adopted, the accuracy is higher, but the calculation difficulty is higher, the consumed time is large, and therefore, the prior art is difficult to be compatible with the effects of the consumed time and the accuracy, and the posture of the face cannot be determined quickly and accurately.

In order to solve the above problem, in the embodiment of the present invention, in order to quickly and accurately determine the face pose, after the image to be processed is acquired, the position information corresponding to each key point in the target face is determined, and then the face pose corresponding to the target face is determined according to the position information corresponding to each key point. Because the respective corresponding position information of the key points is different in different postures, the face posture corresponding to the target face can be determined through the respective corresponding position information of the key points, and when the face posture corresponding to the target face is determined, a few key points can reflect the respective corresponding position information difference of the key points in the target face in different postures, that is, the method has low requirement on the number of the key points, can determine the face posture corresponding to the target face without needing a large number of key points, reduces the calculation amount, saves the calculation time, and can quickly and accurately determine the face posture corresponding to the target face.

In addition, the method for determining the human face posture is simple in process, a calculation formula for calculating the correlation coefficient is also simple, the calculation difficulty is greatly lower than that of three-dimensional reconstruction estimation, the time consumption is shorter, and the calculation power requirement on hardware is lower.

The embodiment of the present invention may be applied to a scenario where, as shown in fig. 1, first, the terminal device 10 may acquire an image to be processed and transmit the image to be processed to the server 20. The server 20 may obtain the position information corresponding to each key point in the target face according to the image to be processed, and determine the face pose score corresponding to the target face according to the position information corresponding to each key point; and determining the face pose corresponding to the target face according to the face pose score. Then, the server 20 sends the face pose corresponding to the target face to the terminal device 10.

It is to be understood that, in the application scenario described above, the actions of the embodiment of the present invention are described as being performed in part by the terminal device 10 and in part by the server 20, as shown in fig. 1. However, such actions may be performed entirely by the server 20 or entirely by the terminal device 10. The invention is not limited in its implementation to the details of execution, provided that the acts disclosed in the embodiments of the invention are performed. The terminal device 10 includes a desktop terminal or a mobile terminal, such as a desktop computer, a tablet computer, a notebook computer, a smart phone, and the like. The servers 20 comprise individual physical servers, clusters of physical servers, or virtual servers.

After obtaining the determination method of the face pose, the determination method of the face pose may be used to process a picture taken by a terminal device having a camera. For example, when a photo shot by a terminal device with a camera is processed, the photo is obtained, and the position information corresponding to each key point in the target face is determined, so that the face pose corresponding to the target face can be determined quickly and accurately. Certainly, in practical application, a function module for determining a face pose is formed, the function of determining the face pose is realized according to the method for determining the face pose, the function module for determining the face pose is configured in the terminal equipment with the camera, when the terminal equipment with the camera takes a picture, the function module for determining the face pose is started, and the picture is processed by the function module for determining the face pose, so that the terminal equipment with the camera outputs the face pose corresponding to the target face corresponding to the picture.

It should be noted that the above application scenarios are only presented to facilitate understanding of the present invention, and the embodiments of the present invention are not limited in any way in this respect. Rather, embodiments of the present invention may be applied to any scenario where applicable.

Various non-limiting embodiments of the present invention are described in detail below with reference to the accompanying drawings.

Referring to fig. 2 and 3, a method for determining a face pose in an embodiment of the present invention is shown. In this embodiment, the method may include, for example, the steps of:

s1, acquiring a to-be-processed image, wherein the to-be-processed image comprises a target human face.

Specifically, the type of the image to be processed includes a two-dimensional image and a three-dimensional image, the image to be processed may be a single image or an image to be processed synthesized from a plurality of images, and for example, the three-dimensional image may be an image synthesized from a plurality of two-dimensional images. The to-be-processed image is obtained by shooting through an imaging system (such as a camera), sending through an external device (such as a smart phone), and passing through a network (such as a hundred degrees). When the camera is used for shooting to obtain the image to be processed, a video frame can be extracted from the shot video to obtain the image to be processed.

Specifically, the image to be processed includes a target face, that is, there are target faces in the image to be processed, there may be one or more target faces in the image to be processed, when determining the face pose, the face pose is determined only for a single target face, and the position information corresponding to the key points in the target face is only used to determine the face pose of the target face, that is, the position information corresponding to the key points in the target face is not used to determine the face poses of other target faces.

And S2, determining the position information corresponding to each key point in the target face respectively.

Specifically, the method for determining the key points of the target face includes manual labeling and automatic identification. For example, the image to be processed is input into a trained face key point recognition model to obtain key points of a face output by the trained face key point recognition model, the target face is recognized from the image to be processed by an automatic recognition method, and position information corresponding to the key points in the target face is determined.

Step S2 includes:

and S21, inputting the image to be processed into the trained face key point recognition model to obtain each key point in the target face output by the preset face key point recognition model.

Specifically, the trained face key point recognition model is a network model for recognizing face key points, and each key point of a target face in an image to be processed can be recognized through the trained face key point recognition model. The trained face key point recognition model is obtained by training based on a face image training sample set, and each face image training sample in the face image training sample set comprises a face image and each key point corresponding to the face image. Specifically, the face key point recognition method includes MTCNN (Multi-task Cascaded Convolutional Networks) method, retinaface (retinal face) method, and the like. In one implementation manner of this embodiment, the trained face keypoint recognition model is an MTCNN neural network, and the face image training sample set is a face data set. The face data set is preferably a 300W data set, the 300W data set comprises 300 indoor images and 300 outdoor images, the expression, illumination condition, posture, shielding and face size of each image in the 300W data set are different, and each face image is labeled with 68 key points of the face, so that the face image is input into a trained face key point recognition model to recognize the 68 key points of the face. Of course, in other implementation manners of this embodiment, the face data set may also be a public database of faces of people at the university of columbia, and other numbers of key points, for example, 5 key points, may be labeled on each face image.

Specifically, the MTCNN comprises a P-Net (pro-facial network) layer, an R-Net (refine network) layer and an O-Net (output network), wherein the P-Net layer is used for extracting and calibrating frames for image pyramid features, after the features are input into three convolution layers, the P-Net layer judges whether the region is a face through a face classifier, and simultaneously, the frame regression and a locator of a key point of a target face are used for preliminarily determining candidate frames of a face region, the P-Net layer finally outputs a plurality of candidate frames of the face region where the face may exist, and the regions are input into the R-Net layer for further processing. The R-Net layer is used for filtering candidate frames with few facial features, because the output of the P-Net layer is only the candidate frames of the face regions where the faces possibly exist, in the R-Net layer, input is subjected to refinement selection and adjustment, most of error input is eliminated, frame regression and key point positioning of the face regions are respectively carried out by using the frame regression and the key point positioner of the target face again, and finally the candidate frames of the face regions which are credible are output by the R-Net layer and used for O-Net. And the O-Net layer user identifies a facial region, the input features of the O-Net layer are more, the last layer of the network structure is a larger 256 full-connection layer, more image features are reserved, then face judgment, face region frame regression and face feature positioning are carried out, and finally the upper left corner coordinate and the lower right corner coordinate of the face region and five key points of the face region are output. The output of the O-Net is taken as the output of the MTCNN neural network.

Specifically, in an implementation manner of the embodiment of the present invention, the output result of the preset face keypoint recognition model is shown in fig. 4 and 5, where fig. 4 is a front result and fig. 5 is a side result. The key points are determined according to the target face, for example, the key points of the target face include eyes, ears, nose, eyebrows, mouths and other five sense organs, that is, key points corresponding to eyes, ears, noses, eyebrows and mouths are included. The number of the key points can be set according to the accuracy and the efficiency, and the more the number of the key points is, the higher the accuracy is, and the lower the efficiency is; the fewer the number of key points, the lower the accuracy and the higher the efficiency. In the embodiment of the invention, the number of the key points is 5-10.

S22, determining the position information corresponding to each key point in the target face according to the image to be processed and each key point in the target face; and the position information corresponding to the key points is the information of the positions of the key points in the image to be processed.

Since the key points of the target face are the feature points reflecting each part of the face in the image to be processed, for example, the feature points may reflect facial features, and then the key points of the target face may be referred to as facial features of the target face. In the key points of the five sense organs of the human face, a plurality of key points are configured at the edge of the eyes to reflect the shape of the eyes, and for example, the key points are configured on the eyes and the mouth to reflect the relative position relationship between the eyes and the mouth and the distance between the eyes and the mouth, that is, the key points of the target human face can reflect the shape of the five sense organs of the human face and the geometric relationship between the five sense organs of the human face in the image to be processed. And determining the position information respectively corresponding to each key point in the target face according to the image to be processed and each key point in the target face.

For example, as shown in fig. 4, a left eye, a right eye, a nose tip, a left mouth corner, and a right mouth corner of a target face are key points, a feature point reflecting the face in the image to be processed is substantially a plurality of pixel points in the image to be processed, a position of the key point in the image to be processed may be represented by a position of any one of the pixel points in the image to be processed, and certainly, a position of a central pixel point in the image to be processed may also be represented by a position of a central pixel point in the image to be processed, for example, a rate of the pixel points in the image to be processed is 300 × 300, a position of a certain pixel point in the image to be processed is (i, j), for example, i is 100, j is 180, and position information corresponding to the key point is (100, 180).

And the position information corresponding to the key points in the target face can be represented by coordinate information. The obtaining mode of the coordinate information corresponding to the key points in the target face is as follows: and establishing a coordinate system for the image to be processed to obtain the coordinate information respectively corresponding to each key point in the target face. The position information corresponding to the key points comprises coordinate information on at least two dimensions for determining the positions of the key points in the image to be processed.

In the two-dimensional image, the position of the key point in the two-dimensional image can be determined only by coordinate information on at least two dimensions; in a three-dimensional image, coordinate information in at least three dimensions is required to determine the location of a keypoint in the three-dimensional image. Establishing a two-dimensional coordinate system by the two-dimensional image, establishing a three-dimensional coordinate system by the three-dimensional image, taking the two-dimensional coordinate system as an example:

as shown in fig. 4, after the candidate frame is determined, a two-dimensional cartesian coordinate system is established according to the candidate frame, and coordinate information is determined. The lower left corner of the candidate frame is taken as the origin O, the bottom side of the candidate frame is taken as the x axis, and the left side of the candidate frame is taken as the y axis. As can be seen from fig. 4, the number of the key points is 5, and the coordinates corresponding to each of the key points are (x)₁，y₁)，(x₂，y₂)，(x₃，y₃)，(x₄，y₄)，(x₅，y₅). The image to be processed of the target face comprises a two-dimensional image to be processed and a three-dimensional image to be processed, and accordingly, in terms of dimensionality, the coordinate system comprises a two-dimensional coordinate system and a three-dimensional coordinate system. When the coordinate system is determined, the image to be processed of the target human face can be usedAny one of the points is used as an origin, and each coordinate axis may be set according to actual conditions. In type, the coordinate system is: cartesian coordinate systems (including two-dimensional cartesian coordinate systems and three-dimensional cartesian coordinate systems), planar polar coordinate systems, cylindrical coordinate systems (or cylindrical coordinate systems), spherical coordinate systems (or spherical coordinate systems), and the like. The coordinate information is different when different coordinate systems are adopted, but the calculation of the correlation coefficient is not influenced, and the determination of the posture is not influenced. The following description will be given by taking a cartesian coordinate system as an example. That is, the two-dimensional coordinate system includes an origin, an x-axis, and a y-axis, and the three-dimensional coordinate system includes an origin, an x-axis, a y-axis, and a z-axis.

Certainly, a coordinate system can be established by taking a certain key point as an origin, so that the calculation amount can be reduced; or establishing a coordinate system by taking the edge of the picture as a coordinate axis.

And after the coordinate system is determined, determining coordinate information corresponding to the key points according to the positions of the key points in the image to be processed. Since the coordinate system includes a two-dimensional coordinate system and a three-dimensional coordinate system, the coordinate information includes two-dimensional coordinate information and three-dimensional coordinate information, accordingly. For example, in a two-dimensional coordinate system, when the number of the key points is 5, the coordinate information corresponding to each key point is (x)₁，y₁)，(x₂，y₂)，(x₃，y₃)，(x₄，y₄)，(x₅，y₅). When the number of the key points is n, the coordinate information corresponding to each key point is (x)₁，y₁)，(x₂，y₂)，(x₃，y₃)，···，(x_t，y_t)，···，(x_n，y_n) Wherein x is_tCoordinate information of an x coordinate axis of the tth key point, or coordinate information on an x dimension corresponding to the tth key point; y is_tThe coordinate information of the y coordinate axis of the tth key point, or the coordinate information of the y dimension corresponding to the tth key point. For another example, in the three-dimensional coordinate system, when the number of the key points is 5, the coordinate information corresponding to each key point is (x)₁，y₁，z₁)，(x₂，y₂，z₂)，(x₃，y₃，z₃)，(x₄，y₄，z₄)，(x₅，y₅，z₅). When the number of the key points is n, the coordinate information corresponding to each key point is (x)₁，y₁，z₁)，(x₂，y₂，z₂)，(x₃，y₃，z₃)，···，(x_t，y_t，z_t)，···，(x_n，y_n，z_n) Wherein x is_tCoordinate information of x coordinate axis of the t-th key point, y_tCoordinate information of the y coordinate axis of the t-th key point, z_tAnd coordinate information of a z coordinate axis of the t-th key point.

And S3, determining the face pose corresponding to the target face according to the position information respectively corresponding to each key point.

The position information corresponding to each key point of the target face is different under different face postures, specifically, the position information corresponding to each key point of the target face is more dispersed under a front posture, the position information corresponding to each key point of the target face is closer under a side posture, and specifically, the position information corresponding to each key point of a left half face of the target face is closer to the position information corresponding to each key point of a right half face of the target face. Therefore, the face pose corresponding to the target face can be determined through the position information corresponding to each key point.

Step S3 includes:

and S31, determining the face pose score corresponding to the target face according to the position information respectively corresponding to each key point.

The method comprises the steps that under different face postures of a target face, face posture scores corresponding to the target face are different, position information corresponding to each key point in the target face is also different, and the face posture score corresponding to the target face is determined through the position information corresponding to each key point, so that the face posture of the target face can be determined. In the embodiment of the invention, the coordinate information of the key points in the image to be processed represents the position information corresponding to the key points, and the face pose scores corresponding to the target face are obtained by processing the coordinate information corresponding to the key points. Specifically, in an implementation manner of the embodiment of the present invention, the face pose score corresponding to the target face is determined according to the correlation degree between the coordinate information corresponding to the key point.

Step S31 includes:

step S311, for each dimension, extracting the coordinate information corresponding to the dimension in each key point to obtain a target vector corresponding to the dimension.

The number of the key points is at least 2, if the number of the key points is only 1, the human face pose cannot be determined, in a two-dimensional plane, 1 straight line can be determined by 2 non-coincident points, and if the number of the straight lines passing through the point is only 1, the number of the straight lines is infinite; similarly, if there are only 1 key point, there are infinite human face poses corresponding to the key point, so that the human face pose cannot be determined; therefore, in the embodiment of the present invention, the number of the key points is set to be at least 2.

After coordinate information corresponding to each key point is obtained, determining a target vector corresponding to each dimension according to the coordinate information corresponding to each key point, wherein if a two-dimensional coordinate system is adopted, namely 2 dimensions exist, the number of coordinate information of each key point is 2; if a three-dimensional coordinate system is adopted, that is, there are 3 dimensions, the coordinate information of each key point has 3, that is, the number of the coordinate information of each key point is consistent with the dimensions. Then, the coordinate information is divided into dimensions (or coordinate axes, each coordinate axis represents a dimension in a cartesian coordinate system) and target vectors corresponding to the dimensions are formed, that is, the two-dimensional coordinate information is divided into two rows according to the x dimension and the y dimension, the two rows are respectively coordinate information corresponding to the x dimension and coordinate information corresponding to the y dimension, and the coordinate information corresponding to the x dimension of all the key points is recombined to form one coordinate vectorTarget vector v corresponding to x dimension_xRecombining coordinate information corresponding to the y dimensions of all key points to form a target vector v corresponding to the y dimension_y. The dimension of the target vector is consistent with the number of the key points.

For example, in a two-dimensional coordinate system, the coordinate information corresponding to each key point is (x)₁，y₁)，(x₂，y₂)，(x₃，y₃)，(x₄，y₄)，(x₅，y₅) Then, a target vector v corresponding to the x dimension can be obtained_x＝(x₁，x₂，x₃，x₄，x₅) Target vector v corresponding to y dimension_y＝(y₁，y₂，y₃，y₄，y₅). When the number of the key points is n, a target vector v corresponding to the x dimension can be obtained_x＝(x₁，x₂，x₃，···，x_t，···，x_n) Target vector v corresponding to y dimension_y＝(y₁，y₂，y₃，···，y_t，···，y_n) Wherein x is_tCoordinate information corresponding to the x dimension of the t-th key point, y_tAnd coordinate information corresponding to the y dimension of the t-th key point.

In the three-dimensional coordinate system, the coordinate information corresponding to each key point is (x)₁，y₁，z₁)，(x₂，y₂，z₂)，(x₃，y₃，z₃)，(x₄，y₄，z₄)，(x₅，y₅，z₅) Then, a target vector v corresponding to the x dimension can be obtained_x＝(x₁，x₂，x₃，x₄，x₅) Target vector v corresponding to y dimension_y＝(y₁，y₂，y₃，y₄，y₅) And a target vector v corresponding to the z dimension_z＝(z₁，z₂，z₃，z₄，z₅). When the number of the key points is n, the target corresponding to the x dimension can be obtainedVector v_x＝(x₁，x₂，x₃，···，x_t，···，x_n) Target vector v corresponding to y dimension_y＝(y₁，y₂，y₃，···，y_t，···，y₅) And a target vector v corresponding to the z dimension_z＝(z₁，z₂，z₃，···，z_t，···，z_n) Wherein x is_tCoordinate information corresponding to the x dimension of the t-th key point, y_tCoordinate information corresponding to the y-dimension of the t-th key point, z_tAnd coordinate information corresponding to the z dimension of the t-th key point.

And S312, determining the correlation coefficient of the target vector corresponding to each dimension.

The larger the absolute value of the correlation coefficient is, the larger the correlation degree is; the smaller the absolute value of the correlation coefficient, the smaller the degree of correlation. The correlation coefficient comprises an autocorrelation coefficient of the target vector corresponding to a single dimension and a cross-correlation coefficient between the target vectors corresponding to any two different dimensions respectively. For example, in a two-dimensional image, two dimensions are x dimension and y dimension, respectively, and the autocorrelation coefficients of the target vector corresponding to a single dimension include: the autocorrelation coefficient of the target vector corresponding to the x dimension and the autocorrelation coefficient of the target vector corresponding to the y dimension; the cross-correlation coefficient between the target vectors respectively corresponding to any two different dimensions comprises: a cross-correlation coefficient between the target vector corresponding to the x-dimension and the target vector corresponding to the y-dimension. For another example, in a three-dimensional image, three dimensions are x dimension, y dimension and z dimension, respectively, and then the autocorrelation coefficients of the target vector corresponding to a single dimension include: the autocorrelation coefficients of the target vectors corresponding to the x dimension, the autocorrelation coefficients of the target vectors corresponding to the y dimension, and the autocorrelation coefficients of the target vectors corresponding to the z dimension; the cross-correlation coefficient between the target vectors respectively corresponding to any two different dimensions comprises: the cross correlation coefficient between the target vector corresponding to the x dimension and the target vector corresponding to the y dimension, the cross correlation coefficient between the target vector corresponding to the x dimension and the target vector corresponding to the z dimension, and the cross correlation coefficient between the target vector corresponding to the z dimension and the target vector corresponding to the y dimension. In the embodiment of the invention, the autocorrelation coefficients specifically reflect the correlation degree of the coordinate information on the same dimension among different key points, and the larger the absolute value of the autocorrelation coefficients is, the larger the correlation degree is, and vice versa; the cross-correlation coefficient reflects in particular the degree of correlation between two target vectors. In the invention, the correlation degree between coordinate information on different dimensions is reflected specifically, and the larger the absolute value of the cross correlation coefficient is, the larger the correlation degree is, and vice versa. And determining the face pose score corresponding to the target face through the correlation coefficient of the target vector.

Specifically, when the correlation coefficient adopts the autocorrelation coefficient of the target vector corresponding to a single dimension, step S312 includes:

s3121, for each dimension, performing hysteresis processing on the target vector corresponding to the dimension to obtain a first hysteresis vector corresponding to the dimension and a second hysteresis vector corresponding to the dimension.

In order to calculate the autocorrelation coefficients of the target vector, the target vector needs to be subjected to lag processing to obtain the two lag vectors (i.e., a first lag vector and a second lag vector, respectively), and then the autocorrelation coefficients of the target vector are determined by the two lag vectors. The lag here refers to the lag of the coordinate information in the lag vector at the key point sequence number. For example, the x dimension corresponds to the target vector v_x＝(x₁，x₂，x₃，···，x_n) Middle coordinate information x_nThe corresponding key point has a serial number n, i.e. coordinate information x_nIs the coordinate information corresponding to the x dimension of the nth key point. If the coordinate information in the first lag vector is x₂The coordinate information x₂Has a key point sequence number of 2 and the coordinate information in the second lag vector is x₃The coordinate information x₃Has a key point sequence number of 3, the coordinate information x in the second lag vector₃With respect to coordinate information x in the first lag vector₂Is hysteretic.

Specifically, for each dimension, extracting k pieces of coordinate information in the target vector corresponding to the dimension to obtain a first lag vector corresponding to the dimension; wherein k is a positive integer and is less than the number of the key points corresponding to the target face; extracting k pieces of coordinate information in the target vector corresponding to the dimension to obtain a second lag vector corresponding to the dimension; wherein there is at least one of the coordinate information in the second lag vector that is different from the coordinate information in the first lag vector.

For example, when the x dimension corresponds to the target vector v_x＝(x₁，x₂，x₃，···，x_n) Then, extracting a target vector v corresponding to the x dimension_xObtaining a first lag vector v corresponding to the x dimension by the medium k coordinate information_x1E.g. a first lag vector v_x1＝(x₁，x₂，x₃，···，x_k-1，x_k) And k is less than the number n of key points in the target human face. Extracting a target vector v corresponding to the x dimension_xObtaining a second lag vector v corresponding to the x dimension by the medium k coordinate information_x2Said second lag vector v_x2In which at least one of said coordinate information and said first lag vector v is present_x1Wherein said coordinate information is not the same, e.g. a second lag vector v_x2＝(x₁，x₂，x₃，···，x_k-1，x_k+1) The presence of coordinate information x in the second lag vector_k+1The coordinate information and the first lag vector v_x1Wherein the coordinate information is different.

Of course, the first lag vector and the second lag vector may also be extracted in a certain order, for example, the sequence number of the coordinate information in the first lag vector and the sequence number of the coordinate information in the second lag vector are sequentially different by a fixed value m. Then, the autocorrelation coefficients include m-order autocorrelation coefficients, where m is a positive integer and the value of m is the lag number. Therefore, the magnitude of the m value can be set as needed, and the m value is usually set to 1, that is, the target direction is calculatedThe 1 st order autocorrelation coefficient of the quantity. Higher order autocorrelation coefficients may be calculated when there are more keypoints. For example, as more key points are encountered, the x-axis target vector v_x＝(x₁，x₂，x₃，···，x_t，···，x_n) The larger n in (1) is due to 0<m<The larger n and n are, the larger the value range of m is, and the larger the value of m is.

For example, when the x dimension corresponds to the target vector v_x＝(x₁，x₂，x₃，···，x_n) Then, extracting a target vector v corresponding to the x dimension_xObtaining a first lag vector v corresponding to the x dimension by the medium k coordinate information_x1E.g. a first lag vector v_x1＝(x₂，x₄，x₅，···，x_k+6，x_k+8) Then the second lag vector v_x2＝(x_2+m，x_4+m，x_5+m，···，x_k+6+m，x_k+8+m) That is, the index of each piece of coordinate information in the second lag vector is increased by m from the index of the corresponding piece of coordinate information in the first lag vector. Of course, here, the number (subscript) of the coordinate information is substantially the key point number.

More specifically, when x-axis target vector v_x＝(x₁，x₂，x₃，···，x_n) When the two adjacent x-axis coordinate information are paired, namely, n-1 pairs of data are obtained when the lag number is 1, namely (x)₁，x₂)，(x₂，x₃)，···，(x_t，x_t+1)，···，(x_n-1，x_n) Of course, n-1 pairs of data can also be considered as two lagged vectors v_x1And v_x2，v_x1＝(x₁，x₂，x₃，···，x_n-1)，v_x2＝(x₂，x₃，x₄，···，x_n). That is, when the 1 st order autocorrelation coefficient is used, the number of lags between the two lagged vectors is 1 in the two lagged vectors. It is apparent that the first lag vector v_x1The coordinate information in (1) is respectively: coordinate information of a first key point, coordinate information of a second key point, coordinate information of a third key point, and so on to coordinate information of an (n-1) th key point. And a second lag vector v_x2The coordinate information in (1) is respectively: the coordinate information of the second key point, the coordinate information of the third key point, the coordinate information of the fourth key point, and so on to the coordinate information of the nth key point. The difference between the two is that the sequence numbers of the key points are different by 1, and the first lag vector v_x1The first coordinate information has a key point sequence number of 1, and a second lag vector v_x2The key point sequence number of the first coordinate information in the two lagging vectors is 2, so that the key point sequence number of the first coordinate information in the two lagging vectors has a difference of 1, the key point sequence number of the second coordinate information also has a difference of 1, and so on, it can be seen that the key point sequence numbers of the corresponding coordinate information in the two lagging vectors have a difference of 1.

When the 2 nd order autocorrelation coefficient is adopted, the number of lags between the two lagged vectors in the two lagged vectors is 2. That is, every 1 x-axis coordinate information is paired in a set to obtain n-2 pairs of data, i.e., (x)₁，x₃)，(x₂，x₄)，···，(x_t，x_t+2)，···，(x_n-2，x_n) Vector v of said two lags_x1And v_x2The method specifically comprises the following steps: v. of_x1＝(x₁，x₂，x₃，···，x_n-2)，v_x2＝(x₃，x₄，x₅，···，x_n). When 2 nd order autocorrelation coefficients are used, then the two lagged vectors differ by 2 in the order number of the keypoint.

When the k-order autocorrelation coefficients are adopted, the number of lags between the two lagged vectors is 2 in the two lagged vectors. That is, every 1 x-axis coordinate information is paired in a set to obtain n-k pairs of data, i.e., (x)₁，x_1+k)，(x₂，x_2+k)，···，(x_t，x_t+k)，···，(x_n-k，x_n)，Vector v of the two lags_x1And v_x2The method specifically comprises the following steps: v. of_x1＝(x₁，x₂，x₃，···，x_n-k)，v_x2＝(x_1+k，x_2+k，x_3+k，···，x_n). When autocorrelation coefficients of order k are used, then the two lagged vectors differ by k in the order number of the keypoint.

In addition to the above-mentioned manner of obtaining n-k pairs of data when k-order autocorrelation coefficients are used, in one implementation manner of the embodiment of the present invention, n/2 (in this case, n is an even number) or (n-1)/2 (in this case, n is an odd number) pairs of data, i.e., (x) may be obtained by using parity₁，x₂)，(x₃，x₄)，···，(x_t，x_t+1)，···，(x_n-1，x_n) Or (x)₁，x₂)，(x₃，x₄)，···，(x_t，x_t+1)，···，(x_n-2，x_n-1)。

The method for calculating the autocorrelation coefficient refers to a method for calculating the autocorrelation coefficient, since the autocorrelation coefficient is calculated for two target vectors and the autocorrelation coefficient is calculated for one target vector, if the autocorrelation coefficient of one target vector is calculated by referring to the method for calculating the autocorrelation coefficient, one target vector is subjected to hysteresis processing to form two lag vectors corresponding to the target vector, then the autocorrelation coefficient between the two lag vectors corresponding to the target vector is calculated, and the autocorrelation coefficient between the two lag vectors corresponding to the target vector is used as the autocorrelation coefficient of the target vector.

S3122, for each dimension, obtaining an autocorrelation coefficient of the target vector corresponding to the dimension according to the first lag vector corresponding to the dimension and the second lag vector corresponding to the dimension, so as to determine a correlation coefficient of the target vector corresponding to each dimension.

After the two lag vectors are determined, the autocorrelation coefficients of the target vector are obtained by calculating the two lag vectors. The method for calculating the autocorrelation coefficient of the target vector is not changed regardless of the manner in which the two lag vectors are obtained, and specifically, for each dimension, the cross-correlation coefficient between the first lag vector corresponding to the dimension and the second lag vector corresponding to the dimension is calculated according to the first lag vector corresponding to the dimension and the second lag vector corresponding to the dimension, and the cross-correlation coefficient between the first lag vector corresponding to the dimension and the second lag vector corresponding to the dimension is taken as the autocorrelation coefficient of the target vector in the dimension.

In one implementation, step S3122 may include:

s3122a, for each dimension, determining a variance value between the first lag vector and the second lag vector corresponding to the dimension.

Specifically, the variance value between the first lag vector and the second lag vector corresponding to each dimension includes: the covariance between the first lag vector and the second lag vector for that dimension, the mean square error of the first lag vector for that dimension, and the mean square error of the second lag vector for that dimension. Thus, for each dimension, a covariance between the first lag vector and the second lag vector for that dimension is determined, and a mean square error of the first lag vector for that dimension and a mean square error of the second lag vector for that dimension are determined.

In an implementation manner of the embodiment of the present invention, by using 1 st order autocorrelation coefficients, two lag vectors v corresponding to x dimension_x1And v_x2Respectively as follows: v. of_x1＝(x₁，x₂，x₃，···，x_t，···，x_n-1)，v_x2＝(x₂，x₃，x₄，···，x_t+1，···，x_n)。

When the autocorrelation coefficient is calculated, the mean square error corresponding to each lag vector and the covariance between two lag vectors are calculated, and then the autocorrelation coefficient is calculated through the covariance and the mean square error. Specifically, the covariance between the two lag vectors is:

where covx represents the covariance between two lag vectors corresponding to the x dimension, Σ represents the sum sign, x_tCoordinate information corresponding to the x dimension of the t-th key point is shown, n represents the number of the key points,

denotes v_x1The average value of (a) of (b),

denotes v_x2The average value of (a), that is,

the mean square deviations corresponding to the two lag vectors corresponding to the x dimension are respectively:

wherein, stdx₁Representing a first lag vector v corresponding to the x dimension_x1Mean square error of (stdx)₂Representing a second lag vector v corresponding to the x dimension_x2Mean square error of (d), sigma denotes the sum sign, x_tCoordinate information corresponding to the x dimension of the t-th key point is shown, n represents the number of the key points,

denotes v_x1The average value of (a) of (b),

denotes v_x2The average value of (a), that is,

and S3122b, determining the cross correlation coefficient between the first lag vector and the second lag vector corresponding to the dimension according to the variance value between the first lag vector and the second lag vector corresponding to the dimension.

Specifically, the cross-correlation coefficient between the first lag vector corresponding to the dimension and the second lag vector corresponding to the dimension is calculated according to the covariance between the first lag vector corresponding to the dimension and the second lag vector corresponding to the dimension, the mean square error of the first lag vector corresponding to the dimension, and the mean square error of the second lag vector corresponding to the dimension. And finally, taking the cross-correlation coefficient between the first lag vector and the second lag vector corresponding to the dimension as the autocorrelation coefficient of the target vector corresponding to the dimension.

For example, the autocorrelation coefficients of the target vector (i.e., the cross correlation coefficients between the first lag vector and the second lag vector) corresponding to the x dimension are:

wherein coef_xRepresenting the autocorrelation coefficients of said target vector for the x dimension, covx representing the covariance between said two lag vectors for the x dimension, stdx₁Representing a first lag vector v corresponding to the x dimension_x1Mean square error of (stdx)₂Representing a second lag vector v corresponding to the x dimension_x2Mean square error of (d), sigma denotes the sum sign, x_tCoordinate information corresponding to the x dimension of the t-th key point is shown, n represents the number of the key points,

denotes v_x1The average value of (a) of (b),

denotes v_x2The average value of (a), that is,

in one implementation of the embodiment of the present invention, 2 nd order autocorrelation coefficients are used, and two lag vectors v corresponding to x dimension_x1And v_x2Respectively as follows: v. of_x1＝(x₁，x₂，x₃，···，x_t，···，x_n-2)，v_x2＝(x₃，x₄，x₅，···，x_t+1，···，x_n). Then the autocorrelation coefficients of the x dimension corresponding to the target vector are obtained as:

similarly, by using k-order autocorrelation coefficients, the x dimension corresponds to the two lag vectors v_x1And v_x2Respectively as follows: v. of_x1＝(x₁，x₂，x₃，···，x_t，···，x_n-k)，v_x2＝(x_1+k，x_2+k，x_3+k，···，x_t+k，···，x_n). Then the autocorrelation coefficients of the x dimension corresponding to the target vector are obtained as:

the autocorrelation coefficients of the target vectors corresponding to the x dimension and the autocorrelation coefficients coef of the target vectors corresponding to the y dimension can be obtained in the same way_y。

Under a three-dimensional coordinate system, the autocorrelation coefficient coef of the target vector corresponding to the z dimension can be obtained in the same way_z。

When the correlation coefficient adopts a cross correlation coefficient between the target vectors respectively corresponding to any two different dimensions, step S312 further includes:

s3123, forming a dimension group by every two dimensions in all the dimensions; for each dimension group, determining a variance value between target vectors respectively corresponding to two dimensions in the dimension group according to the target vectors respectively corresponding to the two dimensions in the dimension group; and obtaining the cross-correlation coefficient between the target vectors respectively corresponding to the two dimensions in the dimension group according to the variance value between the target vectors respectively corresponding to the two dimensions in the dimension group.

In a two-dimensional coordinate system, only two target vectors corresponding to different dimensions are provided, and only one dimension group can be formed, namely, a target vector corresponding to an x dimension and a target vector corresponding to a y dimension, so that only the cross-correlation coefficient between the target vector corresponding to the x dimension and the target vector corresponding to the y dimension needs to be calculated. In a three-dimensional coordinate system, there are three target vectors corresponding to different dimensions, that is, a target vector corresponding to an x dimension, a target vector corresponding to a y dimension, and a target vector corresponding to a z dimension, which can form 3 dimensional groups. The 3 dimensional groups include: the target vector corresponding to the x dimension and the target vector corresponding to the y dimension form a first dimension group, the target vector corresponding to the x dimension and the target vector corresponding to the z dimension form a second dimension group, and the target vector corresponding to the z dimension and the target vector corresponding to the y dimension form a third dimension group. Two target vectors corresponding to different dimensions exist in each dimension group, and for each dimension group, cross-correlation coefficients of the target vectors corresponding to the two different dimensions in the dimension group are calculated, namely for the first dimension group, the cross-correlation coefficients between the target vectors corresponding to the x dimension and the target vectors corresponding to the y dimension in the first dimension group are calculated; aiming at the second dimension group, calculating a cross correlation coefficient between a target vector corresponding to the x dimension and a target vector corresponding to the z dimension in the second dimension group; and aiming at the third dimension group, calculating the cross correlation coefficient between the target vector corresponding to the z dimension and the target vector corresponding to the y dimension in the third dimension group.

And calculating the cross-correlation coefficient of the target vector corresponding to any two dimensions in the same way as the self-correlation coefficient. Specifically, step S3123 includes:

s3123a, for each dimension group, determining covariance between the target vectors respectively corresponding to the two dimensions in the dimension group according to the target vectors respectively corresponding to the two dimensions in the dimension group, and determining a mean square error of the target vector corresponding to a first dimension in the dimension group and a mean square error of the target vector corresponding to a second dimension in the dimension group.

Forming a dimension group under a two-dimensional coordinate system, wherein the target vectors respectively corresponding to two dimensions in the dimension group are respectively: v. of_x＝(x₁，x₂，x₃，···，x_t，···，x_n)，v_y＝(y₁，y₂，y₃，···，y_t，···，y_n) Wherein n is the number of the key points, x_tCoordinate information, y, corresponding to the x dimension of the t-th key point_tAnd coordinate information corresponding to the y dimension of the t-th key point.

The covariance between the target vectors corresponding to each of the two dimensions is:

wherein covxy represents the covariance between the target vector for the x dimension and the target vector for the y dimension, Σ represents the sum sign, x_tRepresenting coordinate information corresponding to x dimension of the t-th key point, n representing the key pointNumber of points, y_tCoordinate information corresponding to the y dimension representing the t-th keypoint,

denotes v_xThe average value of (a) of (b),

denotes v_yThe average value of (a), that is,

the mean square deviations of the target vectors respectively corresponding to the two dimensions are respectively as follows:

wherein stdx represents the mean square error of the target vector corresponding to the x dimension, and stdy represents the mean square error of the target vector corresponding to the y dimension.

In three-dimensional coordinates, since there are three coordinate axes, in addition to calculating covxy, the covariance also needs to calculate the covariance between the target vector corresponding to the x-dimension and the target vector corresponding to the z-dimension

Covariance between target vectors corresponding to the y-dimension and target vectors corresponding to the z-dimension

The mean square error also requires the calculation of the mean of the target vectors corresponding to the z dimensionVariance (variance)

Wherein z is_tCoordinate information corresponding to the z dimension representing the tth keypoint,

denotes v_zThe average value of (a), that is,

s3123b, obtaining cross correlation coefficients between the target vectors corresponding to the two dimensions in the dimension group according to the covariance between the target vectors corresponding to the two dimensions in the dimension group, the mean square error of the target vector corresponding to the first dimension in the dimension group, and the mean square error of the target vector corresponding to the second dimension in the dimension group.

Under a two-dimensional coordinate system, the cross-correlation coefficient between the target vectors respectively corresponding to the x dimension and the y dimension is as follows:

wherein covxy represents the covariance between the target vector corresponding to the x dimension and the target vector corresponding to the y dimension, stdx represents the mean square error of the target vector corresponding to the x dimension, and stdy represents the mean square error of the target vector corresponding to the y dimension. Of course coef in this case_xyAnd representing the cross-correlation coefficient between the target vector corresponding to the x dimension and the target vector corresponding to the y dimension.

In a three-dimensional coordinate system, since there are three coordinate axes, the coef is calculated_xyThe cross-correlation coefficient of the target vector also needs to calculate coef_xzAnd coef_yz：

Wherein stdz represents the mean square error of the target vector corresponding to the z dimension, z_tCoordinate information corresponding to the z dimension representing the tth keypoint,

denotes v_zI.e.:

coef in this case_xzRepresenting the cross-correlation coefficient, coef, between a target vector corresponding to the x-dimension and a target vector corresponding to the z-dimension_yzAnd representing the cross-correlation coefficient between the target vector corresponding to the y dimension and the target vector corresponding to the z dimension.

S313, determining the face pose score corresponding to the target face according to the correlation coefficient of the target vector corresponding to each dimension.

Because the face pose scores corresponding to the target face are different, the correlation coefficients of the target vectors respectively corresponding to all dimensions are also different, and the face pose score corresponding to the corresponding target face is searched according to the magnitude of the correlation coefficients. After determining the correlation coefficient of the target vector corresponding to each dimension, determining the face pose score corresponding to the target face according to the correlation coefficient of the target vector corresponding to each dimension, and then determining the face pose corresponding to the target face according to the face pose score.

Specifically, a preset correlation coefficient center value needs to be obtained first, then a face pose score corresponding to the target face is obtained according to the correlation coefficient of the target vector and the preset correlation coefficient center value, and finally the face pose corresponding to the target face is determined according to the face pose score. Preset correlation coefficient center value: the correlation coefficients of a plurality of target faces in a preset posture can be detected, then an average correlation coefficient is obtained by averaging the correlation coefficients of the target faces in the preset posture, the average correlation coefficient is used as a preset correlation coefficient central value, and a face posture score corresponding to the target face is determined according to the preset correlation coefficient central value and the correlation coefficient of the target face.

Specifically, step S313 includes:

and S3131, for each dimension, obtaining a correlation coefficient decentralization value corresponding to the dimension according to the correlation coefficient of the target vector corresponding to the dimension and a preset correlation coefficient central value corresponding to the dimension, so as to determine the correlation coefficient decentralization values respectively corresponding to the dimensions.

The preset correlation coefficient center value (which is divided into a preset cross-correlation coefficient center value and a preset autocorrelation coefficient center value) refers to a correlation coefficient of each target vector formed with the key point of the target face in a certain preset face posture, that is, if the face posture of a certain person needs to be determined, the correlation coefficient of the face of another person in a certain preset posture (usually adopting the front) can be calculated on the basis of the face of another person, so as to obtain the preset correlation coefficient center value. The preset central value of the correlation coefficient may be obtained according to the correlation coefficient of the target face, or obtained by performing averaging processing according to the correlation coefficients of a plurality of target faces of the same class. That is, the preset center value of the correlation coefficient is obtained by processing the target face or the target face of the same class as the target face through the steps of S1-S32. Here, the category of the target face may be classified according to gender, age, race, and the like.

Specifically, for target faces of the same class, the same preset correlation coefficient center value may be used, for example, when determining the face pose, one preset correlation coefficient center value is suitable for determining all face poses. Certainly, a suitable preset correlation coefficient center value may also be formulated according to needs, for example, since faces of european people and asians are slightly different, and faces of north people and south people are also slightly different, the geographic areas to which the people belong may be divided, and people in each geographic area have a preset correlation coefficient center value, and when determining the pose, the corresponding preset correlation coefficient center value is adopted according to the geographic area to which the target face belongs. The posture can be divided according to factors such as gender and age of people to obtain corresponding preset central values of the correlation coefficients, so that the accuracy of posture determination is improved.

Specifically, the preset correlation coefficient center values include the autocorrelation coefficient center value of each target vector and the cross-correlation coefficient center value of each target vector, and the autocorrelation coefficient center value of each target vector has coef_xs、coef_ys、coef_zsWherein coef_xs、coef_ys、coef_zsRespectively representing the autocorrelation coefficient central value of a target vector corresponding to the dimension x, the autocorrelation coefficient central value of a target vector corresponding to the dimension y and the autocorrelation coefficient central value of a target vector corresponding to the dimension z; the center value of the cross-correlation coefficient of each target vector has coef_xys、coef_xzs、coef_yzsWherein coef_xys、coef_xzs、coef_yzsRespectively representing the cross-correlation coefficient central value between a target vector corresponding to the x dimension and a target vector corresponding to the y dimension, the cross-correlation coefficient central value between a target vector corresponding to the x dimension and a target vector corresponding to the z dimension, and the cross-correlation coefficient central value between a target vector corresponding to the y dimension and a target vector corresponding to the z dimension.

Specifically, the correlation coefficient is subjected to a decentralization process through the preset correlation coefficient center value to obtain a decentralization value (which is divided into a cross-correlation coefficient decentralization value and an autocorrelation coefficient decentralization value), and when the face pose is determined, which correlation coefficient is adopted, the corresponding correlation coefficient center value is also adopted during the decentralization process, so that the decentralization value also includes the autocorrelation coefficient decentralization value of each target vector and the cross-correlation coefficient decentralization value of each target vector. The de-centering value is obtained according to the correlation coefficient and a preset correlation coefficient center value, and specifically, the de-centering value is a correlation coefficient-preset valueAnd setting the center value of the correlation coefficient. More specifically, the x-dimension corresponds to the autocorrelation coefficient of the target vector to the recentered value coef_xmAutocorrelation coefficient coef of vector corresponding to x dimension_xCenter value coef of autocorrelation coefficient of target vector corresponding to x dimension_xsTo obtain, in particular, coef_xm＝coef_x-coef_xs. The value coef for the recentering of the autocorrelation coefficients of the target vector corresponding to the y dimension_ymAutocorrelation coefficient coef of target vector corresponding to y dimension_yCenter value coef of autocorrelation coefficient of target vector corresponding to y dimension_ysTo obtain, in particular, coef_ym＝coef_y-coef_ys. The autocorrelation coefficient of the target vector corresponding to the z dimension deconced value coef_zmAutocorrelation coefficient coef of target vector corresponding to z dimension_zCenter value coef of autocorrelation coefficient of target vector corresponding to z dimension_zsTo obtain, in particular, coef_zm＝coef_z-coef_zs. By analogy, the cross-correlation coefficient between the target vector corresponding to the x dimension and the target vector corresponding to the y dimension is the decentralized value coef_xymThe cross-correlation coefficient coef between the target vector corresponding to the x dimension and the target vector corresponding to the y dimension_xyCenter value coef of cross-correlation coefficient between target vector corresponding to x dimension and target vector corresponding to y dimension_xysTo obtain, in particular, coef_xym＝coef_xy-coef_xys. Value coef for the cross-correlation coefficient between a target vector corresponding to the y-dimension and a target vector corresponding to the z-dimension_yzmThe cross-correlation coefficient coef between the target vector corresponding to the y dimension and the target vector corresponding to the z dimension_yzCenter value coef of cross-correlation coefficient between target vector corresponding to y dimension and target vector corresponding to z dimension_yzsTo obtain, in particular, coef_yzm＝coef_yz-coef_yzs. Cross-correlation coefficient between target vector corresponding to x dimension and target vector corresponding to z dimension decentralized value coef_xzmCross-correlation coefficient coef between target vector corresponding to x dimension and target vector corresponding to z dimension_xzA target vector corresponding to the x dimension and a target vector corresponding to the z dimensionCenter value coef of cross correlation coefficient between_xzsTo obtain, in particular, coef_xzm＝coef_xz-coef_xzs。

And S3132, obtaining a face pose score corresponding to the target face according to the correlation coefficient decentralized value corresponding to each dimension.

The magnitude of the de-centering value indicates the difference between each correlation coefficient in the face pose corresponding to the target face and the corresponding correlation coefficient center value in a preset face pose, and since the correlation coefficient corresponds to the face pose corresponding to the target face and the preset correlation coefficient center value corresponds to the preset face pose, the difference between the face pose corresponding to the target face and the preset face pose can be represented by the magnitude of the de-centering value, so that the face pose of the target face can be obtained according to the magnitude of the de-centering value, that is, the preset face pose is used as a reference to determine the pose of the face pose corresponding to the target face. The smaller the decentralization value is, the closer the face pose corresponding to the target face is to the preset face pose; the larger the decentralization value is, the larger the difference between the face pose corresponding to the target face and the preset face pose is. For example, the preset face pose is a positive face, and if the decentralization value is smaller than the decentralization threshold, the face pose corresponding to the target face is consistent with the preset face pose and is the positive face. And if the decentralization value is larger than or equal to the decentralization threshold value, the face pose corresponding to the target face is not consistent with the preset face pose and is a side face. Because there are a plurality of correlation coefficient decentralized values respectively corresponding to each dimension, the human face pose can be determined only by comprehensively considering.

In order to further improve the accuracy of pose determination, weighted average processing is performed on the correlation coefficient decentralized values respectively corresponding to each dimension, and specifically, weighted average is performed on the cross-correlation coefficient decentralized values and/or the autocorrelation coefficient decentralized values respectively corresponding to each dimension to obtain a face pose score. Under a two-dimensional coordinate system, the face pose score cpose w1 coef_xm+w2*coef_ym+w3*coef_xymWherein w1, w2 and w3 are all weights. Under the three-dimensional coordinate system, the face pose score cpose w4 coef_xm+w5*coef_ym+w6*coef_zm+w7*coef_xym+w8*coef_yzm+w9*coef_xzmWherein w4, w5, w6, w7, w8 and w9 are all weight values. The weights corresponding to the decentralized values of the different correlation coefficients may be the same or different, and the weights are set as required to meet different requirements. The value range of the weight is real number, that is, the weight may be zero, and then the final face pose score is not affected by the decentralized value of the correlation coefficient.

And S32, determining the face pose corresponding to the target face according to the face pose score.

The size of the face pose score indicates the difference between the face pose corresponding to the target face and a certain preset face pose, the smaller the difference is, the closer the face pose corresponding to the target face is to the preset face pose, and the larger the difference is, the more different the face pose corresponding to the target face is from the preset face pose, so that the face pose corresponding to the target face can be obtained according to the size of the face pose score.

Specifically, the face pose corresponding to the target face is obtained according to a preset corresponding relation between the face pose score and the face pose score.

Specifically, different face pose scores have corresponding face poses, the face poses are determined according to the face pose scores, and the face pose scores are obtained by weighting on the basis of a decentralized value, so that the smaller the face pose scores are, the closer the face pose corresponding to the target face is to the preset face pose; the larger the face pose score is, the larger the difference between the face pose corresponding to the target face and the preset face pose is, so that the face pose can be determined by calculating the face pose score. A preset threshold may be set to determine the face pose corresponding to the target face, for example, the preset threshold is 0.05, as shown in fig. 6, when the face pose score of the target face is less than 0.05, the face pose of the target face is a front face (i.e., the preset face pose), at this time, the correlation coefficient corresponding to each dimension is also less than 0.05, that is, the pose may also be determined to be the front face according to the magnitude of the correlation coefficient corresponding to each dimension. As shown in fig. 7, when the face pose score of the target face is greater than 0.05, the face pose of the target face is a side face, and certainly, the correlation coefficient corresponding to each dimension is also greater than 0.05, that is, the pose can also be determined as a side face according to the magnitude of the correlation coefficient corresponding to each dimension.

When the face recognition is carried out, when the face posture is the side face, the face posture does not meet the requirement, and a signal that the face posture does not meet the requirement can be output according to the face posture to prompt, so that a person can conveniently hear the prompt and then adjust the posture of the person, and the face recognition is completed. When the human face posture is positive, the human face posture meets the requirement, and the human face recognition can be directly carried out.

The method determines the face pose score corresponding to the target face according to the position information corresponding to the key points in the target face, further rapidly and accurately determines the face pose corresponding to the target face, and is simple in algorithm and low in calculation force requirement on hardware. In particular, the invention has lower requirements on the number of the key points, and the estimation process is simpler and easier than the three-dimensional reconstruction estimation process in the prior art by calculating the face pose score. Meanwhile, the accuracy of determining the posture of the target face according to the face posture score is high.

In one embodiment, the present invention provides a computer device, which may be a terminal, having an internal structure as shown in fig. 8. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of determining a face pose. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that fig. 8 is a block diagram of only a portion of the structure associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, there is provided a computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

In one embodiment, a computer-readable storage medium is provided, having stored thereon a computer program which, when executed by a processor, performs the steps of:

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

Claims

1. A method for determining a face pose, the method comprising:

2. The method for determining the face pose according to claim 1, wherein the determining the face pose corresponding to the target face according to the position information corresponding to each of the key points comprises:

determining a face pose score corresponding to the target face according to the position information respectively corresponding to each key point;

and determining the face pose corresponding to the target face according to the face pose score.

3. The method according to claim 2, wherein the position information corresponding to the keypoint comprises coordinate information in at least two dimensions for determining the position of the keypoint in the image to be processed; determining a face pose score corresponding to the target face according to the position information respectively corresponding to the key points, including:

for each dimension, extracting the coordinate information corresponding to the dimension in each key point to obtain a target vector corresponding to the dimension;

determining the correlation coefficient of the target vector corresponding to each dimension;

and determining the face pose score corresponding to the target face according to the correlation coefficient of the target vector corresponding to each dimension.

4. The method of claim 3, wherein the correlation coefficient of the target vector comprises: the autocorrelation coefficients of the target vector corresponding to a single dimension;

the determining the correlation coefficient of the target vector corresponding to each dimension includes:

for each dimension, performing hysteresis processing on the target vector corresponding to the dimension to obtain a first hysteresis vector corresponding to the dimension and a second hysteresis vector corresponding to the dimension;

and aiming at each dimension, obtaining the autocorrelation coefficient of the target vector corresponding to the dimension according to the first hysteresis vector corresponding to the dimension and the second hysteresis vector corresponding to the dimension so as to determine the autocorrelation coefficient of the target vector corresponding to each dimension.

5. The method of claim 4, wherein the target face comprises at least 2 key points; for each dimension, performing hysteresis processing on the target vector corresponding to the dimension to obtain a first hysteresis vector corresponding to the dimension and a second hysteresis vector corresponding to the dimension, including:

for each dimension, extracting k pieces of coordinate information in the target vector corresponding to the dimension to obtain a first lag vector corresponding to the dimension; wherein k is a positive integer and is less than the number of the key points corresponding to the target face; extracting k pieces of coordinate information in the target vector corresponding to the dimension to obtain a second lag vector corresponding to the dimension; wherein there is at least one of the coordinate information in the second lag vector that is different from the coordinate information in the first lag vector.

6. The method of claim 5, wherein for each dimension, obtaining the autocorrelation coefficient of the target vector for the dimension according to the first lag vector corresponding to the dimension and the second lag vector corresponding to the dimension comprises:

and for each dimension, according to the first lag vector and the second lag vector corresponding to the dimension, determining a cross correlation coefficient between the first lag vector and the second lag vector corresponding to the dimension, and taking the cross correlation coefficient between the first lag vector and the second lag vector corresponding to the dimension as an autocorrelation coefficient of the target vector corresponding to the dimension.

7. The method of claim 6, wherein the determining, for each dimension, a cross-correlation coefficient between the first lag vector and the second lag vector corresponding to the dimension according to the first lag vector and the second lag vector corresponding to the dimension comprises:

for each dimension, determining a variance value between a first lag vector and a second lag vector corresponding to the dimension;

and determining the cross-correlation coefficient between the first lag vector and the second lag vector corresponding to the dimension according to the variance value between the first lag vector and the second lag vector corresponding to the dimension.

8. The method of claim 7, wherein the variance value between the first lag vector and the second lag vector for each dimension comprises: the covariance between the first lag vector and the second lag vector corresponding to the dimension, the mean square error of the first lag vector corresponding to the dimension, and the mean square error of the second lag vector corresponding to the dimension;

for each dimension, determining a variance value between a first lag vector and a second lag vector corresponding to the dimension; determining a cross-correlation coefficient between the first lag vector and the second lag vector corresponding to the dimension according to a variance value between the first lag vector and the second lag vector corresponding to the dimension, including:

for each dimension, determining covariance between a first lag vector and a second lag vector corresponding to the dimension, and determining mean square error of the first lag vector corresponding to the dimension and mean square error of the second lag vector corresponding to the dimension;

and determining the cross-correlation coefficient between the first lag vector and the second lag vector corresponding to the dimension according to the covariance between the first lag vector and the second lag vector corresponding to the dimension, the mean square error of the first lag vector corresponding to the dimension and the mean square error of the second lag vector corresponding to the dimension.

9. The method of claim 3, wherein the correlation coefficient of the target vector further comprises: cross correlation coefficients between the target vectors respectively corresponding to any two different dimensions;

every two dimensions in all dimensions form a dimension group;

for each dimension group, determining a variance value between target vectors respectively corresponding to two dimensions in the dimension group according to the target vectors respectively corresponding to the two dimensions in the dimension group; and obtaining the cross-correlation coefficient between the target vectors respectively corresponding to the two dimensions in the dimension group according to the variance value between the target vectors respectively corresponding to the two dimensions in the dimension group.

10. The method of claim 9, wherein the variance value between the target vectors corresponding to the two dimensions in each dimension group comprises: covariance between target vectors corresponding to two dimensions in the dimension group, mean square error of the target vector corresponding to a first dimension in the dimension group, and mean square error of the target vector corresponding to a second dimension in the dimension group;

for each dimension group, determining a variance value between target vectors respectively corresponding to two dimensions in the dimension group according to the target vectors respectively corresponding to the two dimensions in the dimension group; obtaining cross-correlation coefficients between the target vectors respectively corresponding to the two dimensions in the dimension group according to variance values between the target vectors respectively corresponding to the two dimensions in the dimension group, including:

for each dimension group, determining covariance between target vectors respectively corresponding to two dimensions in the dimension group according to the target vectors respectively corresponding to the two dimensions in the dimension group, and determining mean square error of the target vector corresponding to a first dimension in the dimension group and mean square error of the target vector corresponding to a second dimension in the dimension group;

and obtaining the cross-correlation coefficient between the target vectors respectively corresponding to the two dimensions in the dimension group according to the covariance between the target vectors respectively corresponding to the two dimensions in the dimension group, the mean square error of the target vector corresponding to the first dimension in the dimension group and the mean square error of the target vector corresponding to the second dimension in the dimension group.

11. The method for determining a face pose according to claim 3, wherein the determining a face pose score corresponding to the target face according to the correlation coefficient of the target vector corresponding to each dimension comprises:

aiming at each dimension, obtaining a correlation coefficient decentralization value corresponding to the dimension according to the correlation coefficient of the target vector corresponding to the dimension and a preset correlation coefficient central value corresponding to the dimension;

and obtaining the face pose score corresponding to the target face according to the correlation coefficient decentralized value respectively corresponding to each dimension.

12. The method for determining a face pose according to any one of claims 1 to 11, wherein the determining a face pose corresponding to the target face according to the face pose score comprises:

and obtaining the face pose corresponding to the target face according to the preset corresponding relation between the face pose score and the face pose score.

13. The method for determining the face pose according to any one of claims 1 to 11, wherein the determining the position information corresponding to each key point in the target face comprises:

inputting the image to be processed into a trained face key point recognition model to obtain each key point in the target face output by a preset face key point recognition model;

determining the position information respectively corresponding to each key point in the target face according to the image to be processed and each key point in the target face; and the position information corresponding to the key points is the information of the positions of the key points in the image to be processed.

14. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, carries out the steps of the method for determining a face pose according to any one of claims 1 to 13.

15. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for determining a face pose according to any one of claims 1 to 13.