CN114627345A

CN114627345A - Face attribute detection method and device, storage medium and terminal

Info

Publication number: CN114627345A
Application number: CN202210254925.3A
Authority: CN
Inventors: 王琼瑶
Original assignee: RDA Microelectronics Beijing Co Ltd
Current assignee: RDA Microelectronics Beijing Co Ltd
Priority date: 2022-03-15
Filing date: 2022-03-15
Publication date: 2022-06-14

Abstract

A method and a device for detecting human face attributes, a storage medium and a terminal are provided, the method comprises the following steps: acquiring a face image of a current frame; performing face attribute detection on the face image of the current frame to obtain face attribute information of the current frame; and judging whether the face image of the current frame and the face image of the previous frame belong to the same person, if so, smoothing the face attribute information of the current frame according to the face attribute information of the previous frame, and determining the face attribute result of the current frame according to the processing result. By the scheme provided by the invention, the detection result of the face attribute can be more stable.

Description

Face attribute detection method and device, storage medium and terminal

Technical Field

The invention relates to the technical field of image processing, in particular to a method and a device for detecting human face attributes, a storage medium and a terminal.

Background

With the development of image processing technology, a technology for detecting attributes of a face from a face image has appeared. For example, the age, sex, and the like of a person can be identified from a face image. When the prior art is adopted to detect the human face attribute in the video stream, the situation that the detection result is unstable easily occurs. Therefore, a method for detecting a face attribute is needed to improve the stability of the face attribute detection result.

Disclosure of Invention

The technical problem solved by the invention is how to improve the stability of the human face attribute detection result.

In order to solve the above technical problem, an embodiment of the present invention provides a method for detecting a face attribute, where the method includes: acquiring a face image of a current frame; performing face attribute detection on the face image of the current frame to obtain face attribute information of the current frame; and judging whether the face image of the current frame and the face image of the previous frame belong to the same person, if so, smoothing the face attribute information of the current frame according to the face attribute information of the previous frame, and determining the face attribute result of the current frame according to the processing result.

Optionally, the determining whether the face image of the current frame and the face image of the previous frame belong to the same person includes: carrying out face recognition on the face image of the current frame to obtain an identity characteristic vector of the current frame; acquiring an identity feature vector of a previous frame; calculating the characteristic distance between the identity characteristic vector of the current frame and the identity characteristic vector of the previous frame; and judging whether the characteristic distance is smaller than a preset distance threshold value, if so, determining that the face image of the current frame and the face image of the previous frame belong to the same person, and if not, determining that the face images of the current frame and the face image of the previous frame do not belong to the same person.

Optionally, the face attribute information is a face attribute feature or a face attribute result.

Optionally, the type of the face attribute result is a continuous numerical type, and performing smoothing processing on the face attribute information of the previous frame and the face attribute information of the current frame includes: and performing weight-based calculation on the face attribute result of the previous frame and the face attribute result of the current frame to obtain the processing result.

Optionally, calculating an error between the face attribute result of the current frame and the face attribute result of the previous frame, and recording as a current error; judging whether the current error is greater than or equal to a first preset error threshold value, if so, judging whether the face image of the next frame and the face image of the current frame belong to the same person; and if the face image of the next frame and the face image of the current frame belong to the same person, calculating the face attribute result of the next frame and the face attribute result of the current frame based on weight to obtain the processing result.

Optionally, the type of the face attribute result is a discrete numerical value type, and performing smoothing processing on the face attribute information of the current frame according to the face attribute information of the previous frame includes: calculating the error between the face attribute result of the current frame and the face attribute result of the previous frame, and recording as the current error; calculating the sum of the accumulated error and the current error, and taking the sum as the accumulated error; and judging whether the accumulated error is smaller than a third preset error threshold value, if so, taking the face attribute result of the previous frame as the processing result, otherwise, taking the face attribute result of the current frame as the processing result and resetting the accumulated error.

Optionally, before calculating the sum of the accumulated error and the current error, the method further includes: judging whether the current error is greater than or equal to a fourth preset error threshold value, if so, judging whether the face image of the next frame and the face image of the current frame belong to the same person; and if the face image of the next frame and the face image of the current frame belong to the same person and the error between the face attribute result of the next frame and the face attribute result of the current frame is smaller than a fourth preset error threshold, taking the face attribute result of the current frame as the processing result.

Optionally, the smoothing processing on the face attribute information of the current frame according to the face attribute information of the previous frame includes: and carrying out fusion processing on the face attribute characteristics of the previous frame and the face attribute characteristics of the current frame to obtain the processing result.

In order to solve the above technical features, an embodiment of the present invention further provides a device for detecting a face attribute, where the device includes: the acquisition module is used for acquiring a face image of a current frame; the detection module is used for carrying out face attribute detection on the face image of the current frame so as to obtain face attribute information of the current frame; and the post-processing module is used for judging whether the face image of the current frame and the face image of the previous frame belong to the same person, if so, smoothing the face attribute information of the current frame according to the face attribute information of the previous frame, and determining the face attribute result of the current frame according to the processing result.

The embodiment of the present invention further provides a storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the above-mentioned method for detecting a face attribute are executed.

The embodiment of the present invention further provides a terminal, which includes a memory and a processor, where the memory stores a computer program that can be run on the processor, and the processor executes the steps of the above-mentioned method for detecting a face attribute when running the computer program.

Compared with the prior art, the technical scheme of the embodiment of the invention has the following beneficial effects:

in the scheme of the embodiment of the invention, after the face attribute information of the current frame is obtained by calculation, whether the face image of the current frame and the face image of the previous frame belong to the same person is judged. Because the face attribute refers to the physiological attribute of the photographed person and has the characteristics of self stability and individual difference, if the face image of the current frame and the face image of the previous frame belong to the same person, the face attribute information of the current frame is smoothed, and the face attribute result of the current frame is determined according to the processing result. By adopting the scheme, the condition that the human face attribute result of the same person has mutation can be reduced, and the stability of human face attribute detection is favorably improved.

Further, in the scheme of this embodiment, the type of the face attribute result is a continuous numerical type, and before weight-based calculation is performed on the face attribute result of the previous frame and the face attribute result of the current frame, an error between the face attribute result of the previous frame and the face attribute result of the current frame is calculated and recorded as a current error; if the current error is larger than or equal to a first preset error threshold value, judging whether the face image of the next frame and the face image of the current frame belong to the same person; and if the face image of the next frame and the face image of the current frame belong to the same person, performing weight-based calculation on the face attribute result of the next frame and the face attribute result of the current frame to obtain the processing result. By adopting the scheme, the face attribute result of the same shot person can be more stable, the situation that the face attribute results of different shot persons are mistakenly taken as the face attribute result of the same shot person when the identity recognition is wrong can be reduced, and the stability and the accuracy of the detection result of the numerical attribute can be considered at the same time.

Further, in the solution of this embodiment, the type of the face attribute result is a discrete numerical type, and performing smoothing processing on the face attribute information of the previous frame and the face attribute information of the current frame includes: calculating the error between the face attribute result of the previous frame and the face attribute result of the current frame, and recording as the current error; calculating the sum of the accumulated error and the current error, and taking the sum as the accumulated error; and judging whether the accumulated error is smaller than a third preset error threshold, if so, taking the face attribute result of the previous frame as a processing result, otherwise, taking the face attribute result of the current frame as the processing result and resetting the accumulated error. By adopting the scheme, the face attribute result of the same shot person can be more stable, the situation that the face attribute results of different shot persons are mistakenly taken as the face attribute result of the same shot person when the identity recognition is wrong can be reduced, and the stability and the accuracy of the detection result of the type attribute can be considered at the same time.

Drawings

Fig. 1 is a schematic flow chart of a method for detecting human face attributes in an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a device for detecting human face attributes in an embodiment of the present invention.

Detailed Description

As described in the background art, a method for detecting human face attributes is needed to improve the stability of the detection result of human face attributes.

Research shows that when the prior art is adopted to detect the attributes of the human face, the human face attributes are easily influenced by human face postures, illumination environments and the like, so that the detection result is unstable. For example, when detecting age, the posture of the subject moving down or up tends to change, which leads to a large fluctuation in the detected age.

In order to solve the above technical problem, an embodiment of the present invention provides a method for detecting a face attribute, in a scheme of the embodiment of the present invention, after face attribute information of a current frame is obtained by calculation, it is determined whether a face image of the current frame and a face image of a previous frame belong to the same person. Because the face attribute refers to the physiological attribute of the photographed person and has the characteristics of self stability and individual difference, if the face image of the current frame and the face image of the previous frame belong to the same person, the face attribute information of the current frame is smoothed, and the face attribute result of the current frame is determined according to the processing result. By adopting the scheme, the condition that the human face attribute result of the same person has mutation can be reduced, and the stability of human face attribute detection is favorably improved.

In order to make the aforementioned objects, features and advantages of the present invention more comprehensible, embodiments accompanying figures are described in detail below.

Referring to fig. 1, fig. 1 is a schematic flow chart of a method for detecting a face attribute according to an embodiment of the present invention. The method may be performed by a terminal, which may be any existing terminal device with data receiving and processing capabilities, such as, but not limited to, a mobile phone, a computer, an internet of things device, a server, and the like. By executing the scheme of the embodiment of the invention, the situation that the attribute detection result of the same person changes in a short time can be reduced, so that the detection result of the face attribute in the video stream can be more stable, and the user experience can be improved.

The scheme in this embodiment is used to detect a face attribute in a video stream, where the video stream includes a face image of a subject, and the face attribute refers to a physiological attribute of the subject and is not limited to an attribute of the face of the subject. Specifically, the face attributes in the present embodiment have characteristics of self-stability and individual variability, and more specifically, the face attributes of the same person do not undergo mutation in a short time, and for example, the face attributes may be age, skin color, gender, and the like, but are not limited thereto. In other words, the attributes of the face, which may change abruptly in a short time, are not the attributes of the face to be detected in the embodiment of the present invention, for example, the pose, the expression, whether the face has an occlusion, whether glasses are worn, and the like.

The method for detecting the attributes of the human face shown in fig. 1 may include:

step S101: acquiring a face image of a current frame;

step S102: performing face attribute detection on the face image of the current frame to obtain face attribute information of the current frame;

step S103: and judging whether the face image of the current frame and the face image of the previous frame belong to the same person, if so, smoothing the face attribute information of the current frame according to the face attribute information of the previous frame, and determining the face attribute result of the current frame according to the processing result.

It is understood that in a specific implementation, the method may be implemented by a software program running in a processor integrated within a chip or a chip module; alternatively, the method can be implemented in hardware or a combination of hardware and software.

In a specific implementation of step S101, a face image of a current frame may be acquired from a video stream. The video stream may be a video being recorded, or may be a video stream that is recorded, which is not limited in this embodiment.

In a specific implementation, images may be extracted from a video stream at preset time intervals, and the image extracted at the current time may be denoted as a current frame, the image extracted at the last time may be denoted as a previous frame, and the image extracted at the next time may be denoted as a next frame.

Further, the video stream may include an image of the person to be photographed, and after the image is extracted each time, it may be determined whether the image includes a face region, and if so, the face region in the image may be extracted to obtain a face image. Specifically, the extracted image may be subjected to face detection to obtain a face detection result, and the face detection result may be used to indicate whether the image contains a face. It should be noted that, in the scheme of this embodiment, it is not limited that each acquired frame image includes a human face.

In a specific implementation of step S102, the face attribute detection may be performed on the face image of the current frame to obtain the face attribute information of the current frame. The face attribute information may be a face attribute feature or a face attribute result. The face attribute features refer to feature parameters for describing the face features, and may be, for example, feature vectors identified by a face attribute detection algorithm; the face attribute result refers to a recognition result obtained by performing calculation based on the face attribute features, and the recognition result may be a value describing each dimension of the face, such as a value of a dimension of age, gender, and the like.

In a specific example, the face image of the current frame may be input to an attribute detection model obtained by pre-training to obtain the face attribute information of the current frame.

More specifically, the attribute detection model may be obtained by training a first preset model in advance by using first training data, where the first training data may include a plurality of face images, each face image having an attribute tag, and the attribute tag may be used to indicate a face attribute in the face image. The method for training the first preset model by using the first training data may be any suitable training method, and the embodiment does not limit this. When the preset training end condition is met, the attribute detection model can be obtained. The preset training end condition may also be various existing suitable training end conditions, for example, the batch of training data reaches a first preset value, the error rate is less than a second preset value, and the like, but is not limited thereto.

Further, the trained attribute detection model may include: the device comprises a first feature extraction module and a prediction module, wherein the first feature extraction module can be used for extracting feature vectors of the face image. In other words, the input of the first feature extraction module may be a face image, the output may be a feature vector, and the output of the first feature extraction module may be recorded as a face attribute feature. The prediction module can be used for calculating a detection result of the face attribute according to the face attribute feature output by the first feature extraction module and recording the detection result as a face attribute result. In other words, the input to the prediction module may be a face attribute feature and the output may be a face attribute result. In a specific embodiment, the face attribute information in step S102 may refer to: and (5) obtaining a face attribute result.

In other embodiments, the face attribute information in step S102 may also be: the face attribute feature, in other words, the face attribute information, may also refer to a feature vector used for calculating a face attribute result, that is, the face attribute information may also be the face attribute feature.

In the specific implementation of step S103, it may be determined whether the face image of the current frame and the face image of the previous frame belong to the same person.

In a specific implementation, before step S103 is executed, a face detection result of a previous frame and a face detection result of a current frame may be obtained, where the face detection result of the previous frame may be used to indicate whether the previous frame includes a face image, and the face detection result of the current frame may be used to indicate whether the current frame includes a face image. If the previous frame does not contain the face image and the current frame contains the face image, the face attribute result of the current frame can be determined only according to the face attribute information in step S102. By adopting the scheme, if the previous frame does not contain the face image and the current frame contains the face image, the fact that the shot person in the current frame is not the same person as the shot person in the previous frame can be determined, and the situation that the shot person changes can be accurately identified.

It should be noted that, in the scheme of this embodiment, the determining the face attribute of the current frame according to the face attribute information obtained in step S102 may include: if the face attribute information is the face attribute result, the face attribute information obtained in step S102 may be used as the face attribute result of the current frame image. If the face attribute information is the face attribute feature, the face attribute result of the current frame can be obtained by calculation only according to the face attribute feature of the current frame. For example, the face attribute feature of the current frame may be input to the prediction module to obtain a face attribute result of the current frame.

Further, if the previous frame includes a face image and the current frame includes a face image, on one hand, the identity information of the previous frame may be obtained, and on the other hand, face recognition may be performed on the face image of the current frame to obtain the identity information of the current frame. And comparing the identity information of the previous frame with the identity information of the current frame to judge whether the face image of the previous frame and the face image of the current frame belong to the same person.

In specific implementation, the identity information may be an identity feature vector, and the face image of the current frame may be input to a face recognition model obtained through pre-training to obtain the identity feature vector of the current frame. Furthermore, the identity feature vector of the current frame can be stored, so that whether the face image of the next frame and the face image of the current frame belong to the same person or not can be judged in the following.

Specifically, the face recognition model may be obtained by training a second preset model by using second training data in advance, where the second training data may include a plurality of face images, each face image has an identity tag, and the identity tag may be used to indicate an identity of a person to be photographed in the face image. The method for training the second preset model by using the second training data may be any suitable training method, and the embodiment does not limit this. When the preset training end condition is met, the face recognition model can be obtained.

Further, the training of the face recognition model may include: and the input of the second characteristic extraction module can be a face image, and the output of the second characteristic extraction module can be face identity characteristics. In other words, the second feature extraction module may be configured to extract a feature vector of the face image, the feature vector extracted by the second feature extraction module may be recorded as an identity feature vector, and the identity feature vector may be used to determine an identity of a photographer corresponding to the face image.

Further, an identity feature vector of a previous frame may be obtained, and a feature distance between the identity feature vector of the current frame and the identity feature vector of the previous frame may be calculated and may be recorded as the first feature distance. The first characteristic distance may be a euclidean distance, a cosine distance, etc., which is not limited in this embodiment.

Further, whether the first characteristic distance is smaller than a preset distance threshold value or not can be judged, and if yes, the face image of the current frame and the face image of the previous frame can be determined to belong to the same person; if the characteristic distance is greater than or equal to the preset distance threshold, it can be determined that the face image of the current frame and the face image of the previous frame do not belong to the same person.

Further, if the face image of the current frame and the face image of the previous frame do not belong to the same person, the face attribute result of the current frame may be determined only according to the face attribute information obtained in step S102. If the face image of the current frame and the face image of the previous frame belong to the same person, smoothing processing can be performed on the face attribute information of the current frame, and a face attribute result of the current frame is determined according to the processing result.

Specifically, if the face attribute information is a face attribute result, the processing result can be directly used as the face attribute result of the current frame and output; if the face attribute information is the face attribute feature, the processing result is also the face attribute feature, and the processing result can be further calculated to obtain and output the face attribute result of the current frame.

In the solution of this embodiment, if the face attribute information is a face attribute result, different smoothing processing manners may be adopted for different types of attributes. In particular implementations, the face attribute results are typically represented in numerical form. For example, age may be expressed as a numerical value; for another example, gender may be represented by a numerical value, with a face attribute result of 0 indicating gender as male and a face attribute result of 1 indicating gender as female.

Furthermore, the types of attributes are different, and the values of the face attribute result have different characteristics. Specifically, the face attribute result may be a continuous numerical type or a discrete numerical type. The face attribute result of the continuous numerical type may correspond to the numerical type attribute, and the face attribute result of the continuous numerical type may be used to indicate the size of the numerical type attribute. The discrete-numeric face attribute result may correspond to a categorical attribute, and the discrete-numeric face attribute result may indicate a category of the categorical attribute. For example, the numerical attribute may be age, body temperature, and the like. As another example, the categorical attribute may be gender, skin color, and the like.

On one hand, if the type of the face attribute result is a continuous numerical type, weight-based calculation may be performed on the face attribute result of the previous frame and the face attribute result of the current frame to obtain a processing result, and then the processing result may be used as the face attribute result of the current frame.

In one non-limiting example, the face attribute result of the previous frame and the face attribute result of the current frame may be weighted and summed to obtain the processing result. For example, the weight of the face attribute result of the previous frame and the weight of the face attribute result of the current frame may both be 0.5, but are not limited thereto. In other embodiments, the sum of the weight of the face attribute result of the previous frame and the weight of the face attribute result of the current frame is 1 and the weight of the face attribute result of the previous frame may be greater than the weight of the face attribute result of the current frame. By adopting the scheme, the condition that the human face attribute result is mutated can be reduced, and the detection result of the numerical attribute is more stable.

In another non-limiting example, before the weight-based calculation is performed on the face attribute result of the previous frame and the face attribute result of the current frame, an error between the face attribute result of the current frame and the face attribute result of the previous frame may be calculated and recorded as a current error.

Further, it may be determined whether the current error is greater than or equal to a first preset error threshold, and if the current error is less than the first preset error threshold, weight-based calculation may be performed on the face attribute result of the previous frame and the face attribute result of the current frame. If the current error is greater than or equal to the first preset error threshold, it may be determined that there may be an error in the identity comparison, that is, the face image of the current frame and the face image of the previous frame may not belong to the same person.

Further, if the current error is greater than or equal to the first preset error threshold, the face image of the next frame can be obtained, and the face attribute information and the identity feature vector of the next frame are obtained. For more contents of obtaining the face attribute information and the identity feature vector of the next frame, reference may be made to the above description about obtaining the face attribute information and the identity feature vector of the current frame, and details are not repeated here.

Further, whether the face image of the next frame and the face image of the current frame belong to the same person is judged. Specifically, a feature distance between the identity feature vector of the next frame and the identity feature vector of the current frame may be calculated, and may be referred to as a second feature distance. If the second characteristic distance is smaller than the preset distance threshold, it can be determined that the face image of the next frame and the face image of the current frame belong to the same person. For more details about the second characteristic distance, reference may be made to the above description about the first characteristic distance, which is not repeated herein.

Further, if the face image of the next frame and the face image of the current frame belong to the same person, performing weight-based calculation on the face attribute result of the next frame and the face attribute result of the current frame to obtain the processing result.

In a specific implementation, it may be determined whether an error between the face attribute result of the next frame of image and the face attribute result of the current frame is smaller than a second preset error threshold. If the error between the face attribute result of the next frame image and the face attribute result of the current frame is smaller than a second preset error threshold, weight-based calculation can be performed according to the face attribute result of the current frame and the face attribute result of the next frame to obtain a processing result. If the error between the face attribute result of the next frame image and the face attribute result of the current frame image is greater than or equal to the second preset error threshold, weight-based calculation can still be performed on the face attribute result of the previous frame and the face attribute result of the current frame to obtain a processing result. And the second preset error threshold is smaller than or equal to the first preset error threshold. In other words, in the solution of this embodiment, before the face attribute result of the next frame is used to perform the smoothing processing on the face attribute result of the current frame, a double determination needs to be performed, that is, whether the second characteristic distance is smaller than the preset distance threshold and whether the error between the face attribute result of the next frame image and the face attribute result of the current frame is smaller than the second preset error threshold.

Further, if the face image of the next frame and the face image of the current frame do not belong to the same person, weight-based calculation may be performed on the face attribute result of the previous frame and the face attribute result of the current frame to obtain a processing result of the smoothing processing of the current frame.

By adopting the scheme, the face attribute result of the same shot person can be more stable, and the condition that the face attribute results of different shot persons are mistakenly taken as the face attribute result of the same shot person when the identity recognition is wrong can be reduced, so that the stability and the accuracy of the detection result of the numerical attribute can be considered.

On the other hand, if the type of the face attribute result is a discrete numerical type, an error between the face attribute result of the previous frame and the face attribute result of the current frame can be calculated and recorded as a current error; further, the sum of the accumulated error and the current error may be calculated and used as the accumulated error, in other words, the accumulated error may be updated according to the sum of the accumulated error and the current error, and the updated accumulated error is the sum of the accumulated error before updating and the current error.

Further, whether the updated accumulated error is smaller than a third preset error threshold value or not can be judged, and if the updated accumulated error is smaller than the third preset error threshold value, the face attribute result of the previous frame can be used as the processing result. And if the accumulated error is greater than or equal to a third preset error threshold, taking the face attribute result of the current frame as a processing result and clearing the accumulated error.

In other words, when the updated accumulated error is smaller than the third preset error threshold, it may be determined that an error between the face attribute result of the current frame and the face attribute result of the previous frame may be caused by changes of other factors (e.g., the illumination environment, the face pose, etc.), and at this time, the face attribute result of the previous frame is still the standard; when the updated accumulated error is greater than or equal to the third preset error threshold, it is indicated that errors exist between the face attribute results of the existing multiple frames and the face attribute results of the respective previous frame, and it can be determined that the error between the face attribute result of the current frame and the face attribute result of the previous frame is caused by the change of the photographer, and the face attribute result of the current frame can be used as the reference at this time.

By adopting the scheme, the face attribute result of the same shot person can be more stable, the situation that the face attribute results of different shot persons are mistakenly taken as the face attribute result of the same shot person when the identity recognition is wrong can be reduced, and the stability and the accuracy of the detection result of the type attribute can be considered.

In a non-limiting example, after calculating the error (i.e., the current error) between the face attribute result of the previous frame and the face attribute result of the current frame, and before calculating the sum of the accumulated error and the current error, it may be determined whether the current error is greater than or equal to a fourth preset error threshold.

If the current error is smaller than the fourth preset error threshold, it can be determined that the result of the identity comparison between the previous frame and the current frame is correct, the sum of the accumulated error and the current error can be further calculated, and the above steps are continuously performed.

If the current error is greater than or equal to the fourth preset error threshold, it can be determined that the identity comparison may have an error. Further, it can be determined whether the face image of the next frame and the face image of the current frame belong to the same person.

If the face image of the next frame and the face image of the current frame do not belong to the same person, the possibility that the identity comparison result of the previous frame and the current frame is wrong can be eliminated, so that the sum of the accumulated error and the current error can be further calculated, and the steps are continuously executed.

If the face image of the next frame and the face image of the current frame belong to the same person, and the error between the face attribute result of the next frame and the face attribute result of the current frame is smaller than a fourth preset error threshold, it can be determined that the result of identity comparison between the previous frame and the current frame is wrong, so that the face attribute result of the current frame can be used as the processing result, and the current accumulated error can be cleared.

If the face image of the next frame and the face image of the current frame belong to the same person, and the error between the face attribute result of the next frame and the face attribute result of the current frame is greater than or equal to a fourth preset error threshold, it can be determined whether the identity comparison result of the previous frame and the current frame is wrong or not can not be accurately verified only by using the next frame, and at this time, based on consideration of processing efficiency and stability, the sum of the accumulated error and the current error can be further calculated, and the above steps are continuously executed.

It should be noted that, in this embodiment, the magnitude relationship between the third preset error threshold and the first preset threshold and the second preset threshold is not limited, and the magnitude relationship between the fourth preset error threshold and the first preset threshold and the second preset threshold is not limited.

In a non-limiting example, the face attribute information in step S102 may be a face attribute feature, and then the face attribute feature of the previous frame and the face attribute feature of the current frame may be subjected to fusion processing to obtain a processing result. In other words, the processing result may be the fused face attribute feature. In a specific implementation, the fusion processing may be to perform weighted summation processing on the face attribute feature of the previous frame and the face attribute feature of the current frame to obtain a processing result, where a sum of weights of the face attribute feature of the previous frame and the face attribute feature of the current frame may be 1.

More specifically, the weight of the face attribute feature of the previous frame may be determined according to the first feature distance. The smaller the first characteristic distance is, the higher the possibility that the face image of the previous frame and the face image of the current frame belong to the same person is, and the greater the weight of the face attribute characteristic of the previous frame is. On the contrary, the smaller the first feature distance is, the smaller the possibility that the face image of the previous frame and the face image of the current frame belong to the same person is, and the smaller the weight of the face attribute feature of the previous frame is.

Further, a face attribute result can be determined according to the fused face attribute features. Specifically, the processing result may be input to a prediction module of the attribute detection model to obtain a face attribute result output by the prediction module. By adopting the scheme, the stability of the detection result of the face attribute can be kept, and the smoothed detection result is more accurate.

Referring to fig. 2, fig. 2 is a schematic structural diagram of an apparatus for detecting a face attribute according to an embodiment of the present invention, and the apparatus shown in fig. 2 may include:

an obtaining module 21, configured to obtain a face image of a current frame;

the detection module 22 is configured to perform face attribute detection on the face image of the current frame to obtain face attribute information of the current frame;

and the post-processing module 23 is configured to determine whether the face image of the current frame and the face image of the previous frame belong to the same person, if so, perform smoothing processing on the face attribute information of the current frame according to the face attribute information of the previous frame, and determine a face attribute result of the current frame according to a processing result.

For more contents such as the working principle, the working method, and the beneficial effects of the device for detecting a face attribute in the embodiment of the present invention, reference may be made to the above description related to the method for detecting a face attribute, and details are not described herein again.

The embodiment of the present invention further provides a storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the above-mentioned method for detecting a face attribute are executed. The storage medium may include ROM, RAM, magnetic or optical disks, etc. The storage medium may further include a non-volatile memory (non-volatile) or a non-transitory memory (non-transient), and the like.

The embodiment of the present invention further provides a terminal, which includes a memory and a processor, where the memory stores a computer program that can be run on the processor, and the processor executes the steps of the above-mentioned method for detecting a face attribute when running the computer program. The terminal includes, but is not limited to, a mobile phone, a computer, a tablet computer and other terminal devices.

It should be understood that, in the embodiment of the present application, the processor may be a Central Processing Unit (CPU), and the processor may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

It will also be appreciated that the memory in the embodiments of the subject application can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash memory. Volatile memory may be Random Access Memory (RAM) which acts as external cache memory. By way of example, and not limitation, many forms of Random Access Memory (RAM) are available, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), SLDRAM (SLDRAM), and direct rambus RAM (DR)

The above-described embodiments may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. The procedures or functions according to the embodiments of the present application are wholly or partially generated when the computer instructions or the computer program are loaded or executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer program may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer program may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire or wirelessly.

In the several embodiments provided in the present application, it should be understood that the disclosed method, apparatus and system may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative; for example, the division of the unit is only a logic function division, and there may be another division manner in actual implementation; for example, various elements or components may be combined or may be integrated in another system or some features may be omitted, or not implemented. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be physically included alone, or two or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit. For example, for each device or product applied to or integrated into a chip, each module/unit included in the device or product may be implemented by hardware such as a circuit, or at least a part of the module/unit may be implemented by a software program running on a processor integrated within the chip, and the rest (if any) part of the module/unit may be implemented by hardware such as a circuit; for each device or product applied to or integrated with the chip module, each module/unit included in the device or product may be implemented by using hardware such as a circuit, and different modules/units may be located in the same component (e.g., a chip, a circuit module, etc.) or different components of the chip module, or at least some of the modules/units may be implemented by using a software program running on a processor integrated within the chip module, and the rest (if any) of the modules/units may be implemented by using hardware such as a circuit; for each device and product applied to or integrated in the terminal, each module/unit included in the device and product may be implemented by using hardware such as a circuit, and different modules/units may be located in the same component (e.g., a chip, a circuit module, etc.) or different components in the terminal, or at least part of the modules/units may be implemented by using a software program running on a processor integrated in the terminal, and the rest (if any) part of the modules/units may be implemented by using hardware such as a circuit.

It should be understood that the term "and/or" herein is only one kind of association relationship describing the association object, and means that there may be three kinds of relationships, for example, a and/or B, and may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in this document indicates that the former and latter related objects are in an "or" relationship.

The "plurality" appearing in the embodiments of the present application means two or more. The descriptions of the first, second, etc. appearing in the embodiments of the present application are only for illustrating and differentiating the objects, and do not represent the order or the particular limitation of the number of the devices in the embodiments of the present application, and do not constitute any limitation to the embodiments of the present application. Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A method for detecting human face attributes is characterized by comprising the following steps:

acquiring a face image of a current frame;

performing face attribute detection on the face image of the current frame to obtain face attribute information of the current frame;

and judging whether the face image of the current frame and the face image of the previous frame belong to the same person, if so, smoothing the face attribute information of the current frame according to the face attribute information of the previous frame, and determining the face attribute result of the current frame according to the processing result.

2. The method of claim 1, wherein determining whether the face image of the current frame and the face image of the previous frame belong to the same person comprises:

carrying out face recognition on the face image of the current frame to obtain an identity characteristic vector of the current frame;

acquiring an identity feature vector of a previous frame;

calculating the characteristic distance between the identity characteristic vector of the current frame and the identity characteristic vector of the previous frame;

and judging whether the characteristic distance is smaller than a preset distance threshold value, if so, determining that the face image of the current frame and the face image of the previous frame belong to the same person, and if not, determining that the face images of the current frame and the face image of the previous frame do not belong to the same person.

3. The method for detecting human face attribute according to claim 1, wherein the human face attribute information is a human face attribute feature or a human face attribute result.

4. The method of claim 3, wherein the type of the face attribute result is a continuous numerical type, and performing smoothing processing on the face attribute information of the previous frame and the face attribute information of the current frame comprises:

and performing weight-based calculation on the face attribute result of the previous frame and the face attribute result of the current frame to obtain the processing result.

5. The method of claim 4, wherein before performing weight-based calculation on the face attribute result of the previous frame and the face attribute result of the current frame, the method further comprises:

calculating the error between the face attribute result of the current frame and the face attribute result of the previous frame, and recording as the current error;

judging whether the current error is greater than or equal to a first preset error threshold value, if so, judging whether the face image of the next frame and the face image of the current frame belong to the same person;

and if the face image of the next frame and the face image of the current frame belong to the same person, performing weight-based calculation on the face attribute result of the next frame and the face attribute result of the current frame to obtain the processing result.

6. The method according to claim 3, wherein the type of the face attribute result is a discrete numerical type, and smoothing the face attribute information of the current frame according to the face attribute information of the previous frame comprises:

calculating the sum of the accumulated error and the current error, and taking the sum as the accumulated error;

and judging whether the accumulated error is smaller than a third preset error threshold value, if so, taking the face attribute result of the previous frame as the processing result, otherwise, taking the face attribute result of the current frame as the processing result and resetting the accumulated error.

7. The method of claim 6, wherein before calculating the sum of the accumulated error and the current error, the method further comprises:

judging whether the current error is greater than or equal to a fourth preset error threshold value or not, if so, judging whether the face image of the next frame and the face image of the current frame belong to the same person or not;

and if the face image of the next frame and the face image of the current frame belong to the same person and the error between the face attribute result of the next frame and the face attribute result of the current frame is smaller than a fourth preset error threshold, taking the face attribute result of the current frame as the processing result.

8. The method according to claim 3, wherein the face attribute information is the face attribute feature, and smoothing the face attribute information of the current frame according to the face attribute information of the previous frame includes:

and carrying out fusion processing on the face attribute characteristics of the previous frame and the face attribute characteristics of the current frame to obtain the processing result.

9. An apparatus for detecting attributes of a human face, the apparatus comprising:

the acquisition module is used for acquiring a face image of a current frame;

the detection module is used for carrying out face attribute detection on the face image of the current frame so as to obtain face attribute information of the current frame;

and the post-processing module is used for judging whether the face image of the current frame and the face image of the previous frame belong to the same person, if so, smoothing the face attribute information of the current frame according to the face attribute information of the previous frame, and determining the face attribute result of the current frame according to the processing result.

10. A storage medium having stored thereon a computer program, characterized in that the computer program, when being executed by a processor, performs the steps of the method for detecting a human face attribute of any one of claims 1 to 8.

11. A terminal comprising a memory and a processor, the memory having stored thereon a computer program operable on the processor, wherein the processor executes the computer program to perform the steps of the method for detecting attributes of a human face as claimed in any one of claims 1 to 8.