CN112200109A - Face attribute recognition method, electronic device, and computer-readable storage medium - Google Patents

Face attribute recognition method, electronic device, and computer-readable storage medium Download PDF

Info

Publication number
CN112200109A
CN112200109A CN202011112251.0A CN202011112251A CN112200109A CN 112200109 A CN112200109 A CN 112200109A CN 202011112251 A CN202011112251 A CN 202011112251A CN 112200109 A CN112200109 A CN 112200109A
Authority
CN
China
Prior art keywords
model
region
attribute
facial
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011112251.0A
Other languages
Chinese (zh)
Inventor
柳天驰
申省梅
马原
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Pengsi Technology Co ltd
Original Assignee
Beijing Pengsi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Pengsi Technology Co ltd filed Critical Beijing Pengsi Technology Co ltd
Priority to CN202011112251.0A priority Critical patent/CN112200109A/en
Publication of CN112200109A publication Critical patent/CN112200109A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application provides a face attribute identification method, an electronic device and a computer readable storage medium, wherein the method comprises the following steps: acquiring a face image obtained by shooting a target object; extracting an occlusion feature value and an attribute feature value of each divided region in the facial image, wherein the occlusion feature value represents the situation that the facial features of the target object in each region are occluded, and the attribute feature value represents the facial attribute information included in each region; for each region in the facial image, determining an adjustment weight for an attribute feature value of the region based on an occlusion feature value of the region, wherein the adjustment weight for each region is inversely proportional to the occlusion feature value of the region; determining a facial attribute of the target object based on the adjustment weight and the corresponding attribute feature value of each region. The embodiment of the application can improve the accuracy of the identification result.

Description

Face attribute recognition method, electronic device, and computer-readable storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a facial attribute recognition method, an electronic device, and a computer-readable storage medium.
Background
The technology for recognizing the face attribute according to the face image is widely concerned, the application of the face attribute recognition depends on the precondition of face detection, face alignment, face quality judgment and the like, the accuracy of the face attribute recognition is seriously influenced under the condition that a face is shielded, the inherent structure and geometric characteristics of the face are damaged by shielding, and the richness of face characteristic information is seriously reduced.
When the common model is used for face attribute recognition, information extraction is directly carried out, a large amount of wrong characteristic information can be introduced, and the larger the shielding proportion of a face area is, the accuracy of a face attribute recognition algorithm can be greatly reduced.
Disclosure of Invention
In view of the above, embodiments of the present application provide a facial attribute recognition method, an electronic device, and a computer-readable storage medium, which improve accuracy of a recognition result.
In a first aspect, an embodiment of the present application provides a facial attribute identification method, where the method includes:
acquiring a face image obtained by shooting a target object;
extracting an occlusion feature value and an attribute feature value of each region in the facial image, wherein the occlusion feature value represents that the facial features of the target object in each region are occluded, and the attribute feature value represents facial attribute information included in each region;
for each region in the facial image, determining an adjustment weight for an attribute feature value of the region based on an occlusion feature value of the region, wherein the adjustment weight for each region is inversely proportional to the occlusion feature value of the region;
determining a facial attribute of the target object based on the adjustment weight and the corresponding attribute feature value of each region.
In one embodiment, for each region in the face image, extracting an occlusion feature value and an attribute feature value of the region includes:
inputting the facial image into an occlusion feature extraction model in a pre-trained facial attribute recognition model to obtain an occlusion feature value of each region in the facial image;
and inputting the facial image into a facial attribute feature extraction model in the facial attribute recognition model to obtain attribute feature values of each region in the facial image.
In one embodiment, determining the facial attributes of the target object based on the adjustment weights and the corresponding attribute feature values of the respective regions comprises:
calculating the product value of the adjustment weight of each region and the corresponding attribute characteristic value for each region;
and inputting the product value corresponding to each region into a face attribute classification model in a face attribute identification model to obtain the face attribute of the target object.
In one embodiment, for each region in the face image, determining an adjustment weight for an attribute feature value of the region based on an occlusion feature value of the region comprises:
and inputting the shielding characteristic value of each region into an attention model of the face attribute identification model to obtain the adjustment weight of the attribute characteristic value of the corresponding region.
In one embodiment, the facial attribute recognition model is trained according to the following steps:
acquiring a first training sample set, wherein the first training sample set comprises a sample image and an actual face attribute corresponding to the sample image, and the sample image comprises an image with a reference object blocked and an image with the reference object not blocked;
inputting a sample image into an occlusion feature extraction model in a face attribute identification model to obtain a first prediction feature value of each divided region in the sample image;
inputting the first prediction characteristic value of each region in the sample image into an attention model in the facial attribute recognition model to obtain a prediction adjustment weight of each region in the sample image;
inputting the sample image into a facial attribute feature extraction model of the facial attribute identification model to obtain a second prediction feature value of each region in the sample image;
determining a predicted facial attribute of a test subject included in the sample image based on the prediction adjustment weight and the corresponding second prediction feature value for each region in the sample image;
and controlling the model parameters of the shielding feature extraction model to be unchanged based on the predicted face attribute and the actual face attribute, and adjusting the model parameters of a face attribute feature extraction model and an attention model in the face attribute recognition model to obtain the trained face attribute recognition model.
In one embodiment, the occlusion feature extraction model is trained according to the following steps:
acquiring a second training sample set, wherein the second training sample set comprises an occlusion image and an actual occlusion area corresponding to the occlusion image, and the occlusion image is an image in which a reference object is occluded;
inputting the occlusion image into a first model comprising an occlusion feature extraction model to obtain a predicted occlusion region;
adjusting model parameters of the first model according to the actual occlusion area and the predicted occlusion area to obtain a trained first model;
and determining the occlusion feature extraction model from the trained first model.
In one embodiment, the model parameters of the first model are adjusted according to the principle that the distance between the predicted occlusion region and the actual occlusion region is minimal.
In one embodiment, the facial attribute feature extraction model is trained according to the following steps:
acquiring a third training sample set, wherein the third training sample set comprises a sample image and an actual face attribute corresponding to the sample image, and the sample image comprises an image of a reference object which is not shielded;
inputting the sample image into a second model comprising a facial attribute feature extraction model to obtain a predicted facial attribute;
adjusting model parameters of the second model according to the actual facial attributes and the predicted facial attributes to obtain a trained second model;
and determining the facial attribute feature extraction model from the trained second model.
In a second aspect, an embodiment of the present application provides an electronic device, including: a processor and a memory, the memory storing machine-readable instructions executable by the processor, the processor executing the machine-readable instructions when the electronic device is run to perform the steps of the method of the first aspect as described above.
In a third aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the method in the first aspect.
According to the facial attribute identification method provided by the embodiment of the application, the extracted shielding characteristic information of each divided area in the facial image is used for generating the adjustment weight aiming at the attribute characteristic value of the corresponding area, the more serious the shielding is, the smaller the adjustment weight is, the smaller the shielding is, the larger the adjustment weight is, the facial attribute characteristic in the facial image is adjusted by using the adjustment weight, when the facial attribute identification is carried out, the response of the shielding characteristic information to the facial attribute identification process is reduced, the more attention is paid to the area which is not shielded in the facial image, and the accuracy of the identification result is improved.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a schematic flow chart illustrating a facial attribute recognition method according to an embodiment of the present application;
FIG. 2 is a schematic diagram illustrating a facial attribute recognition process provided by an embodiment of the present application;
fig. 3 is a schematic structural diagram illustrating a facial attribute recognition apparatus according to an embodiment of the present application;
fig. 4 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of protection of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that in the embodiments of the present application, the term "comprising" is used to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features.
The accuracy of face attribute identification depends on the precondition of face detection, face alignment, face quality judgment and the like, under the condition that a face is shielded, the accuracy of face attribute identification is seriously influenced, the inherent structure and geometric features of the face are damaged by shielding, the richness of face feature information is seriously reduced, part of face regions which are not shielded are reserved for face attribute identification, and the targeted treatment can be carried out according to the face shielding condition.
For the face with local shielding, the facial feature information of the part of the face which is not shielded is fully utilized, the influence of a shielding object on the attribute recognition of the whole face is reduced, the face attribute recognition model can adapt to various face local shielding conditions on the premise that the accuracy of the face attribute recognition of the whole face is not reduced, the recognition accuracy under the conditions is obviously improved, and the robustness of the face attribute recognition model is improved.
In view of the above situation, the present application provides a facial attribute recognition model, which can be used to reduce the response of occlusion feature information to a facial attribute recognition process when performing facial attribute recognition, pay more attention to an area (non-occluded area) containing facial attribute information in a facial image, and improve the accuracy of a recognition result.
The facial attribute recognition model comprises an occlusion feature extraction model, a facial attribute feature extraction model, an attention model and a facial attribute classification model, and can be obtained through pre-training based on a training sample set, and a specific training process comprises facial attribute recognition model training, occlusion feature extraction model training and facial attribute feature extraction model training, wherein the method for training the facial attribute recognition model comprises the following steps:
a11, obtaining a first training sample set, wherein the first training sample set comprises sample images and actual face attributes corresponding to the sample images, and the sample images comprise images with reference objects being blocked and images with reference objects not being blocked;
a12, inputting a sample image into an occlusion feature extraction model in a face attribute identification model, and obtaining a first prediction feature value of each divided region in the sample image;
a13, inputting the first prediction characteristic value of each area in the sample image into an attention model in the face attribute identification model, and obtaining the prediction adjustment weight of each area in the sample image;
a14, inputting the sample image into a facial attribute feature extraction model of the facial attribute identification model, and obtaining a second prediction feature value of each region in the sample image;
a15, determining a predicted face attribute of the test subject included in the sample image based on the prediction adjustment weight and the corresponding second prediction feature value of each region in the sample image;
and A16, controlling the model parameters of the occlusion feature extraction model to be unchanged based on the predicted face attribute and the actual face attribute, and adjusting the model parameters of the face attribute feature extraction model and the attention model in the face attribute recognition model to obtain the trained face attribute recognition model.
Here, the sample image includes an image (an occlusion image) in which the reference object is occluded and an image (an unoccluded image) in which the reference object is unoccluded, where the reference object may be a human face, the size of an occluded region of the reference object in the occlusion image may not be fixed, and the ratio of the occluded image and the unoccluded image in the sample image may be determined according to actual conditions; the actual facial attributes include actual facial attributes of the reference object, for example, gender, age, expression, and the like of the reference object.
In the specific implementation process, after a first training sample set containing a sample image is obtained, the sample image is subjected to region division, the sample image of the divided region is input into an occlusion feature extraction model in a face attribute identification model, and a first prediction feature value of each region in the sample image is obtained, that is, an occlusion image and an unoccluded image are input into an occlusion feature extraction model in the face attribute identification model, and a first prediction feature value of each region in the occlusion image and a first prediction feature value of each region in the unoccluded image are obtained.
And inputting the first prediction characteristic value of each region in the occluded image into an attention model in the face attribute identification model to obtain the prediction adjustment weight of each region in the occluded image, and inputting the first prediction characteristic value of each region in the unoccluded image into the attention model in the face attribute identification model to obtain the prediction adjustment weight of each region in the unoccluded image.
And inputting the shielded image and the non-shielded image into a face attribute feature extraction model of the face attribute identification model, and respectively obtaining a second prediction feature value of each area in the shielded image and a second prediction feature value of each area in the non-shielded image. The occlusion feature extraction model and the facial attribute feature extraction model may be, but are not limited to, a convolutional neural network model, the attention model may be, but is not limited to, a sigmoid activation function, a convolutional neural network, a combination of a convolutional neural network and a softmax function, and the like, and a convolution kernel of the attention model may be determined according to actual conditions.
And calculating the product value of the prediction adjustment weight and the second prediction characteristic value in each region aiming at each region in the occlusion image, and inputting the product value of each region into a face attribute classification model in the face attribute identification model to obtain the predicted face attribute of the occlusion image.
And calculating the product value of the prediction adjustment weight and the second prediction characteristic value in each region aiming at each region in the unoccluded image, and inputting the product value of each region into a face attribute classification model in the face attribute identification model to obtain the predicted face attribute of the unoccluded image.
And according to the principle that the distance between the predicted face attribute and the actual face attribute of the occluded image and the distance between the predicted face attribute and the actual face attribute of the unoccluded image are simultaneously minimum, controlling the model parameters of the occluded feature extraction model to be unchanged, and adjusting the model parameters of the face attribute extraction model and the attention model in the face attribute recognition model to obtain the trained face attribute recognition model.
In addition, the problem that model parameters of the occlusion feature extraction model and model parameters of the facial attribute feature extraction model are fitted with each other may occur when the occlusion region detection task and the face attribute recognition task are trained simultaneously, that is, when training is performed on a facial image, adjustment of the model parameters of the occlusion feature extraction model may cause adjustment of the model parameters of the facial attribute feature extraction model, and a training result may be unstable. Therefore, the model parameters of the occlusion feature extraction model can be adjusted individually to reduce the amount of computation in the training process of the facial attribute recognition model.
When adjusting the model parameters of each model included in the facial attribute recognition model, if training the facial attribute feature extraction model preferentially, then training the occlusion feature extraction model, training the facial attribute feature extraction model, after finishing the training of the facial attribute feature extraction model, splicing the occlusion feature extraction model with the facial attribute feature extraction model, training the spliced occlusion feature extraction model, and finally splicing the trained facial attribute feature extraction model and the trained occlusion feature extraction model to obtain the facial attribute recognition model, that is, splicing and training the facial attribute feature extraction model, the occlusion feature extraction model, the attention model and the facial attribute classification model, increasing the training steps and the calculated amount, therefore, training the occlusion feature extraction model preferentially.
In the embodiment of the application, the occlusion feature extraction model is trained according to the following steps:
b11, acquiring a second training sample set, wherein the second training sample set comprises an occlusion image and an actual occlusion area corresponding to the occlusion image, and the occlusion image is an image in which a reference object is occluded;
b12, inputting the occlusion image into a first model comprising an occlusion feature extraction model and an occlusion region output model to obtain a predicted occlusion region;
b13, adjusting the model parameters of the first model according to the actual occlusion area and the predicted occlusion area to obtain a trained first model;
and B14, cutting the trained occlusion feature extraction model from the trained first model.
Here, the first model includes an occlusion feature extraction model and an occlusion region output model, and realizes identification of an occlusion image in which the reference object is occluded; the occlusion image is an image in which facial features of a human face are occluded, and the occluded area of the facial features can be a fixed area or an unfixed area, wherein the fixed area can be an area in which the facial features are occluded by a mask, an area in which the facial features are occluded by glasses, and the like, the unfixed area can be an area in which the facial features are occluded by an irregular object, and the irregular object can be a book, a mobile phone, paper, and the like; the actual occlusion region is the position information of the occluded part in the occlusion image.
In a specific implementation process, after a second training sample set is obtained, an occlusion image subjected to region division is input into a first model, an occlusion feature extraction model in the first model performs feature extraction on the occlusion image to obtain a predicted feature value of each region in the occlusion image, and the predicted feature value of each region is input into an occlusion region output model to obtain a predicted occlusion region.
And adjusting model parameters of an occlusion feature extraction model and an occlusion region output model in the first model according to a principle that the distance between a predicted occlusion region and an actual occlusion region is minimum, so as to obtain the trained first model.
After the trained first model is obtained, the trained occlusion feature extraction model is cut out from the first model, that is, model structure information and model parameter information of the occlusion feature extraction model are extracted from a model structure file of the trained first model, wherein the model structure information comprises the structure, the execution sequence and the input and output formats between layers of the model, and the model parameter information comprises the model parameters of each layer of the model.
In the embodiment of the present application, the facial attribute feature extraction model is trained according to the following steps:
b21, acquiring a third training sample set, wherein the third training sample set comprises sample images and actual facial attributes corresponding to the sample images, and the sample images comprise images of the reference object which are not shielded;
b22, inputting the sample image into a second model comprising a facial attribute feature extraction model to obtain a predicted facial attribute;
b23, adjusting the model parameters of the second model according to the actual face attribute and the predicted face attribute to obtain a trained second model;
and B24, cutting out the trained face attribute feature extraction model from the trained second model.
Here, the second model enables the identification of images in which the reference object is not occluded.
Through the training process, an occlusion feature extraction model and a face attribute extraction model are obtained, and a face attribute identification model including the occlusion feature extraction model, the face attribute extraction model, an attention model and a face attribute classification model is obtained, and the face attribute identification model can be used in the following face attribute identification method, specifically, the embodiment of the application provides a face attribute identification method, as shown in fig. 1, the method includes the following steps:
s101, a face image obtained by shooting a target object is acquired.
Here, the face image is a face image of a target object, and the image acquisition device for capturing the face image may be a camera device in an access control system, a camera device of a security system, or the like, for example, the access control system may be a system in a residential community, or may be a check-in system applied to an enterprise, and the security system may be a system set in a concert; the target object may be a human body.
Since the image capturing apparatus is capturing an image including a target object, a face of the target object included in the image is not necessarily a standard face image (for example, the face is directly facing the image captured by the image capturing apparatus), face keypoint information needs to be detected from the image. The face key point information includes the position of each face key point in the image, and the face key points may include glasses, a nose and a mouth. And aligning the face image included in the image to a standard face by using the face key point information included in the image. And determining a standard face image in the image as a face image.
S102, carrying out region division on the face image, and extracting an occlusion feature value and an attribute feature value of each region in the face image, wherein the occlusion feature value represents the situation that the face feature of the target object in each region is occluded, and the attribute feature value represents the face attribute information included in each region.
When the face image is divided into regions, the region ranges of different regions in the divided face image may be different or the same, and may be determined according to actual situations.
The occlusion condition of the facial feature can be represented by an occlusion range or an occlusion probability, the larger the occlusion feature value is, the higher the probability that the facial feature representing the target object in the region is occluded is, the smaller the occlusion feature value is, and the lower the probability that the facial feature representing the target object in the region is occluded is. In one embodiment, the occlusion feature value may be represented by 0 or 1, 0 indicating that the target object has no facial features in the region occluded, 1 indicating that the target object has facial features in the region occluded;
the attribute feature value may also characterize the expression of the face attribute information included in each region in a high-dimensional space.
In step S102, the facial image may be input to an occlusion feature extraction model in a pre-trained facial attribute recognition model, so as to obtain occlusion feature values of each region in the facial image; and inputting the facial image into a facial attribute feature extraction model in the facial attribute recognition model to obtain attribute feature values of each region in the facial image.
In one embodiment, as shown in FIG. 2, the facial property identification model includes an occlusion feature extraction model, a facial property feature extraction model, an attention model, and a facial property classification model.
S103, for each area in the face image, determining an adjustment weight of the attribute characteristic value of the area based on the occlusion characteristic value of the area, wherein the adjustment weight of each area is inversely proportional to the occlusion characteristic value of the area.
Here, the larger the occlusion feature value is, the larger the adjustment weight is, and the smaller the occlusion feature value is, the smaller the adjustment weight is.
In executing S103, the occlusion feature value of each region may be input to the attention model of the face attribute identification model, and the adjustment weight for the attribute feature value of the corresponding region may be obtained.
In a specific implementation process, a Sigmoid function can be used to calculate an adjustment weight of each region in a facial image, but the Sigmoid function has no variability, that is, the Sigmoid function is a fixed formula, an output result is fixed, and the application effect of the finally obtained adjustment weight in practical application is not good, so that the adjustment weight can be generated for each region by using an attention model (such as a convolutional neural network).
When the attention model is a convolutional neural network model, parameters in the convolutional neural network are updated iteratively according to the executed tasks, the expression capacity of the network can be increased, the convolution kernel of the convolutional neural network can be 1, 3, 5 and the like, when the convolution kernel is larger than 1, the plurality of occlusion feature values are subjected to convolution processing to obtain a weight corresponding to one attribute feature value, and the areas corresponding to the plurality of occlusion feature values do not correspond to the areas corresponding to the attribute feature values, so that the accuracy of the finally obtained weight is poor, therefore, in the implementation process, a convolutional neural network model with a convolution kernel of 1 is generally selected, and thus, when the weight for the attribute feature value is determined by using the occlusion feature value, the region corresponding to the occlusion feature value and the region corresponding to the attribute feature value are the same region in the face image, thereby ensuring the accuracy of obtaining the weight.
And S104, determining the facial attribute of the target object based on the adjustment weight of each region and the corresponding attribute characteristic value.
The attributes of facial organs of a human face also belong to facial attributes such as nose size, eye size, mouth size, and the like, but when facial organs (such as eyes, nose, mouth) in the human face are occluded, the attributes of the above facial organs cannot be recognized, and therefore, the facial attributes in the present application are attributes that can be recognized even if the facial organs are occluded, such as sex, age, and expression of the target object.
In executing S104, for each region, a product value of the adjustment weight of the region and the corresponding attribute feature value may be calculated; and inputting the product value corresponding to each region into a face attribute classification model in a face attribute identification model to obtain the face attribute of the target object.
In a specific implementation process, the adjustment weight corresponding to each region in the face image may be a weight matrix, the attribute feature value corresponding to each region in the face image may also be a feature value matrix, a product of the weight matrix and the feature value matrix is calculated, and the product is input to the face attribute classification model to obtain the face attribute of the target object.
In an embodiment, referring to fig. 2, after a face picture is obtained, the face picture is respectively input to a face feature extraction model and a face attribute feature extraction model to respectively obtain a face occlusion feature map and a face attribute feature map, the face occlusion feature map is input to an attention model to obtain a weight matrix for adjusting the face attribute map, a product of the weight matrix and the face attribute feature map is calculated, and a calculation result is input to a face attribute result output model to obtain a face attribute of an object included in the face picture.
When the facial attribute of the target object is identified, a facial image is divided into a plurality of areas, the attribute characteristic value of each area is adjusted by using the adjustment weight of each area, when the possibility that the area is blocked is high, the attribute characteristic value of the area is reduced, when the possibility that the area is blocked is low, the attribute characteristic value of the area is increased, and the response of the blocked area to the facial attribute identification is reduced, so that the facial attribute identification model can pay more attention to the unblocked area, the response of the unblocked area in the facial attribute identification is improved, and when the target object is blocked in the facial image identified by the method, the blocked area can be fixed or random.
Referring to fig. 3, a schematic diagram of a facial attribute recognition apparatus according to an embodiment of the present application is shown, where the apparatus includes:
an acquisition module 31 configured to acquire a face image captured for a target object;
an extracting module 32, configured to extract, for each region in the facial image, an occlusion feature value and an attribute feature value of the region, where the occlusion feature value represents that a facial feature of a target object in each region is occluded, and the attribute feature value represents facial attribute information included in each region;
a first determining module 33, configured to determine, for each region in the face image, an adjustment weight for an attribute feature value of the region based on an occlusion feature value of the region, where the adjustment weight for each region is inversely proportional to the occlusion feature value of the region;
a second determining module 34, configured to determine a facial attribute of the target object based on the adjustment weight and the corresponding attribute feature value of each region.
In one embodiment, the extraction module 32 is configured to extract, for each region in the face image, an occlusion feature value and an attribute feature value of the region according to the following steps:
inputting the facial image into an occlusion feature extraction model in a pre-trained facial attribute recognition model to obtain an occlusion feature value of each region in the facial image;
and inputting the facial image into a facial attribute feature extraction model in the facial attribute recognition model to obtain attribute feature values of each region in the facial image.
In one embodiment, the apparatus further comprises: a training module 35, the training module 35 being configured to train the occlusion feature extraction model according to the following steps:
acquiring a first training sample set, wherein the first training sample set comprises an occlusion image and a corresponding actual occlusion area; the occlusion image is an image in which the reference object is occluded;
inputting the occlusion image into a first model comprising an occlusion feature extraction model to obtain a predicted occlusion region;
adjusting model parameters of the first model according to the predicted occlusion area and the actual occlusion area to obtain a trained first model;
and cutting out the trained occlusion feature extraction model from the trained first model.
In one embodiment, the training module 35 is further configured to train the facial attribute recognition model according to the following steps:
acquiring a second training sample set, wherein the second training sample set comprises sample images and corresponding actual facial attributes, and the sample images comprise images with reference objects being blocked and images with reference objects not being blocked;
inputting a sample image into an occlusion feature extraction model in the facial attribute identification model to obtain a first prediction feature value of each region in the sample image;
inputting the first prediction characteristic value of each region in the sample image into an attention model in the facial attribute recognition model to obtain a prediction adjustment weight of each region in the sample image;
inputting the sample image into a facial attribute feature extraction model of the facial attribute identification model to obtain a second prediction feature value of each region in the sample image;
determining a predicted facial attribute of a test subject included in the sample image based on the prediction adjustment weight and the corresponding second prediction feature value for each region in the sample image;
and controlling the model parameters of the shielding feature extraction model to be unchanged based on the predicted face attribute and the actual face attribute, and adjusting the model parameters of a face attribute feature extraction model and an attention model in the face attribute recognition model to obtain the trained face attribute recognition model.
In one embodiment, the second determination module 34 is configured to determine the facial attributes of the target object according to the following steps:
calculating the product value of the adjustment weight of each region and the corresponding attribute characteristic value for each region;
and inputting the product value corresponding to each region into a face attribute classification model in a face attribute identification model to obtain the face attribute of the target object.
In one embodiment, the first determining module 33 is configured to determine the adjustment weight for the attribute feature value of the region according to the following steps:
and inputting the shielding characteristic value of each region into an attention model of the face attribute identification model to obtain the adjustment weight of the attribute characteristic value of the corresponding region.
In some embodiments, the apparatus shown in fig. 3 may be, or may be part of, a computer device.
An embodiment of the present application further provides an electronic device 40, as shown in fig. 4, which is a schematic structural diagram of the electronic device 40 provided in the embodiment of the present application, and includes:
a processor 41, a memory 42, and a bus 43; memory 42 is used to store machine-readable instructions executable by processor 41, including memory 421 and external storage 422; the memory 421 is also referred to as an internal memory, and is used for temporarily storing the operation data in the processor 41 and the data exchanged with the external memory 422 such as a hard disk, the processor 41 exchanges data with the external memory 422 through the memory 421, and when the electronic device 40 operates, the processor 41 communicates with the memory 42 through the bus 43, so that the processor 41 executes the following instructions in a user mode:
acquiring a face image obtained by shooting a target object;
extracting an occlusion feature value and an attribute feature value of each region in the facial image, wherein the occlusion feature value represents that the facial features of the target object in each region are occluded, and the attribute feature value represents facial attribute information included in each region;
for each region in the facial image, determining an adjustment weight for an attribute feature value of the region based on an occlusion feature value of the region, wherein the adjustment weight for each region is inversely proportional to the occlusion feature value of the region;
determining a facial attribute of the target object based on the adjustment weight and the corresponding attribute feature value of each region.
In a possible embodiment, the instructions executed by processor 41 for extracting, for each region in the face image, an occlusion feature value and an attribute feature value of the region include:
inputting the facial image into an occlusion feature extraction model in a pre-trained facial attribute recognition model to obtain an occlusion feature value of each region in the facial image;
and inputting the facial image into a facial attribute feature extraction model in the facial attribute recognition model to obtain attribute feature values of each region in the facial image.
In one possible embodiment, the processor 41 executes instructions for determining the facial attribute of the target object based on the adjustment weight and the corresponding attribute feature value of each region, including:
calculating the product value of the adjustment weight of each region and the corresponding attribute characteristic value for each region;
and inputting the product value corresponding to each region into a face attribute classification model in a face attribute identification model to obtain the face attribute of the target object.
In a possible embodiment, the instructions executed by the processor 41 for determining, for each region in the face image, an adjustment weight for the attribute feature value of the region based on the occlusion feature value of the region include:
and inputting the shielding characteristic value of each region into an attention model of the face attribute identification model to obtain the adjustment weight of the attribute characteristic value of the corresponding region.
In one possible implementation, processor 41 executes instructions that train the facial attribute recognition model according to the following steps:
acquiring a first training sample set, wherein the first training sample set comprises a sample image and an actual face attribute corresponding to the sample image, and the sample image comprises an image with a reference object blocked and an image with the reference object not blocked;
inputting a sample image into an occlusion feature extraction model in a face attribute identification model to obtain a first prediction feature value of each divided region in the sample image;
inputting the first prediction characteristic value of each region in the sample image into an attention model in the facial attribute recognition model to obtain a prediction adjustment weight of each region in the sample image;
inputting the sample image into a facial attribute feature extraction model of the facial attribute identification model to obtain a second prediction feature value of each region in the sample image;
determining a predicted facial attribute of a test subject included in the sample image based on the prediction adjustment weight and the corresponding second prediction feature value for each region in the sample image;
and controlling the model parameters of the shielding feature extraction model to be unchanged based on the predicted face attribute and the actual face attribute, and adjusting the model parameters of a face attribute feature extraction model and an attention model in the face attribute recognition model to obtain the trained face attribute recognition model.
In one possible embodiment, processor 41 executes instructions that train the occlusion feature extraction model according to the following steps:
acquiring a second training sample set, wherein the second training sample set comprises an occlusion image and an actual occlusion area corresponding to the occlusion image, and the occlusion image is an image in which a reference object is occluded;
inputting the occlusion image into a first model comprising an occlusion feature extraction model to obtain a predicted occlusion region;
adjusting model parameters of the first model according to the actual occlusion area and the predicted occlusion area to obtain a trained first model;
and determining the occlusion feature extraction model from the trained first model.
In a possible embodiment, the processor 41 executes instructions for adjusting the model parameters of the first model according to a principle that a distance between the predicted occlusion region and the actual occlusion region is minimal.
In one possible embodiment, the instructions executed by processor 41,
training the facial attribute feature extraction model according to the following steps:
acquiring a third training sample set, wherein the third training sample set comprises a sample image and an actual face attribute corresponding to the sample image, and the sample image comprises an image of a reference object which is not shielded;
inputting the sample image into a second model comprising a facial attribute feature extraction model to obtain a predicted facial attribute;
adjusting model parameters of the second model according to the actual facial attributes and the predicted facial attributes to obtain a trained second model;
and determining the facial attribute feature extraction model from the trained second model.
As is known to those skilled in the art, as computer hardware evolves, the specific implementation and nomenclature of the bus may change, and the bus as referred to herein conceptually encompasses any information transfer line capable of servicing components within an electronic device, including, but not limited to, FSB, HT, QPI, Infinity Fabric, etc.
In the embodiment of the present application, the processor may be a general-purpose processor including a Central Processing Unit (CPU), and may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a neural Network Processor (NPU), a Tensor Processor (TPU), or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component.
In some embodiments, processor 41 may include an apparatus as shown in FIG. 3.
It will be appreciated that the apparatus of fig. 3 and the electronic device of fig. 4 may be used to perform the methods described above in connection with fig. 1 and 2.
An embodiment of the present application further provides a computer-readable storage medium, in which a computer program is stored, and the computer program is executed by a processor to perform the steps of the above-mentioned facial attribute recognition method.
Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk, and the like, and when a computer program on the storage medium is executed, the audio recognition method can be executed, so as to solve the problem that different recognition tasks cannot be recognized through one model.
The application provides a face attribute identification method, an electronic device and a computer readable storage medium, wherein an adjustment weight for an attribute feature value of a corresponding region is generated through extracted shielding feature information of each region in a face image, the more serious the shielding is, the smaller the adjustment weight is, the smaller the shielding is, the larger the adjustment weight is, the adjustment weight is used for adjusting the face attribute information in the face image, when the face attribute identification is carried out, the response of the shielding feature information to the face attribute identification process is reduced, the region (the region which is not shielded) containing the face attribute information in the face image is more concerned, and the accuracy of an identification result is improved.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and there may be other divisions in actual implementation, and for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some communication interfaces, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing an electronic device (which may be a personal computer, a server, or a network device) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A facial attribute recognition method, the method comprising:
acquiring a face image obtained by shooting a target object;
extracting an occlusion feature value and an attribute feature value of each divided region in the facial image, wherein the occlusion feature value represents the situation that the facial features of the target object in each region are occluded, and the attribute feature value represents the facial attribute information included in each region;
for each region in the facial image, determining an adjustment weight for an attribute feature value of the region based on an occlusion feature value of the region, wherein the adjustment weight for each region is inversely proportional to the occlusion feature value of the region;
determining a facial attribute of the target object based on the adjustment weight and the corresponding attribute feature value of each region.
2. The method according to claim 1, wherein extracting, for each region divided in the face image, an occlusion feature value and an attribute feature value of the region comprises:
inputting the facial image into an occlusion feature extraction model in a pre-trained facial attribute recognition model to obtain an occlusion feature value of each region in the facial image;
and inputting the facial image into a facial attribute feature extraction model in the facial attribute recognition model to obtain attribute feature values of each region in the facial image.
3. The method of claim 1, wherein determining the facial attributes of the target object based on the adjusted weights and corresponding attribute feature values for the respective regions comprises:
calculating the product value of the adjustment weight of each region and the corresponding attribute characteristic value for each region;
and inputting the product value corresponding to each region into a face attribute classification model in a face attribute identification model to obtain the face attribute of the target object.
4. The method of any one of claims 1-3, wherein determining, for each region in the facial image, an adjustment weight for an attribute feature value of the region based on an occlusion feature value of the region comprises:
and inputting the shielding characteristic value of each region into an attention model of the face attribute identification model to obtain the adjustment weight of the attribute characteristic value of the corresponding region.
5. The method of claim 2, wherein the facial attribute recognition model is trained according to the following steps:
acquiring a first training sample set, wherein the first training sample set comprises a sample image and an actual face attribute corresponding to the sample image, and the sample image comprises an image with a reference object blocked and an image with the reference object not blocked;
inputting a sample image into an occlusion feature extraction model in a face attribute identification model to obtain a first prediction feature value of each divided region in the sample image;
inputting the first prediction characteristic value of each region in the sample image into an attention model in the facial attribute recognition model to obtain a prediction adjustment weight of each region in the sample image;
inputting the sample image into a facial attribute feature extraction model of the facial attribute identification model to obtain a second prediction feature value of each region in the sample image;
determining a predicted facial attribute of a test subject included in the sample image based on the prediction adjustment weight and the corresponding second prediction feature value for each region in the sample image;
and controlling the model parameters of the shielding feature extraction model to be unchanged based on the predicted face attribute and the actual face attribute, and adjusting the model parameters of a face attribute feature extraction model and an attention model in the face attribute recognition model to obtain the trained face attribute recognition model.
6. The method of claim 2, wherein the occlusion feature extraction model is trained according to the following steps:
acquiring a second training sample set, wherein the second training sample set comprises an occlusion image and an actual occlusion area corresponding to the occlusion image, and the occlusion image is an image in which a reference object is occluded;
inputting the occlusion image into a first model comprising an occlusion feature extraction model to obtain a predicted occlusion region;
adjusting model parameters of the first model according to the actual occlusion area and the predicted occlusion area to obtain a trained first model;
and determining the occlusion feature extraction model from the trained first model.
7. The method of claim 6, wherein the model parameters of the first model are adjusted on the basis of a minimum distance between the predicted occlusion region and the actual occlusion region.
8. The method of claim 2, wherein the facial attribute feature extraction model is trained according to the following steps:
acquiring a third training sample set, wherein the third training sample set comprises a sample image and an actual face attribute corresponding to the sample image, and the sample image comprises an image of a reference object which is not shielded;
inputting the sample image into a second model comprising a facial attribute feature extraction model to obtain a predicted facial attribute;
adjusting model parameters of the second model according to the actual facial attributes and the predicted facial attributes to obtain a trained second model;
and determining the facial attribute feature extraction model from the trained second model.
9. An electronic device, comprising: a processor and a memory, the memory storing machine-readable instructions executable by the processor, the processor executing the machine-readable instructions to perform the steps of the method of any one of claims 1 to 8 when the electronic device is executed.
10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, is adapted to carry out the steps of the method according to any one of claims 1 to 8.
CN202011112251.0A 2020-10-16 2020-10-16 Face attribute recognition method, electronic device, and computer-readable storage medium Pending CN112200109A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011112251.0A CN112200109A (en) 2020-10-16 2020-10-16 Face attribute recognition method, electronic device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011112251.0A CN112200109A (en) 2020-10-16 2020-10-16 Face attribute recognition method, electronic device, and computer-readable storage medium

Publications (1)

Publication Number Publication Date
CN112200109A true CN112200109A (en) 2021-01-08

Family

ID=74009269

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011112251.0A Pending CN112200109A (en) 2020-10-16 2020-10-16 Face attribute recognition method, electronic device, and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN112200109A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469216A (en) * 2021-05-31 2021-10-01 浙江中烟工业有限责任公司 Retail terminal poster identification and integrity judgment method, system and storage medium
CN114549921A (en) * 2021-12-30 2022-05-27 浙江大华技术股份有限公司 Object recognition method, electronic device, and computer-readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107292287A (en) * 2017-07-14 2017-10-24 深圳云天励飞技术有限公司 Face identification method, device, electronic equipment and storage medium
EP3428843A1 (en) * 2017-07-14 2019-01-16 GB Group plc Improvements relating to face recognition
CN110032912A (en) * 2018-01-11 2019-07-19 富士通株式会社 Face verification method and apparatus and computer storage medium
CN110688874A (en) * 2018-07-04 2020-01-14 杭州海康威视数字技术股份有限公司 Facial expression recognition method and device, readable storage medium and electronic equipment
CN111191569A (en) * 2019-12-26 2020-05-22 深圳市优必选科技股份有限公司 Face attribute recognition method and related device thereof
CN111666826A (en) * 2020-05-15 2020-09-15 北京百度网讯科技有限公司 Method, apparatus, electronic device and computer-readable storage medium for processing image

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107292287A (en) * 2017-07-14 2017-10-24 深圳云天励飞技术有限公司 Face identification method, device, electronic equipment and storage medium
EP3428843A1 (en) * 2017-07-14 2019-01-16 GB Group plc Improvements relating to face recognition
CN110032912A (en) * 2018-01-11 2019-07-19 富士通株式会社 Face verification method and apparatus and computer storage medium
CN110688874A (en) * 2018-07-04 2020-01-14 杭州海康威视数字技术股份有限公司 Facial expression recognition method and device, readable storage medium and electronic equipment
CN111191569A (en) * 2019-12-26 2020-05-22 深圳市优必选科技股份有限公司 Face attribute recognition method and related device thereof
CN111666826A (en) * 2020-05-15 2020-09-15 北京百度网讯科技有限公司 Method, apparatus, electronic device and computer-readable storage medium for processing image

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469216A (en) * 2021-05-31 2021-10-01 浙江中烟工业有限责任公司 Retail terminal poster identification and integrity judgment method, system and storage medium
CN113469216B (en) * 2021-05-31 2024-02-23 浙江中烟工业有限责任公司 Retail terminal poster identification and integrity judgment method, system and storage medium
CN114549921A (en) * 2021-12-30 2022-05-27 浙江大华技术股份有限公司 Object recognition method, electronic device, and computer-readable storage medium

Similar Documents

Publication Publication Date Title
US10635890B2 (en) Facial recognition method and apparatus, electronic device, and storage medium
US20200160040A1 (en) Three-dimensional living-body face detection method, face authentication recognition method, and apparatuses
US20180285630A1 (en) Face verifying method and apparatus
CN108932456B (en) Face recognition method, device and system and storage medium
EP4099217A1 (en) Image processing model training method and apparatus, device, and storage medium
CN110826519A (en) Face occlusion detection method and device, computer equipment and storage medium
CN111476306A (en) Object detection method, device, equipment and storage medium based on artificial intelligence
US10832032B2 (en) Facial recognition method, facial recognition system, and non-transitory recording medium
CN107316029B (en) A kind of living body verification method and equipment
CN111597884A (en) Facial action unit identification method and device, electronic equipment and storage medium
CN111310705A (en) Image recognition method and device, computer equipment and storage medium
CN111914748B (en) Face recognition method, device, electronic equipment and computer readable storage medium
CN113642639B (en) Living body detection method, living body detection device, living body detection equipment and storage medium
CN112200109A (en) Face attribute recognition method, electronic device, and computer-readable storage medium
CN115620384B (en) Model training method, fundus image prediction method and fundus image prediction device
CN112149601A (en) Occlusion-compatible face attribute identification method and device and electronic equipment
CN112634246A (en) Oral cavity image identification method and related equipment
CN113536965B (en) Method and related device for training face shielding recognition model
CN112861743A (en) Palm vein image anti-counterfeiting method, device and equipment
JP6911995B2 (en) Feature extraction methods, matching systems, and programs
CN115223022B (en) Image processing method, device, storage medium and equipment
CN115147705B (en) Face copying detection method and device, electronic equipment and storage medium
CN110633647A (en) Living body detection method and device
US20220139113A1 (en) Method and device for detecting object in image
CN113705366A (en) Personnel management system identity identification method and device and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination