CN117611516A

CN117611516A - Image quality evaluation, face recognition, label generation and determination methods and devices

Info

Publication number: CN117611516A
Application number: CN202311133425.5A
Authority: CN
Inventors: 霍磊; 聂玉虎; 崔文朋; 郑哲; 龚向锋; 刘彬; 张永波; 李明月
Original assignee: Beijing Smartchip Microelectronics Technology Co Ltd
Current assignee: Beijing Smartchip Microelectronics Technology Co Ltd
Priority date: 2023-09-04
Filing date: 2023-09-04
Publication date: 2024-02-27

Abstract

The invention discloses a method and a device for evaluating image quality, identifying human face, generating labels and determining labels, wherein the image quality evaluating method comprises the following steps: acquiring a face image set obtained by shooting the same target object; the face image set comprises a plurality of face images to be evaluated; determining visual feature evaluation data and face feature evaluation data of the face image to be evaluated; and carrying out face image quality evaluation according to the visual feature evaluation data and the face feature evaluation data of the face image to be evaluated to obtain face image evaluation data of the face image to be evaluated. Therefore, the quality of the face image to be evaluated is accurately and comprehensively described by fusing the visual characteristic evaluation data and the face characteristic evaluation data, and the effect of evaluating the quality of the face image is effectively improved.

Description

Image quality evaluation, face recognition, label generation and determination methods and devices

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method and apparatus for image quality assessment, face recognition, label generation, and determination.

Background

Face recognition is a biological recognition technology for carrying out identity recognition based on facial feature information of people. Under the unconstrained scene, most of face images acquired by the image acquisition equipment are low in quality, so that the face recognition effect is poor. The quality of the face image to be recognized is evaluated, and the method plays an important role in improving the efficiency and accuracy of face recognition.

In the related art, an unsupervised image quality assessment method is adopted to automatically generate the quality scores of the face images. However, the image quality evaluation method in the related art does not consider the quality of the face image itself, and the effect of quality evaluation is to be improved.

Disclosure of Invention

The present invention aims to solve at least one of the technical problems in the related art to some extent. Therefore, a first object of the present invention is to provide an image quality evaluation method, which can effectively improve the accuracy of the evaluation of the quality of the face image itself.

A second object of the present invention is to propose a tag generation method.

The third purpose of the invention is to provide a training method for a face quality assessment model.

A fourth object of the present invention is to propose another image quality assessment method.

A fifth object of the present invention is to provide a face recognition method.

A sixth object of the present invention is to propose a tag determination method.

A seventh object of the present invention is to propose an image quality evaluation device.

An eighth object of the present invention is to provide a tag generating apparatus.

A ninth object of the present invention is to provide a training device for a face quality assessment model.

A tenth object of the present invention is to propose another image quality assessment device.

An eleventh object of the present invention is to provide a face recognition apparatus.

A twelfth object of the present invention is to provide a tag determination device.

A thirteenth object of the invention is to propose a computer device.

A fourteenth object of the present invention is to provide a chip.

A fifteenth object of the present invention is to propose a computer readable storage medium.

To achieve the above object, an embodiment of a first aspect of the present invention provides an image quality evaluation method, including: acquiring a face image set obtained by shooting the same target object; the face image set comprises a plurality of face images to be evaluated; determining visual feature evaluation data and face feature evaluation data of the face image to be evaluated; wherein the visual feature evaluation data is determined based on low-level features extracted from the face image to be evaluated, the low-level features being used to describe visual layer semantics of the face image to be evaluated; the face feature evaluation data are used for describing the quality condition of concept layer semantics of the face image to be evaluated; and carrying out face image quality evaluation according to the visual feature evaluation data and the face feature evaluation data of the face image to be evaluated to obtain face image evaluation data of the face image to be evaluated.

According to one embodiment of the present invention, the determining manner of the visual characteristic evaluation data includes: obtaining the representative low-level features of the target object according to the low-level features of the plurality of face images to be evaluated; and performing similarity calculation based on the low-level features of the face image to be evaluated and the representative low-level features to obtain the visual feature evaluation data of the face image to be evaluated.

According to an embodiment of the present invention, the face feature evaluation data is determined based on high-level features extracted from the face image to be evaluated; the high-level features are used for describing concept layer semantics of the face image to be evaluated; the determining manner of the low-level features and the high-level features comprises the following steps: and inputting the face image to be evaluated into a trained face recognition model to perform feature extraction, so as to obtain the low-level features and the high-level features.

According to an embodiment of the present invention, the face feature evaluation data includes first feature evaluation data; the high-level features comprise a plurality of first high-level features extracted by the face recognition model in different random discarding modes; the determining mode of the face characteristic evaluation data comprises the following steps: performing similarity calculation according to the plurality of first high-level features to obtain first feature evaluation data of the face image to be evaluated; the first feature evaluation data is used for describing the robustness condition of the high-level features of the face image to be evaluated.

According to one embodiment of the present invention, the face image to be evaluated is a specified face image among the plurality of face images to be evaluated; the high-level features comprise second high-level features extracted by the face recognition model; the face feature evaluation data comprises second feature evaluation data; the determining mode of the face characteristic evaluation data comprises the following steps: performing similarity calculation according to the second high-level features of the appointed face image and the second high-level features of other face images to obtain similarity data between the appointed face image and the other face images; the other face images are other face images to be evaluated except the appointed face image in the face images to be evaluated; obtaining the second characteristic evaluation data of the appointed face image according to the similarity data; the second feature evaluation data is used for describing the identification condition of the appointed face image.

According to an embodiment of the present invention, the face feature evaluation data includes first feature evaluation data and second feature evaluation data; the step of carrying out face image quality evaluation according to the visual feature evaluation data and the face feature evaluation data of the face image to be evaluated to obtain the face image evaluation data of the face image to be evaluated, comprises the following steps: and carrying out face image quality evaluation according to the visual feature evaluation data, the first feature evaluation data and the second feature evaluation data to obtain the face image evaluation data.

To achieve the above object, an embodiment of a second aspect of the present invention provides a tag generation method, including: acquiring a face image set obtained by shooting the same target object; the face image set comprises a plurality of face images to be marked; determining visual feature evaluation data and face feature evaluation data of the face image to be annotated; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the face image to be annotated, and the low-level characteristics are used for describing visual layer semantics of the face image to be annotated; the face feature evaluation data are used for describing the quality condition of concept layer semantics of the face image to be marked; performing face image quality evaluation based on the visual feature evaluation data and the face feature evaluation data to obtain face image evaluation data of the face image to be marked; and taking the face image evaluation data as a label of the face image to be marked.

In order to achieve the above object, an embodiment of a third aspect of the present invention provides a face quality assessment model training method, which includes: constructing a first training sample set; the first training sample set comprises a plurality of first face image samples, and the first face image samples are provided with labels obtained based on the label generating method in claim 7; and training the initial evaluation model by using the first training sample set to obtain the face quality evaluation model.

According to one embodiment of the invention, the initial evaluation model is obtained by migration on the basis of a face recognition student model which is trained; the face recognition student model corresponds to a face recognition teacher model; the face quality evaluation model training method further comprises the following steps: constructing a second training sample set; the first training sample set comprises a plurality of second face image samples, and the second face image samples are provided with object category labels; inputting the second face image sample into the face recognition teacher model to carry out face recognition to obtain a first face recognition feature and a first prediction category; inputting the second face image sample into the face recognition student model to carry out face recognition to obtain a second face recognition feature and a second prediction category; determining loss data of the face recognition student model based on the first face recognition feature, the first predictive category, the second face recognition feature, the second predictive category, and the object category label; and updating parameters of the face recognition student model based on the loss data of the face recognition student model until a model training stopping condition is met.

To achieve the above object, a fourth aspect of the present invention provides another image quality evaluation method, including: acquiring a face image to be evaluated; inputting the face image to be evaluated into a face quality evaluation model obtained through training by the face quality evaluation model training method according to any one of the previous embodiments, so as to obtain quality evaluation data of the face image to be evaluated.

To achieve the above object, according to a fifth aspect of the present invention, there is provided a face recognition method, the method including: acquiring a face image set to be recognized, which is obtained by shooting a target object; the face image set to be identified comprises a plurality of face images to be identified; inputting the face image to be recognized into a face quality evaluation model obtained through training by the face quality evaluation model training method according to any one of the previous embodiments, so as to obtain quality evaluation data of the face image to be recognized; determining a target face image with quality meeting requirements in the face image set to be identified according to the quality evaluation data of the face image to be identified; and carrying out face recognition on the target object based on the target face image to obtain a face recognition result.

To achieve the above object, an embodiment of a sixth aspect of the present invention provides a tag determination method, including: acquiring an object image set obtained by shooting a specified object; the object image set comprises a plurality of object images to be annotated; determining visual characteristic evaluation data and object characteristic evaluation data of the object image to be annotated; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the object image to be annotated, wherein the low-level characteristics are used for describing visual layer semantics of the object image to be annotated; the object feature evaluation data are used for describing the quality condition of concept layer semantics of the object image to be annotated; and performing image quality evaluation according to the visual characteristic evaluation data and the object characteristic evaluation data of the object image to be marked to obtain evaluation data of the object image to be marked, wherein the evaluation data is used as a label of the object image to be marked.

To achieve the above object, an embodiment of a seventh aspect of the present invention provides an image quality evaluation apparatus, comprising: the first face image set acquisition module is used for acquiring a face image set obtained by shooting aiming at the same target object; the face image set comprises a plurality of face images to be evaluated; the first feature evaluation data determining module is used for determining visual feature evaluation data and face feature evaluation data of the face image to be evaluated; wherein the visual feature evaluation data is determined based on low-level features extracted from the face image to be evaluated, the low-level features being used to describe visual layer semantics of the face image to be evaluated; the face feature evaluation data are used for describing the quality condition of concept layer semantics of the face image to be evaluated; the first image evaluation data acquisition module is used for carrying out face image quality evaluation according to the visual feature evaluation data and the face feature evaluation data of the face image to be evaluated to obtain the face image evaluation data of the face image to be evaluated.

According to one embodiment of the present invention, the first feature evaluation data determining module is further configured to obtain a low-level feature representing the target object according to the low-level features of the plurality of face images to be evaluated; and performing similarity calculation based on the low-level features of the face image to be evaluated and the representative low-level features to obtain the visual feature evaluation data of the face image to be evaluated.

According to an embodiment of the present invention, the face feature evaluation data is determined based on high-level features extracted from the face image to be evaluated; the high-level features are used for describing concept layer semantics of the face image to be evaluated; the first feature evaluation data determining module is further configured to input the face image to be evaluated to a trained face recognition model for feature extraction, so as to obtain the low-level features and the high-level features.

According to an embodiment of the present invention, the face feature evaluation data includes first feature evaluation data; the high-level features comprise a plurality of first high-level features extracted by the face recognition model in different random discarding modes; the first feature evaluation data determining module is further configured to perform similarity calculation according to the plurality of first high-level features to obtain the first feature evaluation data of the face image to be evaluated; the first feature evaluation data is used for describing the robustness condition of the high-level features of the face image to be evaluated.

According to one embodiment of the present invention, the face image to be evaluated is a specified face image among the plurality of face images to be evaluated; the high-level features comprise second high-level features extracted by the face recognition model; the face feature evaluation data comprises second feature evaluation data; the first feature evaluation data determining module is further configured to perform similarity calculation according to a second high-level feature of the specified face image and a second high-level feature of other face images, so as to obtain similarity data between the specified face image and the other face images; the other face images are other face images to be evaluated except the appointed face image in the face images to be evaluated; obtaining the second characteristic evaluation data of the appointed face image according to the similarity data; the second feature evaluation data is used for describing the identification condition of the appointed face image.

To achieve the above object, an eighth aspect of the present invention provides a label producing apparatus, comprising: the second face image set acquisition module is used for acquiring a face image set obtained by shooting aiming at the same target object; the face image set comprises a plurality of face images to be marked; the second feature evaluation data determining module is used for determining visual feature evaluation data and face feature evaluation data of the face image to be annotated; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the face image to be annotated, and the low-level characteristics are used for describing visual layer semantics of the face image to be annotated; the face feature evaluation data are used for describing the quality condition of concept layer semantics of the face image to be marked; the second image evaluation data acquisition module is used for carrying out face image quality evaluation based on the visual characteristic evaluation data and the face characteristic evaluation data to obtain face image evaluation data of the face image to be marked; and the first face image label acquisition module is used for taking the face image evaluation data as the label of the face image to be marked.

In order to achieve the above object, according to a ninth aspect of the present invention, there is provided a training device for a face quality assessment model, the device comprising: a first sample set construction module for constructing a first training sample set; the first training sample set comprises a plurality of first face image samples, and the first face image samples are provided with labels obtained based on the label generating method in the embodiment; and the initial evaluation model training module is used for training the initial evaluation model by using the first training sample set to obtain the face quality evaluation model.

According to one embodiment of the invention, the initial evaluation model is obtained by migration on the basis of a face recognition student model which is trained; the face recognition student model corresponds to a face recognition teacher model; the face quality evaluation model training device further comprises: a second sample set construction module for constructing a second training sample set; the first training sample set comprises a plurality of second face image samples, and the second face image samples are provided with object category labels; the first face recognition module is used for inputting the second face image sample into the face recognition teacher model to carry out face recognition to obtain a first face recognition feature and a first prediction category; the second face recognition module is used for inputting the second face image sample into the face recognition student model to carry out face recognition to obtain a second face recognition feature and a second prediction category; a loss data determining module configured to determine loss data of the face recognition student model based on the first face recognition feature, the first prediction category, the second face recognition feature, the second prediction category, and the object category label; and the model parameter updating module is used for updating parameters of the face recognition student model based on the loss data of the face recognition student model until the model training stopping condition is met.

To achieve the above object, an embodiment of a tenth aspect of the present invention provides another image quality evaluation apparatus, including: the face image acquisition module to be evaluated is used for acquiring face images to be evaluated; the first quality evaluation data acquisition module is configured to input the face image to be evaluated into a face quality evaluation model obtained by training the face quality evaluation model training method according to any one of the foregoing embodiments, so as to obtain quality evaluation data of the face image to be evaluated.

To achieve the above object, an eleventh aspect of the present invention provides a face recognition apparatus, including: the third face image set acquisition module is used for acquiring a face image set to be identified, which is obtained by shooting a target object; the face image set to be identified comprises a plurality of face images to be identified; the second quality evaluation data acquisition module is used for inputting the face image to be recognized into a face quality evaluation model obtained through training by the face quality evaluation model training method according to any one of the previous embodiments, so as to obtain quality evaluation data of the face image to be recognized; the target face image determining module is used for determining a target face image with quality meeting the requirement in the face image set to be identified according to the quality evaluation data of the face image to be identified; and the face recognition result acquisition module is used for carrying out face recognition on the target object based on the target face image to obtain a face recognition result.

To achieve the above object, a twelfth aspect of the present invention provides a tag determining apparatus, comprising: the object image set acquisition module is used for acquiring an object image set obtained by shooting a specified object; the object image set comprises a plurality of object images to be annotated; the third feature evaluation data determining module is used for determining visual feature evaluation data and object feature evaluation data of the object image to be annotated; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the object image to be annotated, wherein the low-level characteristics are used for describing visual layer semantics of the object image to be annotated; the object feature evaluation data are used for describing the quality condition of concept layer semantics of the object image to be annotated; and the object image label determining module is used for carrying out image quality evaluation according to the visual characteristic evaluation data and the object characteristic evaluation data of the object image to be annotated to obtain the evaluation data of the object image to be annotated, and the evaluation data is used as a label of the object image to be annotated.

To achieve the above object, an embodiment of a thirteenth aspect of the present invention proposes a computer device comprising a memory storing a first computer program and a processor implementing the steps of the method according to any of the previous embodiments when the processor executes the first computer program.

In order to achieve the above object, an embodiment of the present invention provides a chip, including a storage unit and a processing unit, where the storage unit stores a second computer program, and the processing unit implements the image quality evaluation method according to any one of the foregoing embodiments, and/or the face recognition method, and/or the steps of the tag determination method when executing the second computer program.

To achieve the above object, an embodiment of the fifteenth aspect of the present invention proposes a computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the steps of the method according to any of the previous embodiments.

According to the embodiments provided by the invention, the visual characteristic evaluation data of the face image to be evaluated is determined based on the low-level characteristics extracted from the face image to be evaluated, and the face characteristic evaluation data of the face image to be evaluated is determined based on the high-level characteristics extracted from the face image to be evaluated, so that the quality of the face image to be evaluated is accurately and comprehensively described by fusing the visual characteristic evaluation data and the face characteristic evaluation data, and the evaluation effect of the quality of the face image per se is improved.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

Fig. 1 is a flowchart of an image quality evaluation method according to an embodiment of the present disclosure.

Fig. 2 is a flow chart illustrating a method for determining visual characteristic evaluation data according to an embodiment of the present disclosure.

Fig. 3 is a flowchart illustrating a method for determining face feature evaluation data according to an embodiment of the present disclosure.

Fig. 4 is a flowchart of a label generating method according to an embodiment of the present disclosure.

Fig. 5 is a flowchart of a face quality assessment model training method according to an embodiment of the present disclosure.

Fig. 6 is a flowchart of a face quality assessment model training method according to an embodiment of the present disclosure.

Fig. 7 is a flowchart of another image quality evaluation method according to an embodiment of the present disclosure.

Fig. 8 is a schematic flow chart of a face recognition method according to an embodiment of the present disclosure.

Fig. 9 is a flowchart of a tag determination method according to an embodiment of the present disclosure.

Fig. 10 is a block diagram showing the structure of an image quality evaluation apparatus according to an embodiment of the present specification.

Fig. 11 is a block diagram showing a configuration of a tag generating apparatus according to an embodiment of the present specification.

Fig. 12a is a block diagram of a face quality assessment model training apparatus according to one embodiment of the present disclosure.

Fig. 12b is a block diagram of a face quality assessment model training apparatus according to one embodiment of the present disclosure.

Fig. 13 is a block diagram showing the structure of another image quality evaluation apparatus according to an embodiment of the present specification.

Fig. 14 is a block diagram of a face recognition device according to an embodiment of the present disclosure.

Fig. 15 is a block diagram showing the configuration of a tag determining apparatus according to an embodiment of the present specification.

Fig. 16 is a block diagram of a computer device provided according to one embodiment of the present disclosure.

Fig. 17 is a block diagram of a chip provided according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.

Face recognition is a biological recognition technology for carrying out identity recognition based on facial feature information of people, and similar biological recognition technologies include fingerprint recognition, iris recognition and the like. Face recognition technology is more acceptable to users due to the characteristics of non-compulsory, non-contact, concurrency and the like. However, in unconstrained scenes such as stations and airports where people flow frequently and the surrounding environment is complex, for example, in face images collected by image collecting devices such as mobile terminal devices or monitoring systems, the quality of a large part of face images is generally low. The face recognition system is used for recognizing the low-quality face images, so that invalid recognition results are often obtained, and the waste of computing resources is caused. Therefore, the quality of the face image to be recognized is evaluated, the face image with high quality is screened out, and the face recognition method has an important role in improving the recognition efficiency and accuracy of a face recognition system.

In the related art, the face image quality evaluation method mainly comprises two main categories: an image quality evaluation method based on image analysis and an image quality evaluation method based on deep learning. Image quality assessment methods based on image analysis mainly use some manually defined features to assess the quality of an image. For example, nasrollahi, et al respectively score the quality of a face image according to the pose, brightness, resolution, etc. to obtain the quality scores of the pose, brightness, resolution, etc. respectively, and finally perform weighted fusion on the quality scores of the pose, brightness, resolution, etc. to obtain an overall quality score; a quality assessment method based on the asymmetry of the human face, proposed by Gao and the like; quality assessment methods based on 12 kinds of features (DIP Image priors) features, sensor-related features, and classifier-related features, which are classified into three types, are proposed by p.j.phillips et al. However, these manually designed features are difficult to accurately and comprehensively describe the quality of the face image, and the manual labeling method requires a great deal of manpower and material cost.

The image quality evaluation method based on deep learning mainly utilizes a neural network to judge the image quality score, and comprises a supervised image quality evaluation method and an unsupervised image quality evaluation method. The supervised image quality assessment method requires training of a neural network based on a large number of face image data sets with quality score labels, the acquisition mode of the data sets is very difficult, and the quality scores of the manual labels are difficult to accurately describe the quality of the face images and the consistency of the scores is difficult to ensure. An unsupervised image quality assessment based approach can avoid this problem. Numerous studies have proposed methods for automatically generating face image quality scores using auxiliary models to train a face image quality assessment model based on the automatically generated face image quality scores. For example, a method proposed by Hernandez-Ortega et al for calculating a face image quality score from euclidean distances of face recognition features of a face image and an image of the same class (intra-class image); the method for calculating the quality score of the face image according to the difference of the output characteristics of different face recognition models is proposed by Terhorst et al.

The unsupervised image quality assessment method can effectively reduce a great deal of labor and material cost required for labeling the quality scores of the face images in a manual labeling mode. However, the unsupervised method in the related art considers only the quality of the face image in the recognition direction, and does not consider the quality of the face image itself. In addition, in the related art, the face quality evaluation model with the same scale as the face recognition model is used to evaluate and judge the face quality of the face image. Because the structure of the face recognition model is generally complex, the calculation amount of the face quality evaluation model in the face quality evaluation process is large, and therefore the practicability of the face quality evaluation model is poor.

In order to improve the accuracy of the quality evaluation of the face image itself, it is necessary to propose an image quality evaluation, face recognition, label generation and determination method and apparatus for solving the drawbacks of the related art. The image quality evaluation method determines visual characteristic evaluation data of the face image to be evaluated based on low-level characteristics extracted from the face image to be evaluated, and determines face characteristic evaluation data of the face image to be evaluated based on high-level characteristics extracted from the face image to be evaluated, so that the quality of the face image to be evaluated is accurately and comprehensively described by fusing the visual characteristic evaluation data and the face characteristic evaluation data, and the evaluation effect of the quality of the face image per se is improved.

According to the label generation method provided by the specification, the image quality evaluation method is used for evaluating the quality of the face image to be marked, so that the face image evaluation data of the face image to be marked can be obtained, and the label of the face image to be marked can be obtained and used for marking the face image to be marked. Therefore, the quality evaluation data of the face images can be automatically generated, manual design features are not needed to be used for manually marking the quality evaluation data of the face images, consistency of quality scores of different face images is guaranteed, and meanwhile a large amount of labor and material cost is avoided.

According to the face quality assessment model training method provided by the specification, the label corresponding to the face image sample is obtained through the label generation method, so that the face image sample is utilized to train the initial assessment model, and the trained face quality assessment model is obtained. The face quality evaluation model is utilized to evaluate the quality of a plurality of face images to be recognized, which are shot for the same object, so as to determine the face images with the quality meeting the requirements from the face images for face recognition, and the face recognition efficiency and accuracy can be effectively improved. Further, the face quality assessment model is obtained by training a light initial assessment model, the light initial assessment model is obtained by migration on the basis of a face recognition student model with a smaller structure after training, and the face recognition student model is obtained by performing knowledge migration on a face recognition teacher model with a more complex structure through a knowledge distillation technology. Therefore, the structure of the face quality assessment model can be simplified, the calculation amount of the face quality assessment model is reduced while the accuracy of quality assessment is ensured, the practicability of the face quality assessment model is improved, and the face quality assessment model can be well applied to scenes with light model deployment requirements, such as embedded equipment.

The present embodiment provides an image quality evaluation method, which may include the following steps with reference to fig. 1.

S110, acquiring a face image set obtained by shooting the same target object; the face image set comprises a plurality of face images to be evaluated.

S120, determining visual feature evaluation data and face feature evaluation data of a face image to be evaluated; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the face image to be evaluated, wherein the low-level characteristics are used for describing visual layer semantics of the face image to be evaluated; the face feature evaluation data is used for describing the quality condition of concept layer semantics of the face image to be evaluated.

S130, carrying out face image quality evaluation according to the visual feature evaluation data and the face feature evaluation data of the face image to be evaluated to obtain face image evaluation data of the face image to be evaluated.

The target object has identity information, for example, an identity ID, and the same target object is an object with the same identity information, that is, the same person, so that the face image to be evaluated corresponds to the identity information. The low-level features may include basic visual features such as corner points, edge lines, boundaries, colors and the like extracted from the face image to be evaluated after processing such as image filtering, image enhancement, edge detection and the like. Visual layer semantics are low-level feature semantics of an image, i.e., contours, edges, colors, textures, shapes, etc. in the image; concept layer semantics are high-level feature semantics of an image, i.e., a face in an image, etc. The visual characteristic evaluation data can be used for describing quality conditions such as contour definition, color brightness and the like of the face image to be evaluated, the face characteristic evaluation data can be used for describing quality conditions such as characteristic robustness, easy identification and the like of the face image to be evaluated, and the face image evaluation data can be used for describing the overall face quality condition of the face image to be evaluated more comprehensively.

It can be understood that the semantics of an image are divided into visual layer semantics, object layer semantics, and concept layer semantics. The visual layer, i.e., the generally understood lower layer, contains color, texture, shape, etc. features that describe visual layer semantics; the object layer, namely the middle layer, generally comprises characteristics related to attributes such as shape characteristics, structure characteristics and the like, namely the state of a certain object at a certain moment, and the characteristics are used for describing the semantics of the object layer; the conceptual layer is a high level that contains what the image expresses that is closest to human understanding. For example, if a beach image contains sand, blue sky, sea water, etc., then the visual layer semantics contain edge lines, boundaries, colors, etc. in the image; the object layer semantics comprise sand, blue sky, sea water and the like; the concept layer semantics contain the beach, i.e. the content that the image shows closest to human understanding.

In some cases, the image quality generally refers to the visual quality of the image, such as brightness, resolution, sharpness, etc., however, these indexes cannot fully express the quality of the face image for the face recognition system, that is, cannot express the quality of the recognition effect of face recognition on the face image. In order to accurately and comprehensively describe the quality of the face image, the face image quality evaluation can be performed on the face image based on visual feature evaluation data for describing the quality condition of the visual layer semantics of the face image to be evaluated and face feature evaluation data for describing the quality condition of the conceptual layer semantics of the face image to be evaluated.

Specifically, according to a plurality of face images to be evaluated obtained by shooting aiming at the same target object, a face image set can be obtained. For any face image to be evaluated in the face image set, extracting low-level features of the face image to be evaluated to obtain low-level features, and extracting high-level features of the face image to be evaluated to obtain high-level features. Determining visual feature evaluation data of the face image to be evaluated based on the low-level features of the face image to be evaluated, for describing quality conditions of visual layer semantics of the face image to be evaluated, such as contour sharpness, color brightness and the like; face feature evaluation data of the face image to be evaluated is determined based on high-level features of the face image to be evaluated, so as to be used for describing quality conditions of concept layer semantics of the face image to be evaluated, such as feature robustness, face easy recognition and the like. And carrying out face image quality evaluation on any face image to be evaluated according to the determined visual characteristic evaluation data and the face characteristic evaluation data so as to obtain face image evaluation data of any face image to be evaluated.

In some embodiments, the visual feature evaluation data may be determined according to similarity data between low-level features of the face image to be evaluated and low-level features of other face images to be evaluated of the same target object, and the face feature evaluation data may be determined according to similarity data between high-level features of the face image to be evaluated and high-level features of other face images to be evaluated of the same target object. The face image evaluation data may be calculated by performing weighted summation based on the visual feature evaluation data and the face feature evaluation data.

For each face image to be evaluated in the face image set of the same target object, low-level feature extraction and high-level feature extraction are respectively performed on each face image to be evaluated to obtain low-level features and high-level features of each face image to be evaluated. Aiming at the face image Pi to be evaluated in the face image set, the low-level characteristic Fv of the face image Pi can be used _i And any other low-level features Fv of the face image Pj to be evaluated except the face image Pi in the face image set _j Performing similarity calculation to determine low-level characteristics Fv of face image Pi _i And any other low-level features Fv of the face image Pj to be evaluated _j Similarity data between. Based on the similarity data between the low-level features of the face image Pi and the low-level features of different other face images to be evaluated, the visual feature evaluation data S of the face image Pi can be determined _vi . According to the high-level features Fd of the face image Pi _i And any other high-level features Fd of the face image Pj to be evaluated _j Performing similarity calculation to determine high-level features and any other objects of the face image PiAnd evaluating similarity data between high-level features of the face image Pj. According to the similarity data between the high-level features of the face image Pi and the high-level features of different other face images to be evaluated, the face feature evaluation data S of the face image Pi can be determined _di . Visual characteristic evaluation data S of face image Pi according to preset weight _vi And face feature evaluation data S _di Performing weighted summation calculation to obtain face image evaluation data S of the face image Pi _i The method comprises the following steps: s is S _i ＝w ₁ *S _vi +w ₂ *S _di . Wherein w is ₁ And w ₂ The weight coefficient can be set according to actual requirements.

In other embodiments, the visual feature assessment data may be determined from similarity data between low-level features of the face image to be assessed and low-level features of all face images to be assessed of the same target object.

Taking the above face image set as an example, the low-level features Fv for representing the target object m may also be obtained according to the low-level features of all the face images to be evaluated in the face image set _m . Low-level features Fv from face image Pi _i And the lower layer characteristic Fv _m And performing similarity calculation to obtain similarity data between the low-level features of the face image Pi and the low-level features of all face images to be evaluated in the face image set. From the similarity data, visual feature evaluation data S of the face image Pi can be determined _vi 。

In still other embodiments, the face feature evaluation data may also be determined according to similarity data between high-level features of the face image to be evaluated and high-level features of other face images to be evaluated of the same target object, and according to similarity data between high-level features of the face image to be evaluated and high-level features of face images to be evaluated of different target objects.

Illustratively, taking the face image Pi as an example, the high-level features Fd of the face image Pi are used for _i And the face image Pi of the same target object except the face imageHigher-level features Fd of any other face image Pj to be evaluated _j And carrying out similarity calculation, and determining similarity data between the high-level features of the face image Pi and the high-level features of any other face image Pj to be evaluated. According to the high-level features Fd of the face image Pi _i And the high-level features Fd of any face image Pj' to be evaluated in the face image collection of other target objects _j 'similarity calculation is performed, and similarity data between the high-level features of the face image Pi and the high-level features of any face image Pj' to be evaluated of other target objects can be determined. According to the similarity data between the high-level features of the face image Pi and the high-level features of other face images to be evaluated of the same target object and the similarity data between the high-level features of the face image Pi and the high-level features of the face images to be evaluated of different target objects, the face feature evaluation data S of the face image Pi can be determined _di 。

In still other embodiments, the face image evaluation data may be determined from a result of a face image quality evaluation, which may be a quality evaluation level.

Illustratively, taking the face image Pi as an example, the data S may be evaluated according to the visual characteristics of the face image Pi _vi And face feature evaluation data S _di And carrying out face image quality evaluation on the face image Pi to obtain a quality evaluation grade. For each quality assessment level, a corresponding quality assessment score may be preset. According to the quality evaluation score corresponding to the quality evaluation level of the face image Pi, face image evaluation data of the face image Pi can be obtained.

The low-level features may also be referred to as visual features, and the high-level features may also be referred to as face recognition features. The visual feature evaluation data may be a face image visual quality score and the face feature evaluation data may be a face recognition feature quality score.

In the above embodiment, the visual feature evaluation data of the face image to be evaluated is determined based on the low-level features extracted from the face image to be evaluated, and the face feature evaluation data of the face image to be evaluated is determined based on the high-level features extracted from the face image to be evaluated. Thus, by evaluating the visual quality of the face image using the visual feature evaluation data determined based on the low-level features, instead of the manner of evaluating the visual quality of the image in general (i.e., the manner of evaluating the brightness, resolution, sharpness, etc. of the image), the accuracy of the evaluation of the visual quality of the face image itself can be improved. Meanwhile, the quality of the face image to be evaluated is accurately and comprehensively described by fusing the visual characteristic evaluation data and the face characteristic evaluation data, so that the effect of evaluating the quality of the face image per se can be improved.

In some embodiments, referring to FIG. 2, the manner in which visual characteristic assessment data is determined may include the following steps.

S210, obtaining the representative low-level features of the target object according to the low-level features of the face images to be evaluated.

S220, similarity calculation is carried out based on the low-level features and the representative low-level features of the face image to be evaluated, and visual feature evaluation data of the face image to be evaluated are obtained.

The representative low-level feature may be a low-level feature center point of all face images to be evaluated in a face image set obtained by shooting the same target object.

Specifically, according to a face image set obtained by shooting the same target object, respectively extracting low-level features of a plurality of face images to be evaluated in the face image set, so as to obtain the low-level features of the plurality of face images to be evaluated. According to the low-level features of the plurality of face images to be evaluated, the center points of the low-level features of the plurality of face images to be evaluated can be determined and used as the representative low-level features of the target object. And aiming at any face image to be evaluated in the plurality of face images to be evaluated, performing similarity calculation based on the low-level features of the face image to be evaluated and the representative low-level features of the target object, so as to obtain visual feature evaluation data of the face image to be evaluated.

In some embodiments, the similarity calculation may be performed by adopting a manner of calculating the euclidean distance based on the low-level features and the representative low-level features of the face image to be evaluated, which is an image having a classification category.

It can be understood that when there are multiple target objects, the face images to be evaluated corresponding to the target objects can be classified according to the identity information of each target object, so as to divide a plurality of face images to be evaluated obtained by shooting the same target object into the same category, and one category can be used for representing one identity information (i.e. can be used for representing the same person). Thus, a target object may correspond to a category.

Illustratively, if the classification class corresponding to the target object is denoted as class m, the face image set of the target object belongs to class m. According to the low-level characteristic Fv of a plurality of face images to be evaluated in the category m, the central point of the low-level characteristic Fv of the plurality of face images to be evaluated can be determined and is marked as Fv _m 。Fv _m The calculation method of (2) is as follows:

wherein N represents the number of a plurality of face images to be evaluated in the class m; i represents the index of the face image to be evaluated in class m, fv _i Representing the low-level features of the ith face image to be evaluated in class m, i epsilon N.

The ith face image to be evaluated in the category m is called an image i. For the image i, the Euclidean distance between the low-level feature of the image i and the center points of the low-level features of all face images to be evaluated in the class m to which the image i belongs can be calculated, and visual feature evaluation data S of the image i can be obtained according to the Euclidean distance _vi 。S _vi The calculation method of (2) can refer to the following formula:

wherein d _i Low-level features F representing image i _vi To which image i belongsLow-level characteristic central point Fv of all face images to be evaluated in class m _m Euclidean distance between them; d, d _max Representing low-level features and low-level feature center points Fv of face images to be evaluated in class m _m A maximum value of euclidean distance; d, d _min Representing low-level features and low-level feature center points Fv of face images to be evaluated in class m _m Is the minimum value of euclidean distance.

Therefore, if the Euclidean distance between the low-level feature of the image i and the low-level feature center points of all face images to be evaluated in the class m to which the image i belongs is larger, the smaller the visual feature evaluation data of the image i is, the lower the visual quality of the image i can be indicated. Otherwise, if the euclidean distance between the low-level feature of the image i and the low-level feature center points of all face images to be evaluated in the class m to which the image i belongs is smaller, the visual feature evaluation data of the image i is larger, which can indicate that the visual quality of the image i is higher.

It should be noted that the face image to be evaluated may also be an image sample having a classification category.

In some implementations, the face feature evaluation data is determined based on high-level features extracted from the face image to be evaluated; the high-level features are used to describe the conceptual layer semantics of the face image to be evaluated. The determining manner of the low-level features and the high-level features can comprise: and inputting the face image to be evaluated into a trained face recognition model to perform feature extraction, so as to obtain low-level features and high-level features.

Specifically, the face image to be evaluated can be input into a trained face recognition model, so that the face recognition model performs low-level feature extraction on the face image to be evaluated to obtain low-level features of the face image to be evaluated; and extracting high-level features of the face image to be evaluated by the face recognition model to obtain the high-level features of the face image to be evaluated.

In some embodiments, the trained face recognition model may be a deep neural network model that uses multiple layers of neurons to process the raw image to obtain the final desired features. In general, features obtained by higher-layer neurons of the deep neural network are mainly used for describing conceptual layer semantic information of an image, and features obtained by lower-layer neurons are mainly used for describing visual layer semantic information of the image. Therefore, the low-level feature extraction of the face image to be evaluated can be performed through the low-level neuron of the face recognition model to obtain the low-level feature of the face image to be evaluated, and the high-level feature extraction of the face image to be evaluated can be performed through the high-level neuron of the face recognition model to obtain the high-level feature of the face image to be evaluated.

Further, after the low-level features and the high-level features of the face image to be evaluated are obtained, the visual feature evaluation data and the face feature evaluation data of the face image to be evaluated can be obtained through face recognition model calculation.

In the above embodiment, the low-level features and the high-level features of the face image are extracted by using the face recognition model, and the visual feature evaluation data and the face feature evaluation data of the face image can be obtained by calculating the face recognition model, so that the low-level visual features and the high-level face recognition features of the face image to be evaluated are comprehensively considered when the face image is evaluated for the quality of the face image. Therefore, a face image quality evaluation method based on face recognition method optimization can be realized, and face image evaluation data can be automatically obtained through the evaluation method.

In some implementations, the face feature evaluation data includes first feature evaluation data; the high-level features include a plurality of first high-level features extracted by the face recognition model in different random discard modes. The determining manner of the face feature evaluation data may include: performing similarity calculation according to the plurality of first high-level features to obtain first feature evaluation data of the face image to be evaluated; the first feature evaluation data is used for describing the robustness condition of the high-level features of the face image to be evaluated.

The network layer of the face recognition model is provided with a Dropout layer (random discarding layer), and the random discarding mode is a Dropout mode. The first feature evaluation data is face feature evaluation data of a single image.

Specifically, a trained network layer of a face recognition model with a Dropout layer can be used for high-level feature extraction of a face image to be evaluated. By using different random discarding modes, a plurality of random subnetworks of the face recognition model can be obtained, and the random subnetworks can be used for extracting high-level feature vectors of the face images to be evaluated. The high-level feature vector extracted by one random sub-network is called a first high-level feature, and thus a plurality of first high-level features of the face image to be evaluated can be generated through the plurality of random sub-networks. And performing similarity calculation according to the plurality of first high-level features to obtain first feature evaluation data of the face image to be evaluated, and evaluating the robustness of the high-level features of the face image to be evaluated according to the first feature evaluation data.

In some embodiments, the similarity calculation may be performed from a plurality of first-level features by calculating the euclidean distance. The first feature evaluation data of the face image to be evaluated may be calculated by using an unsupervised SER-FIQ (Unsupervised Estimation of Face Image Quality Based on Stochastic Embedding Robustness) face image quality evaluation method, and the plurality of first high-level features may also be referred to as a plurality of random embeddings generated through a plurality of random sub-networks.

Illustratively, a trained face recognition model with a Dropout layer is used for high-level feature extraction of a face image to be evaluated. By using different Dropout modes, an embedded x of n random subnetworks can be generated _s (i.e., the first high level feature). For the same face image to be evaluated, n random embedments of the same face image to be evaluated can form a set, denoted as X (I), and then there are:

X(I)＝{x _s }，s∈{1，2，.......，n}

where s may represent a randomly embedded index, i.e. an index of a random subnetwork.

For the same face image to be evaluated, similarity calculation is carried out according to n random embedments of the image, so that first feature evaluation data S of the face image to be evaluated can be obtained _ri 。S _ri The calculation method of (2) can refer to the following formula:

wherein σ (·) represents an activation function (sigmoid function); d (x) _p ,x _q ) Representing random embedded x _p And x _q Euclidean distance between, and p e {1,2,.,. The., n }, q e {1,2,.,. The., n }.

According to the formula, it can be found that, for the same face image to be evaluated, if the euclidean distance between random embeddings of the face image to be evaluated is larger, the variation between random embeddings is larger, which indicates that the robustness of the high-level features of the face image to be evaluated is lower, and the face feature quality of the face image to be evaluated is lower. Otherwise, if the euclidean distance between random embedding of the face image to be evaluated is smaller, the change between random embedding is smaller, which indicates that the higher the robustness of the high-level features of the face image to be evaluated is, the higher the face feature quality of the face image to be evaluated is.

In some embodiments, the face image to be evaluated is a designated face image of a number of face images to be evaluated; the high-level features comprise second high-level features extracted by the face recognition model; the face feature evaluation data includes second feature evaluation data. Referring to fig. 3, the determination method of the face feature evaluation data may include the following steps.

S310, performing similarity calculation according to the second high-level features of the appointed face image and the second high-level features of other face images to obtain similarity data between the appointed face image and the other face images; the other face images are other face images to be evaluated except the appointed face image in the plurality of face images to be evaluated.

S320, obtaining second characteristic evaluation data of the appointed face image according to the similarity data; the second feature evaluation data is used for describing the identification condition of the appointed face image.

The similarity data is high-level feature similarity data and is used for describing high-level feature similarity between the appointed face image and other face images. The second feature evaluation data is high-level feature similarity evaluation data of the same object (or the same identity, the same ID), or in-class high-level feature similarity evaluation data.

In some cases, the higher the similarity between the high-level features of different face images to be evaluated of the same target object, the easier the face in the face image to be evaluated of the target object is to be correctly identified by the face recognition model, and the higher the face feature quality of the face image to be evaluated is. Therefore, for the specified face image of the same target object, the second feature evaluation data of the specified face image can be obtained according to the high-level feature similarity between the specified face image and other face images of the target object, so as to be used for describing the recognition condition of the specified face image, namely the recognition accuracy condition of the specified face image.

Specifically, according to a face image set obtained by shooting the same target object, for a plurality of face images to be evaluated in the face image set, high-level feature extraction can be performed on each face image to be evaluated through a face recognition model, so as to obtain second high-level features of each face image to be evaluated. And aiming at the appointed face image in the plurality of face images to be evaluated, carrying out similarity calculation according to the second high-level features of the appointed face image and the second high-level features of other face images in the plurality of face images to be evaluated to obtain similarity data between the appointed face image and the other face images. According to the similarity data between the appointed face image and other face images, second characteristic evaluation data of the appointed face image can be further obtained, so that the identification condition of the appointed face image can be evaluated according to the second characteristic evaluation data.

In some embodiments, the similarity calculation may be performed according to the second high-level features of the specified face image and the second high-level features of the other face images by calculating cosine similarity, where the similarity data is cosine similarity data. The second feature evaluation data may be average cosine similarity data calculated from cosine similarity data between the specified face image and a different other face image.

Illustratively, a specified face image of a plurality of face images to be evaluated of the same ID is denoted as an image i, and any other face image of the plurality of face images to be evaluated other than the specified face image is denoted as an image j. The second high-level characteristic Fc of the image i can be obtained through a face recognition model _i And second high-level feature Fc of image j _j . And performing cosine similarity calculation according to the second high-level features of the image i and the second high-level features of the image j to obtain similarity data between the image i and the image j, thereby obtaining similarity data between the image i and different other face images in the plurality of face images to be evaluated. Average calculation is carried out on similarity data between the image i and different other face images in the plurality of face images to be evaluated, average cosine similarity data can be obtained, and the average cosine similarity data is used as second feature evaluation data S of the image i _ci 。S _ci The calculation method of (2) can refer to the following formula:

where cos (·) represents the cosine similarity calculation. It will be appreciated that image i and image j have the same ID, i.e., ID _i ＝ID _j 。

According to the formula, it can be found that if the similarity data between the image i and the different other face images of the same ID is larger, the face in the image i is easier to identify as a correct object, and the face feature quality of the image i is higher. Conversely, if the similarity data between the image i and the different other face images of the same ID is smaller, it indicates that the face in the image i is less likely to be identified as a correct object, and the face feature quality of the image i is lower.

In this embodiment, the high-level feature extraction is performed on the face image to be evaluated through the network layer of the face recognition model without the Dropout layer.

In some implementations, the face feature evaluation data includes first feature evaluation data and second feature evaluation data. Performing face image quality evaluation according to the visual feature evaluation data and the face feature evaluation data of the face image to be evaluated to obtain the face image evaluation data of the face image to be evaluated, which may include: and carrying out face image quality evaluation according to the visual characteristic evaluation data, the first characteristic evaluation data and the second characteristic evaluation data to obtain face image evaluation data.

Specifically, the visual quality conditions such as the contour definition, the color brightness and the like of the face image to be evaluated and the quality conditions such as the feature robustness, the face easy-recognition and the like of the face image to be evaluated can be evaluated more accurately and comprehensively according to the visual feature evaluation data, the first feature evaluation data and the second feature evaluation data of the face image to be evaluated, so that the face image evaluation data of the face image to be evaluated can be obtained.

In some embodiments, the face image evaluation data may be calculated from a weighted sum of the visual feature evaluation data, the first feature evaluation data, and the second feature evaluation data.

Illustratively, the face image to be evaluated is denoted as image i. For image i, visual feature evaluation data S of image i can be obtained by the foregoing method _vi First feature evaluation data S _ri And second feature evaluation data S _ci . The visual characteristic evaluation data S of the image i is evaluated according to the preset weight by adopting a weighted fusion mode _vi First feature evaluation data S _ri And second feature evaluation data S _ci The face image evaluation data S of the final image i can be generated by performing weighted summation calculation _i The method comprises the following steps:

S _i ＝α*S _vi +β*S _ri +γ*S _ci

wherein, alpha, beta and gamma are weight coefficients, which can be set according to actual demands.

The embodiment of the present specification provides a tag generation method, which may include the following steps with reference to fig. 4.

S410, acquiring a face image set obtained by shooting the same target object; the face image set comprises a plurality of face images to be marked.

S420, determining visual feature evaluation data and face feature evaluation data of a face image to be marked; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the face image to be annotated, and the low-level characteristics are used for describing visual layer semantics of the face image to be annotated; the face feature evaluation data are used for describing the quality condition of concept layer semantics of the face image to be marked.

S430, performing face image quality evaluation based on the visual characteristic evaluation data and the face characteristic evaluation data to obtain face image evaluation data of the face image to be marked.

S440, taking the face image evaluation data as a label of the face image to be marked.

The face image to be marked is an image sample with classification categories, and the classification categories correspond to the identity information of the target object.

Specifically, for any face image to be annotated in a plurality of face images to be annotated, face image evaluation data of the face image to be annotated can be obtained through the image quality evaluation method in any one of the foregoing embodiments, so that the face image evaluation data is used as a label of the face image to be annotated.

Further, according to the face image to be marked and the label of the face image evaluation data corresponding to the face image to be marked, a face image sample can be obtained and used for training based on the face image sample to obtain a face quality evaluation model.

It will be appreciated that the face image to be annotated may be the same as the face image to be assessed in any of the preceding embodiments.

It should be noted that, for the description of the face image evaluation data obtained from the face image to be labeled in the above embodiment, please refer to the description of the image quality evaluation method in the present specification, and detailed descriptions thereof are omitted here.

The embodiment of the present disclosure provides a training method for a face quality assessment model, and referring to fig. 5, the training method for a face quality assessment model may include the following steps.

S510, constructing a first training sample set; the first training sample set includes a plurality of first face image samples, where the first face image samples have labels obtained based on the label generating method in the foregoing embodiment.

S520, training the initial evaluation model by using the first training sample set to obtain a face quality evaluation model.

Specifically, a plurality of face images obtained by shooting aiming at the same target object can be obtained first; secondly, aiming at any one face image in a plurality of face images, taking the face image as the face image to be marked, and obtaining the label of the any face image based on the label generating method in the embodiment; and then constructing a first face image sample according to any face image and the corresponding label thereof. Thus, a plurality of first face image samples can be constructed, and a first training sample set can be constructed. And training the initial evaluation model by using a first face image sample in the first training sample set to obtain a trained face quality evaluation model.

For example, for the target object M, a plurality of face images captured for the target object M may be acquired first, and any face image of the plurality of face images may be denoted as an image i. Secondly, aiming at the image i, face image evaluation data S of the image i can be obtained based on the label generation method _i To evaluate the face image with the data S _i As a label for image i. Then according to the image i and the corresponding label S _i And constructing a first face image sample. Thus, a plurality of first face image samples can be constructed. And taking the face image corresponding to any one of the first face image samples as the input of the initial evaluation model, and obtaining the evaluated quality evaluation data of the first face image samples output by the initial evaluation model. Based on the label of the first face image sample and the quality evaluation data, the loss data of the initial evaluation model can be determined, so that the initial evaluation model is updated according to the loss data until the model training is satisfiedAnd (3) obtaining a trained face quality evaluation model.

In some embodiments, the face quality assessment model may be obtained by performing knowledge migration based on a trained face recognition model, where the network structure of the face quality assessment model is substantially the same as the network structure of the trained face recognition model.

It will be appreciated that the first training sample set may comprise several first face image samples of different target objects.

In some embodiments, the initial assessment model is migrated based on a face recognition student model that has been trained; the face recognition student model corresponds to a face recognition teacher model. Referring to fig. 6, the face quality assessment model training method may further include the following steps.

S610, constructing a second training sample set; the first training sample set comprises a plurality of second face image samples, and the second face image samples are provided with object category labels.

S620, inputting the second face image sample into a face recognition teacher model to carry out face recognition, and obtaining a first face recognition feature and a first prediction category.

And S630, inputting the second face image sample into a face recognition student model to carry out face recognition, and obtaining second face recognition characteristics and second prediction categories.

And S640, determining loss data of the face recognition student model based on the first face recognition feature, the first prediction category, the second face recognition feature, the second prediction category and the object category label.

And S650, updating parameters of the face recognition student model based on the loss data of the face recognition student model until the model training stopping condition is met.

The face recognition Teacher model is a Teacher model which is large in trained network structure and complex in face recognition, the face recognition Student model is a Student model which is simple in network structure and is used for face recognition, and knowledge migration is carried out on the basis of the face recognition Teacher model by adopting a knowledge distillation technology. The object category label is used for indicating the object category to which the second face image sample actually belongs, and the object category corresponds to the identity information of the target object.

The first face recognition features are face recognition features in a second face image sample extracted by a face recognition teacher model, and the first prediction category is an object category of the second face image sample predicted by the face recognition teacher model; the second face recognition feature is a face recognition feature in a second face image sample extracted by the face recognition student model, and the second prediction category is an object category to which the second face image sample predicted by the face recognition student model belongs.

In some cases, most face quality assessment models use substantially the same network structure as the face model to perform knowledge migration based on the face model. However, in order to ensure accuracy of face recognition, the structure of a face recognition model that is generally used is large and complex, resulting in a relatively large calculation amount of the face recognition model. If a face quality evaluation model with the same network structure as that of the face recognition model is added in the face recognition system, the face recognition system is not friendly to the embedded device. Therefore, the face quality evaluation model needs to be lightweight.

Knowledge distillation technology is a model light-weight method. The knowledge distillation technology adopts a Teacher-Student mode, a model with a larger and more complex structure is used as a Teacher model, and a model with a smaller and simpler structure is used as a Student model. The Teacher model has strong learning ability, training of the Student model can be assisted by utilizing the Teacher model, learned knowledge is transferred to the Student model with relatively weak learning ability, so that generalization ability of the Student model is enhanced, and a trained light Student model is obtained. Therefore, a light face recognition model can be obtained by adopting a knowledge distillation technology, and then a light face quality assessment model is obtained by migration based on the light face recognition model.

Specifically, a plurality of face images respectively shot for different target objects can be obtained, and the face images are respectively divided into corresponding object categories according to identity information of the different target objects. And taking the object class corresponding to each face image as the object class label of each face image, and constructing a second face image sample according to each face image and the corresponding object class label. Therefore, a plurality of second face image samples can be constructed, and a second training sample set can be constructed.

The trained face recognition model with a larger and more complex structure is used as a face recognition teacher model, and a face recognition model with a smaller and simpler structure can be additionally defined as a face recognition student model. Inputting a second face image sample in the second training sample set to a face recognition teacher model for face recognition, so that a first face recognition feature extracted by the face recognition teacher model and a predicted first prediction category can be obtained; and inputting a second face image sample in the second training sample set to the face recognition student model for face recognition, so that second face recognition characteristics extracted by the face recognition student model and a second predicted category of prediction can be obtained. According to the first face recognition feature, the first prediction category, the second face recognition feature, the second prediction category and the object category label, the loss data of the face recognition student model can be determined, so that parameters of the face recognition student model are updated according to the loss data until the stopping condition of model training is met, and the face recognition student model with complete training is obtained.

Further, after the training-completed face recognition student model is obtained, an initial evaluation model for face image quality evaluation can be defined by using the same network structure as that of the training-completed face recognition student model, so that knowledge migration is performed on the basis of the training-completed face recognition student model, and the final trained face quality evaluation model is obtained by training the initial evaluation model.

In some embodiments, the object class may be an identity ID class (or a face ID class) classified by identity ID, then the object class label may be a face ID classification label, the first predicted class may be a first predicted face ID class, and the second predicted class may be a second predicted face ID class.

Illustratively, the face image of the target object 1 corresponds to the face ID class 1, the face image of the target object 2 corresponds to the face ID class 2, and the face image of the target object 3 corresponds to the face ID class 3. According to the face image of the target object 1 and the corresponding face ID category 1, the face image of the target object 2 and the corresponding face ID category 2, and the face image of the target object 3 and the corresponding face ID category 3, a plurality of second face image samples can be constructed. Marking object class labels of the second face image sample as C _l Object class label C _l Is any one of face ID category 1, face ID category 2, and face ID category 3.

For any second face image sample, inputting the second face image sample into a face recognition teacher model for face recognition, and marking the first face recognition feature output by the face recognition teacher model as F _t And recording the first prediction category output by the face recognition teacher model as C _t . Inputting the second face image sample into a face recognition student model for face recognition, and marking the second face recognition characteristic output by the face recognition student model as F _s And recording a second prediction category output by the face recognition student model as C _s . Based on the first face recognition feature F _t First prediction category C _t Second face recognition feature F _s Second prediction category C _s And object class label C _l And determining Loss data Loss of the face recognition student model. The method for calculating the Loss data Loss can refer to the following formula:

Loss＝η*MSE(F _s ,F _t )+λ*CE(C _s ,C _t )+ζ*CE(C _s ,C _l )

wherein MSE (·) represents the mean square error loss function; CE (·) represents the cross entropy loss function; η, λ and ζ are weight coefficients, which may be set according to actual requirements.

And updating parameters of the face recognition student model according to the Loss data Loss until the stopping condition of model training is met, and obtaining the face recognition student model after training.

Further, after the training-completed face recognition student model is obtained, an initial evaluation model may be defined using the same Backbone network structure as that of the training-completed face recognition student model (Backbone structure), and knowledge migration may be performed using the Backbone network. And training the initial evaluation model to obtain a trained face quality evaluation model.

It is to be appreciated that the second training sample set may be the same as the first training sample set.

In the above embodiment, the knowledge distillation technology is used to obtain the light-weight unsupervised face quality assessment model, so that the calculation amount of the face quality assessment model can be reduced, the calculation efficiency and the practicability of the face quality assessment model are improved, and the face quality assessment model can be well applied to scenes with light-weight model deployment requirements, such as embedded equipment.

The present embodiment provides an image quality evaluation method, which may include the following steps with reference to fig. 7.

S710, acquiring a face image to be evaluated.

S720, inputting the face image to be evaluated into a face quality evaluation model obtained through training by the face quality evaluation model training method in any one of the previous embodiments, so as to obtain quality evaluation data of the face image to be evaluated.

The quality evaluation data are face image evaluation data obtained by the evaluation of the face quality evaluation model, and are face image quality scores.

Specifically, a face image to be evaluated, which needs to be subjected to face image quality evaluation, is obtained, the face image to be evaluated is used as the input of a face quality evaluation model trained by the face quality evaluation model training method, so that the face image to be evaluated is subjected to face image quality evaluation by the face quality evaluation model, and the face quality evaluation model outputs quality evaluation data of the face image to be evaluated.

The embodiment of the present disclosure provides a face recognition method, and referring to fig. 8, the face recognition method may include the following steps.

S810, acquiring a face image set to be recognized, which is obtained by shooting a target object; the face image set to be identified comprises a plurality of face images to be identified.

S820, inputting the face image to be recognized into a face quality evaluation model obtained through training by the face quality evaluation model training method in any one of the previous embodiments, so as to obtain quality evaluation data of the face image to be recognized.

S830, determining a target face image with quality meeting requirements in a face image set to be identified according to quality evaluation data of the face image to be identified.

And S840, performing face recognition on the target object based on the target face image to obtain a face recognition result.

The face recognition result can be used for determining identity information of the target object.

Specifically, a plurality of face images to be recognized can be shot for a target object to obtain a face image set to be recognized. And aiming at each face image to be identified in the face image set to be identified, respectively taking each face image to be identified as the input of a face quality assessment model trained by the face quality assessment model training method, so that the face quality assessment model respectively outputs the quality assessment data of each face image to be identified. If the quality evaluation data of the face image to be recognized is larger, the face quality of the face image to be recognized can be indicated to be better, and therefore, a target face image with quality meeting requirements can be determined in the face image set to be recognized according to the quality evaluation data of each face image to be recognized. And carrying out face recognition on the target object based on the target face image to obtain a face recognition result, so that the identity information of the target object can be determined according to the face recognition result.

In some embodiments, the target face image may be the best quality face image of the set of face images to be identified.

Illustratively, the face image to be identified includes a face image to be identified 1, a face image to be identified 2, a face image to be identified 3, a face image to be identified 4 and a face image to be identified 5. The trained face quality evaluation model can obtain the quality evaluation data S of the face image 1 to be recognized ₁ Quality evaluation data S of the face image 2 to be recognized ₂ Quality evaluation data S of the face image 3 to be recognized ₃ Quality evaluation data S of the face image 4 to be recognized ₄ And quality evaluation data S of the face image 5 to be recognized ₅ . Suppose S ₁ ＜S ₂ ＜S ₃ ＜S ₄ ＜S ₅ The face image 5 to be recognized may be determined as the target face image, if it may be indicated that the face quality of the face image 5 to be recognized is the best.

In other embodiments, the target face image may be a face image to be identified whose quality assessment data is not less than a preset assessment data threshold.

Taking the above face image set to be identified as an example, the preset evaluation data threshold is recorded as S _f . Suppose S ₁ ＜S ₂ ＜S ₃ ＜S _f ＜S ₄ ＜S ₅ The face image 4 to be recognized and the face image 5 to be recognized may be indicated as images having quality meeting the requirement, and the face image 4 to be recognized and the face image 5 to be recognized may be determined as target face images.

The present embodiment provides a tag determination method, which may include the following steps with reference to fig. 9.

S910, acquiring an object image set obtained by shooting a specified object; the object image set comprises a plurality of object images to be annotated.

S920, determining visual characteristic evaluation data and object characteristic evaluation data of the object image to be annotated; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the object image to be annotated, and the low-level characteristics are used for describing visual layer semantics of the object image to be annotated; the object feature evaluation data are used for describing the quality condition of concept layer semantics of the object image to be annotated.

And S930, performing image quality evaluation according to the visual characteristic evaluation data and the object characteristic evaluation data of the object image to be marked to obtain evaluation data of the object image to be marked, wherein the evaluation data is used as a label of the object image to be marked.

Wherein the specified object can be a human, animal or the like. If the designated object is a person, the image of the object to be marked at least comprises a face; if the specified object is an animal, the image of the object to be marked may at least contain the animal's face or a representative part for distinguishing the individual animal, such as a trunk, wings, mouth, beak, corner, etc. The evaluation data are evaluation data of the image quality of the object image to be annotated.

For the description of the tag determination method in the above embodiment, please refer to the description of the tag generation method in the present specification, and detailed description thereof is omitted here.

The present embodiment provides an image quality evaluation apparatus, referring to fig. 10, an image quality evaluation apparatus 1000 may include: a first face image set acquisition module 1010, a first feature evaluation data determination module 1020, and a first image evaluation data acquisition module 1030.

A first face image set obtaining module 1010, configured to obtain a face image set obtained by shooting for the same target object; the face image set comprises a plurality of face images to be evaluated.

A first feature evaluation data determination module 1020 for determining visual feature evaluation data and face feature evaluation data of the face image to be evaluated; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the face image to be evaluated, wherein the low-level characteristics are used for describing visual layer semantics of the face image to be evaluated; the face feature evaluation data is used for describing the quality condition of concept layer semantics of the face image to be evaluated.

The first image evaluation data acquisition module 1030 is configured to perform face image quality evaluation according to the visual feature evaluation data and the face feature evaluation data of the face image to be evaluated, so as to obtain face image evaluation data of the face image to be evaluated.

In some embodiments, the first feature evaluation data determining module 1020 is further configured to obtain, according to the low-level features of the plurality of face images to be evaluated, a representative low-level feature of the target object; and performing similarity calculation based on the low-level features and the representative low-level features of the face image to be evaluated to obtain visual feature evaluation data of the face image to be evaluated.

In some implementations, the face feature evaluation data is determined based on high-level features extracted from the face image to be evaluated; the high-level features are used to describe the conceptual layer semantics of the face image to be evaluated. The first feature evaluation data determining module 1020 is further configured to input the face image to be evaluated to a trained face recognition model for feature extraction, so as to obtain low-level features and high-level features.

In some implementations, the face feature evaluation data includes first feature evaluation data; the high-level features include a plurality of first high-level features extracted by the face recognition model in different random discard modes. The first feature evaluation data determining module 1020 is further configured to perform similarity calculation according to a plurality of first high-level features to obtain first feature evaluation data of the face image to be evaluated; the first feature evaluation data is used for describing the robustness condition of the high-level features of the face image to be evaluated.

In some embodiments, the face image to be evaluated is a designated face image of a number of face images to be evaluated; the high-level features comprise second high-level features extracted by the face recognition model; the face feature evaluation data includes second feature evaluation data. The first feature evaluation data determining module 1020 is further configured to perform similarity calculation according to the second high-level feature of the specified face image and the second high-level feature of the other face image, so as to obtain similarity data between the specified face image and the other face image; the other face images are other face images to be evaluated except the appointed face image in the plurality of face images to be evaluated; obtaining second characteristic evaluation data of the appointed face image according to the similarity data; the second feature evaluation data is used for describing the identification condition of the appointed face image.

For specific limitations of the image quality evaluation device, reference may be made to the above limitations of the image quality evaluation method, and no further description is given here.

The present embodiment provides a tag generating apparatus, referring to fig. 11, a tag generating apparatus 1100 may include: a second face image set acquisition module 1110, a second feature evaluation data determination module 1120, a second image evaluation data acquisition module 1130, and a first face image tag acquisition module 1140.

A second face image set obtaining module 1110, configured to obtain a face image set obtained by shooting for the same target object; the face image set comprises a plurality of face images to be marked;

the second feature evaluation data determining module 1120 is configured to determine visual feature evaluation data and face feature evaluation data of a face image to be annotated; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the face image to be annotated, and the low-level characteristics are used for describing visual layer semantics of the face image to be annotated; the face feature evaluation data are used for describing the quality condition of concept layer semantics of the face image to be marked;

the second image evaluation data acquisition module 1130 is configured to perform face image quality evaluation based on the visual feature evaluation data and the face feature evaluation data, to obtain face image evaluation data of a face image to be labeled;

the first face image tag obtaining module 1140 is configured to use the face image evaluation data as a tag of a face image to be labeled.

For specific limitations of the label generating apparatus, reference may be made to the above limitations of the label generating method, and no further description is given here.

The embodiment of the present disclosure provides a training device for a face quality assessment model, referring to fig. 12a, the training device 1200 for a face quality assessment model may include: a first sample set construction module 1210, an initial assessment model training module 1220.

A first sample set construction module 1210 for constructing a first training sample set; the first training sample set comprises a plurality of first face image samples, and the first face image samples are provided with labels obtained based on the label generating method in claim 7;

the initial evaluation model training module 1220 is configured to train the initial evaluation model by using the first training sample set to obtain a face quality evaluation model.

In some embodiments, the initial assessment model is migrated based on a face recognition student model that has been trained; the face recognition student model corresponds to a face recognition teacher model. Referring to fig. 12b, the face quality assessment model training apparatus 1200 may further include: the second sample set construction module 1230, the first face recognition module 1240, the second face recognition module 1250, the loss data determination module 1260, and the model parameter update module 1270.

A second sample set construction module 1230 for constructing a second training sample set; the first training sample set comprises a plurality of second face image samples, and the second face image samples are provided with object class labels;

the first face recognition module 1240 is configured to input the second face image sample to a face recognition teacher model for face recognition, so as to obtain a first face recognition feature and a first prediction category;

The second face recognition module 1250 is configured to input a second face image sample to the face recognition student model for face recognition, so as to obtain a second face recognition feature and a second prediction category;

a loss data determination module 1260 for determining loss data of the face recognition student model based on the first face recognition feature, the first prediction category, the second face recognition feature, the second prediction category, and the object category label;

the model parameter updating module 1270 is configured to update parameters of the face recognition student model based on the loss data of the face recognition student model until a model training stop condition is satisfied.

For specific limitations on the training device of the face quality assessment model, reference may be made to the above limitations on the training method of the face quality assessment model, and no further description is given here.

The present embodiment also provides an image quality evaluation apparatus, referring to fig. 13, an image quality evaluation apparatus 1300 may include: a face image to be evaluated acquisition module 1310, a first quality evaluation data acquisition module 1320.

The face image to be evaluated acquisition module 1310 is configured to acquire a face image to be evaluated.

The first quality evaluation data acquisition module 1320 is configured to input the face image to be evaluated into the face quality evaluation model trained by the face quality evaluation model training method in any one of the foregoing embodiments, so as to obtain quality evaluation data of the face image to be evaluated.

The present embodiment also provides a face recognition apparatus, referring to fig. 14, the face recognition apparatus 1400 may include: a third face image set acquisition module 1410, a second quality assessment data acquisition module 1420, a target face image determination module 1430, and a face recognition result acquisition module 1440.

A third face image set obtaining module 1410, configured to obtain a face image set to be identified obtained by shooting a target object; the face image set to be identified comprises a plurality of face images to be identified.

The second quality evaluation data acquisition module 1420 is configured to input the face image to be recognized into a face quality evaluation model trained by the face quality evaluation model training method in any one of the foregoing embodiments, to obtain quality evaluation data of the face image to be recognized.

The target face image determining module 1430 is configured to determine a target face image with quality meeting the requirement in the face image set to be identified according to the quality evaluation data of the face image to be identified;

The face recognition result obtaining module 1440 is configured to perform face recognition on the target object based on the target face image, so as to obtain a face recognition result.

For specific limitations of the face recognition apparatus, reference may be made to the above limitations of the face recognition method, and no further description is given here.

The present embodiment also provides a tag determining apparatus, referring to fig. 15, a tag determining apparatus 1500 may include: an object image set acquisition module 1510, a third feature evaluation data determination module 1520, and an object image tag to be annotated determination module 1530.

An object image set acquisition module 1510, configured to acquire an object image set obtained by shooting for a specified object; the object image set comprises a plurality of object images to be annotated.

A third feature evaluation data determination module 1520, configured to determine visual feature evaluation data and object feature evaluation data of the object image to be annotated; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the object image to be annotated, and the low-level characteristics are used for describing visual layer semantics of the object image to be annotated; the object feature evaluation data are used for describing the quality condition of concept layer semantics of the object image to be annotated.

The to-be-annotated object image tag determination module 1530 is configured to perform image quality evaluation according to the visual feature evaluation data and the object feature evaluation data of the to-be-annotated object image, so as to obtain evaluation data of the to-be-annotated object image, and use the evaluation data as a tag of the to-be-annotated object image.

For specific limitations of the tag determination apparatus, reference may be made to the above limitations of the tag determination method, and no further description is given here.

The above-mentioned image quality evaluation device, label generation device, face quality evaluation model training device, face recognition device, and label determination device may be all or partially implemented by software, hardware, and combinations thereof. The above modules may be embedded in hardware or independent of a processor in the electronic device, or may be stored in software in a memory in the electronic device, so that the processor may call and execute operations corresponding to the above modules.

The present disclosure further provides a computer device, as shown with reference to fig. 16, where the computer device 1600 includes a memory 1610 and a processor 1620, where the memory 1610 stores a first computer program 1630, and where the processor 1620 executes the first computer program 1630 to implement the image quality assessment method according to any one of the foregoing embodiments, and/or the label generation method, and/or the face quality assessment model training method, and/or the face recognition method, and/or the steps of the label determination method.

The present embodiment further provides a chip, referring to fig. 17, where the chip 1700 includes a storage unit 1710 and a processing unit 1720, the storage unit 1710 stores a second computer program 1730, and the processing unit 1720 implements the image quality assessment method of any one of the foregoing embodiments and/or the face recognition method and/or the steps of the tag determination method when executing the second computer program 1730.

The present specification further provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the image quality assessment method of any of the preceding embodiments, and/or the label generation method, and/or the face quality assessment model training method, and/or the face recognition method, and/or the steps of the label determination method.

It should be noted that the logic and/or steps represented in the flowcharts or otherwise described herein, for example, may be considered as a ordered listing of executable instructions for implementing logical functions, and may be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present invention, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.

In the present invention, unless explicitly specified and limited otherwise, the terms "mounted," "connected," "secured," and the like are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; either directly or indirectly, through intermediaries, or both, may be in communication with each other or in interaction with each other, unless expressly defined otherwise. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art according to the specific circumstances.

While embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the invention.

Claims

1. An image quality assessment method, the method comprising:

acquiring a face image set obtained by shooting the same target object; the face image set comprises a plurality of face images to be evaluated;

determining visual feature evaluation data and face feature evaluation data of the face image to be evaluated; wherein the visual feature evaluation data is determined based on low-level features extracted from the face image to be evaluated, the low-level features being used to describe visual layer semantics of the face image to be evaluated; the face feature evaluation data are used for describing the quality condition of concept layer semantics of the face image to be evaluated;

and carrying out face image quality evaluation according to the visual feature evaluation data and the face feature evaluation data of the face image to be evaluated to obtain face image evaluation data of the face image to be evaluated.

2. The method of claim 1, wherein the manner of determining the visual characteristic assessment data comprises:

obtaining the representative low-level features of the target object according to the low-level features of the plurality of face images to be evaluated;

and performing similarity calculation based on the low-level features of the face image to be evaluated and the representative low-level features to obtain the visual feature evaluation data of the face image to be evaluated.

3. The method according to claim 1, wherein the face feature evaluation data is determined based on high-level features extracted from the face image to be evaluated; the high-level features are used for describing concept layer semantics of the face image to be evaluated; the determining manner of the low-level features and the high-level features comprises the following steps:

and inputting the face image to be evaluated into a trained face recognition model to perform feature extraction, so as to obtain the low-level features and the high-level features.

4. A method according to claim 3, wherein the face feature assessment data comprises first feature assessment data; the high-level features comprise a plurality of first high-level features extracted by the face recognition model in different random discarding modes; the determining mode of the face characteristic evaluation data comprises the following steps:

performing similarity calculation according to the plurality of first high-level features to obtain first feature evaluation data of the face image to be evaluated; the first feature evaluation data is used for describing the robustness condition of the high-level features of the face image to be evaluated.

5. The method according to any one of claims 3 or 4, wherein the face image to be evaluated is a specified face image of the number of face images to be evaluated; the high-level features comprise second high-level features extracted by the face recognition model; the face feature evaluation data comprises second feature evaluation data; the determining mode of the face characteristic evaluation data comprises the following steps:

Performing similarity calculation according to the second high-level features of the appointed face image and the second high-level features of other face images to obtain similarity data between the appointed face image and the other face images; the other face images are other face images to be evaluated except the appointed face image in the face images to be evaluated;

obtaining the second characteristic evaluation data of the appointed face image according to the similarity data; the second feature evaluation data is used for describing the identification condition of the appointed face image.

6. The method of claim 1, wherein the face feature assessment data comprises first feature assessment data and second feature assessment data; the step of carrying out face image quality evaluation according to the visual feature evaluation data and the face feature evaluation data of the face image to be evaluated to obtain the face image evaluation data of the face image to be evaluated, comprises the following steps:

and carrying out face image quality evaluation according to the visual feature evaluation data, the first feature evaluation data and the second feature evaluation data to obtain the face image evaluation data.

7. A method of tag generation, the method comprising:

acquiring a face image set obtained by shooting the same target object; the face image set comprises a plurality of face images to be marked;

determining visual feature evaluation data and face feature evaluation data of the face image to be annotated; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the face image to be annotated, and the low-level characteristics are used for describing visual layer semantics of the face image to be annotated; the face feature evaluation data are used for describing the quality condition of concept layer semantics of the face image to be marked;

performing face image quality evaluation based on the visual feature evaluation data and the face feature evaluation data to obtain face image evaluation data of the face image to be marked;

and taking the face image evaluation data as a label of the face image to be marked.

8. A face quality assessment model training method, the method comprising:

constructing a first training sample set; the first training sample set comprises a plurality of first face image samples, and the first face image samples are provided with labels obtained based on the label generating method in claim 7;

And training the initial evaluation model by using the first training sample set to obtain the face quality evaluation model.

9. The method of claim 8, wherein the initial assessment model is migrated based on a trained face recognition student model; the face recognition student model corresponds to a face recognition teacher model; the method further comprises the steps of:

constructing a second training sample set; the first training sample set comprises a plurality of second face image samples, and the second face image samples are provided with object category labels;

inputting the second face image sample into the face recognition teacher model to carry out face recognition to obtain a first face recognition feature and a first prediction category;

inputting the second face image sample into the face recognition student model to carry out face recognition to obtain a second face recognition feature and a second prediction category;

determining loss data of the face recognition student model based on the first face recognition feature, the first predictive category, the second face recognition feature, the second predictive category, and the object category label;

And updating parameters of the face recognition student model based on the loss data of the face recognition student model until a model training stopping condition is met.

10. An image quality assessment method, the method comprising:

acquiring a face image to be evaluated;

inputting the face image to be evaluated into a face quality evaluation model obtained through training of any one of claims 8 to 9, and obtaining quality evaluation data of the face image to be evaluated.

11. A method of face recognition, the method comprising:

acquiring a face image set to be recognized, which is obtained by shooting a target object; the face image set to be identified comprises a plurality of face images to be identified;

inputting the face image to be identified into a face quality evaluation model obtained through training in any one of claims 8 to 9 to obtain quality evaluation data of the face image to be identified;

determining a target face image with quality meeting requirements in the face image set to be identified according to the quality evaluation data of the face image to be identified;

and carrying out face recognition on the target object based on the target face image to obtain a face recognition result.

12. A method of tag determination, the method comprising:

acquiring an object image set obtained by shooting a specified object; the object image set comprises a plurality of object images to be annotated;

determining visual characteristic evaluation data and object characteristic evaluation data of the object image to be annotated; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the object image to be annotated, wherein the low-level characteristics are used for describing visual layer semantics of the object image to be annotated; the object feature evaluation data are used for describing the quality condition of concept layer semantics of the object image to be annotated;

and performing image quality evaluation according to the visual characteristic evaluation data and the object characteristic evaluation data of the object image to be marked to obtain evaluation data of the object image to be marked, wherein the evaluation data is used as a label of the object image to be marked.

13. An image quality evaluation apparatus, characterized in that the apparatus comprises:

the first face image set acquisition module is used for acquiring a face image set obtained by shooting aiming at the same target object; the face image set comprises a plurality of face images to be evaluated;

The first feature evaluation data determining module is used for determining visual feature evaluation data and face feature evaluation data of the face image to be evaluated; wherein the visual feature evaluation data is determined based on low-level features extracted from the face image to be evaluated, the low-level features being used to describe visual layer semantics of the face image to be evaluated; the face feature evaluation data are used for describing the quality condition of concept layer semantics of the face image to be evaluated;

the first image evaluation data acquisition module is used for carrying out face image quality evaluation according to the visual feature evaluation data and the face feature evaluation data of the face image to be evaluated to obtain the face image evaluation data of the face image to be evaluated.

14. The apparatus according to claim 13, wherein the first feature evaluation data determining module is further configured to obtain a representative low-level feature of the target object according to the low-level features of the plurality of face images to be evaluated; and performing similarity calculation based on the low-level features of the face image to be evaluated and the representative low-level features to obtain the visual feature evaluation data of the face image to be evaluated.

15. The apparatus according to claim 13, wherein the face feature evaluation data is determined based on high-level features extracted from the face image to be evaluated; the high-level features are used for describing concept layer semantics of the face image to be evaluated;

the first feature evaluation data determining module is further configured to input the face image to be evaluated to a trained face recognition model for feature extraction, so as to obtain the low-level features and the high-level features.

16. The apparatus of claim 15, wherein the face feature assessment data comprises first feature assessment data; the high-level features comprise a plurality of first high-level features extracted by the face recognition model in different random discarding modes;

the first feature evaluation data determining module is further configured to perform similarity calculation according to the plurality of first high-level features to obtain the first feature evaluation data of the face image to be evaluated; the first feature evaluation data is used for describing the robustness condition of the high-level features of the face image to be evaluated.

17. The apparatus according to any one of claims 15 or 16, wherein the face image to be evaluated is a specified face image of the number of face images to be evaluated; the high-level features comprise second high-level features extracted by the face recognition model; the face feature evaluation data comprises second feature evaluation data;

The first feature evaluation data determining module is further configured to perform similarity calculation according to a second high-level feature of the specified face image and a second high-level feature of other face images, so as to obtain similarity data between the specified face image and the other face images; the other face images are other face images to be evaluated except the appointed face image in the face images to be evaluated; obtaining the second characteristic evaluation data of the appointed face image according to the similarity data; the second feature evaluation data is used for describing the identification condition of the appointed face image.

18. A label producing apparatus, comprising:

the second face image set acquisition module is used for acquiring a face image set obtained by shooting aiming at the same target object; the face image set comprises a plurality of face images to be marked;

the second feature evaluation data determining module is used for determining visual feature evaluation data and face feature evaluation data of the face image to be annotated; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the face image to be annotated, and the low-level characteristics are used for describing visual layer semantics of the face image to be annotated; the face feature evaluation data are used for describing the quality condition of concept layer semantics of the face image to be marked;

The second image evaluation data acquisition module is used for carrying out face image quality evaluation based on the visual characteristic evaluation data and the face characteristic evaluation data to obtain face image evaluation data of the face image to be marked;

and the first face image label acquisition module is used for taking the face image evaluation data as the label of the face image to be marked.

19. A face quality assessment model training apparatus, the apparatus comprising:

a first sample set construction module for constructing a first training sample set; the first training sample set comprises a plurality of first face image samples, and the first face image samples are provided with labels obtained based on the label generating method in claim 7;

and the initial evaluation model training module is used for training the initial evaluation model by using the first training sample set to obtain the face quality evaluation model.

20. The apparatus of claim 19, wherein the initial assessment model is migrated based on a trained face recognition student model; the face recognition student model corresponds to a face recognition teacher model; the apparatus further comprises:

A second sample set construction module for constructing a second training sample set; the first training sample set comprises a plurality of second face image samples, and the second face image samples are provided with object category labels;

the first face recognition module is used for inputting the second face image sample into the face recognition teacher model to carry out face recognition to obtain a first face recognition feature and a first prediction category;

the second face recognition module is used for inputting the second face image sample into the face recognition student model to carry out face recognition to obtain a second face recognition feature and a second prediction category;

a loss data determining module configured to determine loss data of the face recognition student model based on the first face recognition feature, the first prediction category, the second face recognition feature, the second prediction category, and the object category label;

and the model parameter updating module is used for updating parameters of the face recognition student model based on the loss data of the face recognition student model until the model training stopping condition is met.

21. An image quality evaluation apparatus, characterized in that the apparatus comprises:

The face image acquisition module to be evaluated is used for acquiring face images to be evaluated;

the first quality evaluation data acquisition module is used for inputting the face image to be evaluated into a face quality evaluation model obtained through training of any one of claims 8 to 9 to obtain quality evaluation data of the face image to be evaluated.

22. A face recognition device, the device comprising:

the third face image set acquisition module is used for acquiring a face image set to be identified, which is obtained by shooting a target object; the face image set to be identified comprises a plurality of face images to be identified;

the second quality evaluation data acquisition module is used for inputting the face image to be recognized into a face quality evaluation model obtained through training of any one of claims 8 to 9 to obtain quality evaluation data of the face image to be recognized;

the target face image determining module is used for determining a target face image with quality meeting the requirement in the face image set to be identified according to the quality evaluation data of the face image to be identified;

and the face recognition result acquisition module is used for carrying out face recognition on the target object based on the target face image to obtain a face recognition result.

23. A tag determining apparatus, the apparatus comprising:

the object image set acquisition module is used for acquiring an object image set obtained by shooting a specified object; the object image set comprises a plurality of object images to be annotated;

the third feature evaluation data determining module is used for determining visual feature evaluation data and object feature evaluation data of the object image to be annotated; the visual characteristic evaluation data are determined based on low-level characteristics extracted from the object image to be annotated, wherein the low-level characteristics are used for describing visual layer semantics of the object image to be annotated; the object feature evaluation data are used for describing the quality condition of concept layer semantics of the object image to be annotated;

and the object image label determining module is used for carrying out image quality evaluation according to the visual characteristic evaluation data and the object characteristic evaluation data of the object image to be annotated to obtain the evaluation data of the object image to be annotated, and the evaluation data is used as a label of the object image to be annotated.

24. A computer device comprising a memory and a processor, the memory storing a first computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 12 when the first computer program is executed.

25. A chip comprising a memory unit and a processing unit, the memory unit storing a second computer program, characterized in that the processing unit implements the steps of the method of any of claims 1 to 6, 10 to 12 when the second computer program is executed.

26. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 12.