CN113420699A

CN113420699A - Face matching method and device and electronic equipment

Info

Publication number: CN113420699A
Application number: CN202110751346.5A
Authority: CN
Inventors: 殷鹏伟; 陈玉辉; 徐斌; 王春茂
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2021-07-02
Filing date: 2021-07-02
Publication date: 2021-09-21

Abstract

The embodiment of the application provides a face matching method and device and electronic equipment. Wherein the method comprises the following steps: acquiring low-precision face features and high-precision face features of a target face image, wherein the precision of the low-precision face features is lower than that of the high-precision face features; determining whether the high-precision face features of the candidate face images are matched with the high-precision face features of the target face images or not aiming at each candidate face image, and if so, taking the candidate face images as rough screening face images; and determining whether the high-precision face features of the roughly screened face images are matched with the high-precision face features of the target face images or not aiming at each roughly screened face image, and if so, determining that the roughly screened face images are matched with the target face images. The amount of computation required for face matching can be reduced.

Description

Face matching method and device and electronic equipment

Technical Field

The present application relates to the field of machine vision technologies, and in particular, to a face matching method, an apparatus, and an electronic device.

Background

In order to determine the identity of the target person, the face image of the target person may be matched with the face images of a plurality of known persons, and the identity of the target person may be determined according to the matching result. But the identity of the target person can only be determined if the target person is one of a plurality of known persons. Therefore, in order to improve the possibility of determining the identity of the target person, it is necessary to match the target person with the face images of more known persons as much as possible.

When determining whether the face images are matched, calculating the feature distance between the face floating point features. The floating-point number has a large number of bits, so the operation complexity is high, and particularly when the target person needs to be matched with the face images of more known persons, the calculation amount required by the face matching is huge.

Therefore, how to effectively reduce the amount of computation required by face matching becomes an urgent technical problem to be solved.

Disclosure of Invention

An object of the embodiments of the present application is to provide a face matching method, a face matching device, and an electronic device, so as to reduce the amount of computation required by face matching. The specific technical scheme is as follows:

in a first aspect of an embodiment of the present application, a face matching method is provided, where the method includes:

acquiring low-precision face features and high-precision face features of a target face image, wherein the precision of the low-precision face features is lower than that of the high-precision face features;

determining whether the high-precision face features of the candidate face images are matched with the high-precision face features of the target face images or not aiming at each candidate face image, and if so, taking the candidate face images as rough screening face images;

and determining whether the high-precision face features of the roughly screened face images are matched with the high-precision face features of the target face images or not aiming at each roughly screened face image, and if so, determining that the roughly screened face images are matched with the target face images.

In a possible embodiment, the acquiring low-precision face features and high-precision face features of a target face image includes:

inputting a target face image into a face feature extraction model which is trained in advance to obtain low-precision face features and high-precision face features of the target face image output by the face feature extraction model;

the face feature extraction model is obtained by training by utilizing a sample image pair marked with a matching result in advance, and the matching result is used for indicating whether the sample images in the sample image pair are matched or not.

In a possible embodiment, the face feature extraction model is obtained by training in advance in the following way:

respectively inputting two sample images in a sample image pair marked with a matching result into an initial model, wherein the initial model comprises a main branch sub-model, a first branch sub-model and a second branch sub-model, the main branch sub-model is used for extracting image characteristics of the image input into the initial model and inputting the extracted image characteristics into the first branch sub-model and the second branch sub-model, the first branch sub-model is used for outputting high-precision face characteristics according to the input image characteristics, and the second branch sub-model is used for outputting low-precision face characteristics according to the input image characteristics;

acquiring high-precision face features of two sample images output by the first branch sub-model and low-precision face features of the two sample images output by the second branch sub-model;

determining whether the two sample images are matched or not based on the high-precision face features of the two sample images to obtain a first prediction result; determining whether the two first images are matched or not based on the low-precision face features of the two sample images to obtain a second prediction result;

and adjusting model parameters of the main sub-model and the first branch sub-model according to the first prediction result and the matching result, and adjusting model parameters of the main sub-model and the second branch sub-model according to the second prediction result and the matching result to obtain a human face feature extraction model.

In a possible embodiment, the adjusting model parameters of the stem sub-model and the second branch sub-model according to the second prediction result and the matching result includes:

constructing a second loss function according to the discrete degree of the characteristic values in the low-precision face characteristics of the two sample images;

and adjusting model parameters of the main branch sub-model and the second branch sub-model according to the second prediction result, the matching result and the second loss function.

inputting the low-precision face features of the two sample images into a classification network trained in advance to obtain the prediction classification results of the two sample images;

constructing a third loss function according to the prediction classification results of the two sample images and the labeling classification results of the two sample images;

and adjusting model parameters of the main branch sub-model and the second branch sub-model according to the second prediction result, the matching result and the third loss function.

In a possible embodiment, the adjusting the model parameters of the trunk sub-model and the second branch sub-model according to the second prediction result, the matching result, and the second loss function includes:

constructing a fourth loss function according to the network parameters of the classification network, wherein the fourth loss function is used for representing the regularity of the network parameters of the classification network;

and adjusting model parameters of the main branch sub-model and the second branch sub-model according to the second prediction result, the matching result and the fourth loss function.

In one possible embodiment, the low-precision facial features are low-precision facial features, and the high-precision facial features are floating-point facial features.

In a second aspect of the embodiments of the present application, there is provided a face matching apparatus, including:

the system comprises a feature acquisition module, a feature extraction module and a feature extraction module, wherein the feature acquisition module is used for acquiring low-precision face features and high-precision face features of a target face image, and the precision of the low-precision face features is lower than that of the high-precision face features;

the first screening module is used for determining whether the high-precision face features of the candidate face images are matched with the high-precision face features of the target face images or not aiming at each candidate face image, and if the high-precision face features of the candidate face images are matched with the high-precision face features of the target face images, the candidate face images are used as rough screening face images;

and the second screening module is used for determining whether the high-precision face features of the roughly screened face images are matched with the high-precision face features of the target face images or not aiming at each roughly screened face image, and if so, determining that the roughly screened face images are matched with the target face images.

In a third aspect of embodiments of the present application, there is provided an electronic device, including: a processor and a memory;

the memory has stored thereon a computer program which, when executed by the processor, performs the method according to any of the above first aspects.

In a fourth aspect of an embodiment of the present application, a face matching system includes: a camera and a readable storage medium,

the camera is used for acquiring a face image;

the readable storage medium is for storing computer software instructions for use in a method as described in any one of the above.

In a fifth aspect of embodiments of the present application, there is provided a computer storage medium for storing computer software instructions for use in the method of any one of the first aspect.

The face matching method, the face matching device and the electronic equipment can be used for screening the candidate face images for the first time through the low-precision face features and screening the roughly screened images obtained through the first screening for the second time through the high-precision face number features, so that the face images matched with the target face images in the candidate face images are accurately determined. Because the precision of the low-precision face is lower than that of the high-precision face, the calculation amount required by screening the low-precision face features is less, and therefore the calculation amount required by face matching can be effectively reduced.

Of course, not all advantages described above need to be achieved at the same time in the practice of any one product or method of the present application.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.

Fig. 1 is a schematic flow chart of a face matching method according to an embodiment of the present application;

fig. 2 is a schematic flow chart of a training method for a face feature extraction model according to an embodiment of the present application;

fig. 3 is another schematic flow chart of a training method for a face feature extraction model according to an embodiment of the present application;

fig. 4 is another schematic flow chart of a training method for a face feature extraction model according to an embodiment of the present application;

fig. 5 is another schematic flow chart of a training method for a face feature extraction model according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a face matching apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the description herein are intended to be within the scope of the present disclosure.

Referring to fig. 1, fig. 1 is a schematic flow chart of a face matching method provided in the embodiment of the present application, and the method may include:

s101, acquiring low-precision face features and high-precision face features of the target face image.

S102, determining whether the low-precision face features of the candidate face image are matched with the low-precision face features of the target face image or not according to each candidate face image, and if so, taking the candidate face image as a rough screening face image.

S103, aiming at each rough screening face image, determining whether the high-precision face number characteristics of the rough screening face image are matched with the high-precision face number characteristics of the target face image, and if so, determining that the rough screening face image is matched with the target face image.

By adopting the embodiment, the candidate face images can be screened for the first time through the low-precision face features, and then the roughly screened images obtained through the first screening are screened for the second time through the high-precision face number features, so that the face images matched with the target face images in the candidate face images are accurately determined. Because the precision of the low-precision face is lower than that of the high-precision face, the calculation amount required by screening the low-precision face features is less, and therefore the calculation amount required by face matching can be effectively reduced.

In S101, the accuracy of the low-accuracy face features is lower than that of the high-accuracy face features, and the accuracy of the face features may be measured in different manners according to different application scenarios, for example, the accuracy of the face features may be measured by significant digits, the greater the significant digits, the higher the accuracy of the face features is, or the greater the non-zero digits, the higher the accuracy of the face features is.

The low-precision face feature and the high-precision face feature may be different types of features according to different application scenarios, for example, in one possible embodiment, the low-precision face feature may be a face feature composed of integer type data, and the high-precision face feature may be a face feature composed of floating point type data, for example, the low-precision face feature may be a feature composed of hash values (hereinafter, referred to as a face hash feature), and the high-precision face feature may be a feature composed of floating point type data in any format (hereinafter, referred to as a face floating point feature).

When the low-precision face features are the face hash features, the calculation amount of the XOR operation is small because whether the two face hash features are matched can be determined through the XOR operation, and compared with the method of determining whether the numerical values on each bit in the face features are equal bit by bit, the calculation amount of the XOR operation is small, so that when the low-precision face features are the face hash features, the calculation amount of screening rough-screened face images from candidate face images can be reduced, and the calculation amount required by face matching is further reduced.

The low-precision face features and the high-precision face features can be obtained by different model extractions or by the same model extraction. For example, in one possible embodiment, the target face image and the plurality of candidate face images may be input to a face feature extraction model trained in advance, and low-precision face features and high-precision face features of the target face image and the plurality of candidate face images output by the face feature extraction model are obtained. The face feature extraction model is obtained by utilizing a sample image pair marked with a matching result in advance for training.

The face feature extraction model may be a neural network model obtained based on deep learning training, or may be an algorithm model obtained based on traditional machine learning training, which is not limited in this embodiment. How to train the face feature extraction model will be described in detail below, and will not be described herein again.

In S102, the low-precision face features of each candidate face image and the low-precision face features of the target face image are obtained in the same manner, and the high-precision face features of each candidate face image and the high-precision face features of the target face image are obtained in the same manner. The low-precision face features and the high-precision face features of the candidate face image may be obtained in advance, or may be obtained in the process of executing the face matching method provided by the embodiment of the present application.

For example, in a possible embodiment, assuming that the low-precision face features and the high-precision face features of the target face image are extracted through an algorithm model obtained through pre-selection training, the low-precision face features and the high-precision face features of each candidate face image may be respectively extracted in advance by using the algorithm model, and the obtained low-precision face features and the obtained high-precision face features may be stored in a preset database. And when the face matching needs to be executed, reading the low-precision face features and the high-precision face features of each candidate face image from the preset database.

The matching standard of the low-precision face features of the candidate face image and the low-precision face features of the target face image may be different according to different application scenarios, and for example, in one possible embodiment, the similarity between the low-precision face features of the candidate face image and the low-precision face features of the target face image may be calculated, and if the similarity is higher than a preset similarity threshold, the low-precision face features of the candidate face image and the low-precision face features of the target face image are determined to be matched.

In another possible embodiment, the similarity between the low-precision face features of the candidate face image and the low-precision face features of the target face image may also be calculated as the corresponding similarity of the candidate face image. And sequencing all the candidate face images according to the sequence of the corresponding similarity from high to low to obtain a candidate face image sequence, and determining that the low-precision face features of the candidate face images positioned in front of the preset sequence in the sequence are matched with the low-precision face features of the target face image.

For example, assuming that there are 10 ten thousand candidate face images in total, the similarity between the low-precision face features of the 10 ten thousand candidate face images and the low-precision face features of the target face image may be respectively calculated to obtain the respective corresponding similarities of the 10 ten thousand candidate face images, and the 1000 candidate face images with the highest corresponding similarity among the candidate face images are used as the rough-screening face images.

It can be understood that, if the low-precision face features of one candidate face image are matched with the low-precision face features of the target face image, the candidate face image and the target face image can be considered to have a certain probability of being the face image of the same person, that is, the candidate face image may be matched with the target face image.

On the contrary, if the low-precision face features of one candidate face image are matched with the low-precision face features of the target face image, the candidate face image and the target face image can be regarded as face images of different persons, namely the candidate person image is not matched with the target face image.

The degree of similarity in this document refers to the degree of similarity indicated by the degree of similarity, not the degree of similarity. The numerical value of the similarity and the degree of similarity indicated by the similarity may be in positive correlation or in negative correlation. For example, in one possible embodiment, the similarity between two low-precision face features may be: the ratio of the number of dimensions with the same characteristic value in the two low-precision face features to the dimension of the low-precision face features, and at the moment, the numerical value of the similarity is positively correlated with the similarity represented by the similarity. In another possible embodiment, the similarity between two low-precision facial features may be represented by a distance between two low-precision facial features, and the numerical value of the similarity is inversely related to the degree of similarity represented by the similarity, that is, the lower the numerical value of the similarity, the higher the degree of similarity represented by the similarity.

In S103, as described above, only the coarsely screened face image in the candidate face images may be matched with the target face image, so that only the face image matched with the target face image needs to be determined from the coarsely screened face image, and the non-coarsely screened face image does not need to be considered.

Assuming that the calculation amount required for determining whether a pair of high-precision face features are matched is O, the calculation amount required for determining whether a pair of low-precision face features are matched is O, and assuming that the number of candidate face images is N and the number of rough-screened face images is M, the calculation amount required by the face matching method provided by the embodiment of the present application is: n O + M O. If the high-precision face features are directly utilized to determine the face image matched with the target face image from the candidate face images, the required calculation amount is as follows: n O.

Because O is often much smaller than O, the amount of computation required by the face matching method provided by the embodiment of the present application is similar to M × O, and because the coarsely filtered face image is a part of the candidate face image, M is smaller than N, it can be seen that the amount of computation required by face matching can be effectively reduced by the face matching method provided by the embodiment of the present application.

In order to more clearly describe the face matching method provided in the embodiment of the present application, how to train the face feature extraction model is described below, and fig. 2 may be referred to, where fig. 2 is a schematic flow diagram of the face feature extraction model training method provided in the embodiment of the present application.

S201, respectively inputting two sample images in the sample image pair marked with the matching result into the initial model.

The initial model comprises a main branch sub-model, a first branch sub-model and a second branch sub-model, wherein the main branch sub-model is used for extracting image characteristics of an image input to the initial model and respectively inputting the extracted image characteristics to the first branch sub-model and the second branch sub-model, the first branch sub-model is used for outputting high-precision face characteristics according to the input image characteristics, and the second branch sub-model is used for outputting low-precision face characteristics according to the input image characteristics.

The model parameters of the initial model may be preset by the user according to actual experience or requirements, or may be obtained by pre-training, which is not limited in this embodiment.

The matching result is used to indicate whether two sample images in the sample image pair are matched, for example, the matching result may be represented by a binarized parameter, where a value of the parameter is a first value or a second value, where the parameter indicates that the two sample images are matched when the parameter is the first value, and the parameter indicates that the two sample images are not matched when the parameter is the second value, and for convenience of description, the parameter is denoted as s in the following text^oAnd assume that the first value is 1 and the second value is 0.

S202, obtaining high-precision face features of two sample images output by the first branch sub-model and low-precision face features of two sample images output by the second branch sub-model.

S203, determining whether the two sample images are matched or not based on the high-precision face features of the two sample images to obtain a first prediction result, and determining whether the two sample images are matched or not based on the low-precision face features of the two sample images to obtain a second prediction result.

The representation form of the first prediction result and the second prediction result may be different according to different application scenarios, for example, in one possible embodiment, the first prediction result may be represented in the form of a distance between high-precision facial features of two sample images, and the second prediction result may be represented in the form of a distance between low-precision facial features of two sample images, and for convenience of description, the distance between the low-precision facial features of the two sample images is denoted as s hereinafter^h。

And S204, adjusting model parameters of the trunk sub-model and the first branch sub-model according to the first prediction result and the matching result, and adjusting model parameters of the trunk sub-model and the second branch sub-model according to the second prediction result and the matching result to obtain the human face feature extraction model.

The model parameters of the main branch model and the first branch sub model can be adjusted according to the first prediction result and the matching result, and then the model parameters of the main branch model and the second branch sub model can be adjusted according to the second prediction result and the matching result. Or firstly adjusting model parameters of the main branch model and the second branch sub-model according to the second prediction result and the matching result, and then adjusting model parameters of the main branch model and the first branch sub-model according to the first prediction result and the matching result. The two adjustments may also be made in parallel or alternately.

The manner of adjusting the model parameters of the trunk sub-model and the second branch sub-model according to the second prediction result and the matching result may be different according to different application scenarios, and for example, the method may be to construct a loss function for representing a distance or cross entropy between the second prediction result and the matching result, and adjust the model parameters of the trunk sub-model and the second branch sub-model along a direction in which a gradient of the loss function decreases until the adjustment time reaches a preset time threshold, or the convergence of the trunk sub-model and the second branch sub-model reaches a preset convergence threshold, and is regarded as completing the adjustment of the model parameters of the trunk sub-model and the second branch sub-model according to the second prediction result and the matching result.

Next, how to adjust the model parameters of the trunk sub-model and the second branch sub-model according to the second prediction result and the matching result will be described.

In a possible embodiment, a first loss function may be constructed according to a difference degree between the second prediction result and the matching result, and the main sub-model and the second sub-model are adjusted according to a gradient descending direction of the first loss function until a preset training end condition is met, so as to obtain the face feature extraction model. The preset training end condition may be that the convergence of the model parameter reaches a preset threshold, or that the number of times of adjustment reaches a preset number of times.

Wherein the first loss function is used to represent a difference between the first predicted result and the first annotated result. In one possible embodiment, the first loss function may be as follows:

log(1+e^sh)+s^os^h

wherein s is^oLabeling results of the sample image pair.

If there are multiple sample image pairs, the first loss function may be as follows:

wherein i is a positive integer with a value range of 1 to n, n is the number of sample image pairs,

for the labeling result of the ith sample image pair,

is the distance between the low-precision face features of the two sample images in the ith sample image pair. For convenience of description, the first loss function is described as

It can be understood that the feature values in the low-precision face features are binarized or approximately binarized, while the feature values in the real face features are theoretically not binarized, so the feature values in the low-precision face features can be regarded as quantized. And the characteristic value loses part of information in the quantization process, so that the characteristic value is not accurate enough.

In a possible embodiment, in order to improve the accuracy of the face feature extraction model obtained by training, and also consider loss caused by quantization in the process of training the face feature extraction model, for example, refer to fig. 3, and fig. 3 is another schematic flow chart of the training method for the face feature extraction model provided in the embodiment of the present application, which may include:

and S301, respectively inputting two sample images in the sample image pair marked with the matching result into the initial model.

The step is the same as the step S201, and reference may be made to the related description of the step S201, which is not described herein again.

S302, obtaining high-precision face features of two sample images output by the first branch sub-model and low-precision face features of two sample images output by the second branch sub-model.

The step is the same as the step S202, and reference may be made to the related description of the step S202, which is not described herein again.

S303, determining whether the two sample images are matched or not based on the high-precision face features of the two sample images to obtain a first prediction result, and determining whether the two sample images are matched or not based on the low-precision face features of the two sample images to obtain a second prediction result.

The step is the same as the step S203, and reference may be made to the related description of the step S203, which is not described herein again.

S304, constructing a second loss function according to the discrete degree of the characteristic values in the low-precision face characteristics of the two sample images.

It can be understood that the higher the dispersion degree of the low-precision face features is, the greater the difference degree of the low-precision face features of different face images is considered to be, so that the more information that can be used for distinguishing different faces is contained in the low-precision face features, that is, the less information that is lost in the quantization process of the low-precision face features is considered to be.

In one possible embodiment, the second loss function may be represented in the form of:

wherein h is_iIs the low-precision face feature of the ith sample image in the first sample image pair, n is the number of all sample images contained in all sample image pairs, sgn (h)_i) For the expression h_iWhen h is a symbol of_iWhen greater than 0, sgn (h)_i) A value of 1 when h_iWhen less than 0, sgn (h)_i) The value is 0, | h_i-sgn(h_i)|₁Represents h_i-sgn(h_i) 1 norm. For convenience of description, the second loss function is described as

S305, adjusting model parameters of the main branch model and the first branch sub model according to the first prediction result and the matching result, and adjusting model parameters of the main branch model and the second branch sub model according to the second prediction result, the matching result and the second loss function to obtain the face feature extraction model.

The total loss function may be constructed according to the first loss function and the second loss function, and the model parameters of the first model are adjusted according to the gradient descending direction of the total loss function until a preset training end condition is reached, so as to obtain the face feature extraction model.

Wherein the total loss function may be obtained by linearly varying the first loss function and the second loss function, and exemplarily, the total loss function

Can be represented by the following formula:

wherein λ is₁And λ₂For two predetermined coefficients, and₁and λ₂The values of (A) may be the same or different.

In yet another possible embodiment, the training method of the face feature extraction model may also include, as shown in fig. 4:

s401, respectively inputting two sample images in the sample image pair marked with the matching result into the initial model.

S402, obtaining high-precision face features of two sample images output by the first branch sub-model and low-precision face features of two sample images output by the second branch sub-model.

S403, determining whether the two sample images are matched or not based on the high-precision face features of the two sample images to obtain a first prediction result, and determining whether the two sample images are matched or not based on the low-precision face features of the two sample images to obtain a second prediction result.

S404, inputting the low-precision face features of the two sample images into a classification network which is trained in advance to obtain a prediction classification result of the two sample images.

S405, constructing a third loss function according to the prediction classification results of the two sample images and the labeling classification results labeled by the two sample images.

It can be understood that, in the case of no consideration of the error of the classification network, the reason why the predicted classification result and the labeled classification result output by the classification network are different is as follows: low precision face features will lose information in the quantization process. Theoretically, if the lost information is less, the predicted classification result will be closer to the labeled classification result, and if the lost information is more, the predicted classification result will be more deviated from the labeled classification result.

Therefore, the third loss function constructed based on the difference between the prediction classification result and the labeling classification result can be used for carrying out supervised training on the second branch sub-model.

In one possible embodiment, the third loss function may be represented in the form of:

wherein, y_iLabeling the classification result of the ith sample image in the first sample image pair, W^TFor representing a matrix of a classification network, i.e. W^Th_iThe prediction classification result of the ith sample image may be represented. Hereinafter, for convenience of description, the third loss function will be described as

And S406, adjusting model parameters of the trunk sub-model and the first branch sub-model according to the first prediction result and the matching result, and adjusting model parameters of the trunk sub-model and the second branch sub-model according to the second prediction result, the matching result and the third loss function to obtain the face feature extraction model.

The total loss function may be constructed according to the first loss function and the third loss function, and the model parameters of the first model are adjusted according to the gradient descending direction of the total loss function until a preset training end condition is reached, so as to obtain the face feature extraction model.

Wherein the total loss function may be obtained by linearly varying the first loss function and the third loss function, and exemplarily, the total loss function

Can be represented by the following formula:

wherein λ is₁And λ₃For two predetermined coefficients, and₁and λ₃The values of (A) may be the same or different.

In yet another possible embodiment, the training method of the face feature extraction model may also be as shown in fig. 5, and includes:

s501, inputting two sample images in the sample image pair marked with the matching result into the initial model respectively.

S502, obtaining high-precision face features of two sample images output by the first branch sub-model and low-precision face features of two sample images output by the second branch sub-model.

S503, determining whether the two sample images are matched or not based on the high-precision face features of the two sample images to obtain a first prediction result, and determining whether the two sample images are matched or not based on the low-precision face features of the two sample images to obtain a second prediction result.

S504, inputting the low-precision face features of the two sample images into a classification network which is trained in advance, and obtaining the prediction classification results of the two sample images.

And S505, constructing a third loss function according to the prediction classification results of the two sample images and the labeling classification results labeled by the two sample images.

S506, a fourth loss function is constructed according to the network parameters of the classification network.

It can be understood that the classification network itself tends to have a certain error, and therefore, the error will affect the third loss function, so that the fourth loss function can be constructed based on the classification network parameters to compensate the third loss function, thereby accelerating the convergence of the second branch sub-model.

In one possible embodiment, the fourth loss function may be represented in the form of:

|W|_F

wherein, y_iLabeling the classification result of the ith sample image in the first sample image pair, W^TFor representing a matrix of a classification network, i.e. W^Th_iThe prediction classification result of the ith sample image may be represented. For convenience of description hereinafterLet the third loss function be

And S507, adjusting model parameters of the trunk sub-model and the first branch sub-model according to the first prediction result and the matching result, and adjusting model parameters of the trunk sub-model and the second branch sub-model according to the second prediction result, the matching result, the third loss function and the fourth loss function to obtain the face feature extraction model.

The total loss function may be constructed according to the first loss function, the third loss function and the fourth loss function, and the model parameters of the first model are adjusted according to the gradient descending direction of the total loss function until a preset training end condition is reached, so as to obtain the face feature extraction model.

The total loss function may be obtained by linearly changing the first loss function, the third loss function, and the fourth loss function, and exemplarily, the total loss function

Can be represented by the following formula:

wherein λ is₁、λ₃、λ₄Is three preset coefficients, and₁、λ₃、λ₄the values of (A) may be the same or different.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a face matching device according to an embodiment of the present application, and the face matching device may include:

a feature obtaining module 601, configured to obtain low-precision face features and high-precision face features of a target face image and multiple candidate face images;

a first screening module 602, configured to determine, for each candidate face image, whether a low-precision face feature of the candidate face image matches a low-precision face feature of the target face image, and if so, take the candidate face image as a coarse screening face image;

a second screening module 603, configured to determine, for each of the coarsely screened face images, whether the high-precision face features of the coarsely screened face image match the high-precision face features of the target face image, and if so, determine that the coarsely screened face image matches the target face image.

In a possible embodiment, the second filtering module 603 obtains low-precision face features and high-precision face features of the target face image and the plurality of candidate face images, and includes:

inputting a target face image and a plurality of candidate face images into a face feature extraction model which is trained in advance, and obtaining low-precision face features and high-precision face features of the target face image and the candidate face images which are output by the face feature extraction model;

In a possible embodiment, the apparatus further includes a model training module, configured to train a face feature extraction model in advance by:

respectively inputting two sample images in a sample image pair marked with a first matching result into a first model to obtain low-precision face features of the two sample images output by the first model;

determining whether the two sample images are matched or not based on the low-precision face features of the two sample images to obtain a first prediction result;

constructing a first loss function according to the first prediction result and the first annotation result, wherein the first loss function is used for representing the difference between the first prediction result and the first annotation result;

and adjusting the model parameters of the first model according to the first loss function to obtain a face feature extraction model.

In a possible embodiment, the adjusting, by the model training model, the model parameter of the first model according to the first loss function to obtain a face feature extraction model includes:

and adjusting the model parameters of the first model according to the first loss function and the second loss function to obtain a face feature extraction model.

In a possible embodiment, the adjusting, by the model training module, the model parameter of the first model according to the first loss function to obtain a face feature extraction model includes:

and adjusting the model parameters of the first model according to the first loss function and the third loss function to obtain a face feature extraction model.

In a possible embodiment, the adjusting, by the model training module, the model parameter of the first model according to the first loss function and the third loss function to obtain a face feature extraction model includes:

and adjusting the model parameters of the first model according to the first loss function, the third loss function and the fourth loss function to obtain a face feature extraction model.

In a possible embodiment, the model training module is further configured to train a first model in advance by:

respectively inputting two second sample images in a second sample image pair marked with a second matching result into a second model to obtain high-precision face features of the two second sample images output by the second model;

determining whether the two second sample images are matched or not based on the high-precision face features of the two second sample images to obtain a second prediction result;

and adjusting the model parameters of the second model according to the second prediction result and the second labeling result to obtain the first model.

In one possible embodiment, the low-precision face features are face hash features, and the high-precision face features are face floating point features.

An embodiment of the application further provides an electronic device, as shown in fig. 7, including:

a memory 701 for storing a computer program;

the processor 702 is configured to implement the following steps when executing the program stored in the memory 701:

acquiring low-precision face features and high-precision face features of a target face image and a plurality of candidate face images;

determining whether the low-precision face features of the candidate face image are matched with the low-precision face features of the target face image or not aiming at each candidate face image, and if so, taking the candidate face image as a rough screening face image;

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

In another embodiment provided by the present application, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any one of the above-mentioned face matching methods.

In another embodiment provided by the present application, there is also provided a face matching system, including: the camera is used for acquiring a face image; the readable storage medium is used for storing computer software instructions for any one of the above face matching methods.

In yet another embodiment provided by the present application, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform any of the face matching methods of the above embodiments.

the camera is used for acquiring a face image;

The face matching system may be, for example, an access control system, a parking lot charging system, a shopping mall charging system, and the like, which is not limited herein.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the embodiments of the apparatus, the electronic device, the computer-readable storage medium, and the computer program product, since they are substantially similar to the method embodiments, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiments.

The above description is only for the preferred embodiment of the present application and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims

1. A face matching method, characterized in that the method comprises:

2. The method of claim 1, wherein the obtaining of the low-precision face features and the high-precision face features of the target face image comprises:

inputting a target face image into a face feature extraction model which is trained in advance to obtain high-precision face features and low-precision face features of the target face image output by the face feature extraction model;

3. The method of claim 2, wherein the face feature extraction model is obtained by training in advance:

4. The method of claim 3, wherein adjusting model parameters of the stem sub-model and the second branch sub-model according to the second prediction result and the matching result comprises:

5. The method of claim 3, wherein adjusting model parameters of the stem sub-model and the second branch sub-model according to the second prediction result and the matching result comprises:

6. The method of claim 5, wherein adjusting model parameters of the stem sub-model and the second branch sub-model according to the second prediction result, the matching result, and the second loss function comprises:

7. The method according to any one of claims 1 to 6, wherein the low-precision face features are face hash features, and the high-precision face features are face floating point features.

8. A face matching apparatus, the apparatus comprising:

and the second screening module is used for determining whether the high-precision face features of the rough screening face image are matched with the high-precision face features of the target face image or not, and if so, determining that the rough screening face image is matched with the target face image.

9. An electronic device, comprising: a processor and a memory;

the memory has stored thereon a computer program which, when executed by the processor, performs the method of any of claims 1-7.

10. A face matching system, comprising: a camera and a readable storage medium,

the camera is used for acquiring a face image;

the readable storage medium is used for storing computer software instructions for the method of any one of claims 1 to 7.