CN111626193A

CN111626193A - Face recognition method, face recognition device and readable storage medium

Info

Publication number: CN111626193A
Application number: CN202010457116.3A
Authority: CN
Inventors: 王艳; 张修宝; 沈海峰
Original assignee: Beijing Didi Infinity Technology and Development Co Ltd
Current assignee: Beijing Didi Infinity Technology and Development Co Ltd
Priority date: 2020-05-26
Filing date: 2020-05-26
Publication date: 2020-09-04

Abstract

The application provides a face recognition method, a face recognition device and a readable storage medium, wherein in the model learning process, the sample images are processed by using face detection, so that the proportion of the face of a person in each sample image and each corresponding reference person image is different, and corresponding loss functions are combined, so that the face recognition model can put a point of interest on the face of the person in the recognition process, the face of the person does not need to be detected in the use process of the face recognition model, and the single-stage recognition of face obstruction wearing recognition can be directly carried out, so that whether the person in the person image to be recognized wears a face obstruction or not can be quickly and accurately determined, the detection time is greatly reduced, and the wearing recognition speed of the face obstruction is improved.

Description

Face recognition method, face recognition device and readable storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a face recognition method, a face recognition apparatus, and a readable storage medium.

Background

In some special scenes, it is usually required for the relevant people in these scenes to wear face masks such as masks, etc. for face protection, etc., and to wear face masks all the way through these scenes to ensure the safety of the relevant people.

At present, in these special scenes, usually, a supervisor is arranged to spot check the wearing condition of the facial mask of the relevant person at regular time, but for one scene, if the flow of people in the scene is large, the supervisor cannot move along with one relevant person at any moment, under such a situation, the number of the supervisor is often increased, the relevant person can be supervised in real time, a large amount of manpower and material resources are consumed, and the workload of the supervisor is also increased, so that how to accurately detect the wearing condition of the facial mask of the relevant person is an urgent problem to be solved.

Disclosure of Invention

In view of the above, an object of the present application is to provide a face recognition method, a face recognition apparatus and a readable storage medium, in which a face recognition model does not need to detect the face of a person during use, and can directly perform single-stage recognition of wearing and recognizing a face mask, so that whether the person in an image of the person to be recognized wears the face mask can be quickly and accurately determined, the detection time is greatly reduced, and the wearing and recognition speed of the face mask is improved.

According to an aspect of the present application, there is provided a face recognition method including:

obtaining a plurality of sample person images and a sample type label of whether a face of a person in each sample person image wears a face mask or not;

determining a plurality of reference character images of the sample character image aiming at each sample character image, wherein the size proportions of the character faces of any two images in the sample character image and the corresponding plurality of reference character images in the images are different;

training the constructed deep convolutional neural network based on the plurality of sample character images, a plurality of reference character images corresponding to each sample character image and a sample type label of each sample character image to obtain a trained face recognition model;

and inputting the acquired image of the person to be recognized into the face recognition model to obtain a face shelter recognition result of the image of the person to be recognized.

In some embodiments of the present application, the determining, for each sample personal image, a plurality of reference personal images of each sample personal image, where the size ratios of the personal faces of any two images in the sample personal image and the corresponding plurality of reference personal images in the images are different includes:

for each sample person image, determining a face position area of the person face in the sample person image;

and determining a plurality of reference person images of the sample person image based on the face position area, wherein the size proportions of the person faces of any two images in the sample person image and the corresponding plurality of reference person images in the images are different.

In some embodiments of the present application, the determining a plurality of reference person images of the sample person image based on the face position region includes:

according to the preset proportion of the human face in the image, carrying out area range expansion on the face position area of the sample human image, and obtaining a first reference human image from the sample human image;

and taking the obtained reference person image as a sample person image, performing area range expansion on the face position area in the sample person image according to the preset proportion to obtain a second reference person image, and repeating the steps until a plurality of reference person images in preset number corresponding to the sample person image are obtained.

acquiring a plurality of area proportions of the face of the person in the image;

and sequentially performing region range expansion on the face position region in the sample person image according to the proportion of each region to obtain a plurality of reference person images corresponding to the sample person image.

In some embodiments of the present application, the training the constructed deep convolutional neural network based on the plurality of sample personal images, the plurality of reference personal images corresponding to each sample personal image, and the sample category label of each sample personal image to obtain the trained face recognition model includes:

inputting the plurality of sample person images and a plurality of reference person images corresponding to each sample person image into a constructed deep convolutional neural network to obtain a face shielding prediction value of each sample person image and each reference person image;

determining a network loss value of the deep convolutional neural network based on the facial occlusion prediction value of each sample person image and each reference person image and the sample class label of each sample person image;

and adjusting the network parameters of the deep convolutional neural network based on the network loss value to obtain a trained face recognition model.

In some embodiments of the present application, the determining a network loss value of the deep convolutional neural network based on the facial occlusion prediction value of each of the sample human images and each of the reference human images and the sample class label of each of the sample human images comprises:

determining the association loss and the mean square error of each sample person image based on the facial occlusion predicted value of each sample person image and each reference person image of the sample person image and the sample category label of each sample person image;

determining a network loss value of the deep convolutional neural network based on the determined plurality of correlation losses and the plurality of mean square errors.

In some embodiments of the present application, the mean square error of each of the sample human images is determined by:

for each sample character image, determining a first mean square error of the sample character image based on a face occlusion predicted value of the sample character image and a true value corresponding to a sample category label of the sample character image;

determining a plurality of second mean square errors of each sample human image based on the facial occlusion prediction value of each reference human image corresponding to the sample human image and the truth value corresponding to the sample class label of the sample human image;

and determining the mean square error of the sample human image based on the first mean square error and the plurality of second mean square errors.

In some embodiments of the present application, the loss of association for each of the sample person images is determined by:

for each sample person image, determining a plurality of associated image groups from a plurality of images including the sample person image and a plurality of corresponding reference person images, wherein the associated image groups include a first image and a second image, and the size proportion of the face of the person in the first image is smaller than the size proportion of the face of the person in the second image;

determining an association group loss difference of each association group of images based on a face occlusion predicted value corresponding to a first image, a face occlusion predicted value corresponding to a second image and a real value corresponding to a sample class label of the sample character image in each association group of images;

determining an association loss for each of the sample person images based on a plurality of the association group loss differences.

In some embodiments of the present application, the adjusting the network parameters of the deep convolutional neural network based on the network loss value to obtain a trained face recognition model includes:

determining whether the loss value is greater than a preset loss threshold;

if the network loss value is larger than a preset loss threshold value, adjusting network parameters of the deep convolutional neural network until the network loss value is smaller than or equal to the preset loss threshold value, and determining that the deep convolutional neural network is completely trained;

and determining the trained deep convolutional neural network as a face recognition model.

According to a second aspect of the present application, there is provided a face recognition apparatus comprising:

the image acquisition module is used for acquiring a plurality of sample person images and whether a person face in each sample person image is a sample type label of a face shelter;

the image determining module is used for determining a plurality of reference character images of the sample character image aiming at each sample character image, wherein the size proportions of the character faces of any two images in the sample character image and the corresponding reference character images in the images are different;

the model training module is used for training the constructed deep convolution neural network based on the plurality of sample character images, the plurality of reference character images corresponding to the sample character images and the sample type labels of the sample character images to obtain a trained face recognition model;

and the image identification module is used for inputting the acquired image of the person to be identified to the face identification model to obtain the face shelter identification result of the image of the person to be identified.

In some embodiments of the present application, the image determining module is configured to determine, for each sample person image, a plurality of reference person images of each sample person image, where the person faces of any two images of the sample person image and the corresponding plurality of reference person images occupy different size proportions in the image, and the image determining module is configured to:

In some embodiments of the present application, the image determination module, when configured to determine a plurality of reference person images of the sample person image based on the face position region, is configured to:

In some embodiments of the present application, when the model training module is configured to train the constructed deep convolutional neural network based on the plurality of sample person images, the plurality of reference person images corresponding to each sample person image, and the sample category label of each sample person image, to obtain a trained face recognition model, the model training module is configured to:

In some embodiments of the present application, the model training module, when configured to determine the network loss value of the deep convolutional neural network based on the facial occlusion prediction values of each of the sample personal images and each of the reference personal images, and the sample class label of each of the sample personal images, is configured to:

In some embodiments of the present application, the model training module determines the mean square error of each of the sample human images by:

In some embodiments of the present application, the model training module determines the loss of correlation for each of the sample human images by:

In some embodiments of the present application, when the model training module is configured to adjust the network parameters of the deep convolutional neural network based on the network loss value to obtain a trained face recognition model, the model training module is configured to:

determining whether the network loss value is greater than a preset loss threshold value;

An embodiment of the present application further provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is operating, the machine-readable instructions when executed by the processor performing the steps of the face recognition method as described above.

Embodiments of the present application further provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the face recognition method as described above.

The face recognition method, the face recognition device and the readable storage medium provided by the embodiment of the application acquire a plurality of sample person images and sample type labels of whether the face of a person in each sample person image wears a face mask or not; determining a plurality of reference character images of the sample character image aiming at each sample character image, wherein the size proportions of the character faces of any two images in the sample character image and the corresponding plurality of reference character images in the images are different; training the constructed deep convolutional neural network based on the plurality of sample character images, a plurality of reference character images corresponding to each sample character image and a sample type label of each sample character image to obtain a trained face recognition model; and inputting the acquired image of the person to be recognized into the face recognition model to obtain a face shelter recognition result of the image of the person to be recognized.

Thus, the application obtains a plurality of sample character images for training, determines a plurality of reference character images corresponding to each sample character image, trains a deep convolutional neural network through the plurality of sample character images and the plurality of reference character images corresponding to each sample character image to obtain a face recognition model, recognizes the collected character images to be recognized through the face recognition model, and determines the recognition result of the face shielding object of the character images to be recognized, because the proportion of the character faces in each sample image and each reference character image of the sample image is different, the face recognition model can put the attention point on the character face in the recognition process, therefore, the detection of the character face is not needed in the use process of the face recognition model, the single-stage recognition of the wearing recognition of the face shielding object can be directly carried out, therefore, whether the person in the image of the person to be recognized wears the face shielding object or not can be determined quickly and accurately, the detection time is greatly shortened, and the wearing recognition speed of the face shielding object is improved.

In addition, the method and the device can also adjust corresponding parameters of the face recognition model by combining corresponding loss functions in the face recognition model training process, and can further improve the accuracy and recognition efficiency of the recognition result of the face recognition model.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

Fig. 1 is a schematic diagram of an architecture of an image recognition system according to an embodiment of the present disclosure;

fig. 2 is a flowchart of a face recognition method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a sample image of a person;

FIG. 4 is a schematic diagram of a first reference image;

FIG. 5 is a diagram illustrating a second reference image;

FIG. 6 is a schematic diagram of a third reference image;

fig. 7 is a flowchart of a face recognition method according to another embodiment of the present application;

fig. 8 is a schematic structural diagram of a face recognition apparatus according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. Every other embodiment that can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present application falls within the protection scope of the present application.

To enable those skilled in the art to use the present disclosure, the following embodiments are presented in conjunction with a specific application scenario, "person face recognition". It will be apparent to those skilled in the art that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the application. Although the present application is described primarily in the context of human face recognition, it should be understood that this is merely one exemplary embodiment.

It should be noted that in the embodiments of the present application, the term "comprising" is used to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features.

Fig. 1 is a schematic architecture diagram of an image recognition system according to an embodiment of the present disclosure. The image recognition system comprises an image storage device, a face recognition device and an image acquisition device. The image storage device stores a plurality of sample person images, the face recognition device acquires the plurality of sample person images from the image storage device and determines a plurality of reference person images of each sample person image, a deep convolutional neural network is trained through the plurality of reference person images of each sample person image and the sample person image to obtain a face recognition model, and due to the fact that the proportion of the person face in each sample image and each reference person image of the sample image is different, the face recognition model can place a point of interest on the person face in the recognition process, the detection of the person face is not needed in the use process of the face recognition model, the single-stage recognition of the face shielding object wearing recognition can be directly carried out, and whether a person in the person image to be recognized wears a face shielding object or not can be rapidly and accurately determined, the time of detection is greatly reduced, and the wearing and identifying speed of the face shelter is improved.

It is worth noting that, at present, in these special scenes, a supervisor is usually arranged to spot check the wearing condition of the facial mask of the relevant person at regular time, but for one scene, if the traffic of people in the scene is large, the supervisor cannot move along with one relevant person at any moment, in this case, the number of supervisors needs to be increased to supervise the relevant person in real time, a large amount of manpower and material resources are consumed, and the workload of the supervisor is also increased, so how to accurately detect the wearing condition of the facial mask of the relevant person is an urgent problem to be solved.

For example, in some working scenes, in order to ensure the personal safety of workers, an enterprise usually requires the workers to wear face shields such as masks and goggles in the whole course of the working process, but the working places of the workers may not be fixed, and at this time, the supervisors cannot move along with the workers, so that the workers cannot be supervised in real time, and correct operation instructions cannot be given to the workers timely when the workers forget to wear the face shields.

Based on this, the embodiment of the application provides a face identification method, whether the person in the person image to be identified wears the face shielding object can be quickly and accurately determined, so that the worker in the person image to be identified can be monitored in real time, when the worker in the person image to be identified does not wear or does not correctly wear the face shielding object, the worker can be timely reminded of wearing the face shielding object, the personal safety of the worker in the working process can be further ensured, the detection time can be favorably reduced, and the wearing identification speed of the face shielding object can be improved.

Fig. 2 is a schematic flow chart of a face recognition method according to an embodiment of the present application. As shown in fig. 2, a face recognition method provided in an embodiment of the present application includes:

s201, obtaining a plurality of sample person images, and judging whether the face of a person in each sample person image is a sample type label of a face shelter.

In the step, before training the deep convolutional neural network, a plurality of sample human images for training the deep convolutional neural network and a sample class label of each sample human image are obtained, wherein the sample class label can indicate whether the human face in the sample human image wears a face mask.

In the multiple sample personal images, the multiple positive sample personal images with the facial obstruction and the multiple negative sample personal images without the facial obstruction are included, for example, when the true value of the corresponding sample category label of the positive sample personal image is 1, conversely, when the true value of the corresponding sample category label of the negative sample personal image is 0.

S202, aiming at each sample person image, a plurality of reference person images of the sample person image are determined, wherein the size proportions of the person faces of any two images in the sample person image and the corresponding reference person images in the images are different.

In this step, for each of the obtained sample personal images, face recognition processing is performed on each of the sample personal images to determine a plurality of reference personal images corresponding to each of the sample personal images, where the size ratios of the personal faces in any two of the sample personal images and the plurality of reference personal images are different, and the sample category label of the sample personal image is determined to be the same as the sample category labels of the plurality of reference personal images, that is, the sample category labels of the reference personal images are the same as the sample category labels of the sample personal images.

Illustratively, as shown in fig. 3, fig. 3 is a schematic diagram of a sample person image, where the face of the person in the sample person image accounts for a% (as shown in fig. 3) of the entire sample person image, face recognition is performed on the sample person image, the face of the person in the sample person image is recognized, and the ratio of the face of the person in the image is changed to obtain a first reference image, a second reference image, a third reference image, and the like, where the ratio of the face of the person in the first reference image is B% (as shown in fig. 4, fig. 4 is a schematic diagram of the first reference image), the ratio of the face of the person in the second reference image is C% (as shown in fig. 5, fig. 5 is a schematic diagram of the second reference image), and the ratio of the face of the person in the third reference image is D% (as shown in fig. 6, fig. 6 is a schematic diagram of the third reference image).

That is, as the size ratio occupied by the human face increases, the background information included in the reference human image gradually decreases, and the human face information gradually increases.

In this way, the human face information can be used to guide the model to focus on the human face during the training process of the model. When the model learns to actively search the face of the person and identifies the face shielding object in the face of the person, the person can be detected without advance in practical application, namely, the face of the person is not detected in the using process of the model, the wearing identification of the face shielding object is directly carried out, the detection time is saved, and the wearing identification speed of the face shielding object is improved.

S203, training the constructed deep convolutional neural network based on the plurality of sample person images, the plurality of reference person images corresponding to each sample person image and the sample type label of each sample person image to obtain a trained face recognition model.

In the step, the obtained multiple sample character images and multiple reference character images corresponding to each sample character image are used as the input of the constructed deep convolutional neural network, the sample type labels of the sample character images are used as the output of the constructed deep convolutional neural network, and the pre-constructed deep convolutional neural network is trained to obtain the trained face recognition model.

And S204, inputting the acquired image of the person to be recognized into the face recognition model to obtain a face shelter recognition result of the image of the person to be recognized.

In the step, the image of the person to be recognized, which needs to be subjected to face recognition, is collected, the collected image of the person to be recognized is input into a trained face recognition model, and the collected image of the person to be recognized is recognized, so that the recognition result of the face shielding object of the person in the image of the person to be recognized is determined, namely whether the person in the image of the person to be recognized wears the face shielding object is determined.

The face identification method provided by the embodiment of the application obtains a plurality of sample person images and whether a person face in each sample person image is a sample type label of a face shelter or not; determining a plurality of reference character images of the sample character image aiming at each sample character image, wherein the size proportions of the character faces of any two images in the sample character image and the corresponding plurality of reference character images in the images are different; training the constructed deep convolutional neural network based on the plurality of sample character images, a plurality of reference character images corresponding to each sample character image and a sample type label of each sample character image to obtain a trained face recognition model; and inputting the acquired image of the person to be recognized into the face recognition model to obtain a face shelter recognition result of the image of the person to be recognized.

Thus, the method obtains a plurality of sample character images for training, determines a plurality of reference character images corresponding to each sample character image, trains a deep convolutional neural network through the plurality of sample character images and the plurality of reference character images corresponding to each sample character image to obtain a trained face recognition model, recognizes the collected character images to be recognized through the face recognition model, and determines the recognition result of the face mask of the character images to be recognized, and because the proportion of the character faces in each sample image and each reference character image of the sample image is different, the face recognition model can place the attention point on the character faces in the recognition process, so that the detection of the character faces is not needed in the use process of the face recognition model, the single-stage recognition of the wearing recognition of the face mask can be directly carried out, therefore, whether the person in the image of the person to be recognized wears the facial obstruction can be determined quickly and accurately, the detection time is greatly shortened, and the wearing recognition speed of the facial obstruction is improved.

Referring to fig. 7, fig. 7 is a flowchart of a face recognition method according to another embodiment of the present application. As shown in fig. 7, a face recognition method provided in an embodiment of the present application includes:

s701, obtaining a plurality of sample person images, and judging whether the face of a person in each sample person image is a sample type label of a face shelter.

S702, determining a face position area of the person face in the sample person image aiming at each sample person image.

In this step, for each sample personal image obtained, a face position area of the face of the person in the sample personal image may be determined from the sample personal image by a face recognition method.

And S703, determining a plurality of reference character images of the sample character image based on the face position area, wherein the size proportions of the character faces of any two images in the sample character image and the corresponding plurality of reference character images in the images are different.

In this step, after the face position area in the sample personal image determined in step S702 is determined, a plurality of reference personal images corresponding to the sample personal image are determined according to the face position area, that is, the reference personal images include the same personal face as the sample personal image, but each reference personal image is different from the other reference personal images in the plurality of reference personal images and the sample personal image in the size ratio of the personal face in the image.

S704, training the constructed deep convolutional neural network based on the plurality of sample person images, the plurality of reference person images corresponding to each sample person image and the sample type label of each sample person image to obtain a trained face recognition model.

S705, inputting the acquired image of the person to be identified into the face identification model to obtain a face obstruction identification result of the image of the person to be identified.

The descriptions of S701, S704, and S705 may refer to the descriptions of S201, S203, and S204, and the same technical effect can be achieved, which is not described in detail herein.

Further, step S702 includes: according to the preset proportion of the human face in the image, carrying out area range expansion on the face position area of the sample human image, and obtaining a first reference human image from the sample human image; and taking the obtained reference person image as a sample person image, performing area range expansion on the face position area in the sample person image according to the preset proportion to obtain a second reference person image, and repeating the steps until a plurality of reference person images in preset number corresponding to the sample person image are obtained.

In this step, referring to fig. 3, 3a in fig. 3 is a face position area of a human face in a sample human image, a preset proportion of a preset human face in the image is obtained based on the determined face position area 3a, and according to the preset proportion, the area position of the face position area in the sample human image is expanded, that is, the area is expanded to the edge of the sample human image on the basis of the face position area 3a, that is, the proportion of the face position area 3a in the sample human image is changed, which is equivalent to intercepting an image in a certain area range around the face position area in the sample human image, and the intercepted image is enlarged to obtain a first reference human image; then, the obtained first reference person image is used as a sample person image, the area range of the face position area 4a in the sample person image (the first reference person image) at the moment is expanded according to the preset proportion again to obtain a second reference person image, then the second reference person image is used as a sample person reference image, the area range expansion of the face position area 5a is continued, and the like until a plurality of reference person images with the preset number corresponding to the sample person image are obtained.

Further, step S702 includes: acquiring a plurality of area proportions of the face of the person in the image; and sequentially performing region range expansion on the face position region in the sample person image according to the proportion of each region to obtain a plurality of reference person images corresponding to the sample person image.

In this step, referring to fig. 3, 3a in fig. 3 is a face position area of a human face in a sample human image, a plurality of area ratios of a plurality of preset human faces in the image are obtained, based on the face position area 3a of the human face in the sample human image, the area range of the face position area 3a is sequentially expanded according to each area ratio, namely, the area is expanded to the edge of the sample human image on the basis of the face position area, namely, the ratio of the face position area in the sample human image is changed, which is equivalent to the image within a certain area range around the face position area in the sample human image is intercepted, and the intercepted image is enlarged, so as to obtain a plurality of reference human images corresponding to each sample human image, for example, a first reference image (as shown in fig. 4), a second reference image (as shown in fig. 5), and a third reference image (as shown in fig. 6).

Further, step S704 includes: inputting the plurality of sample person images and a plurality of reference person images corresponding to each sample person image into a constructed deep convolutional neural network to obtain a face shielding prediction value of each sample person image and each reference person image; determining a network loss value of the deep convolutional neural network based on the facial occlusion prediction value of each sample person image and each reference person image and the sample class label of each sample person image; and adjusting the network parameters of the deep convolutional neural network based on the network loss value to obtain a trained face recognition model.

The method comprises the steps that a plurality of acquired sample person images and a plurality of reference person images corresponding to each sample person image are used as input features and input into a constructed deep convolutional neural network, meanwhile, a sample type label of each sample person image is used as an output feature and input into the constructed deep convolutional neural network, and a face shielding prediction value of each sample person image and each reference person image corresponding to the sample person image is determined; calculating a network loss value of the deep convolutional neural network based on the fact that a face shielding predicted value of each sample person image, a face shielding predicted value of each reference person image corresponding to the sample person image and a sample category label corresponding to each sample person image are determined through the deep convolutional neural network; and then, adjusting each network parameter in the deep convolutional neural network based on the determined network loss value to obtain a trained face recognition model.

Further, the determining a network loss value of the deep convolutional neural network based on the facial occlusion prediction value of each of the sample human images and each of the reference human images and the sample class label of each of the sample human images includes:

determining the association loss and the mean square error of each sample person image based on the facial occlusion predicted value of each sample person image and each reference person image of the sample person image and the sample category label of each sample person image; determining a network loss value of the deep convolutional neural network based on the determined plurality of correlation losses and the plurality of mean square errors.

In the step, before calculating the network loss value of the deep convolutional neural network, the correlation loss and the mean square error of each sample person image are calculated based on each sample person image and the face occlusion prediction value of each reference person image of the sample person image, and then the network loss value of the deep convolutional neural network is further determined through calculation based on the obtained correlation loss and the mean square error of each sample person image.

Specifically, the loss value of the deep convolutional neural network is calculated by the following formula:

where L denotes a network loss value, i denotes an i-th sample personal image, n denotes a total of n sample personal images, and Lmse_(i)Mean square error, L, representing the ith sample person image_iIndicating the loss of correlation for the ith sample person image.

Further, the mean square error of each of the sample human figure images is determined by the following steps:

for each sample character image, determining a first mean square error of the sample character image based on a face occlusion predicted value of the sample character image and a true value corresponding to a sample category label of the sample character image; determining a plurality of second mean square errors of each sample human image based on the facial occlusion prediction value of each reference human image corresponding to the sample human image and the truth value corresponding to the sample class label of the sample human image; and determining the mean square error of the sample human image based on the first mean square error and the plurality of second mean square errors.

Determining a first mean square error of each sample character image based on a facial occlusion predicted value of each sample character image and a real value corresponding to a sample type label corresponding to the sample character image, wherein the sample type label of the sample character image indicates whether a character in the sample character image wears a facial occlusion object, if the character in the sample character image wears the facial occlusion object, the real value corresponding to the sample type label of the sample character image is 1, otherwise, if the character in the sample character image wears the facial occlusion object, the real value corresponding to the sample type label of the sample character image is 0; meanwhile, determining a second mean square error of the sample character image based on the face shielding prediction value of each reference character image of the sample character image and the real value corresponding to the sample class label of the sample character image; and finally, determining the mean square error corresponding to the sample character image through further calculation based on the first mean square error and the second mean square error of each sample character image.

Specifically, the mean square error of each sample character image is calculated by the following formula:

wherein, Lmse_(i)Mean square error, xi, representing the ith sample person image_jA face occlusion prediction value, y, of a jth reference person image representing an ith sample person image_iThe true value of the sample category label representing the ith sample personal image.

Further, a loss of correlation for each of the sample person images is determined by: for each sample person image, determining a plurality of associated image groups from a plurality of images including the sample person image and a plurality of corresponding reference person images, wherein the associated image groups include a first image and a second image, and the size proportion of the face of the person in the first image is smaller than the size proportion of the face of the person in the second image; determining an association group loss difference of each association group of images based on a face occlusion predicted value corresponding to a first image, a face occlusion predicted value corresponding to a second image and a real value corresponding to a sample class label of the sample character image in each association group of images; determining an association loss for each of the sample person images based on a plurality of the association group loss differences.

Determining a plurality of related image groups from the sample person image and a plurality of reference person images of the sample person image for each sample person image, wherein each related image group comprises two images, namely a first image and a second image, and the first image and the second image are different in that the size proportion of the face of a person in the first image is smaller than the size proportion of the face of the person in the second image; determining an association group loss difference of each association image group based on a face shielding predicted value corresponding to the first image, a face shielding predicted value corresponding to the second image and a real value of a sample type label of the sample character image in each association image group; finally, the loss of correlation of each sample personal image is determined based on the difference in loss of correlation groups corresponding to the plurality of groups of correlation images.

Specifically, the loss of correlation of each sample person image is calculated by the following formula:

wherein L is_iRepresenting the loss of correlation, x, of the ith sample person image_i2Representing a facial occlusion prediction value, x, of the second image_i1Representing a facial occlusion prediction value, y, of a first image_iThe true value of the sample category label representing the ith sample personal image, α, is a positive integer.

In addition, the implications of the loss of association are: the difference between the predicted facial occlusion value of the second image and the true value corresponding to the sample type label of the sample human image and the predicted facial occlusion value of the first image and the true value corresponding to the sample type label of the sample human image is added with a positive number (taking alpha as an example), and if the difference is greater than 0, loss is generated.

Because the face information contained in the first image and the second image is sequentially increased, the difference value between the true values corresponding to the first image and the second image and the sample class label is expected to be smaller and smaller, that is, the more the face information is contained, the more accurate the model judgment result is, otherwise, the loss is generated in the training process of the model. Through the loss mechanism, the distance between the face shielding predicted value of the second image and the real value corresponding to the sample class label is gradually reduced in the learning process of the model, namely, the attention is gradually concentrated to the face area from the whole image, and therefore people do not need to recognize the detected face area through face detection.

Further, the adjusting the network parameters of the deep convolutional neural network based on the network loss value to obtain a trained face recognition model includes: determining whether the loss value is greater than a preset loss threshold; if the network loss value is larger than a preset loss threshold value, adjusting network parameters of the deep convolutional neural network until the network loss value is smaller than or equal to the preset loss threshold value, and determining that the deep convolutional neural network is completely trained; and determining the trained deep convolutional neural network as a face recognition model.

Determining whether the calculated network loss value of the deep convolutional neural network is greater than a preset loss threshold value or not based on the obtained network loss value, if so, adjusting each network parameter in the deep convolutional neural network until the network loss value of the deep convolutional neural network is less than the preset loss threshold value, and determining that the deep convolutional neural network is trained; and determining the deep convolutional neural network as a face recognition model for recognizing the face obstruction recognition result of the person image to be recognized.

The face identification method provided by the embodiment of the application obtains a plurality of sample person images and whether a person face in each sample person image is a sample type label of a face shelter or not; for each sample person image, determining a face position area of the person face in the sample person image; determining a plurality of reference person images of the sample person image based on the face position area, wherein the size proportions of the person faces of any two images in the sample person image and the corresponding plurality of reference person images in the images are different; training the constructed deep convolutional neural network based on the plurality of sample character images, a plurality of reference character images corresponding to each sample character image and a sample type label of each sample character image to obtain a trained face recognition model; and inputting the acquired image of the person to be recognized into the face recognition model to obtain a face shelter recognition result of the image of the person to be recognized.

Thus, the method comprises the steps of obtaining a plurality of sample character images for training, determining a face position area of a character face in each sample character image, determining a plurality of reference character images corresponding to each sample character image according to the face position area, training a deep convolutional neural network through the plurality of sample character images and the plurality of reference character images corresponding to each sample character image to obtain a face recognition model, obtaining a trained face recognition model, recognizing the acquired character image to be recognized through the face recognition model, determining a recognition result of a face obstruction of the character image to be recognized, and enabling the face recognition model to place a focus point on the character face in the recognition process due to different proportions of the character face in each sample image and each reference character image of the sample image, therefore, the use process of the face recognition model does not need to detect the face of a person, single-stage recognition of face shielding object wearing recognition can be directly carried out, whether the person in the image of the person to be recognized wears the face shielding object or not can be rapidly and accurately determined, the detection time is greatly shortened, and the wearing recognition speed of the face shielding object is improved.

In addition, the method and the device can also adjust corresponding parameters of the face recognition model by combining corresponding loss functions in the face recognition model training process, and can further improve the accuracy of the recognition result of the face recognition model.

Referring to fig. 8, fig. 8 is a schematic structural diagram of a face recognition device according to an embodiment of the present application, where the face recognition device 800 includes:

an image obtaining module 810, configured to obtain a plurality of sample person images, and whether a face of a person in each sample person image is a sample category label of a face mask;

an image determining module 820, configured to determine, for each of the sample personal images, a plurality of reference personal images of the sample personal image, where the size ratios of the personal faces of any two images in the sample personal image and the corresponding plurality of reference personal images in the images are different;

the model training module 830 is configured to train the constructed deep convolutional neural network based on the plurality of sample character images, the plurality of reference character images corresponding to each sample character image, and the sample type label of each sample character image, so as to obtain a trained face recognition model;

the image recognition module 840 is configured to input the acquired image of the person to be recognized to the face recognition model, so as to obtain a face obstruction recognition result of the image of the person to be recognized.

Further, the image determining module 820 is configured to determine a plurality of reference personal images of each sample personal image, where the size ratios of the personal faces of any two images of the sample personal image and the corresponding plurality of reference personal images in the images are different, for each sample personal image, the image determining module 820 is configured to:

Further, the image determination module 820, when configured to determine a plurality of reference person images of the sample person image based on the face position area, is configured to:

Further, when the model training module 830 is configured to train the constructed deep convolutional neural network based on the plurality of sample person images, the plurality of reference person images corresponding to each sample person image, and the sample category label of each sample person image, so as to obtain a trained face recognition model, the model training module 830 is configured to:

Further, when the model training module 830 is configured to determine the network loss value of the deep convolutional neural network based on the facial occlusion prediction value of each of the sample personal images and each of the reference personal images, and the sample class label of each of the sample personal images, the model training module 830 is configured to:

Further, the model training module 830 determines the mean square error of each of the sample human images by:

Further, the model training module 830 determines the loss of correlation for each of the sample human images by:

Further, when the model training module 830 is configured to adjust the network parameters of the deep convolutional neural network based on the network loss value to obtain a trained face recognition model, the model training module 830 is configured to:

The face recognition device provided by the embodiment of the application acquires a plurality of sample person images and whether a person face in each sample person image is a sample type label of a face shelter; determining a plurality of reference character images of the sample character image aiming at each sample character image, wherein the size proportions of the character faces of any two images in the sample character image and the corresponding plurality of reference character images in the images are different; training the constructed deep convolutional neural network based on the plurality of sample character images, a plurality of reference character images corresponding to each sample character image and a sample type label of each sample character image to obtain a trained face recognition model; and inputting the acquired image of the person to be recognized into the face recognition model to obtain a face shelter recognition result of the image of the person to be recognized.

Thus, the method obtains a plurality of sample character images for training, determines a plurality of reference character images corresponding to each sample character image, trains a deep convolutional neural network through the plurality of sample character images and the plurality of reference character images corresponding to each sample character image to obtain a face recognition model, obtains a trained face recognition model, recognizes the collected character images to be recognized through the face recognition model, and determines the recognition result of the face shielding object of the character images to be recognized, and because the proportion of the character face in each sample image and each reference character image of the sample image is different, the face recognition model can put the attention point on the character face in the recognition process, so that the detection of the character face is not needed in the use process of the face recognition model, the single-stage recognition of the wearing recognition of the face shielding object can be directly carried out, therefore, whether the person in the image of the person to be recognized wears the face shielding object or not can be determined quickly and accurately, the detection time is greatly shortened, and the wearing recognition speed of the face shielding object is improved.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 9, the electronic device 900 includes a processor 910, a memory 920, and a bus 930.

The memory 920 stores machine-readable instructions executable by the processor 910, when the electronic device 900 runs, the processor 910 communicates with the memory 920 through the bus 930, and when the machine-readable instructions are executed by the processor 910, the steps of the face recognition method in the method embodiments shown in fig. 2 and fig. 7 may be performed.

An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the face recognition method in the method embodiments shown in fig. 2 and fig. 7 may be executed.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the exemplary embodiments of the present application, and are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A face recognition method, characterized in that the face recognition method comprises:

2. The method of claim 1, wherein the determining a plurality of reference images for each sample image of the person, wherein the sample image of the person and the corresponding reference images of the person have different proportions of the person's face in the images comprises:

3. The face recognition method according to claim 2, wherein the determining a plurality of reference person images of the sample person image based on the face position area includes:

4. The face recognition method according to claim 2, wherein the determining a plurality of reference person images of the sample person image based on the face position area includes:

5. The method of claim 1, wherein the training the constructed deep convolutional neural network based on the plurality of sample human images, the plurality of reference human images corresponding to each sample human image, and the sample class label of each sample human image to obtain the trained face recognition model comprises:

6. The method of claim 5, wherein the determining the network loss value of the deep convolutional neural network based on the facial occlusion prediction value of each of the sample human images and each of the reference human images and the sample class label of each of the sample human images comprises:

7. The face recognition method of claim 6, wherein the mean square error of each of the sample human images is determined by:

8. The face recognition method according to claim 6, wherein the loss of association of each of the sample personal images is determined by:

9. The method of claim 5, wherein the adjusting network parameters of the deep convolutional neural network based on the network loss values to obtain a trained face recognition model comprises:

determining whether the loss value is greater than a preset loss threshold;

10. A face recognition apparatus characterized by comprising:

11. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the face recognition method according to any one of claims 1 to 9.

12. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, performs the steps of the face recognition method according to one of claims 1 to 9.