CN111860456A

CN111860456A - Mask face recognition method

Info

Publication number: CN111860456A
Application number: CN202010773738.7A
Authority: CN
Inventors: 苏文烈
Original assignee: Guangzhou Weizhilian Technology Co Ltd
Current assignee: Guangzhou Weizhilian Technology Co Ltd
Priority date: 2020-08-04
Filing date: 2020-08-04
Publication date: 2020-10-30
Anticipated expiration: 2040-08-04
Also published as: CN111860456B

Abstract

The invention belongs to the technical field of artificial intelligence, and discloses a mask face recognition method, which comprises the following steps: s1: acquiring a mask face image data set according to the non-mask face image data; s2: establishing a mask recognition model by using a mask face image data set based on a neural network, a confidence coefficient adaptation mechanism and a multi-frame clustering prediction mechanism; s3: and acquiring a field video in real time, and identifying by using a mask identification model according to the field video to obtain an identification result. The invention solves the problems of low recognition effect, low recognition efficiency, complex structure of a face recognition neural network and great influence by scenes in the prior art.

Description

Mask face recognition method

Technical Field

The invention belongs to the technical field of artificial intelligence, and particularly relates to a mask face recognition method.

Background

Social security is more and more emphasized, identity authentication of mask molecules becomes particularly important, and although some video monitoring devices are installed in various places, due to the fact that video monitoring environments are complex, the mask molecules have the reasons of crowd density, motion dispersion and the like and the reasons of imaging environments and the like, great practical challenges are brought to face identity identification of the mask molecules under huge monitoring.

With the development of economy, the mask face recognition demand also becomes remarkable. Many head manufacturers develop face recognition algorithms specifically under the mask. However, the algorithm research results of various manufacturers are different, and the recognition accuracy is also different. But basically applied to relatively static close-range recognition scenes such as gate machines, face attendance and the like.

The prior art has the following defects:

in the field of face recognition, wearing a mask/face mask belongs to large-area face shielding and is a long-known problem, and the difficulty is mainly reflected in the following four points:

1) the face recognition algorithm mainly judges the identity according to the facial features of the face, and when the mask is worn for recognition, the algorithm cannot accurately detect the face position and locate key points of five sense organs, so that the recognition effect is greatly reduced.

2) The deep learning technology used by the face recognition algorithm depends on massive training data, a large number of mask wearing photos are difficult to collect in a short period of time and are marked manually, and the recognition efficiency is low;

3) the face recognition neural network has a complex structure and comprises multiple modules, the wearing of the mask is not only a face comparison module, but also a plurality of modules such as face detection and tracking can be influenced, and great interference influence is brought to the whole design.

4) At present, mask identification in a monitoring scene is influenced by scene complexity and diversity, and particularly, imaging fuzzy complex scenes bring great challenges to mask visual identification.

Disclosure of Invention

The present invention aims to solve at least one of the above technical problems to a certain extent.

Therefore, the invention aims to provide a mask face recognition method, which is used for solving the problems of low recognition effect, low recognition efficiency, complex face recognition neural network structure and large influence of scenes in the prior art.

The technical scheme adopted by the invention is as follows:

a mask face recognition method comprises the following steps:

s1: acquiring a mask face image data set according to the non-mask face image data;

s2: establishing a mask recognition model by using a mask face image data set based on a neural network, a confidence coefficient adaptation mechanism and a multi-frame clustering prediction mechanism;

s3: and acquiring a field video in real time, and identifying by using a mask identification model according to the field video to obtain an identification result.

Further, in step S1, the MaskGAN method is used to operate on the non-mask face image data using various and interactive face images, so as to obtain a mask face image data set.

Further, in step S2, the neural network is a residual error network, the residual error network is provided with an LSTM long-term memory module, and the LSTM long-term memory module is provided with a forgetting gate.

Further, in step S2, the formula of the forgetting gate is:

f_t＝σ(W_f·[h_t-1，x_t]+b_f)

in the formula (f)_tIs a forget gate function; sigma (#) is a sigmoid activation function; h is_t-1Is the output of the previous time step (t-1); t is the time step indicator; x is the number of_tInputting the current time step; b_fIs a convolutional layer bias term; w_fAre convolutional layer weights.

Further, in step S2, the output formula of the neural network is:

in the formula o_ijIs an attention weighted output feature; alpha is alpha_ijIs the normalized attention weight; i is an attention indicator; j is a one-way time step; n is the number of one-way time steps; h is_jAs output for each time step.

Further, the specific step of step S2 is:

s2-1: dividing a mask face image data set into a training set and a testing set;

s2-2: training the neural network by using a training set to obtain an initial mask recognition model;

s2-3: and adding a confidence coefficient adaptation mechanism and a multi-frame clustering prediction mechanism into the initial mask recognition model, and optimizing the initial mask recognition model by using a test set to obtain and output an optimal mask recognition model.

Further, in step S2, the confidence level adaptation mechanism is: and adaptively adjusting the confidence threshold of the current model according to the imaging fuzziness of the face image data set input into the current model.

Further, the specific step of step S3 is:

s3-1: acquiring a live video in real time, carrying out human head detection on the live video, and acquiring the head characteristics of the current human head after the human head is detected;

s3-2: taking the head of a human body as an individual identifier, setting a head detection frame, continuously tracking the current individual in the field video according to the head characteristics and the position of the corresponding head detection frame, carrying out individual tracking matching to obtain the matching result of the individual, if the matching result is a new individual or an existing individual, acquiring a plurality of single-frame images of the current individual, and if not, ending the mask face recognition method;

s3-3: carrying out face detection on the current single-frame image, and carrying out front face detection on the current single-frame image by using a mask recognition model after a face is detected;

s3-4: after the front face is detected, acquiring the imaging fuzziness of the current single-frame image, adaptively adjusting a confidence threshold of a mask recognition model according to the imaging fuzziness, and performing face recognition on the current single-frame image by using the adjusted mask recognition model to obtain a face recognition result of the current frame image;

s3-5: updating the current single-frame image according to the plurality of single-frame images collected in the step S3-2, and returning to the step S3-3 until all the collected single-frame images are subjected to face recognition to obtain a plurality of face recognition results;

s3-6: and obtaining the final recognition result of the current individual based on a multi-frame clustering prediction mechanism according to the plurality of face recognition results.

Further, in step S3-3, face detection is performed, and the face detection result is: the current single-frame image is a mask face image or the current single-frame image is a non-mask face image.

Further, in step S3-3, performing face detection on the non-mask face image to obtain a non-mask face, and directly performing face recognition on the non-mask face to obtain a recognition result;

and (5) performing face correction detection on the face image of the mask to obtain a mask face, and entering the step S3-4 to obtain the imaging fuzziness of the current single-frame image according to the mask face.

The invention has the beneficial effects that:

1) the mask recognition accuracy under the monitoring scene is relatively high, the self-adaptive mask face recognition under most monitoring scenes can be met, and the scene influence is avoided;

2) the residual structure can effectively prevent the gradient from disappearing, so that the target feature learning capacity is very strong, the running speed is ideal, and the identification efficiency is improved;

3) the neural network is improved aiming at the special face scene of mask/face shield in the monitoring scene, and the enhanced bidirectional LSTM network is used, so that the robustness of the algorithm is greatly improved, the individual identification capability of the mask in the monitoring scene is enhanced, and the complex structure of the face identification neural network is avoided;

4) the post-processing logic of the multi-frame clustering prediction improves the accuracy of mask recognition prediction and improves the recognition accuracy.

Other advantageous effects of the present invention will be described in detail in the detailed description.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of a mask face recognition method;

FIG. 2 is a non-masked face data image;

FIG. 3 is a mask face image dataset processing image;

fig. 4 is an image of the structure of the LSTM long-and-short memory module.

Detailed Description

The invention is further described with reference to the following figures and specific embodiments. It should be noted that the description of the embodiments is provided to help understanding of the present invention, but the present invention is not limited thereto. Functional details disclosed herein are merely illustrative of example embodiments of the invention. This invention may, however, be embodied in many alternate forms and should not be construed as limited to the embodiments set forth herein.

It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments of the invention. When the terms "comprises," "comprising," "includes," and/or "including" are used herein, they specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, numbers, steps, operations, elements, components, and/or groups thereof.

It should also be noted that, in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may, in fact, be executed substantially concurrently, or the figures may sometimes be executed in the reverse order, depending upon the functionality/acts involved.

It should be understood that specific details are provided in the following description to facilitate a thorough understanding of example embodiments. However, it will be understood by those of ordinary skill in the art that the example embodiments may be practiced without these specific details. For example, systems may be shown in block diagrams in order not to obscure the examples in unnecessary detail. In other instances, well-known processes, structures and techniques may be shown without unnecessary detail in order to avoid obscuring example embodiments.

Example 1

As shown in fig. 1, the present embodiment provides a mask face recognition method, including the following steps:

processing the non-mask face image data by using a MaskGAN method for various and interactive face images to obtain a mask face image data set;

as shown in fig. 2, a second automatic affine transformation labeling is performed using a conventional 8-class 32-subclass face database, which is 200 ten thousand, and through 68 point pairs of faces and affine transformation, an automatic mask wearing is performed, and MaskGAN method is used to operate various and interactive face images to process non-mask face image data, as shown in fig. 3, the method includes:

1) the area shielded by the mask (5 percent is used as shielding proportion interval, and 16 different gears are divided by 5 to 80 percent);

2) in order to adapt to the influence of a multi-style mask on face recognition, N95 mask styles, common medical masks, common masks and facial tissues are adopted for classification and processing;

3) in order to adapt to the influence of different color masks/facial tissues on the face identification, N95 mask patterns of four colors of blue, gray, white and black, common medical masks, common masks and facial tissues are used for classified mixing treatment;

s2: based on a neural network, a confidence coefficient adaptation mechanism and a multi-frame clustering prediction mechanism, a mask face image data set is used for establishing a mask recognition model, and post-processing logic of multi-frame clustering prediction is adopted, so that the accuracy of mask recognition prediction is improved, and the recognition accuracy is improved;

the neural network is a residual error network, the residual error network is provided with an LSTM long-term memory module, the LSTM long-term memory module is provided with a forgetting gate, and a residual error structure can effectively prevent gradient disappearance, so that the learning target feature capacity is very strong, the operation speed is ideal, the recognition efficiency is improved, the neural network is improved aiming at a special face scene of mask/face shield shielding in a monitoring scene, the enhanced bidirectional LSTM network is used, the robustness of an algorithm is greatly improved, the individual recognition capability of a mask in the monitoring scene is enhanced, and the complexity of the face recognition neural network structure is avoided;

the formula for a forget gate is:

f_t＝σ(W_f·[h_t-1，x_t]+b_f)

in the formula (f)_tIs a forget gate function; sigma (#) is a sigmoid activation function; h is_t-1Is the output of the previous time step (t-1); t is the time step indicator; x is the number of_tInputting the current time step; b_fIs a convolutional layer bias term; w_fIs the convolutional layer weight; the output of the previous time step and the input of the current time step are fused through the convolution layer in the whole calculation process, and then the excitation is carried out through a sigmoid function, the output is limited between 0 and 1, 0 represents that all the time is forgotten, and 1 represents that all the time is reserved;

the formula of the attention module of the neural network is:

e_ij＝tanh((h_s·w)+b)*u

in the formula, e_ijAttention weight before normalization; tan h (—) is a hyperbolic tangent function; h is_sFor the output of each time step; w is the convolution weight; b is a convolution bias term; u is a scaling factor; i is an attention indicator; j is a one-way time step;

the formula for the attention weight of the neural network is:

in the formula, alpha_ijIs the normalized attention weight; e.g. of the type_ijAttention weight before normalization; i is an attention indicator; j is a one-way time step; k is a time step indicator; n is the number of one-way time steps; the calculation is that activation of a normalization index softmax function is carried out, the output is limited to be 0-1, and attention distribution is obtained;

the formula of the attention weighted output characteristics of the neural network is:

in the formula o_ijIs an attention weighted output feature; alpha is alpha_ijIs the normalized attention weight; i is an attention indicator; j is a one-way time step; n is the number of one-way time steps; h is_jFor the output of each time step;

the confidence adaptation mechanism is as follows: adaptively adjusting a confidence threshold of the current model according to the imaging fuzziness of the face image data set input into the current model, and turning down the confidence threshold to obtain a single recognition result when the imaging fuzziness is increased so as to avoid the problem of unrecognizable imaging;

the specific steps of step S2 are:

s2-3: adding a confidence coefficient adaptation mechanism and a multi-frame clustering prediction mechanism into an initial mask recognition model, and optimizing the initial mask recognition model by using a test set to obtain and output an optimal mask recognition model;

s3: the method comprises the following steps of acquiring a field video in real time, identifying by using a mask identification model according to the field video to obtain an identification result, wherein the mask identification accuracy under a monitoring scene is relatively high, the self-adaptive mask face identification under most monitoring scenes can be met, the scene influence is avoided, and the method comprises the following specific steps:

and carrying out face detection, wherein the face detection result is as follows: the current single-frame image is a mask face image or the current single-frame image is a non-mask face image;

carrying out face detection on the non-mask face image to obtain a non-mask face, and directly carrying out face identification on the non-mask face to obtain an identification result;

carrying out face correction detection on the face image of the mask to obtain a mask face, entering a step S3-4, and acquiring the imaging fuzziness of the current single-frame image according to the mask face;

imaging ambiguity calculation rule:

1) pixel RGB to gray value calculation:

RGB is not less than Gray value Gray_n＝R*0.3+G*0.59+B*0.11；

2) Calculating the average value of each pixel point of the gray picture:

μ＝∑Gray_nn, wherein μ is an average value of each pixel point of the gray-scale picture; gray_nIs the gray value of the gray picture; n is the total number of pixel points;

3) calculating the gray value variance of the whole gray image:

S²＝∑(Gray_n-μ)²v (N-1), wherein S²The smaller S is, the more blurred the image is;

in this embodiment, S is greater than or equal to 10 and less than 20, the confidence coefficient weight is adjusted to 0.8, and then the current confidence coefficient threshold is 0.8 × the preset confidence coefficient threshold;

s is more than or equal to 20 and less than or equal to 30, the weight of the confidence coefficient is adjusted to be 0.9, and then the current confidence coefficient threshold value is 0.9 multiplied by the preset confidence coefficient threshold value;

s is not more than 30, the weight of the confidence coefficient is adjusted to be 1, then the current confidence coefficient threshold is 1 x the preset confidence coefficient threshold, and S is 30 which is the optimal critical value of the fuzzy detection;

s3-5: updating the current single-frame image according to the plurality of single-frame images acquired in the step S3-2, namely, taking the next single-frame image as the current single-frame image, and returning to the step S3-3 until all the acquired single-frame images are subjected to face recognition to obtain a plurality of face recognition results;

s3-6: according to a plurality of face recognition results, based on a multi-frame clustering prediction mechanism, obtaining identity information of a current individual, namely a final recognition result, continuously updating individual ID in continuous individual tracking and recognition, wherein in the initial tracking stage, the recognition result is likely to continuously change due to adverse factors such as mask shielding, long distance, insufficient light and the like, but after continuous tracking, the recognition result tends to be stable, accurate identity information is obtained, and the recognition success rate is greatly improved;

the multi-frame clustering prediction mechanism is as follows:

matched individual P₀-P_kThe confidence levels of (a) are respectively:

individual P₀Contains m0 results:

individual P₁Contains m1 results:

……

individual P_kContains mk results:

the inter-frame enhancement similarity of the recognition results of each individual is as follows:

...

then R (F)₀-F_n)＝Max(∑P₀,∑P₁,...,∑P_k) The corresponding individual P;

namely, a plurality of individuals are possibly matched in continuous tracking identification, the mechanism carries out comprehensive prediction according to the matching times and confidence degrees, the final confidence degrees of different individuals are calculated and compared, and the confidence degrees correspond to the maximum inter-frame enhancement similarity sigma P_jThe individual P is the individual R (F) of the output result_i) Where j is 0,1, and k, i is 0,1, and n, the identification result becomes more stable with continuous tracking, and the identity information of the individual becomes more and more certain.

It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and they may alternatively be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, or fabricated separately as individual integrated circuit modules, or fabricated as a single integrated circuit module from multiple modules or steps. Thus, the present invention is not limited to any specific combination of hardware and software.

The embodiments described above are merely illustrative, and may or may not be physically separate, if referring to units illustrated as separate components; if reference is made to a component displayed as a unit, it may or may not be a physical unit, and may be located in one place or distributed over a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: modifications of the technical solutions described in the embodiments or equivalent replacements of some technical features may still be made. And such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

The present invention is not limited to the above-described alternative embodiments, and various other forms of products can be obtained by anyone in light of the present invention. The above detailed description should not be taken as limiting the scope of the invention, which is defined in the claims, and which the description is intended to be interpreted accordingly.

Claims

1. A mask face recognition method is characterized in that: the method comprises the following steps:

2. The mask face recognition method according to claim 1, characterized in that: in step S1, the MaskGAN method of various and interactive facial image operations is used to process the non-mask facial image data, so as to obtain a mask facial image data set.

3. The mask face recognition method according to claim 1, characterized in that: in step S2, the neural network is a residual network, the residual network is provided with an LSTM long-and-short memory module, and the LSTM long-and-short memory module is provided with a forgetting gate.

4. A mask face recognition method according to claim 3, characterized in that: in step S2, the formula of the forgotten door is:

f_t＝σ(W_f·[h_t-1，x_t]+b_f)

5. The mask face recognition method according to claim 1, characterized in that: in step S2, the output formula of the neural network is:

6. The mask face recognition method according to claim 1, characterized in that: the specific steps of step S2 are:

7. The mask face recognition method according to claim 1, characterized in that: in step S2, the confidence level adaptation mechanism is: and adaptively adjusting the confidence threshold of the current model according to the imaging fuzziness of the face image data set input into the current model.

8. The mask face recognition method according to claim 1, characterized in that: the specific steps of step S3 are:

9. The mask face recognition method of claim 8, wherein: in step S3-3, face detection is performed, and the face detection result is: the current single-frame image is a mask face image or the current single-frame image is a non-mask face image.

10. The mask face recognition method of claim 9, wherein: in the step S3-3, performing front face detection on the non-mask face image to obtain a non-mask front face, and directly performing face recognition on the non-mask front face to obtain a recognition result;