CN113723243A

CN113723243A - Thermal infrared image face recognition method for wearing mask and application

Info

Publication number: CN113723243A
Application number: CN202110960380.3A
Authority: CN
Inventors: 张天序; 郭诗嘉; 郭婷; 苏轩; 彭雅; 叶建国
Original assignee: Nanjing Huatu Information Technology Co ltd
Current assignee: Nanjing Huatu Information Technology Co ltd
Priority date: 2021-08-20
Filing date: 2021-08-20
Publication date: 2021-11-30
Anticipated expiration: 2041-08-20
Also published as: CN113723243B

Abstract

The invention discloses a thermal infrared image face recognition method of a face mask and application thereof. The method comprises the following steps: acquiring a thermal infrared face image to be recognized, and inputting the thermal infrared face image into a trained first convolution neural network, wherein the first convolution neural network is used for detecting whether the thermal infrared face image is a mask wearing image and a face detection frame; inputting output data of the first convolutional neural network into a trained second convolutional neural network, wherein the second convolutional neural network is used for outputting a face detection image without a mask; and inputting the output data of the second convolutional neural network into a trained third convolutional neural network, wherein the third convolutional neural network is used for outputting a face recognition result. The invention can improve the accuracy of face recognition of the face mask.

Description

Thermal infrared image face recognition method for wearing mask and application

Technical Field

The invention belongs to the crossing field of a biological feature recognition infrared technology and an anti-terrorism technology, and particularly relates to a thermal infrared image face recognition method wearing a mask and application thereof.

Background

The face recognition by wearing a mask, namely, the label or name of the corresponding face can be known for a face thermal infrared image wearing the mask. The face recognition with the mask is widely applied to various fields, and particularly has very high application value in the field of national security.

The face recognition technology of visible light has been widely applied to various fields. However, the visible light face recognition technology cannot work without an external light source, and cannot recognize a face with a mask on the face. When a face is shielded with a mask or the like, the information shielded cannot be ascertained based on the reflected visible light camera. When the mask is worn, some parts of the face are hidden, which makes visible light face recognition a very difficult task.

Thermal infrared images are thermal radiation images, which are images based on spatial differences in infrared radiation of an object. When the face is shielded by the face mask, heat of the face can penetrate through the face mask in a heat conduction and heat radiation mode to be acquired by the thermal infrared camera, and the technology can make up the defect that the visible light camera completely loses the characteristics of the shielded part.

In the prior art, a mature solution for face recognition by using a thermal infrared technology is not available.

Disclosure of Invention

Aiming at least one defect or improvement requirement in the prior art, the invention provides a thermal infrared image face recognition method and application of a mask, which can improve the accuracy of face recognition of the mask.

In order to achieve the above object, according to a first aspect of the present invention, there is provided a thermal infrared image face recognition method for wearing a mask, comprising the steps of:

acquiring a thermal infrared face image to be recognized, and inputting the thermal infrared face image into a trained first convolution neural network, wherein the first convolution neural network is used for detecting whether the thermal infrared face image is a mask wearing image and a face detection frame;

inputting output data of the first convolutional neural network into a trained second convolutional neural network, wherein the second convolutional neural network is used for outputting a face detection image without a mask;

and inputting the output data of the second convolutional neural network into a trained third convolutional neural network, wherein the third convolutional neural network is used for outputting a face recognition result.

Preferably, the training of the first convolutional neural network comprises the steps of:

constructing a first training set and training labels thereof, wherein the first training set comprises a plurality of thermal infrared face images of different people wearing masks and a plurality of thermal infrared face images of different people not wearing masks, and the training labels of the thermal infrared face images comprise face calibration frames and class labels for indicating whether the masks are worn or not;

and setting a plurality of anchor points for the first convolutional neural network, and inputting the first training set and the training labels thereof into the first convolutional neural network for training.

Preferably, the training of the second convolutional neural network comprises the steps of:

constructing a second training set and training labels thereof, wherein the second training set comprises two training subsets, one training subset comprises a plurality of mask-wearing face detection images, the other training subset comprises a plurality of mask-not-wearing face detection images, and the face detection images are marked with class labels which represent the mask-wearing face detection images or the mask-not-wearing face detection images;

and inputting the second training set and the training labels thereof into the second convolutional neural network for training.

Preferably, the training of the third convolutional neural network comprises the steps of:

constructing a third training set and training labels thereof, wherein the third training set comprises a plurality of face detection images without masks and a plurality of face detection images without masks, the face detection images are marked with category labels and identity labels, the category labels are used for representing the face detection images without masks or the face detection images without masks, the category labels are used for distinguishing different people, and the identity labels of the face detection images without masks and the face detection images without masks belonging to the same person are the same;

and inputting the third training set and training labels thereof into the second convolutional neural network for training.

Preferably, the second convolutional neural network comprises a first anti-network and a second anti-network, the first anti-network comprises a first generator and a first discriminator, the second anti-network comprises a second generator and a second discriminator, the class label of the mask-wearing face detection image is real _ a, the class label of the mask-not-wearing face detection image is real _ B, the mask-wearing image generated by the second discriminator is labeled as fake _ a class, and the mask-not-wearing image generated by the first discriminator is labeled as fake _ B class;

the first discriminator is used for judging whether an input image or an image generated by the second discriminator is a real _ A type image or not, and the first generator is used for generating a fake _ B type image according to the input or generated real _ A type image;

the second discriminator is used for judging whether the input image or the image generated by the first discriminator is a real _ B type, and the second generator is used for generating a fake _ A type image according to the input or generated real _ B type.

Preferably, the loss function of the second convolutional neural network is:

Loss_full＝Loss_Gan(G，D_x)+Loss_Gan(F，D_Y)+λLoss_cycle

therein, Loss_fullFor the loss function, the first generator is marked as G, and the first discriminator is marked as D_x，Loss_Gan(G，D_x) A loss function representing the first generator and the first discriminator, the second generator being denoted as F, the second generator being denoted asTwo discriminators are marked as D_y，Loss_Gan(F，D_Y) Loss function, Loss, representing said second generator and said second discriminator_cycleTo combat losses, λ is a weighting factor.

Preferably, the squared figure loss function of the third convolutional neural network is:

wherein i represents the ith image, N represents the total number of samples of the third training set, real _ B _ others represents different images of real _ B class, others represents images belonging to different persons, and fake _ B _ others represents different images of fake _ B class;

loss1 is a Loss function for zooming in the difference between images of the same type, real _ B, and zooming out the difference between images of different types;

loss2 is a Loss function used to zoom in on the difference between real class B images and fake class B images, zooming in on the difference between different classes of images;

loss3 is a Loss function used to zoom in on differences between similar class-like class-B images, and to zoom out on differences between different class-B images.

According to a second aspect of the present invention, there is provided a thermal infrared image face recognition system for wearing a mask, comprising:

the system comprises a first module, a second module and a third module, wherein the first module is used for acquiring a thermal infrared face image to be recognized and inputting the thermal infrared face image into a trained first convolution neural network, and the first convolution neural network is used for detecting whether the thermal infrared face image is a mask wearing image and a face detection frame;

the second module is used for inputting the output data of the first convolutional neural network into a trained second convolutional neural network, and the second convolutional neural network is used for outputting a face detection image without a mask;

and the third module is used for inputting the output data of the second convolutional neural network into a trained third convolutional neural network, and the third convolutional neural network is used for outputting a face recognition result.

According to a third aspect of the invention, there is provided an electronic device comprising a memory storing a computer program and a processor implementing the steps of any of the above-described methods when the computer program is executed by the processor.

According to a fourth aspect of the invention, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the methods described above.

In general, the face recognition of the present invention includes three steps of face detection frame recognition, mask removal and face information recognition, and can improve the accuracy of face recognition by wearing the mask, which is specifically represented by:

1) according to the invention, the thermal infrared images with and without the mask are input into the convolutional neural network for training to obtain the convolutional neural network meeting the requirements, so that whether the mask is worn or not can be rapidly and automatically detected, and the position of the face can be accurately framed.

2) The invention removes the mask influence by the thermal infrared technology, can recover the characteristics of the face part which is shielded by wearing the mask, and fully utilizes the characteristics of the face to complete the method for removing the mask influence. The capability that the visible light can not utilize the information of the shielded part is compensated.

3) The invention can identify the information of the person of the thermal infrared face image by the thermal infrared face identification technology after the influence of the face mask is removed.

Drawings

Fig. 1 is a flow chart of a thermal infrared image face recognition method of wearing a mask according to an embodiment of the invention.

FIG. 2 is a schematic diagram of anchor blocks of two scales in an embodiment of the present invention;

FIG. 3 is a schematic illustration of the results of a mask test according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a pair of training images of a neural network of a facemask in an embodiment of the present invention;

FIG. 5 is a schematic diagram of a second convolutional neural network for de-masking in an embodiment of the present invention;

FIG. 6 is a graph comparing the effect of a neural network of a mask according to an embodiment of the present invention before and after removal;

FIG. 7 is a schematic diagram of the loss function of face recognition after face mask removal according to an embodiment of the present invention;

fig. 8 is a flow chart of training and testing of a thermal infrared image face recognition method for wearing a mask according to an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

The mask in this application should not be construed narrowly as a mask but as any object that performs the function of blocking part or all of the face.

As shown in fig. 1, a thermal infrared image face recognition method for wearing a mask according to an embodiment of the present invention includes steps S1 to S3.

And S1, acquiring a thermal infrared face image to be recognized, and inputting the thermal infrared face image into a trained first convolution neural network, wherein the first convolution neural network is used for detecting whether the thermal infrared face image is a mask wearing image and a face detection frame.

The thermal infrared face image to be recognized may be an original image collected by a thermal infrared camera, and may include a face and a background other than the face.

The first convolution neural network is used for judging whether the thermal infrared face image to be identified is a face mask wearing image, if the thermal infrared face image to be identified is the face mask wearing image, outputting a face detection frame including a face mask area, and if the thermal infrared face image is not the face mask wearing image, outputting the face detection frame.

(1) constructing a first training set and training labels thereof, wherein the first training set comprises a plurality of thermal infrared face images of different people wearing masks and a plurality of thermal infrared face images of different people not wearing masks, and the training labels of the thermal infrared face images comprise face calibration frames and class labels for indicating whether the masks are worn or not;

(2) and setting a plurality of anchor points for the first convolutional neural network, and inputting the first training set and the training labels thereof into the first convolutional neural network for training.

In one embodiment, the training and testing of the first convolutional neural network specifically comprises the steps of:

(1) combining N thermal infrared face images with masks and L thermal infrared face images without masks into a training set, acquiring M thermal infrared images as a test set, and framing a face frame of each thermal infrared image as a calibration frame; the mark of each thermal infrared image with the mask is 1, and the mark of each thermal infrared image without the mask is 0;

in order to guarantee a sufficient number of thermal infrared images, it is necessary to guarantee sufficient experimental data. Specifically, a medium wave thermal infrared imager with the model of TAURUS-110kM of IRCAM company in Germany can be adopted, and the test environment of data is as follows: the distance of people's face apart from the camera is 2 meters, 3 meters, and the people of 5 meters different distances records the video of setting for the time to everyone, and every video is selected the photo of setting for quantity after cutting out according to setting for the frame number, can select 86 individuals to shoot, adopts the video form of intercepting of 50 frames, has included and has worn the face guard and not worn the face guard, and the influence of different scene backgrounds has guaranteed the accuracy of the follow-up use of face guard detection model through a large amount of experiments. Then, the thermal infrared images captured by the video can be screened, so that the situation that in deep learning, a computer learns the parameters and affects real parameters, for example, when a picture is cut, blurred images which are easy to appear are generally removed, 27 ten thousand thermal infrared images can be obtained finally, 2.5 ten thousand thermal infrared face images with masks are obtained, 2.5 ten thousand thermal infrared face images without masks are obtained to form a training set, and 2 ten thousand thermal infrared images with M are obtained to serve as a test set.

The thermal infrared image of each mask is marked with 1, and the thermal infrared image of each mask is marked with 0.

(2) And building a first convolutional neural network, setting the sizes of six anchor points according to the particularity of the face image, and inputting the training set and the training labels into the first convolutional neural network together for training so as to obtain a required convolutional neural network training model.

(2.1) setting the sizes of the six anchor points according to the particularity of the face image as follows:

in one embodiment, as shown in FIG. 2, the Fast Anchor box algorithm (Fast Anchor) is implemented by designing two Anchor boxes (Anchor boxes) of two sizes, the large Anchor boxes being (379,387), (251,292), (142,189), and the small Anchor boxes being (190,193), (110,116), (54, 45).

(2.2) the parameters of the first convolutional neural network are as follows:

the number of times of the invention (epoch) was set to 200, the batch (batch) was set to 8, the input image was set to 416 × 416, the momentum (moment) was set to 0.9, the attenuation (decay) was set to 0.0005, the learning rate (Ir) was set to 0.001, the color saturation (saturation) was set to 1.5, the exposure (exposure) was set to 1.5, the hue change range (hue) was set to.1, and the training time was 83 hours.

(3) Inputting the thermal infrared image concentrated in the test, and obtaining a face detection frame of the face mask through a convolutional neural network; the invention can realize the processing of a single image 0.026s and has high precision. The test sample of the invention selects 2 ten thousand pictures, and the accuracy is 96.5%.

The mask detection result output through step S1 is shown in fig. 3.

And S2, inputting the output data of the first convolutional neural network into a trained second convolutional neural network, wherein the second convolutional neural network is used for outputting the face detection image without the mask.

And inputting the face detection frame image output by the first convolutional neural network into a second convolutional neural network.

The second convolutional neural network is used for removing the mask, and more information is recovered from the image, so that the subsequent identification of a third convolutional neural network is facilitated.

(1) constructing a second training set and training labels thereof, wherein the second training set comprises two training subsets, one of the training subsets comprises a plurality of mask-wearing face detection images, the other training subset comprises a plurality of mask-not-wearing face detection images, and the face detection images are marked with class labels which represent the mask-wearing face detection images or the mask-not-wearing face detection images;

(2) and inputting the second training set and the training labels thereof into a second convolutional neural network for training.

In one embodiment, the training and testing of the second convolutional neural network specifically comprises the steps of:

(1) and combining the A-5 ten thousand hot infrared face images detected as wearing the mask and the B-5 ten thousand hot infrared face images detected as not wearing the mask into a training set, and acquiring the C-1 ten hot infrared image of wearing the mask as a test set. 0-59 volunteers were selected as the training set, and 60-85 volunteers were selected as the testing set. The hot infrared face image with the detected face mask is marked as real _ A, and the hot infrared face image with the detected face mask is marked as real _ B. And (3) building a convolutional neural network without a mask, and inputting the training set and the training labels into the convolutional neural network together for training so as to obtain a required training model of the convolutional neural network.

A training image pair of the second convolutional neural network for the de-mask processing is shown in fig. 4.

(1.1) the second convolutional neural network of the de-mask consists of:

as shown in fig. 5, the structure of the second convolutional neural network is as follows: the neural network of the mask removing device is provided with two Gan networks, each Gan network is provided with a generator and a discriminator, the first Gan network comprises a generator 1 and a discriminator A, and the second Gan network comprises a generator 2 and a discriminator B. The first countermeasure network comprises a first generator and a first discriminator, the second countermeasure network comprises a second generator and a second discriminator, the class label of the face detection image with the mask is real _ A, the class label of the face detection image without the mask is real _ B, the face detection image with the mask generated by the second discriminator is recorded as fake _ A class, and the face detection image without the mask generated by the first discriminator is recorded as fake _ B class; the first discriminator is used for judging whether the input image or the image generated by the second discriminator is a real _ A type image or not, and the first generator is used for generating a fake _ B type image according to the input or generated real _ A type image; the second discriminator is used for judging whether the input image or the image generated by the first discriminator is a real _ B type, and the second generator is used for generating a fake _ A type image according to the input or generated real _ B type. Two mirrored Gan networks form a ring network, and the two Gan networks resist each other, share information and learn together.

As shown in fig. 5(a), the first gan network inputs a real _ a image, first, the determiner a determines whether the image is a real _ a image, the generator 1 generates a fake _ B image, the determiner B determines whether the image is a real _ B image, and then the generator 2 sends the fake _ B image to generate a fake _ a image.

As shown in fig. 5(B), the second gan network inputs a real _ B type image, first determines whether the image is of real _ B type by the determiner B, generates a fake _ a type image by the generator 2, determines whether the image is of real _ a type by the determiner a, and then sends the fake _ a type image to the generator 1 to generate a fake _ B type image.

The Generator (Generator) in the Gan network adopts a network consisting of Resblock, a down-sampling part is convoluted by stride, and an up-sampling part is deconvoluted. The arbiter (Discriminator) uses the PatchGANs structure in the Pix2Pix network, and the size is 70x 70.

(1.2) loss function of the network:

the loss function in training is divided into three parts as formula (1), the first part is a generator1 and a loss function of a discriminator a, the generator 1 of real _ a to fake _ B being denoted by G, the discriminator a by D_xAnd (4) showing. The second part is a loss function of generator 2 and discriminator B, with generator 2 from real _ B to fake _ A being denoted F and discriminator B being denoted D_Y。

Loss_Gan(G，D_x) In the figure, G represents generators of real _ A to real _ B, and Dx represents discriminators of real _ A to real _ B. Loss_Gan(F，D_Y) F represents generators of real _ B to real _ A, and Dy represents discriminators of real _ B to real _ A.

The third part is the countermeasure loss, and λ is a weighting factor used to control the weight of the cyclic consistency loss in the total loss.

Loss_full＝Loss_Gan(G，D_x)+Loss_Gan(F，D_Y)+λLoss_cycle (1)

The number of times (epoch) of the present invention is set to 200. The batch (batch) is set to 6. The input image is set to 256 × 256. The learning rate (Lr) was set to 0.0002. B1 in the optimizer (Adam) is set to 0.5, b2 is set to 0.999, the cyclic loss coefficient (lambda _ cycle) is set to 10.0, and the self loss coefficient (lambda _ id) is set to 5.0.

If a traditional confrontation generation network is used, the images of each person wearing the mask and the images of each person not wearing the mask need to be in one-to-one correspondence, the face positions of each person need to be ensured not to be inclined, in the process of breathing with the mask, the breathing will cause that each thermal infrared face image is different, and when the breath is exhaled, steam exists, so that the images cannot be completely registered with the images without the masks. However, in the actual acquisition data set, this cannot be realized, the conventional countermeasure generation network uses the deviation between pixels to complete the image migration, and if the problem of face inclination and the like of the data set occurs, the effect is very poor in the generalization test. But the confrontation generation network of the application does not need to carry out registration training on images of each person, so the trained model has good generalization capability.

(2) And inputting thermal infrared images in the face data of No. 60-85 people wearing the mask in the test set, and obtaining the thermal infrared face images without the mask through a convolutional neural network. And outputting the generated data of removing the fake _ B of the mask.

The front-to-back effect pair of the removal mask is shown in fig. 6.

And S3, inputting the output data of the second convolutional neural network into a trained third convolutional neural network, wherein the third convolutional neural network is used for outputting a face recognition result.

The input of the third convolutional neural network is data processed by the second convolutional neural network, and the output is recognized face information.

In one embodiment, the training of the third convolutional neural network specifically includes the steps of:

(1) constructing a third training set and training labels thereof, wherein the third training set comprises a plurality of face detection images without masks and a plurality of face detection images without masks, the face detection images are marked with category labels and identity labels, the category labels are used for representing the face detection images without masks or the face detection images without masks, the category labels are used for distinguishing different people, and the identity labels of the face detection images without masks and the face detection images without masks belonging to the same person are the same;

(2) and inputting the third training set and the training labels thereof into a second convolutional neural network for training.

In one embodiment, the training and testing of the third convolutional neural network specifically comprises the steps of:

(1) and (3) combining 2 ten thousand thermal infrared face images without a mask and 2 ten thousand thermal infrared face images without the mask into a training set, and acquiring 4 ten thousand thermal infrared images as a test set. The images of the person without the mask and with the mask removed are jointly marked. Building a convolutional neural network, inputting a training set and a training label into the convolutional neural network together for training, thereby obtaining a required training model of the convolutional neural network;

as shown in fig. 7, preferably, the squared loss function during training:

in Loss1, i represents the ith image, N represents N sets of samples, real _ B represents an image without a mask, real _ B _ others represents a different image without a mask, and others represents an image of a different person. Loss1 is used to pull in the Euclidean distance between the same class of real _ B and pull up the Euclidean distance between different classes. The euclidean distance may be replaced by other calculation methods for representing the difference between pictures.

Real _ B in Loss2 represents an image of an unweared mask, fake _ B represents a generated unweared mask image, and others represents images of different people. Loss2 is used to pull in the Euclidean distance between real _ B and fake _ B, and to pull up the Euclidean distance from different classes.

In Loss3, fake _ B indicates the generated image without the mask, fake _ B _ others indicates the generated different images without the mask, and others indicates the images of different persons. Loss3 is used to pull up the Euclidean distance between similar class fake _ B and pull up the Euclidean distance between different classes.

(2) And inputting the thermal infrared image with concentrated test, and obtaining the belonging information of the thermal infrared face image through a convolutional neural network. The performance of the model was examined using TAR (correct acceptance rate), FAR (false acceptance rate) and Accuracy (recognition rate) as evaluation indexes. Suppose there is total at present_TrueGroup same person pair { two different thermal infrared face images of the same person } and total_FalseDifferent people in the group are paired with each other { two thermal infrared face images of different people }, and TAR and FAR are defined as follows:

in the formula, total_TrueAcceptThe number of the same person pairs which are judged to belong to the same person by the model; total of_FalseAcceptThe number of different people is judged to belong to the same person by the model; total of_FalseRejectThe number of different people to which the model determines the number of different people. By choosing different maximum similarity distances, there will be different TARs and FARs. The data set used contained 20000 groups of homo-human pairs, 20000 groups of different human pairs. The area AUC of ROC was 0.512. The maximum similarity distance was taken to be 18, and the recognition rate was 89.68%.

A thermal infrared image face recognition method of wearing a mask according to another embodiment of the present invention is shown in fig. 8.

The thermal infrared image face recognition system of the embodiment of the invention comprises:

the system comprises a first module, a second module and a third module, wherein the first module is used for acquiring a thermal infrared face image to be identified and inputting the thermal infrared face image into a trained first convolution neural network, and the first convolution neural network is used for detecting whether the thermal infrared face image is a mask wearing image and a face detection frame;

the second module is used for inputting the output data of the first convolutional neural network into a trained second convolutional neural network, and the second convolutional neural network is used for outputting the face detection image without the mask;

The implementation principle and technical effect of the system are similar to those of the method, and are not described herein again.

The present embodiment further provides an electronic device, which includes at least one processor and at least one memory, where the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the steps of the method for recognizing a face by using a thermal infrared image of a wearer in the foregoing embodiments, which are not described herein again; in this embodiment, the types of the processor and the memory are not particularly limited, for example: the processor may be a microprocessor, digital information processor, on-chip programmable logic system, or the like; the memory may be volatile memory, non-volatile memory, a combination thereof, or the like.

The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, wherein the computer program is executed by a processor to implement the technical scheme of any one of the above embodiments of the thermal infrared image face recognition method for wearing a mask. The implementation principle and technical effect are similar to those of the above method, and are not described herein again.

It must be noted that in any of the above embodiments, the methods are not necessarily executed in order of sequence number, and as long as it cannot be assumed from the execution logic that they are necessarily executed in a certain order, it means that they can be executed in any other possible order.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A thermal infrared image face recognition method of wearing a mask is characterized by comprising the following steps:

2. The method of claim 1, wherein the training of the first convolutional neural network comprises the steps of:

3. The method of claim 1, wherein the training of the second convolutional neural network comprises the steps of:

4. The method of claim 1, wherein the training of the third convolutional neural network comprises the steps of:

5. The method according to claim 3, wherein the second convolutional neural network comprises a first anti-network and a second anti-network, the first anti-network comprises a first generator and a first discriminator, the second anti-network comprises a second generator and a second discriminator, the class label of the face detection image with the mask is real _ A, the class label of the face detection image without the mask is real _ B, the image with the mask generated by the second discriminator is labeled as class fake _ A, and the image without the mask generated by the first discriminator is labeled as class fake _ B;

6. The method according to claim 5, wherein the loss function of the second convolutional neural network is:

Loss_full＝Loss_Gan(G，D_x)+Loss_Gan(F，D_Y)+λLoss_cycle

therein, Loss_fullTo loss ofFunction, the first generator is marked as G and the first arbiter is marked as D_x，Loss_Gan(G，D_x) A loss function representing the first generator and the first discriminator, the second generator being denoted as F, the second discriminator being denoted as D_y，Loss_Gan(F，D_Y) Loss function, Loss, representing said second generator and said second discriminator_cycleTo combat losses, λ is a weighting factor.

7. The method according to claim 5, wherein the Sudoku loss function of the third convolutional neural network is:

8. A thermal infrared image face recognition system for wearing a mask, comprising:

9. An electronic device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.