CN113392696A

CN113392696A - Intelligent court monitoring face recognition system and method based on fractional calculus

Info

Publication number: CN113392696A
Application number: CN202110369258.9A
Authority: CN
Inventors: 彭朝霞; 蒲亦非; 王竹; 周激流; 张妮
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2021-04-06
Filing date: 2021-04-06
Publication date: 2021-09-14

Abstract

The invention relates to the field of computer vision and image processing, and discloses an intelligent court monitoring face recognition system and method based on fractional calculus, which can be used for quickly extracting meaningful features in a picture and improving the feature capture capability of a specific area so as to improve the face recognition precision. The method comprises the following steps: a. extracting an interested face area from the captured face image to obtain a face picture to be recognized; b. preprocessing a face picture to be recognized; c. adopting a trained improved residual error network for recognition to obtain a face recognition result: extracting the features of the face picture to obtain a face feature picture; compressing the human face feature map on a space dimension by using a channel attention mechanism to generate a channel attention map; compressing the channel attention map as an input by using a spatial attention mechanism in the channel dimension to generate a spatial attention map; and finally, comparing the output characteristic image with the face image in the database by adopting a classifier to obtain an identification result.

Description

Intelligent court monitoring face recognition system and method based on fractional calculus

Technical Field

The invention relates to the field of computer vision and image processing, in particular to an intelligent court monitoring face recognition system and method based on fractional calculus.

Background

In recent years, as a powerful means for capturing biological facial feature information and matching face data in an existing database, a face recognition technology has the advantages of non-contact type, automatic capturing, low application cost and the like, plays an important role in the aspects of economic safety, information safety, public safety and the like, and is applied in more and more scenes.

However, in real life, the face image captured by the device is affected by natural illumination, human posture expression, environmental background and other factors, or face occlusion caused by wearing a mask under the current new crown pneumonia epidemic situation, and these phenomena make face recognition still face some challenges.

Because the residual error network can simplify the training of a deeper network structure while extracting abundant face features, model degradation caused when the depth of a model structure is deepened is avoided, many current face recognition models perform face recognition based on the residual error network ResNet as a network model, but the existing models are not enough for meaningful feature extraction in pictures and feature capture capability of certain specific areas, and are also deficient in recognition accuracy.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the system and the method for recognizing the face based on the intelligent court monitoring of the fractional calculus are provided, the meaningful features in the picture are extracted quickly, the feature capturing capability of a specific area is improved, and therefore the face recognition precision is improved.

The technical scheme adopted by the invention for solving the technical problems is as follows:

an intelligent court monitoring face recognition system based on fractional calculus, comprising:

the face image detection unit is used for extracting an interested face area from the captured face image to obtain a face image to be recognized;

the face image preprocessing unit is used for preprocessing a face image to be recognized;

the face recognition unit is used for recognizing the preprocessed face picture by adopting a trained improved residual error network to obtain a face recognition result;

the improved residual network comprises a rolling block, a channel attention module, a spatial attention module and a classifier; the rolling block is used for extracting the features of the face picture to obtain a face feature picture; the channel attention module is used for compressing the human face feature map on a space dimension to generate a channel attention map; the spatial attention module is used for taking the channel attention map as an input, compressing the channel attention map in the channel dimension and generating a spatial attention map; the classifier is used for comparing the space attention diagram with the face image in the database to obtain a recognition result.

In addition, based on the above face recognition system, another aspect of the present invention further provides a face recognition method for intelligent court monitoring based on fractional calculus, which includes the following steps:

a. extracting an interested face area from the captured face image to obtain a face picture to be recognized;

b. preprocessing a face picture to be recognized;

c. adopting a trained improved residual error network to identify the preprocessed face picture, and obtaining a face identification result:

extracting the features of the face picture to obtain a face feature picture; compressing the human face feature map on a space dimension by using a channel attention mechanism to generate a channel attention map; compressing the channel attention map in the channel dimension by using a spatial attention mechanism as an input to generate a spatial attention map; and finally, comparing the spatial attention diagram with the face image in the database by adopting a classifier to obtain an identification result.

As a further optimization, in the step b, the preprocessing includes correcting and cutting the face picture to be recognized by using a trained MTCNN network (multi-task cascaded convolutional neural network).

As a further optimization, in step c, compressing the face feature map in the spatial dimension by using a channel attention mechanism to generate a channel attention map, including:

compressing the human face feature map on a spatial dimension by adopting a channel attention mechanism, aggregating spatial information of feature mapping by using average pooling and maximum pooling, sending features generated by the average pooling and maximum pooling to a shared multilayer neural network, compressing the spatial dimension of the input feature map, summing and combining element by element, and finally generating the channel attention map.

As a further optimization, the channel attention mechanism is expressed as:

where σ is Sigmoid function, W₀And W₁Is the weight of the convolution multiplication.

As a further optimization, in step c, the compressing the channel attention map in the channel dimension using the spatial attention mechanism as an input to generate a spatial attention map, including:

the method comprises the steps of compressing an input channel attention map, respectively carrying out average pooling and maximum pooling on channel dimensions, combining pooled features to obtain a two-dimensional feature map, carrying out dimension reduction through convolution operation, and finally generating the space attention map.

As a further optimization, the spatial attention mechanism is expressed as:

where σ is Sigmoid function and 7 × 7 represents convolution kernel size.

As a further optimization, in step c, the classifier adopts an Arcface loss function as a classification function.

As a further optimization, the Sigmoid function is represented as:

and processing by adopting fractional differentiation.

The invention has the beneficial effects that:

the improved residual error network structure adds a channel attention and space attention mechanism on the basis of a ResNet residual error network, so that the characteristics are extracted from two dimensions of a channel and a space, meaningful characteristics in a face picture can be extracted quickly, and the characteristic capture capability of a specific area is improved; in addition, the node function is processed by adopting fractional order differentiation, so that the convergence speed of the network model in the training process can be increased, and the time cost for increasing the depth of the network model is reduced. Therefore, the method and the device can improve the accuracy of face recognition under the condition of not increasing excessive calculation overhead.

Drawings

FIG. 1 is a block diagram of a face recognition system in accordance with the present invention;

FIG. 2 is a flow chart of a face recognition method in the present invention;

fig. 3 is a schematic diagram of the improved residual error network principle of the present invention.

Detailed Description

The invention aims to provide an intelligent court monitoring face recognition system and method based on fractional order calculus.

In a specific implementation, as shown in fig. 1, the face recognition system in the present invention includes: the system comprises a face picture detection unit, a face picture preprocessing unit and a face recognition unit;

the improved residual error network in the invention is based on a ResNet network structure, and adds channel attention and space attention after a convolution block of a model, as shown in FIG. 3, the improved residual error network specifically comprises a convolution block, a channel attention module, a space attention module and a classifier; the convolution block is used for extracting the features of the face picture to obtain a face feature picture; the channel attention module is used for compressing the human face feature map on a space dimension to generate a channel attention map; the spatial attention module is used for compressing the channel attention diagram as input in the channel dimension to generate a spatial attention diagram; the classifier is used for comparing the space attention diagram with the face image in the database to obtain a recognition result.

Based on the face recognition system, the flow of the face recognition method provided by the invention is shown in fig. 2, and the method comprises the following steps:

s201, extracting an interested face area from the captured face image to obtain a face picture to be recognized;

s202, preprocessing a face picture to be recognized;

in order to facilitate the processing of the network model and improve the recognition accuracy, the preprocessing in the step comprises the steps of correcting and cutting the face picture to be recognized by adopting the trained MTCNN.

S203, recognizing the preprocessed face picture by adopting the trained improved residual error network to obtain a face recognition result: in the step, firstly, feature extraction is carried out on a face picture to obtain a face feature picture; compressing the face feature map in a spatial dimension by using a channel attention mechanism to generate a channel attention map; compressing the channel attention map as an input by using a spatial attention mechanism in the channel dimension to generate a spatial attention map; and finally, comparing the spatial attention diagram with the face image in the database by adopting a classifier to obtain an identification result.

Specifically, after the improved residual network adopts a convolution block to extract human face features and obtain a human face feature map, a channel attention mechanism is utilized to compress the feature map on a spatial dimension, average pooling and maximum pooling are used to aggregate spatial information of feature mapping, features generated by the average pooling and maximum pooling are sent to a shared multilayer neural network, the spatial dimension of input feature maps is compressed, element-by-element summation and combination are carried out, and finally a channel attention map is generated.

The channel attention mechanism is expressed as:

Then, the space attention module is used for taking the output of the channel attention module as an input feature map, channel compression is carried out on the input feature map, average pooling and maximum pooling are respectively carried out on channel dimensions, then pooled features are combined to obtain a feature map of a two-dimensional space, dimension reduction is carried out through convolution operation, and finally the space attention map is generated.

The attention mechanism is represented as:

where σ is Sigmoid function, and 7 × 7 represents convolution kernel size.

Because the improved residual error network model simultaneously extracts the features from two dimensions of a channel and a space, compared with the method only using a channel attention module or a space attention module, the improved residual error network model has higher feature expression capability, and simultaneously uses average pooling and maximum pooling on the dimension to combine and generate feature descriptions, so that the meaningful features in the face picture can be quickly extracted, and the feature capture capability of a specific area is also improved.

Meanwhile, the node function Sigmoid is processed by utilizing the fractional order, compared with integral order differentiation of Sigmoid, when the node function is processed by utilizing the fractional order differentiation, the 0.5-order derivative of the function is very fast to change relative to the 1-order derivative at the 0 and 1 positions of the function, so that the convergence speed of the network model in the training process can be obviously accelerated, and the time cost for increasing the depth of the network model is reduced. And finally, adopting an Arcface loss function as a classification function, and comparing the recognized face image with the face image in the database to obtain a recognition result.

Sigmoid function is expressed as:

example (b):

in this example, we used the CASIA-WebFace as the training data set, which contains 494414 human face images of 10575 individuals collected over the network, and we first used the trained MTCNN neural network to detect the pictures in the data set, and cut the detected human face pictures to 112 × 112 pixels, and trained the improved residual network model of the present invention based on these preprocessed human face pictures.

The batch-size in the training is set to 64, the initial learning rate is set to 0.05, the iteration total round number epoch is set to 25, the learning rate is attenuated to 0.1 times of the last learning rate when iterating to the 14 th and 22 th epochs, and in order to prevent the model from being overfitted, the total weight attenuation parameter is set to 5 multiplied by 10^-4And optimizing the model by adopting a random gradient descent strategy SGD in training, and setting the momentum parameter to be 0.9. When processing node function Sigmoid, changeAlthough the variable differential order can reduce the face recognition effect to a certain extent, the convergence time of the training can be effectively shortened, so that the convergence time of the model training can be effectively reduced by properly adjusting the fractional order differential.

In order to verify the effect of the new model, a traditional residual error network and an improved residual error network are respectively adopted to test the three face data sets of LFW, AgeDB-30 and CFP-FP to obtain a test result. The LFW dataset contained 13233 face images of 5749 individuals taken without restriction, each image giving the corresponding name, and the vast majority of people had only one picture; the AgeDB-30 dataset contained 16488 images belonging to 568 different people, each with identity, age, and gender attributes; the CFP data set contains 500 pictures of human faces with different identities, including front faces and side faces that are different for each person.

The test results of the traditional residual error network and the improved residual error network of the invention on three face data sets are respectively referred to table 1 and table 2;

TABLE 1 test Effect of conventional residual error network on LFW, AgeDB-30, CFP-FP

Data	Rate of identification accuracy
		LFW	99.383
AgeDB_30	93.336
		CFP_FP	95.557

TABLE 2 test Effect of improved residual error network on LFW, AgeDB-30, CFP-FP

Data	Rate of identification accuracy
		LFW	99.583
AgeDB_30	94.583
		CFP_FP	96.104

It can be seen that after a channel attention mechanism and a space attention mechanism are introduced into the residual error network, the effect of testing on three data sets is improved to a certain extent compared with that of the traditional residual error network, so that the face recognition system has better recognition performance and is particularly suitable for court monitoring.

Claims

1. Wisdom court control face identification system based on fractional order calculus, its characterized in that includes:

the improved residual network comprises a rolling block, a channel attention module, a spatial attention module and a classifier; the convolution block is used for extracting the features of the face picture to obtain a face feature picture; the channel attention module is used for compressing the human face feature map on a space dimension to generate a channel attention map; the spatial attention module is used for compressing the channel attention diagram as input in the channel dimension to generate a spatial attention diagram; the classifier is used for comparing the output characteristic image with the face image in the database to obtain an identification result.

2. The face recognition method based on fractional calculus intelligent court monitoring is applied to the face recognition system as claimed in claim 1, and is characterized by comprising the following steps:

b. preprocessing a face picture to be recognized;

extracting the features of the face picture to obtain a face feature picture; compressing the human face feature map on a space dimension by using a channel attention mechanism to generate a channel attention map; compressing the channel attention map as an input by using a spatial attention mechanism in the channel dimension to generate a spatial attention map; and finally, comparing the output characteristic image with the face image in the database by adopting a classifier to obtain an identification result.

3. The method as claimed in claim 2, wherein the preprocessing comprises rectifying and cropping the picture of the face to be recognized using a trained MTCNN network in step b.

4. The method as claimed in claim 2, wherein the step c of compressing the face feature map in the spatial dimension by using the channel attention mechanism to generate the channel attention map comprises:

5. The method of claim 4 wherein the face recognition is monitored in an intelligent court based on fractional calculus,

the channel attention mechanism is expressed as:

6. The method of claim 2, wherein the face recognition is monitored in an intelligent court based on fractional calculus,

in step c, the compressing the channel attention map in the channel dimension using the spatial attention mechanism as an input to generate a spatial attention map, including:

the method comprises the steps of compressing an input channel attention map, respectively carrying out average pooling and maximum pooling on channel dimensions, combining pooled features to obtain a feature map of a two-dimensional space, carrying out dimension reduction through convolution operation, and finally generating the space attention map.

7. The method of claim 6, wherein the face recognition is monitored in an intelligent court based on fractional calculus,

the spatial attention mechanism is expressed as:

where σ is Sigmoid function and 7 × 7 represents convolution kernel size.

8. The method as claimed in any one of claims 2 to 7, wherein in step c, the classifier uses an Arcface loss function as the classification function.

9. The method for intelligent forensic face recognition based on fractional calculus as claimed in any one of claims 2 to 7 wherein the Sigmoid function is expressed as:

and processing by adopting fractional differentiation.