Disclosure of Invention
In order to solve the above problems in the prior art, the present invention provides a hyper-spectral face recognition method, a hyper-spectral face recognition device, an electronic device, and a storage medium thereof.
One embodiment of the invention provides a hyper-spectral face recognition method, which comprises the following steps:
acquiring a visible light face image set and an infrared face image set, and dividing the visible light face image set into a visible light face training image set and a visible light face testing image set, wherein the infrared face image set is divided into an infrared face training image set and an infrared face testing image set;
respectively preprocessing the visible face training image set, the visible face testing image set, the infrared face training image set and the infrared face testing image set to obtain a visible face preprocessing training image set, a visible face preprocessing testing image set, an infrared face preprocessing training image set and an infrared face preprocessing testing image set;
constructing a hyperspectral image fusion network model, and training the hyperspectral image fusion network model according to the visible light face preprocessing training image set and the infrared face preprocessing training image set to obtain a trained hyperspectral image fusion network model;
inputting the visible light face test image set and the infrared face test image set into the trained hyperspectral image fusion network model to obtain a hyperspectral face image set, and dividing the hyperspectral face image set into a hyperspectral face training image set and a hyperspectral face test image set;
constructing a convolutional neural network face recognition model, and training the convolutional neural network face recognition model according to the hyper-spectral face training image set to obtain a trained convolutional neural network face recognition model;
and inputting the hyper-spectral face test image set into the trained convolutional neural network face recognition model to obtain a face feature set, and classifying the face feature set by using a support vector machine classifier to realize hyper-spectral face recognition.
In an embodiment of the present invention, the pre-processing the visible light face training image set, the visible light face testing image set, the infrared face training image set, and the infrared face testing image set to obtain a visible light face preprocessing training image set, a visible light face preprocessing testing image set, an infrared face preprocessing training image set, and an infrared face preprocessing testing image set respectively includes:
respectively carrying out gray level conversion and normalization processing on the visible light face training image set and the visible light face testing image set to obtain a visible light face preprocessing training image set and a visible light face preprocessing testing image set;
and respectively carrying out image enhancement and normalization processing on the infrared face training image set and the infrared face testing image set to obtain an infrared face preprocessing training image set and an infrared face preprocessing testing image set.
In one embodiment of the invention, the constructed hyperspectral image fusion network model comprises a pre-fusion layer, an encoder module, a fusion layer and a decoder module, wherein,
the input of the pre-fusion layer is connected with an input image, the output of the pre-fusion layer is connected with the input of the encoder module, the encoder module comprises a first convolution layer and a dense residual module which are sequentially connected, the output of the dense residual module and the output of the pre-fusion layer are subjected to global residual connection output, and the global residual connection output is connected with the input of the fusion layer;
the output of the fusion layer is connected with the input of the decoder module, the decoder module comprises second to fifth convolution layers and a feedback layer which are sequentially connected, and the output of the feedback layer and the output of the fusion layer are input to the second convolution layer again to form feedback connection.
In an embodiment of the present invention, the dense residual module includes a first dense residual connection layer, a third dense residual connection layer, a multi-scale splicing layer, and a fourth dense residual connection layer, which are sequentially connected, an input of the first dense residual connection layer is further connected to an output of the first dense residual connection layer, an output of the second dense residual connection layer, and an output of the third dense residual connection layer, an input of the second dense residual connection layer is further connected to an output of the second dense residual connection layer and an output of the third dense residual connection layer, an input of the third dense residual connection layer is further connected to an output of the third dense residual connection layer, an output of the fourth dense residual connection layer is further locally connected to and laminated with an input of the first volume residual connection layer, and a local residual connection output and an output of the pre-fusion layer are globally connected and output.
In an embodiment of the present invention, training the hyperspectral image fusion network model according to the visible light face preprocessing training image set and the infrared face preprocessing training image set to obtain a trained hyperspectral image fusion network model, includes:
constructing a composite loss function based on structural loss, pixel loss and average gradient loss;
and training the hyperspectral image fusion network model according to the visible face preprocessing training image set and the infrared face preprocessing training image set by using the composite loss function based on structural loss, pixel loss and average gradient loss to obtain the trained hyperspectral image fusion network model.
In one embodiment of the invention, the constructed convolutional neural network face recognition model comprises a ResNet feature extraction module, a feature normalization module and a feature space mapping module which are sequentially connected.
In an embodiment of the present invention, training the convolutional neural network face recognition model according to the hyper-spectral face training image set to obtain a trained convolutional neural network face recognition model, including:
constructing a triple loss function Triplet loss;
and training the convolutional neural network face recognition model according to a hyper-spectral face training image set by utilizing the triple loss function triple loss to obtain a trained convolutional neural network face recognition model.
Another embodiment of the present invention provides a hyper-spectral face recognition apparatus, including:
the data acquisition module is used for acquiring a visible light face image set and an infrared face image set, dividing the visible light face image set into a visible light face training image set and a visible light face testing image set, and dividing the infrared face image set into an infrared face training image set and an infrared face testing image set;
the data preprocessing module is used for respectively preprocessing the visible light face training image set, the visible light face testing image set, the infrared face training image set and the infrared face testing image set to obtain a visible light face preprocessing training image set, a visible light face preprocessing testing image set, an infrared face preprocessing training image set and an infrared face preprocessing testing image set;
the first model construction training module is used for constructing a hyperspectral image fusion network model, and training the hyperspectral image fusion network model according to the visible light face preprocessing training image set and the infrared face preprocessing training image set to obtain a trained hyperspectral image fusion network model;
the data generation module is used for inputting the visible light face test image set and the infrared face test image set into the trained hyper-spectral image fusion network model to obtain a hyper-spectral face image set, and dividing the hyper-spectral face image set into a hyper-spectral face training image set and a hyper-spectral face test image set;
the second model construction training module is used for constructing a convolutional neural network face recognition model, and training the convolutional neural network face recognition model according to the hyper-spectral face training image set to obtain a trained convolutional neural network face recognition model;
and the data identification module is used for inputting the hyper-spectral face test image set into the trained convolutional neural network face identification model to obtain a face feature set, and classifying the face feature set by using a support vector machine classifier to realize hyper-spectral face identification.
The invention further provides hyper-spectrum face recognition electronic equipment, which comprises an image collector, a display, a processor, a communication interface, a memory and a communication bus, wherein the image collector, the display, the processor, the communication interface and the memory finish mutual communication through the communication bus;
the image collector is used for collecting image data;
the display is used for displaying the image identification data;
the memory is used for storing a computer program;
the processor is configured to implement any one of the above hyper-spectral face recognition methods when executing the computer program stored in the memory.
Yet another embodiment of the present invention provides a computer-readable storage medium having stored therein a computer program which, when executed by a processor, implements a hyper-spectral face recognition method as described in any of the above.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a set of complete hyper-spectral face recognition technology, which can overcome the defects of narrow application range, low recognition performance, poor feature extraction robustness and the like of the traditional face recognition technology; the embodiment provides a new theory and a new method for the practicability of the face recognition technology, so that the face recognition technology becomes more practical, reliable and popularized.
The present invention will be described in further detail with reference to the accompanying drawings and examples.
Detailed Description
The present invention will be described in further detail with reference to specific examples, but the embodiments of the present invention are not limited thereto.
Example one
At present, the face recognition technology mostly adopts visible light imaging and acquisition means, so the technology is mostly limited to the good background condition with sufficient light in the daytime and generally has poor performance in the environments of insufficient light, severe climate and the like. With the emergence of applications in various complex environments in the real world, the face recognition technology based on visible light is more and more difficult to meet the requirements, and the infrared imaging technology has the advantages of low background light requirements, imaging in rainy days, foggy days and other climates, and the like, and makes up for the defects of the visible light imaging environment. Therefore, referring to fig. 1, fig. 1 is a schematic flow chart of a hyper-spectral face recognition method according to an embodiment of the present invention, where the embodiment provides a hyper-spectral face recognition method, and the method includes the following steps:
step 1, a visible light face image set and an infrared face image set are obtained, the visible light face image set is divided into a visible light face training image set and a visible light face testing image set, and the infrared face image set is divided into an infrared face training image set and an infrared face testing image set.
Specifically, in the embodiment, the visible light and the infrared cameras are used to simultaneously acquire the face images of the individuals to obtain the visible light face image set and the infrared face image set. And the visible light face image set and the infrared face image set are used for subsequent face recognition.
And 2, respectively preprocessing the visible light face training image set, the visible light face testing image set, the infrared face training image set and the infrared face testing image set to obtain a visible light face preprocessing training image set, a visible light face preprocessing testing image set, an infrared face preprocessing training image set and an infrared face preprocessing testing image set.
Specifically, in order to realize better image fusion, in this embodiment, before image fusion, normalization and contrast adjustment are performed on face images in the visible light face training image set, the visible light face testing image set, the infrared face training image set, and the infrared face testing image set, specifically, step 2 in this embodiment includes step 2.1 and step 2.2:
and 2.1, respectively carrying out gray level conversion and normalization processing on the visible light face training image set and the visible light face testing image set to obtain a visible light face preprocessing training image set and a visible light face preprocessing testing image set.
Specifically, in this embodiment, the face images in the visible light face training image set and the visible light face testing image set are firstly subjected to gray scale conversion into gray scale images, and the formula design of the gray scale conversion is specifically as follows:
I gray =0.2989×R+0.5870×G+0.1140×B (1)
wherein, I gray For the gray image output after gray conversion, R, G, B is the RGB values corresponding to the image before gray conversion, which is specifically the RGB values corresponding to the face images in the visible light face training image set and the visible light face testing image set in this embodiment.
Then, the gray image I is processed gray A normalization process is performed, which normalizes to [0, 255]]The normalized formula design is as follows:
wherein, I n As a grey scale image I gray Normalized image output of (1), I max And I min Respectively a gray scale image I gray Maximum and minimum gray values.
In this embodiment, each of the visible light face training image and the visible light face testing image in the visible light face training image set and the visible light face testing image set is processed by the above formula (1) and formula (2), so as to obtain a visible light face preprocessing training image set and a visible light face preprocessing testing image set.
And 2.2, respectively carrying out image enhancement and normalization processing on the infrared face training image set and the infrared face testing image set to obtain an infrared face preprocessing training image set and an infrared face preprocessing testing image set.
Specifically, in this embodiment, the face images in the infrared face training image set and the infrared face testing image set are first subjected to image enhancement by using a log operator, and the formula design for image enhancement is specifically as follows:
I=log(1+X) (3)
the embodiment specifically includes an infrared face training image set and an infrared face testing image set, where I is an image after image enhancement, and X is an image before image enhancement.
Then, the enhanced image I is normalized to [0, 255] by the above formula (2).
In this embodiment, each of the infrared face training image set and the infrared face testing image set is processed by the above formula (3) and formula (2), so as to obtain an infrared face preprocessing training image set and an infrared face preprocessing testing image set.
And 3, constructing a hyperspectral image fusion network model, and training the hyperspectral image fusion network model according to the visible light face preprocessing training image set and the infrared face preprocessing training image set to obtain the trained hyperspectral image fusion network model.
Specifically, in order to obtain a high-quality hyperspectral image with rich texture information and thermal information, the present embodiment proposes a novel feedback type depth fusion framework to fuse an infrared image and a visible light image, and specifically, step 3 of the present embodiment includes step 3.1 and step 3.2:
and 3.1, constructing a hyperspectral image fusion network model.
Specifically, referring to fig. 2, fig. 2 is a schematic structural diagram of a hyper-spectral image fusion network model in a hyper-spectral face recognition method according to an embodiment of the present invention, where the embodiment provides a hyper-spectral image fusion network model RFFuseNet with high robustness and specially for hyper-spectral face image fusion, specifically: the constructed hyperspectral image fusion network model comprises a pre-fusion layer PF, an encoder module, a fusion layer F and a decoder module, wherein the input of the pre-fusion layer PF is connected with an input image, the output of the pre-fusion layer PF is connected with the input of the encoder module, the encoder module comprises a first convolution layer C1 and an intensive residual module RDB which are sequentially connected, the output of the intensive residual module RDB and the output of the pre-fusion layer PF are subjected to global residual connection output, and the global residual connection output is connected with the input of the fusion layer F; the output of the fusion layer F is connected with the input of the decoder module, the decoder module comprises second to fifth convolution layers C2 to C5 and a feedback layer FB which are connected in sequence, and the output of the feedback layer FB and the output of the fusion layer F are input to the second convolution layer C2 again to form feedback connection. Referring to fig. 3, fig. 3 is a schematic structural diagram of a dense residual module in a hyper-spectral face recognition method according to an embodiment of the present invention, where the dense residual module RDB includes a first dense residual connection layer C _ R _1 to a third dense residual connection layer C _ R _3, a multi-scale splicing layer CC, and a fourth dense residual connection layer C _ R _4, which are sequentially connected, an input of the first dense residual connection layer is further connected to an output of the first dense residual connection layer, an output of the second dense residual connection layer, and an output of the third dense residual connection layer, an input of the second dense residual connection layer is further connected to an output of the second dense residual connection layer and an output of the third dense residual connection layer, an output of the fourth dense residual connection layer C _ R _4 is further connected to an input of the first volume of the packed layer C1 for local residual connection output, and an output of the local residual connection layer is connected to an output of the pre-spectral layer for global residual output PF.
And 3.2, training the hyperspectral image fusion network model according to the visible face preprocessing training image set and the infrared face preprocessing training image set to obtain the trained hyperspectral image fusion network model.
Specifically, in the training process of this embodiment, a composite loss function based on the structural loss, the pixel loss, and the average gradient loss is constructed, specifically, the sum of the structural loss, the pixel loss, and the average gradient loss is used as a loss function, and specifically, the composite loss function is designed as follows:
Loss=λL ssim +L p +0.05L ag (4)
wherein λ is a parameter, and may be 1, 10, 100 and 1000 ssim 、L p And L ag Respectively, structural loss, pixel loss and average gradient loss, specifically L ssim 、L p And L ag The design is as follows:
L ssim =1-SSIM(O,I) (5)
L p =||O-I|| 2 (6)
wherein, O and I respectively represent an output image and an input image corresponding to the hyper-spectral image fusion network model, M multiplied by N represents the size of the output image,
and
respectively representing the gradients of the output image in the horizontal direction and the vertical direction, and the SSIM (O, I) representing the structural similarity degree between the output image and the input image, the specific calculation mode of the SSIM (O, I) is as follows:
SSIM(O,I)=[l(O,I) α ·c(O,I) β ·s(O,I) γ ] (8)
wherein l (O, I) represents the brightness mean value of the output image and the input image, c (O, I) represents the contrast variance of the output image and the input image, s (O, I) represents the structural similarity value of the output image and the input image, and alpha, beta and gamma are parameters for adjusting the proportion of three components in the hyperspectral image fusion network model.
And further, training the hyperspectral image fusion network model according to the visible light face preprocessing training image set and the infrared face preprocessing training image set by using a composite loss function based on structural loss, pixel loss and average gradient loss to obtain the trained hyperspectral image fusion network model.
Specifically, in this embodiment, based on the composite loss function of the hyperspectral image fusion network model constructed by the above formula (4), the visible face preprocessing training image set and the face training image set in the infrared face preprocessing training image set are input into the hyperspectral image fusion network model for training, and the hyperspectral image fusion network model can be trained by specifically adopting a back propagation algorithm in the training process, so that the hyperspectral image fusion network model trained finally in this embodiment is obtained and used for generating a subsequent hyperspectral face image set.
And 4, inputting the visible light face test image set and the infrared face test image set into the trained hyperspectral image fusion network model to obtain a hyperspectral face image set, and dividing the hyperspectral face image set into a hyperspectral face training image set and a hyperspectral face test image set.
Specifically, in this embodiment, a trained hyper-spectral image fusion network model is obtained in step 3, and the hyper-spectral image fusion network model is used to perform face image fusion processing on the face images in the visible light face test image set and the infrared face test image set, so as to obtain a high-quality hyper-spectral face image set having rich texture information and thermal information. The hyper-spectral face image set is divided into a hyper-spectral face training image set and a hyper-spectral face testing image set for subsequent hyper-spectral face recognition, and preferably, the division ratio of the hyper-spectral face training image set and the hyper-spectral face testing image set is 3:1.
And 5, constructing a convolutional neural network face recognition model, and training the convolutional neural network face recognition model according to the hyper-spectral face training image set to obtain the trained convolutional neural network face recognition model.
Specifically, in this embodiment, a deep learning face recognition technology and a deep hyper-spectral fusion technology are combined to improve the final recognition rate of a face, after a hyper-spectral face image set is generated in step 4, a hyper-spectral face recognition framework based on deep learning is proposed, that is, a convolutional neural network face recognition model is provided, specifically, step 5 in this embodiment includes step 5.1 and step 5.2:
and 5.1, constructing a convolutional neural network face recognition model.
Specifically, referring to fig. 4, fig. 4 is a schematic structural diagram of a convolutional neural network face recognition model in a hyper-spectral face recognition method provided in an embodiment of the present invention, where the convolutional neural network face recognition model constructed in this embodiment includes a ResNet feature extraction module, a feature normalization module L2, and a feature space mapping module ED, which are connected in sequence. Firstly, cutting an input image Batch of the convolutional neural network face recognition model to a fixed size; the ResNet feature extraction module, the feature normalization module L2, and the feature space mapping module ED may be implemented by the existing common methods, respectively, and the specific implementation manner is not limited.
And 5.2, training the convolutional neural network face recognition model according to the hyper-spectral face training image set to obtain the trained convolutional neural network face recognition model.
Specifically, in the embodiment, a triple loss function triplets loss is constructed in the face recognition training process, and the triple loss function triplets loss is used to train a convolutional neural network face recognition model, and the triple loss function triplets loss is designed as follows:
wherein,
and
three input images representing a convolutional neural network face recognition model,
and
are two images from the same category and,
and
are two images from a heterogeneous set, alpha is the interval parameter, and accordingly,
and
and forming a Triplet group of the output characteristics of the convolutional neural network face recognition model as a loss function Triplet loss.
It can be seen that in this embodiment, the triple loss function Triplet loss needs to receive features of three images and corresponding labels as input, and aims to make the intra-class distance smaller than the inter-class distance through a large amount of triple training, the optimizer selected in the training process is Adam, the learning rate is 0.001, the batch size is 128, and a suitable threshold is set by looking up the curve of the loss function value and the iteration times through a sensor Board plug-in to stop the model training, so that the trained convolutional neural network face recognition model is obtained. The image received by the triple loss function Triplet loss is from a hyper-spectral face training image set.
And 6, inputting the hyper-spectral face test image set into a trained convolutional neural network face recognition model to obtain a face feature set, and classifying the face feature set by using a support vector machine classifier to realize hyper-spectral face recognition.
Specifically, in the embodiment, the euclidean distance is used as a measurement function, a hyper-spectral face test image set is input to a trained convolutional neural network face recognition model for feature extraction to obtain a face feature set, the face feature set and a label thereof are input to a Support Vector Machine (SVM) classifier, and the face recognition accuracy is obtained through the SVM classifier to realize hyper-spectral face recognition. The method comprises the following steps of obtaining the accuracy of face recognition through a Support Vector Machine (SVM) classifier, wherein the steps are as follows:
(1) In-class maximum Euclidean distance D max Setting a threshold value, namely setting a minimum Euclidean distance D between the prediction label and the current image bw_min And D max Comparing, if less than D max And considering the prediction classification to be correct, otherwise, considering the prediction classification to be wrong, and counting the number of the classes with correct classification and wrong classification.
(2) And (2) calculating to obtain the classification Accuracy (Accuracy) according to the category number of the classification Accuracy and the category number of the classification errors counted in the step (1), wherein a specific formula is designed as follows:
where TN denotes a true positive case, TP denotes a true negative case, FP denotes a false positive case, and FN denotes a false negative case.
In summary, the introduction of the hyper-spectral face recognition technology for depth image fusion of the present embodiment gradually improves the final recognition accuracy rate through the deep learning and image fusion technologies, specifically: firstly, before image fusion, normalization and contrast adjustment are carried out on a visible light image and an infrared image, then the infrared image and the visible light image are fused by utilizing a depth fusion framework RFFUSENet (hyper-spectral image fusion network model) provided by the application so as to obtain a high-quality hyper-spectral image with rich texture information and thermal information, and finally, a deep learning face recognition method (convolutional neural network face recognition model) is utilized to recognize the hyper-spectral face image, so that the traditional face recognition performance based on the visible light image is improved.
In order to verify the superiority of the hyper-spectral face recognition method provided by the present application, the face data set used by the present application is a CASIA face image set and a QFIRE face image set, please refer to fig. 5a to 5d, and fig. 5a to 5d are schematic diagrams illustrating examples of visible light face images and infrared face images in the hyper-spectral face recognition method provided by the embodiment of the present invention, specifically: fig. 5 (a) and 5 (c) show face images obtained by photographing visible light at 1.5m, fig. 5 (b) and 5 (d) show face images obtained by photographing visible light at 1.5m in the near infrared, fig. 5 (a) shows a CASIA visible light image set, fig. 5 (b) shows a CASIA near infrared image set, fig. 5 (c) shows a QFIRE visible light image set, and fig. 5 (d) shows a QFIRE near infrared image set.
In the verification process of the embodiment, the parameter design of each layer in the hyperspectral image fusion network model is specifically shown in table 1, and the padding mode in the convolution process is 0 filling. The parameters of each layer in the hyperspectral image fusion network model are designed according to actual conditions, and the embodiment is designed according to specific parameters in table 1 for identification and verification.
TABLE 1 design of parameters for each layer in a hyperspectral image fusion network model
The experiments designed in this example were demonstrated in two ways:
(1) To show that the fusion effect of the hyperspectral image fusion network model RFFUSENet is superior to that of the traditional face image fusion and other mainstream deep learning methods, the hyperspectral image fusion network model RFFUSENet is compared with image fusion based on CBF (traditional method), image fusion based on DenseeFuses (mainstream deep learning method) and the like, and an entropy value (EN), structural Similarity (SSIM), image edge fidelity (Q) are calculated abf ) And artificial noise (N) abf ) And the like. Referring to fig. 6a to 6b, fig. 6a to 6b are schematic diagrams illustrating face images after performing hyper-spectral fusion in a hyper-spectral face recognition method according to an embodiment of the present invention, where fig. 6 (a) is a result of QFIRE image fusion, and fig. 6 (b) is a result of CASIA image fusion. Referring to table 2, table 2 shows Entropy (EN) and image edge fidelity (Q) of different image fusion methods provided in this embodiment abf ) Structural Similarity (SSIM) and artificial noise (N) abf ) The comparison result of (1). Wherein, entropy value (EN), image edge fidelity (Q) abf ) The larger the Structural Similarity (SSIM) value is, the better the fusion effect is, and the artificial noise (N) abf ) Smaller values indicate better fusion.
Table 2 comparison results of different image fusion methods provided in this embodiment
The comparison result in table 2 shows that the hyper-spectral face image fused by the rffuset network of the present application has the maximum entropy value, the structural similarity, the image edge fidelity and the minimum artificial noise, so that the fusion performance of the rffuset method designed by the present application is superior to that of the traditional image fusion method and the mainstream deep learning method.
(2) In order to prove that the face recognition performance of the hyper-spectral image fusion network model RFFuseNet is higher than the recognition performance without using the fusion technology, face recognition experiments are respectively carried out on the situations without using fusion single-spectral images (namely visible light images and infrared images), using CBF-based image fusion, using DenseeFuse-based image fusion, using RFFuseNet fusion and the like. Referring to table 3, table 3 shows the face recognition accuracy comparison results of different methods provided in this embodiment, including results before and after the depth fusion technique is used.
Table 3 comparison results of face recognition accuracy rates of different methods provided in this embodiment
As can be seen from the comparison results in table 3, the hyperspectral image recognition rate using the RFFuseNet fusion of the present application is significantly higher than that of two single-spectrum images, which proves that the RFFuseNet fusion method of the present application can effectively improve the face recognition performance, and meanwhile, the hyperspectral image recognition rate using the RFFuseNet fusion of the present application is also higher than that using CBF and DenseFuse, which proves that the hyperspectral face recognition performance of the RFFuseNet model of the present application is better than that of other face fusion methods.
Therefore, the embodiment introduces an image fusion idea aiming at the limitation that the traditional face recognition method only adopts visible light, and fuses the visible light face image and the infrared ray face image to obtain a hyperspectral face image, so that the face image has complementary information (namely abundant texture and thermal information) of visible light and infrared ray at the same time, and the effect of improving the face recognition performance is achieved; the embodiment designs a novel residual feedback type hyper-spectral image fusion network model RFFUSENet aiming at the problem of face image fusion, and experiments show that compared with the traditional fusion method, the hyper-spectral image fusion network model RFFUSENet provided by the application does not need to manually design fusion rules and select fusion parameters, can fuse hyper-spectral face images with higher quality and richer contained information, and has better fusion indexes compared with other fusion methods based on deep learning; in the embodiment, a deep learning face recognition technology and a deep hyper-spectrum fusion technology are combined, a hyper-spectrum face recognition framework based on deep learning is provided, the problems that the traditional face recognition is limited to visible light and the recognition performance is insufficient can be successfully solved, and experiments show that the final face recognition rate can be obviously improved by using the method.
The embodiment provides a set of complete hyper-spectral face recognition technology, which can overcome the defects of narrow application range, low recognition performance, poor feature extraction robustness and the like of the traditional face recognition technology; the embodiment provides a new theory and a new method support for the practicability of the face recognition technology, so that the face recognition technology becomes more practical, reliable and popularized; the embodiment can be widely applied to outdoor, night, rain and snow and other application occasions such as attendance checking, civil monitoring, public security law enforcement, access control, cell entrance and the like in complex environments.
Example two
On the basis of the first embodiment, please refer to fig. 7, and fig. 7 is a schematic structural diagram of a hyper-spectral face recognition apparatus according to an embodiment of the present invention. This embodiment provides a hyperspectral human face recognition device, and the device includes:
and the data acquisition module is used for acquiring a visible light face image set and an infrared face image set, dividing the visible light face image set into a visible light face training image set and a visible light face test image set, and dividing the infrared face image set into an infrared face training image set and an infrared face test image set.
And the data preprocessing module is used for respectively preprocessing the visible light face training image set, the visible light face testing image set, the infrared face training image set and the infrared face testing image set to obtain a visible light face preprocessing training image set, a visible light face preprocessing testing image set, an infrared face preprocessing training image set and an infrared face preprocessing testing image set.
Specifically, the data preprocessing module of this embodiment respectively preprocesses a visible light face training image set, a visible light face test image set, an infrared face training image set, and an infrared face test image set to obtain a visible light face preprocessing training image set, a visible light face preprocessing test image set, an infrared face preprocessing training image set, and an infrared face preprocessing test image set, and the data preprocessing module includes:
respectively carrying out gray level conversion and normalization processing on the visible light face training image set and the visible light face testing image set to obtain a visible light face preprocessing training image set and a visible light face preprocessing testing image set;
and respectively carrying out image enhancement and normalization processing on the infrared face training image set and the infrared face testing image set to obtain an infrared face preprocessing training image set and an infrared face preprocessing testing image set.
And the first model construction training module is used for constructing a hyperspectral image fusion network model, and training the hyperspectral image fusion network model according to the visible light face preprocessing training image set and the infrared face preprocessing training image set to obtain the trained hyperspectral image fusion network model.
Specifically, the hyperspectral image fusion network model constructed in the first model construction training module of the embodiment includes a pre-fusion layer, an encoder module, a fusion layer and a decoder module, wherein,
the input of the pre-fusion layer is connected with the input image, the output of the pre-fusion layer is connected with the input of the encoder module, the encoder module comprises a first convolution layer and a dense residual module which are sequentially connected, the output of the dense residual module and the output of the pre-fusion layer are subjected to global residual connection output, and the global residual connection output is connected with the input of the fusion layer;
the output of the fusion layer is connected with the input of the decoder module, the decoder module comprises second to fifth convolution layers and a feedback layer which are sequentially connected, and the output of the feedback layer and the output of the fusion layer are input into the second convolution layer again to form feedback connection.
The dense residual module comprises a first dense residual connecting layer, a third dense residual connecting layer, a multi-scale splicing layer and a fourth dense residual connecting layer which are sequentially connected, wherein the input of the first dense residual connecting layer is further connected with the output of the first dense residual connecting layer, the output of the second dense residual connecting layer and the output of the third dense residual connecting layer, the input of the second dense residual connecting layer is further connected with the output of the second dense residual connecting layer and the output of the third dense residual connecting layer, the input of the third dense residual connecting layer is further connected with the output of the third dense residual connecting layer, the output of the fourth dense residual connecting layer is further connected with the input of the first volume layer in a local residual mode and the output of the first volume layer in a local residual connecting mode, and the output of the local residual connecting layer and the output of the pre-fusion layer in a global residual connecting mode.
Further, training the hyperspectral image fusion network model according to the visible face preprocessing training image set and the infrared face preprocessing training image set to obtain a trained hyperspectral image fusion network model, comprising:
constructing a composite loss function based on structural loss, pixel loss and average gradient loss;
and training the hyperspectral image fusion network model according to the visible light face preprocessing training image set and the infrared face preprocessing training image set by utilizing a composite loss function based on structural loss, pixel loss and average gradient loss to obtain the trained hyperspectral image fusion network model.
And the data generation module is used for inputting the visible light face test image set and the infrared face test image set into the trained hyperspectral image fusion network model to obtain a hyperspectral face image set, and dividing the hyperspectral face image set into a hyperspectral face training image set and a hyperspectral face test image set.
And the second model construction training module is used for constructing a convolutional neural network face recognition model, and training the convolutional neural network face recognition model according to the hyper-spectral face training image set to obtain the trained convolutional neural network face recognition model.
Specifically, the convolutional neural network face recognition model constructed in the second model construction training module in this embodiment includes a ResNet feature extraction module, a feature normalization module, and a feature space mapping module, which are connected in sequence.
Further, training the convolutional neural network face recognition model according to the hyper-spectral face training image set to obtain a trained convolutional neural network face recognition model, comprising:
constructing a triple loss function Triplet loss;
and training the convolutional neural network face recognition model by utilizing a triple loss function triple loss according to the hyper-spectral face training image set to obtain the trained convolutional neural network face recognition model.
And the data identification module is used for inputting the hyper-spectral face test image set into a trained convolutional neural network face identification model to obtain a face feature set, and classifying the face feature set by using a support vector machine classifier to realize hyper-spectral face identification.
The hyper-spectral face recognition device provided by this embodiment can execute the above hyper-spectral face recognition method embodiment, and its implementation principle and technical effect are similar, and are not described herein again.
EXAMPLE III
On the basis of the second embodiment, please refer to fig. 8, and fig. 8 is a schematic structural diagram of an electronic device for hyper-spectral face recognition according to an embodiment of the present invention. The embodiment provides hyper-spectrum face recognition electronic equipment, which comprises an image collector, a display, a processor, a communication interface, a memory and a communication bus, wherein the image collector, the display, the processor, the communication interface and the memory finish mutual communication through the communication bus;
the image collector is used for collecting image data;
the display is used for displaying the image identification data;
a memory for storing a computer program;
a processor for executing the computer program stored in the memory, the computer program when executed by the processor performing the steps of:
step 1, controlling an image collector to collect face images, acquiring a visible light face image set and an infrared face image set, dividing the visible light face image set into a visible light face training image set and a visible light face testing image set, and dividing the infrared face image set into an infrared face training image set and an infrared face testing image set.
And 2, respectively preprocessing the visible light face training image set, the visible light face testing image set, the infrared face training image set and the infrared face testing image set to obtain a visible light face preprocessing training image set, a visible light face preprocessing testing image set, an infrared face preprocessing training image set and an infrared face preprocessing testing image set.
Specifically, in step 2 of this embodiment, the pre-processing is performed on the visible light face training image set, the visible light face testing image set, the infrared face training image set, and the infrared face testing image set respectively to obtain a visible light face preprocessing training image set, a visible light face preprocessing testing image set, an infrared face preprocessing training image set, and an infrared face preprocessing testing image set, and the method includes:
respectively carrying out gray level conversion and normalization processing on the visible light face training image set and the visible light face testing image set to obtain a visible light face preprocessing training image set and a visible light face preprocessing testing image set;
and respectively carrying out image enhancement and normalization processing on the infrared face training image set and the infrared face testing image set to obtain an infrared face preprocessing training image set and an infrared face preprocessing testing image set.
And 3, constructing a hyperspectral image fusion network model, and training the hyperspectral image fusion network model according to the visible light face preprocessing training image set and the infrared face preprocessing training image set to obtain the trained hyperspectral image fusion network model.
Specifically, the hyperspectral image fusion network model constructed in step 3 of this embodiment includes a pre-fusion layer, an encoder module, a fusion layer, and a decoder module, wherein,
the input of the pre-fusion layer is connected with the input image, the output of the pre-fusion layer is connected with the input of the encoder module, the encoder module comprises a first convolution layer and a dense residual module which are sequentially connected, the output of the dense residual module and the output of the pre-fusion layer are subjected to global residual connection output, and the global residual connection output is connected with the input of the fusion layer;
the output of the fusion layer is connected with the input of the decoder module, the decoder module comprises second to fifth convolution layers and a feedback layer which are sequentially connected, and the output of the feedback layer and the output of the fusion layer are input into the second convolution layer again to form feedback connection.
Furthermore, the dense residual module comprises a first dense residual connecting layer, a third dense residual connecting layer, a multi-scale splicing layer and a fourth dense residual connecting layer which are sequentially connected, wherein the input of the first dense residual connecting layer is further connected with the output of the first dense residual connecting layer, the output of the second dense residual connecting layer and the output of the third dense residual connecting layer, the input of the second dense residual connecting layer is further connected with the output of the second dense residual connecting layer and the output of the third dense residual connecting layer, the input of the third dense residual connecting layer is further connected with the output of the third dense residual connecting layer, the output of the fourth dense residual connecting layer is further connected with the input of the first volume layer in a local residual manner and the output of the local residual connecting layer is connected with the output of the pre-fusion layer in a global residual manner and output.
Further, training the hyperspectral image fusion network model according to the visible face preprocessing training image set and the infrared face preprocessing training image set to obtain a trained hyperspectral image fusion network model, comprising:
constructing a composite loss function based on structural loss, pixel loss and average gradient loss;
and training the hyperspectral image fusion network model according to the visible light face preprocessing training image set and the infrared face preprocessing training image set by utilizing a composite loss function based on structural loss, pixel loss and average gradient loss to obtain the trained hyperspectral image fusion network model.
And 4, inputting the visible light face test image set and the infrared face test image set into the trained hyper-spectral image fusion network model to obtain a hyper-spectral face image set, and dividing the hyper-spectral face image set into a hyper-spectral face training image set and a hyper-spectral face test image set.
And 5, constructing a convolutional neural network face recognition model, and training the convolutional neural network face recognition model according to the hyper-spectral face training image set to obtain the trained convolutional neural network face recognition model.
Specifically, the convolutional neural network face recognition model constructed in step 5 of this embodiment includes a ResNet feature extraction module, a feature normalization module, and a feature space mapping module, which are connected in sequence.
Further, training the convolutional neural network face recognition model according to the hyper-spectral face training image set to obtain a trained convolutional neural network face recognition model, comprising:
constructing triple loss function Triplet loss;
and training the convolutional neural network face recognition model by utilizing a triple loss function triple loss according to the hyper-spectral face training image set to obtain the trained convolutional neural network face recognition model.
And 6, inputting the hyper-spectral face test image set into a trained convolutional neural network face recognition model to obtain a face feature set, and classifying the face feature set by using a support vector machine classifier to realize hyper-spectral face recognition. And finally, outputting the hyper-spectrum face recognition result to a display.
The hyper-spectral face recognition electronic device provided by this embodiment may implement the hyper-spectral face recognition method embodiment and the hyper-spectral face recognition apparatus embodiment, and the implementation principle and the technical effect are similar, which are not described herein again.
Example four
On the basis of the third embodiment, please refer to fig. 9, and fig. 9 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention. The present embodiment provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the following steps:
step 1, a visible light face image set and an infrared face image set are obtained, the visible light face image set is divided into a visible light face training image set and a visible light face testing image set, and the infrared face image set is divided into an infrared face training image set and an infrared face testing image set.
And 2, respectively preprocessing the visible light face training image set, the visible light face testing image set, the infrared face training image set and the infrared face testing image set to obtain a visible light face preprocessing training image set, a visible light face preprocessing testing image set, an infrared face preprocessing training image set and an infrared face preprocessing testing image set.
Specifically, in step 2 of this embodiment, the pre-processing is performed on the visible light face training image set, the visible light face testing image set, the infrared face training image set, and the infrared face testing image set respectively to obtain a visible light face preprocessing training image set, a visible light face preprocessing testing image set, an infrared face preprocessing training image set, and an infrared face preprocessing testing image set, and the method includes:
respectively carrying out gray level conversion and normalization processing on the visible light face training image set and the visible light face testing image set to obtain a visible light face preprocessing training image set and a visible light face preprocessing testing image set;
and respectively carrying out image enhancement and normalization processing on the infrared face training image set and the infrared face testing image set to obtain an infrared face preprocessing training image set and an infrared face preprocessing testing image set.
And 3, constructing a hyperspectral image fusion network model, and training the hyperspectral image fusion network model according to the visible light face preprocessing training image set and the infrared face preprocessing training image set to obtain the trained hyperspectral image fusion network model.
Specifically, the hyperspectral image fusion network model constructed in step 3 of this embodiment includes a pre-fusion layer, an encoder module, a fusion layer, and a decoder module, wherein,
the input of the pre-fusion layer is connected with the input image, the output of the pre-fusion layer is connected with the input of the encoder module, the encoder module comprises a first convolution layer and a dense residual module which are sequentially connected, the output of the dense residual module and the output of the pre-fusion layer are subjected to global residual connection output, and the global residual connection output is connected with the input of the fusion layer;
the output of the fusion layer is connected with the input of the decoder module, the decoder module comprises second to fifth convolution layers and a feedback layer which are sequentially connected, and the output of the feedback layer and the output of the fusion layer are input into the second convolution layer again to form feedback connection.
The dense residual module comprises a first dense residual connecting layer, a third dense residual connecting layer, a multi-scale splicing layer and a fourth dense residual connecting layer which are sequentially connected, wherein the input of the first dense residual connecting layer is further connected with the output of the first dense residual connecting layer, the output of the second dense residual connecting layer and the output of the third dense residual connecting layer, the input of the second dense residual connecting layer is further connected with the output of the second dense residual connecting layer and the output of the third dense residual connecting layer, the input of the third dense residual connecting layer is further connected with the output of the third dense residual connecting layer, the output of the fourth dense residual connecting layer is further connected with the input of the first volume layer in a local residual mode and the output of the first volume layer in a local residual connecting mode, and the output of the local residual connecting layer and the output of the pre-fusion layer in a global residual connecting mode.
Further, training the hyperspectral image fusion network model according to the visible face preprocessing training image set and the infrared face preprocessing training image set to obtain a trained hyperspectral image fusion network model, comprising:
constructing a composite loss function based on structural loss, pixel loss and average gradient loss;
and training the hyperspectral image fusion network model according to the visible light face preprocessing training image set and the infrared face preprocessing training image set by utilizing a composite loss function based on structural loss, pixel loss and average gradient loss to obtain the trained hyperspectral image fusion network model.
And 4, inputting the visible light face test image set and the infrared face test image set into the trained hyperspectral image fusion network model to obtain a hyperspectral face image set, and dividing the hyperspectral face image set into a hyperspectral face training image set and a hyperspectral face test image set.
And 5, constructing a convolutional neural network face recognition model, and training the convolutional neural network face recognition model according to the hyper-spectral face training image set to obtain the trained convolutional neural network face recognition model.
Specifically, the convolutional neural network face recognition model constructed in step 5 of this embodiment includes a ResNet feature extraction module, a feature normalization module, and a feature space mapping module, which are connected in sequence.
Further, training the convolutional neural network face recognition model according to the hyper-spectral face training image set to obtain a trained convolutional neural network face recognition model, comprising:
constructing a triple loss function Triplet loss;
and training the convolutional neural network face recognition model by utilizing a triple loss function triple loss according to the hyper-spectral face training image set to obtain the trained convolutional neural network face recognition model.
And 6, inputting the hyper-spectral face test image set into a trained convolutional neural network face recognition model to obtain a face feature set, and classifying the face feature set by using a support vector machine classifier to realize hyper-spectral face recognition.
The computer-readable storage medium provided in this embodiment may implement the above-mentioned hyper-spectrum face recognition method embodiment, the above-mentioned hyper-spectrum face recognition apparatus embodiment, and the above-mentioned hyper-spectrum face recognition electronic device embodiment, and the implementation principle and technical effect thereof are similar, and details are not repeated herein.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.