CN115690856B

CN115690856B - Large thenar palmprint identification method based on feature fusion

Info

Publication number: CN115690856B
Application number: CN202310009813.6A
Authority: CN
Inventors: 杨翠云; 徐英豪; 侯钧译; 曹怡亮; 吕玉超; 朱习军
Original assignee: Qingdao University of Science and Technology
Current assignee: Qingdao University of Science and Technology
Priority date: 2023-01-05
Filing date: 2023-01-05
Publication date: 2023-03-17
Anticipated expiration: 2043-01-05
Also published as: CN115690856A

Abstract

The invention provides a method for identifying a big thenar palmprint based on feature fusion, belonging to the technical field of artificial intelligence. Firstly, obtaining a palmprint image of a big thenar region by a key point positioning method, and performing data enhancement processing on the extracted big thenar palmprint; and then, taking ResNet152 and VGG-19 as main feature extraction networks, introducing an AdaBoost algorithm, and integrating the ResNet152 and the VGG-19. According to the method, an LBP operator is used in a ResNet152 network to extract the characteristic of the thenar texture, and the characteristic and the high-level semantic characteristic of the thenar palmprint extracted by the ResNet152 network are subjected to weighted fusion; an RGA attention module is introduced into the VGG-19 network, so that the VGG-19 can better extract the global features of the large thenar palmprint. The invention provides a final integrated model which achieves higher accuracy than the original two network models.

Description

Large thenar palmprint identification method based on feature fusion

Technical Field

The invention belongs to the technical field of artificial intelligence, and particularly relates to a method for identifying a big thenar palmprint based on feature fusion.

Background

The identification research of the thenar palmprint is crucial to the current medicine, and four methods for identifying and researching thenar palmprints used at home and abroad in recent years are introduced below.

Ajay Kumar proposed in 2016 to apply a convolutional neural network (convolutional neural network) to palm print recognition to establish keypoints along the palm print creases and produce more accurate matches, which is the first application of CNN to matches between palm prints, mainly to investigate the possibility of matching left and right palm print images to determine an individual's identity. The main method is to use a multilayer convolution mode to connect each neuron, the structure of the network comprising three convolution units and two complete connection layers is composed of combinations of different layers, input data firstly enters a convolution layer, three outputs are formed through the three convolution units, corresponding nonlinear activation functions are used for activation, the outputs from the three convolution units are processed by using the complete connection layers, and all the outputs of the current layer are connected. The network can automatically learn the optimal set of parameters through forward and backward propagation in the training phase. The experiment of the method is that the experiment evaluation is carried out on a public data set PolyU data set, and the obtained result has obviously improved performance compared with the BLPOC method using a reference point.

Ramachandra et al propose a new method based on transfer learning in 2018, improve the accuracy of recognizing large thenar palmprints by fine-tuning a pre-trained AlexNet architecture, and have the advantages of applicability in the aspect of processing small amount of data and good performance in various biological recognition applications. Therefore, ramachandra carries out fine adjustment operation capable of controlling learning rate parameters on the network architecture, wherein the learning rate of the last network architecture layers is mainly improved, so that the network architecture of the later network architecture layers can change more quickly, and then the palm print representation extracted by the algorithm takes the sum of scores of an SVM (support vector machine) and a Soft max classifier as a measurement standard, so that the identification precision is improved. The experimental results obtained using the contact Palmprint database of Newborn Babies can also prove that such a method is useful.

Yang Bing et al proposed in 2020 a three-dimensional palm print feature extraction algorithm, which completes recognition by describing local features of three-dimensional palm print as input, the method for description mainly uses curvature features, shape index, and surface type to describe, and uses the described features as input for a subsequent deep learning method, where the deep learning method mainly uses a convolutional neural network (convolutional neural network), and the difference from Ajay Kumar is that populus et al uses more convolutional layers and performs comparison experiments of different convolutional models, including AlexNet, google net, vgg16, resNet50: wherein the AlexNet comprises 8 layers, the front 5 layers are convolution layers, and the rear 3 layers are full-connection layers; the GoogleNet comprises 22 layers, adopts an inclusion modular structure, additionally adds 2 Softmax for gradient forward conduction, and adds 2 loss to avoid gradient return disappearance; vgg16 consists of 5 pooling layers and 13 convolution layers, all hidden layers contain a RuLU activation function; the ResNet50 is most different from the conventional network structure in that the ResNet network structure passes input to output as an initial result by means of a shortcut connection. Finally, the three-dimensional palm print database of the university of hong Kong tally university of China is used for comparison, and the result shows that the four methods can carry out effective identification and are superior to other methods, so that the palm print identification by using deep learning is proved to be very effective.

Wu Biqiao et al proposed a palmprint image recognition algorithm based on transfer learning in 2021, and the method mainly uses a VGG16 as a basic network, and uses transfer learning to transfer information transfer learned from a source domain to a target domain, so as to solve the recognition problem that the amount of tagged sample data in the target domain is small. Wu et al mainly change the original VGG16 model by some, and change the flatten operation in the original model into the global maximum pooling operation, so that the number of parameters can be effectively reduced, and overfitting can be prevented. The softmax activation function of the 3 rd fully-connected layer is converted to the relu activation function, which is followed by 160 neurons of the fully-connected layer, representing 160 classes. The accuracy rate of high-resolution palm print recognition can reach 96.56% through a data set obtained through pre-training in an Imagenet match, and the accuracy rate, the convergence rate and the stability of the VGG16 network adopting transfer learning are superior to those of a randomly initialized model.

However, although the above method has achieved a high accuracy, the above method is theoretically strong, and has problems such as difficulty in convergence during actual training and unstable model results, and thus cannot be popularized in practical applications.

Disclosure of Invention

The invention provides a method for identifying a big thenar palmprint based on feature fusion, which is used for overcoming the defects of the prior art.

The method comprises the steps of preprocessing data, extracting high-level semantic features and local information from a processed picture, fusing the extracted features, and inputting the fused information into a softmax classifier for judgment.

In order to realize the purpose of the invention, the invention adopts the following technical scheme to realize:

a method for identifying a big thenar palmprint based on feature fusion comprises the following steps:

s1: collecting palm print data, preprocessing the palm print data, and extracting the thenar region of the palm print data by using a method based on key point positioning; dividing the training set into a training set and a testing set;

s2: realizing a model on the basis of ResNet152, introducing SoftPool soft pooling operation into the model to extract high-level semantic features, and extracting local texture features by using an equivalent pattern LBP operator; training the improved ResNet152 model by using a training set; training to obtain a classification result;

s3: introducing an AdaBoost algorithm, adjusting the weight of data in a training set according to the classification result of ResNet152, and retraining VGG-19 by using the weighted training data set; RGA is introduced into the VGG-19 to increase the extraction capability of the model to global features; the AdaBoost algorithm integrates two network models of ResNet152 and VGG-19;

s4: the test set data are processed through a trained ResNet152 model and a trained VGG-19 model respectively to obtain prediction results respectively;

s5: based on an AdaBoost algorithm, the ResNet152 and the prediction result of the VGG-19 are subjected to weighted summation to obtain a final classification result.

Further, in S1:

s1-1: the image segmentation is realized more accurately by combining an OSTU threshold segmentation algorithm under a Cr component in a YCrCb color space:

(1) Extracting the hand contour of the collected data under the Cr component in the YCrCb color space;

(2) Traversing and acquiring the segmentation threshold of the background and the target area under the Cr component by combining an OSTU threshold segmentation algorithm, and improving the image segmentation precision;

s1-2: and (3) positioning valley points between fingers by using a convex hull and convex defect search algorithm to obtain a metacarpal print image of the big fish segment:

(1) Extracting the outline of the hand area under the Cr component by using a convex hull and convex defect search algorithm;

(2) Positioning valley points between fingers according to convex hulls and a convex defect detection algorithm;

s1-3: and (3) extracting the large thenar palmprint after the inclined data angle is calibrated:

(1) Obtaining a rotation angle formula according to the Pythagorean theorem, and performing left-hand and right-hand judgment after performing angle calibration on the inclined data;

(2) Marking and intercepting a target area according to the position characteristics of the thenar of the left hand and the right hand;

(3) And scaling the marking frame in an equal ratio according to the ratio of the distance between the valley points of the fingers to the reference numerical value.

Further, S2 is specifically as follows:

s2-1: introducing a soft pooling (SoftPool) method after the ResNet152 model convolution layer, giving weight information to the feature graph obtained by convolution, and reserving more high-level semantic information;

max Pool is adopted in ResNet152 for pooling operation, and Max Pool can cause partial characteristic information loss while compressing the size of a characteristic diagram, so that a soft pooling (Softpool) method is introduced after a ResNet152 model convolution layer, and weight information is given to the characteristic diagram obtained by convolution, so that more characteristic information is reserved in a down-sampling activation mapping.

S2-2: reducing the dimension of the Pattern type of the LBP operator by adopting an equivalent Pattern (Uniform Pattern), solving the problem of excessive binary patterns and simultaneously extracting local features;

an equivalence schema (Uniform Pattern) is employed to dimension the schema classes of the LBP operator. When a cyclic binary number corresponding to a certain LBP has at most two transitions from 0 to 1 or from 1 to 0, the binary number corresponding to the LBP is called an equivalent pattern class; modes other than the equivalent mode class are referred to as mixed mode classes. This reduces the amount of computation and does not lose feature detail.

S2-3: setting weighting coefficients

Carrying out weighted feature fusion on the high-level semantic feature information extracted by the ResNet152 model and the bottom-layer texture detail feature extracted by the equivalent pattern LBP operator:

wherein, the first and the second end of the pipe are connected with each other,

in order to be a feature after the fusion,

advanced features extracted for the ResNet152 network,

the texture features extracted for the circular LBP operator,

is a weighting coefficient;

inputting the fused features into a Soft Max classifier for classification:

performing secondary classification on the generated feature vectors through a Soft max classifier;

then, the proportion of different samples is adjusted according to the classification result of the Soft Max classifier, the weight of the sample with the wrong ResNet152 network classification is improved, and the weight of the sample with the correct ResNet152 network classification is reduced;

s2-4: training the improved ResNet152 model by using a training set; and training to obtain a classification result.

Further, in the step S3, an RGA module is introduced into the bottom layer of the VGG-19 to extract the global features of the big thenar palmprints, then the semantic features of the input image are extracted through high-layer convolution operation, and the extracted high-layer semantic features are input into a Soft Max classifier to classify the big thenar palmprints.

Further, in S5, the picture of the big thenar palmprint is respectively input into the ResNet152 and the VGG-19, a corresponding classification result can be obtained by the softmax classifier, the obtained classification result is weighted according to the accuracy of each network, and the final classification result is obtained by weighting and summing.

The method can be applied to the identification of the thenar palmprint.

The invention has the advantages and technical effects that:

according to the method, a key point positioning-based method is adopted to extract the big thenar area of the collected palmprint data, then a deep learning model is adopted to realize category prediction on the preprocessed big thenar palmprint data set, the model is realized on the basis of ResNet152 and VGG-19, an adaboost algorithm is utilized to integrate two classification models to improve the model identification accuracy, the improved model is utilized to identify the big thenar palmprint data, and the performance of the model is evaluated. Through practical verification, the precision of the big thenar palmprint data extracted by the method is high.

According to the method, two different classification models are integrated, the stability of a model training result can be improved, an LBP operator and an RGA are respectively introduced into a ResNet152 network and a VGG-19 network, the extraction of global information of local information is respectively emphasized, the accuracy of thenar palmprint classification is improved by the two networks from different angles, and the integration of the two networks can complement respective advantages, so that higher accuracy is obtained.

Drawings

FIG. 1 is a detailed flow chart of the present invention;

FIG. 2 is an overall model diagram of an embodiment of the present invention;

FIG. 3 is a geometric model diagram of an embodiment of the present invention;

FIG. 4 is a large thenar palm print extraction diagram of one embodiment of the present invention; the method comprises the following steps of (a) obtaining a hand contour extraction result picture, (b) obtaining a target area segmentation effect picture, (c) obtaining an inter-finger valley point positioning picture, and (d) obtaining a large thenar position screenshot effect picture.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and examples.

Example 1:

the embodiment provides a method for identifying a big thenar palmprint based on feature fusion, wherein the specific flow of an algorithm is shown in fig. 1, and the block diagram of an algorithm module is shown in fig. 2.

The embodiment provides a large thenar palmprint recognition algorithm based on feature fusion, taking a labeled confirmed palm print data set provided by a certain hospital in Qingdao city as an example, as shown in fig. 1, the embodiment includes the following steps:

s1: preprocessing the collected palm print data and extracting the collected palm print data in a big fish area by using a method based on key point positioning;

s2: realizing a model on the basis of ResNet152, as shown in FIG. 3, introducing SoftPool soft pooling operation into the model to extract high-level semantic features, and extracting local texture features by using an equivalent pattern LBP operator;

s3: introducing an AdaBoost algorithm, integrating two networks of ResNet152 and VGG-19, properly adjusting the weight of data in a training set according to the result of ResNet152, and further training VGG-19 by using the weighted data set;

s4: RGAs are introduced into VGG-19 to increase the extraction capability of the model for global features;

s5: and weighting and summing the ResNet152 and the prediction result of the VGG-19 to obtain a final classification result.

The S1 specifically comprises the following steps:

(1) Extracting the hand contour of the collected data under the Cr component in the YCrCb color space, wherein the extraction result is shown in fig. 4 (a);

(2) Traversing and acquiring segmentation thresholds of the background and the target region under the Cr component by combining an OSTU threshold segmentation algorithm, so as to improve the image segmentation precision, wherein the segmentation effect is shown in FIG. 4 (b);

(1) And (3) extracting the outline of the hand region under the Cr component by using a convex hull and convex defect search algorithm:

the convex hull and the convex defect search algorithm can be used for positioning the valley point between the fingers, and the thenar palm print image is obtained based on the valley point.

(2) And (3) positioning valley points between fingers according to convex hulls and a convex defect detection algorithm:

extracting the contour of the hand region under the Cr component, positioning valley points between fingers according to a convex hull and convex defect detection algorithm, detecting the convex hull at the peripheral boundary,

、

、

、

the effect is shown in fig. 4 (c) for the location point of the inter-finger valley point.

(1) Obtaining a rotation angle formula according to the Pythagorean theorem, and performing left-hand and right-hand judgment after angle calibration on inclined data:

the positions of the big thenar areas are different because the left hand direction and the right hand direction are different. Before the extraction of the big fish, the left hand and the right hand need to be judged, so the inclination needs to be judgedThe angle calibration is performed on the data. After the valley point detection, the method can know

、

Coordinate value, and rotation angle formula obtained according to Pythagorean theorem

(2) Marking and intercepting a target area according to the position characteristics of the thenar of the left hand and the right hand:

after the angle has been calibrated, the angle is calibrated,

、

、

、

the new change of coordinates occurs, the vertical coordinate of a point is the minimum value, the point is the valley point between the thumb and the forefinger, when the point is positioned at the left side of other three points, the hand is the left hand, otherwise, the hand is the right hand. Marking and intercepting the target area according to the position characteristics of the thenar of the left hand and the right hand, and the effect is shown in figure 4 (d).

(3) Scaling the labeling frame in an equal ratio according to the ratio of the distance between the valley points of the fingers to the reference numerical value:

considering the phenomena of different distances between the palm prints and the acquisition equipment, different sizes of palms of different people and the like during data acquisition, when labeling the big fish, the method comprises the steps of

、

The distance value between two points is taken as a reference according to different palms

、

The scale between the distance and the reference value scales the marking frame in an equal ratio, and the generalization capability and the robustness of the ROI extraction method are improved.

The S2 specifically comprises the following steps:

s2-1: introducing a soft pooling (SoftPool) method after the ResNet152 model convolution layer, endowing weight information to the feature diagram obtained by convolution, and reserving more high-level semantic information:

the residual error network solves the degradation problem caused by excessive hidden layers of the neural network, the ResNet152 model in the residual error network is superior in the classification problem model, and the top-5 error rate is reduced to 3.57 percent once, so the method selects the ResNet-152 layer residual error network as a basic model to realize the classification of the large thenar palmprints. Max Pool is adopted in ResNet152 for pooling operation, and Max Pool causes partial characteristic information loss while compressing the size of a characteristic diagram, so that a soft pooling (Softpool) method is introduced after a ResNet152 model convolution layer, and weight information is given to the characteristic diagram obtained by convolution, so that more characteristic information is reserved in a down-sampling activation mapping.

S2-2: and (3) reducing the dimension of the Pattern type of the LBP operator by adopting an equivalent Pattern (Uniform Pattern), solving the problem of excessive binary patterns and simultaneously extracting local features:

the LBP operator is a texture description operator for extracting local features as a judgment basis and is used for extracting local texture information of the image. The basic idea of LBP is to define it in 8 neighborhoods of pixels (3 × 3 windows), use the gray value of the central pixel as the threshold, and mark it according to the principle of 1 big or 0 small, to obtain the 8-bit binary number of the central point, convert it into decimal LBP value which is the central pixel point, and use this value to reflect the texture information of this area.

In order to solve the problem of excessive binary patterns, the invention adopts an equivalent Pattern (Uniform Pattern) to reduce the dimension of the Pattern type of the LBP operator. When a cyclic binary number corresponding to a certain LBP has at most two transitions from 0 to 1 or from 1 to 0, the binary number corresponding to the LBP is called an equivalent pattern class; modes other than the equivalent mode class are referred to as mixed mode classes. This reduces the amount of computation and does not lose feature detail.

S3-1: setting weighting coefficients

Carrying out weighted fusion on the high-level semantic feature information extracted by the ResNet152 model and the bottom-layer texture detail feature extracted by the equivalent pattern LBP operator:

wherein the content of the first and second substances,

in order to be a feature after the fusion,

advanced features extracted for the ResNet152 network,

the texture features extracted for the circular LBP operator,

are weighting coefficients.

S3-2: inputting the fused features into a Soft Max classifier for classification:

and performing weighted fusion on the high-level semantic features and the bottom-level texture features, and performing secondary classification on the generated feature vectors through a Soft max classifier.

S3-3: and adjusting the proportion of different samples according to the classification result of the Soft Max classifier, and using the adjusted training set for training the VGG-19.

And (3) readjusting the weight of each picture in the training set according to the classification result of the ResNet152 network model, improving the weight of the sample with the wrong ResNet152 network classification, reducing the weight occupied by the sample with the correct ResNet152 network classification, and training the VGG-19 network by using the adjusted data set. And finally, weighted summation is carried out on the results obtained by ResNet152 and VGG-19.

s4-1: introducing RGA into the lower layer of VGG-19 to extract the global features of the input image, inputting the extracted features into the upper layer of VGG-19, and further obtaining the classification result of the big thenar palmprint:

inputting the adjusted data set of the thenar palmprint into VGG-19, extracting the global characteristics of the thenar palmprint through an RGA module at the bottom layer, and then obtaining the classification result of the thenar palmprint through high-layer convolution operation.

S5: weighting and summing the ResNet152 and the prediction result of VGG-19 to obtain a final classification result;

s5-1: respectively inputting the pictures of the large thenar palmprints into ResNet152 and VGG-19, weighting and summing the classification results obtained by the two models according to respective accuracy rates to obtain a final classification result:

and respectively inputting the pictures of the large thenar palmprints into ResNet152 and VGG-19, obtaining corresponding classification results through a softmax classifier, adding weights to the obtained classification results according to the accuracy of each network, and performing weighted summation to obtain the final classification results.

A flow chart of a large thenar palmprint recognition algorithm based on feature fusion is shown in fig. 4, and pseudo codes are as follows;

input: large thenar palmprint data in YCrCb color space

1, segmenting an image to obtain a whole palm area X;

2. positioning valley point of X to obtain valley point

、

、

、

；

3. Using formulas

To coordinate

、

Carrying out angle calibration;

4. by using

、

、

、

Marking and intercepting a target area at the positions of the four points;

5. according to

、

The distance between the marking frames is scaled in equal proportion;

6. inputting the processed picture K into a ResNet-152 network to obtain a characteristic diagram Z, and processing by an LBP operator to obtain a characteristic diagram F;

7. fusing Z and F to obtain a characteristic diagram Y

9. Inputting Y into Soft Max classifier to perform explicit and implicit classification J

10. And adjusting the weight of K according to J, and inputting the adjusted K' into a VGG-19 network for training to obtain a classification result M.

11. And performing weighted summation on the M and the J to obtain a final classification result N.

Output: binary results (explicit or implicit).

According to the method, a key point positioning-based method is adopted to extract the big thenar region of the collected palmprint data, then a deep learning model is adopted to realize category prediction on the preprocessed big thenar palmprint data set, a model is realized on the basis of ResNet152 and VGG-19, a soft pooling and feature fusion mechanism is introduced to improve the model identification accuracy, the improved model is used for identifying the big thenar palmprint data, and the performance of the model is evaluated.

TABLE 1 comparison of training results for different models

From table 1, the accuracy of the processing model provided by the invention for identifying the big thenar palmprint is higher than that of other models, and the integrated model is more suitable for the identification research of the big thenar palmprint. VGG19, resNet50 model are superior to the integrated classification model in training time, but model accuracy is not improved. The method is superior in classification accuracy, the accuracy is improved by at least 11.95 percent compared with other models, but due to the integration of two networks and the introduction of an LBP operator and RGA attention, network training parameters are increased, certain training time cost is generated, and the detection time consumption is increased by a proper amount relative to the improvement of performance.

The above-mentioned embodiments are only preferred embodiments of the present invention, and the scope of the present invention is not limited thereby, and all changes made in the shape and principle of the present invention should be covered within the scope of the present invention.

Claims

1. A big thenar palmprint recognition method based on feature fusion is characterized by comprising the following steps:

s3: introducing an AdaBoost algorithm, adjusting the weight of data in a training set according to the classification result of ResNet152, and retraining VGG-19 by using the weighted training data set; introducing RGA into the VGG-19; the AdaBoost algorithm integrates two network models of ResNet152 and VGG-19;

s5: based on an AdaBoost algorithm, weighting and summing the ResNet152 and the prediction result of VGG-19 to obtain a final classification result;

the S2 specifically comprises the following steps:

s2-1: introducing a soft pooling method after the ResNet152 model convolution layer, giving weight information to the feature graph obtained by convolution, and reserving more high-level semantic information;

s2-2: reducing dimensions of the pattern types of the LBP operator by adopting an equivalent pattern, solving the problem of excessive binary patterns and simultaneously extracting local features;

the S3 specifically comprises the following steps:

s3-1: setting a weighting coefficient alpha, and performing weighted feature fusion on high-level semantic feature information extracted by a ResNet152 model and bottom-layer texture detail features extracted by an equivalent pattern LBP operator:

f _w ＝α·f _R +(1-α)f _L

wherein f is _w As a fused feature, f _R High level features extracted for ResNet152 networks, f _L Extracting texture features for the circular LBP operator, wherein alpha is a weighting coefficient;

performing weighted fusion on the high-level semantic features and the bottom-level texture features, and performing secondary classification on the generated feature vectors through a Soft max classifier;

s3-3: and (3) adjusting the proportion of different samples according to the classification result of the Soft Max classifier, and using the adjusted training set for the training of VGG-19: introducing RGA into the lower layer of VGG-19 to extract the global features of the input image, inputting the extracted features into the upper layer of VGG-19, and further obtaining the classification result of the big thenar palmprint:

inputting the adjusted data set of the thenar palmprint into VGG-19, extracting the global characteristics of the thenar palmprint through an RGA module at the bottom layer, and then obtaining the classification result of the thenar palmprint through high-layer convolution operation;

s3-4: and (3) readjusting the weight of each picture in the training set according to the classification result of the ResNet152 network model, improving the weight of the sample with the wrong ResNet152 network classification, reducing the weight occupied by the sample with the correct ResNet152 network classification, and training the VGG-19 network by using the adjusted data set.

2. The method for identifying a big thenar palm print as claimed in claim 1, wherein in S1:

s1-1: and the image segmentation is realized more accurately by combining an OSTU threshold segmentation algorithm under the Cr component in the YCrCb color space:

(1) Extracting the outline of the hand region under the Cr component by using a convex hull and convex defect search algorithm;

3. The method for identifying the big thenar palmprint as claimed in claim 1, wherein in the step S5, the picture of the big thenar palmprint is respectively input into the ResNet152 and the VGG-19, the corresponding classification result is obtained through a softmax classifier, the obtained classification result is weighted according to the accuracy of each network, and the final classification result is obtained through weighting and summing.