CN114596622A

CN114596622A - Iris and periocular antagonism adaptive fusion recognition method based on contrast knowledge drive

Info

Publication number: CN114596622A
Application number: CN202210264824.4A
Authority: CN
Inventors: 刘元宁; 周智勇; 朱晓冬; 董立岩; 李沅峰; 刘煜; 张天悦; 刘帅; 崔靖威; 张亚星; 孙野; 袁一航; 董楠; 杨恩斌; 张少强
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2022-03-17
Filing date: 2022-03-17
Publication date: 2022-06-07

Abstract

The invention discloses a contrast knowledge driven iris and periocular antagonism self-adaptive fusion recognition method, which comprises the following steps: step one, establishing a data set and dividing the data set into a training set and a test set; step two, obtaining a model architecture;step three, training a model architecture; step four, obtaining a convolution coding part; fifthly, positioning the iris and the periocular region of the target detection model; sixthly, removing the last full-connection layer after the training is finished to obtain a periocular feature extraction model E_per(ii) a Seventhly, training a set of iris visible light image training and personnel identity labels; step eight, distributing the features of different modes in a low-dimensional subspace for registration; step nine, setting two learnable parameters; step ten, obtaining a model feature extractor; and eleventh, determining a decision mechanism of the bimodal fusion recognition model. Has the advantages that: the reliability and the safety of the system are improved; the recognition performance on a small sample data set is improved; the final recognition performance can be improved.

Description

Iris and periocular antagonism self-adaptive fusion recognition method based on contrast knowledge drive

Technical Field

The invention relates to an iris and periocular adaptive fusion recognition method, in particular to an iris and periocular adaptive fusion recognition method based on contrast knowledge driving.

Background

At present, with the information security being in an increasingly important position in the information-oriented society, the biometric identification technology using physiological features represented by irises, fingerprints and human faces and behavioral features represented by gaits, signatures and keystrokes solves the problems that the traditional identity authentication is easy to lose and forge and the like, and is being applied to the scenes of access control, identity identification and the like in a large scale. The iris recognition is used as a non-contact biological recognition technology, and is applied to the fields of military affairs, finance, banks, intelligent construction sites, intelligent mines, intelligent homes and the like due to the characteristics of uniqueness, stability, difficulty in stealing, easiness in use and the like, and the accuracy rate in the biological recognition technology is only equal to that of DNA.

Because the iris is located the internal organ region of eye, and the area is less, need the user to can effectively gather the iris image at less distance cooperation, moreover because asian's iris colour is dark brown, the iris is gathered and is used more expensive near infrared light collection equipment, has consequently restricted the civilian popularization of iris discernment. The single iris identification technology is susceptible to noise, low upper limit of the identification rate, presence attack and the like. In addition, the traditional multi-modal solution such as fusion of the iris and the face is limited by the long-term environment in which covid-19 is popular at home and abroad, and the face is shielded by a mask to reduce the recognition rate. And the heterogeneous modal fusion has the problem of inconsistent feature distribution, which leads to the reduction of the recognition rate after the fusion.

Therefore, how to solve the key problems that the iris identification acquisition distance is small, the acquisition method cost is expensive, the robustness of a single iris identification system is low, the biological identification technology is influenced by the envision-19 popular environment for a long time at home and abroad, the heterogeneous mode fusion identification rate is improved, and the like is a key point of the current iris identification and multi-mode biological identification technology.

Disclosure of Invention

The invention mainly aims to solve the problems of high acquisition conditions, low safety and reliability and the like of the existing single-mode biological characteristic iris recognition;

it is another object of the present invention to improve the performance of multi-modal biometric identification techniques for iris fusion other modalities;

still another object of the present invention is to provide a training paradigm to solve the problem of deep neural network overfitting when large-scale iris databases are not available, and to improve the performance of iris and other biometric identification.

The invention aims to solve the problems and achieve the aim, and provides an iris and periocular antagonism adaptive fusion identification method based on contrast knowledge driving.

The invention provides a contrast knowledge-driven iris and periocular antagonism self-adaptive fusion recognition method, which comprises the following steps:

establishing an eye visible light image data set, an iris and periocular target detection data set, an iris visible light image data set and a periocular visible light image data set, and dividing the data sets into a training set and a test set, wherein the training set is used for training an iris and periocular region detection network model and an iris and periocular feature fusion model, and the test set is used for evaluating the accuracy of the models;

initializing three MobileNet V3 deep convolutional neural networks, wherein each network model removes the MobileNet V3 of the last full-connection layer as a coding layer, a projection layer is added behind the coding layer, and a prediction layer is added behind the projection layer to obtain three initialized model architectures;

step three, training three initialization model architectures obtained in the step two by using an algorithm driven by contrast knowledge by using an eye visible light image training set acquired in the step one;

step four, removing the projection layer and the prediction layer of the trained three model frameworks to obtain three trained convolutional coding parts of the MobileNet V3;

step five, comparing the MobileNet V3 convolution coding part obtained by knowledge drive training in the step four and using the MobileNet V3 convolution coding part as a skeleton backbone to construct an IrisPer _ YooloV 3 target detection model, freezing the MobileNet V3 convolution coding part, and training an IrisPer _ YooloV 3 target detection model to position the iris and the periocular region by using an iris and periocular target detection training set;

sixthly, comparing the MobileNet V3 convolutional coding part obtained by knowledge-driven training in the fourth step, adding a multilayer perceptron as a classifier, freezing the MobileNet V3 convolutional coding part, using the periocular visible light image training set and the personnel identity label training, and removing the last full-connection layer after the training is finished to obtain a periocular feature extraction model E_per；

Seventhly, adding a multilayer perceptron as a classifier behind the MobileNet V3 convolutional coding part obtained by comparison knowledge drive training in the step four, freezing the MobileNet V3 convolutional coding part, using iris visible light image training set and personnel identity label training, and removing the last full-connection layer after the training is finished to obtain an iris feature extraction model E_iris；

Step eight, constructing a classification and discriminator integrated fully-connected neural network C, and extracting the iris feature extraction model E obtained in the step six and the step seven_irisAnd periocular extraction model E_perUsing an iris visible light training set and a periocular visible light training set together with a classification and discriminator integrated fully-connected neural network C to perform combined confrontation training with identity class labels, and aiming at registering characteristics of different modes distributed in a low-dimensional subspace;

step nine, setting two learnable parameters, and using the iris feature extractor E obtained through the eight-confrontation training_irisAnd eye periphery feature extractor E_perAdding a multilayer perceptron as a fusion classifier F to construct an iris and periocular fusion model, freezing the iris feature extraction model and the periocular feature extraction model, training the whole fusion recognition model by using an iris visible light training set and a periocular visible light training set, and optimizing the multilayer perceptron of the fusion classifier F;

step ten, jointly training the iris and periocular fusion model obtained in the step nine and the iris periocular synchronous detection model IrisPer _ YoloV3 target detection model obtained in the step five, and removing the last layer of neural network to obtain a final iris and periocular region detection and fusion recognition model feature extractor;

and step eleven, using a scheme of multi-stage decision strategy cooperation as a decision mechanism of the trained iris and eye periphery bimodal fusion recognition model.

The specific process of establishing the eye visible light image data set, the iris and periocular target detection data set, the iris visible light image data set and the periocular visible light image data set in the first step is as follows:

the method comprises the following steps that 2500 template eye visible light images of template testers are collected by a high-definition camera, wherein 2000 images are divided into a training set, and 500 images are divided into a test set and serve as an eye visible light image data set;

secondly, labeling an iris region and a periocular region of each sample in the training set and the testing set of the eye visible light image by using a labelme tool to obtain target detection position labels of the iris region and the periocular region, and combining the labels and the eye visible light image data set to obtain an iris and periocular target detection data set;

and thirdly, cutting each sample in the eye visible light image dataset according to the labels in the iris and periocular target detection dataset obtained in the second step to obtain an iris visible light image and a periocular visible light image, distributing an identity type label for each image, and respectively constructing an iris visible light dataset and a periocular visible light dataset by the identity labels, the iris visible light image and the periocular visible light image.

The specific process of constructing the projection layer and the prediction layer in the second step is as follows:

the method comprises the following steps that firstly, a 3-layer multilayer perceptron is constructed to serve as a projection layer, each layer of fully-connected layer is provided with 1024 neurons, and a BN layer, namely a batch normalization layer, is followed;

and secondly, constructing a 2-layer multi-layer perceptron as a prediction layer, wherein the prediction layer is the 2-layer multi-layer perceptron, except an output layer, other layers follow a BN layer, the dimension of the hidden layer is 256, and the dimensions of the other layers are 1024.

The comparative knowledge-driven training process in the third step is as follows:

firstly, carrying out random data enhancement on a sample in eye visible light image training set data, randomly cutting an eye visible light image according to the proportion [0.5, 2.0] of the size of an original image, adjusting the size of the image by using a bilinear interpolation algorithm, then carrying out random left-right turning, randomly adjusting the contrast, brightness, hue and saturation of the image, then carrying out random graying on the image, and finally carrying out random fuzzy operation on the image by using a mean filter, Gauss fuzzy, median fuzzy, a bilateral filter and box fuzzy;

secondly, carrying out random enhancement operation twice on each image X by using the random enhancement operation of the first step to obtain an image X₁And X₂Two images X₁And X₂Respectively input into a convolutional coding part f, and output is expressed as m₁＝f(X₁) And m₂＝f(X₂)，m₁And m₂Input projection layer g and output n₁＝g(m₁) And n₂＝g(m₂)，n₁And n₂Input prediction layer h and output is denoted v₁＝h(n₁) And v₂＝h(n₂) Adopting the architecture thought f, g and h weight sharing of the twin neural network to construct a contrast knowledge driven training paradigm;

thirdly, taking an eye visible light image training set according to the principle that BatchSize is 512, an optimizer is SGD, the initial learning rate is 0.1, inputting a contrast knowledge driven learning framework for training, wherein a loss function driven by the contrast knowledge is a formula (1);

wherein: l is_contrastRepresenting a loss function, n, driven by contrast knowledge₁Representation image X₁Output via f and g, n₂Representation image X₂Output via f, v₁Represents n₁Output over h, v₂Represents n₂Output over h, Stopgrad (n)₂) Representing the vector n during the training process₂The gradient is stopped.

The specific process of training the IrisPer _ YoloV3 target detection model to locate the iris and the periocular region in the step five is as follows:

firstly, preprocessing data enhancement is carried out on an iris and periocular target detection data set, and each sample in a target detection data set training set is subjected to random brightness change, random contrast change, random color change, random filling, random cutting, random overturning and random rotation;

secondly, marking training labels of the target detection model, dividing a single sample image in a training set into a plurality of grids with the size of scale multiplied by scale according to the down-sampling scale of the image of MobileNet V3, generating three anchor frames in each grid, wherein the coordinate of the central point of each anchor frame is the same as the coordinate of the central point of the grid, the size of each anchor frame is as consistent as possible with the size of the iris and the eye circumference real frame in the training sample, traversing the eye circumference real frame and the iris real frame according to the position information of the iris and the eye circumference real frame in the training sample, and assuming that the abscissa of the coordinate of the central point of the real frame is i_ceneterThe abscissa is j_ceneterThen it is located in the grid position of (i)_ceneter/scale，j_ceneterScale), calculate the IOU of these 3 anchor boxes and the real box using equation (2);

wherein: s_truthRepresenting the area of the real box, S_anchorRepresenting the area of the anchor frame;

taking the anchor frame with the largest IOU, setting the object label of the corresponding prediction frame to be 1, wherein the object represents the confidence coefficient, namely the possibility that the prediction frame has an iris or an eye circumference, setting the object training label of the corresponding prediction frame to be-1 if the IOU of other anchor frames in the grid is not the maximum value but exceeds a set threshold value, setting the object training labels of the other prediction frames to be 0, and marking the positions and the sizes of the object training labels in the prediction frames with the object label of 1 by using a vector (t)_x，t_y，t_h，t_w) Indicating position and size, t_xIs the central abscissa, t, of the iris or eye circumference_yIs the central ordinate, t, of the iris or of the eye periphery_hIs the height of the iris or periocular region, t_wIs the width of the iris or the eye circumference, the goal of the training network is to make the position and size of the output prediction box as close to the real box as possible, and the relationship between the position and size of the prediction box and the real box should satisfy the following equations (3), (4), (5), (6):

σ(t_x)+c_x＝gt_x#(3)

σ(t_y)+c_y＝gt_y#(4)

wherein: c. C_xIs the abscissa of the cell in which the prediction box is located, c_yIs the ordinate of the cell in which the prediction box is located, a_wTo predict the width of the anchor frame to which the frame corresponds, a_hTo predict the height of the anchor frame to which the frame corresponds_xAbscissa representing real frame, gt_yOrdinate representing the real box gt_yOrdinate representing the real box gt_wWidth of real frame, gt_hRepresenting the height of the real box;

thirdly, constructing an Irisper _ YooloV 3 target detection model based on MobileNet V3, wherein the detection model is formed by adding a three-branch convolution detection head network after a convolution part of MobileNet V3, and assuming the kernel size and the number of convolution layers to be (W)_kernal×H_kernal，N_kernal) Denotes W_kernalRepresenting the width of the convolution kernel, H_kernalRepresenting the height of the convolution kernel, N_kernalRepresenting the number of convolution kernels, the first branch is to add 7 convolution layers after the convolution layer of the MobileNet V3 fifth downsampling, the convolution kernel size of each layerAnd the number of the convolution kernels is (1 × 1, 512), (3 × 3, 1024), (1 × 01, 512), (3 × 13, 1024), (1 × 21, 512), (3 × 33, 1024), (1 × 41, 21), respectively, the second branch is to perform feature map stitching in the channel direction on the output of the fifth convolution layer of the first branch after one (1 × 51, 256) convolution layer and one up-sampling operation and the output of the fourth down-sampling of MobileNetV3, 7 convolution layers are added after the stitching operation, the size and the number of the convolution kernels of each layer are (1 × 61, 256), (3 × 73, 512), (1 × 81, 256), (3 × 93, 512), (1 × 1, 256), (3 × 03, 512), (1 × 11, 21), respectively, the third branch is to perform feature map stitching on the output of the fifth convolution layer of the first branch after one (1 × 21, 256) the convolution layers and the output after the up-sampling operation are spliced with the feature map in the channel direction by the output of the third down-sampling of the MobileNet V3, 7 convolution layers are added after the splicing operation, and the sizes and the number of convolution kernels of each layer are respectively (1 × 1, 128), (3 × 3, 256), (1 × 1, 128), (3 × 3, 256), (1 × 1, 128), (3 × 3, 256), (1 × 1, 21);

and fourthly, feeding the training set subjected to data enhancement in the first step into an Irisper _ YooloV 3 target detection model based on MobileNet V3 established in the third step, freezing a convolution part of MobileNet V3 according to a labeling rule of a training label in the second step, and performing supervised training on a three-branch target detection head established in the third step, wherein as prediction frames output by a network of the three-branch target detection head have 3 scales, loss functions are required to be solved for the prediction frames corresponding to each scale, and then the average values of the loss functions are obtained, and the target detection loss functions of the iris and the eye circumference are as shown in (7) and (8):

Loss_mean＝(Loss₁+Loss₂+Loss₃)/3#(8)

wherein Loss_kRepresents the loss of the kth branch output profile, where S_kDenotes the size of the kth branch output characteristic diagram, i denotes the (i + 1) th cell, j denotes the (j + 1) th prediction box, and lambda_coordA weight coefficient representing the loss of position,

when the j +1 prediction frames in the (i + 1) th cells have irises or eye circles, the value is 1, otherwise, the value is 0, wherein x_ijIs the central abscissa of the real frame of the iris or the eye periphery,

to predict the center abscissa, y, of the box_ijIs the central ordinate of the real frame of the iris or the eye circumference,

is the central ordinate, w, of the prediction box_ijIs the width of the real frame of the iris or the eye circumference,

to predict the width of the frame, h_ijIs the height of the real frame of the iris or the eye circumference,

to predict the height of the box, c_ijIn the form of a category of a real box,

for the class of prediction box, λ_noobjA weight coefficient representing no class loss of the object,

when j +1 prediction frames in i +1 cells have no iris or eye circumference, the value is 1, otherwise 0, p_ijThe confidence level of the real box is represented,

representing the confidence of the prediction box.

Extracting the eye periphery characteristic in the sixth step and the seventh step_perAnd iris feature extraction model E_irisThe specific process of training is as follows:

firstly, performing data expansion on a visual light image training set around the eyes and an iris membrane visual light image training set, wherein the data expansion strategy comprises random overturning in the horizontal direction, random overturning in the vertical direction and random rotation, and randomly adjusting the brightness, the hue, the saturation and the contrast, and the size of each image of the expanded visual light image training set and each image of the expanded iris image training set is adjusted to 224 multiplied by 224;

secondly, respectively constructing a three-layer fully-connected neural network for periocular characteristics and iris characteristics to serve as a multilayer perceptron, wherein the input layer comprises 960 neurons, the hidden layer comprises 1280 neurons, and the output layer comprises k neurons corresponding to the total number of identity labels of periocular and iris training sets;

thirdly, using the expanded visual light image training set around the eyes or the iris visual light image training set as input to be sent to the model for supervised training, wherein the loss function of the training is a formula (9);

wherein: when the modal is taken iris to represent the loss function for training the iris feature extractor, and when the modal is taken per to represent the loss function for training the eye periphery feature extractor,

representing the cross-entropy loss of the classification model for training the periocular or iris, n is the number of BatchSize,

soft output of the j-th neuron after the ith eye circumference image or iris image passes through the last layer of the model,

is the j-th value in the Onehot code of the identity tag of the current input periocular image or iris image.

The specific process of the confrontation training in the step eight is as follows:

the method comprises the following steps that firstly, a built classification and discrimination integrator neural network, an iris feature extraction network and a periocular feature extraction network form an anti-modal adaptive architecture, wherein the integrated neural network is a fully-connected neural network with 1280 neurons as an input layer and k +1 neurons as an output layer;

second, labeling samples from the iris visible light training set

Wherein

Representing the ith iris image of the training set,

identity label representing Onehot vector form corresponding to ith iris image, n_irisRepresenting the total number of samples in the training set of iris, and labeling the images from the training set of visible light around the eye as the same

Setting a modal label for different modes of the iris and the periocular region, wherein the modal label of the sample from the iris is 0, and the modal label of the sample from the periocular region is 1;

thirdly, inputting the iris sample into an iris characteristic extractor E_irisInputting the eye circumference sample into a feature extractor E_perInputting the obtained iris and periocular feature vectors into a classification and discrimination integrator C to obtain classification output vectors

To represent

I.e., the output vector of the classification and discrimination integrator, where item is iris, per,

is expressed asOutput vector of classifier and discrimination integrator

K, where K is 1, 2.. K, the conditional probability of the output vector is calculated, the K-th element value of the conditional probability being as shown in equation (10):

wherein:

represent

Calculating the kth element value of the modal prediction vector of the kth class identity label through the kth element value of the output conditional probability of the classifier and the discrimination integrator as shown in formula (11),

wherein:

to represent

The kth' element value of the modal prediction vector of the kth class identity label output by the classifier and the discrimination integrator is obtained;

fourthly, in order to register the feature distribution of the iris and periocular modes by using confrontation training, a classification and discrimination integrator is trained to correctly classify the identity labels of the iris and periocular modes, meanwhile, the mode labels of the iris and periocular modes are correctly classified, and when an iris feature extractor and a periocular feature extractor are trained, the generated mode features enable the classification and discrimination integrator not to distinguish the mode labels, so that the aim of registering the feature distribution of the iris and periocular modes is fulfilled, and the features of the iris and periocular modes are countedComputing a loss function L associated with an iris feature extractor^iris(E_iris，C)、

Loss function L^iris(E_irisC) is designed to ensure a state of relationship between the prediction of iris class and the prediction of type of modality against inhibition by the iris feature extractor, the loss function is shown in equation (12):

wherein: l is^iris(E_irisAnd, C) an iris model feature extractor E_irisThe countermeasure loss of (1);

loss function

The design of the method is to ensure that the type of the characteristic mode extracted by the iris characteristic extractor keeps discriminability, simultaneously ensure the discriminability of the iris category of the characteristic vector, and achieve the aim of the joint probability distribution registration of the iris category and the mode prediction, and the loss function is shown as a formula (13):

wherein:

representing a joint probabilistic registration loss of the iris feature extractor;

computing joint probabilistic registration loss for classification and integrated discriminator view angles associated with iris data

This formula is shown in (14):

calculating a loss function L associated with a periocular feature extractor^per(E_per，C)，

Loss function L^per(E_perC) is designed to ensure a state of relationship of the eye contour feature extractor to the competing inhibition between the prediction of the eye contour class and the prediction of the type of modality, and the loss function is shown in equation (15):

wherein: l is^per(E_perAnd C) a feature extractor E for eye circumference_perThe countermeasure loss of (1);

loss function

The design of (2) is to ensure that the type of the characteristic mode extracted by the periocular feature extractor keeps discriminability, and simultaneously ensure the discriminability of the periocular category of the characteristic vector, so as to achieve the aim of the joint probability distribution registration of the periocular category and the prediction of the mode, and the loss function is shown as a formula (16):

wherein:

representing a joint probabilistic registration loss of the periocular feature extractor;

computing joint probabilistic registration loss for classification and integrated discriminator view angles associated with periocular data

This formula is shown in (17):

to estimate the clusters with low error, a minimum entropy rule is calculated, as shown in equation (18):

wherein:

represents a minimized entropy loss of the iris or eye periphery;

fifthly, the thawing classification and discrimination integrator enables the weight to be learnable, the iris feature extractor is frozen to enable the weight not to be learnable, and the eye periphery feature extractor is frozen to enable the weight not to be learnable to learn the target

Training a classification and discrimination integrator to train the classification and discrimination integrator,

represents training C to

Taking the minimum value;

sixthly, the freezing classification and discrimination integrator makes the weight not to be learnable, the unfreezing iris feature extractor makes the weight to be learnable, and the unfreezing eye contour extractor makes the weight to be learnable to learn the target

Training an iris feature extractor, wherein

Represents training E_irisMake it possible to

Taking the maximum value to learn the target

Training the eye contour extractor, wherein

Represents training E_perMake it

Taking a maximum value;

and step seven, repeating the steps three, four, five and six in a circulating way until the model converges or the maximum circulating wheel number is reached.

The concrete process of the ninth step is as follows:

firstly, carrying out data expansion on an iris visible light image training set, wherein the strategy of the data expansion is random inversion in the horizontal direction, random inversion in the vertical direction and random rotation, the size of each image of the iris visible light image training set after the data expansion is adjusted to 224 multiplied by 224, carrying out the data expansion on a visual light image training set around eyes, and the strategy of the data expansion is random inversion in the horizontal direction, random inversion in the vertical direction and random rotation, and the size of each image of the visual light image training set around eyes after the data expansion is adjusted to 224 multiplied by 224;

secondly, constructing an iris and periocular fusion classifier F, wherein the fusion classifier is a three-layer perceptron, the number of neurons of an input layer is 2560, the number of neurons of a hidden layer is 2560, and the number of neurons of an output layer is K;

thirdly, respectively inputting the augmented iris visible light image training set and the augmented periocular visible light training set into an iris feature extractor and a periocular feature extractor, wherein the output of the iris feature extractor is an iris feature vector, the output of the periocular feature extractor is a periocular feature vector, setting two learnable parameters as contribution balance coefficients of the iris feature vector and the periocular feature vector, respectively multiplying the iris feature vector and the periocular feature vector by respective balance coefficients, splicing the two vectors in a feature direction to obtain a 2560-dimensional fusion feature vector, and expressing the whole fusion process as V_fusion＝[γ_irisV_iris，γ_perV_per]In which V is_fusionRepresenting the final iris-to-periocular fusion eigenvector, gamma_irisRepresenting the iris contribution trade-off coefficient, V_irisIris feature vector, gamma, output from iris feature extractor_perRepresents the eye circumference contribution trade-off coefficient, V_perA feature vector representing the output of the eye contour feature extractor;

fourthly, inputting the fusion feature vector into an iris and periocular fusion classifier F, calculating cross entropy loss, freezing an iris feature extractor and a periocular feature extractor, and updating the iris and periocular fusion classifier F and a balance coefficient gamma through error back propagation training_perAnd gamma_iris。

The specific process of the iris and periocular region detection and fusion recognition model feature extractor training in the above step ten is as follows:

firstly, carrying out data expansion on an eye visible light image training set, wherein the strategy of the data expansion is random overturning in the horizontal direction, random overturning in the vertical direction and random rotation, and the size of each image of the expanded eye visible light image training set is adjusted to 224 multiplied by 224;

secondly, inputting a light-weight IrisPer _ YooloV 3 iris and periocular target detection model into an eye visible light image, outputting iris region coordinates and periocular region coordinates by the detection model, cutting out iris region images and periocular region images through the coordinates, adjusting the sizes of the iris images and the periocular images to 224 multiplied by 224, constructing a fusion identification sample, inputting the fusion identification sample into an iris and periocular feature extraction and feature fusion classification model, and recording the output of the fusion classification model as r;

thirdly, calculating a loss function L of the fusion classification_fusionFreezing iris and periocular region detection model Irisper _ YoloV3, minimizing loss L_fusionFor the purpose of training the fused classification model F, the loss function of the fused classification is shown in equation (19):

wherein: n is_eyeRepresents the total number of fused identified samples, r_jThe j-th element value, t, of the soft output vector representing the fused classification model_jThe j-th element value of the identity tag Onehot vector representing the fused recognition sample.

The multi-stage decision making process of the step eleven is as follows:

firstly, registering, extracting a random image in each type of visible light image test set of eyes through an iris and eye periphery bimodal recognition model to obtain iris and eye periphery fusion feature vectors, storing all the feature vectors into a feature vector database, and marking identity category information of the feature vectors;

and secondly, performing first-order authentication, randomly selecting an image in the optical image test set from the eyes, extracting the image through an iris and periocular bimodal recognition model to obtain iris and periocular fusion feature vectors, calculating Euclidean distance decision matching functions one by one according to the feature vectors and the registered iris and periocular fusion feature vectors in a database, and selecting the iris and periocular fusion feature vector with the minimum distance to the feature vector to be authenticated, wherein the calculation formula of the Euclidean distance decision matching function is (20) as follows:

wherein: eutlideandistance (X)_prob，Y_gallery) Represents X_probAnd Y_galleryOf between, Euclidean distance, X_probRepresenting the iris and periocular fusion eigenvectors to be authenticated,

represents X_probThe i-th element of (C), Y_galleryRepresenting fusion eigenvectors of iris and periocular fusion in a registered database, Y_galleryRepresents

The ith element, L represents the dimension of the feature vector;

the third step, carrying out second-order authentication, comparing the Euclidean distance between the characteristic vector to be authenticated and the registered characteristic vector in the database, selecting the registered characteristic vector with the minimum Euclidean distance, and calculating the cosine similarity between the characteristic vector to be authenticated and the registered characteristic vector with the minimum Euclidean distance, if the value of the cosine similarity is greater than the threshold value of 0.7, the category of the characteristic vector to be identified is considered to be the same as the categories of the characteristic vectors of the iris and the eye periphery in the registry, otherwise, the identity category of the characteristic vector to be identified is not in the system database, and the identification fails, wherein the calculation formula (21) of the cosine similarity matching decision function is as follows:

wherein: cosinedistance (X)_prob，Y_chosen) Represents X_probAnd Y_chosenCosine similarity between them, Y_chosenIndicates the second step selected and X_probThe feature vector having the smallest euclidean distance of (a),

represents Y_chosenThe ith element value of (2).

The invention has the beneficial effects that:

the iris and periocular antagonism self-adaptive fusion recognition method based on contrast knowledge drive provided by the invention fuses novel biological feature periocular and iris features, avoids harsh requirements on acquisition equipment by utilizing the advantage of larger recognizable size of periocular and the high accuracy rate of small-size iris, and simultaneously improves the reliability and safety of the system through multi-mode biological feature fusion.

The invention designs a multi-stage training algorithm based on comparison knowledge drive and data drive, firstly, the comparison training is used for fully mining the discriminability feature coding spatial relationship of the iris and periocular data sets, and then the iris and periocular region detection and the iris and periocular classification downstream task fine tuning are carried out, so that the overfitting problem of training a deep neural network under the condition that the number of ocular biological feature data sets is limited is effectively avoided, and the identification performance on a small sample data set is improved.

The invention designs a modal impedance self-adaptive fusion method, which is used for registering the feature distribution of different biological modes in a low-dimensional feature subspace through impedance learning, and fusing the registered feature vectors, thereby improving the final identification performance.

Drawings

Fig. 1 is a schematic flow chart of an identification method according to the present invention.

Detailed Description

The flow chart of the present invention is shown in FIG. 1:

example 1:

the whole procedure of operation carried out under the framework of claim 1 for a certain person (named Tom, information of Tom not registered before, same iris collector for the test eye image and the eye image in the database for generating the fusion feature vector of iris and periocular region):

1) the method comprises the steps of collecting 100 eye images for model training of people by using a high-definition image collector on the market at will, collecting 1500 eye images for training for each person by 15 eyes, and establishing an eye visible light original image training data set, an iris visible light training data set and a periphery visible light training data set.

2) Initializing three MobileNet V3 deep convolution neural networks, wherein each network model removes the MobileNet V3 of the last full-connection layer as a coding layer, a projection layer is added behind the coding layer, and a prediction layer is added behind the projection layer to obtain three initialized model architectures;

3) three initialization model architectures obtained by training 2) using a comparative knowledge-driven learning algorithm using 1) a training set of raw images of eye visible light acquired.

4) The projection and prediction layers of the three model architectures are removed, resulting in three trained convolutional encoded portions of MobileNetV 3.

5) And 4) constructing an IrisPer _ YoloV3 target detection model by using a MobileNet V3 convolution coding part obtained by contrast knowledge driving training as a skeleton backbone, freezing a MobileNet V3 convolution coding part, and training an IrisPer _ YoloV3 target detection model to position an iris and a periocular region by using an iris and periocular target detection training set.

6) The contrast knowledge in 4) drives the convolution coding part of the MobileNet V3 obtained by training, and a multi-layer perceptron is added as a classifier. Freezing a MobileNet V3 convolution coding part, using a visible light image training set around the eyes and a personnel identity label training, removing the last full-connection layer to obtain a feature extraction model E around the eyes_per。

7) The contrast knowledge in 4) drives the convolutional encoding part of the MobileNetV3 obtained by training to add a multi-layer perceptron as a classifier. Using iris visible light image training set and personnel identity label training, removing the last full-connection layer to obtain iris feature extraction model E_iris

8) And constructing a classification and discriminator integrated fully-connected neural network C. Extracting iris feature obtained in 6) and 7) to obtain an iris feature extraction model E_irisAnd periocular extraction model E_perUsing the iris visible light training set and the periocular visible light training set together with the classification and discriminator integrated fully-connected neural network C to perform countertraining with the identity class labels, aiming at distributing the characteristics of different modes in a low-dimensional subspaceAnd (6) carrying out registration.

9) Setting two learnable parameters, using an iris extractor and a periocular feature extractor obtained through 8) confrontation training, adding a multilayer perceptron as a fusion classifier F to construct an iris and periocular fusion model, freezing the iris feature extraction model and the periocular feature extraction model, training the whole fusion recognition model by using an iris visible light training set and a periocular visible light training set, and optimizing the multilayer perceptron of the fusion classifier F.

10) And (3) jointly training the iris and periocular fusion model obtained in the step (9) and the iris periocular synchronous detection model IrispPer _ YoloV3 target detection model obtained in the step (5), and removing the last layer of neural network to obtain the final iris and periocular region detection and fusion recognition model feature extractor.

11) Any one eye image of each person in the eye data set of 100 persons in the test set is taken, and the total number of 100 images is obtained.

12) Inputting 100 images in 11) into a detection and fusion recognition model feature extractor for the iris and periocular region obtained in 10), outputting 100 template feature vectors, and storing the template feature vectors into a database.

13) And (3) acquiring an eye image of Tom by using an acquisition instrument, inputting 10) the obtained iris and periocular region detection and fusion recognition model extractor, outputting a template characteristic vector of Tom, and storing the template characteristic vector in a database.

14) And (3) using an acquisition instrument to acquire an eye image input 10) of Tom, and outputting a test feature vector of Tom by using an iris and periocular region detection and fusion recognition model extractor.

15) And comparing the test feature vector to be authenticated with the template feature vectors in the database one by using a two-stage decision strategy, screening out the template feature vector with the minimum Euclidean matching decision function value calculated with the test feature vector, calculating a cosine matching decision function value by using the template feature vector and the test feature vector, wherein the cosine matching decision function value is greater than a threshold value of 0.7, and successfully matching the template feature vector and the test feature vector.

16) The identity label of the template feature vector is Tom, thus concluding: the template feature vector and the test feature vector are of the same class and both belong to Tom.

Example 2:

the whole process of operation performed under the framework of claim 1 for some two persons (named Betty and John, information not previously registered to Betty and John, same iris acquisition instrument used for testing eye images and eye images in the database for generating iris and periocular fusion feature vectors):

1) the eye images for model training of 100 persons are acquired by using a high-definition image acquisition instrument on the market at will, wherein 15 eye images are acquired for each person, 1500 eye images are acquired for training, and an eye visible light original image training data set, an iris visible light training data set and a periphery visible light training data set are established.

4) The projection layer and the prediction layer of the three model architectures are removed, and three trained convolutional coding parts of MobileNetV3 are obtained.

6) The contrast knowledge in 4) drives the convolutional encoding part of the MobileNetV3 obtained by training to add a multi-layer perceptron as a classifier. Freezing a MobileNet V3 convolution coding part, using a visible light image training set around the eyes and a personnel identity label training, removing the last full-connection layer to obtain a feature extraction model E around the eyes_per。

7) Contrast knowledge-driven training in 4) to obtainThe MobileNetV3 convolutional encoding section of (1) adds a multi-layer perceptron as a classifier after it. Using iris visible light image training set and personnel identity label training, removing the last full-connection layer to obtain iris feature extraction model E_iris

8) And constructing a classification and discriminator integrated fully-connected neural network C. Extracting iris feature obtained in 6) and 7) to obtain an iris feature extraction model E_irisAnd periocular extraction model E_perAnd carrying out countermeasure training by using an iris visible light training set and a periocular visible light training set together with a classification and discriminator integrated fully-connected neural network C and an identity class label, so as to distribute the features of different modes in a low-dimensional subspace for registration.

11) Taking any one eye image of each person in the eye data set of 100 persons in the test set, and totaling 100 images

13) And (3) acquiring an eye image input 10) of Betty by using an acquisition instrument, and outputting a template characteristic vector of Betty to store in a database by using an iris and periocular region detection and fusion recognition model extractor.

14) And acquiring an eye image input 10) of John by using an acquisition instrument, and outputting a test feature vector of John by using an iris and periocular region detection and fusion recognition model extractor.

15) And comparing the test feature vector to be authenticated with the template feature vectors in the database one by using a two-stage decision strategy, and screening out the template feature vector which has the minimum Euclidean matching decision function value calculated with the test feature vector, wherein the cosine matching decision function value is calculated by the template feature vector and the test feature vector, and is less than a threshold value, and the template feature vector is unsuccessfully matched with the test feature vector.

16) No template feature vector in the database satisfies the requirement that the decision function value of the Euclidean matching with the test feature vector is minimum and the cosine matching function value is greater than the threshold value of 0.7, so that the conclusion is obtained: the template feature vector is of a different class than the test feature vector, and the Betty template feature vector is not of the same class as the John test feature vector.

Example 3:

the whole process of operation carried out under the framework of claim 1 for some two persons (named Martin and Robert, information of which is not registered before, and the same iris acquisition instrument is used for testing the eye image and the eye image in the database for generating the fusion feature vector of the iris and the periocular region):

7) The contrast knowledge in 4) drives the convolutional encoding part of the MobileNetV3 obtained by training to add a multi-layer perceptron as a classifier. Using the training set of iris visible light images and the training of personnel identity labels, removing the last full-connection layer to obtain an iris feature extraction model E_iris

13) An eye image input 10) of Martin is acquired by using an acquisition instrument, and an iris and periocular region detection and fusion recognition model extractor is obtained, and template feature vectors of Martin are output and stored in a database.

14) The eye image input 10) of Robert is acquired by using an acquisition instrument, and the iris and periocular region detection and fusion recognition model extractor is obtained, and the test feature vector of Robert is output.

16) No template feature vector in the database satisfies the requirement that the decision function value of the Euclidean matching with the test feature vector is minimum and the cosine matching function value is greater than the threshold value of 0.7, so that the conclusion is obtained: the template feature vector and the test feature vector are of different categories, and the Martin template feature vector and the Robert test feature vector are not of the same category.

Claims

1. The iris and periocular antagonism self-adaptive fusion recognition method based on contrast knowledge drive is characterized by comprising the following steps of: the method comprises the following steps:

step nine, setting two learnable parameters, and using the iris feature extractor E obtained through the eight confrontation training steps_irisAnd eye periphery feature extractor E_perAdding a multilayer perceptron as a fusion classifier F to construct an iris and periocular fusion model, freezing the iris feature extraction model and the periocular feature extraction model, training the whole fusion recognition model by using an iris visible light training set and a periocular visible light training set, and optimizing the multilayer perceptron of the fusion classifier F;

2. The adaptive iris and periocular antagonism fusion recognition method based on contrast knowledge driving as claimed in claim 1, wherein: the specific process of establishing the eye visible light image data set, the iris and periocular target detection data set, the iris visible light image data set and the periocular visible light image data set in the first step is as follows:

the method comprises the following steps that 2500 template eye visible light images of template testers are collected by a high-definition camera, wherein 2000 images are divided into training sets, and 500 images are divided into testing sets and serve as eye visible light image data sets;

3. The adaptive iris and periocular antagonism fusion recognition method based on contrast knowledge driving as claimed in claim 1, wherein: the specific process of constructing the projection layer and the prediction layer in the second step is as follows:

4. The adaptive iris and periocular antagonism fusion recognition method based on contrast knowledge driving as claimed in claim 1, wherein: the comparative knowledge-driven training process in the third step is as follows:

wherein L is_contrastRepresenting a comparative knowledge-driven loss function, n₁Representation image X₁Output via f and g, n₂Representation image X₂Output via f, v₁Represents n₁Output over h, v₂Represents n₂Output over h, Stopgrad (n)₂) Representing the vector n during the training process₂The gradient is stopped.

5. The adaptive iris and periocular antagonism fusion recognition method based on contrast knowledge driving as claimed in claim 1, wherein: the specific process of training the irispper _ yov 3 target detection model to locate the iris and the periocular region in the above step five is as follows:

secondly, marking training labels of the target detection model, dividing a single sample image in a training set into a plurality of lattices with the size of scale multiplied by scale according to the down-sampling scale of the image of MobileNet V3, generating three anchor frames in each lattice, wherein the coordinate of the central point of each anchor frame is the same as the coordinate of the central point of the lattice, the size of each anchor frame is as consistent as possible with the sizes of the iris and the real frame around the eye in the training sample, traversing the real frame around the eye and the real frame around the eye according to the position information of the iris and the real frame around the eye in the training sample, and assuming that the abscissa of the coordinate of the central point of the real frame is i_ceneterOn the abscissa of j_ceneterThen it is located in the grid position of (i)_ceneter/scale，j_ceneterScale), calculate the IOU of these 3 anchor boxes and the real box using equation (2);

taking the anchor frame with the largest IOU, setting the object label of the corresponding prediction frame to be 1, wherein the object represents the confidence coefficient, namely the possibility that the prediction frame has an iris or an eye circumference, setting the object training label of the corresponding prediction frame to be-1 if the IOU of other anchor frames in the grid is not the maximum value but exceeds a set threshold value, setting the object training labels of the other prediction frames to be 0, and marking the positions and the sizes of the object training labels in the prediction frames with the object label of 1 by using a vector (t)_x，t_y，t_h，t_w) Indicating position and size, t_xIs the central abscissa, t, of the iris or eye periphery_yIs the central ordinate, t, of the iris or of the eye periphery_hIs the height of the iris or periocular region, t_wIs the width of the iris or the circumference of the eye,the goal of the training network is to make the position and size of the output prediction box as close as possible to the real box, and the position and size relationship of the prediction box and the real box should satisfy the following equations (3), (4), (5), (6):

σ(t_x)+c_x＝gt_x#(3)

σ(t_y)+c_y＝gt_y#(4)

wherein: c. C_xIs the abscissa of the cell in which the prediction box is located, c_yIs the ordinate of the cell in which the prediction box is located, a_wTo predict the width of the anchor frame to which the frame corresponds, a_hTo predict the height, gr, of the anchor frame to which the frame corresponds_xAbscissa representing real frame, gt_yOrdinate representing the real box gt_yOrdinate representing the real box gt_wWidth of real frame, gt_hRepresenting the height of the real box;

thirdly, constructing an Irisper _ YooloV 3 target detection model based on MobileNet V3, wherein the detection model is formed by adding a three-branch convolution detection head network after a convolution part of MobileNet V3, and assuming the kernel size and the number of convolution layers to be (W)_kernal×H_kernal，N_kernal) Denotes W_kernalRepresenting the width of the convolution kernel, H_kernalRepresenting the height of the convolution kernel, N_kernalRepresenting the number of convolution kernels, the first branch is a convolution layer obtained by downsampling MobileNet V3 for the fifth time, and then adding 7 convolution layers, the sizes and the numbers of convolution kernels of each layer are respectively (1 × 1, 512), (3 × 3, 1024), (1 × 1, 512), (3 × 3, 1024), (1 × 1, 512), (3 × 3, 1024), (1 × 1, 21), the second branch is obtained by firstly passing the output of the fifth convolution layer of the first branch through a convolution layer of (1 × 1, 256) and an upsampling operationThe output after the operation is spliced with the feature map in the channel direction by the output of the fourth down-sampling of the MobileNet V3, 7 convolutional layers are added after the splicing operation, the sizes and the numbers of convolutional cores of each layer are respectively (1 × 1, 256), (3 × 3, 512), (1 × 01, 256), (3 × 13, 512), (1 × 21, 256), (3 × 33, 512), (1 × 41, 21), the output of the third branch is obtained by firstly passing the output of the fifth convolutional layer of the first branch through one (1 × 1, 256) convolutional layer and one up-sampling operation, and the output of the third down-sampling of the MobileNet V3 is spliced with the feature map in the channel direction, 7 convolutional layers are added after the splicing operation, the sizes and the numbers of convolutional cores of each layer are respectively (1 × 1, 128), (3, 256), (1 × 1, 128), (3 × 3, 256), (1 × 1, 128), (3 × 3, 256), (1 × 1, 128), (128), (3 × 3, 256), (1 × 1, 21);

Loss_mean＝(Loss₁+Loss₂+Loss₃)/3#(8)

wherein Loss_kRepresents the loss of the kth branch output profile, where S_kDenotes the size of the kth branch output characteristic diagram, i denotes the (i + 1) th cell, j denotes the (j + 1) th prediction box, and lambda_coordWeight system representing position lossThe number of the first and second groups is,

when the j +1 prediction frames in the (i + 1) th cell have irises or eye circumferences, the value is 1, otherwise, the value is 0, wherein x_ijIs the central abscissa of the real frame of the iris or the eye periphery,

to predict the width of the frame, h_ijIs the height of the real frame of the iris or eye circumference,

to predict the height of the box, c_ijIn the form of a category of a real box,

when j +1 prediction frames in i +1 cells have no iris or eye circumference, the value is 1, otherwise, the value is 0, p_ijThe confidence level of the real box is represented,

representing the confidence of the prediction box.

6. Contrast knowledge driven according to claim 1The iris and periocular antagonism self-adaptive fusion recognition method is characterized by comprising the following steps: extracting the eye periphery characteristic in the sixth step and the seventh step_perAnd iris feature extraction model E_irisThe specific process of training is as follows:

firstly, performing data expansion on a visual light image training set around eyes and an iris membrane visual light image training set, wherein the data expansion strategy is random inversion in the horizontal direction, random inversion in the vertical direction and random rotation, and randomly adjusting the brightness, the hue, the saturation and the contrast, and the size of each image in the expanded visual light image training set and each image in the expanded iris image training set is adjusted to 224 multiplied by 224;

7. The adaptive iris and periocular antagonism fusion recognition method based on contrast knowledge driving as claimed in claim 1, wherein: the specific process of the confrontation training in the step eight is as follows:

second, labeling samples from the iris visible light training set

Wherein

Representing the ith iris image of the training set,

identity label representing Onehot vector form corresponding to ith iris image, n_irisRepresenting the total number of samples in the training set of iris, and labeling the image from the training set of visible light around the eye as

thirdly, inputting the iris sample into an iris feature extractor E_irisInputting the eye circumference sample into a feature extractor E_perInputting the obtained iris and periocular feature vectors into a classification and discrimination integrator C to obtainTo the classification output vector by

To represent

I.e., the output vector of the classification and discrimination integrator, where item is, iris, per,

output vector representing classifier and discrimination integrator

K, where K is 1,2, …, K, a conditional probability of the output vector is calculated, the K-th element value of the conditional probability being as shown in equation (10):

wherein:

to represent

wherein:

to represent

fourthly, in order to register the feature distribution of the iris and periocular modes by using confrontation training, a classification and discrimination integrator is trained to correctly classify the identity labels of the iris and periocular modes, meanwhile, the mode labels of the iris and periocular modes are correctly classified, and when an iris feature extractor and a periocular feature extractor are trained, the generated mode features enable the classification and discrimination integrator not to distinguish the mode labels, so that the aim of registering the feature distribution of the iris and periocular modes is achieved, and a loss function L related to the iris feature extractor is calculated^iris(E_iris,C)、

loss function

The design of the method is to ensure the type of the characteristic mode extracted by the iris characteristic extractor to keep the discriminability, ensure the discriminability of the iris category of the characteristic vector and achieve the aim of the predicted joint probability distribution registration of the iris category and the mode, and the loss function is shown as a formula (13):

wherein:

This formula is shown in (14):

calculating a loss function L associated with a periocular feature extractor^per(E_per,C)，

Loss function L^per(E_perC) is designed to guarantee a mutually antagonistic suppression relationship state between the prediction of the periocular class and the prediction of the type of modality, and the loss function is as shown in equation (15):

wherein: l is^per(E_perAnd, C) a feature extractor E for eye circumference_perThe countermeasure loss of (1);

loss function

Is designed to ensure that the type of the characteristic mode extracted by the eye periphery characteristic extractor keeps discriminability and ensure the characteristic vectorThe discriminability of the periocular class achieves the goal of joint probability distribution registration of the periocular class with the prediction of the modality, and the loss function is shown in equation (16):

wherein:

This formula is shown in (17):

wherein:

represents a minimum entropy loss of the iris or periocular region;

represents training C

Taking the minimum value;

Training an iris feature extractor, wherein

Represents training E_irisMake it

Taking the maximum value to learn the target

Training the eye contour extractor, wherein

Represents training E_perMake it

Taking a maximum value;

8. The adaptive iris and periocular antagonism fusion recognition method based on contrast knowledge driving as claimed in claim 1, wherein: the concrete process of the ninth step is as follows:

thirdly, respectively inputting the augmented iris visible light image training set and the augmented periocular visible light training set into an iris feature extractor and a periocular feature extractor, wherein the output of the iris feature extractor is an iris feature vector, the output of the periocular feature extractor is a periocular feature vector, setting two learnable parameters as contribution balance coefficients of the iris feature vector and the periocular feature vector, respectively multiplying the iris feature vector and the periocular feature vector by respective balance coefficients, splicing the two vectors in a feature direction to obtain a 2560-dimensional fusion feature vector, and expressing the whole fusion process as V_fusion＝[γ_irisV_iris,γ_perV_per]In which V is_fusionRepresenting the final iris-to-periocular fused feature vector, gamma_irisRepresenting the iris contribution trade-off coefficient, V_irisRepresenting iris feature extractionIris feature vector, gamma, output by the device_perRepresents the eye circumference contribution trade-off coefficient, V_perA periocular feature vector representing the periocular feature extractor output;

fourthly, inputting the fusion feature vector into an iris and periocular fusion classifier F, calculating cross entropy loss, freezing the iris feature extractor and the periocular feature extractor, and updating the iris and periocular fusion classifier F and a balance coefficient gamma through error back propagation training_perAnd gamma_iris。

9. The adaptive fusion identification method based on contrast knowledge-driven iris and periocular contrast of claim 1, characterized in that: the specific process of the iris and periocular region detection and fusion recognition model feature extractor training in the above step ten is as follows:

10. The adaptive iris and periocular antagonism fusion recognition method based on contrast knowledge driving as claimed in claim 1, wherein: the specific process of the multi-stage decision of the step eleven is as follows:

firstly, registering, namely extracting a random image in each type of visible light image test set of eyes through an iris and periocular bimodal recognition model to obtain iris and periocular fusion feature vectors, storing all the feature vectors into a feature vector database, and marking identity category information of the feature vectors;

wherein: eutlideandistance (X)_prob,Y_gallery) Represents X_probAnd Y_galleryOf between, Euclidean distance, X_probRepresenting the iris and periocular fusion eigenvectors to be authenticated,

The ith element, L represents the dimension of the feature vector;

and thirdly, performing second-order authentication, comparing the Euclidean distance between the feature vector to be authenticated and the registered feature vector in the database, selecting the registered feature vector with the minimum Euclidean distance, calculating cosine similarity between the feature vector to be authenticated and the registered feature vector with the minimum Euclidean distance, if the cosine similarity is greater than a threshold value of 0.7, determining that the category of the feature vector to be identified is the same as the category of the feature vectors of the iris and the eye periphery in the database, otherwise, determining that the identity category of the feature vector to be identified is not in the system database, and failing to identify, wherein the calculation formula of the cosine similarity matching decision function is (21):

wherein: cosinedistance (X)_prob,Y_chosen) Represents X_probAnd Y_chosenCosine similarity between them, Y_chosenIndicates the second step of selection and X_probThe euclidean distance of (c) is the smallest,

represents Y_chosenThe ith element value of (2).