Disclosure of Invention
The invention aims to provide a wrist vein authentication system which is simple to operate and strong in robustness and solves the problems that in the prior art, traditional identity authentication is completed based on a marker or knowledge, but the methods are inconvenient, unsafe and unreliable.
The technical solution of the invention is as follows:
a wrist vein authentication system comprises an image acquisition module, an image preprocessing module, a feature extraction module and an identification module;
the image acquisition module acquires the wrist vein infrared image through the wrist vein acquisition device;
the image preprocessing module is used for intercepting an interested area of the acquired wrist vein infrared image, filtering the area by adopting a mean value filtering method, carrying out graying and normalization on the area and carrying out contrast enhancement processing by adopting a histogram stretching method;
the feature extraction module is used for extracting principal component features and wavelet features of the preprocessed image;
and the identification module is used for classifying and identifying the wrist vein images by using the extreme learning machine, and identifying and comparing the main components and the high-frequency and low-frequency characteristics of the image information with the wrist vein characteristic database.
Further, the extreme learning machine model is:
wherein j is 1,2,3 … N; w is aiConnecting weights to the ith node for all inputs;
wi=[wi1,wi2,…,win]T;βiconnecting weights to the ith node for all outputs;
βi=[βi1,βi2,…,βim]T。
further, the matrix representation of the extreme learning machine model is in the form of:
Hβ=Y (5.4)
in the formula, H is a network hidden layer output matrix,
further, before training of the extreme learning machine, w and b are randomly generated, and only the number of neurons in the hidden layer and the excitation function g (x) are determined to calculate beta, wherein the specific steps are as follows:
determining the number L of the neurons of the hidden layer, randomly setting a connection weight w of the input layer and the hidden layer and the neuron bias b of the hidden layer;
selecting an infinite differentiable function g (x) as an excitation function, and calculating a hidden layer output matrix H;
calculating the weight of output layer, weight β ═ H-1Y。
Further, intercepting an interested area of the acquired image, specifically:
intercepting an image to be processed: removing the background of the original image, recording the coordinates of the junction point of the wrist vein and the background, and then intercepting the area within the junction as a picture to be processed;
intercepting image blocks for training and testing on an image to be processed: randomly taking a point on the image to be processed as a center, and taking an image block as a training and identifying sample for the side length.
Further, the principal component analysis step is:
s1, carrying out standardization processing on the original data;
s2, calculating a sample correlation coefficient matrix;
s3, solving the eigenvalue of the correlation coefficient matrix and the corresponding eigenvector by using a Jacobian method;
s4, selecting a principal component, and writing a principal component expression;
s5, calculating the score of the principal component;
s6, the principal component having a high score is extracted as a feature from the data of the principal component score, and used in the following classification.
Further, the extraction of features by wavelet analysis is specifically that after sampling the image of the vein of the wrist, a wavelet multi-scale decomposition is performed on a signal in a large limited frequency band, and the acquired signal is divided into two signals, namely a high-frequency part and a low-frequency part.
Further, in the wrist vein collection device, a mixture of 850nm and 940nm wave bands is adopted as an irradiation light source, a selected CMOSS image sensor is adopted, a USB network camera with 130 ten thousand pixels is adopted, and the background color is set to be black.
A wrist vein image recognition method based on an extreme learning machine is characterized in that the extreme learning machine is used for classifying and recognizing a wrist vein image, and a main component and high-frequency and low-frequency characteristics of image information are used for recognizing and comparing with a wrist vein characteristic database.
Further, the extreme learning machine model is:
wherein j is 1,2,3 … N; w is aiConnecting weights to the ith node for all inputs;
wi=[wi1,wi2,…,win]T;βiconnecting weights to the ith node for all outputs;
βi=[βi1,βi2,…,βim]T;
the matrix representation of the extreme learning machine model is:
Hβ=Y (5.4)
in the formula, H is a network hidden layer output matrix,
before training of the extreme learning machine, w and b are randomly generated, β is calculated only by determining the number of neurons of the hidden layer and an excitation function g (x), and the method comprises the following specific steps:
determining the number L of the neurons of the hidden layer, randomly setting a connection weight w of the input layer and the hidden layer and the neuron bias b of the hidden layer;
selecting an infinite differentiable function g (x) as an excitation function, and calculating a hidden layer output matrix H;
calculating the weight of output layer, weight β ═ H-1Y。
The invention has the beneficial effects that: the system obtains an image with obvious clear characteristics by preprocessing an acquired image; by analyzing various image feature extraction methods, selecting principal component analysis and wavelet analysis to extract image features in combination with actual conditions; and classifying the images by adopting an extreme learning machine, comparing the test results for many times, and selecting the optimal parameters to ensure that the recognition rate reaches more than 95%. The system identifies the principal component and the high-frequency and low-frequency characteristics of the image information, and has the advantages of simplicity in operation and strong robustness.
Detailed Description
Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
According to the wrist vein authentication system, the wrist vein image acquisition device is designed after learning of the camera, the light source and the optical filter, and images meeting experimental requirements are acquired after repeated tests and adjustment. And then, the Matlab language programming is used for completing the image preprocessing and feature extraction process. And finally, completing classification and identification of the vein images by adopting an extreme learning machine, designing a graphical user interface and showing the functions of the system. The experimental result shows that the identification precision of the system reaches more than 95%. The overall framework of the design of the wrist vein authentication system is shown in figure 1.
The image acquisition module acquires the wrist vein infrared image through the wrist vein acquisition device;
the image preprocessing module is used for intercepting an interested area of the acquired wrist vein infrared image, filtering the area by adopting a mean value filtering method, carrying out graying and normalization on the area and carrying out contrast enhancement processing by adopting a histogram stretching method;
the feature extraction module is used for extracting principal component features and wavelet features of the preprocessed image;
and the identification module is used for classifying and identifying the wrist vein images by using the extreme learning machine, and identifying and comparing the main components and the high-frequency and low-frequency characteristics of the image information with the wrist vein characteristic database. The method has the advantages of simple operation and strong robustness.
In the wrist vein acquisition device, acquisition equipment is required to be closed as much as possible; the background color should be selected as much as possible with a color having a relatively low gray value, such as black. An infrared emitting diode (LED) is used as the infrared light source. The embodiment adopts infrared LED lamps of three wave bands: 850nm, 880-900nm and 940 nm. The experiment of adopting single light source irradiation and two-by-two light source mixed irradiation finally finds that the effect of adopting 850nm and 940nm mixed irradiation as the irradiation light source is the best. The CMOSS image sensor is selected, and a USB network camera with 130 ten thousand pixels is adopted. The camera is convenient to transmit after the largest spot photo is taken, and can be directly transmitted into a computer for processing. An 800-1100nm filter is used, and the optical characteristics of the filter are that the light with the wavelength of 800-1100nm is allowed to pass through the light cut-off of other wave bands.
The infrared acquisition principle is as follows: according to the characteristics of human skeleton and muscle tissues, NIR light with the wavelength of 700nm-1000nm has strong penetrating power on the human tissues, and deep physiological information can be extracted by measuring the optical parameters of the human tissues. Meanwhile, hemoglobin in the wrist vein vessel can absorb more IR radiation than other tissues under the skin, and can well present the vein vessel structure. The wavelength range belongs to near infrared light, and is 800-1500 nm when the infrared light intensity is at the peak value according to the relevant regulations of the medical infrared monitoring light source in China. Thus, light in a wavelength band between 750nm and 900nm may be selected as the light source. The light of the wave band can not only better penetrate through the epidermal tissue of the wrist, but also be absorbed by vein blood vessels more.
And the image preprocessing module comprises the steps of extracting and normalizing the interested region of the wrist vein image and enhancing the contrast of the wrist vein. The acquired near-infrared wrist image is a color image and contains background and edge information, so that preprocessing of the wrist vein image is very necessary.
The image is collected by using invisible light in a near infrared band of 0.85 μm, and the collected picture still contains color information (namely R, G, B information) because the device is difficult to completely seal, and in order to reduce the calculation amount in the later period, the image is firstly subjected to graying processing to eliminate the color information. The gray scale of the image can make the data amount of the image only 1/3 of the original data amount, thereby reducing the calculation amount of subsequent processing.
In the RGB color model, color information is represented by three colors of colors (R, G, B). The 256-color grayscale image is converted by calculating (R, G, B). The method mainly comprises 3 methods, namely a maximum value method, an average value method and a weighted average value method.
According to the importance or other indexes, different weights are given to R, G and B, and the values of R, G and B are weighted, namely:
Gray=(WRR+WGG+WBB)/(WR+WG+WB) (3.3)
in the formula, WR,WG,WBAre the corresponding weights, W, of R, G, B, respectivelyR,WG,WBTaking different values, the weighted average method will form different gray level images. Since the human eye is most sensitive to green, second to red and least sensitive to blue, W is madeG>WR>WBA more reasonable grayscale image will be obtained.
In MATLAB, a color image I is converted into a grayscale image by means of the rgb2gray (I) function, which is implemented according to the following principle: according to different sensitivity degrees of human eyes to R, G, B three components, the sensitivity degrees are blue, red and green once from low to high, so the conversion method of the gray-scale image adopted here is to add the three components according to the importance proportions of R, G, B three componentsWeight processing, the weighted values of the three components are respectively WG=0.5870,WR=0.2989,WBWhen 0.1140, the best gray-scale transformation image can be obtained:
I′(i,j)=0.2989×R(i,j)+0.5870×G(i,j)+0.1140×B(i,j) (3.4)
wherein I' (I, j) is a pixel value of a (I, j) point on the image after the gradation conversion; i (I, j) is the pixel value of the (I, j) point on the original image. The original image (left part of fig. 2) is processed by the function rgb2gray (I) to obtain a gray image (right part of fig. 2).
Intercepting an image into two parts, wherein the first part is to intercept the image to be processed; the second part is to intercept the training and testing image blocks on the image to be processed.
Intercepting an original image: this operation is to remove the background of the original image and to reserve a region with rich information. The operation method comprises the steps of firstly previewing the acquired image, recording the coordinates of the junction point of the wrist vein and the background, and then intercepting the area within the junction as the picture to be processed.
Intercepting an image block of a training test: and randomly taking a point on the obtained object to be processed as a center, and intercepting an image block by taking a certain length as a side length to be used as a sample for training and recognition. The side length of the square is the optimal length obtained after experimental tests, namely the smaller the square is, the better the square is when a certain recognition rate is required to be met. There is a problem in that since the center point is random, the center point may fall on the boundary or the inner boundary is very close, which is out of the graph in the case of the square. Therefore, the program for preventing out-of-bounds is set when the program is implemented, and if the program is out of bounds, the central point is reselected until the proper image block is obtained.
Because the influence of the external environment and the equipment itself cannot be avoided, the vein image of the back of the hand has some noises, and the difference of the noises may be mistaken as the difference of the vein information, so that the system identification rate is reduced, and therefore, the image is filtered to remove the noises. The embodiment adopts a mean filtering method.
The mean filtering is a typical linear filtering algorithm, which gives a template to a target pixel, the template includes 8 pixels around the target pixel (excluding the target pixel), the template is called a filtering template, and the average value of all pixels in the template is used to replace the value of the target pixel.
The formula for mean filtering expresses:
wherein I (I, j) is the pixel value of the (I, j) point on the original image; i' (I, j) is the pixel value of the (I, j) point on the image after mean filtering; d is the size of the filter window, typically 3 x 3 or 5 x 5 pixels in size.
After the image is subjected to mean filtering, the noise of spots in the image is greatly reduced, the image looks smoother, and the filtering process lays a foundation for the subsequent image processing.
And the illumination normalization is realized, and when an image sample is collected, the brightness of different images of the same wrist is not uniform due to the difference of the distance from the wrist to the lens, the relative position between the wrist and the lens and the light intensity, which is also called illumination non-uniform. When classifying at a later stage, the classifier can classify images of the same wrist with different illumination intensities into different categories, which can affect the classification accuracy. The sample image is subjected to illumination normalization.
The difference of the illumination intensity reflects the difference of the image brightness on the sample image, and the intrinsic factor determining the image brightness is the pixel value of the image, so the pixel value is used to reflect the illumination intensity.
In the experiment, 50 pictures are collected from one wrist, and the algorithm of the illumination normalization is described as follows: firstly, the average pixel value Mean of each picture is obtainedk(k=1,2, … 50) and then averaging the 50 averages over a Mean. And taking the final average Mean as a normalized illumination value of the wrist, and normalizing the average of each of the 50 pictures to the final average, thereby completing the illumination normalization. The calculation formula is as follows:
Ik(i,j)′=Ik(i,j)+Mean-Meank(3.6)
in the formula, MeankThe mean value of the pixels of the kth image is obtained; i isk(i, j) is the pixel value of the point of the kth image (i, j); m isknkThe number of rows and columns of the k image matrix is obtained; mean is the Mean of all image pixels; i iskAnd (i, j)' is the pixel value of the point of the k image (i, j) after illumination normalization.
From the processing results of fig. 3, it is seen that the light ratios of the upper images are stronger before the light normalization, and the light ratios of the lower images are weaker, so that the lights of different images of the same wrist are unified after the normalization, that is, the same type has no difference in light, which lays a foundation for the later classification.
After the image is intercepted, due to the fact that the light source is strong and the absorption effect of each part of the wrist on near infrared light is different, the wrist area is bright overall and the contrast of the wrist vein and the muscle tissue image is poor when the wrist area is seen. In order to facilitate the subsequent processing of the image, a method of image gray scale normalization is used. The processed contrast image is shown in fig. 6.
The gray scale normalization is mainly by the following formula
Wherein I (I, j) is the gray value of the point (I, j) on the original image; max (I (I, j)) is the maximum value of the gray scale of the original image; min (I (I, j)) is the minimum value of the gray scale of the original image; i' (I, j) is the gray value at (I, j) on the normalized image.
The contrast of the wrist image is enhanced, and the image is enhanced after the gray level normalization. And contrast enhancement is realized by adopting a histogram stretching method. The histogram of an image is an important statistical feature of the image and represents the statistical relationship between each gray level in a digital image and the frequency of occurrence of the gray level. Generally, a uniformly quantized natural image is concentrated in a narrow low-value gray scale interval due to the distribution of a gray scale histogram of the natural image, details of the image are not clear, and in order to make the image clear, the gray scale range of the image can be expanded or the gray scale distribution tends to be homogenized in a dynamic range through transformation, so that the contrast is increased, the details of the image are clear, and the purpose of image enhancement is achieved. The transformation of the histogram is classified into linear, piecewise linear, nonlinear, and other transformations.
Piecewise linear transformation can be used to stretch desired image detail gray levels, enhance contrast, and compress undesired detail gray levels in order to highlight interesting objects or gray levels in the image, and to relatively suppress those uninteresting gray levels without sacrificing details at other gray levels. Usually, a three-stage linear transformation method shown in the figure is adopted, and the mathematical expression is as follows:
in fig. 4, the gray scale interval [ a, b ] is linearly expanded, while the gray scale intervals [0, a ] and [ b, e ] are compressed. Any gray scale interval can be expanded and compressed by adjusting the position of the inflection point of the broken line and controlling the slope of the segmented straight line.
The histogram stretches the contrast enhancement, and the contrast enhancement method adopted by the embodiment is piecewise linear transformation. The method enables the gray value at the blood vessel to be lower and the gray value at the muscle tissue to be higher, so that the blood vessel lines become clearer. Image enhancement by image histogram stretching has proven to be an effective method. The processed contrast image is shown in fig. 6.
The characteristics of the image generally focus on parts with severe changes of the image, the wrist vein image often has simpler and clearer texture characteristics, and the wrist vein image has better gray level distinguishability than the muscle tissue as seen from the typical image of the wrist vein, which obviously brings great convenience to the extraction of the wrist vein.
Feature selection, feature data is formed from a pattern to be recognized, i.e., the original features of the pattern. The type and number of features greatly affect the performance of the classifier. The original features obtained in the feature forming process are possibly many, and if all the original features are used as classification features and sent to a classifier, the classifier is more complex, the classification calculation discrimination amount is larger, and the classification error probability is not necessarily small. There is a need to reduce the number of features. The method for reducing the number of features is the selection and extraction of the features involved here. The selection of features is a method of selecting some of the most efficient features from a set of features to achieve a reduction in the spatial dimension of the features.
The information attributes that an image can provide can be roughly divided into four aspects, namely structural characteristics, statistical characteristics, fuzzy characteristics and knowledge characteristics. In vein recognition, the first two types of characteristics, that is, geometric features and statistical features, are mainly used.
The feature extraction is a link which is easy to distinguish in an image, and for the aspect of vein of the detail geometric feature extraction, the intersection points and the end points in the vein image are extracted and expressed. The vein detail features mainly include the endpoints and intersections of the veins. The adoption of detail features has the advantage of high recognition accuracy, and the defect of the adoption of detail features is that the requirement on preprocessing of the previous image is high.
For the thinned image (a single-pixel image with the gray scale of 0 or 255), the extraction of the end points is simple, the idea is clear, namely, the image is scanned line by line, the first black point f (with the gray scale of 0) is firstly found, the number N of the black points in eight adjacent points around the first black point f is checked, if N is equal to l, the f point is proved to be the end point, the whole image is iterated until the whole image is scanned, and all the end points are found.
The types of intersections appearing in the vein image mainly include three intersections and four intersections, and the algorithm for solving the three-line intersections is similar to the algorithm for solving the end points, except that the number of black points in eight adjacent points around the black point is required to be three.
For a quad point, two lines only cross vertically and cross non-vertically. For the case of a vertical intersection, there are four points in eight neighborhoods of the center intersection, only this is possible; for the case of non-perpendicular intersections, two triplets will be generated, and two lines should intersect only at one intersection in the image, which causes errors, and the exclusion method is that if a certain point is found to be an intersection, then no intersection is allowed in its eight neighborhoods, i.e. the intersection is not considered in the surrounding eight neighborhoods.
After the end points and the cross points are extracted, matching is generally divided into a local comparison method and a global comparison method.
Local alignment: different alignment algorithms determine different alignment computational complexity.
The adopted comparison algorithm comprises the steps of firstly judging whether the types of the central points are the same or not, and if so, comparing whether the parameters of the neighborhood characteristic points are matched one by one, wherein the parameters comprise whether the types of the neighborhood characteristic points are the same or not, and whether the parameters between two points and the parameters between three points meet the requirement of a threshold range (the error is less than 5% -10%). If the center point of the qualified star structure remains as the feature point, the center point of the unqualified star structure is deleted from the feature point. Thus, the defective feature points are deleted, and the remaining points are used as feature points for the next matching. This is done to avoid matching between two widely different star structures in the same class of vein images.
And (4) global matching. The method comprises the steps of firstly counting end points and intersection points which are remained after local matching, then calculating distances between all the end points and distances between the intersection points, arranging the end points and the distances from small to large, wherein about 100 distances can be obtained, performing matching experiments by using the 100 distances, calculating accumulated errors, failing to match if the errors are larger than a threshold value, and succeeding to match if the errors are smaller than the threshold value, wherein the threshold value is set to be (5%). It can be seen that this global matching method is sensitive to the effects of spurious details and local deformations, which cannot be neglected especially when the number of vein feature points is small. The smaller the number of feature points, the greater the degree of influence. There are two main effects:
(1) the influence of spurious details is a change in feature point type, an addition of spurious feature points, or a loss of true feature points.
(2) Unpredictable local nonlinear deformation, so that the relative distance, angle and direction angle of the minutiae point pair are greatly changed and exceed the range of a matching threshold value, so that the originally matched points are not matched; or to have the local features completely changed.
Such pseudo-details and local deformations are ubiquitous in real vein images. For example, in the vein images used in the embodiment, each image has pseudo-details and local deformation, and if only global feature point matching is adopted, most of misjudgment is caused by the two effects. Therefore, it can be seen that the local-global matching scheme of the embodiment is reasonable, and the recognition rate is improved a lot through experimental verification.
With the rapid development of intelligent computer methods, the global feature of the vein image makes possible a global correlation matching method based on the whole image. The matching method has the following characteristics:
(1) the matching method based on the global correlation does not need to extract the detail features of the image, and further removes the influence on image matching caused by inaccurate detail feature extraction.
(2) The global correlation-based matching method is simpler than a detail feature-based matching algorithm.
(3) The extracted global features include some features implicit in the image.
The matching method adopts a transformed template matching method. Unlike conventional template matching, in the absence of standard templates, a moving transformation template is used (a template of a specific size is selected from a starting point in a sample image, each template is template-matched with an image to be recognized, and then the transformation template is moved point by point according to the specific size) to finally find the closest matching result. In the matching process, the invariant moment of the corresponding part is calculated every time the template moves once, and then the invariant moment of the corresponding part of the template and the image to be matched are matched.
Feature extraction, which is an image matching method used by most authentication systems (such as fingerprint identification face recognition systems), extracts specific features, such as the number of feature points, image texture features, and the like. However, the wrist vein authentication system studied in the embodiment requires that the identification work can be completed when the information is incomplete (not the whole wrist vein image), which determines that the embodiment cannot adopt specific features such as feature points and the like. Based on the aspect, two methods, namely principal component analysis and wavelet analysis, are adopted to extract the internal features of the image, as shown in FIG. 7.
The principal component analysis adopts a mathematical dimension reduction method to find out a plurality of comprehensive variables to replace the original variables, so that the comprehensive variables can represent the information content of the original variables as much as possible and are mutually independent. This statistical analysis method, which will quantify a plurality of variables into a few mutually independent synthetic variables, is called principal component analysis or principal component analysis.
Principal componentAll that is needed for the analysis is to try to recombine the original variables with a certain correlation into a new set of independent comprehensive variables to replace the original variables. Usually, the mathematical processing method is to make linear combination of the original variables as the new comprehensive variable, but there are many combinations, if not limited, how should be chosen? If the first linear combination, i.e. the first integrated variable, is taken as F1Naturally, it is desirable that it reflects as much information as possible of the original variable, where "information" is measured in terms of variance, i.e. Var (F) is desired1) Larger, denotes F1The more information that is contained. F thus selected in all linear combinations1Should be the largest variance, so called F1Is the first main component. If the first principal component is not enough to represent the information of the original p variables, F is selected again2I.e. the second linear combination, F in order to effectively reflect the original information1The existing information does not need to be presented in F2In a mathematical language, that is, the requirement of Cov (F)1,F2) When it is equal to 0, it is called F2The third, fourth, pth principal component can be constructed by analogy with the second principal component.
Mathematical models of principal component analysis observe p variables, x, for a sample of data1,x2,...xpThe data matrix of n samples is:
in the above formula:
principal component analysis is to synthesize p observed variables into p new variables (synthesized variables), i.e.
The abbreviation is:
Fj=αj1x1+αj2+…+αjpxp
j=1,2,…,p
the model is required to satisfy the following condition:
(1)Fi,Fjare not related to each other (i ≠ j, i, j ═ 1,2,. p.)
(2)F1Variance of greater than F2Variance of greater than F3Variance of (2), and so on
(3)ak1 2+ak2 2+…+akp 2=1k=1,2,…p.
Then, call F1Is a first main component, F2Is the second principal component, and so on, there is the pth principal component. The principal component is also called principal component. Where a isijReferred to as principal component coefficients.
The above model can be represented by a matrix as: f is AX;
in the formula: a is a principal component coefficient matrix.
A step of analyzing the main component of the sample,
the sample observation matrix is:
the first step is as follows: standardizing the raw data
Wherein,
the second step is that: calculating a sample correlation coefficient matrix
For convenience, assuming that the raw data is still represented by X after normalization, the correlation coefficient of the normalized data is:
the third step: calculating characteristic value (lambda) of correlation coefficient matrix R by using Jacobian method1,λ2,…,λp) And corresponding feature vector ai=(ai1,ai2,…,aip),i=1,2…p。
The fourth step: and selecting important principal components and writing a principal component expression.
However, since the variance of each principal component is decreased and the amount of information contained is also decreased, in actual analysis, the first k principal components are selected according to the magnitude of the cumulative contribution rate of each principal component, instead of selecting p principal components, where the contribution rate refers to the proportion of the variance of a principal component to the total variance, and actually, the proportion of a feature value to the total feature value. Namely, it is
The larger the contribution rate is, the stronger the information indicating the original variables included in the principal component is. The number k of the principal components is selected mainly according to the accumulated contribution rate of the principal components, i.e. the accumulated contribution rate is generally required to reach more than 85%, so that the comprehensive variables can be ensured to include most of information of the original variables.
In addition, in practical application, after selecting important principal components, attention is paid to the explanation of the actual meanings of the principal components. A key problem in principal component analysis is how to give new significance to principal components and give reasonable explanation. In general, this interpretation is based on the coefficients of the principal component expressions in conjunction with qualitative analysis. The principal component is a linear combination of original variables, the coefficient of each variable in the linear combination has a certain magnitude, has a positive value and a negative value, and has a corresponding magnitude, so that the principal component cannot be simply considered to be the attribute of a certain original variable.
The fifth step: and calculating the principal component score.
And respectively substituting the normalized original data into the principal component expression according to each sample to obtain new data of each sample under each principal component, namely the principal component score. Specific forms may be as follows.
And a sixth step: from the data of the principal component scores, principal components having high scores are extracted as features for use in the following classification.
The method utilizes principal component analysis to extract features, the experiment carries out principal component analysis on an image matrix, and the size of the image is determined to be 300 x 300 through analysis of a plurality of test results, wherein the image matrix is as follows
The extracted characteristic values are as follows
Experiments show that the ideal accuracy can be achieved by taking the first six characteristic values as the characteristics, so that only the first six characteristic values lambda are selected in the experiment1,λ2,λ3,λ4,λ5,λ6The features extracted by the wavelet analysis later are combined as the features of the population.
Wavelet analysis feature extraction, wherein wavelet analysis is local analysis of time (space) frequency, and multi-scale refinement is performed on signals (functions) step by step through telescopic translation operation, so that the requirements of time refinement at high frequency and frequency refinement at low frequency can be met finally, the requirements of time-frequency signal analysis can be automatically adapted, and therefore any details of signals can be focused, and the problem of difficulty in Fourier transformation is solved.
The multi-scale concept is provided when the orthogonal wavelet base is constructed, the pyramid algorithm of the discrete orthogonal dyadic wavelet transform is provided, and any function f (x) belongs to the L2(R) can be all 2 according to the resolution-NF (x) of (2) and a resolution of 2-j(1. ltoreq. j. ltoreq.N) the high-frequency part (detail part) of f (x) is completely reconstructed. In the multi-scale analysis, only the low frequency part is further decomposed, and the high frequency part is not considered, and the decomposition has the relationship:
f(x)=An+Dn+Dn-1+…+D2+D1(4.12)
wherein f (x) is a signal; a is a low frequency approximation; d is a high-frequency detail part; n is the number of decomposition layers.
Continuous wavelet transform and discrete wavelet transform
The starting point of the wavelet transform is to obtain a set of wavelets with similar shapes by scaling and shifting a basic wavelet. The basic wavelet is the mother wavelet, and the wavelet obtained after expansion and translation is called sub-wavelet or wavelet basis function[18-19]. The mother wavelet is mathematically defined as follows:
let psi (t) be E L2(R) Fourier-transformed to ψ (ω) if satisfied
Let ψ (t) be the basic wavelet or mother wavelet. The above equation also becomes a permissible condition of the wavelet, and the function of the basic wavelet ψ (t) always satisfies ψ (ω ═ 0) ═ 0, that is, has a band-pass property, has an oscillating waveform of positive and negative alternation, and has an average value of zero. Generally, a real function with regularity of a tight support set is selected, so that a wavelet mother function has localization characteristics in a time domain and a frequency domain simultaneously, and the requirement of time-frequency analysis is met. The wavelet basis functions can be generated by translating and scaling the small ψ (t).
Wherein a is a humidity factor; τ is the translation factor.
Since a and tau are continuously varied, #a,τ(t) is also known as the continuous wavelet basis function. The scale factor has the function of scaling the wavelet, and for different scales a, the basic wavelet is scaled intoWhen a is larger than the above-mentioned range,become wider and conversely become narrower. That is, when the scale factor is changed, the time-frequency resolution of the wavelet is also changed correspondingly.
The variation of the time window width and the frequency window width is reversed, the time window stretching necessarily leads to the frequency window compression, and the time window compression necessarily leads to the frequency window stretching. The specific time-frequency window structure of the wavelet transform is very suitable for the requirements of practical application: the low-frequency signal has long duration, the time window is always desired to be as wide as possible, and the frequency is as fine as possible; when analyzing high frequency signals, it is desirable that the time window is narrower and the frequency domain is less accurate.
With continuous wavelet basis function psia,τ(t), the functions are applied to the domain energy wired signal f (t), or projection decomposition is carried out under the wavelet basis functions, and then continuous wavelet transformation is obtained. The definition is as follows:
in the formula, WTf(a, τ) is the wavelet transform coefficient;<f(t),ψa,τ(t)>is f (t) and psia,τ(t) inner product;is composed ofComplex conjugation of (a).
What has been described above is a continuous wavelet transform of the function f (t).
Since the basic vehicle of modern computer adopts digital processing mode, the wavelet transformation needs to be discretized to be suitable for the use on digital computer. Discrete Wavelet Transform (DWT) is a Transform method with respect to Continuous Wavelet Transform (CWT), and essentially performs discretization on a scale factor a and a translation factor τ. A classical, commonly accepted discretization method is to apply the scale factor a in a power seriesDiscretizing by a translation factor tau at intervals tau0Uniform sampling was performed as follows
Wherein the specific form of the mother wavelet ψ (t) determines a0、τ0Is taken as the value of (1), the wavelet becomes a discrete wavelet
The discretized wavelet transform can be expressed as:
the discrete wavelet transform coefficients are compared to the continuous wavelet transform coefficients, the former being a two-dimensional discrete sequence with respect to j, k, and the latter being a two-dimensional continuous variable with respect to the integer a, τ.
The multi-layered wavelet decomposition of an image, the low and high bands of the wavelet decomposition play different roles, the role of the low frequency component mainly corresponding to the global (global) description, and the role of the high frequency component mainly corresponding to the description of the local detail. Therefore, the wrist vein information can be well preserved by only removing high-frequency components and preserving low-frequency components after the image is subjected to wavelet transformation. The image is subjected to wavelet decomposition to obtain four sub-band images, namely LL, HL, LH and HH, the dimension of each sub-image is one fourth of the dimension of the original image, and a hand back vein image is subjected to wavelet decomposition once to obtain a result. It can be seen that the image undergoes a layer of wavelet decomposition to produce four subband images. The subband image LL represents low-frequency components of the image after low-pass filtering in the horizontal and vertical directions; HL represents the high-frequency component of the original image in the horizontal direction and the low-frequency component in the vertical direction; LH represents low-frequency components in the horizontal direction and high-frequency components in the vertical direction of the original image; HH represents the high frequency components of the original image in the horizontal and vertical directions. As shown, the subband image LL holds the structural information of the original image well, so that the subband images HL, LH and HH are removed and only LL remains. And then performing wavelet decomposition on the LL again, retaining the first sub-band image and removing other sub-band images. By analogy, a multilayer wavelet decomposition result can be obtained by performing wavelet decomposition on each layer of the sub-band image LL once. After wavelet decomposition is carried out for two times, the main structure information of the hand back vein image is kept unchanged, the noise in the image is reduced, and in addition, the dimension of the image is also reduced by 16 times. It can be seen that feature extraction after wavelet transform is employed in the dorsal hand vein recognition system provides convenience.
The method is characterized in that wavelet analysis is utilized to extract features, a signal in a large limited frequency band can be obtained after sampling a wrist vein image, and wavelet multi-scale decomposition is carried out on the signal, wherein the signal is divided into two signals, namely a high-frequency part and a low-frequency part, the low-frequency part generally contains main information of the signal, and the high-frequency part is connected with noise and disturbance. The resulting low frequency portion may continue to be decomposed as needed for analysis, thus resulting in a lower frequency portion of the signal and a relatively higher frequency portion of the signal.
In the embodiment, two layers of wavelet decomposition are adopted to obtain high and low frequency coefficients of the image, wherein the high and low frequency coefficients comprise two low frequency coefficients and six high frequency coefficients, and then the eight coefficients are respectively averaged to obtain 8 characteristic values, so that 8 characteristics of the image are obtained through wavelet analysis. And combining the eight characteristic values and the six characteristic values extracted from the principal component into a characteristic matrix for the input characteristics of the later classifier.
And combining the features and generating a feature value matrix, combining the two extracted features, and adding the expected output to form the feature value matrix which is directly used as a training and recognition sample of the extreme learning machine.
The feature extracted by the principal component analysis alone or the feature extracted by the wavelet analysis alone is used for the later image recognition, and has many defects. The features extracted by the method are easy to generate errors, so that the later recognition precision is large in fluctuation, the randomness is too strong, and the recognition result is unreliable. The more the number of the features is, the higher the identification precision is, the more the features are selected as much as possible, but the more the features are, the longer the identification time is, the contradiction phenomenon exists, and the high-precision identification is completed by using less feature values.
In the principal component extraction, a plurality of characteristic values can be used, the identification precision is relatively improved when the used characteristic values are more, but experiments find that the improvement of the identification precision is increased due to the characteristic diversification, for example, the identification accuracy of the test of the latter method is higher than that of the former method when the characteristic of a principal component is added and the characteristic of wavelet extraction is added, so that two characteristics are combined into a characteristic matrix to be used as the identification characteristic of the later period, and the aim of completing high-precision identification by using less characteristic values is fulfilled.
Wrist vein image recognition based on extreme learning machine
Given N learning sample matrices (x)i,yi)
xi=[xi1,xi2,xi3…,xin]T
yi=[yi1,yi2,yi3,…,yin]T(5.1)
Given L number of single-hidden-layer nodes and a single-hidden-layer-node excitation function g (x) of the fabric network, in general:
the extreme learning machine model can therefore be expressed as:
wherein j is 1,2,3 … N; w is aiConnecting weights to the ith node for all inputs
wi=[wi1,wi2,…,win]T;βiConnecting weights to the ith node for all outputs
βi=[βi1,βi2,…,βim]T。
From the above analysis, writing the model of the ELM as a matrix representation is:
Hβ=Y (5.4)
in the formula, H is a network hidden layer output matrix
when any given w and b has β ═ H-1and Y, before training of the extreme learning machine, randomly generating w and b, and calculating β only by determining the number of neurons in the hidden layer and the excitation function g (x).
General steps of the extreme learning machine algorithm:
1. and determining the number L of the neurons of the hidden layer, and randomly setting the connection weight w of the input layer and the hidden layer and the neuron bias b of the hidden layer.
2. Selecting an infinitely differentiable function g (x) as an excitation function, and calculating a hidden layer output matrix H
3. calculating the weight of output layer, weight β ═ H-1Y。
Because the input weight and the bias are not required to be adjusted during the ELM training, the parameter selection is simpler, and the whole process does not need iteration, the ELM algorithm has the most prominent advantage that the speed is very high. The training speed is obviously improved. The speed is increased, and meanwhile, the algorithm result is better guaranteed. And the ELM algorithm overcomes the problems of local minimization and overfitting of the traditional gradient-based algorithm (such as BP algorithm). In summary, the ELM algorithm structure is much simpler and has good generalization capability relative to other feed-forward network algorithms.
The extreme learning machine classifies the images of the wrist veins, the embodiment collects the images of the wrists of five persons, 50 images of each hand are selected, and the total 500 images of the wrist veins are used as raw data.
Generation of training and test data, original image I size 640 x 480, first cut image I1The size is 332 × 604, and the side length is I with 10, 30 and 40.. 300 as the side length respectively1And randomly selecting one point as the center to intercept the square image block. Each wrist vein image is intercepted by 50 image blocks, and then 33 are taken as training samples, and 17 are taken as test samples. 30 sets of training and test data were generated from the increments of the side length.
The number of hidden layer nodes is determined, and the influence of the number of the hidden layer nodes of the extreme learning machine on the test training precision is very large, so that the proper number of the nodes needs to be selected. Through comparison and analysis of a plurality of experimental results (figure 9), when 50 pictures are processed, the hidden layer nodes are selected, the precision reaches an ideal value basically, and the time is within an acceptable range.
And determining the size of the recognition block, wherein the principle of selecting the size of the recognition image block is that the smaller the size is, the better the size is under the condition of ensuring certain recognition accuracy. In order to ensure the highest identification precision, the identification precision is highest when the size of the square block is about 240 × 240 as can be seen from the curve of 50 hidden layer nodes with the variation of precision along with the size of the square block. The size of the image square that is ultimately used for recognition is determined here to be 240 x 240.
Classifying the wrist vein images, and after the distribution of training data and test data is determined, the hidden layer node of the extreme learning machine is 50, and the size of the image block is 240 × 240, beginning to classify the 10 collected wrist vein images. And (3) after the extracted feature matrix is sent to the extreme learning machine along with the expected value, training is started, and the test sample is classified by using the trained input weight, hidden layer bias and output weight. The sequence number of the classification is derived from the actual output value of the extreme learning machine test sample. Example 10 wrists were classified collectively and the accuracy of the test and training was plotted. It can be seen from fig. 10 that under the selected conditions, the training precision is always 1, the test precision fluctuates but is generally above 0.95, and the requirement of the recognition precision is met.
By designing the user interface, the operator can complete the functions of the system by clicking simple buttons according to the text prompt or some dialog boxes on the interface.