Disclosure of Invention
In order to overcome the defects of low recognition rate and poor feasibility of the existing face image recognition mode with shielding and illumination changes, the invention provides the face recognition method based on mixed error coding, which has high recognition rate and good feasibility.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a face recognition method based on mixed error coding comprises the following steps:
step 1, taking an m multiplied by N dimensional face image y in a real number domain as a face image to be recognized, and collecting N face image sets which do not contain shielding and are collected under normal illumination conditions from K individualsAs a training sample, here,is an m × n order matrix in the real number domain, representing the ith training sample of the kth individual, K being 1, 2 … K,Nkthe number of samples of the face image of the kth person;
step 2 transforms y and a into the gradient direction domain, respectively:
here, denotes a two-dimensional convolution, arctan (. cndot.) is an arctangent function, hdIs the coefficient matrix of the difference filter, the symbol T represents the transpose of the matrix;
and 3, transforming the image y to be recognized and the training set A into a high-dimensional feature space:
wherein,representing a representation of the image to be recognized y in a high-dimensional feature space,represents the representation of the training set a in the high-dimensional feature space, V (·) is a stretch transform, i.e.: stretching the m × n order matrix into D ═ m × n dimensional column vectors, or sequentially stretching the elements in the set a of all m × n order matrices into D ═ m × n dimensional column vectors, and then assembling the stretched column vectors into a matrix, that is:
wherein,
step 4, initializing the coding coefficient of the face image y to be recognized relative to the training set A to beHere, 1N×1Represents an N-dimensional column vector, all elements of which are 1;
step 5, initializing parameters: t is 1, ζ*Where t denotes the number of iterations, ζ ∞*Representing the minimum quality of the discrimination error generated in the previous iteration process;
step 6, in the pixel space, according to the following formula, calculating the reconstructed image of the image y to be identified
Wherein M ism×n(. cndot.) is the inverse operation of V (. cndot.), i.e., reducing the column vector of m × n dimension to m × n order matrix, x(t)Representing the coding coefficient of the face image y to be recognized generated in the t iteration relative to the training set A for the N-dimensional column vector on the real number field; v (A) stretch transform for set A:
step 7 is toTransformation into the gradient direction domain:
step 8, calculating the image y to be identified and the reconstructed image thereofDifference Δ θ in gradient direction of (2)(t):
Step 9 will separate Δ θ(t)In the cos domain the following transformations are done:
step 10 pairsAnd (3) carrying out mean value filtering:here, hmIs the coefficient of the mean filter;
step 11 willThe structured error is obtained by the following transformation:where D is m × nDimension of (a), representing taking absolute value, | · torvirential calculation1To representA norm;
step 12 pairs structured errorsThreshold value clustering is carried out to obtain preliminary estimation s of effective feature support of the image y to be identifiedτ(t):
Wherein,(. represents a threshold filter operator, sτ(t)Andthe subscript i ∈ {1, 2, …, m × n } of (a) represents sτ(t)Andindex of middle element, tau is given threshold;
step 13, further estimating the effective characteristic support s of the image y to be recognized through the following formula(t):
Wherein s represents the sum of s in the optimized formula(t)Corresponding significant feature supported arguments, s and sτ(t)Is the same, the subscript i, j ∈ {1, 2, …, m × n } denotes s and sτ(t)The index of the medium element(s),a neighborhood node, λ, representing isAs a smoothing parameter, λμIs a data parameter;
step 14 is supported by valid features s of the image y to be recognized(t)Computing the representation of y in a high-dimensional feature spaceRepresentation in a high-dimensional feature space with respect to a training set AOf sparse coding coefficient x(t)And with x(t)Associated discrimination error
Wherein s.t. represents a constraint condition, | | · | | non-phosphor2To representNorm, ⊙, represents a point-by-point,x represents the sum of x in the optimization formula(t)The argument of the corresponding sparse coding coefficient,representation of and in the optimizationArgument of discrimination error corresponding to positive;
step 15 of measuring the discriminant errorMass of (b) (. zeta.(t):If ζ(t)<ζ*Then make ζ*=ζ(t),x*=x(t),Here, ,respectively record(ii) an optimal representation of (d);
step 16, making t equal to t +1, and iterating the steps 6 to 15 until the given maximum iteration times;
step 17, in the high-dimensional feature space, calculating the projection residual r of the image y to be recognized in the sample space of each person in the training set according to the following formulak:
Where K is ∈ {1, 2, …, K }, δk(x*) Represents x*All elements except for the coding coefficient corresponding to the kth person are set to 0;
step 18, classifying the image y to be recognized according to the following formula:
the technical conception of the invention is as follows: one of the major difficulties with reality-oriented face recognition systems is: face images often have occlusions and lighting variations. The shading and illumination changes can cause the characteristic loss of the image to be recognized on one hand, and can form local obvious characteristics on the other hand, so that higher error recognition rate is caused. Existing methods generally utilize the same error metric, namely: the projection error (also called reconstruction error) of the face image to be recognized in the training image space is used for selecting effective characteristics (so as to eliminate the influence of occlusion and illumination change), and then recognition is carried out. Because the targets and requirements of effective feature selection and image recognition are different, the two are difficult to simultaneously reach the optimum by using the same error measurement, and the method utilizes two errors: and structuring errors and distinguishing errors to respectively select effective characteristics and identify images. Spatial structure with occlusion and illumination variation, namely: continuity and local directionality, construct structured errors; and constructing a discrimination error by using a sparse representation theory and representation of the image in a high-dimensional feature space. The two errors are independent (in function and measurement) and influence each other (in performance), so the invention refers to the face recognition method based on the two error codes as a mixed error coding method. Experiments show that by using the mixed error coding method, effective features in the face image to be recognized can be effectively selected at the same time, and a good recognition effect is achieved.
The invention has the following beneficial effects: compared with the prior art, the method of the invention has better effect on the face image in reality, especially the face image with the change of sheltering and illumination, and has obvious practical value.
Drawings
FIG. 1 is a flow chart of a face recognition method of the present invention;
FIG. 2 is an image to be recognized y from an AR face database and a training set image A;
FIG. 3 is the result of transformation of an image y to be recognized and a training set image A in the gradient direction domain;
FIG. 4 is a schematic diagram of stretching an image matrix into column vectors;
FIG. 5 is a transformation result of an image y to be recognized in a spatial domain, a gradient direction domain and a high-dimensional space;
FIG. 6 is a partial intermediate result generated in an iterative process of recognition by the method provided by the present invention for the training set image and the image to be recognized provided by FIG. 2: reconstructing an image and a gradient direction characteristic image thereof, various intermediate error images for generating a structured error, twice estimation of effective characteristic support of an image to be recognized by using the structured error image, a sparse coding coefficient of the image to be recognized calculated based on a discriminant error minimization criterion, numerical measurement of the quality of discriminant error, a residual histogram and the like;
FIG. 7 is a 2 nd person's face image from the AR face database with different feature dimensions of sunglasses and scarf occlusion collected under different lighting conditions (normal, left and right highlights);
FIG. 8 is a test result of the recognition performance of the face image with sunglasses shielding collected under normal lighting condition or uneven strong light condition from the AR face database under different feature dimensions by using the method provided by the present invention and four existing methods;
FIG. 9 is a test result of the recognition performance of face images with scarf shielding collected under normal lighting conditions or uneven high light conditions from an AR face database under different feature dimensions by using the method provided by the present invention and four existing methods;
FIG. 10 is partial training set data from subsets I and II of person 1 from the Extended Yale B face database and partial to-be-recognized face images from subsets III, IV and V, respectively, with different simulated occlusion levels (0%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%);
FIG. 11 is a test result of the recognition performance of the method provided by the present invention and four prior art methods on simulated occlusion (monkey occlusion) face images from subset III of the Extended Yale B database with different sizes (0%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%);
FIG. 12 is a test result of the recognition performance of the method provided by the present invention and four prior art methods on simulated occlusion (monkey occlusion) face images from subset IV of the Extended Yale B database with different sizes (0%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%);
fig. 13 shows the results of testing the recognition performance of the face image from the subset V of the Extended Yale B database to which simulated occlusions (monkey occlusions) of different sizes (0%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%) were applied by the method of the present invention and four existing methods.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 to 13, a face recognition method based on mixed error coding includes the following steps:
step 1, taking an m multiplied by N dimensional face image y in a real number domain as a face image to be recognized, and collecting N non-occlusion-containing face images from K individuals under a normal illumination conditionTo a face image setAs a training sample, here,is an m × n order matrix in the real number domain, representing the ith training sample of the kth individual, K being 1, 2 … K,Nkthe number of samples of the face image of the kth person;
step 2 transforms y and a into the gradient direction domain, respectively:
here, denotes a two-dimensional convolution, arctan (. cndot.) is an arctangent function, hdIs a coefficient matrix of a difference filter, the suggested values of which are:symbol T represents a transposed matrix;
and 3, transforming the image y to be recognized and the training set A into a high-dimensional feature space:
wherein,representing a representation of the image to be recognized y in a high-dimensional feature space,represents the representation of the training set a in the high-dimensional feature space, V (·) is a stretch transform, i.e.: stretching the m × n order matrix into D ═ m × n dimensional column vectors, or sequentially stretching the elements in the set a of all m × n order matrices into D ═ m × n dimensional column vectors, and then assembling the stretched column vectors into a matrix, that is:
wherein,
step 4, initializing the coding coefficient of the face image y to be recognized relative to the training set A to beHere, 1N×1Represents an N-dimensional column vector, all elements of which are 1;
step 5, initializing parameters: t is 1, ζ*Where t denotes the number of iterations, ζ ∞*Representing the minimum quality of the discrimination error generated in the previous iteration process;
step 6, in the pixel space, according to the following formula, calculating the reconstructed image of the image y to be identified
Wherein M ism×n(. cndot.) is the inverse operation of V (. cndot.), i.e., reducing the column vector of m × n dimension to m × n order matrix, x(t)Representing the coding coefficient of the face image y to be recognized generated in the t iteration relative to the training set A for the N-dimensional column vector on the real number field; v (A) stretch transform for set A:
step 7 is toTransformation into the gradient direction domain:
step 8, calculating the image y to be identified and the reconstructed image thereofDifference Δ θ in gradient direction of (2)(t):
Step 9 will separate Δ θ(t)In the cos domain the following transformations are done:
step 10 pairsAnd (3) carrying out mean value filtering:here, hmFor the coefficients of the mean filter, a 7 × 7 order matrix with all elements 1 is proposed;
step 11 willThe structured error is obtained by the following transformation:where D is m × nDimension of (a), representing taking absolute value, | · torvirential calculation1To representA norm;
step 12 pairs structured errorsThreshold value clustering is carried out to obtain preliminary estimation s of effective feature support of the image y to be identifiedτ(t):
Wherein,(. represents a threshold filter operator, sτ(t)Andthe subscript i ∈ {1, 2, …, m × n } of (a) represents sτ(t)Andindex of middle element, tau is given threshold value, and the suggested value is 0.3;
step 13, further estimating the effective characteristic support s of the image y to be recognized through the following formula(t):
Wherein s represents the sum of s in the optimized formula(t)Corresponding significant feature supported arguments, s and sτ(t)Is the same, the subscript i, j ∈ {1, 2, …, m × n } denotes s and sτ(t)The index of the medium element(s),a neighborhood node, λ, representing isAs a smoothing parameter, λμAs a data parameter, λsAnd λμThe suggested value of (2). The optimized suggestion is solved by graph segmentation algorithm (GraphCuts), and the specific solving method is as follows: http: html,/www.cs.cornell.edu/. about rdz/grams;
step 14 is supported by valid features s of the image y to be recognized(t)Computing the representation of y in a high-dimensional feature spaceRepresentation in a high-dimensional feature space with respect to a training set AOf sparse coding coefficient x(t)And with x(t)Associated discrimination error
Wherein s.t. represents a constraint condition, | | · | | non-phosphor2To representNorm, ⊙, represents a point-by-point,x represents the sum of x in the optimization formula(t)Correspond to each otherThe argument of the sparse coding coefficient of (2),representation of and in the optimizationThe corresponding argument of the discrimination error;
step 15 of measuring the discriminant errorMass of (b) (. zeta.(t):If ζ(t)<ζ*Then make ζ*=ζ(t),x*=x(t),Here, ,respectively record(ii) an optimal representation of (d);
step 16, iterating the above steps 6 to 15 until a given maximum number of iterations (the recommended value of the maximum number of iterations is 6) with t equal to t + 1;
step 17, in the high-dimensional feature space, calculating the projection residual r of the image y to be recognized in the sample space of each person in the training set according to the following formulak:
Where K is ∈ {1, 2, …, K },δk(x*) Represents x*All elements except for the coding coefficient corresponding to the kth person are set to 0;
step 18, classifying the image y to be recognized according to the following formula:
for better illustration, specific examples are given below:
step 1, taking a 7 th sample with sunglasses shielding and left strong light illumination from an AR face database as a face image y to be recognized, wherein the dimension is 112 multiplied by 92; samples of the first 4 persons and the 7 th person without occlusion and illumination change are taken from the AR face database as a training sample set, 40 samples (8 samples for each person) are obtained in total, the dimension of each sample is 112 x 92, and the training sample set is used as a training set A. The face image y to be recognized and the training set image a are shown in fig. 2.
Step 2 transforms y and A into gradient direction domains φ (y) and φ (A), respectively, as shown in FIG. 3.
Step 3 transforms y and a into a high-dimensional feature space: here, V (·) is a stretching transformation, and fig. 4 gives the operation of stretching the matrix into vectors; fig. 5 compares the transformation results of the image y to be recognized in the spatial domain, the gradient direction domain, and the high-dimensional space.
Step 4, initializing the coding coefficient of the face image y to be recognized relative to the training set A to be
Step 5, initializing parameters: t is 1, ζ*=+∞。
Step 6, in pixel space, a reconstructed image of y is calculated:
wherein M ism×n(·) is the inverse operation of V (·), i.e. D ═ m × n dimensional column vectors are reduced to m × n order matrices; reconstructed image of yAs shown in the t-th column in fig. 6 (c).
Step 7 is toTransformation into the gradient direction domain: as shown in the t-th column in fig. 6 (d).
Step 8 calculates y anddifference in gradient direction of (a):as shown in the t-th column in fig. 6 (e).
And 9, calculating:as shown in the t-th column in fig. 6 (f).
Step 10, calculating:as shown in FIG. 6 (g)) Shown in column t of (c).
Step 11, calculating:here, the number of D is 112 × 92,as shown in the t-th column in fig. 6 (h).
Step 12 pairs structured errorsThreshold filtering is carried out to obtain the preliminary estimation of the effective feature support of y:here, theTaking out the mixture of 0.3,as shown in the t-th column in fig. 6 (i).
Step 13 let λs=λμ2, solving the following formula by GraphCut algorithm:
a further estimate of the valid feature support of y is obtained as shown in column t of fig. 6 (j).
Step 14, in the high-dimensional feature space, calculating the discrimination error of the image y to be recognized in the non-occlusion area relative to the training set A according to the following formulaAnd withAssociated y coding coefficient x(t):
Wherein | · | purple sweet2To representNorm, ⊙, represents a point-by-point,in FIG. 6(1), the t-th column showsIs coded by the coding coefficient x(t)。
Step 15 calculationAt this time, if ζ(t)<ζ*Then give an orderColumn t in FIG. 6(m) shows ζ(t)The calculation result of (2).
Step 16 sets t to t +1, and iterates the above steps 6 to 15 until t is 6.
Step 17 consists of ζ for 6 iterations in FIG. 6(m)(t)When t is 5, ζ is obtained(t)Minimum value of 1.236719, resulting in x(t)Andis optimal, take x*=x(5)Andfor k ∈ {1, 2, 3, 4, 7}, calculate:
wherein, deltak(x*) Represents x*All elements except the coding coefficient corresponding to the k-th person in (a) are all set to 0. For k ∈ {1, 2, 3, 4, 7}, rkThe specific values are shown in column 5 of fig. 6 (n).
Step 18 classifies y:
from the histogram in column 5 of fig. 6(n), id (y) is 7, and the recognition result is correct.
In order to simulate the real face recognition with occlusion and illumination change and further illustrate the beneficial effects of the method provided by the present invention compared with the existing method, fig. 7 and 10 show schematic diagrams of two sets of relatively large-scale test data, and fig. 8, 9, 11, 12 and 13 respectively perform large-scale recognition experiments on face images with different occlusion and illumination conditions under different feature dimensions or different occlusion levels for these different test data. In these identification experiments, the method provided by the present invention is denoted by MEC (Mixed Error Coding). The specific description is as follows:
(1) fig. 8 and 9 show the results of the tests of the recognition performance of the method provided by the invention (MEC) and 4 existing methods (IGO-PCA, SSEC, CESR, RSC) on face images with sunglasses or scarf shields collected from the AR face database under normal lighting conditions or under uneven glare conditions, in different feature dimensions. Training set: selecting the non-shielding facial images of 119 persons (65 men and 54 women) collected under the normal lighting condition as a training set, wherein 8 training images of each person are similar to the training set in the figure 2, and the dimension is 112 multiplied by 92; and (3) test set: 119 × 3 ═ 357 images of the 119 individuals with uneven glare (normal, left-side and right-side glare) and sunglass (fig. 7 (a))/scarf (fig. 7(b)) obscuration were selected as the test set. Fig. 7 shows test images of the 2 nd person of the AR library at four different feature dimensions (112 × 92 ═ 10304, 56 × 46 ═ 2576, 28 × 23 ═ 644, 14 × 11 ═ 154). With respect to the above training and testing data, fig. 8 and 9 show the recognition results of the Method (MEC) provided by the present invention and 4 existing methods, respectively. It can be seen that the method provided by the invention almost always achieves the best recognition effect under different feature dimensions.
(2) Fig. 11, 12 and 13 show the results of testing the recognition performance of the method provided by the present invention (MEC) and 4 existing methods (IGO-PCA, SSEC, CESR, RSC) on face images from Extended Yale B database under different lighting conditions (low light) with different sizes of simulated occlusions (monkey occlusion) applied. The face image of the ExtendedYale B database is divided into 5 subsets according to the illumination condition: I. II, III, IV, V, from subset I to subset V, the lighting conditions gradually deteriorate. Subset I and subset II (717 images total) were chosen as training sets, subset III (453 images), subset IV (524 images) and subset V (712 images) were chosen as test sets, respectively, and 10 monkey images with different occlusion levels (0%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%) were superimposed on each of these test sets as occlusions. FIG. 10(a) shows partial training data from subset I and subset II for the first person of the Extended Yale B database, and FIGS. 10(B), (c) and (d) show partial test data from subset III, subset IV and subset V, respectively, for the first person of the Extended Yale B database, with different sizes of monkey occlusions applied. With respect to the above training and testing data, fig. 11, 12 and 13 show the recognition effect of the Method (MEC) provided by the present invention and 4 existing methods, respectively. It can be seen that the method provided by the invention almost always achieves the best recognition effect under different illumination conditions and shading levels.