CN112966649B

CN112966649B - Occlusion face recognition method based on sparse representation of kernel extension dictionary

Info

Publication number: CN112966649B
Application number: CN202110319464.9A
Authority: CN
Inventors: 童莹; 马杲东; 陈瑞; 曹雪虹; 赵小燕
Original assignee: Nanjing Institute of Technology
Current assignee: Nanjing Institute of Technology
Priority date: 2021-03-25
Filing date: 2021-03-25
Publication date: 2022-06-03
Anticipated expiration: 2041-03-25
Also published as: CN112966649A

Abstract

The invention discloses an occlusion face recognition method based on sparse representation of a kernel extended dictionary, which comprises the following steps (S1)): constructing a training sample set X; step (S2): constructing a standard sample set N; step (S3): constructing a test sample set Y; step (S4): constructing an occlusion dictionary D1 of the training sample set X and an occlusion dictionary D2 of the testing sample set Y to obtain a mixed complete occlusion dictionary D; step (S5): and performing linear sparse representation classification on the sample to be detected by adopting an SRC model according to the mixed complete occlusion dictionary D, and performing occlusion face recognition on the sample to be detected. The method is used for eliminating the pixel redundant information in the sample dictionary, obtaining the dictionary with more discriminative and representational properties, enabling the sample dictionary to only contain the face structure characteristics but not contain the pixel redundant information and the interference information, and enabling the occlusion dictionary to only contain the occlusion information of the training sample and the test sample but not have the face structure characteristics, and the two are combined to improve the accuracy of the occlusion face recognition.

Description

Occlusion face recognition method based on sparse representation of kernel extension dictionary

Technical Field

The invention relates to the technical field of personal identity verification and identification under the condition that a human face is shielded in man-machine interaction, in particular to a shielded human face identification method based on sparse representation of a kernel extension dictionary.

Background

In recent years, due to the development of technologies such as artificial intelligence, computer vision, internet of things communication and the like, a face recognition technology is widely applied in actual life, for example, intelligent home appliances, intelligent retail, intelligent entrance guard and the like. However, the above applications all require the target object to remain frontal, unobstructed. In practical application, the face of a target object is often shielded by accessories such as a scarf, a hat, a mask and glasses or by illumination, so that the accuracy of identity verification is reduced. Therefore, how to eliminate the interference of these blocking factors and improve the accuracy of face recognition in practical application has become a technical difficulty of blocking face recognition.

In 2009, Wright et al first applied Sparse Representation (SR) theory to face recognition, and proposed a Sparse Representation based classification (SRC). The algorithm adopts training samples to construct a dictionary, restrains the minimum norm of a coding coefficient L1 of a sample to be detected, aims to select a minimum subset from the sample dictionary to perform linear representation on the sample to be detected, calculates the residual error between the sample to be detected and each type of reconstructed sample, and divides the type according to the minimum residual error. The target function of SRC is shown in formula (1), where a is a sample dictionary, y is a sample to be measured, and α is a sparse coding coefficient.

According to the sparse regularization theory, in the formula (1), as long as the sample dictionary a contains enough abundant sample atoms, that is, the sample atoms describe various possible interference conditions existing in face recognition, the sample to be detected can be reconstructed from the sample dictionary in a linear sparse way without distortion. However, in practical applications, the collected face image may have the influence of interference factors such as age, external environment illumination, facial expression, accessory shielding, head pose, and the like, which results in that the atoms in the sample dictionary a cannot cover various possible changes. Meanwhile, under the shielding condition, part of information of the face part of the human face is lost, so that the sample to be detected cannot be accurately and linearly represented by the shielded sample, and is easily judged by mistake.

Therefore, it is not appropriate to use only a single sample dictionary a to represent a sample to be measured, therefore, Wright introduces the identity matrix I in the SRC model as an extended noise dictionary, and aims to separate interference factors from an original image, the essential structural features of a human face are represented by the sample dictionary, and external interference factors are represented by the noise dictionary, so as to further improve the accuracy of linear sparse representation of the sample to be measured, and an objective function is shown in formula (2).

In 2012, Deng improved SRC, and proposed an extended sparse representation-based classifier (ESRC). According to the method, a standard sample (a front face interference-free face image) is subtracted from a changed sample (a face image with interference such as shielding, expression and illumination), and an intra-class difference dictionary V is constructed by the difference value between the standard sample and the changed sample, so that an unit interference matrix I in the traditional SRC is replaced. Compared with the traditional SRC algorithm, the intra-class difference dictionary V constructed by the ESRC has richer interference information, and can describe the sample to be detected more accurately by combining with the sample dictionary A. In 2016, Chen proposes an Adaptive Noise Dictionary (AND), firstly adopts an iterative weighted robust principal component analysis method to adaptively extract various occlusion information possibly existing in a sample to be detected, AND then combines a training sample without occlusion to realize accurate linear representation of the sample to be detected.

By analyzing the implementation principle of the SRC and the improved algorithm thereof, we find that the methods are all improved based on the formula (2), and the main purpose of the methods is to obtain an accurate extended noise dictionary by using different methods, and to separate interference information from an original face image, so as to further improve the accuracy of linear sparse representation of a sample to be measured. Although a certain effect is achieved in the occlusion face recognition, the following problems still exist:

1. atoms in the sample dictionary A are all represented by original images, so that a sample dictionary constructed based on the images has a large amount of pixel redundant information, the like atoms lack consistency, and the heterogeneous atoms lack discriminability; meanwhile, the dictionary atoms are represented by converting two-dimensional images into one-dimensional column vectors, so that the dimensionality of the dictionary atoms is far greater than the number of the atoms, the problem of 'small samples' is easy to occur, and the optimal sparse solution cannot be obtained in a solution space;

2. the above method of constructing the extended noise dictionary V still has limitations. For example, the conventional SRC algorithm uses the identity matrix I as an extended noise dictionary, and can only describe discontinuous single interference problems, such as random pixel damage and small-scale non-real object occlusion. Although the intra-class difference dictionary of the ESRC is more effective than the identity matrix, it includes both the occlusion information and the non-occlusion information of the samples, and the effect is not good when performing linear sparse representation on the occluded samples. Meanwhile, the intra-class difference dictionary of the ESRC is obtained by subtracting a standard sample from a variation sample, wherein a large amount of pixel redundant information exists. If the selection of the changed samples is insufficient, the intra-class difference information among the obtained samples is also insufficient, and the accuracy of face recognition is affected.

Disclosure of Invention

The invention aims to solve the problem of an occlusion face recognition algorithm with sparse representation in the prior art. The method is based on improvement of a construction method of a sample dictionary and an extended noise dictionary in a sparse representation model, is used for eliminating pixel redundant information in the sample dictionary and obtaining a dictionary with more discriminative and representational properties, so that the sample dictionary only contains human face structural features but does not contain pixel redundant information and interference information, and the occlusion dictionary only contains occlusion information of a training sample and a test sample but does not contain human face structural features, and the accuracy of occlusion face recognition can be effectively improved by combining the two.

In order to achieve the purpose, the invention adopts the technical scheme that:

a method for recognizing an occluded face based on sparse representation of a kernel extension dictionary comprises the following steps: comprises the following steps of (a) carrying out,

step (S1): constructing a training sample set X, and guiding and learning the training sample set X by adopting a Kernel Discriminant Analysis (KDA) algorithm to obtain a KDA projection matrix

Step (S2): constructing a standard sample set N, performing projection dimensionality reduction on the standard sample set N by adopting a KDA algorithm according to a formula (3) to obtain a low-dimensional basic dictionary A,

phi (N) represents a radial basis kernel function, N represents a standard sample set, and T represents the transposition operation of a matrix;

step (S3): constructing a test sample set Y ═ Y₁,y₂,...,y_n^]∈R^{d×n^}，

Wherein R is a real number set, d represents the column vector dimension of the sample, n ^ represents the number of the samples, Y ∈ R^{d×n^}The method comprises the steps that a test sample set Y comprises n ^ samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector are valued from a real number set R;

step (S4): respectively extracting shielding information in the training samples in the training sample set X and the samples to be tested in the testing sample set Y by adopting a KDA algorithm, and constructing a shielding dictionary D1 of the training sample set X and a shielding dictionary D2 of the testing sample set Y to obtain a mixed complete shielding dictionary D;

step (S5): and performing linear sparse representation classification on the sample to be detected by adopting an SRC model according to the mixed complete occlusion dictionary D, and performing occlusion face recognition on the sample to be detected.

Preferably, in the step (S1), the training sample set X includes information on expression, illumination, and occlusion having rich intra-class variation [ X ═ X₁,X₂,...,X_c]＝[x₁,x₂,...x_n]∈R^d×nLearning the high-dimensional spatial distribution of the training sample set to obtain a KDA projection matrix with c-1 projection vectors

Where c is the number of classes in the training sample set, X₁,X₂,...,X_cRepresenting c subsets, wherein R is a real number set, d represents a column vector dimension of the samples, n represents the number of the samples, and X belongs to R^d×nThe training sample set X is represented to contain n samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector are valued from a real number set R.

Preferably, in the step (S2): constructed standardsThe book is N ═ x₁,x₂,...x_m]∈R^d×mThe method is characterized in that an interference-free front face image is taken from m objects respectively, wherein d represents the column vector dimension of a sample, m represents the number of samples, and R represents^d×mThe standard sample set N is represented by m samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector take values from the real number set R.

Preferably, the step (S4): the method comprises the following steps:

(S41): adopting KDA projection matrix based on the following formula (4)

For the occlusion sample subset X in the training sample set X_OAnd corresponding subset of standard samples X_NAnd carrying out low-dimensional mapping, and then subtracting the low-dimensional vectors to obtain an occlusion dictionary D1 of the training sample:

(S42): from the test sample set Y ═ Y₁,y₂,...,y_n^]∈R^{d×n^}In any one of the samples to be tested y belongs to R^d×1And randomly selecting a class I subset X from the training sample set X₁,X₂,...,X_lCalculating and obtaining the shielding information b of the sample y to be tested in the training sample subset by adopting a robust principal component analysis algorithm₁,b₂,...,b_lTaking the mean value of the obtained l shielding information to obtain the shielding information of the sample y to be detected

(S43): approximate sample y of sample y to be measured is constructed by adopting formula (5)^*Then, using KDA to project the matrix

For y and y respectively^*Performing low-dimensional projection to obtain a low-dimensional vector y_KDAAnd

subtracting the two to obtain the low-dimensional shielding information do of the sample y to be detected, as shown in formula (6):

y^*＝y-k*b (5)；

wherein k is a coefficient, and b is shielding information of the sample y to be detected;

(S44): repeating the steps (S42) - (S43), calculating the shielding information of all samples to be tested, and constructing the self-adaptive shielding dictionary D of the test sample set Y₂＝[do₁,do₂,...,do_n^]；

(S45): and combining the occlusion dictionary D1 of the training sample set X with the occlusion dictionary D2 of the testing sample set Y to obtain a mixed complete occlusion dictionary D ═ D1, D2.

Preferably, in the step (S5): according to the mixed complete occlusion dictionary D, linear sparse representation classification is carried out on the sample to be detected by adopting an SRC model, and the method comprises the following steps:

(S51): optimizing and solving a sparse coding coefficient of a sample Y to be detected for Y based on the following target function formula (7) of SRC;

wherein,

a is a low-dimensional basic dictionary, D is a mixed complete occlusion dictionary, beta is a coding coefficient corresponding to the low-dimensional basic dictionary A,

the code coefficient corresponding to the mixed complete occlusion dictionary D is lambda which is a regularization coefficient;

(S52): calculating residual errors of the sample to be detected and each type of reconstructed sample according to the following formula (8), and finally judging y as the type with the minimum residual error according to the following formula (9);

wherein delta_j(β) represents a j-th coefficient (j ═ 1, 2.. said., c) of the encoding coefficients β corresponding to the low-dimensional basic dictionary a,

representing the coding coefficient corresponding to the mixed complete occlusion dictionary D, e_jAnd the residual error between the sample y to be measured and the j-th type reconstruction sample is represented (j is 1, 2.

The label representing the judgment of the sample y to be measured is minimum e_jThe value corresponds to a label.

The invention has the beneficial effects that:

1. the method abandons the traditional strategy of constructing the dictionary in the original image space, improves the dictionary construction method based on the low-dimensional discrimination feature space, and aims to eliminate pixel redundant information and obtain a dictionary with more discriminative and representational properties;

2. because the face image collected in the real environment is distributed in a nonlinear complex manifold in the sample space, the traditional Linear dimension reduction method, such as Linear Discriminant Analysis (LDA), can not effectively process the nonlinear inseparable condition, therefore, the invention adopts Kernel Discriminant Analysis (KDA) algorithm to calculate the optimal low-dimensional projection direction of the original image space, and obtains a more Discriminant low-dimensional subspace;

3. according to the method, the construction method of the sample dictionary is improved in the KDA low-dimensional projection subspace, so that redundant information among pixels in an original image is removed, the discriminability of dictionary atoms is improved, the dimensionality of the dictionary atoms is reduced, the calculation efficiency of a model is improved, and the optimal sparse solution is obtained in a solution space;

4. the invention firstly improves the construction method of the occlusion dictionary of the training sample in a KDA low-dimensional projection subspace, and aims to eliminate redundant information among pixels and face structure characteristics so that the occlusion dictionary of the training sample has more representation. Meanwhile, the shielding information of the sample to be detected is extracted in the KDA low-dimensional projection subspace by adopting a robust principal component analysis algorithm, so that the shielding dictionary is supplemented, and the shielding information has completeness and self-adaptability.

In conclusion, the invention improves the construction method of the sample dictionary and the occlusion dictionary in the KDA low-dimensional projection subspace, aims to ensure that the sample dictionary only contains the face structure characteristics, but does not contain pixel redundant information and interference information, and ensures that the occlusion dictionary only contains the occlusion information of the training sample and the test sample, but does not contain the face structure characteristics, and the combination of the two can effectively improve the accuracy of the occlusion face recognition.

Drawings

FIG. 1 is a block diagram of a flow implementation of the method for recognizing an occluded face based on sparse representation of a kernel extended dictionary according to the present invention;

FIG. 2 is a diagram of the effect of face simulation in a portion of the CAS-PEAL library of the present invention;

FIG. 3 is a graph of the hybrid recognition rate of the present invention with k taken at different values;

FIG. 4 is a diagram of the face simulation effect of a certain class of people in the AR library of the present invention;

FIG. 5 is a diagram of simulation effect of a portion of samples in the Extended Yale B database according to the present invention.

Detailed Description

The invention will be further described with reference to the accompanying drawings.

The invention performs experiments on three face databases of a CAS-PEAL library, an AR library and an Extended Yale B, and the experimental environment is a win 1064-bit operating system, an 8GB memory and a MatlabR2017a simulation platform.

As shown in FIG. 1, the method for recognizing the face with occlusion based on sparse representation of the kernel extended dictionary of the present invention comprises: comprises the following steps of (a) carrying out,

step (S1), a training sample set X is constructed, and a KDA algorithm is adopted to guide and learn the training sample set X, so that a KDA projection matrix is obtained

The KDA algorithm related in the invention refers to a Kernel Discriminant Analysis (KDA) algorithm;

Wherein R is a real number set, d represents the column vector dimension of the sample, n ^ represents the number of the samples, Y ∈ R^{d×n^}The method comprises the following steps that a test sample set Y is represented to contain n ^ samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector are valued from a real number set R;

Further, in step (S1), the training sample set X includes expression, illumination, and occlusion information with rich intra-class variation information [ X ═ X₁,X₂,...,X_c]＝[x₁,x₂,...x_n]∈R^d×nLearning the high-dimensional spatial distribution of the training sample set to obtainKDA projection matrix with c-1 projection vectors

Where c is the number of classes in the training sample set, X₁,X₂,...,X_cRepresenting c subsets, wherein R is a real number set, d represents a column vector dimension of a sample, n represents the number of samples, and X belongs to R^d×nThe training sample set X is represented to contain n samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector are valued from a real number set R.

Further, in step (S2): the standard sample set is constructed as N ═ x₁,x₂,...x_m]∈R^d×mThe method is characterized in that an interference-free front face image is taken from m objects respectively, wherein d represents the column vector dimension of a sample, m represents the number of samples, and R represents^d×mThe standard sample set N is represented by m samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector take values from the real number set R.

Further, step (S4): the method comprises the following steps:

(S41): adopting KDA projection matrix based on the following formula (4)

(S42): from the test sample set Y ═ Y₁,y₂,...,y_n^]∈R^{d×n^}In any one of the samples to be tested y belongs to R^d×1And randomly selecting a class I subset X from the training sample set X₁,X₂,...,X_l(in the embodiment, the calculation complexity of the algorithm is considered, and l is 5), the robust principal component analysis algorithm is adopted to calculate to obtain the y position of the sample to be measuredOcclusion information b in this subset of training samples₁,b₂,...,b_lTaking the average value of the obtained l shielding information to obtain the shielding information of the sample y to be detected

y^*＝y-k*b (5)；

Further, in step (S5): according to the mixed complete occlusion dictionary D, linear sparse representation classification is carried out on the sample to be detected by adopting an SRC model, and the method comprises the following steps:

wherein,

the code coefficient corresponding to the mixed complete occlusion dictionary D is lambda, which is a regularization coefficient;

wherein delta_j(β) denotes a j-th coefficient (j ═ 1, 2.. once, c) among the encoding coefficients β corresponding to the low-dimensional basic dictionary a,

representing the coding coefficient corresponding to the mixed complete occlusion dictionary D, e_jThe residual error between the sample y to be measured and the j-th reconstructed sample is represented (j ═ 1, 2.., c).

Example 1: experiments were performed in the CAS-PEAL database:

the CAS-PEAL face database contains 1040 people, a total of 99594 face images (including 595 men and 445 women). All images are collected in a special collection environment, 4 main change conditions of postures, expressions, ornaments and illumination are covered, and part of face images have changes of backgrounds, distances and time spans. The invention selects 9031 images to carry out the experiment, and partial sample images are shown in figure 2.

The design of the training sample set, the standard sample set, and the test sample set on the CAS-PEAL database is as follows:

(1) the training sample set with rich intra-class variation contains 200 people with illumination variation, 100 people with expression variation and 20 people with accessory occlusion, and each person has 4 images, and 1280 variation samples in total. Meanwhile, the training sample set also comprises 1 front face interference-free image of each type of people, and 273 standard samples are formed in total, and the training sample set is formed by the 1 front face interference-free images and the 273 standard samples.

(2) The standard sample set comprises 1040 people in CAS-PEAL database, each person takes 1 non-interference image of front face, and 1040 samples are total.

(3) The test sample set is composed of the remaining samples after the training sample and the standard sample are removed, and contains 6711 samples which comprise 6 subsets of ornament shading subset, illumination subset, expression subset, distance subset (different shooting distances), time subset (different time intervals of face shooting) and background subset (different shooting backgrounds).

Based on the design and construction of the sample set, the experimental results of the invention, SRC, ESRC, KDA and KED are shown in table 1. It can be seen from the table that the three algorithms of ESRC, KDA and ke all have good experimental results on four subsets of expression, time, background and distance, the recognition rate is close to 99%, some even reaches 100%, but the experimental effect is greatly different on two subsets of ornament blocking and illumination blocking, especially by using SRC algorithm, the recognition rate on the illumination subset is only 17.32%, and it is thus not appropriate to use unit matrix to describe illumination change, and the blocking dictionaries of other algorithms also need to be improved.

The invention (k is 0.1) achieves the best recognition effect on six interference subsets (except for the background subset, the recognition rate is slightly lower than the KED), and particularly, the recognition effect is the best on two subsets, namely the accessory occlusion and the illumination occlusion, wherein the recognition rates respectively reach 92.29% and 86.55%, and are 1.77% and 3.52% higher than the second best KED algorithm. Therefore, the improvement of the invention on the sample dictionary and the extended noise dictionary is helpful to improve the accuracy of the linear representation of the occlusion face image.

The mixed identification rate in table 1 refers to the result of identifying samples in the six interference subsets mixed together. As can be seen from the table, the recognition effect of the invention is still the best, reaches 93.97%, and is respectively 25.23%, 3.39%, 6.23% and 1.54% higher than SRC, ESRC, KDA and KED. This further illustrates that the present invention is robust and adaptive to various interference factors present in face recognition.

TABLE 1 identification results in PEAL-CAS database (%)

To illustrate the effect of the choice of coefficient k in equation (5) on the present invention, we set k at [0,1 ]]Taking the value in the step (1). Fig. 3 shows the recognition result of the face image with mixed interference condition when k takes different values. It can be seen from the figure that when k is 0, the recognition rate is 92.92%, when k is 0.1, the recognition rate is 93.97%, and when k is 1, the recognition rate is reduced to 92.04%. This shows that when the value of k is 0, the occlusion dictionary only contains occlusion information of the training sample at this time, no occlusion information of the sample to be detected exists, and the recognition effect is affected; when k is large, for example, k is 1, the approximate sample y calculated by equation (5) is used^*More face structure characteristics are lost, and a similar sample y is obtained^*After the nonlinear mapping is carried out on the high-dimensional space, the nonlinear mapping deviates greatly from the sample phi (y) to be detected in the high-dimensional space, so that the projection difference value of the two in the KDA low-dimensional space cannot well represent the shielding information of the sample to be detected, and the identification accuracy is reduced. In summary, only when k is small, for example, k is 0.1, the obtained approximate sample y is calculated^*The method not only eliminates partial shielding factors, but also keeps more face structure characteristics, and the projection difference value of the face structure characteristics and the sample to be detected in the KDA low-dimensional space can effectively represent the shielding information of the sample to be detected, so that the identification effect is improved.

Example 2: experiments were performed in the AR database:

the AR face database contains 126 classes of people (56 women, 70 men), and there are 4000 faces in front alignment. Every kind of people is shot in two stages, and 13 images are shot in each stage, wherein 4 images are changed in illumination, 3 images are changed in expression, 3 images are blocked by glasses, and 3 images are blocked by a neckerchief. The invention selects 100 people to carry out experiment, and cuts and normalizes the image, the size after cutting is 120 multiplied by 100. Fig. 4 is a partial sample image in the AR face library.

The design of the training sample set, the standard sample set, and the test sample set on the AR database is as follows:

(1) selecting a first front face interference-free image of 100 people in the first stage to construct a standard sample set, wherein 100 samples are total;

(2) selecting the remaining 12 interference images of 100 people in the first stage to construct a training sample set, wherein the total number of the interference images is 1200;

(3) and selecting all images of 100 people in the second stage to construct a test sample set, wherein 1300 samples are obtained.

Based on the design and construction of the sample set, the experimental results of the invention, SRC, ESRC, KDA and KED are shown in table 2. As can be seen from the table, the identification effect of the invention is equivalent to that of KED, and the mixed identification rate reaches 99.15 percent. Meanwhile, the recognition effect of the method on four subsets, namely illumination, expression, glasses and scarf, is also the best (except for the scarf shielding subset, the recognition result is slightly lower than the KED), which fully indicates that the sample dictionary and the shielding dictionary designed by the method have better robustness on different types of interference.

TABLE 2 recognition rate in AR database (%)

Example 3: experiments were performed in the Extended yard B database:

the Extended Yale B database contains face elevation views of 38 persons collected under different lighting conditions, and about 64 images of each person, for a total of 2414 samples. FIG. 5 is a partial sample image from Extended Yale B library and an image with 20% occlusion added.

The design of the training sample set, the standard sample set and the test sample set on the Extended Yale B database is as follows:

(1) selecting 7 illumination images for each class of people to construct a training sample set, wherein 266 samples are total;

(2) selecting a front face non-illumination interference image to construct a standard sample set by each type of people, wherein 38 samples are total;

(3) and (4) constructing a test sample set by removing the standard sample and the residual sample after training samples, wherein 2110 samples are obtained in total.

Five experiments are set on the test sample set, wherein the experiments are to perform experiments on the original map in the database without adding any shielding in the image. Experiments two to five are performed on the images added with the position random blocking blocks, and the area of the blocking blocks in the experiments two to five respectively accounts for 20%, 30%, 40% and 50% of the total image area. Based on the design and construction of the sample set, the experimental results of the invention, SRC, ESRC, KDA and KED are shown in table 3.

As can be seen from table 3, as the area of the occlusion block increases, the recognition rates of all the methods decrease to different degrees, but the recognition rate of the present invention decreases the least, and particularly when the area of the occlusion block occupies 50% of the total area, the recognition rates of SRC and ESRC are only about 20%, and the recognition rate of the present invention still remains 76.92%, which indicates that the present invention can effectively eliminate the mixed influence of illumination occlusion and large-area block occlusion, and has better robustness.

TABLE 3 recognition in Extended Yale B database (%)

The invention respectively carries out experimental simulation on three databases of CAS-PEAL, AR and Extended Yale B, and the experimental results show that compared with the prior art, the innovation point of the invention is effective and feasible in solving the problem of face identification occlusion, and the concrete summary is as follows:

1. the invention abandons the traditional strategy of constructing the dictionary in the original image space, and improves the dictionary construction method based on KDA low-dimensional discrimination feature space. On one hand, KDA dimension reduction is carried out on the original data, so that redundant information among pixels can be effectively eliminated, and the low-dimensional data is more discriminative; on the other hand, the improved dictionary construction method is adopted, so that the sample dictionary is more discriminative, and the occlusion dictionary is more representative, and is favorable for accurate identification of the occluded face.

2. When the occlusion dictionary is constructed, the occlusion information contained in the training sample is considered, meanwhile, the occlusion information contained in the test sample is extracted in a self-adaptive mode, the limitation that the occlusion information of the test sample in practical application may be different from the occlusion information of the training sample is overcome, and the occlusion dictionary constructed by the method is more complete.

3. The method is not limited by aspects of sample selection, feature extraction and the like, and the implementation steps are simple, so that the method is easier to use and more feasible compared with the prior art. Meanwhile, the invention reprocesses the data after dimension reduction, and the system has high calculation efficiency and practical value.

The foregoing shows and describes the general principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. The method for recognizing the shielded face based on the sparse representation of the kernel extension dictionary is characterized by comprising the following steps of: comprises the following steps of (a) preparing a solution,

step (S1): constructing a training sample set X, and guiding and learning the training sample set X by adopting a KDA algorithm to obtain a KDA projection matrix

phi (N) represents that data is subjected to high-dimensional mapping by adopting a nonlinear kernel function, the nonlinear kernel function is a radial basis kernel function, N represents a standard sample set, and T represents transposition operation of a matrix;

step (S3): constructing a test sample set

Wherein R is a set of real numbers, d represents the column vector dimension of the samples, n ^ represents the number of samples,

the method comprises the steps that a test sample set Y comprises n ^ samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector are valued from a real number set R;

step (S4): respectively extracting shielding information in the training samples in the training sample set X and the samples to be tested in the testing sample set Y by adopting a KDA algorithm, and constructing a shielding dictionary D1 of the training sample set X and a shielding dictionary D2 of the testing sample set Y to obtain a mixed complete shielding dictionary D; the method specifically comprises the following substeps:

(S41): adopting KDA projection matrix based on the following formula (4)

(S42): from test sample sets

In any one of the samples to be tested y belongs to R^d×1And randomly selecting a class I subset X from the training sample set X₁,X₂,...,X_lObtaining the shielding information b of the sample y to be tested in the training sample subset by adopting the robust principal component analysis algorithm₁,b₂,...,b_lTaking the average value of the obtained l shielding information to obtain the shielding information of the sample y to be detected

y^*＝y-k*b (5)；

(S44): repeating the steps (S42) - (S43), calculating the shielding information of all samples to be tested, and constructing the self-adaptive shielding dictionary of the test sample set Y

(S45): combining the occlusion dictionary D1 of the training sample set X with the occlusion dictionary D2 of the testing sample set Y to obtain a mixed complete occlusion dictionary D ═ D1, D2;

step (S5): according to the mixed complete occlusion dictionary D, performing linear sparse representation classification on the sample to be detected by adopting an SRC (fuzzy C-means) model, and performing occlusion face recognition on the sample to be detected; the method specifically comprises the following steps:

wherein,

wherein, delta_j(β) denotes a j-th coefficient (j ═ 1, 2.. once, c) among the encoding coefficients β corresponding to the low-dimensional basic dictionary a,

representing the coding coefficient corresponding to the mixed complete occlusion dictionary D, e_jRepresents the residual error (j ═ 1, 2.. times.c) between the sample y to be measured and the reconstructed sample of the j-th class,

2. The occlusion face recognition method based on the sparse representation of the kernel extended dictionary according to claim 1, characterized in that: in step (S1), the training sample set X is a training sample set X ═ X containing expression, illumination, and occlusion information with rich intra-class variation information₁,X₂,...,X_c]＝[x₁,x₂,...x_n]∈R^d×nLearning the high-dimensional spatial distribution of the training sample set to obtain a KDA projection matrix with c-1 projection vectors

Where c is the number of classes in the training sample set, X₁,X₂,...,X_cRepresenting c subsets, R is a real number set, d represents the column vector dimension of the sample, n represents the number of the samples, and X belongs to R^d×nThe training sample set X is represented to contain n samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector are valued from a real number set R.

3. The occlusion face recognition method based on the sparse representation of the kernel extended dictionary according to claim 1, characterized in that: in step (S2): the standard sample set is constructed as N ═ x₁,x₂,...x_m]∈R^d×mThe method is characterized in that an interference-free front face image is taken from m objects respectively, wherein d represents the column vector dimension of a sample, m represents the number of samples, and R represents^d×mThe standard sample set N is represented by m samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector take values from the real number set R.