CN112966649B - Occlusion face recognition method based on sparse representation of kernel extension dictionary - Google Patents

Occlusion face recognition method based on sparse representation of kernel extension dictionary Download PDF

Info

Publication number
CN112966649B
CN112966649B CN202110319464.9A CN202110319464A CN112966649B CN 112966649 B CN112966649 B CN 112966649B CN 202110319464 A CN202110319464 A CN 202110319464A CN 112966649 B CN112966649 B CN 112966649B
Authority
CN
China
Prior art keywords
sample
dictionary
occlusion
sample set
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110319464.9A
Other languages
Chinese (zh)
Other versions
CN112966649A (en
Inventor
童莹
马杲东
陈瑞
曹雪虹
赵小燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Institute of Technology
Original Assignee
Nanjing Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Institute of Technology filed Critical Nanjing Institute of Technology
Priority to CN202110319464.9A priority Critical patent/CN112966649B/en
Publication of CN112966649A publication Critical patent/CN112966649A/en
Application granted granted Critical
Publication of CN112966649B publication Critical patent/CN112966649B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Abstract

The invention discloses an occlusion face recognition method based on sparse representation of a kernel extended dictionary, which comprises the following steps (S1)): constructing a training sample set X; step (S2): constructing a standard sample set N; step (S3): constructing a test sample set Y; step (S4): constructing an occlusion dictionary D1 of the training sample set X and an occlusion dictionary D2 of the testing sample set Y to obtain a mixed complete occlusion dictionary D; step (S5): and performing linear sparse representation classification on the sample to be detected by adopting an SRC model according to the mixed complete occlusion dictionary D, and performing occlusion face recognition on the sample to be detected. The method is used for eliminating the pixel redundant information in the sample dictionary, obtaining the dictionary with more discriminative and representational properties, enabling the sample dictionary to only contain the face structure characteristics but not contain the pixel redundant information and the interference information, and enabling the occlusion dictionary to only contain the occlusion information of the training sample and the test sample but not have the face structure characteristics, and the two are combined to improve the accuracy of the occlusion face recognition.

Description

Occlusion face recognition method based on sparse representation of kernel extension dictionary
Technical Field
The invention relates to the technical field of personal identity verification and identification under the condition that a human face is shielded in man-machine interaction, in particular to a shielded human face identification method based on sparse representation of a kernel extension dictionary.
Background
In recent years, due to the development of technologies such as artificial intelligence, computer vision, internet of things communication and the like, a face recognition technology is widely applied in actual life, for example, intelligent home appliances, intelligent retail, intelligent entrance guard and the like. However, the above applications all require the target object to remain frontal, unobstructed. In practical application, the face of a target object is often shielded by accessories such as a scarf, a hat, a mask and glasses or by illumination, so that the accuracy of identity verification is reduced. Therefore, how to eliminate the interference of these blocking factors and improve the accuracy of face recognition in practical application has become a technical difficulty of blocking face recognition.
In 2009, Wright et al first applied Sparse Representation (SR) theory to face recognition, and proposed a Sparse Representation based classification (SRC). The algorithm adopts training samples to construct a dictionary, restrains the minimum norm of a coding coefficient L1 of a sample to be detected, aims to select a minimum subset from the sample dictionary to perform linear representation on the sample to be detected, calculates the residual error between the sample to be detected and each type of reconstructed sample, and divides the type according to the minimum residual error. The target function of SRC is shown in formula (1), where a is a sample dictionary, y is a sample to be measured, and α is a sparse coding coefficient.
Figure BDA0002992548400000011
According to the sparse regularization theory, in the formula (1), as long as the sample dictionary a contains enough abundant sample atoms, that is, the sample atoms describe various possible interference conditions existing in face recognition, the sample to be detected can be reconstructed from the sample dictionary in a linear sparse way without distortion. However, in practical applications, the collected face image may have the influence of interference factors such as age, external environment illumination, facial expression, accessory shielding, head pose, and the like, which results in that the atoms in the sample dictionary a cannot cover various possible changes. Meanwhile, under the shielding condition, part of information of the face part of the human face is lost, so that the sample to be detected cannot be accurately and linearly represented by the shielded sample, and is easily judged by mistake.
Therefore, it is not appropriate to use only a single sample dictionary a to represent a sample to be measured, therefore, Wright introduces the identity matrix I in the SRC model as an extended noise dictionary, and aims to separate interference factors from an original image, the essential structural features of a human face are represented by the sample dictionary, and external interference factors are represented by the noise dictionary, so as to further improve the accuracy of linear sparse representation of the sample to be measured, and an objective function is shown in formula (2).
Figure BDA0002992548400000021
In 2012, Deng improved SRC, and proposed an extended sparse representation-based classifier (ESRC). According to the method, a standard sample (a front face interference-free face image) is subtracted from a changed sample (a face image with interference such as shielding, expression and illumination), and an intra-class difference dictionary V is constructed by the difference value between the standard sample and the changed sample, so that an unit interference matrix I in the traditional SRC is replaced. Compared with the traditional SRC algorithm, the intra-class difference dictionary V constructed by the ESRC has richer interference information, and can describe the sample to be detected more accurately by combining with the sample dictionary A. In 2016, Chen proposes an Adaptive Noise Dictionary (AND), firstly adopts an iterative weighted robust principal component analysis method to adaptively extract various occlusion information possibly existing in a sample to be detected, AND then combines a training sample without occlusion to realize accurate linear representation of the sample to be detected.
By analyzing the implementation principle of the SRC and the improved algorithm thereof, we find that the methods are all improved based on the formula (2), and the main purpose of the methods is to obtain an accurate extended noise dictionary by using different methods, and to separate interference information from an original face image, so as to further improve the accuracy of linear sparse representation of a sample to be measured. Although a certain effect is achieved in the occlusion face recognition, the following problems still exist:
1. atoms in the sample dictionary A are all represented by original images, so that a sample dictionary constructed based on the images has a large amount of pixel redundant information, the like atoms lack consistency, and the heterogeneous atoms lack discriminability; meanwhile, the dictionary atoms are represented by converting two-dimensional images into one-dimensional column vectors, so that the dimensionality of the dictionary atoms is far greater than the number of the atoms, the problem of 'small samples' is easy to occur, and the optimal sparse solution cannot be obtained in a solution space;
2. the above method of constructing the extended noise dictionary V still has limitations. For example, the conventional SRC algorithm uses the identity matrix I as an extended noise dictionary, and can only describe discontinuous single interference problems, such as random pixel damage and small-scale non-real object occlusion. Although the intra-class difference dictionary of the ESRC is more effective than the identity matrix, it includes both the occlusion information and the non-occlusion information of the samples, and the effect is not good when performing linear sparse representation on the occluded samples. Meanwhile, the intra-class difference dictionary of the ESRC is obtained by subtracting a standard sample from a variation sample, wherein a large amount of pixel redundant information exists. If the selection of the changed samples is insufficient, the intra-class difference information among the obtained samples is also insufficient, and the accuracy of face recognition is affected.
Disclosure of Invention
The invention aims to solve the problem of an occlusion face recognition algorithm with sparse representation in the prior art. The method is based on improvement of a construction method of a sample dictionary and an extended noise dictionary in a sparse representation model, is used for eliminating pixel redundant information in the sample dictionary and obtaining a dictionary with more discriminative and representational properties, so that the sample dictionary only contains human face structural features but does not contain pixel redundant information and interference information, and the occlusion dictionary only contains occlusion information of a training sample and a test sample but does not contain human face structural features, and the accuracy of occlusion face recognition can be effectively improved by combining the two.
In order to achieve the purpose, the invention adopts the technical scheme that:
a method for recognizing an occluded face based on sparse representation of a kernel extension dictionary comprises the following steps: comprises the following steps of (a) carrying out,
step (S1): constructing a training sample set X, and guiding and learning the training sample set X by adopting a Kernel Discriminant Analysis (KDA) algorithm to obtain a KDA projection matrix
Figure BDA0002992548400000041
Step (S2): constructing a standard sample set N, performing projection dimensionality reduction on the standard sample set N by adopting a KDA algorithm according to a formula (3) to obtain a low-dimensional basic dictionary A,
Figure BDA0002992548400000042
phi (N) represents a radial basis kernel function, N represents a standard sample set, and T represents the transposition operation of a matrix;
step (S3): constructing a test sample set Y ═ Y1,y2,...,yn^]∈Rd×n^
Wherein R is a real number set, d represents the column vector dimension of the sample, n ^ represents the number of the samples, Y ∈ Rd×n^The method comprises the steps that a test sample set Y comprises n ^ samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector are valued from a real number set R;
step (S4): respectively extracting shielding information in the training samples in the training sample set X and the samples to be tested in the testing sample set Y by adopting a KDA algorithm, and constructing a shielding dictionary D1 of the training sample set X and a shielding dictionary D2 of the testing sample set Y to obtain a mixed complete shielding dictionary D;
step (S5): and performing linear sparse representation classification on the sample to be detected by adopting an SRC model according to the mixed complete occlusion dictionary D, and performing occlusion face recognition on the sample to be detected.
Preferably, in the step (S1), the training sample set X includes information on expression, illumination, and occlusion having rich intra-class variation [ X ═ X1,X2,...,Xc]=[x1,x2,...xn]∈Rd×nLearning the high-dimensional spatial distribution of the training sample set to obtain a KDA projection matrix with c-1 projection vectors
Figure BDA0002992548400000051
Where c is the number of classes in the training sample set, X1,X2,...,XcRepresenting c subsets, wherein R is a real number set, d represents a column vector dimension of the samples, n represents the number of the samples, and X belongs to Rd×nThe training sample set X is represented to contain n samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector are valued from a real number set R.
Preferably, in the step (S2): constructed standardsThe book is N ═ x1,x2,...xm]∈Rd×mThe method is characterized in that an interference-free front face image is taken from m objects respectively, wherein d represents the column vector dimension of a sample, m represents the number of samples, and R representsd×mThe standard sample set N is represented by m samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector take values from the real number set R.
Preferably, the step (S4): the method comprises the following steps:
(S41): adopting KDA projection matrix based on the following formula (4)
Figure BDA0002992548400000052
For the occlusion sample subset X in the training sample set XOAnd corresponding subset of standard samples XNAnd carrying out low-dimensional mapping, and then subtracting the low-dimensional vectors to obtain an occlusion dictionary D1 of the training sample:
Figure BDA0002992548400000053
(S42): from the test sample set Y ═ Y1,y2,...,yn^]∈Rd×n^In any one of the samples to be tested y belongs to Rd×1And randomly selecting a class I subset X from the training sample set X1,X2,...,XlCalculating and obtaining the shielding information b of the sample y to be tested in the training sample subset by adopting a robust principal component analysis algorithm1,b2,...,blTaking the mean value of the obtained l shielding information to obtain the shielding information of the sample y to be detected
Figure BDA0002992548400000054
(S43): approximate sample y of sample y to be measured is constructed by adopting formula (5)*Then, using KDA to project the matrix
Figure BDA0002992548400000061
For y and y respectively*Performing low-dimensional projection to obtain a low-dimensional vector yKDAAnd
Figure BDA0002992548400000062
subtracting the two to obtain the low-dimensional shielding information do of the sample y to be detected, as shown in formula (6):
y*=y-k*b (5);
wherein k is a coefficient, and b is shielding information of the sample y to be detected;
Figure BDA0002992548400000063
(S44): repeating the steps (S42) - (S43), calculating the shielding information of all samples to be tested, and constructing the self-adaptive shielding dictionary D of the test sample set Y2=[do1,do2,...,don^];
(S45): and combining the occlusion dictionary D1 of the training sample set X with the occlusion dictionary D2 of the testing sample set Y to obtain a mixed complete occlusion dictionary D ═ D1, D2.
Preferably, in the step (S5): according to the mixed complete occlusion dictionary D, linear sparse representation classification is carried out on the sample to be detected by adopting an SRC model, and the method comprises the following steps:
(S51): optimizing and solving a sparse coding coefficient of a sample Y to be detected for Y based on the following target function formula (7) of SRC;
Figure BDA0002992548400000064
wherein the content of the first and second substances,
Figure BDA0002992548400000065
a is a low-dimensional basic dictionary, D is a mixed complete occlusion dictionary, beta is a coding coefficient corresponding to the low-dimensional basic dictionary A,
Figure BDA0002992548400000066
the code coefficient corresponding to the mixed complete occlusion dictionary D is lambda which is a regularization coefficient;
(S52): calculating residual errors of the sample to be detected and each type of reconstructed sample according to the following formula (8), and finally judging y as the type with the minimum residual error according to the following formula (9);
Figure BDA0002992548400000071
Figure BDA0002992548400000072
wherein deltaj(β) represents a j-th coefficient (j ═ 1, 2.. said., c) of the encoding coefficients β corresponding to the low-dimensional basic dictionary a,
Figure BDA0002992548400000073
representing the coding coefficient corresponding to the mixed complete occlusion dictionary D, ejAnd the residual error between the sample y to be measured and the j-th type reconstruction sample is represented (j is 1, 2.
Figure BDA0002992548400000074
The label representing the judgment of the sample y to be measured is minimum ejThe value corresponds to a label.
The invention has the beneficial effects that:
1. the method abandons the traditional strategy of constructing the dictionary in the original image space, improves the dictionary construction method based on the low-dimensional discrimination feature space, and aims to eliminate pixel redundant information and obtain a dictionary with more discriminative and representational properties;
2. because the face image collected in the real environment is distributed in a nonlinear complex manifold in the sample space, the traditional Linear dimension reduction method, such as Linear Discriminant Analysis (LDA), can not effectively process the nonlinear inseparable condition, therefore, the invention adopts Kernel Discriminant Analysis (KDA) algorithm to calculate the optimal low-dimensional projection direction of the original image space, and obtains a more Discriminant low-dimensional subspace;
3. according to the method, the construction method of the sample dictionary is improved in the KDA low-dimensional projection subspace, so that redundant information among pixels in an original image is removed, the discriminability of dictionary atoms is improved, the dimensionality of the dictionary atoms is reduced, the calculation efficiency of a model is improved, and the optimal sparse solution is obtained in a solution space;
4. the invention firstly improves the construction method of the occlusion dictionary of the training sample in a KDA low-dimensional projection subspace, and aims to eliminate redundant information among pixels and face structure characteristics so that the occlusion dictionary of the training sample has more representation. Meanwhile, the shielding information of the sample to be detected is extracted in the KDA low-dimensional projection subspace by adopting a robust principal component analysis algorithm, so that the shielding dictionary is supplemented, and the shielding information has completeness and self-adaptability.
In conclusion, the invention improves the construction method of the sample dictionary and the occlusion dictionary in the KDA low-dimensional projection subspace, aims to ensure that the sample dictionary only contains the face structure characteristics, but does not contain pixel redundant information and interference information, and ensures that the occlusion dictionary only contains the occlusion information of the training sample and the test sample, but does not contain the face structure characteristics, and the combination of the two can effectively improve the accuracy of the occlusion face recognition.
Drawings
FIG. 1 is a block diagram of a flow implementation of the method for recognizing an occluded face based on sparse representation of a kernel extended dictionary according to the present invention;
FIG. 2 is a diagram of the effect of face simulation in a portion of the CAS-PEAL library of the present invention;
FIG. 3 is a graph of the hybrid recognition rate of the present invention with k taken at different values;
FIG. 4 is a diagram of the face simulation effect of a certain class of people in the AR library of the present invention;
FIG. 5 is a diagram of simulation effect of a portion of samples in the Extended Yale B database according to the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings.
The invention performs experiments on three face databases of a CAS-PEAL library, an AR library and an Extended Yale B, and the experimental environment is a win 1064-bit operating system, an 8GB memory and a MatlabR2017a simulation platform.
As shown in FIG. 1, the method for recognizing the face with occlusion based on sparse representation of the kernel extended dictionary of the present invention comprises: comprises the following steps of (a) carrying out,
step (S1), a training sample set X is constructed, and a KDA algorithm is adopted to guide and learn the training sample set X, so that a KDA projection matrix is obtained
Figure BDA0002992548400000081
The KDA algorithm related in the invention refers to a Kernel Discriminant Analysis (KDA) algorithm;
step (S2): constructing a standard sample set N, performing projection dimensionality reduction on the standard sample set N by adopting a KDA algorithm according to a formula (3) to obtain a low-dimensional basic dictionary A,
Figure BDA0002992548400000091
phi (N) represents a radial basis kernel function, N represents a standard sample set, and T represents the transposition operation of a matrix;
step (S3): constructing a test sample set Y ═ Y1,y2,...,yn^]∈Rd×n^
Wherein R is a real number set, d represents the column vector dimension of the sample, n ^ represents the number of the samples, Y ∈ Rd×n^The method comprises the following steps that a test sample set Y is represented to contain n ^ samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector are valued from a real number set R;
step (S4): respectively extracting shielding information in the training samples in the training sample set X and the samples to be tested in the testing sample set Y by adopting a KDA algorithm, and constructing a shielding dictionary D1 of the training sample set X and a shielding dictionary D2 of the testing sample set Y to obtain a mixed complete shielding dictionary D;
step (S5): and performing linear sparse representation classification on the sample to be detected by adopting an SRC model according to the mixed complete occlusion dictionary D, and performing occlusion face recognition on the sample to be detected.
Further, in step (S1), the training sample set X includes expression, illumination, and occlusion information with rich intra-class variation information [ X ═ X1,X2,...,Xc]=[x1,x2,...xn]∈Rd×nLearning the high-dimensional spatial distribution of the training sample set to obtainKDA projection matrix with c-1 projection vectors
Figure BDA0002992548400000092
Where c is the number of classes in the training sample set, X1,X2,...,XcRepresenting c subsets, wherein R is a real number set, d represents a column vector dimension of a sample, n represents the number of samples, and X belongs to Rd×nThe training sample set X is represented to contain n samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector are valued from a real number set R.
Further, in step (S2): the standard sample set is constructed as N ═ x1,x2,...xm]∈Rd×mThe method is characterized in that an interference-free front face image is taken from m objects respectively, wherein d represents the column vector dimension of a sample, m represents the number of samples, and R representsd×mThe standard sample set N is represented by m samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector take values from the real number set R.
Further, step (S4): the method comprises the following steps:
(S41): adopting KDA projection matrix based on the following formula (4)
Figure BDA0002992548400000101
For the occlusion sample subset X in the training sample set XOAnd corresponding subset of standard samples XNAnd carrying out low-dimensional mapping, and then subtracting the low-dimensional vectors to obtain an occlusion dictionary D1 of the training sample:
Figure BDA0002992548400000102
(S42): from the test sample set Y ═ Y1,y2,...,yn^]∈Rd×n^In any one of the samples to be tested y belongs to Rd×1And randomly selecting a class I subset X from the training sample set X1,X2,...,Xl(in the embodiment, the calculation complexity of the algorithm is considered, and l is 5), the robust principal component analysis algorithm is adopted to calculate to obtain the y position of the sample to be measuredOcclusion information b in this subset of training samples1,b2,...,blTaking the average value of the obtained l shielding information to obtain the shielding information of the sample y to be detected
Figure BDA0002992548400000103
(S43): approximate sample y of sample y to be measured is constructed by adopting formula (5)*Then, using KDA to project the matrix
Figure BDA0002992548400000104
For y and y respectively*Performing low-dimensional projection to obtain a low-dimensional vector yKDAAnd
Figure BDA0002992548400000105
subtracting the two to obtain the low-dimensional shielding information do of the sample y to be detected, as shown in formula (6):
y*=y-k*b (5);
wherein k is a coefficient, and b is shielding information of the sample y to be detected;
Figure BDA0002992548400000111
(S44): repeating the steps (S42) - (S43), calculating the shielding information of all samples to be tested, and constructing the self-adaptive shielding dictionary D of the test sample set Y2=[do1,do2,...,don^];
(S45): and combining the occlusion dictionary D1 of the training sample set X with the occlusion dictionary D2 of the testing sample set Y to obtain a mixed complete occlusion dictionary D ═ D1, D2.
Further, in step (S5): according to the mixed complete occlusion dictionary D, linear sparse representation classification is carried out on the sample to be detected by adopting an SRC model, and the method comprises the following steps:
(S51): optimizing and solving a sparse coding coefficient of a sample Y to be detected for Y based on the following target function formula (7) of SRC;
Figure BDA0002992548400000112
wherein the content of the first and second substances,
Figure BDA0002992548400000113
a is a low-dimensional basic dictionary, D is a mixed complete occlusion dictionary, beta is a coding coefficient corresponding to the low-dimensional basic dictionary A,
Figure BDA0002992548400000114
the code coefficient corresponding to the mixed complete occlusion dictionary D is lambda, which is a regularization coefficient;
(S52): calculating residual errors of the sample to be detected and each type of reconstructed sample according to the following formula (8), and finally judging y as the type with the minimum residual error according to the following formula (9);
Figure BDA0002992548400000115
Figure BDA0002992548400000116
wherein deltaj(β) denotes a j-th coefficient (j ═ 1, 2.. once, c) among the encoding coefficients β corresponding to the low-dimensional basic dictionary a,
Figure BDA0002992548400000121
representing the coding coefficient corresponding to the mixed complete occlusion dictionary D, ejThe residual error between the sample y to be measured and the j-th reconstructed sample is represented (j ═ 1, 2.., c).
Figure BDA0002992548400000122
The label representing the judgment of the sample y to be measured is minimum ejThe value corresponds to a label.
Example 1: experiments were performed in the CAS-PEAL database:
the CAS-PEAL face database contains 1040 people, a total of 99594 face images (including 595 men and 445 women). All images are collected in a special collection environment, 4 main change conditions of postures, expressions, ornaments and illumination are covered, and part of face images have changes of backgrounds, distances and time spans. The invention selects 9031 images to carry out the experiment, and partial sample images are shown in figure 2.
The design of the training sample set, the standard sample set, and the test sample set on the CAS-PEAL database is as follows:
(1) the training sample set with rich intra-class variation contains 200 people with illumination variation, 100 people with expression variation and 20 people with accessory occlusion, and each person has 4 images, and 1280 variation samples in total. Meanwhile, the training sample set also comprises 1 front face interference-free image of each type of people, and 273 standard samples are formed in total, and the training sample set is formed by the 1 front face interference-free images and the 273 standard samples.
(2) The standard sample set comprises 1040 people in CAS-PEAL database, each person takes 1 non-interference image of front face, and 1040 samples are total.
(3) The test sample set is composed of the remaining samples after the training sample and the standard sample are removed, and contains 6711 samples which comprise 6 subsets of ornament shading subset, illumination subset, expression subset, distance subset (different shooting distances), time subset (different time intervals of face shooting) and background subset (different shooting backgrounds).
Based on the design and construction of the sample set, the experimental results of the invention, SRC, ESRC, KDA and KED are shown in table 1. It can be seen from the table that the three algorithms of ESRC, KDA and ke all have good experimental results on four subsets of expression, time, background and distance, the recognition rate is close to 99%, some even reaches 100%, but the experimental effect is greatly different on two subsets of ornament blocking and illumination blocking, especially by using SRC algorithm, the recognition rate on the illumination subset is only 17.32%, and it is thus not appropriate to use unit matrix to describe illumination change, and the blocking dictionaries of other algorithms also need to be improved.
The invention (k is 0.1) achieves the best recognition effect on six interference subsets (except for the background subset, the recognition rate is slightly lower than the KED), and particularly, the recognition effect is the best on two subsets, namely the accessory occlusion and the illumination occlusion, wherein the recognition rates respectively reach 92.29% and 86.55%, and are 1.77% and 3.52% higher than the second best KED algorithm. Therefore, the improvement of the invention on the sample dictionary and the extended noise dictionary is helpful to improve the accuracy of the linear representation of the occlusion face image.
The mixed identification rate in table 1 refers to the result of identifying samples in the six interference subsets mixed together. As can be seen from the table, the recognition effect of the invention is still the best, reaches 93.97%, and is respectively 25.23%, 3.39%, 6.23% and 1.54% higher than SRC, ESRC, KDA and KED. This further illustrates that the present invention is robust and adaptive to various interference factors present in face recognition.
TABLE 1 identification results in PEAL-CAS database (%)
Figure BDA0002992548400000131
To illustrate the effect of the choice of coefficient k in equation (5) on the present invention, we set k at [0,1 ]]Taking the value in the step (1). Fig. 3 shows the recognition result of the face image with mixed interference condition when k takes different values. It can be seen from the figure that when k is 0, the recognition rate is 92.92%, when k is 0.1, the recognition rate is 93.97%, and when k is 1, the recognition rate is reduced to 92.04%. This shows that when the value of k is 0, the occlusion dictionary only contains occlusion information of the training sample at this time, no occlusion information of the sample to be detected exists, and the recognition effect is affected; when k is large, for example, k is 1, the approximate sample y calculated by equation (5) is used*More face structure characteristics are lost, and a similar sample y is obtained*After the nonlinear mapping is carried out on the high-dimensional space, the nonlinear mapping deviates greatly from the sample phi (y) to be detected in the high-dimensional space, so that the projection difference value of the two in the KDA low-dimensional space cannot well represent the shielding information of the sample to be detected, and the identification accuracy is reduced. In summary, only when k is small, for example, k is 0.1, the obtained approximate sample y is calculated*The method not only eliminates partial shielding factors, but also keeps more face structure characteristics, and the projection difference value of the face structure characteristics and the sample to be detected in the KDA low-dimensional space can effectively represent the shielding information of the sample to be detected, so that the identification effect is improved.
Example 2: experiments were performed in the AR database:
the AR face database contains 126 classes of people (56 women, 70 men), and there are 4000 faces in front alignment. Every kind of people is shot in two stages, and 13 images are shot in each stage, wherein 4 images are changed in illumination, 3 images are changed in expression, 3 images are blocked by glasses, and 3 images are blocked by a neckerchief. The invention selects 100 people to carry out experiment, and cuts and normalizes the image, the size after cutting is 120 multiplied by 100. Fig. 4 is a partial sample image in the AR face library.
The design of the training sample set, the standard sample set, and the test sample set on the AR database is as follows:
(1) selecting a first front face interference-free image of 100 people in the first stage to construct a standard sample set, wherein 100 samples are total;
(2) selecting the remaining 12 interference images of 100 people in the first stage to construct a training sample set, wherein the total number of the interference images is 1200;
(3) and selecting all images of 100 people in the second stage to construct a test sample set, wherein 1300 samples are obtained.
Based on the design and construction of the sample set, the experimental results of the invention, SRC, ESRC, KDA and KED are shown in table 2. As can be seen from the table, the identification effect of the invention is equivalent to that of KED, and the mixed identification rate reaches 99.15 percent. Meanwhile, the recognition effect of the method on four subsets, namely illumination, expression, glasses and scarf, is also the best (except for the scarf shielding subset, the recognition result is slightly lower than the KED), which fully indicates that the sample dictionary and the shielding dictionary designed by the method have better robustness on different types of interference.
TABLE 2 recognition rate in AR database (%)
Figure BDA0002992548400000151
Example 3: experiments were performed in the Extended yard B database:
the Extended Yale B database contains face elevation views of 38 persons collected under different lighting conditions, and about 64 images of each person, for a total of 2414 samples. FIG. 5 is a partial sample image from Extended Yale B library and an image with 20% occlusion added.
The design of the training sample set, the standard sample set and the test sample set on the Extended Yale B database is as follows:
(1) selecting 7 illumination images for each class of people to construct a training sample set, wherein 266 samples are total;
(2) selecting a front face non-illumination interference image to construct a standard sample set by each type of people, wherein 38 samples are total;
(3) and (4) constructing a test sample set by removing the standard sample and the residual sample after training samples, wherein 2110 samples are obtained in total.
Five experiments are set on the test sample set, wherein the experiments are to perform experiments on the original map in the database without adding any shielding in the image. Experiments two to five are performed on the images added with the position random blocking blocks, and the area of the blocking blocks in the experiments two to five respectively accounts for 20%, 30%, 40% and 50% of the total image area. Based on the design and construction of the sample set, the experimental results of the invention, SRC, ESRC, KDA and KED are shown in table 3.
As can be seen from table 3, as the area of the occlusion block increases, the recognition rates of all the methods decrease to different degrees, but the recognition rate of the present invention decreases the least, and particularly when the area of the occlusion block occupies 50% of the total area, the recognition rates of SRC and ESRC are only about 20%, and the recognition rate of the present invention still remains 76.92%, which indicates that the present invention can effectively eliminate the mixed influence of illumination occlusion and large-area block occlusion, and has better robustness.
TABLE 3 recognition in Extended Yale B database (%)
Figure BDA0002992548400000161
The invention respectively carries out experimental simulation on three databases of CAS-PEAL, AR and Extended Yale B, and the experimental results show that compared with the prior art, the innovation point of the invention is effective and feasible in solving the problem of face identification occlusion, and the concrete summary is as follows:
1. the invention abandons the traditional strategy of constructing the dictionary in the original image space, and improves the dictionary construction method based on KDA low-dimensional discrimination feature space. On one hand, KDA dimension reduction is carried out on the original data, so that redundant information among pixels can be effectively eliminated, and the low-dimensional data is more discriminative; on the other hand, the improved dictionary construction method is adopted, so that the sample dictionary is more discriminative, and the occlusion dictionary is more representative, and is favorable for accurate identification of the occluded face.
2. When the occlusion dictionary is constructed, the occlusion information contained in the training sample is considered, meanwhile, the occlusion information contained in the test sample is extracted in a self-adaptive mode, the limitation that the occlusion information of the test sample in practical application may be different from the occlusion information of the training sample is overcome, and the occlusion dictionary constructed by the method is more complete.
3. The method is not limited by aspects of sample selection, feature extraction and the like, and the implementation steps are simple, so that the method is easier to use and more feasible compared with the prior art. Meanwhile, the invention reprocesses the data after dimension reduction, and the system has high calculation efficiency and practical value.
The foregoing shows and describes the general principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (3)

1. The method for recognizing the shielded face based on the sparse representation of the kernel extension dictionary is characterized by comprising the following steps of: comprises the following steps of (a) preparing a solution,
step (S1): constructing a training sample set X, and guiding and learning the training sample set X by adopting a KDA algorithm to obtain a KDA projection matrix
Figure FDA0003586651620000011
Step (S2): constructing a standard sample set N, performing projection dimensionality reduction on the standard sample set N by adopting a KDA algorithm according to a formula (3) to obtain a low-dimensional basic dictionary A,
Figure FDA0003586651620000012
phi (N) represents that data is subjected to high-dimensional mapping by adopting a nonlinear kernel function, the nonlinear kernel function is a radial basis kernel function, N represents a standard sample set, and T represents transposition operation of a matrix;
step (S3): constructing a test sample set
Figure FDA0003586651620000013
Wherein R is a set of real numbers, d represents the column vector dimension of the samples, n ^ represents the number of samples,
Figure FDA0003586651620000014
the method comprises the steps that a test sample set Y comprises n ^ samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector are valued from a real number set R;
step (S4): respectively extracting shielding information in the training samples in the training sample set X and the samples to be tested in the testing sample set Y by adopting a KDA algorithm, and constructing a shielding dictionary D1 of the training sample set X and a shielding dictionary D2 of the testing sample set Y to obtain a mixed complete shielding dictionary D; the method specifically comprises the following substeps:
(S41): adopting KDA projection matrix based on the following formula (4)
Figure FDA0003586651620000015
For the occlusion sample subset X in the training sample set XOAnd corresponding subset of standard samples XNAnd carrying out low-dimensional mapping, and then subtracting the low-dimensional vectors to obtain an occlusion dictionary D1 of the training sample:
Figure FDA0003586651620000016
(S42): from test sample sets
Figure FDA0003586651620000017
In any one of the samples to be tested y belongs to Rd×1And randomly selecting a class I subset X from the training sample set X1,X2,...,XlObtaining the shielding information b of the sample y to be tested in the training sample subset by adopting the robust principal component analysis algorithm1,b2,...,blTaking the average value of the obtained l shielding information to obtain the shielding information of the sample y to be detected
Figure FDA0003586651620000018
(S43): approximate sample y of sample y to be measured is constructed by adopting formula (5)*Then, using KDA to project the matrix
Figure FDA0003586651620000021
For y and y respectively*Performing low-dimensional projection to obtain a low-dimensional vector yKDAAnd
Figure FDA0003586651620000022
subtracting the two to obtain the low-dimensional shielding information do of the sample y to be detected, as shown in formula (6):
y*=y-k*b (5);
wherein k is a coefficient, and b is shielding information of the sample y to be detected;
Figure FDA0003586651620000023
(S44): repeating the steps (S42) - (S43), calculating the shielding information of all samples to be tested, and constructing the self-adaptive shielding dictionary of the test sample set Y
Figure FDA0003586651620000024
(S45): combining the occlusion dictionary D1 of the training sample set X with the occlusion dictionary D2 of the testing sample set Y to obtain a mixed complete occlusion dictionary D ═ D1, D2;
step (S5): according to the mixed complete occlusion dictionary D, performing linear sparse representation classification on the sample to be detected by adopting an SRC (fuzzy C-means) model, and performing occlusion face recognition on the sample to be detected; the method specifically comprises the following steps:
(S51): optimizing and solving a sparse coding coefficient of a sample Y to be detected for Y based on the following target function formula (7) of SRC;
Figure FDA0003586651620000025
wherein the content of the first and second substances,
Figure FDA0003586651620000026
a is a low-dimensional basic dictionary, D is a mixed complete occlusion dictionary, beta is a coding coefficient corresponding to the low-dimensional basic dictionary A,
Figure FDA0003586651620000027
the code coefficient corresponding to the mixed complete occlusion dictionary D is lambda which is a regularization coefficient;
(S52): calculating residual errors of the sample to be detected and each type of reconstructed sample according to the following formula (8), and finally judging y as the type with the minimum residual error according to the following formula (9);
Figure FDA0003586651620000028
Figure FDA0003586651620000029
wherein, deltaj(β) denotes a j-th coefficient (j ═ 1, 2.. once, c) among the encoding coefficients β corresponding to the low-dimensional basic dictionary a,
Figure FDA0003586651620000031
representing the coding coefficient corresponding to the mixed complete occlusion dictionary D, ejRepresents the residual error (j ═ 1, 2.. times.c) between the sample y to be measured and the reconstructed sample of the j-th class,
Figure FDA0003586651620000032
the label representing the judgment of the sample y to be measured is minimum ejThe value corresponds to a label.
2. The occlusion face recognition method based on the sparse representation of the kernel extended dictionary according to claim 1, characterized in that: in step (S1), the training sample set X is a training sample set X ═ X containing expression, illumination, and occlusion information with rich intra-class variation information1,X2,...,Xc]=[x1,x2,...xn]∈Rd×nLearning the high-dimensional spatial distribution of the training sample set to obtain a KDA projection matrix with c-1 projection vectors
Figure FDA0003586651620000033
Where c is the number of classes in the training sample set, X1,X2,...,XcRepresenting c subsets, R is a real number set, d represents the column vector dimension of the sample, n represents the number of the samples, and X belongs to Rd×nThe training sample set X is represented to contain n samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector are valued from a real number set R.
3. The occlusion face recognition method based on the sparse representation of the kernel extended dictionary according to claim 1, characterized in that: in step (S2): the standard sample set is constructed as N ═ x1,x2,...xm]∈Rd×mThe method is characterized in that an interference-free front face image is taken from m objects respectively, wherein d represents the column vector dimension of a sample, m represents the number of samples, and R representsd×mThe standard sample set N is represented by m samples, each sample is represented by a column vector with the dimension d, and all elements of the column vector take values from the real number set R.
CN202110319464.9A 2021-03-25 2021-03-25 Occlusion face recognition method based on sparse representation of kernel extension dictionary Active CN112966649B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110319464.9A CN112966649B (en) 2021-03-25 2021-03-25 Occlusion face recognition method based on sparse representation of kernel extension dictionary

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110319464.9A CN112966649B (en) 2021-03-25 2021-03-25 Occlusion face recognition method based on sparse representation of kernel extension dictionary

Publications (2)

Publication Number Publication Date
CN112966649A CN112966649A (en) 2021-06-15
CN112966649B true CN112966649B (en) 2022-06-03

Family

ID=76278404

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110319464.9A Active CN112966649B (en) 2021-03-25 2021-03-25 Occlusion face recognition method based on sparse representation of kernel extension dictionary

Country Status (1)

Country Link
CN (1) CN112966649B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113269137B (en) * 2021-06-18 2023-10-31 常州信息职业技术学院 Non-matching face recognition method combining PCANet and shielding positioning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107025444A (en) * 2017-04-08 2017-08-08 华南理工大学 Piecemeal collaboration represents that embedded nuclear sparse expression blocks face identification method and device
CN111723759A (en) * 2020-06-28 2020-09-29 南京工程学院 Non-constrained face recognition method based on weighted tensor sparse graph mapping
WO2021003637A1 (en) * 2019-07-08 2021-01-14 深圳大学 Kernel non-negative matrix factorization face recognition method, device and system based on additive gaussian kernel, and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107025444A (en) * 2017-04-08 2017-08-08 华南理工大学 Piecemeal collaboration represents that embedded nuclear sparse expression blocks face identification method and device
WO2021003637A1 (en) * 2019-07-08 2021-01-14 深圳大学 Kernel non-negative matrix factorization face recognition method, device and system based on additive gaussian kernel, and storage medium
CN111723759A (en) * 2020-06-28 2020-09-29 南京工程学院 Non-constrained face recognition method based on weighted tensor sparse graph mapping

Also Published As

Publication number Publication date
CN112966649A (en) 2021-06-15

Similar Documents

Publication Publication Date Title
CN112580590B (en) Finger vein recognition method based on multi-semantic feature fusion network
CN102682302B (en) Human body posture identification method based on multi-characteristic fusion of key frame
CN110659665B (en) Model construction method of different-dimension characteristics and image recognition method and device
CN108446589B (en) Face recognition method based on low-rank decomposition and auxiliary dictionary in complex environment
CN111126240B (en) Three-channel feature fusion face recognition method
CN106845551B (en) Tissue pathology image identification method
CN109241813B (en) Non-constrained face image dimension reduction method based on discrimination sparse preservation embedding
CN104077742B (en) Human face sketch synthetic method and system based on Gabor characteristic
WO2022178978A1 (en) Data dimensionality reduction method based on maximum ratio and linear discriminant analysis
CN110796022B (en) Low-resolution face recognition method based on multi-manifold coupling mapping
CN107403153A (en) A kind of palmprint image recognition methods encoded based on convolutional neural networks and Hash
Paul et al. Extraction of facial feature points using cumulative histogram
CN111695455B (en) Low-resolution face recognition method based on coupling discrimination manifold alignment
CN114445715A (en) Crop disease identification method based on convolutional neural network
CN112966649B (en) Occlusion face recognition method based on sparse representation of kernel extension dictionary
CN108932501A (en) A kind of face identification method being associated with integrated dimensionality reduction based on multicore
CN111325275A (en) Robust image classification method and device based on low-rank two-dimensional local discriminant map embedding
CN108319891A (en) Face feature extraction method based on sparse expression and improved LDA
CN114937298A (en) Micro-expression recognition method based on feature decoupling
CN112183504B (en) Video registration method and device based on non-contact palm vein image
CN111723759B (en) Unconstrained face recognition method based on weighted tensor sparse graph mapping
CN111611963B (en) Face recognition method based on neighbor preservation canonical correlation analysis
CN112966648B (en) Occlusion face recognition method based on sparse representation of kernel expansion block dictionary
CN115439930A (en) Multi-feature fusion gait recognition method based on space-time dimension screening
Wang et al. Feature extraction method of face image texture spectrum based on a deep learning algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant