CN107451537B

CN107451537B - Face recognition method based on deep learning multi-layer non-negative matrix decomposition

Info

Publication number: CN107451537B
Application number: CN201710568578.0A
Authority: CN
Inventors: 同鸣; 李明阳; 陈逸然; 席圣男
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2017-07-13
Filing date: 2017-07-13
Publication date: 2020-07-10
Anticipated expiration: 2037-07-13
Also published as: CN107451537A

Abstract

The invention discloses a Face recognition method based on deep learning multilayer nonnegative matrix decomposition, which mainly solves the problem of low recognition rate of the existing Face recognition technology under complex appearance change.

Description

Face recognition method based on deep learning multi-layer non-negative matrix decomposition

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to a face image identification method which can be applied to the fields of identity authentication and information security.

Background

With the continuous development of human society, the face recognition has wide application in a plurality of fields such as security, finance, electronic government affairs and the like, and the improvement of the face recognition performance is beneficial to the expansion of the application of the face recognition. The main current research on face recognition is to extract efficient, robust and more discriminative features and to design classifiers with better classification capabilities. The key to improve the robustness of face recognition is to select more robust and discriminative features and design a classifier with good classification capability.

The non-negative matrix decomposition is a characteristic extraction method for matrix decomposition under non-negative constraint, has good data representation capability, can greatly reduce the dimensionality of data characteristics, has decomposition characteristics conforming to the visual experience of human visual perception, and has interpretable and clear physical significance for decomposition results. Basic non-negative matrix factorization NMF directly decomposes the original coefficient matrices into basis and coefficient matrices and requires that both the basis and coefficient matrices are non-negative, which indicates that non-negative matrix factorization NMF only has additive combinations. Therefore, the non-negative matrix factorization NMF can be regarded as a model based on partial representation, which can provide a local structure of observed data, but in some cases, the NMF algorithm also gives a global feature, which results in limited classification performance.

Deep learning is a new research direction of feature representation in the field of machine learning, and in recent years, breakthrough progress is made in various applications such as speech recognition and computer vision, and the deep learning forms more abstract high-level representation or features by combining bottom-level features. The deep learning model has more nonlinear transformation layers and stronger generalization capability. However, in practical applications, the performance of deep learning is degraded due to changes in appearance caused by factors such as head posture, lighting, and shading, and no good solution has been found so far.

Disclosure of Invention

In view of the above-mentioned shortcomings in the prior art, an object of the present invention is to provide a face recognition method based on deep learning multi-layer non-negative matrix factorization, so as to obtain low-rank robust features with more discriminative deep layers and improve the face recognition rate under complex appearance changes.

The technical key point for realizing the method is to introduce a new multi-layer non-negative matrix decomposition on the basis of deep learning so as to improve the existing deep learning method. Specifically, the invention obtains a low-rank characteristic representation with better discrimination by performing multiple times of nonnegative matrix decomposition on sample characteristics obtained by deep learning, thereby improving the face recognition rate, and the steps comprise:

(1) inputting each channel data of a training sample into a VGG-Face deep convolution neural network to obtain characteristic data X (K) of each channel data of the training sample, wherein K is 1, 2.

(2) Respectively carrying out the characteristic extraction processes of normalization, nonlinear transformation and matrix decomposition on the characteristic data X (k) obtained in the step (1) to obtain a coefficient matrix H (k);

(3) repeating the feature extraction process in the step (2) for L times to obtain the low-rank robust feature h_j(k) Wherein j is 1,2, and n is the total number of training samples;

(4) obtaining a low-rank robust feature h according to the step (3)_j(k) Constructing K nearest neighbor classifiers;

(5) inputting each channel data of the test sample into a VGG-Face deep convolution neural network to obtain characteristic data Y (k) of each channel data of the test sample;

(6) performing a projection process according to the characteristic data Y (k) obtained in the step (5) to obtain a projection coefficient vector

(7) The projection coefficient vector obtained in the step (6) is processed

Inputting the test samples into K nearest neighbor classifiers to obtain a classification result of each channel of the test samples, wherein i is 1, 2.

(8) And (5) integrating the classification result of each channel of the test sample obtained in the step (7) to obtain the classification result of the test sample.

Compared with the prior art, the invention has the following advantages:

1) the method combines multi-layer non-negative matrix factorization on the basis of deep learning, and can obtain characteristic representation with higher discriminative power;

2) the invention further improves the face recognition rate under the complex appearance change by integrating the classification results of different channels.

Drawings

FIG. 1 is a flow chart of an implementation of the present invention.

Detailed Description

Referring to fig. 1, the face recognition based on deep learning multi-layer non-negative matrix factorization of the present invention comprises the following steps:

step 1, obtaining characteristic data X (k) of each channel data of a training sample.

(1a) Obtaining a face data set V^trainThe method comprises the following steps of (1) taking the training samples as a training data set, wherein the total number of the training samples in the training data set is n, the number of the classes of the training data set is c, each training sample in the training data set is equally divided into K regions, each region is taken as 1 channel data of the training sample, and the training samples contain K channel data;

(1b) according to the training data set, under an L inux operating system, a Caffe deep learning framework is utilized to finely adjust the VGG-Face deep convolution neural network parameters;

(1c) inputting each channel data of each training sample in a training data set into a VGG-Face deep convolution neural network to obtain characteristic data X (K) of each training channel data, wherein K is 1, 2. K is the number of channels of the training sample.

And 2, acquiring a coefficient matrix H (k) according to the characteristic data X (k).

Respectively carrying out the characteristic extraction processes of normalization, nonlinear transformation and matrix decomposition on the characteristic data X (k) to obtain a coefficient matrix H (k);

(2a) normalizing the characteristic data X (k) by using L2 norm;

(2b) carrying out nonlinear transformation on the result subjected to normalization processing in the step (2a) by using a sigmoid function to obtain a transformed result B (k);

(2c) performing matrix decomposition on the result B (k) after the nonlinear transformation in the step (2b) by using soft constraint non-negative matrix decomposition to obtain B (k) approximately equal to Z (k) A (k) F (k), wherein B (k) is a matrix of m × n order, Z (k) is a base matrix of m × phi order, A (k) is an auxiliary matrix of phi × c order, F (k) is a prediction label matrix of c × n order, m is an original feature dimension, phi is a decomposition dimension, c is a category number, and n is the total number of training samples;

(2c1) random initialization base matrix Z⁽¹⁾(k) Auxiliary matrix A⁽¹⁾(k) And a predictive label matrix F⁽¹⁾(k) As a result of iteration 1, where the basis matrix Z⁽¹⁾(k) Any element in (1) satisfies

Is a basis matrix Z⁽¹⁾(k) Row p and column q elements; auxiliary matrix A⁽¹⁾(k) Any element in (1) satisfies

As an auxiliary matrix A⁽¹⁾(k) α row β column element, predictive tag matrix F⁽¹⁾(k) Any element in (1) satisfies

For predicting label matrix F⁽¹⁾(k) Row y of (2)

A column element, p 1,2, a, m, q 1,2, a, phi, α 1,2, a, phi, β 1,2, a, c, gamma 1,2, a, c,

(2c2) for the element Z in the base matrix Z, the following formula is used_p,qUpdating:

where T is the number of iterations, T2., iter, iter being the maximum number of iterations, T being the matrix transpose,

for non-normalized basis matrix Z obtained after t iterations^(t)′(k) Row p and column q elements;

(2c3) for the base matrix Z obtained in step (2c2)^(t)' (k) carrying out normalization processing to obtain a base matrix Z iterated for t times^(t)(k)；

(2c4) For the element A in the auxiliary matrix A (k), the following formula is used_α,β(k) Updating:

wherein the content of the first and second substances,

for the auxiliary matrix A obtained after t iterations^(t)(k) α row β column element A^(t)(k) The auxiliary matrix is obtained after t iterations;

(2c5) according to the following formula, the elements in the prediction label matrix F (k)

Updating:

wherein the content of the first and second substances,

predicting label matrix F after t iterations^(t)(k) Row y of (2)

A column element; f^(t)(k) Predicting a label matrix after iterating for t times; λ is a regular term coefficient;

is the gamma row of the predefined local label matrix C (k)

A column element;

(2c6) judging whether the iteration time t reaches the maximum iteration time iter: if so, stopping iteration, and performing the basic matrix Z obtained by the iter iteration^(iter)(k) Auxiliary matrix A^(iter)(k) And a predictive label matrix F^(iter)(k) As final basis matrix z (k), auxiliary matrix a (k) and predictive label matrix f (k); otherwise, returning to the step (2c 2);

(2d) obtaining a coefficient matrix according to the auxiliary matrix A (k) and the prediction label matrix F (k) obtained after the soft constraint non-negative matrix decomposition in the step (2 c): h (k) ═ a (k) f (k).

And 3, acquiring the low-rank robust features h (k) of the training samples.

Repeating the feature extraction process in the step 2 to obtain low-rank robust features h (k) of feature data X (k) of each channel of the training sample;

(3a) processing the characteristic data X (k) of each channel of the training sample according to the step 2 to obtain a layer 1 basis matrix Z¹(k) And a layer 1 coefficient matrix H¹(k)；

(3b) According to the step 2, the 1 st layer coefficient matrix H obtained in the step (3a)¹(k) Processing to obtain a layer 2 base matrix Z²(k) And a layer 2 coefficient matrix H²(k)；

(3c) Continuing to repeat the same steps according to steps (3a) and (3b), according to the l-1 th layer coefficient matrix H^l-1(k) To obtain the first layer basis matrix Z^l(k) And the first layer coefficient matrix H^l(k) Until the repetition number l is L, a L th layer basis matrix Z is obtained^L(k) And L th layer coefficient matrix H^L(k) Wherein l 2.. L is the number of layers of the multilayer non-negative matrix decomposition;

(3d) l th layer coefficient matrix H obtained according to the step (3c)^L(k) Obtaining the low-rank robust features h of each channel of the training sample_j(k) Wherein j is 1, 2.

Step 4, obtaining low-rank robust features h according to the step 3_j(k) And constructing K nearest neighbor classifiers.

(4a) Selecting the low-rank robust feature h of the kth channel of each training sample from the result obtained in the step 3_j(k) Forming a feature set;

(4b) forming a nearest neighbor classifier according to the feature set obtained in the step (4 a);

(4c) repeating the steps (4a) and (4b) for different channels to obtain K nearest neighbor classifiers.

And 5, acquiring characteristic data Y (k) of each channel data of the test sample.

(5a) Acquiring a face data set V with the same attribute as the training data set^testAs a test data set, the total number of test samples in the test data set is e, the number of categories of the test data set is c, and each test sample in the test data set is divided into K channel data according to the step (1 a);

(5b) setting parameters of the VGG-Face deep convolution neural network according to the step (1 b);

(5c) and inputting each channel data of the test sample into the VGG-Face deep convolution neural network to obtain the characteristic data Y (k) of each channel data of the test sample.

Step 6, respectively projecting the characteristic data Y (k) of each channel data of the test sample obtained in the step 5, and outputting a projection coefficient vector

(6a) Carrying out normalization processing, nonlinear transformation and projective transformation on the characteristic data Y (k) of the test sampleTo obtain a layer 1 projection matrix

(6a1) Normalizing the characteristic data Y (k) of the test sample by using L2 norm;

(6a2) performing nonlinear transformation on the result obtained after the normalization processing in the step (6a1) by using a Sigmoid function to obtain a transformation result f (Y (k)) after the nonlinear transformation, wherein f (·) represents that the nonlinear transformation is performed by using the Sigmoid function;

(6a3) respectively carrying out the nonlinear transformation on the result f (Y (k)) obtained in the step (6a2) in the step (3a) to obtain the layer 1 base matrix Z¹(k) Carrying out projection transformation to obtain a1 st layer projection matrix:

wherein the content of the first and second substances,

representing a generalized inverse operation;

(6b) the layer 1 projection matrix obtained according to the step (6a)

And a layer 2 basis matrix Z²(k) Performing the same processing procedure to obtain the 2 nd layer projection matrix

(6c) Continuing to repeat the same steps according to steps (6a) and (6b), projecting the matrix according to layer l-1

And the l-th layer basis matrix Z^l(k) Obtaining the first layer projection matrix

Until the repetition time l is L, a L layer projection matrix is obtained

Wherein, l ═ 2.., L;

(6d) l th layer projection matrix obtained according to the step (6c)

Obtaining a projection coefficient vector of each test sample

Wherein, i is 1, 2.

Step 7, the projection coefficient vector obtained in the step 6 is used

And inputting the data into K nearest neighbor classifiers to obtain a classification result of each channel of the test sample.

(7a) Computing low-rank robust features h of training samples_j(k) Projection coefficient vector with test sample

Low dimension euclidean distance therebetween

Obtaining a set of distances

Where j ═ 1, 2., n, i ∈ {1, 2., e }, | · | | computationally |, y₂Represents a2 norm;

(7b) the distance set obtained according to step (7a)

Minimum value in distance set

A corresponding ξ th training sample class is used as a classification result of the ith test sample on the kth nearest classifier, wherein ξ∈ {1, 2.. multidot.n };

(7c) and (5) classifying the K channels of each test sample according to the steps (7a) and (7b) respectively to obtain the classification result of each test sample on the K nearest neighbor classifiers.

And 8, integrating the classification result of each channel of the test sample obtained in the step 7 to obtain the final classification result of the test sample.

(8a) Respectively counting the number CN of the correctly classified test samples on each nearest neighbor classifier according to the classification result of each test sample on the K nearest neighbor classifiers obtained in the step 7_kAnd calculating the recognition rate of each nearest neighbor classifier:

wherein, CN_kNumber of correctly classified test samples on the k-th nearest neighbor classifier, o_kThe identification rate of the kth nearest neighbor classifier;

(8b) respectively calculating the linear weight coefficients α of the K nearest neighbor classifiers according to the identification rates of the K nearest neighbor classifiers obtained in the step (8a)_k：

(8c) Linear weight coefficient α obtained from step (8b)_kCalculating K channel projection coefficient vectors of the test sample

Training sample K channels low-rank robust feature h_j(k) Weighted distance between:

to obtain a weighted distance set { d }_1i,d_2i,...,d_ji,...,d_ni}；

(8d) According to the weighted distance set { d) obtained in the step (8c)_1i,d_2i,...,d_ji,...,d_niD, weighting the minimum value d in the distance set_ωiAnd taking the corresponding category of the omega training sample as the classification result of the test sample, wherein omega ∈ {1, 2.

The foregoing description is only an example of the present invention and should not be construed as limiting the invention, as it will be apparent to those skilled in the art that various modifications and variations in form and detail can be made without departing from the principle and structure of the invention after understanding the present disclosure and the principles, but such modifications and variations are considered to be within the scope of the appended claims.

Claims

1. The face recognition method based on deep learning multilayer non-negative matrix decomposition comprises the following steps:

(3) repeating the feature extraction process in the step (2) for L times to obtain the low-rank robust feature h_j(k) Wherein j is 1,2, and n is the total number of training samples; the method comprises the following concrete steps:

(3a) processing the characteristic data X (k) of each channel of the training sample according to the step (2) to obtain a layer 1 basis matrix Z¹(k) And a layer 1 coefficient matrix H¹(k) Wherein, K is 1, 2.., K;

(3b) according to the step (2), the 1 st layer coefficient matrix H obtained in the step (3a)¹(k) Processing to obtain a layer 2 base matrix Z²(k) And a layer 2 coefficient matrix H²(k)；

(3d) l th layer coefficient matrix H obtained according to the step (3c)^L(k) Obtaining the low-rank robust features h of each channel of the training sample_j(k) Wherein j is 1, 2.. times.n;

(7) The projection coefficient vector obtained in the step (6) is processed

2. The method of claim 1, wherein the step (2) is implemented as follows:

(2a) normalizing the characteristic data X (k) by using L2 norm;

3. The method of claim 2, wherein the soft constrained non-negative matrix factorization in step (2c) is used to matrix-decompose the result b (k) after the non-linear transformation in step (2b) by:

For predicting label matrix F⁽¹⁾(k) Row y of (2)

Column elements, p 1,2,., m, q 1,2,., phi, α 1,2, …, phi, β 1,2,., c, gamma 1,2, …, c,

for non-normalized basis matrix Z obtained after t iterations^(t)' (k) row p and column q elements;

wherein the content of the first and second substances,

Updating:

wherein the content of the first and second substances,

predicting label matrix F after t iterations^(t)(k) Row y of (2)

is the gamma row of the predefined local label matrix C (k)

A column element;

(2c6) judging whether the iteration time t reaches the maximum iteration time iter: if so, stopping iteration, and performing the basic matrix Z obtained by the iter iteration^(iter)(k) Auxiliary matrix A^(iter)(k) And a predictive label matrix F^(iter)(k) As final basis matrix z (k), auxiliary matrix a (k) and predictive label matrix f (k); otherwise, return to step (2c 2).

4. The method of claim 1, wherein the step (6) is implemented as follows:

(6a) carrying out projection processing procedures of normalization processing, nonlinear transformation and projection transformation on the characteristic data Y (k) of the test sample to obtain a projection matrix

Wherein K ∈ {1, 2.., K }, and K is the number of sample channels;

wherein the content of the first and second substances,

representing a generalized inverse operation;

(6b) the layer 1 projection matrix obtained according to the step (6a)

Until the repetition time l is L, a L layer projection matrix is obtained

Wherein l 2.. L is the number of layers of the multilayer non-negative matrix decomposition;

(6d) l th layer projection matrix obtained according to the step (6c)

Obtaining a projection coefficient vector of each test sample

Wherein, i is 1, 2.

5. The method of claim 1, wherein the step (7) is performed as follows:

Low dimension euclidean distance therebetween

Obtaining a set of distances

Where j 1, 2., n, K ∈ {1, 2., K }, i ∈ {1, 2., e }, | | |, | u |, n, K ∈ {1, 2., (K) }, i ∈ {1, 2., (e) |, |₂Represents a2 norm;

(7b) the distance set obtained according to step (7a)

Minimum value in distance set

6. The method of claim 1, wherein said step (8) is performed by:

(8a) according to the steps(7) The obtained classification result of each test sample on K nearest neighbor classifiers is respectively counted to obtain the number CN of correctly classified test samples on each nearest neighbor classifier_kAnd calculating the recognition rate of each nearest neighbor classifier:

obtain a set of weighted distances d_1i,d_2i,...,d_ji,...,d_niJ ═ 1, 2.., n, i ∈ {1, 2.., e };