CN110826408B

CN110826408B - Face recognition method by regional feature extraction

Info

Publication number: CN110826408B
Application number: CN201910954178.2A
Authority: CN
Inventors: 李云红; 聂梦瑄; 周小计; 穆兴; 李传真; 刘旭东
Original assignee: Xian Polytechnic University
Current assignee: Xian Polytechnic University
Priority date: 2019-10-09
Filing date: 2019-10-09
Publication date: 2023-03-28
Anticipated expiration: 2039-10-09
Also published as: CN110826408A

Abstract

The invention discloses a face recognition method by regional feature extraction, which is implemented according to the following steps: 1, acquiring a face image to be recognized; 2, preprocessing the acquired face image by utilizing a multitask convolutional neural network, and marking key points of the face; 3, dividing the face into an expression variable region and an expression invariable region according to different position information of the key points; 4, inputting images of the expression variable regions and the expression invariable regions into a Gabor & block LBP feature extraction channel to obtain a feature histogram containing face feature information; and 5, processing the feature histogram containing the face feature information in the step 4 by using a linear discrimination method, and then matching the processed face feature information with the face features in the database to obtain a face recognition result. The invention discloses a face recognition method by extracting regional features, which solves the problem of low face recognition rate caused by the influence of an undefined environment in the prior art.

Description

Face recognition method by regional feature extraction

Technical Field

The invention belongs to the technical field of face recognition methods, and relates to a face recognition method by extracting regional features.

Background

Face recognition technology is an important component in machine vision. The intelligent monitoring system is convenient, quick, easy to realize and non-invasive, and is widely applied to aspects of daily life such as video monitoring, access control systems and station security inspection.

In recent years, as face recognition technology is continuously innovated, people have higher and higher requirements on recognition accuracy. Because the accuracy of the traditional face recognition system is tested in a limited environment, a satisfactory effect can be achieved. However, in a non-limited environment, because the input face image is affected by factors such as illumination, noise, facial expression change and the like, the face recognition accuracy rate is greatly reduced.

The classical Gabor algorithm utilizes the multi-scale and multi-direction characteristics to extract the features of an input face image, so that the influence of illumination and noise on the identification accuracy can be effectively reduced, but the algorithm extracts the multi-scale and multi-direction features of an image, so that a large amount of redundancy exists in feature information, and subsequent operation is not facilitated;

the characteristic face-based method is a common face recognition method, and the main principle is to convert a high-dimensional face image into a low-dimensional space, so that the characteristic dimension can be reduced, and calculation is convenient.

The traditional face recognition method is improved only aiming at one influence factor, and multiple influence factors coexist in the actual life, so that the traditional algorithm has unsatisfactory effect in the actual application.

Disclosure of Invention

The invention aims to provide a method for extracting face recognition by regional characteristics, which solves the problem of low face recognition rate caused by the influence of non-limited environment in the prior art.

The technical scheme adopted by the invention is that a method for extracting face recognition by regional characteristics is implemented according to the following steps:

step 1, acquiring a face image to be recognized;

step 2, preprocessing the face image obtained in the step 1 by utilizing a multitask convolutional neural network, detecting the face, and marking key points of the face;

step 3, segmenting the face image subjected to the preprocessing operation into an expression variable region and an expression invariable region according to different position information of key points;

step 4, respectively inputting images of the variable regions and the invariable regions of the expression into a Gabor & block LBP feature extraction channel to obtain a feature histogram containing face feature information;

and 5, processing the feature histogram containing the face feature information in the step 4 by using a linear discrimination method, and then matching the processed face feature information with the face features in the database to obtain a face recognition result.

The present invention is also characterized in that,

and 2, preprocessing the face image obtained in the step 1 by utilizing a multitask convolutional neural network to judge whether the face image is contained in the input image by a face classifier, if so, continuously obtaining a face frame to automatically intercept the face part, and then positioning key points.

Calculating the face classification result of the face classifier by using a cross entropy loss function, wherein the expression is as follows:

wherein, the first and the second end of the pipe are connected with each other,

representing a cross-entropy loss function, p, for face classification _i Represents the size of the face probability, and>

a representation background real label;

calculating the position information of the face frame by using the Euclidean distance, and acquiring the face frame, wherein the expression is as follows:

representing the calculated Euclidean distance of the face frame, </or >>

Represents the predicted face frame position, < > or >>

Representing the real face frame location.

And (3) positioning the coordinates of the key points through Euclidean distance loss, wherein the expression is as follows:

wherein the content of the first and second substances,

representing the Euclidean distance, at which a key point is located>

Representing the locations of the key points of the predicted face,

representing the positions of key points of a real face;

different weights are added to the formulas (1), (2) and (3) under different samples, and then the summation is carried out, wherein the expression is as follows:

where N represents the number of training samples, a _j Representing the importance of the task, b _i Represents a sample label; the whole pretreatment process is as follows: the value of Y in equation (4) is minimized, i.e. the values of equations (1), (2) and (3) are minimized.

The step 3 specifically comprises the following steps:

step 3.1, determining a segmentation line to obtain a face segmentation graph;

determination of horizontal parting lines:

suppose that the coordinates of the key points of the two eyebrows are respectively (x) ₁ ,y ₁ )、(x ₂ ,y ₂ ) The coordinates of the key points of the two eyes are respectively marked as (m) ₁ ,n ₁ )、(m ₂ ,n ₂ ) Respectively solving the coordinates of the central point in the vertical direction of the eyebrows and the eyes on the left side and the right side:

and a handle h ₁ And h ₂ The connecting line is used as a horizontal dividing line, and the horizontal dividing line is used as a fuzzy band for processing, namely the horizontal dividing line is used as an intersection part of two different areas;

determination of vertical dividing line:

suppose that the coordinates of the key point labels of the two eyes are respectively (m) ₁ ,n ₁ )、(m ₂ ,n ₂ ) The coordinates of the key points of the nose are (a) ₁ ,b ₁ ) Finding the horizontal center point of the left and right eyes to the nose:

/>

respectively making vertical lines through two central points, wherein the vertical lines are vertical dividing lines, and treating the vertical dividing lines as fuzzy zones;

respectively carrying out horizontal segmentation or vertical segmentation on two adjacent key points, if the two key points are adjacent up and down, calculating a horizontal segmentation line, if the two key points are adjacent left and right, calculating a hammered segmentation line of the two key points, and carrying out segmentation according to the segmentation line to obtain segmentation maps of different areas of the human face;

and 3.2, classifying the areas containing the key points of the eyes, the mouth and the eyebrows into variable expression areas, and classifying the areas containing the key points of the nose, the forehead and the cheeks into invariable expression areas.

The step 4 specifically comprises the following steps:

step 1, respectively inputting images of the expression variable regions and the expression invariable regions into a Gabor feature extraction channel, performing Gabor feature extraction, outputting a Gabor feature graph, then performing 2*2 partitioning on the Gabor feature graph, and dividing the Gabor feature graph into four sub-blocks Gabor feature graphs, wherein a two-dimensional Gabor kernel function is as follows:

u and v respectively represent the number of the selected kernel function scale and direction, i | |. | is a 2-norm operator, z is an image pixel coordinate point, and σ represents a standard deviation;

step 4.2, performing LBP feature extraction on each sub-block partitioned in the step 4.2, wherein formulas of partitioned LBP values are shown as (10) and (11):

wherein the pixel brightness value at the middle position is g _c The surrounding eight neighboring pixels have a brightness value of g _i (i =0,1,. 7), if g _i >g _c Then g is _i The value of (c) is assigned to 1 if g _i <g _c Then g is _i And assigning the value of the point to be 0 to finally obtain a binarization value of eight adjacent pixels, writing the binarization value into an eight-bit binary number by taking the binarization value on the right side of the central point as an initial position through anticlockwise rotation, converting the obtained eight-bit binary number into a decimal number, expressing the LBP value of the central point, and performing LBP feature extraction on each point in four pairs of the four subblock Gabor feature maps to obtain four feature histograms.

The linear discrimination method in the step 5 specifically comprises the following steps:

setting the facial image features output in the step 4 as sample data x = (x) ₁ ,x ₂ ,…,x _n ) After mapping matrix H, the matrix is transformed into z = (z) ₁ ,z ₂ ,…,z _n ) I.e. z = H ^T x, the specific steps of the mapping are as follows:

let x in sample data x _i Has a mean value of X _i ，x _i Is a one-dimensional column vector, the center point is

Wherein N is _i Is x _i Number of samples:

the central point is obtained by H transformation

Is sample set X _i Projection of the center point of (a): />

Wherein the inter-class spacing J _H Is represented by the sum of the squares of the distances from the points of the sample to the center point, S _i Represents the intra-class divergence:

distance between classes J _B Measured mainly by the distance between two different kinds of centre points, where S _B Representing the degree of interspecies divergence;

setting the target function as J (H), obtaining the maximum value of the target function J (H) by derivation of the target function, wherein the finally obtained face feature information has the best inter-class distance and the best inter-class distance:

the step 5 specifically comprises the following steps:

and comparing the human face features which are processed by the linear discrimination method and contain the minimum inter-class distance and the maximum inter-class distance with each human face feature in the database, and if the final similarity is greater than a preset threshold value, determining that the same human face is used, otherwise, determining that the human faces are different human faces.

The beneficial effects of the invention are:

(1) The method divides the face into the expression variable region and the expression invariable region by utilizing the characteristic that different regions of the face are influenced by the expression change in different sizes, extracts the face image characteristics by regions, and can effectively improve the identification accuracy.

(2) The selected Gabor algorithm has the characteristics of multiple scales and multiple directions, and can effectively weaken the influence of illumination change and noise interference on face feature extraction.

(3) The block LBP algorithm is used for not only performing dimension reduction processing on the features extracted by the Gabor algorithm, but also refining similar feature information further to distinguish the difference between similar features, so that the accuracy of face recognition can be effectively improved.

(4) The invention selects a classic LDA algorithm as a face discrimination method, and the algorithm can increase the inter-class spacing and reduce the intra-class spacing, thereby distinguishing different features of the face more accurately.

Drawings

FIG. 1 is a flow chart of a method for extracting face recognition by regional features according to the present invention;

FIG. 2 is an LBP mapping diagram of a method for extracting face recognition by regional features according to the present invention;

FIG. 3 is a diagram illustrating the selection of a Gabor algorithm for direction and scale in the method for identifying a face by extracting regional features according to the present invention;

FIG. 4 is a flow chart of Gabor algorithm feature processing in a method for extracting face features by regions according to the present invention;

FIG. 5 is a schematic diagram showing the selection of the number of blocks of a block LBP in the method for extracting face features by regions according to the present invention;

FIG. 6 is a comparison graph of algorithm accuracy rates for different sample numbers in a method for identifying a face by regional feature extraction according to the present invention.

Detailed Description

The invention is described in detail below with reference to the drawings and the detailed description.

The invention relates to a face recognition method by regional feature extraction, the flow of which is shown in figure 1 and is implemented according to the following steps:

step 1, acquiring a face image to be recognized;

among them, the multitask convolutional neural network (MTCNN) mainly includes three sub-networks: P-Net, R-Net, O-Net; P-Net is a full convolution network and is used for generating regression vectors of candidate frames and screening the candidate frames by using the regression vectors of the frames and a non-maximum value inhibition method; sending the screened result into R-Net, and continuously using a frame regression vector and a non-maximum value inhibition method to screen the frame again; finally, performing the last screening by using O-Net, and marking key points while reserving a face frame;

the preprocessing is to use a multitask convolution neural network to preprocess the face image obtained in the step 1, namely, a face classifier is carried out on the input face image to judge whether the input image contains the face image, if so, a face frame is continuously obtained to automatically intercept the face part, and then key point positioning is carried out;

wherein the content of the first and second substances,

representing a cross-entropy loss function, p, for face classification _i Represents the size of the face probability and is used for judging whether the face is in a normal state or not>

A real label representing a background;

wherein the content of the first and second substances,

representing the calculated Euclidean distance of the face frame, </or >>

Represents the predicted face frame position, < > or >>

Representing the real face frame location.

wherein the content of the first and second substances,

representing the Euclidean distance, at which a key point is located>

Representing the locations of the key points of the predicted face,

representing the positions of key points of a real face;

where N represents the number of training samples, a _j Representing the importance of the task, b _i Represents a sample label; the whole pretreatment process is as follows: minimizing the value of Y in the formula (4), i.e. minimizing the values of the formulas (1), (2) and (3); the corresponding parameter values in P-Net and R-Net are as follows: (a) A _det ＝1,a _box ＝0.5,a _landmark = 0.5), corresponding parameter values in O-Net: (a) _det ＝1,a _box ＝0.5,a _landmark = 1), where P-Net is used to generate candidate box and bounding box regression vectors, R-Net is used to refine the candidate boxes, and O-Net outputs the final face box and keypoint locations;

in view of the difference of the degree of influence of facial features by expression change, the face can be divided into an expression variable region and an expression invariant region according to the difference of the position information of the key points. Since the boundaries of the variable regions and the constant regions are blurred, region segmentation cannot be performed accurately. Therefore, the invention provides an edge overlap segmentation method, which takes a fuzzy band as an intersection of two different regions and then performs region segmentation, and although the region segmentation method increases the redundancy of data, the problem of edge feature loss is avoided, so that the problem of difficult division of edge regions is effectively solved, and the region division is performed on a human face according to a labeled key point, specifically as follows:

step 3, dividing the face image subjected to preprocessing operation into an expression variable region and an expression invariable region according to the difference of the position information of the key points, specifically:

step 3.1, determining a segmentation line to obtain a face segmentation image;

determination of horizontal parting lines:

suppose that the coordinates of the key points of the two eyebrows are respectively (x) ₁ ,y ₁ )、(x ₂ ,y ₂ ) The coordinates of the key points of the two eyes are respectively marked as (m) ₁ ,n ₁ )、(m ₂ ,n ₂ ) Respectively solving the coordinates of the central point in the vertical direction of the left and right eyebrows and eyes:

determination of vertical dividing line:

respectively making a vertical line through two central points, wherein the vertical line is a vertical dividing line, and treating the vertical dividing line as a fuzzy band;

respectively horizontally or vertically dividing two adjacent key points, if the two key points are vertically adjacent, calculating a horizontal dividing line, if the two key points are horizontally adjacent, calculating a hamming dividing line of the two key points, and dividing according to the dividing line to obtain dividing graphs of different areas of the face;

step 3.2, classifying areas containing key points of eyes, mouths and eyebrows into variable expression areas, and classifying areas containing key points of noses, forehead and cheeks into invariable expression areas;

step 4, respectively inputting the images of the variable regions and the invariable regions of the expression into a Gabor & block LBP feature extraction channel to obtain a feature histogram containing face feature information, which specifically comprises the following steps:

where u and v represent the number of chosen kernel function scales and directions, respectively, | is a 2-norm operator, z is an image pixel coordinate point, σ represents the standard deviation; as can be seen from the experimental diagram in fig. 3, the extracted features can improve the recognition accuracy to the greatest extent when u =5,v = 8; the features of different fineness degrees of the human face can be effectively extracted by different scales. The human face features can be extracted from different angles in different directions, so that the influence of illumination and noise on an input image is avoided;

the Gabor kernel function is a complex function and can be divided into a real part and an imaginary part, and the real part can smooth the image by filtering, so that the illumination sensitivity is reduced. Imaginary part filtering can effectively describe image edge information; as can be seen from fig. 4, the real part and the imaginary part of the Gabor feature extraction have 40 filters, the amount of information after processing the input image is 40 times that of the original image, and the extracted features contain a large amount of redundant information, so that the dimension reduction operation needs to be performed.

wherein the pixel brightness value at the middle position is g _c The surrounding eight neighboring pixels have brightness values of g _i (i =0,1.., 7.) the process of calculating the LBP value for each pixel is shown in fig. 2, if g _i ＞g _c Then g is _i The value of (c) is assigned to 1 if g _i <g _c Then g is _i Assigning the value of the position to be 0, finally obtaining a binarization value of eight adjacent pixels, writing the obtained binarization value into an eight-bit binary number by taking the binarization value on the right side of the central point as an initial position through anticlockwise rotation, converting the obtained eight-bit binary number into a decimal number, wherein the decimal number represents an LBP value of the central point, and carrying out LBP feature extraction on each point in four pairs of subblock Gabor feature maps to obtain four feature histograms;

as shown in fig. 5, when the number of blocks is 2*2, the recognition accuracy can be effectively improved under the condition of ensuring the recognition speed;

step 5, processing the feature histogram containing the face feature information in the step 4 by using a linear discrimination method, and then matching the processed face feature information with the face features in the database to obtain a face recognition result;

the LDA algorithm has the best separability for the features distributed in the space, that is, the inter-class scattering matrix of the samples in the new space obtained by using the method is the largest, and the intra-class scattering matrix is the smallest, and the linear discrimination method specifically comprises the following steps:

setting the facial image feature output in the step 4 as x = (x) ₁ ,x ₂ ,…,x _n ) After mapping matrix H, the matrix is transformed into z = (z) ₁ ,z ₂ ,…,z _n ) I.e. z = H ^T x, the specific steps of the mapping are as follows:

let sample data x _i Has a mean value of X _i ，x _i Is a one-dimensional column vector, then the center point is

Wherein N is _i Is x _i Number of samples:

the central point is obtained by H transformation

Is the sample set X _i Projection of the center point of (a):

wherein the inter-class distance J _H Is represented by the sum of the squares of the distances from the points of the sample to the center point, S _i Represents the intra-class divergence:

distance between classes J _B Measured mainly by the distance between two different kinds of centre points, where S _B Representing the degree of dispersion between classes;

namely: and comparing the human face features which are processed by the linear discrimination method and contain the minimum inter-class spacing and the maximum inter-class spacing with each human face feature in the database, and if the final similarity is greater than a preset threshold value, determining that the same human face is used, otherwise, determining that the human faces are different human faces.

In order to verify the invention, in a FERET face database, four algorithms (LBP, gabor, LBP & SVM and Gabor & blocked LBP of the invention) are adopted to perform respective simulation under the condition of different sample numbers, the identification accuracy is shown in figure 6, and table 1 shows the comparison of the identification time consumption of the four algorithms when the sample number is 600.

TABLE 1

Method	Identifying elapsed time/s
		LBP	1.75
Gabor	2.56
		LBP&SVM	1.93
Gabor&Partitioned LBP	1.88

From fig. 6 and table 1, it can be obtained that, under the condition that the number of samples gradually increases, the average of the Gabor & blocking LBP algorithm is higher than that of the LBP & SVM algorithm with the highest accuracy in the control experiment by about 1%, and when the number of samples is 600, the recognition time is only 0.13s more, and the result shows that, under the condition that the number of samples in the FERET face database is different, the algorithm has higher recognition accuracy than other algorithms.

Claims

1. A method for extracting face recognition by regional features is characterized by comprising the following steps:

step 1, acquiring a face image to be recognized;

step 3, dividing the face image subjected to preprocessing operation into an expression variable region and an expression invariable region according to the difference of the position information of the key points;

2. The method for extracting face features according to claim 1, wherein the face image obtained in step 2 is preprocessed by a multitask convolutional neural network to determine whether the face image is included in the input image by a face classifier, if so, face frame acquisition is continued to automatically intercept the face part, and then key point positioning is performed.

3. The method for extracting face recognition by regional characteristics according to claim 2, wherein a cross entropy loss function is used to calculate the face classification result of the face classifier, and the expression is as follows:

wherein the content of the first and second substances,

A real label representing a background;

means for calculating Euclidean distance of face frame>

Represents the predicted face frame position, < > or >>

Representing a real face frame position;

wherein the content of the first and second substances,

representing the Euclidean distance, at which a key point is located>

Represents the position of a key point of a predicted face, and>

representing the positions of key points of a real face;

where N represents the number of training samples, a _j Representing the importance of the task, b _i Represents a sample label; the whole pretreatment process is as follows: the value of Y in equation (4) is minimized, i.e., the values of equations (1), (2) and (3) are minimized.

4. The method for extracting a face recognition by using regional features according to claim 3, wherein the step 3 specifically comprises:

step 3.1, determining a segmentation line to obtain a face segmentation image;

determination of horizontal parting lines:

determination of vertical dividing line:

5. The method for extracting a face recognition by using regional features according to claim 4, wherein the step 4 specifically comprises:

u and v respectively represent the number of the selected kernel function scale and direction, | | |. | | is a 2-norm operator, z is an image pixel coordinate point, and σ represents a standard deviation;

/>

wherein the pixel brightness value at the middle position is g _c The surrounding eight neighboring pixels have brightness values of g _i (i =0,1,. 7), if g _i >g _c Then g is _i The value of (c) is assigned to 1 if g _i <g _c Then g is _i And assigning the value of the position to be 0 to finally obtain a binarization value of eight adjacent pixels, writing the binarization value into an eight-bit binary number by taking the binarization value on the right side of the central point as an initial position through anticlockwise rotation, converting the obtained eight-bit binary number into a decimal number, expressing the LBP value of the central point through the decimal number, and extracting the LBP characteristic of each point in the Gabor characteristic diagrams of the four subblocks to obtain four characteristic histograms.

6. The method for extracting face features according to claim 5, wherein the linear discrimination method in the step 5 is specifically as follows:

setting the facial image feature output in the step 4 as x = (x) ₁ ,x ₂ ,…,x _n ) After mapping matrix H, the matrix is transformed into z = (z) ₁ ,z ₂ ,...,z _n ) I.e. z = H ^T x, the specific steps of the mapping are as follows:

let sample data x _i Has a mean value of X _i ，x _i Is a one-dimensional column vector, the center point is

Wherein N is _i Is x _i Number of samples:

the center point is obtained by H transformationTo

Is the sample set X _i Projection of the center point of (a):

7. the method for extracting a face recognition by using regional features according to claim 6, wherein the step 5 specifically comprises:

and comparing the human face features which are processed by the linear discrimination method and contain the minimum inter-class spacing and the maximum inter-class spacing with each human face feature in the database, and if the final similarity is greater than a preset threshold value, determining that the same human face is used, otherwise, determining that the human faces are different human faces.