CN107403168B

CN107403168B - Face recognition system

Info

Publication number: CN107403168B
Application number: CN201710667710.3A
Authority: CN
Inventors: 孙强; 崔孔明
Original assignee: Qingdao Yousuo Intelligent Technology Co ltd
Current assignee: Qingdao Yousuo Intelligent Technology Co ltd
Priority date: 2017-08-07
Filing date: 2017-08-07
Publication date: 2020-08-11
Anticipated expiration: 2037-08-07
Also published as: CN107403168A

Abstract

The invention discloses a face recognition system, which relates to the technical field of image processing and pattern recognition, and comprises the steps of firstly determining the gender of a face image to be recognized by adopting a deep learning method, then determining the face identity of the face image to be recognized with high accuracy by combining the gender recognized by the deep learning method and a convolutional neural network based method, finally determining the facial micro-expression of the face image to be recognized by adopting a binocular vision technology based on the parallax features of the face image to be recognized, realizing the face recognition by combining the gender information, the identity information and the expression information of the face image to be recognized, combining the deep learning method, the convolutional neural network and the binocular vision technology, reducing the influence of factors such as illumination conditions and facial postures on the face recognition rate, simultaneously reducing the data calculation amount in the face recognition process, realizing the rapid and high-accuracy recognition of the face, the cost in the face recognition process is reduced, and the accuracy of face recognition under different face postures is improved.

Description

Face recognition system

Technical Field

The invention relates to the technical field of image processing and pattern recognition, in particular to a face recognition system.

Background

The face recognition technology is used for identity identification by analyzing the displacement shape and position relation of facial organs, is an important biological recognition technology, and is widely applied to the fields of security protection, entrance guard, monitoring and the like. The main algorithms of the face recognition technology include a face recognition method based on template matching of geometric features, a face recognition method based on sample learning, and a face recognition method based on texture features. The face recognition method based on the face texture features mainly depends on LBP (local Binary pattern), namely, local Binary pattern, to extract the face features.

Currently, face recognition systems are classified into two-dimensional face recognition systems and three-dimensional face recognition systems according to the difference in processed data.

The method adopted by the two-dimensional face recognition system is relatively mature, and the characteristic face (Eigenfaces) method proposed by Turk and Pentland in 1991 has a good recognition effect. In subsequent studies, a neural network-based face recognition method, a Support Vector Machine (SVM) -based face recognition method, a wavelet transform-based face recognition method, and the like are continuously emerging. However, the inherent defect of two-dimensional face recognition cannot be overcome in any way of improvement, namely, the matching of the image features to be recognized and the image features in the sample library is influenced by the change of factors such as the illumination condition, the face posture and the like, so that the recognition performance is reduced.

Most three-dimensional face recognition methods are based on more abstract space geometric features, such as a recognition method for matching the similarity of curved surfaces by adopting an iterative closest point method and a recognition method for matching curves by extracting local regions according to the positioning of three-dimensional model feature points. However, the three-dimensional face recognition technology is not mature enough, the three-dimensional data is too huge, the calculation complexity is high, the data calculation amount is large, the face recognition rate is low, the three-dimensional data acquisition equipment is expensive, and the three-dimensional data acquisition conditions are limited, so that the three-dimensional face recognition in the prior art is difficult to popularize in practical application.

Disclosure of Invention

The embodiment of the invention provides a face recognition system, aiming at reducing the influence of factors such as illumination conditions, face postures and the like on the face recognition rate, reducing the data calculation amount in the face recognition process, realizing the rapid and high-accuracy recognition of faces and reducing the cost in the face recognition process.

The specific technical scheme provided by the invention is as follows:

a facial recognition system, the facial recognition system comprising:

the facial image processing module is used for intercepting a facial image after determining a facial area in the image to be processed based on a facial feature point positioning algorithm, and performing noise reduction, light supplement, highlighting and normalization processing on the intercepted facial image to be recognized;

the face feature extraction module is used for extracting face features of a face image to be recognized based on key points of the face image, wherein the face features comprise direction gradient Histogram (HOG) features, Local Binary Pattern (LBP) features and parallax features of face pixel points;

a big data-based face recognition module for recognizing gender of the facial image to be recognized based on big data according to the histogram of oriented gradients HOG feature and the local binary pattern LBP feature;

the face identification module based on the convolutional neural network is used for realizing high-accuracy face identity identification of the face image to be identified in multiple angles by adopting a trained convolutional neural network model according to the HOG feature of the histogram of directional gradients and the LBP feature of the local binary pattern;

and the face recognition module based on the binocular vision technology is used for acquiring the special facial expression detection image of the facial image to be recognized according to the parallax features of the facial pixel points, and recognizing the facial micro-expression of the facial image to be recognized based on the special facial expression detection image.

Optionally, the facial image processing module specifically includes:

the multi-view-angle-based face region detection submodule is used for carrying out graying on an input color image, carrying out histogram equalization, respectively carrying out face detection by using a front face detector, a left side face detector and a right side face detector, removing a face detection result with the area smaller than a preset value, and obtaining a multi-view-angle to-be-processed face image;

the face positioning and normalization processing submodule is used for positioning feature points in the obtained multi-view face image to be processed based on a mixed tree-shaped structure feature point model of HOG (histogram of oriented gradient) features, accurately determining a face area according to face outline feature points after the feature points are positioned, and finishing the normalization of the face image to be processed by cutting and scaling the face area image;

and the post-processing sub-module of the facial image is used for carrying out noise reduction, light supplement and highlighting processing on the intercepted facial image to be processed to obtain the facial image to be recognized.

Optionally, the facial feature extraction module specifically includes:

the HOG feature extraction submodule is used for carrying out histogram equalization on the facial image to be identified after normalization processing and extracting HOG features;

the LBP characteristic extraction submodule is used for carrying out image blocking on the normalized facial image to be identified according to different blocking strategies and extracting the LBP characteristic of the blocked image by adopting a mixed LBP (local binary pattern) operator;

and the parallax extraction submodule is used for calculating the parallax image of the facial image to be recognized according to the depth images of the same scene respectively shot by the two depth cameras, and acquiring the parallax features of the facial pixel points of the facial image to be recognized based on the parallax image.

Optionally, the big data based face recognition module specifically includes:

the gender recognition model construction sub-module based on deep learning is used for training a gender recognition model based on deep learning for facial gender recognition according to the face sample marked with gender;

and the gender identification submodule is used for carrying out gender identification on the facial image to be identified by adopting the gender identification model based on deep learning.

Optionally, the gender identification model building submodule based on deep learning is specifically configured to:

training a first deep-learning gender identification model for facial gender identification by using all gender-labeled facial samples, wherein output parameters of the first deep-learning gender identification model comprise a first probability parameter for representing that the gender-labeled facial samples are males and a second probability parameter for representing that the gender-labeled facial samples are females, and the sum of the first probability parameter and the second probability parameter is 1;

obtaining a retraining face sample of which the absolute value of the difference value between the first probability parameter and the second probability parameter is smaller than a preset threshold value from all the gender-marked face samples;

and training a second deep learning gender identification model for identifying the gender of the face by adopting the retraining face sample, wherein the output parameters of the second deep learning gender identification model comprise a third probability parameter for representing that the retraining face sample is male and a fourth probability parameter for representing that the retraining face sample is female, and the sum of the third probability parameter and the fourth probability parameter is 1.

Acquiring a first probability parameter that the facial image to be recognized is a male and a second probability parameter that the facial image to be recognized is a female by using the first deep learning gender recognition model, wherein the sum of the first probability parameter and the second probability parameter is 1;

if the difference value between the first probability parameter and the second probability parameter is larger than a preset threshold value, determining that the facial image to be identified is a male;

if the difference value between the second probability parameter and the first probability parameter is larger than a preset threshold value, determining that the facial image to be identified is female;

if the absolute value of the difference value between the first probability parameter and the second probability parameter is smaller than a preset threshold value, acquiring a third probability parameter that the facial image to be recognized is a male and a fourth probability parameter that the facial image to be recognized is a female by using the second deep learning gender recognition model, wherein the sum of the third probability parameter and the fourth probability parameter is 1;

determining the gender of the facial image to be recognized according to the first probability parameter, the second probability parameter, the third probability parameter and the fourth probability parameter.

Optionally, the face recognition module based on the convolutional neural network specifically includes:

an image input sub-module, configured to input the histogram of oriented gradients HOG feature and the local binary pattern LBP feature of the facial image to be processed;

the face recognition model construction submodule based on the convolutional neural network is used for training a face recognition model based on the convolutional neural network for face recognition according to the marked face sample;

and the face recognition submodule is used for realizing high-accuracy face recognition of the face image at multiple angles by adopting the face recognition model based on the convolutional neural network according to the HOG feature of the direction gradient histogram and the LBP feature of the local binary pattern.

Optionally, the facial recognition model building submodule based on the convolutional neural network is specifically configured to:

constructing a deep convolutional neural network for face recognition in a server, wherein the deep convolutional neural network is divided into 9 layers, the number of input layer neurons is the pixel size of a face sample, and parameters of the rest layers are set as follows: the 1 st, 3 rd, 5 th and 7 th layers are convolutional layers C1, C2, C3 and C4 respectively, and are composed of 4 6 × 6, 8 6 × 6 feature maps and 12 6 × 6 feature maps respectively, and each neuron is connected with the neighborhood of 6 × 6 of the input layer; the 2 nd, 4 th and 6 th layers are downsampling layers S1, S2 and S3, and each neuron in the feature map of the 2 nd, 4 th and 6 th layers is connected with a 4x 4 neighborhood of the corresponding feature map in the 1 st, 3 th and 5 th layers; the 8 th layer is a hidden layer, the characteristic values in the 12 characteristic diagrams of C4 are arranged into a column vector to form a characteristic vector, and the one-dimensional characteristics are finally classified and identified; the 9 th layer is an output layer, the number of the neurons is determined by the number of the face identities to be determined, and the neurons represent the total number of possible recognition results;

inputting the collected face image and the corresponding face identity into the set deep convolutional neural network, and obtaining an output Op through progressive forward propagation;

and calculating the difference between the output Op and the corresponding ideal output Yp, and adjusting the weight matrix according to a method of minimizing errors until a reasonable face recognition model based on the convolutional neural network is obtained.

Optionally, the binocular vision technology-based face recognition module specifically includes:

the color image construction sub-module is used for constructing a color image with the same size as the parallax image of the face image according to the parallax value of each pixel point in the parallax image of the face image; the three primary color values of each pixel point of the color image are related to the parallax values of the corresponding pixel points in the parallax image of the face image;

the three-dimensional distance calculation sub-module is used for dividing the color image into a plurality of candidate regions and determining three-dimensional space information of each candidate region by combining the parallax value of the pixel point corresponding to each candidate region in the parallax image;

the facial organ selection submodule is used for determining whether each candidate region is a facial organ region according to the three-dimensional space information of each candidate region and a preset three-dimensional space information threshold value of a facial organ;

and the face recognition submodule is used for recognizing the identity information of the face image according to the three-dimensional space information of the facial organ of the face image.

Optionally, the three-dimensional distance calculation sub-module specifically includes:

the image segmentation unit is used for dividing the color image into a plurality of areas according to different colors of different areas of the color image;

the coordinate calculation unit is used for calculating the three-dimensional space coordinate of each pixel point according to the parallax value of each pixel point in the parallax map of the face image;

and the region calculation unit is used for determining the size and the position of each candidate region according to the three-dimensional space coordinates of the pixel points contained in each candidate region.

The invention has the following beneficial effects:

the face recognition system provided by the embodiment of the invention respectively extracts the HOG (histogram of oriented gradient), the LBP (local binary pattern) and the parallax features of face pixels of a face image to be recognized, then determines the gender of the face image to be recognized by adopting a deep learning method, then determines the face identity of the face image to be recognized at high accuracy by combining the gender recognized by the deep learning, and finally determines the facial micro-expression of the face image to be recognized by adopting a binocular vision technology based on the parallax features of the face image to be recognized, so that the face recognition combining the gender information, the identity information and the expression information of the face image to be recognized is realized, the deep learning method, the convolutional neural network and the binocular vision technology are combined, the influence of factors such as illumination conditions and facial postures on the face recognition rate is reduced, and the data calculation amount in the face recognition process is reduced, the face recognition method and the face recognition device realize rapid and high-accuracy face recognition, reduce the cost in the face recognition process and improve the accuracy of face recognition under different face postures.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a schematic structural diagram of a face recognition system according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a distribution of key points of a face according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of the distribution of facial features according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of facial region segmentation according to an embodiment of the present invention;

FIG. 5 is a block diagram of a face image processing module 100 according to an embodiment of the present invention;

FIG. 6 is a block diagram of the facial feature extraction module 200 according to an embodiment of the present invention;

FIG. 7 is a block diagram of a big data based face recognition module 300 according to an embodiment of the present invention;

FIG. 8 is a block diagram of a convolutional neural network based face recognition module 400 according to an embodiment of the present invention;

fig. 9 is a schematic block diagram illustrating a structure of a face recognition module 500 based on binocular vision technology according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the description of the present invention, it is to be understood that the terms "center", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and thus, are not to be construed as limiting the present invention.

The terms "first", "second", "third", "fourth", "fifth", "sixth", "seventh", "eighth" and "ninth" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first," "second," "third," "fourth," "fifth," "sixth," "seventh," "eighth," and "ninth" may explicitly or implicitly include one or more of the features. In the description of the present invention, "a plurality" means two or more unless otherwise specified.

The face recognition system of the embodiment of the invention can be applied to various intelligent terminals, such as intelligent mobile phones, intelligent cameras, intelligent monitoring equipment, intelligent access control and the like; the face recognition system of the embodiment of the invention is particularly suitable for various terminals with face recognition technology, such as security equipment, monitoring equipment, access control equipment, smart phones and the like which adopt the face recognition technology and binocular vision cameras.

Referring to fig. 1, a face recognition system according to an embodiment of the present invention includes a face image processing module 100, a facial feature extraction module 200, a big data-based face recognition module 300, a convolutional neural network-based face recognition module 400, and a binocular vision technology-based face recognition module 500: the facial image processing module 100 is configured to perform facial image interception after determining a facial region in an image to be processed based on a facial feature point positioning algorithm, and perform noise reduction, light supplement, highlighting and normalization processing on the intercepted facial image to be recognized; the facial feature extraction module 200 is configured to extract facial features of a facial image to be recognized based on key points of the facial image, where the extracted facial features include a Histogram of Oriented Gradients (HOG) feature, a Local Binary Pattern (LBP) feature, and a parallax feature of a facial pixel point; the big data-based face recognition module 300 is used for recognizing the gender of the face image to be recognized based on a deep learning method according to the extracted HOG feature and the local binary pattern LBP feature; the face recognition module 400 based on the convolutional neural network is used for adopting a trained convolutional neural network model according to the extracted HOG feature and the LBP feature of the local binary pattern to realize high-accuracy face identity recognition of the face image to be recognized in multiple angles; the face recognition module 500 based on the binocular vision technology is used for acquiring a facial expression detection exclusive image of a face image to be recognized according to parallax features of facial pixel points, and recognizing a facial micro-expression of the face image to be recognized based on the facial expression detection exclusive image.

Specifically, as shown in fig. 5, the facial image processing module 100 specifically includes: the multi-view-based face region detection sub-module 101 is configured to perform graying on an input color image, perform histogram equalization, perform face detection by using front, left and right face detectors, respectively, remove a face detection result having an area smaller than a predetermined value, and obtain a multi-view-based to-be-processed face image; the face positioning and normalization processing sub-module 102 is used for positioning feature points in the obtained multi-view face image to be processed based on a hybrid tree-shaped structure feature point model of HOG (histogram of oriented gradient) features, accurately determining a face area according to feature points of the outer contour of the face after the feature points are positioned, and completing the normalization of the face image to be processed by cutting and scaling the face area image; and the post-processing submodule 103 of the facial image is used for performing noise reduction, light supplement and highlighting processing on the intercepted facial image to be processed to obtain the facial image to be recognized.

Further, firstly, the multi-view-angle-based face region detection submodule 101 grays an input color image, performs histogram equalization, performs face detection by using a front face, a left side face and a right side face detector respectively, removes a face detection result with an area smaller than a predetermined value, obtains a multi-view-angle to-be-processed face image, namely, performs preprocessing on the input color image by using the multi-view-angle-based face region detection submodule 101 to obtain a grayscale image of the to-be-processed image, and then performs face detection on the to-be-processed image by using the face detector.

For the process of performing face detection on an image to be processed by using a face detector, first, facial contour key points, eyebrow key points, eye key points, nose key points and mouth key points included in the image to be processed are determined, wherein the facial contour key points, the eyebrow key points, the eye key points, the nose key points and the mouth key points respectively represent the position of a facial contour, the position of an eyebrow, the position of an eye, the position of a nose and the position of a mouth in the image to be processed.

For example, fig. 2 shows a schematic distribution diagram of facial key points used by a face detector according to an embodiment of the present invention, as shown in fig. 2, the determined facial key points total 83, where there are 19 facial contour key points representing facial contour positions and facial contour sizes, 16 eyebrow key points representing eyebrow positions and eyebrow sizes, 18 eye key points representing eye positions and eye sizes, 12 nose key points representing nose positions and nose sizes, and 18 mouth key points representing mouth positions and mouth sizes. Of course, this is merely an example and does not represent a limitation to the location and number of distribution of facial keypoints as determined in step 100 of embodiments of the present invention.

After determining the face key points included in the image to be processed, the face region in the image to be processed may be determined based on the positions of the face key points, then the face detection result having an area smaller than a predetermined value is removed, and the view angle of the face image to be processed may be determined according to the ratio between the face key points, obtaining a multi-view face image to be processed.

Further, referring to a lot of experience, the facial angles generally include five different facial angles, namely a front face, a left side face, a right side face, a downward head lowering direction, an upward head raising direction and the like, wherein the facial features of the facial image to be processed are greatly changed in the horizontal direction and are not changed in the vertical direction relative to the front face; the face features of the face image to be processed do not change in the horizontal direction and change greatly in the vertical direction relative to the front face when the head is lowered downwards and raised upwards; for example, fig. 3 shows a schematic diagram of distribution of facial features in a face-righting posture, and referring to fig. 3, a vertical distance between an eyeball and a forehead line and a vertical distance between the eyeball and a highest point of a head of the face respectively account for half of the vertical distance of the face, that is, a connecting line between two eyeballs of the face is a center line of the face in a vertical direction, wherein the vertical distance between the nose bottom line and the forehead line and the vertical distance between the nose bottom line and an eyebrow line respectively account for one third of the vertical distance of the face, that is, under a face-righting angle, a ratio of the vertical distance between the eyebrow line and the nose bottom line to the vertical distance between the nose bottom line and the forehead line is 1: 1. Further analysis shows that when the face angle is changed from face-up to head-up, the face part between the eyebrow lines and the nasal base lines is farther away from the shooting lens than the face part between the nasal base lines and the forehead base lines, the vertical distance between the corresponding eyebrow lines and the nasal base lines is larger than that in a face-up state, the vertical distance between the nasal base lines and the forehead base lines is smaller than that in a face-up state, namely, the ratio of the vertical distance between the eyebrow lines and the nasal base lines to the vertical distance between the nasal base lines and the forehead base lines is larger than 1:1 in a head-up state; when the face angle is changed from face-up to downward head-lowering, the distance between the face part between the eyebrow line and the nasal bottom line and the shooting lens is shorter than the distance between the face part between the nasal bottom line and the forehead bottom line and the distance between the corresponding eyebrow line and the nasal bottom line is smaller in a face-up state, the vertical distance between the nasal bottom line and the forehead bottom line is larger in a face-up state, namely the ratio of the vertical distance between the eyebrow line and the nasal bottom line to the vertical distance between the nasal bottom line and the forehead bottom line is smaller than 1:1 in a head-down state.

Based on the above analysis, the present invention can determine whether the face angle of the face image to be processed is downward low or upward high by determining the magnitude between the first threshold Y and the ratio of the vertical distance between the eyebrow line and the nasal base line to the vertical distance between the nasal base line and the forehead base line. In consideration of the measurement error and the difference between the distribution of the facial features, the first threshold Y is preferably a range of values rather than a specific value, and is illustratively [0.8, 1.2 ].

Referring to fig. 3, in the facial features under frontal angle, the horizontal distance from the left-eye corner of the left eye to the left ear root, the horizontal distance from the left-eye corner of the left eye to the right-eye corner of the left eye, the horizontal distance from the right-eye corner of the left eye to the left-eye corner of the right eye, the horizontal distance from the left-eye corner of the right eye to the right-eye corner of the right eye and the horizontal distance from the right-eye corner of the right eye to the right ear root each account for one fifth of the overall horizontal distance of the facial image, namely, the ratio of the horizontal distance between the left canthus of the left eye and the left root of the ear, the horizontal distance between the left canthus of the left eye and the right canthus of the left eye, the horizontal distance between the right canthus of the left eye and the right canthus of the right eye, the horizontal distance between the left canthus of the right eye and the right canthus of the right eye and the horizontal distance between the right canthus of the right eye and the right root of the ear is 1:1:1:1: 1. Through research, when the face angle of a face image changes from a front face to a left face or a right face, the horizontal distance between the left eye corner of the left eye and the root of the left ear, the horizontal distance between the left eye corner of the left eye and the right eye corner of the left eye, the horizontal distance between the right eye corner of the left eye and the left eye corner of the right eye, the horizontal distance between the left eye corner of the right eye and the right eye corner of the right eye and the root of the right ear, respectively, change to different degrees relative to the front face posture, and therefore, the face image can be judged to be the left face or the right face by judging the ratio change condition of any two of the five horizontal distances.

After obtaining a multi-view to-be-processed facial image, a facial positioning and normalization processing sub-module 102 is adopted to position feature points in the obtained multi-view to-be-processed facial image based on a mixed tree structure feature point model of HOG (histogram of oriented gradient) features, after the feature points are positioned, facial regions are accurately determined according to facial outer contour feature points (namely facial outer contour key points), and normalization of the to-be-processed facial image is completed by cutting and scaling the facial region image; and finally, performing post-processing such as noise reduction, light supplement, highlighting and the like on the intercepted facial image to be processed by adopting a post-processing sub-module 103 of the facial image to obtain the facial image to be recognized.

It should be noted that, in the embodiment of the present invention, post-processing processes such as noise reduction, light supplement, and highlighting are performed on a face image to be processed, which are not described herein in detail, and a person skilled in the art may refer to a noise reduction, light supplement, and highlighting algorithm in the prior art to perform related processing, which is not limited in the embodiment of the present invention.

Further, referring to fig. 6, the facial feature extraction module 200 specifically includes: the HOG feature extraction submodule 201 is configured to perform histogram equalization on the normalized facial image to be identified and extract HOG features; the LBP feature extraction submodule 202 is used for carrying out image blocking on the normalized facial image to be identified according to different blocking strategies, and extracting the LBP features of the blocked image by adopting a mixed LBP (local binary pattern) operator; and the parallax extraction sub-module 203 is configured to calculate a parallax image of the facial image to be recognized according to the depth images of the same scene respectively captured by the two depth cameras, and obtain parallax features of facial pixel points of the facial image to be recognized based on the parallax image.

For example, histogram equalization may be directly performed on the normalized facial image to be recognized by using a conventional method, and the HOG feature may be extracted, which is not described in detail in the embodiment of the present invention, for example, if the normalized facial image to be recognized is 64 × 64 pixels, the dimension of the extracted HOG feature is 1764 dimensions.

Specifically, the LBP feature extraction sub-module 202 may process the facial image to be identified by using a conventional LBP operator, and count the value of each pixel of the processed LBP image to obtain a histogram H0 of the LBP image, then perform image blocking on the normalized facial image to be identified according to different blocking strategies, where each block is processed by using a uniform LBP operator, and count to obtain a corresponding histogram, and then combine all the histograms in a certain order to form H1; finely partitioning the image and processing the partitioned image by using the same method to obtain H2; connecting the histograms H0, H1 and H2 into a histogram H1; for example, the histogram obtained by the above process may use an LBP operator with a sampling point of 12 and a sampling radius of 2; replacing the operators by LBP operators with the sampling points of 16 and the sampling radius of 4, and repeating the process to obtain a histogram h 2; then combining h1 with h2 to obtain a feature vector h; illustratively, the dimension of the resulting LBP feature vector is 2872.

For example, referring to fig. 4, the LBP feature extraction sub-module 202 divides the facial image to be recognized into four regions, namely a first region, a second region, a third region and a fourth region, wherein the first region, the second region, the third region and the fourth region are distributed clockwise, and the first region is located in the vertical direction of the fourth region. The first region is located half face in facial left side and is located the area more than the nose bottom line, and the second region is located half face on facial right side and is located the area more than the nose bottom line, and the third region is located half face on facial right side and is located the area below the nose bottom line, and the fourth region is located half face in facial left side and is located the area below the nose bottom line.

For facial images to be recognized under different facial angles, the extraction windows adopted by the first region, the second region, the third region and the fourth region in the process of extracting the LBP features are different, for example, if the facial angle of the facial image to be recognized is the front, the pixel points in the first region, the second region, the third region and the fourth region can all adopt the same circular window to extract the LBP features; if the face angle of the face image to be recognized is downward head lowering, the pixel points in the first area and the second area adopt the oval windows with the longitudinal axes larger than the horizontal axis to extract the LBP characteristics, the pixel points in the third area and the fourth area adopt the oval windows with the longitudinal axes smaller than the horizontal axis to extract the LBP characteristics, the problem that the LBP characteristics extracted by the same key point are large in difference due to different local scaling ratios of the face image under different face angles can be avoided, the effectiveness of LBP characteristic extraction based on the face key points under different face angles is improved, and the face recognition accuracy is further improved.

Specifically, the parallax extraction sub-module 203 uses two depth cameras to respectively shoot the face of the same person, obtains two different depth maps of the face of the same person, then uses one of the two depth maps as a reference map and the other as a matching map, obtains a parallax map of the face image to be recognized by using a parallax calculation method, and obtains parallax features of face pixel points of the face image to be recognized based on the parallax map of the face image to be recognized.

Further, referring to fig. 7, the big data based face recognition module 300 specifically includes: the gender recognition model building sub-module 301 based on deep learning is used for training a gender recognition model based on deep learning for facial gender recognition according to the gender-labeled face sample; and a gender identification submodule 302, configured to perform gender identification on the facial image to be identified by using the gender identification model based on deep learning.

The big data-based face recognition module 300 firstly trains a gender recognition model for face gender recognition based on deep learning by using the gender-labeled face sample to obtain the gender recognition model for deep learning, and then inputs the features of the face image to be recognized into the gender recognition model for deep learning, i.e. the gender of the face image to be recognized can be determined according to the output parameters of the gender recognition model for deep learning.

Specifically, the gender identification model building submodule 301 based on deep learning is specifically configured to: training a first deep learning gender identification model for facial gender identification by adopting all gender-labeled facial samples, wherein the output parameters of the first deep learning gender identification model comprise a first probability parameter for representing that the gender-labeled facial samples are males and a second probability parameter for representing that the gender-labeled facial samples are females, and the sum of the first probability parameter and the second probability parameter is 1; after the first deep learning gender identification model is obtained, retraining face samples of which the absolute value of the difference value between the first probability parameter and the second probability parameter is smaller than a preset threshold value are obtained from all face samples marked with the gender; and training a second deep learning gender identification model for facial gender identification by using all the retrained face samples, wherein the output parameters of the second deep learning gender identification model comprise a third probability parameter for representing that the retrained face samples are male and a fourth probability parameter for representing that the retrained face samples are female, and the sum of the third probability parameter and the fourth probability parameter is 1.

Specifically, the gender identification submodule 302 is specifically configured to: firstly, a first probability parameter that a facial image to be identified is male and a second probability parameter that the facial image to be identified is female are obtained by adopting a first deep learning gender identification model, and the sum of the first probability parameter and the second probability parameter is 1; then judging whether the absolute value of the difference value between the first probability parameter and the second probability parameter is greater than a preset threshold value or not; if the absolute value of the difference value between the first probability parameter and the second probability parameter is greater than a preset threshold value, determining that the facial image to be identified is a male if the difference value between the first probability parameter and the second probability parameter is greater than the preset threshold value, and determining that the facial image to be identified is a female if the difference value between the second probability parameter and the first probability parameter is greater than the preset threshold value; if the absolute value of the difference value between the first probability parameter and the second probability parameter is smaller than a preset threshold value, acquiring a third probability parameter that the face image to be recognized is male and a fourth probability parameter that the face image to be recognized is female by adopting a second deep learning gender recognition model, wherein the sum of the third probability parameter and the fourth probability parameter is 1; and finally, determining the gender of the facial image to be identified according to the first probability parameter, the second probability parameter, the third probability parameter and the fourth probability parameter.

Specifically, the process of determining the gender of the facial image to be recognized according to the first probability parameter, the second probability parameter, the third probability parameter and the fourth probability parameter is as follows: if the difference value obtained by subtracting the sum of the second probability parameter and the fourth probability parameter from the sum of the first probability parameter and the third probability parameter is larger than a preset threshold value, determining that the facial image to be identified is a male; and if the difference value obtained by subtracting the sum of the first probability parameter and the third probability parameter from the sum of the second probability parameter and the fourth probability parameter and the sum of the first probability parameter and the third probability parameter is larger than a preset threshold value, determining that the facial image to be identified is a woman.

Specifically, the process of determining the gender of the facial image to be recognized according to the first probability parameter, the second probability parameter, the third probability parameter and the fourth probability parameter is as follows: if the sum of the difference value of the first probability parameter minus the second probability parameter and the first product of the first weight coefficient and the difference value of the third probability parameter minus the fourth probability parameter and the second product of the second weight coefficient is larger than a preset threshold value, determining that the facial image to be identified is a male; if the sum of the difference value of the second probability parameter minus the first probability parameter and the first product of the first weight coefficient and the difference value of the fourth probability parameter minus the third probability parameter and the second product of the second weight coefficient is larger than a preset threshold value, determining that the facial image to be identified is female; the sum of the first weight coefficient and the second weight coefficient is 1, and the second weight coefficient is larger than the first weight coefficient.

It should be noted that, in the embodiment of the present invention, the second weight coefficient is set to be greater than the first weight coefficient, because the inventor finds that, in the process of implementing the present invention, the accuracy of determining the gender of the face image to be recognized based on the probability after fusion by using the second deep learning gender recognition model trained by using the retrained face sample is significantly higher than that of the first deep learning gender recognition model, and thus, the accuracy of determining the gender of the face image to be recognized based on the probability after fusion according to the first probability parameter, the second probability parameter, the third probability parameter and the fourth probability parameter can be improved.

Next, it should be noted that, in the embodiment of the present invention, the size of the preset threshold is not specifically limited, for example, the preset threshold may be 0.3, 0.4, or 0.5, where the specific size of the preset threshold may be considered to be set, and may also be set by the face recognition system in the embodiment of the present invention.

The facial recognition system of the embodiment of the invention, the big data-based facial recognition module 300 firstly trains a first deep learning gender recognition model by using all facial samples marked with gender, then takes out samples of which the absolute value of the difference value between a first probability parameter and a second probability parameter in the output parameters of the first deep learning gender recognition model is smaller than a preset threshold value to form retraining facial samples, then trains a second deep learning gender recognition model by only using all retraining facial samples, can improve the training of the identification parameters of the fuzzy facial samples, further adopts the first deep learning gender recognition model and the second deep learning recognition model which are circularly trained to cooperate with each other when the gender of the facial image to be recognized is fuzzy, determines the gender of the facial image to be recognized, and improves the accuracy rate of determining the gender of the facial image to be recognized, the gender of the facial image to be recognized cannot be determined according to the first deep learning gender recognition model when the first probability parameter and the second probability parameter determined by the first deep learning gender recognition model are close, the gender determination of the facial image to be recognized under various angles and various definitions is realized, and the influence of factors such as illumination conditions, facial postures and the like on the gender determination of the facial image can be effectively reduced.

Further, referring to fig. 8, the convolutional neural network based face recognition module 400 specifically includes: an image input sub-module 401, configured to input a histogram of oriented gradients HOG feature and a local binary pattern LBP feature of the facial image to be processed; a face recognition model construction submodule 402 based on the convolutional neural network, configured to train a face recognition model based on the convolutional neural network for face recognition according to the face sample with the labeled identity; the face recognition sub-module 403 is configured to implement, according to the histogram of oriented gradients HOG feature and the local binary pattern LBP feature of the facial image to be recognized, the above-mentioned face recognition model based on the convolutional neural network, so as to implement high-accuracy face recognition of the facial image to be recognized in multiple angles.

Specifically, the face recognition model building submodule 402 based on the convolutional neural network is specifically configured to: a convolutional neural network for face recognition is constructed in a server, wherein the convolutional neural network is divided into 9 layers, the number of neurons in an input layer is the pixel size of a face sample, and parameters of the rest layers are set as follows: the 1 st, 3 rd, 5 th and 7 th layers are convolutional layers C1, C2, C3 and C4 respectively, and are composed of 4 6 × 6, 8 6 × 6 feature maps and 12 6 × 6 feature maps respectively, and each neuron is connected with the neighborhood of 6 × 6 of the input layer; the 2 nd, 4 th and 6 th layers are downsampling layers S1, S2 and S3, and each neuron in the feature map of the 2 nd, 4 th and 6 th layers is connected with a 4x 4 neighborhood of the corresponding feature map in the 1 st, 3 th and 5 th layers; the 8 th layer is a hidden layer, the characteristic values in the 12 characteristic diagrams of C4 are arranged into a column vector to form a characteristic vector, and the one-dimensional characteristics are finally classified and identified; the 9 th layer is an output layer, the number of the neurons is determined by the number of the face identities to be determined, and the neurons represent the total number of possible recognition results; inputting the collected face image and the corresponding face identity into a set convolutional neural network, and obtaining an output Op through progressive forward propagation; and calculating the difference between the output Op and the corresponding ideal output Yp, and adjusting the weight matrix according to a method of minimizing errors until a reasonable face recognition model based on the convolutional neural network is obtained.

The face recognition system provided by the embodiment of the invention firstly adopts deep learning to determine the gender of the face image to be recognized, determines the identity of the face image to be recognized by adopting a convolutional neural network method in combination with the gender of the face image to be recognized, combines the deep learning and the convolutional neural network to be used for the identity determination of the person idle image, and can effectively improve the identity determination accuracy of the face image to be recognized.

Further, referring to fig. 9, the face recognition module 500 based on the binocular vision technology specifically includes: the color image construction sub-module 501 is configured to construct a color image with the same size as the disparity map of the facial image to be recognized according to the disparity value of each pixel point in the disparity map of the facial image to be recognized; the three primary color values of each pixel point of the color image are related to the parallax values of the corresponding pixel points in the parallax image of the facial image to be recognized; the three-dimensional distance calculation sub-module 502 is configured to divide the color image into a plurality of candidate regions, and determine three-dimensional space information of each candidate region by combining disparity values of pixel points corresponding to each candidate region in the disparity map; a facial organ selecting sub-module 503, configured to determine whether each candidate region is a facial organ region according to the three-dimensional spatial information of each candidate region and a preset three-dimensional spatial information threshold of a facial organ; and the face recognition sub-module 504 is configured to recognize the facial micro-expression of the facial image to be recognized according to the three-dimensional spatial information of the facial organ of the facial image to be recognized.

The parallax map of the face image to be identified generated by the image processing engine can be further generated into a color image with the same size as the parallax map by the embedded microprocessor. Specifically, the embedded microprocessor creates a color image with the same size as the parallax image; and each pixel point in the parallax map corresponds to each pixel point of the color image one by one. And the embedded microprocessor carries out color filling on corresponding pixel points of the color image according to the parallax value of each pixel point in the parallax image.

It should be explained that based on the disparity map after In-tracking operation, by combining the distance B between two cameras of the binocular camera and the focal length f of the lens of the camera, the depth information of each pixel point In the actual three-dimensional space, that is, the Z value, can be calculated by using the formula Z ═ B × f/d, d as a disparity value. And then, color filling can be carried out on corresponding pixel points of the color image according to the depth information. For example, the RGB (three primary colors) value of each pixel of the color image may be adjusted according to the fluctuation range of the depth information of all the pixels, so that the RGB value of each pixel fluctuates between 0 and 255.

The three-dimensional distance calculation sub-module 502 specifically includes: the image segmentation unit is used for dividing the color image into a plurality of areas according to the different colors of different areas of the color image; the coordinate calculation unit is used for calculating the three-dimensional space coordinate of each pixel point according to the parallax value of each pixel point in the parallax map of the facial image to be recognized; and the region calculation unit is used for determining the size and the position of each candidate region according to the three-dimensional space coordinates of the pixel points contained in each candidate region.

It should be noted that the color image may be subjected to pixel-level segmentation, and the segmented image may be divided into several candidate regions. For each divided candidate region, the three-dimensional space information of each candidate region can be determined according to the disparity value of the pixel point of the corresponding position of each candidate region in the disparity map. The three-dimensional space information includes information such as length, width, height, position and the like corresponding to each candidate region.

Specifically, one region having the same color may be divided, and by performing the division, the nose, the eyes, the mouth, the ears, the hair, and the like may be divided into different regions, respectively. Then, according to the disparity value of each pixel point in the disparity map, calculating the three-dimensional space coordinate of each pixel point, which can be calculated by adopting the following formula: z ═ B × f/d, (X ═ W/u) × B/d-B/2, and Y ═ H '- (v-H/2) × B/d, where (X, Y, Z) are three-dimensional spatial coordinate values in the world coordinate system, B is the distance between two cameras of the binocular camera, f is the camera lens focal length, d is the parallax value, H' is the height of the two cameras from the ground, and the parallax map size is (W, H), for example: 1280 x 960, the coordinates of the pixel point in the image coordinate system are (u, v), such as pixel (100 ). Since B, f, d, H', and (W, H) and (u, v) are known quantities, the three-dimensional spatial coordinate values of the pixel points of each divided candidate region can be calculated by the above formula. After three-dimensional space coordinate values (X, Y, Z) of each pixel point of each candidate region are calculated, length, width and height size information of each region can be obtained directly by calculating the difference value of the coordinate values. And the spatial position of each region can be determined according to the three-dimensional spatial coordinate values of all the pixel points in each region.

The face recognition module 500 based on the binocular vision technology provided by the embodiment of the invention can obtain the disparity map of the face image to be recognized, acquiring a color image used for representing each pixel point position and the far and near face image to be recognized according to the disparity map, and finally acquiring the face image to be recognized according to the disparity map based on the disparity feature and the color image in the disparity map, determining the facial organ type of each candidate region according to the three-dimensional space information of each candidate region and a preset three-dimensional space information threshold of the facial organ, and finally, based on the position and parallax information between different facial organs, facial micro-expressions of the facial image to be recognized are determined, such as according to the position of the corners of the mouth and the shape of the mouth, the facial expression of the facial image to be recognized can be judged as open or injured, such as the corners of the mouth being raised, the mouth being slightly open indicating that it is smiling, such as the mouth rising, the eyebrows being wrinkled, indicating that it is injured.

The face recognition module 500 based on the binocular vision technology provided by the embodiment of the invention realizes the judgment according to the positions of different facial organs by adopting the binocular vision technology and combining a disparity map and a color image representing the position of each pixel point and the distance of a face image to be recognized, and finally realizes the recognition of the facial micro-expression of the face image to be recognized by combining the positions and the disparity information between different facial organs, and the facial micro-expression information can be used as face recognition information for intelligent entrance guard and intelligent encrypted face recognition, so that the safety and the reliability of intelligent home based on the face recognition are improved, and the accuracy of the face recognition is further improved.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, implement the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various modifications and variations can be made in the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to encompass such modifications and variations.

Claims

1. A facial recognition system, characterized in that the facial recognition system comprises:

the face recognition module based on a binocular vision technology is used for acquiring a special facial expression detection image of the facial image to be recognized according to the parallax features of the facial pixel points, and recognizing the facial micro-expression of the facial image to be recognized based on the special facial expression detection image; the binocular vision technology-based face recognition module specifically comprises:

2. The facial recognition system of claim 1, wherein the facial image processing module specifically comprises:

3. The facial recognition system of claim 1, wherein the facial feature extraction module specifically comprises:

4. The facial recognition system of claim 1 wherein the big-data based facial recognition module specifically comprises:

5. The face recognition system of claim 4, wherein the deep learning based gender recognition model construction sub-module is specifically configured to:

6. The face recognition system of claim 5, wherein the gender identification sub-module is specifically configured to

7. The facial recognition system of claim 1 wherein the convolutional neural network-based facial recognition module specifically comprises:

8. The facial recognition system of claim 7 wherein the convolutional neural network-based facial recognition model construction sub-module is specifically configured to:

9. The facial recognition system of claim 8 wherein the three-dimensional distance computation submodule specifically comprises: