WO2015037973A1

WO2015037973A1 - A face identification method

Info

Publication number: WO2015037973A1
Application number: PCT/MY2013/000167
Authority: WO
Inventors: Yong Thye LIM
Original assignee: Data Calibre Sdn Bhd
Priority date: 2013-09-12
Filing date: 2013-09-12
Publication date: 2015-03-19

Abstract

The present invention relates to a method of identifying a face in a digital image (100), characterised by the steps of: detecting a face in a scene of an image; extracting at least one template from the detected face of the image; training the extracted template using an ensemble of classifiers for classifying the face into two classes of face and non-face template; converting the face template to a Greyscale image or Hue Saturation Intensity (HSI) image; resizing the converted face template to a consistent dimension; adjusting the resized face template for consistent brightness and contrast; extracting facial feature from the face template for discrimination; generating face descriptors using a transform matrix corresponding to the respective facial feature; and verifying the input facial feature by comparing generated face descriptors with registered face descriptors using a fusser between multiple classifiers for face identification; wherein the multiple classifiers is fused using an ordinal structure fuzzy module technique; whereby votes for each class are counted over the input classifiers and an output is chosen using the ordinal structure fuzzy module technique.

Description

A FACE IDENTIFICATION METHOD

Background of the Invention

Field of the Invention

This invention relates to a face identification method, and more particularly to a method of identifying a face in a digital image by finding new features in the extraction process and making a decision among multiple outputs by using a fuser.

Description of Related Arts

Augmented reality (AR) is a live view of a physical, real-world environment whose elements are augmented by computer-generated sensory. AR technology has been advanced to cause the information about the surrounding of a user becomes interactive and digitally manipulatable. Along with the development of AR technology, face recognition via AR has been widely researched on for security enhancement, as well as for socialisation.

Face recognition means to identify a person from a digital image or a video frame from a video source by extracting selected facial features from the image and comparing with templates. There are number of known algorithms and techniques being applied in a system for said face recognition. One of the examples has been disclosed in U.S. Patent Application Publication No. 2003/02151 15 A1 , wherein a component-based linear discriminant analysis (LDA) face descriptor is used for recognizing a face. The LDA-based face recognition is conducted by classifying poses to several classes, grouping facial image based on the several pose classes, and compensating for changes in a target image due to the pose change.

U.S. Patent Application Publication No. 2012/0148160 A1 is another example of face recognition, wherein a cascaded classifier and a strong classifier tailored are ran to detect different types of facial landmarks and to determine one or more respective locations of the facial landmarks. The cascaded classifier is performed using a multi-staged AdaBoost classifier, and the strong classifier is a support vector machine (SVM) classifier with input features processed by a principal component analysis (PCA) of the landmark subimage.

Despite the multiple approaches implemented by the existing systems, many issues remain unaddressed. Issues such as the illumination problem, the pose problem, scale variability, images taken years apart, glasses, moustaches, beards, low quality image acquisition, partially occluded faces et cetera are those prominent issues that remains unsolved. Therefore, there is a need to find a system that could tackle the said unaddressed issues.

The existing face verifying system mainly uses OpenCV for the face detection prior to face verification. However, said technique has some drawbacks such as low detection rate for portrait image, high rate of false positive and negative detection, thus affect the accuracy of face detection. In addition, the conventional face recognition algorithms require a plurality of sheets containing query images for face identification. The face detection may not able to perform when only one sheet or a few sheets containing query images is given.

Accordingly, it can be seen in the prior arts that there exists a need to provide a method of identifying a face in a digital image which could solve existing problems in the prior art as mentioned above.

Summary of Invention

It is an objective of the present invention to provide a method of identifying a face in a digital image.

It is also an objective of the present invention to provide a method of identifying a face in a digital image to achieve a low False Acceptance Rate (FAR) and maintain False Rejection Rate (FRR) at a reasonable rate. It is yet another objective of the present invention to provide a method of identifying a face in a digital image which combines an ensemble of classifiers before merge the classifiers into a final strong classifier through voting. It is a further objective of the present invention to provide a method of identifying a face in a digital image to maintain a low failed enrolment rate by applying fusion techniques.

Accordingly, these objectives may be achieved by following the teachings of the present invention. The present invention relates to a method of identifying a face in a digital image, characterised by the steps of: detecting a face in a scene of an image; extracting at least one template from the detected face of the image; training the extracted template using an ensemble of classifiers for classifying the face into two classes of face and non-face template; converting the face template to a Greyscale image or Hue Saturation Intensity (HSI) image; . resizing the converted face template to a consistent dimension; adjusting the resized face template for consistent brightness and contrast; extracting facial feature from the face template for discrimination; generating face descriptors using a transform matrix corresponding to the respective facial feature; and verifying the input facial feature by comparing generated face descriptors with registered face descriptors using a fusser between multiple classifiers for face identification; wherein the multiple classifiers is fused using an ordinal structure fuzzy module technique; whereby votes for each class are counted over the input classifiers and an output is chosen using the ordinal structure fuzzy module technique.

Brief Description of the Drawings

The features of the invention will be more readily understood and appreciated from the following detailed description when read in conjunction with the accompanying drawings of the preferred embodiment of the present invention, in which:

Fig. 1 is a flow chart of a method of identifying a face in a digital image in accordance to the present invention. Detailed Description of the Invention

As required, detailed embodiments of the present invention are disclosed herein; however, it is to be understood that the disclosed embodiments are merely exemplary of the invention, which may be embodied in various forms. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting but merely as a basis for claims. It should be understood that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the invention is to cover all modifications, equivalents and alternatives falling within the scope of the present invention as defined by the appended claims. As used throughout this application, the word "may" is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words "include," "including," and "includes" mean including, but not limited to. Further, the words "a" or "an" mean "at least one" and the word "plurality" means one or more, unless otherwise mentioned. Where the abbreviations or technical terms are used, these indicate the commonly accepted meanings as known in the technical field. For ease of reference, common reference numerals will be used throughout the figures when referring to the same or similar features common to the figures. The present invention will now be described with reference to Fig. 1.

The present invention provides a method of identifying a face in a digital image for solving pose and illumination issues, wherein a fusion of multiple classifiers is used to enhance the overall performance for face recognition. The method of identifying a face in a digital image characterised by the steps of:

detecting a face in a scene of an image;

extracting at least one template from the detected face of the image; training the extracted template using an ensemble of classifiers for classifying the face into two classes of face and non-face template;

converting the face template to a Greyscale image or Hue Saturation

Intensity (HSI) image,

resizing the converted face template to a consistent dimension; adjusting the resized face template for consistent brightness and contrast;

extracting facial feature from the face template for discrimination; generating face descriptors using a transform matrix corresponding to the respective facial feature; and

verifying the input facial feature by comparing generated face descriptors with registered face descriptors using a fusser between multiple classifiers for face recognition;

wherein the multiple classifiers is fused using an ordinal structure fuzzy module technique;

whereby votes for each class are counted over the input classifiers and an output is chosen using the ordinal structure fuzzy module technique.

In a preferred embodiment of the method of identifying a face in a digital image, the template extracted from the detected face of the image including skin color to decrease the false acceptance rate (FAR) and false rejection rate (FRR).

In a preferred embodiment of the method of identifying a face in a digital image, the ensemble of classifiers comprises Adaptive Boosting (AdaBoost) classifier, Haar cascade classifier using Adaptive Skin Color Filter to decrease the false detection.

In a preferred embodiment of the method of identifying a face in a digital image, the Haar cascade classifier is trained using Open Source Computer Vision (OpenCV).

In a preferred embodiment of the method of identifying a face in a digital image, the Adaptive Skin Color Filter uses Normalized Lookup Table for skin tone percentage calculation. In a preferred embodiment of the method of identifying a face in a digital image, the Adaptive Skin Color Filter is categorised into methods based on non-linear skin distribution model by Normalized Lookup Table. In a preferred embodiment of the method of identifying a face in a digital image, the ensemble of classifiers are weak classifiers modelled by simple histograms and are combined into a strong classifier in a cascade.

In a preferred embodiment of the method of identifying a face in a digital image, the weighted vote of weak classifiers determines the strong classifier.

In a preferred embodiment of the method of identifying a face in a digital image, the converted face template is resized using sizing filter technique. In a preferred embodiment of the method of identifying a face in a digital image, the resized face template is adjusted using Histogram Equalization.

In a preferred embodiment of the method of identifying a face in a digital image, the converted face template is subjected to further process comprising color saturation filtering, noisy face filtering, and face dimensional filtering.

In a preferred embodiment of the method of identifying a face in a digital image, the face descriptors are generated using Eigenface OpenCV algorithm to represent a face as a linear combination of a set of basis image.

In a preferred embodiment of the method of identifying a face in a digital image, the face descriptors are generated using Bayesian Principal Component Analysis (PCA) to transform a set of process variables by rotating axes of representation. In a preferred embodiment of the method of identifying a face in a digital image, the variables including face distance, appearance variations, dimension reduction, hyperplane, and properties of a signal. In a preferred embodiment of the method of identifying a face in a digital image, the face descriptors are generated using Client Specific Linear Discriminant Analysis (CSLDA) which employs client specific projections for class separation.

In a preferred embodiment of the method of identifying a face in a digital image, the face descriptors are generated using Support Vector Machine (SVM) to predict classification of each input facial feature. In a preferred embodiment of the method of identifying a face in a digital image, the face descriptors are generated using Hidden Markov Model (HMM) to characterise statistical properties of the input facial feature.

A face identification system mainly comprises a face detection unibfor detecting a face from a digital image, a feature extraction unit that generates face descriptors corresponding to respective facial feature images for face recognition, a registered face descriptor database that stores registered face descriptors, and a decision unit that verifies the input facial feature by comparing generated face descriptors with the registered face descriptors. Besides the hardware system, a series of instructions is needed in order to operate the system to successfully identify a face from an image.

In accordance with the present invention, the method for face identification adopts a fusion technique, wherein an ensemble of classifiers is used to arrive at a final strong classifier. It is a complicated engineering process and it is difficult to imitate because every possible variable has to be considered. The variables include face distance, appearance variations, dimension reduction, hyperplane, and the properties of a signal, which could affect successful matching. The overall flow chart of the present invention is shown in Fig. 1.

According to the present invention, the steps for identifying a face in a digital image include detecting a face in a scene of an image. Location and size of a face in the digital image in real time is determined. Non-face image such as background and bodies are undesired information and are removed preferably by cropping the noisy background, rotating and tilting the image. Said detection may be in a digital image or even a video frame from a video source. The detection of the present invention gives a roll tolerance of preferably up to 180 degrees and yaw tolerance of preferably up to 45 degrees of a face over an augmented reality.

Then, at least one template is extracted from the detected face of the image. The extracted template includes skin color to decrease the false acceptance rate (FAR) and false rejection rate (FRR). Next, the extracted template is trained using an ensemble of classifiers for classifying the face into two classes of face and non-face template. In accordance to a preferred embodiment, AdaBoost learning algorithm and abstracts faces' Haar-like features are adopted to train a classifier using OpenCV library to examine each image location and classify the image as "face" or "non-face". OpenCV comes with several different classifiers for frontal face detection, as well as some profile faces in side view, eye detection, nose detection, mouth detection, whole body detection, et cetera. In the preferred embodiment, Haar Cascade classifier is used as the face detector in the OpenCV library for frontal face detection.

As the AdaBoost algorithm may have some limitation in detecting portrait images, a white margin may be introduced to the image prior to detecting a face in a scene of the image. This enhances the performance of face detection. During detection of the face in an image, inner changes of the face have to be considered. Therefore, Adaptive Skin Color Filter is adopted to calculate the percentage of skin tone over the whole image. According to a preferred embodiment, the Adaptive Skin Color Filter is categorised into methods based on non-linear skin distribution model by Normalized Lookup Table for specifying the skin tone.

Changes of external condition such as illumination or impact of lighting are considered in the present invention. Histogram Equalization is applied over the image to equalize the distribution of brightness of the image, thereby increase the contrast of the image and the dynamic range. Besides that, sizing filter technique is applied in the present invention for imaging condition of an image such as the focal length of the camera equipment, and imaging distance that affects the dimension of an image. Furthermore, Greyscalecolor space method and HSI are applied to overcome the issue of color's sensitivity to illumination changes; thereby, increasing tolerance level of the intensity changes in the image and separating intensity and chromaticity, whereby only the chromaticity part will be used further. Greyscalecolor space method converts color of an image to greyscale, particularly for face that is not detected in 24-bit color image. Then, the converted image is sent for detection again using the OpenCV library.

Color saturation filtering, noisy face filtering and face dimensional filtering are preferably adopted to reduce value of False Acceptance Error (FAE). Color < saturation filtering specifies the number of colors in the whole image, identifies the first three dominant colors which appears the most in the whole image, and determines if these are skin color based on its relation to skin properties. Noisy face filtering specifies face area and calculates number of detected color in the face area. If the number of the detected color is less than the given threshold, the face is considered as noisy face and system will run as "no face detected". The face dimensional filter calculates dimensions of the detected face to compare against the dimension of the whole image. The filter works based on the location and the dimension of the detected face, particularly width filter and height filter. According to the width filter, if the width of the detected face is lower than 8% of the whole image and 1 % to one of the upper and the lower edges, the detected face will be considered as noisy face and will not be sent to the next stage. Whilst for height filter, if the height of the detected face is lower than 12% of the whole image and 1 % to one of the left and right edges, the detected face will be considered as noisy face and will be counted out from the calculation.

Thereafter, the facial feature is extracted from the face template to generate face descriptors using a transform matrix corresponding to the respective facial feature. A combination of technologies is adopted in the present invention as fusion techniques for better performance of the face identification. According to one of the preferred embodiments, Eigenface OpenCV algorithm is used to generate the face descriptors, wherein a face is represented as a linear combination of a set of basis image. The advantage of using Eigenface OpenCV algorithm is that periodic signal can be split into simple sines and cosines, as well as being able to approximately reconstruct the signal.

According to one of the preferred embodiments, Bayesian Principal Component Analysis (PCA) is used to generate the face descriptors. A probabilistic similarity measure derived from image intensity differences is utilised. PCA defines facial appearance variations as either intrapersonal variation corresponding to changes in appearance due to facial expression, illumination, pose of same individual; or extra-personal variations corresponding to change in identity. PCA transforms a set of process variables by rotating axes of representation, thereby every possible variable is considered. The variables including face distance, appearance variations, dimension reduction, hyperplane, and properties of a signal which affects a successful matching. The system will estimate possible density differences using registered data by rotating their axes and creating a full probability model with all possible variables. Thus, when a face is passed to the system, the system reads the differences of the image density and decides whether this is a true identity from a face in a face class with noise or simply an unknown face.

According to one of the preferred embodiments, the face descriptors are generated using Client Specific Linear Discriminant Analysis (CSLDA) which employs client specific projections for class separation.

According to one of the preferred embodiments, the face descriptors are generated using Support Vector Machine (SVM). SVM takes a set of input data and predicts each given input of possible classes to be fallen to. Suppose each given data points belong to one of these classes, and the goal is to decide which class a new data point will be in. According to one of the preferred embodiments, the face descriptors are generated using Hidden Markov Model (HMM) to characterise statistical properties of the input facial feature. Hidden Markov Model (HMM) provides a learning set that comprising some input examples and known-correct output for each case. Then, the input-output examples will be used to show the network expected behaviour and the back propagation algorithm will then allow the network to adapt.

Thereafter, the input facial feature is verified by comparing generated face descriptors with registered face descriptors using a fusser between multiple classifiers for face recognition. Said fusion techniques compute the distance between the faces stored in the database and select the face closest to the face detected from the database and identify it as the "most likely known person". Thereafter, the recognized individual is analyzed and performance of different classifiers is combined. The multiple classifiers are fused using the ordinal structure fuzzy module technique, whereby votes for each class are counted over the input classifiers and an output is chosen using the ordinal structure fuzzy module technique.

Below is an example of algorithms adopted for generating face descriptors, from which the advantages of the present invention may be more readily understood. It is to be understood that the following example is for illustrative purpose only and should not be construed to limit the present invention in any way.

Examples

(a) Eigenfaces

Firstly, the original images of a training set are transformed into a set of Eigenfaces, E; then, the weights are calculated for each image of the training set and stored in the set, W; upon observing an unknown image, X, the weights are calculated for that particular image and stored in the vector, WX; WX is then compared to the weights of images, of which one knows for certain that they are faces according to the weights of training set, W.

In order to determine whether an image is a face, regard each weight vector as a point in space and calculate an average distance, D, between the weight vectors from WX and the weight vector of the unknown image, WX. If the average distance exceeds the threshold value set, the weight vector of the unknown image WX lies too far apart from the weights of the faces, and thus be considered as "not a face". Otherwise, if X is a face, its weight vector WX will be stored for later classification. The optimal threshold value has to be determined empirically.

(b) Bayesian PCA

A full probability model of all observable and unobservable quantities is set up based on the assumption that all variables are random. The conditional density of the variables to be estimated given the observed data ("posterior") is calculated. The implication of posterior is evaluated and the accuracy of the estimated quantities is checked.

The posterior density function is then computed. If the likelihood and the prior densities are mathematically simple, such computation will be done analytically. Otherwise, Markov Chain Monte Carlo will be used for the complicated ones. A decision about the sample will then be selected from the posterior as the final Bayesian estimate.

(c) Client Specific Linear Discriminant Analysis (CSLDA)

The process of finding client specific template commences by dimension reduction using PCA which is necessary so that the within class scatter matrix is not ranked deficient. If the dimension of images vector is more than the number of training samples, "Snapshot" method is used to find the eigenvectors that project training data into lower dimensions. (d) Support Vector Machine (SVM)

SVM finds the hyper-plane that separates the largest possible fraction of points of the same class on the same side, while maximizing the distance from either class to the hyper-plane. It supports both regression and classification tasks and is able to handle multiple continuous and categorical variables. For categorical variables, a dummy variable is created with case values as either 0 or 1 . Thus a categorical dependent variable consisting of three levels represented by a set of three dummy variables, which are illustrated as "A", "B", and "C" as following:

A: {1 0 0}, B: {0 1 0}, C: {0 0 1}

The component-based detector runs over each image in the training set to extract the components for face recognition purposes. Each of the components is normalized in size and their grey values are combined into a single feature vector for the purpose of generation of input to the face recognition classifier.

(e) Hidden Markov Model (HMM)

HMMs are a set of statistical models used to characterise the statistical properties of a signal. The HMM comprises two interrelated processes which are: an underlying, unobservable Markov chain with a finite number of states, a state transition probability matrix and an initial state probability distribution; and a set of probability density functions associated with each state. Although the present invention has been described with reference to specific embodiments, also shown in the appended figures, it will be apparent for those skilled in the art that many variations and modifications can be done within the scope of the invention as described in the specification and defined in the following claims.

Claims

Claims I/We claim:

1. A method of identifying a face in a digital image (100), characterised by the steps of:

detecting a face in a scene of an image;

converting the face template to a Greyscale image or Hue Saturation Intensity (HSI) image;

verifying the input facial feature by comparing generated face descriptors with registered face descriptors using a fusser between multiple classifiers for face identification;

2. A method of identifying a face in a digital image (100) according to claim , wherein the template extracted from the detected face of the image including skin color to decrease the false acceptance rate (FAR) and false rejection rate (FRR).

3. A method of identifying a face in a digital image ( 00) according to claim , wherein the ensemble of classifiers comprises Adaptive Boosting (AdaBoost) classifier, Haar cascade classifier using Adaptive Skin Color Filter to decrease the false detection.

A method of identifying a face in a digital image (100) according to claim 3, wherein the Haar cascade classifier is trained using Open Source Computer Vision (OpenCV).

A method of identifying a face in a digital image (100) according to claim 3, wherein the Adaptive Skin Color Filter uses Normalized Lookup Table for skin tone percentage calculation.

A method of identifying a face in a digital image (100) according to claim 3, wherein the Adaptive Skin Color Filter can be categorised into methods based on non-linear skin distribution model by Normalized Lookup Table.

A method of identifying a face in a digital image (100) according to claim , wherein the ensemble of classifiers are weak classifiers modelled by simple histograms and are combined into a strong classifier in a cascade.

8. A method of identifying a face in a digital image (100) according to claim 1 , wherein the weighted vote of weak classifiers determines the strong classifier.

9. A method of identifying a face in a digital image (100) according to claim 1 , wherein the converted face template is resized using sizing filter technique.

10. A method of identifying a face in a digital image (100) according to claim 1 , wherein the resized face template is adjusted using Histogram Equalization.

1 1 . A method of identifying a face in a digital image (100) according to claim 1 , wherein the converted face template is subjected to further process comprising color saturation filtering, noisy face filtering, and face dimensional filtering.

A method of identifying a face in a digital image (100) according to claim 1 , wherein the face descriptors are generated using Eigenface OpenCV algorithm to represent a face as a linear combination of a set of basis image.

A method of identifying a face in a digital image (100) according to claim 1 , wherein the face descriptors are generated using Bayesian Principal Component Analysis (PCA) to transform a set of process variables by rotating axes of representation.

A method of identifying a face in a digital image (100) according to claim 13, wherein the variables including face distance, appearance variations, dimension reduction, hyperplane, and properties of a signal.

A method of identifying a face in a digital image (100) according to claim 1 , wherein the face descriptors are generated using Client Specific Linear Discriminant Analysis (CSLDA) which employ client specific projections for class separation.

A method of identifying a face in a digital image (100) according to claim 1 , wherein the face descriptors are generated using Support Vector Machine (SVM) to predict classification of each input facial feature.

A method of identifying a face in a digital image (100) according to claim 1 , wherein the face descriptors are generated using Hidden Markov Model (HMM) to characterise statistical properties of the input facial feature.