KR20150135745A

KR20150135745A - Device and method for face recognition

Info

Publication number: KR20150135745A
Application number: KR1020150070651A
Authority: KR
Inventors: 박강령; 이원오; 홍형길; 김영곤
Original assignee: 동국대학교 산학협력단
Priority date: 2014-05-23
Filing date: 2015-05-20
Publication date: 2015-12-03
Also published as: KR101756919B1

Abstract

The present invention relates to a face recognition apparatus and method, and more particularly, to a face recognition apparatus and method with improved face recognition accuracy.

Description

&Lt; Desc / Clms Page number 1 > DEVICE AND METHOD FOR FACE RECOGNITION [

Biometrics technology applied to a system for existing user authentication is technologies using human specific characteristic information such as fingerprint, face, iris, and voice. Particularly, since face shape is mainly used to identify the identity of a person, face recognition technology is the most natural and less sensitive biometric technology.

However, the face recognition technology using the conventional face feature information may deteriorate the recognition performance due to various pose, focus state, and illumination state changes during face detection and tracking in a continuous image.

The background art of the present invention is disclosed in Korean Patent Laid-Open Publication No. 2013-0114893 (published on October 21, 2013).

The present invention proposes a face recognition apparatus and method capable of selecting a high quality face image by calculating an image quality score by adaptively setting a quality measurement item for each image included in a continuous face image.

The present invention proposes a face recognition apparatus and method that improves face recognition performance by performing face recognition by selecting a high-quality face image from a continuous face image.

According to an aspect of the present invention, a face recognition apparatus is disclosed.

A face recognition apparatus according to an embodiment of the present invention includes an input unit that receives a plurality of continuous face images, a detection unit that detects feature points in the plurality of continuous face images, A quality measurement unit for calculating a plurality of quality measurement values of each of the face images by measuring quality of each face image for each item, and adaptively selecting at least two of the plurality of quality measurement values, A selection unit for selecting a predetermined number of consecutive face images from a higher level on the basis of the quality score, and a selection unit for selecting a predetermined number of consecutive face images from a higher level on the basis of the quality score, And a face recognition unit for performing face recognition using the selected continuous face image.

The score calculation unit calculates a variance value of the quality measurement values of the plurality of continuous face images for each quality measurement item and compares the calculated variance values to select a predetermined number of quality measurement items having a high variance value , The quality measurement value corresponding to the selected quality measurement item is used for the input of the fuzzy logic to adaptively select the quality measurement value.

The quality measuring unit measures the degree of difference between the registered head pose and the predicted head pose, the degree of illumination change of the facial image, the sharpness of the facial image, the degree of opening of the detected facial image, the contrast of the facial image, resolution is measured to calculate a plurality of quality measurement values.

The quality measure is normalized from 0 to 1 for application to the fuzzy logic.

The quality measuring unit measures the rotation angle of the face recognition object based on the position of the feature point, compares the measured rotation angle with the registered rotation angle, and calculates the difference value as the quality measurement value.

The quality measuring unit measures the degree of left-right symmetry with reference to the left and right dividing lines of the face region detected from the face image, and calculates the degree of illumination change of the face image as a quality measurement value.

The quality measuring unit obtains the difference between the pixel value of the facial image and the result obtained by applying the low-pass filter to the pixel value, thereby calculating the sharpness reflecting the intermediate frequency and the high-frequency component as the quality measurement value.

The quality measuring unit calculates an open value of the eye, which is calculated using the standard deviation of the number of black pixels projected on the horizontal axis, as a quality measurement value.

The quality measuring unit calculates a contrast value by dividing the difference between the pixel maximum values at the positions of 25% and 75% in the cumulative histogram of the face image by the pixel brightness value.

The quality measuring unit calculates a distance between the detected two eyes as a resolution value.

The face recognition unit calculates a weight of each selected continuous face image by a ratio of a sum of quality scores of the selected continuous face image and a quality score of each selected continuous face image, and performs face recognition using the calculated weight.

According to another aspect of the present invention, a face recognition method performed by a face recognition apparatus using a plurality of continuous face images is disclosed.

A method for recognizing a face according to an embodiment of the present invention includes receiving a plurality of continuous face images, detecting feature points in the plurality of continuous face images, Measuring a quality of each face image by each measurement item to calculate a plurality of quality measurement values of each face image; adaptively selecting at least two of the plurality of quality measurement values; Calculating a quality score of each of the face images with an output value calculated by inputting the input image data into a fuzzy logic, selecting a preset number of consecutive face images based on the quality score, And performing face recognition using the image.

Wherein adaptively selecting at least two of the plurality of quality measurement values comprises: calculating a variance value of the quality measurement values of the plurality of continuous face images for each quality measurement item; comparing the calculated variance values Selecting a predetermined number of quality metrics having a high variance value and determining a quality measurement value corresponding to the selected quality metric item as an input of the fuzzy logic.

Wherein the step of calculating a plurality of quality measurement values of each face image includes calculating a difference between a registered head position and a predicted head position, a degree of illumination change of a face image, a sharpness of a face image, , The contrast of the face image and the image resolution are measured to calculate a plurality of quality measurement values.

The step of performing face recognition may further include calculating a weight of each selected continuous face image by a ratio of a sum of quality scores of the selected continuous face image and a quality score of each selected continuous face image, And performs recognition.

The present invention improves face recognition performance by performing face recognition by selecting a high-quality face image from a continuous face image.

The present invention can improve the face recognition performance by calculating an image quality score by adaptively setting a quality measurement item for each image included in the continuous face image.

BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a diagram illustrating an environment to which a face recognition apparatus according to an embodiment of the present invention is applied; FIG.
2 is a view for explaining a concept of a face recognition apparatus according to an embodiment of the present invention.
3 is a view schematically illustrating a configuration of a face recognition apparatus according to an embodiment of the present invention.
FIG. 4 is a view showing an example of a binarized image of a bird's eye and a sensed eye and a corresponding histogram according to an embodiment of the present invention; FIG.
5 is a diagram illustrating a procedure for finally calculating a quality score of an input image using a purge system according to an embodiment of the present invention.
6 illustrates an example of a symmetric input fuzzy membership function according to an embodiment of the present invention.
Figure 7 illustrates an example of a symmetric output fuzzy membership function according to an embodiment of the present invention;
8 illustrates an example of obtaining an output value using an input membership function according to an embodiment of the present invention.
9 is a diagram illustrating an example of a depiction based on a membership function for an output value and IV according to an embodiment of the present invention.
FIG. 10 is a view showing an example of a face image selected from a continuous face image and a face log according to an embodiment of the present invention; FIG.
11 is a diagram illustrating a concept of MLBP feature extraction according to an embodiment of the present invention.
12 is a flowchart illustrating a face recognition method according to an embodiment of the present invention.
13 is a view showing an example of a face image obtained in the initial registration step according to the embodiment of the present invention.
14 illustrates an example of face and facial features detected and tracked in a recognition step according to an embodiment of the present invention.
15 to 22 are views for explaining experimental results and analysis results of a face recognition method according to an embodiment of the present invention.

While the present invention has been described in connection with certain exemplary embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, is intended to cover various modifications and similarities. It should be understood, however, that the invention is not intended to be limited to the particular embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, the present invention will be described in detail with reference to the accompanying drawings. In addition, numerals (e.g., first, second, etc.) used in the description of the present invention are merely an identifier for distinguishing one component from another.

Also, in this specification, when an element is referred to as being "connected" or "connected" with another element, the element may be directly connected or directly connected to the other element, It should be understood that, unless an opposite description is present, it may be connected or connected via another element in the middle.

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. In order to facilitate a thorough understanding of the present invention, the same reference numerals are used for the same means regardless of the number of the drawings.

1 is a diagram illustrating an environment to which a face recognition apparatus according to an embodiment of the present invention is applied.

Referring to FIG. 1, the face recognition apparatus 100 may perform face recognition on a face image of a user obtained by the camera 200. For example, the camera 200 may be a visible light camera having a zoom lens, and the resolution of the image acquired by the camera 200 may be 1600 x 1200 pixels. It can be 3 (RGB) bytes per pixel. The capturing speed of the camera 200 may be about 10 frames per second due to the bandwidth limitation of the USB interface. The Z distance (Z distance) between the user and the camera 200 can be set to 2 to 2.5 m.

2 is a view for explaining a concept of a face recognition apparatus according to an embodiment of the present invention.

As shown in FIG. 2, the face recognition apparatus 100 according to the embodiment of the present invention may have a form in which the face recognition unit 116 is added to the image selection apparatus 110.

Here, in order to improve the face recognition performance when the face recognition is performed by the face recognition unit 116 using the continuous face image, the image selection apparatus 110 selects the quality of each image included in the plurality of continuous face images, And selects a preset number of high quality face images from the continuous face image according to the quality evaluation result.

Accordingly, the face recognition unit 116 performs face recognition with the selected high-quality face image, thereby performing enhanced face recognition.

3 is a schematic view illustrating a configuration of a face recognition apparatus according to an embodiment of the present invention.

3, the face recognition apparatus 100 includes an input unit 111, a detection unit 112, a quality measurement unit 113, a score calculation unit 114, a selection unit 115, and a face recognition unit 116 .

The input unit 111 receives a plurality of continuous face images. Here, a plurality of continuous face images are obtained through face detection and tracking, and are a plurality of continuous images including a face region of a face recognition object. For each continuous face image, the quality of the image may vary due to changes in face pose, illumination, and focus of the face recognition object.

The detection unit 112 detects feature points from each face image. For example, the detection unit 20 can detect characteristic points such as eyes, nose, mouth, and the like on each continuous face image using techniques such as Adaboost, Rapid Eye, Camshift, binarization, labeling, and the like.

The quality measuring unit 113 calculates the quality measurement value of each face image based on the plurality of quality measurement items set for each face image using the detected feature points. Here, the quality measurement items may include head pose, illumination, sharpness, openness of the eyes, contrast and image resolution, In this specification, it is assumed that the six quality measurement items are used to evaluate the quality of the facial image.

That is, the quality measuring unit 113 measures the degree of difference between the registered head pose and the predicted head pose, the degree of illumination change of the facial image, the sharpness of the facial image, the degree of opening of the detected facial image, ) And resolution (image resolution). Here, the calculated quality measurement values are normalized between 0 and 1 to apply to the fuzzy logic. At this time, the normalization of the quality measurement value may be performed using the maximum value or the minimum value of the quality measurement value. For example, when the quality measurement value is higher, the normalization of the quality measurement value is performed based on the maximum value of the quality measurement values of the continuous face images. If the quality measurement value is smaller, , Normalization of the quality measure may be performed based on the minimum of the quality measure values of the continuous face images. Accordingly, the higher the normalized quality measurement value, that is, the closer to 1, the better the quality of the face image can be judged.

The quality measurement unit 113 measures the rotation angle of the face recognition object based on the position of the feature point, compares the measured rotation angle with the registered rotation angle, and calculates the difference value as the quality measurement value for the head pose have. Despite changes in head pose, one of the main purposes of face recognition is personal identification. Generally, during face recognition, out-of-plane (rotation in the horizontal or vertical direction) occurs. Thus, it is important to distinguish between different rotations of the quality measurement item for the head pose, and assign a higher quality score to the nearest image to one of the registered images with respect to the head pose. For example, a head pose can be calculated based on feature points of detected eyes and nostrils. That is, in order to calculate the rotation outside the plane, the distance between the two eyes and the distance between the eyes and the nostril can be used. Thus, a rotation angle in the X-axis (horizontal) and Y-axis (vertical) directions can be calculated based on the distance between the detected eye and the characteristic point of the nostril. The face feature point and the face histogram feature information may be obtained according to the head pose and the head pose value for the face image at the recognition step may be calculated using the following equation.

[Equation 1]

Here, P (t, a) is the head pose value, t is the face image number, and a is the head pose number representing the five head pose values obtained in the registration step. x _t and y _t are rotation angles calculated in the X and Y axis directions with respect to the face image in the recognition step. xm _a and ym _a are the average rotation angles of all registered users in head pose a. Thus, the final head pose value (F ₁ ) of the t-th image, which is the closest value among the five head pose values, can be calculated by the following equation.

&Quot; (2) "

The quality measuring unit 113 may measure the degree of symmetry of the left and right sides of the face region detected from the face image, and calculate the degree of illumination change of the face image as a quality measurement value. The left-right symmetry degree can be calculated by the difference between the pixel average values of the left area and the right area. The smaller the calculated difference value, the better the quality of the face image. For example, another important element of face recognition is a change that occurs due to illumination changes. When the face is evenly illuminated, no shadow or saturation area appears, and uniform illumination exists in the entire area of the face. Thus, by dividing the face region into left and right regions, the pixel average value difference in the right and left regions and the illumination value (F ₂ ) can be calculated using the following equation.

&Quot; (3) "

Here, I is the average pixel value of the face region, IMG (x, y) _t | is the pixel value at the (x, y) position of the tth face image, W and H are the width and height to be. F ₂ represents an illumination value based on the difference between the average pixel values of the right and left face regions. On the basis of the symmetry of the right and left face regions, if the entire face region has uniform illumination, F ₂ becomes smaller.

The quality measuring unit 113 can measure the sharpness of the facial image and calculate the quality measurement value. Since the user's head can usually move in front of the camera, the acquired facial image is affected by the motion blur which reduces the image quality. Thus, quality metrics defined by sharpness are useful. A blurred image due to motion blur may have a lower sharpness than a well-focused image, and a quality measurement value for sharpness may be calculated to be lower. The quality measurement value for the sharpness can be calculated using the following equation.

&Quot; (4) "

Here, IMG (x, y) _t is the second face image of t (x, y) is the pixel value at the position, LowPass (IMG (x, y ) t) is a low-pass filter to the IMG (x, y) _t The result is the applied value. By obtaining the difference between IMG (x, y) _t and LowPass (IMG (x, y) _t ), F ₃ reflecting the intermediate frequency and high frequency components of IMG (x, y) _t can be calculated.

The quality measuring unit 113 can measure the degree of opening of the detected eye and calculate the degree of opening of the eye as the quality measurement value. Since eye-related information is important for face recognition, the degree of eye opening can be an important quality measurement item. In order to calculate the degree of opening of the eyes, a thresholding method of separating the detected eye image into the eye and the background region may be used. That is, after the eye region is acquired, component labeling may be used to remove noise such as hair or eyebrows. As a result, the obtained binarized image is projected on the horizontal axis (x axis), and a histogram of black pixels can be obtained. For example, FIG. 4 is a diagram showing an example of a binarized image of a bird eye and a sensed eye and a corresponding histogram. As shown in Fig. 4A, in the case of a bounced eye, the middle area of the histogram has more black pixels than the side area (a high standard deviation of the number of black pixels). On the other hand, as shown in Fig. 4 (b), in the case of the eyes, the middle area and the side area are similar (a low standard deviation of the number of black pixels). Based on this characteristic, the standard deviation of the black pixel number can be calculated as the quality measurement value F ₄ by using the following equation.

&Quot; (5) "

Here, F ₄ is the open value of the eye calculated using the standard deviation of the number of black pixels projected on the horizontal axis. i and n represent the position of the projected pixel and the range of the histogram, respectively. x _i is the number of black pixels at the i-th position,

Is the average of the number of black pixels. F ₄ is a high value for the floating eye, the eye wound F ₄ can be calculated at a low value.

The quality measuring unit 113 can measure the contrast of the face image and calculate the quality measurement value for the contrast. For example, contrast has a significant impact on the image quality associated with image analysis, as well as human perception. A poor lighting environment affects the contrast and produces an unnatural image. The contrast value F ₅ can be calculated using the following equation.

&Quot; (6) "

Where H _q1 and H _q3 are the maximum pixel values at 25% and 75% positions in the cumulative histogram of the facial image, respectively, and I _r is the range of pixel brightness values of the facial image. Generally, high contrast images have a wider range of pixel brightness values than low contrast images. Therefore, F ₅ is increased in the case of an image of high contrast.

The quality measuring unit 113 may measure the image resolution of the face image to calculate a quality measurement value for the resolution. Resolution is the easiest way to measure image quality. Generally, high resolution face images can be preferred to low resolution face images. This is because a high resolution face image can produce a better face recognition result. In order for this feature to be reflected in the quality evaluation, the distance between the detected eyes can be calculated as the quality measurement value F ₆ , and the highest quality score can be assigned to the high resolution face image.

The score calculation unit 114 calculates the quality score of each face image using fuzzy logic. In other words, the score calculation unit 114 calculates a quality score of each face image by inputting a plurality of quality measurement values into the fuzzy logic (Fuzzy Logic) and outputting the calculated output values. At this time, the score calculation unit 114 adaptively selects at least two or more of the plurality of quality measurement values calculated for each of the plurality of quality measurement items, and adaptively selects the quality measurement values as the input of the fuzzy logic use. For example, after the above-described six quality measurement values (F ₁ to F ₆ ) are calculated, some of these basic quality measurement values that affect the recognition performance may be adaptively selected. The reason for doing this is that the factor that reduces the face recognition performance changes according to the continuous face image. For example, in a first continuous face image, a change element of illumination and head pose may be a main factor, but a change element of resolution and sharpness may be a main factor in a second continuous face image. Therefore, using a fixed number of quality metrics for all continuous face images can not cope with all of these factors. To solve this problem, four quality measurement values may be adaptively selected by comparing the variance values obtained from the six quality measurement values in the continuous face image. A quality measurement item with a high variance of the quality measurement value can be a more meaningful factor in relation to image discrimination.

5 is a diagram illustrating a procedure for finally calculating a quality score of an input image using a purge system according to an embodiment of the present invention.

As shown in FIG. 5, four quality measurements may be selected and input to the purge system to obtain the quality score of the facial image. To use the fuzzy system, the membership function must be determined.

6 is a diagram illustrating an example of a symmetric input fuzzy membership function according to an embodiment of the present invention.

As shown in FIG. 6, the membership function can be designed according to the input values QM ₁ to QM ₄ . Hereinafter, four selected quality measurement values (QM ₁ to QM ₄ ) will be referred to as elements 1 to 4 for convenience of explanation. The quality of the image with larger values of elements 1 to 4 is higher. As shown in FIG. 6, the input values of the purge system can be classified into two types of low (L: low) and high (H: high). The output pilot function of the fuzzy system can be designed as shown in FIG. 7 is a diagram showing an example of a symmetric output fuzzy membership function. Here, the output value (quality score) can be classified into low (L: low), middle (M: middle) and high (H: high). In general, a membership function can be designed by considering the distribution of input values and output values. It is assumed that the distributions of the low and high are similar to each other. Thus, the membership functions of low and high can be symmetric with respect to 0.5. Additionally, on the average (0.5), it can be assumed that the characteristics of the left part of the middle distribution are similar to those of the right part. The intermediate membership function can be symmetric with respect to the mean (0.5). As described above, the larger the value of elements 1 to 4, the higher the quality of the image. Accordingly, based on the fuzzy rule, the relationship between the input values (elements 1 to 4) and the output values (quality scores) can be expressed as shown in Table 1 below.

Factor 1 Factor 2 Factor 3 Factor 4 Output L L L L L H L H L L H M H L L L H M H L M H H H L L L L H M H L M H H H L L M H H H L H H H

For example, if the value of all elements is low, then the image quality can be considered low and the output value can be given low. If the values of the two elements are low and the values of the remaining two elements are high, the output value can be given as intermediate. If the value of all elements is high, the image quality can be considered high and the output value can be given high. Since all the weights of elements 1 to 4 are considered to be similar, the fuzzy rules table of Table 2 is designed with symmetry in mind. That is, when all of the elements 1 to 4 are low and the output value is low, the elements 1 to 4 are all high and the output value is high. If the output values of elements 1 to 4 are low, high, low, and low respectively, the output values of elements 1 to 4 are high, low, high, and high, respectively. An output value (quality score) can be obtained based on the fuzzy membership function and the fuzzy rule in consideration of symmetry.

8 is a diagram illustrating an example of obtaining an output value using an input membership function according to an embodiment of the present invention.

As shown in FIG. 8, one input value of an element corresponds to two output values using a membership function. Since there are four input values (elements 1 to 4), all eight output values can be obtained. For example, as shown in FIG. 8, a pair of output values (0.2 (low), 0.8 (high)) calculated from the first element (0.8) can be obtained. Similarly, assuming that the value of the element 2 to 4 is 0.8, as in the value of the element 1 (0.8), {(0.2 (low), 0.8 (high) (Low), 0.8 (high)), (0.2 (low), 0.8 (high)), (0.2 (low), 0.8 (high))} can be obtained. 0.2 (low), 0.2 (low), 0.2 (low), 0.2 (low), 0.2 (low), 0.2 (low) 0.8 (high), 0.8 (high), 0.8 (high), 0.2 (low), 0.2 (low) Can be obtained. With one subset, one output value (0.2 or 0.8) and its symbol (low, medium, or high) can be determined based on the MIN or MAX method and the fuzzy rule as shown in Table 1. The MIN method selects one minimum output value among all values, while the MAX method selects one maximum output value among all values. For example, using the MAX method, the output value can be determined to be 0.8 from one subset (0.2 (low), 0.2 (low), 0.2 (low), 0.8 (high)). Then, its symbol can be determined to be low using Table 1. Thus, 0.8 (low) can be obtained from (0.2 (low), 0.2 (low), 0.2 (low), 0.8 (high)). In this specification, it is assumed that a value of (0.8 (low)) is set as an inference value (IV). If the MIN method is applied, the acquired IV is 0.2 (low). Thus, the 16 types of IV are {(0.2 (low), 0.2 (low), 0.2 (low), 0.2 (low) 0.2 (low), 0.2 (low), 0.2 (low), 0.8 (high), 0.2 (low) (High), 0.8 (high), 0.8 (high), 0.8 (high))}. Based on these 16 IVs, the final quality score can be calculated using a defuzzification process.

FIG. 9 is a diagram illustrating an example of the division based on the membership function for the output value and IV according to the embodiment of the present invention.

As shown in FIG. 9, for each IV, one of two output values (quality score) can be obtained. If IV is 0.2 (low), the corresponding output value is S ₁ . Thus, a plurality of output values (S ₁ , S ₂ , ..., S _N ) can be obtained from the sixteen IVs, and the final output value (quality score) by the devel- opment method can be determined. Five differentiation methods such as first of maxima (FOM), last of maxima (LOM), middle of maxima (MOM), mean of maxima (MeOM) and center of gravity (COG) The FOM is a method of selecting the first output value (S ₂ ) among the output values calculated using the maximum IV (0.8 (intermediate)) as the output value. The LOM is a method of selecting the last output value (S ₃ ) among the output values calculated using the maximum IV (0.8 (intermediate)) as the output value. The MOM is a method of selecting the intermediate output value ((S ₂ + S ₃ ) / 2) among the output values calculated using the maximum IV (0.8 (intermediate)) as the output value. MeOM is a method of selecting the average output value ((S ₂ + S ₃ ) / 2) among the output values calculated using the maximum IV (0.8 (intermediate)) as the output value. The output value of COG can be determined as S ₅ , which is the geometrical center of the common region of the _three regions R ₁ , R ₂ and R ₃ , as shown in FIG. 9 (b). The geometric center can be calculated based on the weighted average of all regions defined by all IVs.

After all the quality scores of a plurality of continuous face images are calculated, the selection unit 115 selects a predetermined number of continuous face images from the upper side based on the quality score. Using the output quality score, a face log for face recognition can be created. Here, the face log means a high quality facial image. The quality score can be stored for each face region of a person observed in a continuous face image composed of n images, and m face images can be selected to make a face log in order of quality score.

FIG. 10 is a view showing an example of a face image selected from a continuous face image and a face log according to an embodiment of the present invention. The quality score shown in FIG. 10 (b) represents the quality score of the i-th image in FIG.

The face recognition unit 116 can perform face recognition using the weight calculated from the quality scores of the selected continuous face images and the selected continuous face images. For example, m face images selected in the face log can be used for face recognition. After the selected face image is obtained, the face recognition unit 116 may redefine the face region based on the detected and tracked eye position in order to normalize the size of the face region. This is because the size of the face area changes according to the Z-axis distance between the camera and the face. In addition, while the user is watching TV, lighting changes may occur in the face area. To solve this problem, retinex filtering can be performed for illumination normalization. The feature for face recognition can be extracted using the face image after performing the illumination normalization. According to an embodiment of the present invention, the face recognition unit 116 may use a multi-level binary pattern (MLBP) method to extract features from a normalized facial image. The face recognition unit 116 may use at least one of principal component analysis (PCA), linear discriminant analysis (LDA), and local binary pattern (LBP).

11 is a diagram illustrating a concept of MLBP feature extraction according to an embodiment of the present invention.

Referring to FIG. 11, the face region is divided into sub-blocks, and an LBP histogram can be obtained from each block as shown in FIG. 11 (C). The final histogram feature may be concatenated from all the histogram blocks to form the final feature vector for face recognition, as shown in Figure 11 (d). For example, a chi-squared distance (matching score) can be used to measure the difference between the registered face histogram feature and the input image face histogram feature. In order to deal with head pose changes (horizontal and vertical rotation), the matching histogram feature of the input image is matched with the five registered face histogram features, and then the matching score may be determined to be the smallest of the five. As the final matching score, the obtained matching score of the selected face image in the face log can be fused. The weight of the matching score can be calculated by the following equation.

&Quot; (7) "

Here, w _i represents the i-th weight of the matching score of the facial image in the face log, m represents the number of facial images in the face log, and Score _i represents the i-th quality score in the face log.

The final matching score for face recognition can be calculated using the weight of the selected face image as shown in the following equation.

&Quot; (8) "

The matching score is obtained by _{MLBP (MS 1, ..., MS} m) and using the weights _{(w 1, ..., w m} ) of the face image in the face log, fusing the matching score (FMS: fused matching score ) Can be calculated by Equation (8).

12 is a flowchart illustrating a face recognition method according to an embodiment of the present invention. FIG. 12 schematically illustrates a face recognition method performed by the face recognition apparatus according to the embodiment of the present invention.

First, the initial registration step will be described.

In step S1210, the face recognition apparatus 100 starts acquiring the continuous face image of the user. For example, the face recognition apparatus 100 can acquire a user's RGB color image captured by the camera 200 at a predetermined Z-axis distance of 2 m.

In step S1220, the face recognition apparatus 100 acquires a continuous face image according to the gazing point of the user. For example, the user can gaze at five points on the upper left, upper right, center, lower left, and lower right of the TV screen through the rotation of the head. At this time, The continuous face image of FIG.

In step S1230, the face recognition apparatus 100 acquires and stores facial feature information from the continuous face image according to the gazing point. For example, the facial feature information may include a face texture, an eye, and a nostril. In addition, the face recognition apparatus can acquire face histogram feature information according to various head pose as well as face feature information.

Next, the recognition step will be described.

In step S1240, the face recognition apparatus 100 acquires the continuous face image of the user. For example, the face recognition apparatus 100 may acquire an RGB color image of a user to be photographed by the camera 200.

In step S1250, the face recognition apparatus 100 detects the face region in the acquired continuous face image. For example, the face recognition apparatus 100 can detect a face region using an adaptive boosting (Adaboost) algorithm.

In step S1260, the face recognition apparatus 100 tracks the detected face area. For example, the face recognition apparatus 100 may track the detected face region using a continuously adaptive mean shift (CamShift) algorithm.

In step S1270, the face recognition apparatus 100 detects the eye region. For example, the face recognition apparatus 100 can detect an eye region using an Adaboost algorithm and an adaptive template matching (ATM) method.

In step S1280, the face recognition apparatus 100 performs nostril detection. For example, the face recognition apparatus 100 may detect nostrils by performing template matching based on subblocks.

In step S1290, the face recognition apparatus 100 selects a predetermined number of face images from the continuous face images through quality measurement and fuzzy-based quality evaluation. The face recognition apparatus 100 measures the quality of each face image according to a plurality of preset quality measurement items using the detected feature points. The face recognition apparatus 100 inputs a quality measurement value corresponding to a quality measurement item adaptively selected for each image into a fuzzy logic and calculates a quality score of each face image using the calculated output value. Here, the step of adaptively selecting among the plurality of quality measurement values may be based on the variance value of the quality measurement values of a plurality of continuous face images for each quality measurement item. Then, the face recognition apparatus 100 selects a predetermined number of consecutive face images from the upper side based on the calculated quality score.

In step S1300, the face recognition apparatus 100 performs face recognition by fusing matching points of the selected images.

Five face images according to the gazing point are obtained in the initial registration stage to improve the accuracy of face recognition irrespective of the head pose. In the initial registration phase, the user can face the TV with a Z-axis distance of 2m.

13 is a diagram showing an example of a face image obtained in the initial registration step. Face and eye regions can be detected in the acquired face image using the Adaboost algorithm. In addition, the nostril region can be detected using a sub-block based template matching method. Facial feature information in which the head pose of the image is estimated at the recognition step is stored based on the facial features (face, eye and nostril), and facial histogram feature information according to the head pose identifying each user is registered have.

Also, in the recognition step, the continuous face image information can be used for face detection and tracking. The face region detected by the Adaboost algorithm can be traced by the CamShift algorithm. Although the Adaboost method has a high detection rate, it consumes a short time. In other words, face tracking using the CamShift algorithm has the advantage of being stronger at high processing speed and head pose change.

Eye detection and tracking can be performed using the Adaboost algorithm and the ATM method, respectively. The nostril region can then be detected using a nostril detection mask based on sub-block based template matching and tracked by the ATM method. A facial image for quality evaluation can be obtained using the facial feature information. 14 is a diagram showing an example of face and face features detected and tracked in the recognition step.

15 to 22 are diagrams for explaining experimental results and analysis results of the face recognition method according to the embodiment of the present invention.

In the experiment, there are many face databases, but most of the databases do not have head pose, illumination, sharpness, openness of the eyes, contrast, Since it does not include all elements such as image resolution, a built-in database (Database I) was used. A self-contained database contains all elements in a continuous face image for experimentation.

When creating a self-built database, 20 groups were defined based on 20 people attending the experiment. Three subjects in each group performed three shots with varying Z-axis distance (2, 2.5m) and sitting position (left, middle, and right). Participants blinked and looked randomly and naturally at random points on the TV screen. During this period, successive images were acquired. The built-in database contains a total of 31,234 images to measure the performance of the face recognition system. In addition, at the initial registration stage, five images per person in each group were acquired at a Z-axis distance of 2 m. 15 shows an example of an image obtained for an experiment.

As shown in FIG. 16, the self-built database includes head pose, illumination, sharpness, openness of the eyes, contrast and image resolution, As shown in FIG. The most influential quality metric for creating a face log is the head pose. Since there is little change in the Z-axis distance, the resolution was observed to be unaffected by the quality metric. In Fig. 16, the bars in the first column represent the number of quality measurement values selected primarily. Similarly, the bars in the fourth column represent the number of quality measures selected in the fourth order. That is, the vertical axis in FIG. 16 is the number of selected quality measurement values.

In the first experiment, the accuracy of the face recognition method was measured based on the genuine acceptance rate (GAR) by setting the number of registered persons to three. Either MIN or MAX methods were selected to obtain an IV. Finally, the final quality score was obtained using one of five dif- ferentialization methods (FOM, LOM, MOM, MeOM, COG). Thus, as shown in the following Table 2, the accuracy of the face recognition was compared using the MIN or MAX method, and the accuracy of the face recognition was compared according to the differential method.

Method Number of selected

images One 2 3 4 5 No fusion No fusion Fusion No fusion Fusion No fusion Fusion No fusion Fusion Fuzzy MIN rule FOM 92.05 89.74 91.95 91.35 92.3 91.36 92.42 91.32 92.44 LOM 91.61 91.05 91.78 91.3 91.95 91.28 92.17 91.33 92.23 MOM 92.29 91.35 91.88 91.43 92.05 91.45 92.38 91.45 92.45 MeOM 92.23 91.54 92.29 91.53 92.63 91.48 92.92 91.41 92.82 COG 92.27 91.61 92.42 91.63 92.73 91.63 92.89 91.58 92.94 Fuzzy MAX rule FOM 92.12 91.49 91.74 91.49 91.98 91.46 92.1 91.45 92.16 LOM 92.31 91.38 91.9 91.41 92.18 91.36 92.2 91.36 92.46 MOM 91.77 91.31 92.01 91.53 92.43 91.46 92.4 91.43 92.42 MeOM 92.43 91.47 92.09 91.52 92.68 91.51 92.72 91.47 92.72 COG 92.37 91.4 92.33 91.58 92.77 91.42 92.72 91.54 92.86

Here, the number of continuous face images is 10, and the accuracy of face recognition is measured while changing the number of selected images in the face log. In Table 2, no fusion means that the matching scores (MS ₁ , ..., MS _m ) of Equation 8 are used to calculate the GAR without using the convergence method of Equation (7). The fusion in Table 2 means that the FMS of Equation (8) is used to calculate the GAR.

Experimental results show that the methods based on MIN and COG show higher face recognition accuracy than other methods. The highest accuracy (92.94%) was obtained with the fuzzy MIN rule and the COG when combining the five selected images.

In the next experiment, the accuracy of face recognition was measured according to the number of facial images with respect to the GAR, as shown in Table 3 below.

Number of selected

images Number ( n ) of images in a video sequence 10 15 20 25

Fusion

2 92.42 93.12 93.63 93.92 3 92.73 93.59 93.89 94.33 4 92.89 93.57 94.06 94.59 5 92.94 93.72 94.18 94.71

In Table 3, fusion means that the FMS of Equation 8 is used to calculate the GAR. As the number of face images increases, the accuracy of face recognition improves.

In addition, the accuracy of the proposed method is compared with methods based on fixed quality measures. Using the results of FIG. 16, four influential quality measurement items (head pose, illumination, sharpness, openness of eyes) were selected for the method based on fixed quality metrics. In this experiment, n and m were set to 25 and 5, respectively. The method based on fixed quality metrics is less accurate than the proposed method because it can not evaluate other quality metrics that affect the accuracy of face recognition. The accuracy of existing methods using all images and the accuracy of methods based on fixed quality metrics were compared with the accuracy of the proposed method. The following Table 4 shows the results. Experimental results show that the proposed method is more accurate than other methods.

Using all images in a video sequence
[40] Fixed quality measures-based approach
[21] Adaptive quality measure-based approach
( proposed method ) Accuracy 89.74 93.09 94.71

17 and 18 show face images of correct recognition results according to an embodiment of the present invention. As shown in FIGS. 17A and 17B, even when a part of a face is hidden by a hand, a face region is not detected correctly, or an eye is wound, Can be excluded. Based on these results, it can be seen that the proposed method can accurately select good images and correctly match registered face images of the same person.

19 and 20 show face images of inaccurate recognition results according to the embodiment of the present invention. As shown in FIGS. 19 and 20, even if there is an eye-wrapped image, the proposed method can exclude a low-quality image and accurately select a good quality face image. However, it was incorrectly matched with registered face images of other people. This is because there is a difference in size between the registered image and the face region of the input image due to incorrect detection of the face and eye regions. This can be solved by increasing the accuracy of redefining the face area based on a more accurate detection algorithm.

For the experiment, as shown in Fig. 13, an image obtained while each user naturally watches TV was used. As shown in Fig. 21, in the case where a severe head rotation occurs, it is difficult to detect the face and the facial feature, and when the user normally watches TV, there is no serious case of head rotation. Thus, images of this case were not used in the experiment.

When a user watches TV as usual, the basic factor for determining the degree of head pose change is the size of the TV and the viewing distance between the user and the TV. The relationship between TV size and proper viewing distance is already defined. That is, the larger the TV size, the more the viewing distance should be further. And, the smaller the TV size, the closer the viewing distance should be.

22 (a) to 22 (c) illustrate a case where (a) a user views a 50-inch TV at a viewing distance of about 2 m, (b) a user watches a 60- (C) a user views a 70-inch TV at a viewing distance of about 2.8 m. 22 (a) to 22 (c), when each user gazes at the same position (lower left side) in the TV although the image resolution of each user decreases as the viewing distance increases, The degree of change in pose is similar in all cases. As shown in the bottom image of FIGS. 22 (a) to 22 (c), the accurate region of the face and face features is detected by the proposed method, and the proposed method of measuring the quality of the face image successfully operates. This shows that the performance of the proposed method is not affected by using a small or large size TV when considering the proper viewing distance.

Meanwhile, the face recognition method according to an embodiment of the present invention may be implemented in a form of a program command that can be executed through a variety of means for electronically processing information, and may be recorded in a storage medium. The storage medium may include program instructions, data files, data structures, and the like, alone or in combination.

Program instructions to be recorded on the storage medium may be those specially designed and constructed for the present invention or may be available to those skilled in the art of software. Examples of storage media include magnetic media such as hard disks, floppy disks and magnetic tape, optical media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, magneto-optical media and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. The above-mentioned medium may also be a transmission medium such as a light or metal wire, wave guide, etc., including a carrier wave for transmitting a signal designating a program command, a data structure and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as devices for processing information electronically using an interpreter or the like, for example, a high-level language code that can be executed by a computer.

The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the invention as defined in the appended claims. It will be understood that the invention may be varied and varied without departing from the scope of the invention.

100: face recognition device
111:
112:
113: Quality measurement section
114: score calculation unit
115:
116:

Claims

An input unit for receiving a plurality of continuous face images;
A detecting unit detecting feature points in the plurality of continuous face images;
A quality measuring unit for measuring a quality of each face image by using a plurality of preset quality metrics using the detected feature points and calculating a plurality of quality measurement values of each face image;
A score calculation unit for adaptively selecting at least two of the plurality of quality measurement values, inputting the selected quality measurement value into fuzzy logic, and calculating a quality score of each face image with the calculated output value;
A selection unit for selecting a predetermined number of consecutive face images from the upper side based on the quality score; And
And a face recognition unit for performing face recognition using the selected continuous face image.

The method according to claim 1,
Wherein the score calculating unit calculates a variance value of the quality measurement values of the plurality of continuous face images for each quality measurement item and compares the calculated variance values to determine a predetermined number of quality measurement items having a high variance value as an adaptive And selects a quality measurement value adaptively by using a quality measurement value corresponding to the selected quality measurement item as the input of the fuzzy logic.

The method according to claim 1,
The quality measuring unit measures the degree of difference between the registered head pose and the predicted head pose, the degree of illumination change of the facial image, the sharpness of the facial image, the degree of opening of the detected facial image, the contrast of the facial image, resolution of at least two of the plurality of quality measurement values.

The method of claim 3,
Wherein the quality measurement value is normalized between 0 and 1 for application to the fuzzy logic.

The method of claim 3,
Wherein the quality measuring unit measures the rotation angle of the face recognition object based on the position of the feature point and compares the measured rotation angle and the registered rotation angle to calculate a difference value as a quality measurement value. Device.

The method of claim 3,
Wherein the quality measuring unit measures the degree of left-right symmetry with reference to the left and right division lines of the face region detected from the face image, and calculates the degree of illumination change of the face image as a quality measurement value.

The method of claim 3,
Wherein the quality measuring unit calculates a sharpness reflecting the intermediate frequency and high frequency components as a quality measurement value by obtaining a difference between a pixel value of a face image and a result obtained by applying a low pass filter to the pixel value. Recognition device.

The method of claim 3,
Wherein the quality measuring unit calculates an open value of eyes calculated using a standard deviation of the number of black pixels projected on a horizontal axis as a quality measurement value.

The method of claim 3,
Wherein the quality measuring unit calculates a contrast value as a value obtained by dividing a difference between maximum pixel values at positions of 25% and 75% by a pixel brightness value in a cumulative histogram of a face image.

The method of claim 3,
Wherein the quality measuring unit calculates a distance between the detected two eyes as a resolution value.

The method according to claim 1,
Wherein the face recognition unit calculates a weight of each selected continuous face image by a ratio of the sum of the quality scores of the selected continuous face image and the quality score of each selected continuous face image and performs face recognition using the calculated weight Wherein the face recognition apparatus comprises:

A face recognition method performed by a face recognition apparatus using a plurality of continuous face images,
Receiving the plurality of continuous face images;
Detecting feature points in the plurality of continuous face images;
Calculating a plurality of quality measurement values of each face image by measuring quality of each face image for each of a plurality of preset quality measurement items using the detected feature points;
Adaptively selecting at least two of the plurality of quality measurements;
Inputting the selected quality measurement value into a fuzzy logic and calculating a quality score of each of the face images using the calculated output value;
Selecting a predetermined number of consecutive face images from the top based on the quality score; And
And performing face recognition using the selected continuous face image.

13. The method of claim 12,
Wherein adaptively selecting at least two of the plurality of quality measurements comprises:
Calculating a variance value of quality measurement values of the plurality of continuous face images for each quality measurement item;
Comparing the calculated variance values to select a preset number of quality metrics having a high variance value; And
And determining a quality measurement value corresponding to the selected quality measurement item as an input of the fuzzy logic.

13. The method of claim 12,
Wherein the step of calculating a plurality of quality measurement values of each face image comprises:
The degree of difference between the registered head pose and the predicted head pose, the degree of illumination change of the facial image, the sharpness of the facial image, the degree of opening of the detected facial image, the contrast of the facial image, And a plurality of quality measurement values are calculated by measuring the two.

13. The method of claim 12,
Wherein the step of performing face recognition comprises:
Calculating a weight of each selected continuous face image by a ratio of a sum of quality scores of the selected continuous face images and a quality score of each selected continuous face image, and performing face recognition using the calculated weight values Recognition method.

A computer program for executing the face recognition method according to any one of claims 12 to 15.