CN101908152B

CN101908152B - Customization classifier-based eye state identification method

Info

Publication number: CN101908152B
Application number: CN2010101979800A
Authority: CN
Inventors: 马争; 解梅; 孙睿
Original assignee: University of Electronic Science and Technology of China
Current assignee: Houpu Clean Energy Group Co ltd
Priority date: 2010-06-11
Filing date: 2010-06-11
Publication date: 2012-04-25
Anticipated expiration: 2030-06-11
Also published as: CN101908152A

Abstract

The invention belongs to the technical field of image processing and mode identification and is suitable for driver fatigue detection. The method comprises the following steps of: establishing a human face image library and a user face image library, calculating eye images of each image and mixing the two libraries according to different proportions; calculating the haar-like characteristic vector of each image in the mixed eye image library and constructing a strong classifier by using an AdaBoost method; randomly selecting a plurality of eye images in user face image library, judging the constructed strong classifier, and selecting the strong classifier with the highest identification accuracy as the eye state identification classifier used when the user drives. Through the method, different classifiers are used for different users by a method of mixing the user data and the human face library data according to the customization concept, so that the identification accuracy of the classifier is improved and the identification risks are reduced. The invention further provides two different classifiers for users wearing or not wearing glasses, and the eye state identification is more flexible.

Description

A kind of eye state identification method based on customization classifier

Technical field

The invention belongs to the image processing and pattern recognition field, relate to the driver fatigue detection technique.

Background technology

At present; Traffic hazard causes ten hundreds of vehicle collisions and great casualties every year; According to incompletely statistics; The whole world surpasses 600,000 because of road traffic accident causes dead number, and wherein because the traffic hazard that driver tired driving causes has 100,000 at least, direct economic loss reaches 12,500,000,000 dollars.Driver tired driving with drive when intoxicated equally, become the main hidden danger of traffic hazard.Follow development of computer; The various countries researchist has begun to further investigate the detection method of fatigue driving from every field; The United States Federal in 1998 Speedway Control Broad test has confirmed that PERCLOS (number percent that the unit interval human eye is closed) has the correlativity of height with driver's fatigue conditions, and this has opened up new thinking for fatigue driving detects.See document D.F.Dinges for details; And R.Grace; " PERCLOS:A valid psychophysiological measure of alertness asassessed by psychomotor vigilance; " US Department of Transportation, Federal highwayAdministration.Publication Number FHWA-MCRT-98-006.

Method for detecting fatigue driving based on the PERCLOS characteristic is gathered the driver front usually, and especially the video image of eye areas is handled, and whole detection method mainly comprises people's face location, human eye location, three processes of human eye state identification.And these processes all can be summed up as in the pattern-recognition people's face and non-face, human eye and non-human eye, the classification problem of opening eyes and closing one's eyes.Solve above-mentioned classification problem following several kinds of classical ways are arranged usually: (1) SVM, i.e. SVMs.SVM is a kind of learning machine of the Statistical Learning Theory based on structural risk minimization, is widely used in each branch of pattern-recognition.SVM is the earliest by propositions such as Vapnik, and it is specially adapted to the higher-dimension small sample problem, and the excellent popularization ability is arranged.(2) FLD, promptly Fisher is linear differentiates.FLD attempts to seek a projecting direction, makes to differentiate best to 2 types of samples.Try to achieve best projection direction w ^*After, all samples are projected to the best projection direction, obtain y=w ^{* T}X, and select a threshold value y ₀Carry out 2 types of divisions.(3) based on the Adaboost algorithm of Haar type rectangular characteristic.The Adaboost algorithm is a kind of learning algorithm that is widely used in recent years, and it is proposed by people such as Schapire the earliest, and its main thought is from a big Weak Classifier space, to select the part Weak Classifier, and they are combined constitutes a strong classifier.

Experiment shows, and is fast based on Adaboost algorithm strong robustness, accuracy height and the speed of Haar type rectangular characteristic, has very significantly actual application value.Its specific practice is from positive negative sample, to extract the Haar-like proper vector, uses cascade AdaBoost method to make up sorter model then, trains the concrete parameter of sorter.See for details document Paul Viola andMichael J.Jones. " Rapid Object Detection using a Boosted Cascade of Simple Features; " IEEECVPR; 2001. with document R.Lienhart; A.Kuranov; And V.Pisarevsky. " Empirical analysis of detectioncascades of boosted classifiers for rapid object detection, " In DAGM25th Pattern RecognitionSymposium, 2003.

In practical application, adopt Adaboost algorithm based on Haar type rectangular characteristic, the classifier parameters of training out through general people's face sample storehouse can apply to people's face location and human eye location; And for the identification of eye state, this method can only reach certain accuracy rate for most of crowd, and higher relatively for another part crowd misclassification rate, even mistake fully.This is that and custom such as whether wear glasses is difficult to differentiate with a general sorter because everyone eyes are opened with closed otherness very greatly.

Summary of the invention

The present invention provides a kind of eye state identification method based on customization classifier, and this method can generate the sorter of different eye states according to different users, improves the accuracy rate and the scope of application of eye state identification.

Describe content of the present invention for ease, at first some terms are defined.

Definition 1: eye state.Detect for fatigue driving, eye state is divided into to be opened and closed two types.

Definition 2: people's face sample storehouse.People's face sample storehouse among the present invention is meant the image library that has comprised different front faces.Whether the image of this database should be gathered under the different illumination environment, and according to wearing glasses, and is divided into wearing spectacles database and wearing spectacles database not.

Definition 3: human eye central point.For the image of opening eyes, definition human eye central point is a pupil center location; For the image of closing one's eyes, definition human eye central point is an eye seam point midway.

Define five in 4: three front yards." five in three front yards " be people's face long with the wide ratio of face, think 3/10ths of human eye area width behaviour face width in the present invention, and the distance between two human eyes is the width of a human eye just.

Definition 5:Haar-like proper vector.The Haar-like characteristic is to be characterized in people's face by humans such as Papageorgiou the earliest.People such as Papageorgiou use the Haar wavelet basis function in to the research of front face and human detection problem; They find that standard quadrature Haar wavelet basis receives certain restriction on using; In order to obtain better spatial resolution, they have used the characteristic of 3 kinds of forms.People such as Viola have done expansion on this basis, use 2 types of characteristics of totally 4 kinds of forms.Lienhart has increased the rectangular characteristic of several kinds of hypotenuses again finally, makes characteristic type reach 3 types 14 kinds forms (as shown in Figure 2).

Definition 6:AdaBoost.The Adaboost full name is Adaptive Boost, is a kind of iterative algorithm, and its core concept is to the different sorter (Weak Classifier) of same training sample set training, combines these Weak Classifiers then, constitutes a strong classifier.Its algorithm itself realizes through changing DATA DISTRIBUTION whether it is correct according to the classification of each training sample among each training sample set, and the accuracy rate of overall classification last time, confirms the weights of each training sample.Give lower floor's sorter with the new training sample set of revising weights and train, will train the set of classifiers that obtains at last at every turn altogether as decision-making sorter (strong classifier).Use the Adaboost sorter can get rid of some unnecessary training sample characteristics, and the main foundation that will classify is placed on above the main training sample characteristic.Wherein common Adaboost has Discrete AdaBoost, Real AdaBoost and Gentle AdaBoost.Discrete AdaBoost be meant a kind of output valve of Weak Classifier be limited to 1 ,+1}'s and generate the AdaBoost algorithm of strong classifier through weights adjustment; Real AdaBoost is meant that a kind of Weak Classifier output area is R's and generate the AdaBoost algorithm of strong classifier through weights adjustment; Gentle AdaBoost be a kind of to the two kinds of AdaBoost in front because to " unlike " the very high problem that has caused the decrease in efficiency of sorter of positive sample weights adjustment, and the mutation algorithm of generation.

Technical scheme of the present invention is following:

A kind of eye state identification method based on customization classifier, as shown in Figure 1, may further comprise the steps:

Step 1: set up facial image database A.Said face database A comprises two sub-banks A1 and A2; One of them word bank A1 forms by removing with outdoor, Different Individual, that do not wear glasses, front face gray level image, and another word bank A2 forms by removing with the open air, Different Individual, that wear glasses, front face gray level image.Two central point distances of the people's face gray level image among the face database A are not less than 48 pixel units, people's face gray level image quantity basically identical of open eyes state and closed-eye state.

Step 2: set up user's facial image database B.Said user's facial image database B comprises two sub-banks B1 and B2, and one of them word bank B1 is made up of the user, that do not wear glasses, front face gray level image, and another word bank B2 is made up of the user, that wear glasses, front face gray level image.Two central point distances of the people's face gray level image among the face database B are not less than 48 pixel units, people's face gray level image quantity basically identical of open eyes state and closed-eye state.

Step 3: the eye image that calculates each width of cloth facial image among facial image database A and the user's facial image database B; Obtain respectively with facial image database A in two sub-banks A1 ' and the A2 ' of the corresponding eye image database A ' of two sub-banks A1 and A2, and with user's facial image database B in two sub-banks B1 ' and the B2 ' of the corresponding eye image database B ' of two sub-banks B1 and B2.The computing method of concrete eye image are: at first calculate the pixel distance d between two of people's face gray level images; According to the principle in five in three front yards, be the center then with the human eye central point, the long and wide rectangular area that is the d/2 pixel size of intercepting; All rectangular areas are zoomed to 24 * 24 pixel sizes, and rotation at random in-10 ° to 10 ° scopes in the direction of the clock, eye image obtained at last.

Step 4: set up and mix eye image database C.Said mixing eye image database C comprise 2N sub-banks

and

wherein word bank (1≤i≤N, N are natural number) by the eye image of the eye image of the A1 ' of word bank described in the step 3 and word bank B1 ' according to different proportion, mix at random; Word bank

(1≤i≤N, N are natural number) by the eye image of the eye image of the A2 ' of word bank described in the step 3 and word bank B2 ' according to different proportion, mix at random.Said sub-libraries

and

number of images of the human eye is not less than 2000.

Step 5: calculate the eye image word bank

With

In the haar-like proper vector x of all eye images, said haar-like proper vector x comprises 3 types of 14 kinds of forms, and with each eye image word bank

With

All proper vector x combine and constitute 2N training sequence

With

(1≤i≤N); And training sequence

With

Can be expressed as { (x ₁, y ₁), (x ₂, y ₂) ..., (x _i, y _i) ..., (x _M, y _M) form, x wherein _iExpression

With

In i haar-like proper vector; y _i∈ 1,1}, expression haar-like proper vector x _iThe state that pairing eye image is opened eyes or closed one's eyes; M is the eye image storehouse

With

Middle eye image quantity.

Step 6: Step 5 training sequence proceeds of 2N and using AdaBoost method to build a strong classifier corresponding 2N and

Step 7: the eye image from user's eye image word bank B1 ' that step 3 is set up more than picked at random 1000 width of cloth; Calculate its haar-like proper vector x; Adopt the constructed strong classifier of step 6

to judge respectively; Obtain judged result: 1-opens eyes, and 0-closes one's eyes; Same from user's eye image word bank B2 ' that step 3 is set up the eye image more than picked at random 1000 width of cloth; Calculate its haar-like proper vector x; Adopt the constructed strong classifier of step 6 to judge respectively; Obtain judged result: 1-opens eyes, and 0-closes one's eyes.

Step 8: the judged result of step 7 gained and selected eye image actual opened eyes or closed-eye state compares; And then count the recognition accuracy of two groups of strong classifiers

and

respectively; Choose recognition accuracy is the highest in the strong classifier

strong classifier then and carry out the sorter of the human eye state identification in the driving procedure at wearing spectacles not, choose the sorter that recognition accuracy is the highest in the strong classifier

strong classifier carries out the human eye state identification in the driving procedure as the user at wearing spectacles as the user.

Step 9: in user's driving procedure; Gather user's face image in real time; And calculate the eyes image of 24 * 24 pixel sizes and the haar-like proper vector x of this eyes image in real time, at last according to the user whether wearing spectacles select that corresponding strong classifier carries out human eye state identification in the step 8.

Through above step, just can use eye state sorter, thereby improve the accuracy rate of individual state identification according to different users based on customization.

Need to prove:

1. step 1 and step 2 are when setting up face database A and user's face database B, and facial image is preferably under various different light environment and gathers.Can at first make up one and gather environment, this collection environment is preferably the darkroom, is furnished with regulatable light source, can realize that the light and shade of photoenvironment changes, and can in a few minutes, collect individual thousands of width of cloth facial images.

2. in the step 6 the AdaBoost method that is adopted is not had special qualification, various AdaBoost methods all can be used, and are that last accuracy rate is slightly different.

The present invention adopts the constant method of characteristic according to the thought of customization, sets up facial image database and user's facial image database at first respectively; Calculate the eye image of every width of cloth image in facial image database and the user's facial image database then respectively; Eye image with the facial image database mixes by different proportion with the eye image of user's facial image database again, obtains mixing the eye image database; Calculate the haar-like proper vector of mixing every width of cloth image in the eye image database again, and adopt the AdaBoost method to make up strong classifier; Eye image in the some width of cloth users of the picked at random facial image database again; Calculate its haar-like proper vector; The strong classifier that adopts the AdaBoost method to make up is judged; Count the recognition accuracy of strong classifier, choose the highest strong classifier of recognition accuracy as the human eye state discriminator device that uses in user's driving procedure; In user's driving procedure, adopt this sorter to carry out human eye state identification at last.

Innovation part of the present invention is:

1, the thought with customization applies to use different sorters for different user in the human eye state identification, has improved the accuracy rate of individual human eyes state recognition.

2, the training sample of sorter has adopted the method for user data and face database data mixing, makes sorter to guarantee again simultaneously to be without loss of generality to the individual accuracy rate that improves, and reduces the identification risk.

3, the user's of raising wearing spectacles recognition accuracy, and user can be selected for use and wear glasses and the two kinds of different sorters of not wearing glasses, and possesses dirigibility.

Description of drawings

Fig. 1 is a schematic flow sheet of the present invention.

Fig. 2 is the synoptic diagram of haar-like characteristic, has comprised 3 types of 14 kinds of forms.

Fig. 3 is to be the quantity of the various haar-like characteristics of example with 24 * 24 sized images.

Embodiment

and

wherein word bank

(1≤i≤N, N are natural number) by the eye image of the eye image of the A1 ' of word bank described in the step 3 and word bank B1 ' according to different proportion, mix at random; Word bank (1≤i≤N, N are natural number) by the eye image of the eye image of the A2 ' of word bank described in the step 3 and word bank B2 ' according to different proportion, mix at random.Said sub-libraries

and

number of images of the human eye is not less than 2000.

Step 5: calculate the eye image word bank

With In the haar-like proper vector x of all eye images, said haar-like proper vector x comprises 3 types of 14 kinds of forms, and with each eye image word bank

With

All proper vector x combine and constitute 2N training sequence With

(1≤i≤N); And training sequence

With

With

With

Middle eye image quantity.

Step 6: Step 5 training sequence proceeds of 2N

and using AdaBoost method to build a strong classifier corresponding 2N

and

Step 7: the eye image from user's eye image word bank B1 ' that step 3 is set up more than picked at random 1000 width of cloth; Calculate its haar-like proper vector x; Adopt the constructed strong classifier of step 6 to judge respectively; Obtain judged result: 1-opens eyes, and 0-closes one's eyes; Same from user's eye image word bank B2 ' that step 3 is set up the eye image more than picked at random 1000 width of cloth; Calculate its haar-like proper vector x; Adopt the constructed strong classifier of step 6

to judge respectively; Obtain judged result: 1-opens eyes, and 0-closes one's eyes.

and

The inventive method is compared with the method for only using general face database image to train, and the general individual accuracy rate improves about 2%, and the individual accuracy rate of wearing spectacles improves 3%～5%, and operation time is less than 0.1s.

In sum, method of the present invention is utilized the thought of customization, and user data is combined with the face database data, adopts the constant method of characteristic to train the human eye state sorter, thereby has realized human eye state identification fast and accurately.

Claims

1. eye state identification method based on customization classifier may further comprise the steps:

Step 1: set up facial image database A;

Said face database A comprises two sub-banks A1 and A2; One of them word bank A1 forms by removing with outdoor, Different Individual, that do not wear glasses, front face gray level image, and another word bank A2 forms by removing with the open air, Different Individual, that wear glasses, front face gray level image; Two central point distances of the people's face gray level image among the face database A are not less than 48 pixel units, people's face gray level image quantity basically identical of open eyes state and closed-eye state;

Step 2: set up user's facial image database B;

Said user's facial image database B comprises two sub-banks B1 and B2, and one of them word bank B1 is made up of the user, that do not wear glasses, front face gray level image, and another word bank B2 is made up of the user, that wear glasses, front face gray level image; Two central point distances of the people's face gray level image among the face database B are not less than 48 pixel units, people's face gray level image quantity basically identical of open eyes state and closed-eye state;

Step 3: the eye image that calculates each width of cloth facial image among facial image database A and the user's facial image database B; Obtain respectively with facial image database A in two sub-banks A1 ' and the A2 ' of the corresponding eye image database A ' of two sub-banks A1 and A2, and with user's facial image database B in two sub-banks B1 ' and the B2 ' of the corresponding eye image database B ' of two sub-banks B1 and B2; The computing method of concrete eye image are: at first calculate the pixel distance d between two of people's face gray level images; According to the principle in five in three front yards, be the center then with the human eye central point, the long and wide rectangular area that is the d/2 pixel size of intercepting; All rectangular areas are zoomed to 24 * 24 pixel sizes, and rotation at random in-10 ° to 10 ° scopes in the direction of the clock, eye image obtained at last;

Step 4: set up and mix eye image database C;

The mixing of the human eye image database C includes 2N sub-libraries

and

where sub-libraries

from step 3 in the sub-library A1 'of the human eye image and the sub-libraries B1' of eye images in different proportions, randomly mixed; shard

from step 3 in the sub-library A2 'of the human eye image and the sub-libraries B2' of the human eye image in different proportions, random mixture; said sub-libraries and

number of images in the human eye is not less than 2000; wherein 1 ≤ i ≤ N, N is a natural number;

Step 5: calculate the eye image word bank

With

In the haar-like proper vector x of all eye images, said haar-like proper vector x comprises 3 types of 14 kinds of forms, and with each eye image word bank With All proper vector x combine and constitute 2N training sequence

With

And training sequence

With

Can be expressed as { (x ₁, y ₁), (x ₂, y ₂) ..., (x _i, y _i) ..., (x _M, y _M) form, 1≤i≤M wherein, x _iExpression With

In i haar-like proper vector; y _i∈ 1,1}, expression haar-like proper vector x _iThe state that pairing eye image is opened eyes or closed one's eyes; M is the eye image storehouse With

Middle eye image quantity;

Step 6: Step 5 training sequence proceeds of 2N

and

using AdaBoost method to build a strong classifier corresponding 2N

and

to judge respectively; Obtain judged result: 1-opens eyes, and 0-closes one's eyes; Same from user's eye image word bank B2 ' that step 3 is set up the eye image more than picked at random 1000 width of cloth; Calculate its haar-like proper vector x; Adopt the constructed strong classifier of step 6

to judge respectively; Obtain judged result: 1-opens eyes, and 0-closes one's eyes;

and

strong classifier carries out the human eye state identification in the driving procedure as the user at wearing spectacles as the user;