CN105069396A

CN105069396A - Dynamic percentage characteristic cutting AdaBoost face detection algorithm

Info

Publication number: CN105069396A
Application number: CN201510391351.4A
Authority: CN
Inventors: 李东新; 左卜
Original assignee: Hohai University HHU
Current assignee: Hohai University HHU
Priority date: 2015-07-06
Filing date: 2015-07-06
Publication date: 2015-11-18
Anticipated expiration: 2035-07-06
Also published as: CN105069396B

Abstract

The invention discloses a dynamic percentage characteristic cutting AdaBoost face detection algorithm. The algorithm concretely includes the steps: determining a percentage f of the needed cut characteristic number when one round of iteration begins, and then selecting the characteristic with good classification performance and making the selected characteristic take part in the next round of training; increasing the characteristic number taking part in the training by reducing the cutting coefficient of the iteration if the optimal weak classifier error rate of the iteration obtained from the training is greater than a randomly selected value; and stopping iteration if all the characteristics are utilized for training and the error rate still exceeds 0.5. If the number of the characteristics taking part in training is too much, the algorithm is utilized to save the training time by selecting the characteristic with the low error rate in the previous round and making the selected characteristic take part in the next round of training.

Description

Dynamic number percent feature cutting AdaBoost Face datection algorithm

Technical field

The present invention relates to a kind of dynamically number percent feature cutting AdaBoost Face datection algorithm, belong to mode identification technology.

Background technology

Biometrics identification technology realizes identity by the exclusive physiological characteristic of each individuality and behavioural characteristic to confirm or the individual object differentiated.Face, as the one of biological characteristic, has and is easy to obtain, and the features such as interface is friendly, compared to now conventional mode, as password, credit card, ID (identity number) card etc., have the advantages such as not reproducible, easy to carry, distinctive is strong.Therefore have broad prospects in fields such as video monitoring, Smart Home and criminal investigations.Along with embedded device arithmetic capability is more and more stronger, intelligent algorithm is applied to embedded development field more and more, realizes different functions.Wherein Face datection is as the basis of recognition of face, becomes the study hotspot of artificial intelligence field.

Its core of AdaBoost algorithm from a large amount of Haar features, extracts the best feature of classifying quality as Weak Classifier by the method for iteration, and the strong classifier finally generated is made up of a large amount of Weak Classifiers.AdaBoost is practical and simple, and for the detection of single facial image, not only there is high accuracy of detection based on the Face datection algorithm of AdaBoost algorithm, and possessing very fast detection speed, the face recognition technology therefore based on this algorithm is widely used.

Work as training sample, sample characteristics, time Weak Classifier number is more, adopt the sorter of AdaBoost Algorithm for Training can consume a large amount of training times.Characteristic Number determines the iterations of algorithm, and each iteration obtains the error rate that individual features is concentrated at training sample, obtains best Weak Classifier finally by comparison error rate.Often trained a best Weak Classifier, the weight of training sample can change accordingly, if therefore need more Weak Classifier, then needs the above-mentioned steps repeating corresponding number of times.As can be seen here, work as training sample, when sample characteristics number and Weak Classifier number increase, the training time can increase with the three cubed order of magnitude.

Summary of the invention

The object of the invention is to overcome deficiency of the prior art, a kind of dynamically number percent feature cutting AdaBoost Face datection algorithm is provided, solves in prior art and adopt the sorter of AdaBoost Algorithm for Training can consume the technical matters of a large amount of training times.

For solving the problems of the technologies described above, the technical solution adopted in the present invention is: dynamically number percent feature cutting AdaBoost Face datection algorithm, in each iteration at first, first determine the number percent f of required cutting Characteristic Number, the feature then selecting classification performance good participates in next round training; Randomly drawing value when training the best Weak Classifier error rate of the current iteration obtained to be greater than, by reducing the cutting coefficient of current iteration, expanding the Characteristic Number participating in training; If when adopting whole feature to train, error rate still more than 0.5, then stops iteration;

Specific algorithm comprises the steps:

Step one: establish the training sample of input to add up to N, wherein negative sample is m, and positive sample is n, and training sample set is S={ (x ₁, y ₁) ... (x _n, y _n), wherein x _irepresent i-th sample, y _i={ 1,0}, is respectively used to identify positive negative sample;

Step 2: initialization sample weight:

w_{1, i} = \{\begin{matrix} \frac{1}{2 m}, & y_{i} = 0 \\ \frac{1}{2 n}, & y_{i} = 1 \end{matrix}

Step 3: supposing that each takes turns the sample percentage cast out is f, so each takes turns the number of samples participating in training is N × (1-f), iterations t=1,2 ..., T;

Step 4: obtain optimum Weak Classifier, tries to achieve Weak Classifier h _tweighting coefficient α in strong classifier _t, method is as follows:

Step 401: the weighted value of normalization sample:

Step 402: for each feature j, trains a simple Weak Classifier h _j(x, f _j, p _j, θ _j):

Wherein, f _jx () is eigenwert, p _jrepresent sign of inequality direction, θ _jfor Weak Classifier threshold value;

Step 403: select the Weak Classifier h that minimal error rate is corresponding _t(x), wherein minimal error rate is defined as:

ϵ_{t} = \min_{f, p, θ} \underset{i}{Σ} w_{i} | h_{j} (x_{i}, f, p, θ) - y_{i} |;

Step 404: if ε _t=0 or just occur ε when the first round trains _t>=0.5, then make T=t-1, jump to step 6; If ε _t>=0.5 and be not the first round, then make T=t-1, judge whether f is greater than 2/3, if be greater than, make f=2 × f-1, otherwise make f=f/2 jump to step 5;

Step 405: upgrade sample weights:

As sample x _iby e during mis-classification _i=0, on the contrary e _i=1,

Step 406: try to achieve Weak Classifier h _tweighting coefficient in strong classifier:

α_{t} = \frac{1}{2} I n \frac{1 - ϵ_{t}}{ϵ_{t}};

Step 5: the error in classification of each feature is sorted from big to small, if t=1, then according to the number percent f of cutting, crop front n × f feature of error in classification in the feature participating in training, if t>1, except crop participate in training feature in before error in classification except n × f feature, also need the feature having neither part nor lot in training in last round of to join in the middle of the training of next round;

Step 6: export strong classifier:

Compared with prior art, the beneficial effect that the present invention reaches is: when the Characteristic Number being applicable to participate in training is too much, participates in the training of next round, reaching the object of saving the training time by choosing the lower feature of last round of middle error rate.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of the inventive method.

Fig. 2 is the process flow diagram obtaining optimum Weak Classifier.

Embodiment

Below in conjunction with accompanying drawing, the invention will be further described.

Implication in accompanying drawing represented by each function is as follows:

Function cvGetTickCount (): return the millisecond number from os starting to current process, just can add up the time spent by training by the difference calculating two back amount.

Function Single_Classifier (inti): for generation of a strong classifier, the Parametric Representation imported into forms the Weak Classifier number of this strong classifier.

Function G enerate_AllFeatures (intcount): for generating the feature of all Haar-like, count represent the quantity using characteristic type.The present invention has selected 5 kinds of common feature templates, and therefore count value is 5.

Function Input_Samples (): read in positive negative sample from assigned catalogue.

Function Select_WeakClassifier (): for obtaining optimum Weak Classifier.

Function Output_WeakClassifier (): for exporting the sorter of generation.

Function Cal_HaarValue (j, k): for calculating a jth feature of a kth sample.

Function qsort (): the size according to eigenwert sorts to sample.

As shown in Figure 1, dynamic number percent sample cutting AdaBoost Face datection algorithm, in each iteration at first, first determine the number percent f of required cutting number of samples, each is taken turns and crops the less sample of weight according to f, with the training of residue sample;

When the error rate of training the best Weak Classifier error rate of the current iteration obtained to be greater than random value generation, by reducing the constant f of cutting, enlarged sample collection quantity, re-starts training for current iteration;

If when adopting whole sample training, error rate still more than 0.5, then stops iteration;

Specific algorithm comprises the steps:

Step 2: initialization sample weight:

w_{1, i} = \{\begin{matrix} \frac{1}{2 m}, & y_{i} = 0 \\ \frac{1}{2 n}, & y_{i} = 1 \end{matrix}

Step 4: obtain optimum Weak Classifier, tries to achieve Weak Classifier h _tweighting coefficient α in strong classifier _t, as shown in Figure 2, method is as follows:

Step 401: the weighted value of normalization sample:

ϵ_{t} = \min_{f, p, θ} \underset{i}{Σ} w_{i} | h_{j} (x_{i}, f, p, θ) - y_{i} |;

Step 405: upgrade sample weights:

As sample x _iby e during mis-classification _i=0, on the contrary e _i=1,

α_{t} = \frac{1}{2} I n \frac{1 - ϵ_{t}}{ϵ_{t}};

Step 6: export strong classifier:

The above is only the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the prerequisite not departing from the technology of the present invention principle; can also make some improvement and distortion, these improve and distortion also should be considered as protection scope of the present invention.

Claims

1. dynamic number percent feature cutting AdaBoost Face datection algorithm, it is characterized in that, in each iteration at first, first determine the number percent f of required cutting Characteristic Number, afterwards according to the size of Weak Classifier error rate, the feature selecting classification performance good participates in next round training; Randomly drawing value when training the best Weak Classifier error rate of the current iteration obtained to be greater than, by reducing the cutting coefficient of current iteration, expanding the Characteristic Number participating in training; If when adopting whole feature to train, error rate still more than 0.5, then stops iteration;

Specific algorithm comprises the steps:

Step one: establish the training sample of input to add up to N, wherein negative sample is m, and positive sample is n, and training sample set is S={ (x ₁, y ₁) ... (x _n,y _n), wherein x _irepresent i-th sample, y _i={ 1,0}, is respectively used to identify positive negative sample;

Step 2: initialization sample weight:

Step 401: the weighted value of normalization sample:

Step 405: upgrade sample weights:

As sample x _iby e during mis-classification _i=0, on the contrary e _i=1,

Step 5: the error in classification of each feature is sorted from big to small: if t=1, then according to the number percent f of cutting, crop front n × f the feature that in the feature participating in training, error in classification is large, if t>1, except crop participate in training feature in error in classification large before except n × f feature, also need the feature having neither part nor lot in training in last round of to join in the middle of the training of next round;

Step 6: export strong classifier: