CN100336070C - Method of robust human face detection in complicated background image - Google Patents
Method of robust human face detection in complicated background image Download PDFInfo
- Publication number
- CN100336070C CN100336070C CNB2005100862485A CN200510086248A CN100336070C CN 100336070 C CN100336070 C CN 100336070C CN B2005100862485 A CNB2005100862485 A CN B2005100862485A CN 200510086248 A CN200510086248 A CN 200510086248A CN 100336070 C CN100336070 C CN 100336070C
- Authority
- CN
- China
- Prior art keywords
- face
- sample
- people
- image
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000001514 detection method Methods 0.000 title claims abstract description 26
- 238000012549 training Methods 0.000 claims abstract description 100
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 27
- 238000010606 normalization Methods 0.000 claims description 42
- 238000000605 extraction Methods 0.000 claims description 17
- 230000008569 process Effects 0.000 claims description 13
- 238000005286 illumination Methods 0.000 claims description 10
- 230000008859 change Effects 0.000 claims description 7
- 238000013459 approach Methods 0.000 claims description 5
- 238000013461 design Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 239000004744 fabric Substances 0.000 claims description 3
- 230000003760 hair shine Effects 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 abstract description 5
- 238000011156 evaluation Methods 0.000 abstract description 2
- 238000012805 post-processing Methods 0.000 abstract 1
- 238000012360 testing method Methods 0.000 description 8
- 238000002474 experimental method Methods 0.000 description 5
- 238000005259 measurement Methods 0.000 description 4
- 238000012795 verification Methods 0.000 description 4
- 230000001815 facial effect Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 241000405217 Viola <butterfly> Species 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- IXSZQYVWNJNRAL-UHFFFAOYSA-N etoxazole Chemical compound CCOC1=CC(C(C)(C)C)=CC=C1C1N=C(C=2C(=CC=CC=2F)F)OC1 IXSZQYVWNJNRAL-UHFFFAOYSA-N 0.000 description 1
- 238000005242 forging Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
The present invention relates to a human face detecting technology under a complex background, which belongs to the field of human face recognition. The present invention is characterized in that the present invention provides a human face detecting method in images of performance robust under a complex background. The present invention adopts microstructure characteristics with high efficiency and high redundancy to express the characteristics of the gray level distribution of regions of eyes, mouths, etc. in human face modes and adopts a risk sensitive AdaBoost algorithm to select characteristics which are in a microstructure for distinguishing human faces and nonhuman faces most from the characteristics to be formed into a strong classifier, and each classifier obtained through training reduces the false acceptance rate of nonhuman face samples as far as possible under the condition of the assurance of low rejected rate to human faces. Thereby, human face detection of higher performance under a complex background image is realized through simple structure, and additionally, a postprocessing algorithm is also used for further reducing error detecting rate. A plurality of public data bases and results of competition evaluation prove the excellent performance of the present invention.
Description
Technical field
Method for detecting human face belongs to the face recognition technology field in the complex background image.
Background technology
It is exactly the information such as position, size of definite people's face in image or image sequence that people's face detects.It is widely used in the systems such as recognition of face, video monitoring, Intelligent Human-Machine Interface at present.People's face detects people's face under the complex background especially, and to detect simultaneously also be the problem of a difficulty.This be owing to extraneous factors such as appearance, the colour of skin, expression, the reason of people's faces such as motion itself in three dimensions and beard, hair, glasses, illumination cause change in people's face mode class huge, and, be difficult to make a distinction with people's face because background object is very complicated.
The main stream approach of people's face detection at present is based on the detection method of sample statistics study.These class methods have generally been introduced " non-face " this classification, are different from the feature of " non-face " classification, the parameter of model by the sample of collecting being carried out statistical learning acquisition " people's face " classification, rather than obtain the top layer rule according to visual impression.This is more reliable on statistical significance, has not only avoided also can expanding the scope that detects by increasing training sample owing to imperfect, the mistake that out of true is brought of observing, and improves the robustness of detection system; These class methods adopt by simple mostly to complicated multistratum classification device structure in addition, earlier exclude most backdrop window by sorter simple in structure, by the sorter of complexity remaining window is further judged then, thereby reached detection speed faster.But since these class methods do not consider people's face and the extremely unbalanced characteristics of non-face two quasi-mode classification error risks in the real image (this be since the prior probability that people's face occurs in the image far below non-face prior probability, and it is to find out the position of people's face that people's face detects fundamental purpose, so being divided into non-face risk by mistake, people's face is people's face) much larger than non-face erroneous judgement, only adopt and train each layer sorter based on the method for minimum classification error rate, reach wrong reject rate (the False Rejection Rate lower by the threshold value of adjusting sorter to people's face, FRR), can not reach simultaneously like this false acceptance rate lower to non-face pattern (False Acceptance Rate, FAR); The sorter number of plies is too much, structure is too complicated, detection speed is slow thereby cause, and the algorithm overall performance is descended.Defective at this type of algorithm existence, the present invention proposes a kind of based on the responsive AdaBoost algorithm of risk (Cost Sensitive AdaBoost, abbreviation CS-AdaBoost) method for detecting human face, the principle that employing minimizes the classification risk makes each layer sorter of obtaining of training when guaranteeing the extremely low reject rate of people's face pattern, reduce the false acceptance rate of non-face classification as far as possible, thereby with the sorter number of plies still less, simpler sorter structure realizes that more high performance people's face detects under the complex background image, and this is not have used method in the present every other document.
Summary of the invention
The objective of the invention is to realize the human-face detector of energy robust people from location face under complex background.The realization of this human-face detector comprises training and detects two stages.
In the training stage, at first should carry out the collection of sample, comprise the collection of people's face and non-face sample, then sample is carried out the normalized of size and illumination; Utilize training sample then, carry out microstructure features and extract, obtain feature database; Utilize feature database in conjunction with the CS-AdaBoost algorithm then, training obtains one deck people face/non-face strong classifier; Repeat above training process, obtain structure by simple multistratum classification device to complexity; At last these sorter cascades are got up, obtain a complete human-face detector.
At detection-phase, at first be continuous according to a certain percentage scaling input picture, in the image series that obtains, differentiate the wherein wicket of each certain size (a rectangular area subimage in the definition input picture is a wicket) then.To each wicket, at first carrying out gray scale normalization handles, extract its microstructure features then, with the human-face detector that trains wicket is judged, if arbitrarily the output of one deck sorter is lower than assign thresholds and thinks that promptly this wicket is non-face and do not carry out follow-up judgement, has only those wickets of judging by all layers sorter to be considered to people's face.Thereby obtain high people's face and detect accuracy.This method has been applied to register in system etc. based on the work attendance of people's face.
The present invention consists of the following components: the cascade of sample collection and normalization, integrogram calculating and microstructure features extraction, feature selecting and classifier design, multistratum classification device.
1. sample collection and normalization
1.1 the collection of sample
Promptly adopt the manual method of demarcating of people, from the picture that comprises people's face, cut out facial image, never comprise in the scenery picture of people's face and cut out non-face image at random.Facial image and non-face image are used for training classifier as positive example sample and counter-example sample respectively.Gatherer process as shown in Figure 2.
1.2 size normalization
With people's face and the non-face image normalization of collecting each good size is specified size.If the original sample image is [F (x, y)]
M * N, picture traverse is M, highly is N, the value that image is positioned at the picture element of the capable y of x row is F (x, y) (0≤x<M, 0≤y<N); If image is after the size normalization [G (x, y)]
W * H, picture traverse is W, is H highly, gets W=H=20 in the experiment.Size normalization can be regarded as source images dot matrix [F (x, y)] like this
M * NBe mapped to target image dot matrix [G (x, y)]
W * HProcess.The present invention use back projection and linear interpolation with the original sample image transformation to the standard size sample image, input picture [F (x, y)] then
M * NWith image after the normalization [G (x, y)]
W * HBetween corresponding relation be:
G(x,y)=F(x/r
x,y/r
y)
R wherein
xAnd r
yBe respectively the change of scale factor of x and y direction: r
x=N/H, r
y=M/W.
According to following formula, (x is y) corresponding to the point (x/r in the input picture for the point in the output image dot matrix
x, y/r
y).Because x/r
x, y/r
yValue generally be not integer, so need estimate F (x/r according near the value at known discrete point place
x, y/r
y) value.According to linear interpolation method, for given (x, y), the order:
Wherein:
[] is bracket function.Interpolation process can be expressed as:
G(x,y)=F(x
0+Δ
x,y
0+Δ
y)=F(x
0,y
0)Δ
xΔ
y+F(x
0+1,y
0)(1-Δ
x)Δ
y
+F(x
0,y
0+1)Δ
x(1-Δ
y)+F(x
0+1,y
0+1)(1-Δ
x)(1-Δ
y)
1.3 gray scale normalization
Because factors such as ambient light photograph, imaging device may cause brightness of image or contrast unusual, strong shadow or situation such as reflective appear.So also need the sample behind the geometrical normalization is carried out the gray balance processing, improve its intensity profile, the consistance between enhancement mode.The present invention adopts gray average, variance normalization that sample is carried out the gray balance processing, and the average μ and the variances sigma of samples pictures gray scale are adjusted to set-point μ
0And σ
0
At first adopt following formula calculate sample image G (x, y) (0≤x<W, average and the variance of 0≤y<H):
Then each gray values of pixel points is carried out as down conversion:
Thereby the average and the variance of gradation of image are adjusted to set-point μ
0And σ
0, finish the gray scale normalization of sample.
2. microstructure features rapid extraction
The present invention adopts five types of microstructure templates among Fig. 5 to extract the higher-dimension microstructure features of people's face and non-face sample: each microstructure features by pixel grey scale in the corresponding image in calculation template black region and the white portion and difference to obtain (be in order to distinguish two zones herein, give different colors respectively, down with), and template in image the position and the size of template can change.Concrete feature extraction mode is as follows:
Definition S (x
1, y
1x
2, y
2) be zone (x
1≤ x '≤x
2, y
1≤ y '≤y
2) in pixel grey scale and
If the pixel coordinate in the microstructure template upper left corner be (x, y), then five types of microstructures (black region equates with white area in preceding four kinds of microstructures, black region being distributed symmetrically in white portion in the 5th type of microstructure) are respectively as shown in Figure 5:
(a):S(x,y;x+w-1,y+h-1)-S(x+w,y;x+2w-1,y+h-1)
(b):S(x,y;x+w-1,y+h-1)-S(x,y+h;x+w-1,y+2h-1)
(c):2S(x+w,y;x+2w-1,y+h-1)-S(x,y;x+3w-1,y+h-1)
(d):S(x,y;x+2w-1,y+2h-1)-2S(x,y;x+w-1,y+h-1)-
-2S(x+w,y+h;x+2w-1,y+2h-1)
(e):S(x,y;x+w-1,y+h-1)-S(x+2,y+2;x+w-3,y+h-3)
Since each feature extraction only relate to pixel in the rectangular area and computational problem, so can utilize the integral image (Integral Image) of entire image to obtain a kind of microstructure features of any yardstick, optional position fast.
2.1 integral image
For an image I (x, y), (x 〉=0, y 〉=0), define its corresponding integral image II (x, y) be from (0,0) to (x, y) all pixel sums in the scope, promptly
And definition II (1, y)=0, II (x ,-1)=0.Have thus:
S(x
1,y
1;x
2,y
2)=II(x
2,y
2)+II(x
1-1,y
1-1)-II(x
2,y
1-1)-II(x
1-1,y
2)。
Be original image I (x, y) in pixel and S (x in any one rectangular area
1, y
1x
2, y
2) can calculate through 3 plus-minus method by integrogram;
Same definition integrated square image SqrII (x, y) be from (0,0) to (x, y) interior all pixels square sum of scope, promptly
The integrated square image can be used for calculating the variance (seeing 2.3 joints) of each rectangular area.
2.2 the rapid extraction of microstructure features
Since each feature extraction only relate to pixel in the rectangular area and computational problem, so above any one microstructure features can calculate fast by integral image several times plus-minus, wherein the computing formula (shown in Figure 6) of (a) type microstructure features
g(x,y,w,h)=2·II(x+w-1,y-1)+II(x+2·w-1,y+h-1)
+II(x-1,y+h-1)-2·II(x+w-1,y+h-1)
-II(x+2·w-1,y-1)-II(x-1,y-1)
(b) type microstructure features:
g(x,y,w,h)=2II(x+w-1,y+h-1)+II(x-1,y-1)-II(x+w-1,y-1)
-2II(x-1,y+h-1)-II(x+w-1,y+2h-1)+II(x-1,y+2h-1)
(c) type microstructure features:
g(x,y,w,h)=2II(x+2w-1,y+h-1)+2II(x+w-1,y-1)-2II(x+2w-1,y-1)
-2II(x+w-1,y+h-1)-II(x+3w-1,y+h-1)-II(x-1,y-1)
+II(x-1,y+h-1)+II(x+3w-1,y-1)
(d) type microstructure features:
g(x,y,w,h)=-II(x-1,y-1)-II(x+2w-1,y-1)-II(x-1,y+2h-1)
-4II(x+w-1,y+h-1)+2II(x+w-1,y-1)+2II(x-1,y+h-1)
-II(x+2w-1,y+2h-1)+2II(x+2w-1,y+h-1)+2II(x+w-1,y+2h-1)
(e) type microstructure features:
g(x,y,w,h)=II(x+w-1,y+h-1)+II(x-1,y-1)-II(x+w-1,y,-1)-II(x-1,y+h-1)
-II(x+w-3,y+h-3)-II(x+1,y+1)+II(x+1,y+h-3)+II(x+w-1,y+1)
Change parameter x, y, w, h can extract the feature of diverse location, different scale.For the sample image of one 20 * 20 pixel, can obtain 92267 five types microstructure features altogether, form the eigenvector FV (j) of this sample, 1≤j≤92267.
2.3 the normalization of feature
In order to alleviate the influence that illumination detects for people's face, need carry out the normalization of gray average and variance to each 20 * 20 pixel samples image, so at first will calculate the average μ and the variances sigma of wicket fast, then each dimensional feature is carried out normalization, 20 * 20 pixel wicket zone (x wherein
0≤ x '≤x
0+ 19, y
0≤ y '≤y
0+ 19) Nei pixel grey scale and μ and σ be respectively (as shown in Figure 6):
μ=[II(x
0+19,y
0+19)+II(x
0-1,y
0-1)-II(x
0-1,y
0+19)-II(x
0+19,y
0-1)]/400
σ={[SqrII(x
0+19,y
0+19)+SqrII(x
0-1,y
0-1)-SqrII(x
0-1,y
0+19)
-SqrII(x
0+19,y
0-1)]/400- μ
2}
1/2
Can carry out following normalization to each dimension microstructure features:
For the sample image of one 20 * 20 pixel, obtain 92267 dimension microstructure features FV (j) altogether, 1≤j≤92267.
3. feature selecting and classifier design
For reaching enough fast detection speed, a human-face detector must adopt hierarchy (as shown in Figure 7), and being cascaded up by the strong classifier from simple to complexity of sandwich construction constitutes.Earlier exclude backdrop window in the image, by baroque strong classifier remaining window is judged that (strong classifier herein is meant and reaches enough high performance sorter on training set then by strong classifier simple in structure; Weak Classifier hereinafter is meant that on training set error rate is a little less than 0.5 sorter).
The present invention uses every layer of strong classifier of CS-AdaBoost algorithm training.The CS-AdaBoost algorithm is the integrated algorithm of a kind of Weak Classifier, Weak Classifier can be combined into the strong classifier on training set; And treat the risk that two class classification errors bring in the CS-AdaBoost algorithm with a certain discrimination, the total classification risk of errors on the training set is minimized.Detect problem for people's face, the strong classifier that training is obtained reduces the classification error (FAR) of non-face classification simultaneously as far as possible on the basis of the enough low classification error (FRR) on underwriter's face classification.
3.1 the structure of Weak Classifier
Weak Classifier is to use the tree classification device of one-dimensional characteristic structure among the present invention:
Wherein sub is the sample of one 20 * 20 pixel, g
j(sub) j feature obtaining from this sample extraction of expression, θ
jBe the decision threshold (people face and j feature of the non-face sample requirement that make the FRR of people face sample satisfy regulation of this threshold value by adding up all collections obtains) of j feature correspondence, h
j(sub) the judgement output of the tree classification device of j latent structure is used in expression.Each Weak Classifier only need compare a subthreshold and just can finish judgement like this; Can obtain 92267 Weak Classifiers altogether.
3.2 strong classifier design based on the CS-AdaBoost algorithm
The CS-AdaBoost algorithm is used for training of human face/non-face strong classifier in conjunction with above-mentioned Weak Classifier building method.Following (the note training sample set L={ (sub of training step
i, l
i), i=1 ..., n, l
iThe=0, the 1st, sample image sub
iCategory label, respectively corresponding non-face classification and people's face classification, wherein people's face sample n
FaceIndividual, non-face sample n
NonfaceIndividual):
3.2.1 the initialization of parameter
The initialization of training sample misclassification risk.Misclassification risk for everyone face sample
Misclassification risk to each non-face sample
(c is that people's face classification is the misclassification risk multiple of non-face classification, and the c value should be greater than 1 and along with increasing of the strong classifier number of plies reduces to approach 1 gradually, and concrete selective value sees Table 1);
The initialization of training sample weight.The weight of initial each sample is
Select iterations T (T is the number of the Weak Classifier of wishing use), T should increase along with increasing gradually of the strong classifier number of plies, and concrete selective value sees Table 1;
Maximum value Fmax of every dimensional feature (j) and minimal value Fmin (j) (wherein j is the feature sequence number, 1≤j≤92267) on the statistical sample collection:
3.2.2 repeat following process T time (t=1 ..., T):
3.2.2.1 use j feature (1≤j≤92267) structure Weak Classifier h
j, exhaustive search threshold parameter θ between Fmin (j) and Fmax (j) then
j, make h
jError rate ε
jMinimum, definition
3.2.2.2 order
And the Weak Classifier that it is corresponding is as h
t
3.2.2.3 calculating parameter
3.2.2.4 the weight of new samples more
Wherein
i=1,...,n,
3.3 the cascade of multilayer strong classifier
Because the individual layer strong classifier is difficult to realize high-class speed simultaneously, extremely low FRR and extremely low targets such as FAR are so whole human-face detector must adopt hierarchy, by simply to complicated the multilayer strong classifier being cascaded up, as shown in Figure 7.When detecting,, can exclude immediately and not carry out follow-up judgement, otherwise further judge by follow-up more complicated strong classifier as long as certain image window can not pass through wherein any one deck.So for obviously unlike the video in window of people's face, preceding what just can be excluded, need not subsequent calculations, thereby saved calculated amount greatly.
Use 11580 people's face samples and 2000000 non-face samples as training sample set, the concrete training step of multilayer strong classifier cascade is as follows:
(1) initialization i=1; The training objective that defines every layer of strong classifier is FRR≤0.02% on people's face training set, FAR on non-face training set≤60%; Define target FRR≤0.5% of whole human-face detector on people's face training set, the target FAR on non-face training set≤3.2 * 10
-6, wherein FAR and FRR are defined as follows:
FAR=is differentiated the non-face total sample number of non-face number of samples ÷ * 100% for people's face
FRR=is differentiated is non-face people's face number of samples ÷ people face total sample number * 100%
(2) use training sample set to adopt the method in 3.2 joints to train i layer strong classifier;
(3) the preceding i layer sorter that obtains with training detects sample set:
(4) if FRR, FAR do not reach predetermined value, then i ← i+1 returns step (2) and proceeds training; Otherwise stop training.
The human-face detector that training at last obtains comprises 19 layers of strong classifier, has used 3139 Weak Classifiers altogether.The FRR of whole detecting device on people's face checking collection is about 0.15%, and FAR is about 3.2 * 10 on non-face training set
-6Table 1 provides the wherein training result of several layers sorter.
Some face/non-face strong classifier training result of table 1
Number of plies i | c | T | People's face FRR training set | Non-face FAR checking |
1 | 100 | 1 | 0.10% | 64.2% |
2 | 60 | 1 | 0.0% | 83.5% |
3 | 3.5 | 5 | 0.0% | 75.4% |
7 | 1.5 | 65 | 0.0% | 42.5% |
8 | 1.4 | 87 | 0.0% | 40.1% |
9 | 1.4 | 120 | 0.0% | 35.4% |
17 | 1.2 | 355 | 0.01% | 67.6% |
18 | 1.15 | 361 | 0.02% | 60.2% |
19 | 1.10 | 397 | 0.02% | 68.3% |
If a window thinks then that by the judgement of all layers sorter this window comprises people's face when detecting.
The invention is characterized in, it be a kind of can be under complex background and illumination robust ground detect the technology of various people's faces, and in normal video, can reach real-time detection speed.It at first carries out size normalization and unitary of illumination to the sample of collecting, to eliminate the input sample to greatest extent because of difference in the different classes that cause of illumination and size, extract the microstructure features of energy fine difference people's face and non-face mode configuration characteristics then expeditiously, utilize the training of CS-AdaBoost algorithm to obtain having the strong classifier of extremely low FRR and extremely low FAR on this basis, then the multilayer strong classifier is cascaded into a complete human-face detector, obtains final people's face position.
In the system that is made up of image capture device and computing machine, this detection method comprises training stage and detection-phase.Wherein the training stage is contained following steps successively:
1. the collection of sample
Utilize equipment images acquired such as camera, digital camera, scanner, and artificial demarcation of people's face wherein cut out, set up people's face training sample database; Never comprise in the scenery picture of people's face and cut out non-face training image at random.Obtain 11580 people's face samples and 2000000 non-face samples altogether as training sample set
2. normalized comprises sample light and shines and big or small linear normalization
(2.1) size normalization
If the original sample image is [F (x, y)]
M * N, picture traverse and highly be respectively M and N, after the size normalization be [G (x, y)]
W * H, get W=H=20 in the experiment.Use back projection and linear interpolation to obtain sample image after the normalization from the original sample image, then after input picture and the normalization there be the corresponding relation between the image:
G(x,y)=F(x/r
x,y/r
y)
R wherein
xAnd r
yBe respectively the change of scale factor of x and y direction: r
x=N/H, r
y=M/W.Because x/r
x, y/r
yValue generally be not integer, so need estimate F (x/r according near the value at known discrete point place
x, y/r
y) value.The present invention adopts linear interpolation method.For given (x, y), the order:
Wherein:
[] is bracket function, can get:
G(x,y)=F(x
0+Δ
x,y
0+Δ
y)=F(x
0,y
0)Δ
xΔ
y+F(x
0+1,y
0)(1-Δ
x)Δ
y
+F(x
0,y
0+1)Δ
x(1-Δ
y)+F(x
0+1,y
0+1)(1-Δ
x)(1-Δ
y)
(2.2) gray scale normalization
(x, the gray scale of each pixel y) is carried out as down conversion, and average μ and variances sigma are adjusted to set-point μ to the sample image G after the size normalization
0And σ
0, obtain sample image I (x, y):
Wherein
3. the sample characteristics storehouse obtains
Calculate integrogram with the rapid extraction microstructure features, it contains following steps successively:
(3.1) calculate the integrogram of each sample
Use according to definition
Calculate each sample correspondence integrogram II (x, y), and have II (1, y)=0, II (x ,-1)=0.
(3.2) extraction in microstructure features storehouse
Utilize the definition of each microstructure features and 92267 features of above each sample correspondence of integrogram rapid extraction, thereby constitute the feature database of people's face sample and the feature database of non-face sample respectively.
4. classifier design
Train each layer people face/non-face strong classifier with training of above training sample set and CS-AdaBoost algorithm, and the multilayer strong classifier cascaded up form a complete human-face detector.May further comprise the steps:
(4.1) initialization i=1; The training objective that defines every layer of strong classifier is FRR≤0.02% on people's face training set, FAR on non-face training set≤60%; Define target FRR≤0.5% of whole human-face detector on people's face training set, the target FAR on non-face training set≤3.2 * 10
-6
(4.2) training i layer strong classifier;
(4.3) the preceding i layer sorter that obtains with training detects sample set;
(4.4) if FRR, FAR do not reach predetermined value, then i ← i+1 returns step (4.2) and proceeds training; Otherwise stop training.
Wherein step (4.2) contains following steps successively:
(4.2.1) initialization of parameter
The initialization of training sample misclassification risk.Misclassification risk for everyone face sample
Misclassification risk to each non-face sample
(c is that people's face classification is the misclassification risk multiple of non-face classification, and the c value should be greater than 1 and along with increasing of the strong classifier number of plies reduces to approach 1 gradually, and concrete selective value sees Table 1);
The initialization of training sample weight.The weight of initial each sample is
Select iterations T (T is the number of the Weak Classifier of wishing use), T should increase along with increasing gradually of the strong classifier number of plies, and concrete selective value sees Table 1);
Maximum value Fmax of every dimensional feature (j) and minimal value Fmin (j) (wherein j is the feature sequence number, 1≤j≤92267) on the statistical sample collection:
(4.2.2) repeat following process T time (t=1 ..., T):
(4.2.2.1) use j feature (1≤j≤92267) structure Weak Classifier h
j, exhaustive search threshold parameter θ between Fmin (j) and Fmax (j) then
j, make h
jError rate ε
jMinimum, definition
(4.2.2.2) order
And the Weak Classifier that it is corresponding is as h
t
(4.2.2.3) calculating parameter
(4.2.2.4) weight of new samples more
Wherein
i=1,...,n,
Can train by above each step and to obtain a complete human-face detector.
At detection-phase, these invention employing following steps judge whether comprise people's face in the input picture (an actual detection process is as figure):
(1) collection of input picture
Utilize equipment images acquired such as camera, digital camera, scanner.
(2) the quick judgement of each wicket in the scaling of input picture and the image thereof
For detecting people's face of different size, the linear interpolation method that adopts preamble to use dwindles 12 input pictures (the present invention adopts 1.25 ratio) according to a certain percentage continuously, obtain input picture altogether by 13 different sizes, judge the wicket of 20 * 20 all in every input picture pixels respectively, can detect the people face of size like this from 20 * 20 pixels to 280 * 280 pixels.May further comprise the steps specifically:
(2.1) scaling of input picture
Adopt linear interpolation method that preamble uses in proportion q=1.25 dwindle 12 input picture I continuously (x y) obtain input image sequence { I
i(x, y) } (and i=0 ..., 12);
(2.2) calculating of integral image
Use above iterative formula to calculate each image I respectively
i(x, y) pairing integral image II
i(x is y) with square integral image SqrII
i(x, y), (i=0 ..., 9);
(2.3) the exhaustive judgement of wicket
From every width of cloth image I
i(x, upper left corner y) begins the wicket of all 20 * 20 Pixel Dimensions of exhaustive differentiation, to any wicket [x
0, y
0x
0+ 19, y
0+ 19] treatment step is as follows:
(2.3.1). utilize the integrogram II of entire image
i(x is y) with square integrogram SqrII
i(x, y) the average μ and the variances sigma of calculating wicket;
μ=[II
i(x
0+19,y
0+19)+II
i(x
0-1,y
0-1)-II
i(x
0-1,y
0+19)-II
i(x
0+19,y
0-1)]/400
σ={[SqrII
i(x
0+19,y
0+19)+SqrII
i(x
0-1,y
0-1)-SqrII
i(x
0-1,y
0+19)
-SqrII
i(x
0+19,y
0-1)]/400- μ
2}
1/2
(2.3.2). utilize the microstructure features of this wicket of preamble introduction method rapid extraction, and carry out the feature normalized;
(2.3.3). adopt the multilayer people's face/non-face strong classifier that trains that wicket is judged; If, think that then this wicket comprises people's face, exports its position by the judgement of all layers strong classifier; Otherwise discard this wicket, do not carry out subsequent treatment;
Utilize above step can fast robust ground to detect everyone face in the input picture.
For verifying validity of the present invention, we test on a plurality of public datas storehouse, and provide a concrete realization example.
We compare the performance of performance of the present invention with present universally acknowledged best algorithm on the CMU test set.The CMU test set comprises 130 pictures with complex background altogether, 507 people's faces.In the experiment image is carried out maximum 13 convergent-divergents in 1.25 ratio, 71040758 image windows have been judged in search altogether.Comparative result sees Table 2, this paper algorithm overall performance is better than Viola[Viola P as can be seen, Jones M.Rapid object detection using a boosted cascadeof simple features.Proc on Computer Vision Pattern Recognition, 2001], Schneiderman[Schneiderman H, Kanade T.Probabilistic modeling of local appearanceand spatial relationships for object recognition.Proc.on CVPR, 1998], Rowley[RowleyHA, Baluja S, and Kanade T.Neural network-based face detection.IEEE Transactionson Pattern Analysis and Machine Intelligence, 1998, the performance of method such as 20 (1): 23-38], particularly under the situation of low false-alarm, for example during 10 false-alarms, people's face verification and measurement ratio of this paper algorithm is 90.1%, higher by 7%~14% than the verification and measurement ratio of other algorithms, be much better than other algorithms.Wherein compare with the detecting device that obtains based on conventional AdaBoost algorithm of Viola, our human-face detector has used 3139 Weak Classifiers of 19 layers of strong classifier, and it has used a Weak Classifier surplus 38 layers of strong classifier 6000, our detecting device is structurally simple more than it, so the present invention has more excellent performance and detection speed faster.Normal video image for 386 * 288 adopts this paper algorithm can reach above detection speed (PIII1.8GHZ dominant frequency, 512M internal memory) 18 frame/seconds.
Table 2 compares with the performance of other detection method on the positive homo erectus's face of CMU test set
On the BANCA database, also compare in addition with the detection performance of the famous product F aceIT of Identix company.The BANCA database comprises 6540 pictures with complex background and illumination, comprises positive homo erectus's face in every pictures, and the pitching of people's face changes greatly.Correct verification and measurement ratio of the present invention is 98.8%, and the correct verification and measurement ratio of FaceIT is 94.9%; Comprise in the test of carrying out on the image set of people's face transferring to every image of third party-China Aerospace information firm in its collection, the detection accuracy of this paper algorithm is 98.6%, and the detection accuracy of FaceIT is 98.0%.
Description of drawings
The hardware of a typical face detection system of Fig. 1 constitutes.
The acquisition process of Fig. 2 training sample.
The typical human face sample example that Fig. 3 obtains.
The formation of Fig. 4 face detection system.
Five kinds of microstructure features templates of Fig. 5.
The extraction example of the calculating of Fig. 6 integrogram and microstructure features.
The cascade of the multistage strong classifier of Fig. 7.
The training process of Fig. 8 strong classifier.
The actual detected process example of people's face in image of Fig. 9.
Figure 10 is based on the recognition of face of this algorithm system of registering.
Embodiment
When realizing a face detection system, at first should obtain human-face detector, but with regard to end user's face detector any input picture be detected then by collecting abundant sample training.The hardware configuration of total system as shown in Figure 1, the training process of system and testing process as shown in Figure 4, below the detailed various piece of introducing system:
A) realization of training system
A.1 training sample obtains
Utilize equipment images acquired such as camera, digital camera, scanner, the artificial demarcation of people's face wherein cut out, set up people's face training sample database; Non-face training sample then is never to comprise in the scenery picture etc. of people's face to extract at random.Collect altogether in this example and use 11580 people's face samples and 2000000 non-face samples as training set.
A.2 sample normalization
A.2.1 size normalization
If the original sample image is [F (x, y)]
M * N, picture traverse and highly be respectively M and N, after the size normalization be [G (x, y)]
W * H, get W=H=20 in the experiment.Use back projection and linear interpolation to obtain sample image after the normalization from the original sample image, then after input picture and the normalization there be the corresponding relation between the image:
G(x,y)=F(x/r
x,y/r
y)
R wherein
xAnd r
yBe respectively the change of scale factor of x and y direction: r
x=N/H, r
y=M/W.For given (x, y), the order:
Wherein:
[] is bracket function, can get:
G(x,y)=F(x
0+Δ
x,y
0+Δ
y)=F(x
0,y
0)Δ
xΔ
y+F(x
0+1,y
0)(1-Δ
x)Δ
y
+F(x
0,y
0+1)Δ
x(1-Δ
y)+F(x
0+1,y
0+1)(1-Δ
x)(1-Δ
y)
A.2.2 unitary of illumination
(x, the gray scale of each pixel y) is carried out as down conversion, and average μ and variances sigma are adjusted to set-point μ to the sample image G after the size normalization
0And σ
0, obtain sample image I (x, y):
Wherein
A.3 the sample characteristics storehouse obtains
A.3.1 the calculating of sample integrogram
Use according to definition
Calculate each sample correspondence integrogram II (x, y), and have II (1, y)=0, II (x ,-1)=0.
A.3.2 the extraction in microstructure features storehouse
Utilize the definition of each microstructure features and 92267 features of above each sample correspondence of integrogram rapid extraction, carry out normalization respectively, thereby constitute the feature database of people's face sample and the feature database of non-face sample respectively.
A.4 the training of human-face detector
Train each layer people face/non-face strong classifier with training of above training sample set and CS-AdaBoost algorithm, and the multilayer strong classifier cascaded up form a complete human-face detector.May further comprise the steps:
A.4.1 initialization i=1; The training objective that defines every layer of strong classifier is FRR≤0.02% on people's face training set, FAR on non-face training set≤60%; Define target FRR≤0.5% of whole human-face detector on people's face training set, the target FAR on non-face training set≤3.2 * 10
-6
A.4.2 train i layer strong classifier;
A.4.3 the preceding i layer sorter that obtains with training detects sample set;
If A.4.4 FRR, FAR do not reach predetermined value, then i ← i+1 returns step (4.2) and proceeds training; Otherwise stop training.
Foregoing steps A .4.2 contains following steps successively:
A.4.2.1 the initialization of parameter
The initialization of training sample misclassification risk.Misclassification risk for everyone face sample
Misclassification risk to each non-face sample
(c is that people's face classification is the misclassification risk multiple of non-face classification, and the c value should be greater than 1 and along with increasing of the strong classifier number of plies reduces to approach 1 gradually, and concrete selective value sees Table 1);
The initialization of training sample weight.The weight of initial each sample is
Select iterations T (T is the number of the Weak Classifier of wishing use), T should increase along with increasing gradually of the strong classifier number of plies, and concrete selective value sees Table 1);
Maximum value Fmax of every dimensional feature (j) and minimal value Fmin (j) (wherein j is the feature sequence number, 1≤j≤92267) on the statistical sample collection:
A.4.2.2 repeat following process T time (t=1 ..., T):
A.4.2.2.1 use j feature (1≤j≤92267) structure Weak Classifier h
j, exhaustive search threshold parameter θ between Fmin (j) and Fmax (j) then
j, make h
jError rate ε
jMinimum, definition
A.4.2.2.2 order
And the Weak Classifier that it is corresponding is as h
t
A.4.2.2.3 calculating parameter
A.4.2.2.4 the weight of new samples more
Wherein
i=1,...,n,
B) realization of detection system
At detection-phase, this invention may further comprise the steps:
B.1 the collection of image
Utilize equipment images acquired such as camera, digital camera, scanner.
B.2 the calculating of pyramidal formation of input picture and integral image
For detecting people's face of different size, the linear interpolation method that adopts preamble to use dwindles 12 input pictures (the present invention adopts 1.25 ratio) according to a certain percentage continuously, obtain input picture altogether by 13 different sizes, judge the wicket (a rectangular area subimage in the definition input picture is a wicket) of each 20 * 20 pixel in every input picture respectively, can detect the people face of size like this from 20 * 20 pixels to 280 * 280 pixels.May further comprise the steps specifically:
B.2.1 the scaling of input picture
Adopt linear interpolation method that preamble uses in proportion q=1.25 dwindle 12 input picture I continuously (x y) obtain input image sequence { I
i(x, y) } (and i=0 ..., 12);
B.2.2 the calculating of integral image
Use above iterative formula to calculate each image I respectively
i(x, y) pairing integral image II
i(x is y) with square integral image SqrII
i(x, y), (i=0 ..., 9);
B.2.3 the exhaustive judgement of wicket
From every width of cloth image I
i(x, upper left corner y) begins the wicket of all 20 * 20 Pixel Dimensions of exhaustive differentiation, to any wicket [x
0, y
0x
0+ 19, y
0+ 19] treatment step is as follows:
B.2.3.1 utilize the integrogram II of entire image
i(x is y) with square integrogram SqrII
i(x, y) the average μ and the variances sigma of calculating wicket;
μ=[II
i(x
0+19,y
0+19)+II
i(x
0-1,y
0-1)-II
i(x
0-1,y
0+19)-II
i(x
0+19,y
0-1)]/400
σ={[SqrII
i(x
0+19,y
0+19)+SqrII
i(x
0-1,y
0-1)-SqrII
i(x
0-1,y
0+19)
-SqrII
i(x
0+19,y
0-1)]/400- μ
2}
1/2
B.2.3.2 utilize the microstructure features of this wicket of preamble introduction method rapid extraction, and carry out the feature normalized;
B.2.3.3 adopt the multilayer people's face/non-face strong classifier that trains that wicket is judged; If, think that then this wicket comprises people's face, exports its position by the judgement of all layers strong classifier; Otherwise discard this wicket, do not carry out subsequent treatment;
Utilize above step can fast robust ground to detect everyone face in the input picture.
Embodiment 1: based on the identification of the people's face system (as Figure 10) of registering
Face authentication is to be subjected to the most friendly a kind of authentication mode in the biological characteristic authentication technology of extensive concern recently, be intended to utilize facial image to carry out the automatic personal identification of computing machine, to replace identification authentication mode such as traditional password, certificate, seal, has advantages such as being difficult for forging, can not losing and making things convenient for.Native system utilizes people's face information to come people's identity is verified automatically.Use therein people's face detection module is the achievement in research of this paper.Native system has also been participated in the FAT2004 contest of ICPR2004 tissue in addition.Total 13 face recognition algorithms of 11 science such as the Carnegie Mellon university from the U.S., the Neuroinformatik research institute of Germany, the Surrey university of Britain and commercial undertaking that comprise of contest are this time participated in.The system that submit in this laboratory all obtains the first place than the result of second place with low about 50% error rate on three evaluation indexes.The achievement in research of this paper is applied in this people's face detection module of testing real institute submission system, thereby the overall performance that has guaranteed system occupy advanced international standard.
In sum, the present invention can detect people's face in robust ground in having the image of complex background, obtained excellent testing result in experiment, has very application prospects.
Claims (1)
1, the method for robust human face detection in the complex background image is characterized in that, this method is come designer's face detector based on the classification error risk of people's face and non-face pattern; For designing this human-face detector, at first the sample of collecting is carried out size normalization and unitary of illumination, to eliminate the input sample because of difference in the different classes that cause of illumination and size, utilize the CS-AdaBoost algorithm to select the microstructure features of reflection people face pattern/non-face pattern difference then, and these features are formed one deck strong classifier that has the mistake reject rate that is lower than 10e-3 and be lower than the false acceptance rate of 10e-6, then the multilayer strong classifier is cascaded into a complete human-face detector, handles obtaining final people's face position;
In the system that is made up of image capture device and computing machine, described method for detecting human face comprises training stage and detection-phase; Wherein the training stage is contained following steps successively:
The collection of step 1. sample
Utilization comprises camera, digital camera, scanner in any interior equipment images acquired, and artificial demarcation of people's face wherein cut out, and sets up people's face training sample database; Never comprise in the scenery picture of people's face and cut out non-face training image at random; Obtain 11580 people's face samples and 2000000 non-face samples altogether as training sample set;
Step 2. normalized comprises sample light and shines and big or small linear normalization;
The normalization of step 2.1 size, people's face and non-face image normalization that step 1 is obtained are specified size;
If the original sample image is [F (x, y)]
M * N, picture traverse and highly be respectively M and N, after the size normalization be [G (x, y)]
W * HThen after input picture and the normalization there be the corresponding relation between the image:
G(x,y)=F(x/r
x,y/r
y)
R wherein
xAnd r
yBe respectively the change of scale factor of x and y direction: r
x=N/H, r
y=M/W; F (x/r
x, y/r
y) be estimation point (x/r
x, y/r
y) pixel value located, order:
Wherein:
[] is bracket function, can get:
G(x,y)=F(x
0+Δ
x,y
0+Δ
y)=F(x
0,y
0)Δ
xΔ
y+F(x
0+1,y
0)(1-Δ
x)Δ
y
+F(x
0,y
0+1)Δ
x(1-Δ
y)+F(x
0+1,y
0+1)(1-Δ
x)(1-Δ
y);
Step 2.2 gray scale normalization
(x, the gray scale of each pixel y) is carried out as down conversion, and average μ and variances sigma are adjusted to set-point μ to the sample image G after the size normalization
0And σ
0, obtain sample image I (x, y):
Wherein
Obtaining of step 3. sample characteristics storehouse
Calculate integrogram to extract microstructure features, it contains following steps successively:
Step 3.1 is calculated the integrogram of each sample
Use according to definition
Calculate each sample correspondence integrogram II (x, y), and have II (1, y)=0, II (x ,-1)=0;
The extraction in step 3.2 microstructure features storehouse
Set: five kinds of microstructure features that extract people's face sample with following five types of microstructure templates, each microstructure features by pixel grey scale in the corresponding image in calculation template black region and the white portion and difference obtain, described five kinds of microstructure features g (x, y, w h) is expressed as follows respectively:
(a) class: black region and white portion left-right symmetric and area equate, represents the wide of each zone wherein with w, and h represents the wherein height in each zone:
g(x,y,w,h)=2·II(x+w-1,y-1)+II(x+2·w-1,y+h-1)
+II(x-1,y+h-1)-2·II(x+w-1,y+h-1)
-II(x+2·w-1,y-1)-II(x-1,y-1)
(b) class: symmetry and area equate about black region and the white portion, and the definition of w, h is identical with (a) class:
g(x,y,w,h)=2II(x+w-1,y+h-1)+II(x-1,y-1)-II(x+w-1,y-1)
-2II(x-1,y+h-1)-II(x+w-1,y+2h-1)+II(x-1,y+2h-1)
(c) class: in the horizontal direction, black region is between two white portions, and the area of black region and every white portion equates that the definition of w, h is identical with (a) class:
g(x,y,w,h)=2II(x+2w-1,y+h-1)+2II(x+w-1,y-1)-2II(x+2w-1,y-1)
-2II(x+w-1,y+h-1)-II(x+3w-1,y+h-1)-II(x-1,y-1)
+II(x-1,y+h-1)+II(x+3w-1,y-1)
(d) class: two black regions are in first quartile and third quadrant respectively, and two white portions are in second and four-quadrant respectively, and the area of every black region and every white portion equates that the definition of w, h is identical with (a) class:
g(x,y,w,h)=-II(x-1,y-1)-II(x+2w-1,y-1)-II(x-1,y+2h-1)
-4II(x+w-1,y+h-1)+2II(x+w-1,y-1)+2II(x-1,y+h-1)
-II(x+2w-1,y+2h-1)+2II(x+2w-1,y+h-1)+2II(x+w-1,y+2h-1)
(e) class: black region is positioned at the central authorities of white portion, and the upper and lower both sides of black region, and left and right both sides are respectively apart from the upper and lower both sides of white portion, 2 pixels in left and right both sides, and w, h represent the wide and high of white portion respectively:
g(x,y,w,h)=II(x+w-1,y+h-1)+II(x-1,y-1)-II(x+w-1,y-1)-II(x-1,y+h-1)
-II(x+w-3,y+h-3)-II(x+1,y+1)+II(x+1,y+h-3)+II(x+w-1,y+1)
For the sample image of one 20 * 20 pixel and above five types microstructure template, parameter x, y, w, the combination of h has 92267, can extract the characteristic quantity FV (j) of this sample image thus, 1≤j≤92267;
The normalization of step 3.3 feature, promptly each pixel samples image is carried out the normalization of gray average and variance:
If: each 20 * 20 pixel wicket zone, i.e. (x
0≤ x '≤x
0+ 19, y
0≤ y '≤y
0+ 19) mean value of Nei pixel grey scale is μ, and variance is σ, then:
μ=[II(x
0+19,y
0+19)+II(x
0-1,y
0-1)-II(x
0-1,y
0+19)-II(x
0+19,y
0-1)]/400
σ={[SqrII(x
0+19,y
0+19)+SqrII(x
0-1,y
0-1)-SqrII(x
0-1,y
0+19)
-SqrII(x
0+19,y
0-1)]/400- μ
2}
1/2
Then each microstructure features is done following normalization:
For the sample image of one 20 * 20 pixel, obtain 92267 dimension microstructure features FV (j) altogether, 1≤j≤92267;
Step 4. feature selecting and classifier design
Train each layer people face/non-face strong classifier with training of above training sample set and CS-AdaBoost algorithm, and the multilayer strong classifier cascaded up form a complete human-face detector, may further comprise the steps:
Step 4.1 initialization i=1; The training objective that defines every layer of strong classifier is false rejection rate FRR≤0.02% on people's face training set, wrong acceptance rate FAR≤60% on non-face training set; Define target FRR≤0.5% of whole human-face detector on people's face training set, the target FAR on non-face training set≤3.2 * 10
-6
Step 4.2 training i layer strong classifier;
Step 4.3 detects sample set with the preceding i layer sorter that training obtains, and calculates FRR, FAR:
FAR=is differentiated the non-face total sample number of non-face number of samples ÷ * 100% for people's face
FRR=is differentiated is non-face people's face number of samples ÷ people face total sample number * 100%
If step 4.4 FRR, FAR do not reach the predetermined value that step 4.1 is set, then the i value increases by 1, returns step 4.2 and proceeds training; Otherwise stop training;
Wherein step 4.2 contains following steps successively:
The initialization of step 4.2.1 parameter
The initialization of training sample misclassification risk; Misclassification risk for people's face sample
Misclassification risk to non-face sample
C is that people's face classification is the misclassification risk multiple of non-face classification, and the c value is greater than 1 and along with increasing of the strong classifier number of plies reduces to approach 1 gradually;
The initialization of training sample weight; The weight of initial each sample is
Select iterations T, T is the number of the Weak Classifier of wishing use, and T is along with increasing gradually of the strong classifier number of plies increased;
Maximum value Fmax of each characteristic distribution (j) and minimal value Fmin (j) on the statistical sample collection, wherein j is the feature sequence number, 1≤j≤92267;
Step 4.2.2 repeats following process T time, t=1 ..., T:
Step 4.2.2.1 uses j feature, 1≤j≤92267, structure Weak Classifier h
j, exhaustive search threshold parameter θ between Fmin (j) and Fmax (j) then
j, make h
jError rate ε
jMinimum, definition
Weak Classifier uses h
j(sub) expression is called for short h
j:
Wherein:
Sub is the sample of one 20 * 20 pixel,
g
j(sub) be j the feature that obtains from this sample extraction;
θ
jBe based on the decision threshold of j feature, people face and j feature of the non-face sample requirement that make the FRR of people face sample satisfy regulation of this threshold value by adding up all collections obtains;
h
j(sub) be to use the judgement output of the tree classification device of j latent structure, correspondingly obtain 92267 Weak Classifiers;
l
iThe=0, the 1st, sample image sub
iCategory label, respectively corresponding non-face classification and people's face classification, wherein people's face sample n
FaceIndividual, non-face sample n
NonfaceIndividual, common composing training sample set: L={ (sub
i, l
i), i=1 ..., n, l
i=0,1;
Step 4.2.2.2 order
And the Weak Classifier that it is corresponding is as h
t
Step 4.2.2.3 calculating parameter
Step 4.2.2.4 is the weight of new samples more
Wherein
i=1,...,n,
Can train by above each step and to obtain a complete human-face detector;
At detection-phase, adopt following steps to judge whether comprise people's face in the input picture:
The collection of step 1. input picture
Utilize camera, digital camera, the arbitrary equipment images acquired of scanner;
The judgement of each wicket in the scaling of step 2. input picture and the image thereof
For detecting people's face of different size, the linear interpolation method that adopts training stage step 2.1 to use dwindles input picture according to a certain percentage continuously 12 times, obtain the input picture of 13 different sizes altogether, judge the window of each 20 * 20 pixel in every input picture respectively, can detect the people face of size like this from 20 * 20 pixels to 280 * 280 pixels; May further comprise the steps:
The scaling of step 2.1 input picture
Adopt linear interpolation method that training stage step 2.1 uses in proportion q=1.25 dwindle 12 input picture I continuously (x y) obtain input image sequence { I
i(x, y) }, i=0 ..., 12;
The calculating of step 2.2 integral image
Use the iterative formula of training stage step 3.1 to calculate each image I respectively
i(x, y) pairing integral image II
i(x is y) with square integral image SqrII
i(x, y), i=0 ..., 9;
The wicket of each in the step 2.3 judgement image
From every width of cloth image I
i(x, upper left corner y) begins to differentiate the wicket of each 20 * 20 Pixel Dimensions in the image, to any wicket [x
0, y
0x
0+ 19, y
0+ 19] treatment step is as follows:
Step 2.3.1 utilizes the integrogram II of entire image
i(x is y) with square integrogram SarII
i(x, y) the average μ and the variances sigma of calculating wicket;
μ=[II
i(x
0+19,y
0+19)+II
i(x
0-1,y
0-1)-II
i(x
0-1,y
0+19)-II
i(x
0+19,y
0-1)]/400
σ={[SqrII
i(x
0+19,y
0+19)+SqrII
i(x
0-1,y
0-1)-SqrII
i(x
0-1,y
0+19)
-SqrII
i(x
0+19,y
0-1)]/400- μ
2}
1/2
Step 2.3.2 utilizes training stage step 3.2 introduction method to extract the microstructure features of this wicket, and carries out the feature normalized;
Step 2.3.3 adopts the multilayer people's face/non-face strong classifier that trains that wicket is judged; If, think that then this wicket comprises people's face, exports its position by the judgement of all layers strong classifier; Otherwise discard this wicket, do not carry out subsequent treatment;
Utilize above step can detect everyone face in the input picture.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2005100862485A CN100336070C (en) | 2005-08-19 | 2005-08-19 | Method of robust human face detection in complicated background image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2005100862485A CN100336070C (en) | 2005-08-19 | 2005-08-19 | Method of robust human face detection in complicated background image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1731417A CN1731417A (en) | 2006-02-08 |
CN100336070C true CN100336070C (en) | 2007-09-05 |
Family
ID=35963765
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB2005100862485A Expired - Fee Related CN100336070C (en) | 2005-08-19 | 2005-08-19 | Method of robust human face detection in complicated background image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN100336070C (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010072153A1 (en) * | 2008-12-25 | 2010-07-01 | 南京壹进制信息技术有限公司 | Computer intelligent energy saving method |
Families Citing this family (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100389429C (en) * | 2006-06-01 | 2008-05-21 | 北京中星微电子有限公司 | AdaBoost based characteristic extracting method for pattern recognition |
CN101196984B (en) * | 2006-12-18 | 2010-05-19 | 北京海鑫科金高科技股份有限公司 | Fast face detecting method |
JP5058681B2 (en) * | 2007-05-31 | 2012-10-24 | キヤノン株式会社 | Information processing method and apparatus, program, and storage medium |
CN101315670B (en) | 2007-06-01 | 2010-08-11 | 清华大学 | Specific shot body detection device, learning device and method thereof |
WO2008151470A1 (en) * | 2007-06-15 | 2008-12-18 | Tsinghua University | A robust human face detecting method in complicated background image |
CN101406390B (en) * | 2007-10-10 | 2012-07-18 | 三星电子株式会社 | Method and apparatus for detecting part of human body and human, and method and apparatus for detecting objects |
CN101196990B (en) * | 2007-10-11 | 2011-04-27 | 北京海鑫科金高科技股份有限公司 | Network built-in type multiplex face detecting system and method thereof |
CN101178770B (en) * | 2007-12-11 | 2011-02-16 | 北京中星微电子有限公司 | Image detection method and apparatus |
CN101655914B (en) * | 2008-08-18 | 2014-10-22 | 索尼(中国)有限公司 | Training device, training method and detection method |
CN101350063B (en) * | 2008-09-03 | 2011-12-28 | 北京中星微电子有限公司 | Method and apparatus for locating human face characteristic point |
CN101360246B (en) * | 2008-09-09 | 2010-06-02 | 西南交通大学 | Video error masking method combined with 3D human face model |
CN101751551B (en) * | 2008-12-05 | 2013-03-20 | 比亚迪股份有限公司 | Method, device, system and device for identifying face based on image |
CN101447023B (en) * | 2008-12-23 | 2013-03-27 | 北京中星微电子有限公司 | Method and system for detecting human head |
CN101872477B (en) | 2009-04-24 | 2014-07-16 | 索尼株式会社 | Method and device for detecting object in image and system containing device |
CN102024149B (en) * | 2009-09-18 | 2014-02-05 | 北京中星微电子有限公司 | Method of object detection and training method of classifier in hierarchical object detector |
JP2011198268A (en) * | 2010-03-23 | 2011-10-06 | Sony Corp | Information processing apparatus, method, and program |
CN102004904B (en) * | 2010-11-17 | 2013-06-19 | 东软集团股份有限公司 | Automatic teller machine-based safe monitoring device and method and automatic teller machine |
CN102136075B (en) * | 2011-03-04 | 2013-05-15 | 杭州海康威视数字技术股份有限公司 | Multiple-viewing-angle human face detecting method and device thereof under complex scene |
CN102170563A (en) * | 2011-03-24 | 2011-08-31 | 杭州海康威视软件有限公司 | Intelligent person capture system and person monitoring management method |
CN102436578B (en) * | 2012-01-16 | 2014-06-04 | 宁波江丰生物信息技术有限公司 | Formation method for dog face characteristic detector as well as dog face detection method and device |
CN102693417A (en) * | 2012-05-16 | 2012-09-26 | 清华大学 | Method for collecting and optimizing face image sample based on heterogeneous active visual network |
CN104318049A (en) * | 2014-10-30 | 2015-01-28 | 济南大学 | Coronal mass ejection event identification method |
CN106295668A (en) * | 2015-05-29 | 2017-01-04 | 中云智慧(北京)科技有限公司 | Robust gun detection method |
CN105205453B (en) * | 2015-08-28 | 2019-01-08 | 中国科学院自动化研究所 | Human eye detection and localization method based on depth self-encoding encoder |
CN105488456B (en) * | 2015-11-23 | 2019-04-23 | 中国科学院自动化研究所 | Method for detecting human face based on adaptive threshold adjustment rejection sub-space learning |
CN105488472B (en) * | 2015-11-30 | 2019-04-09 | 华南理工大学 | A kind of digital cosmetic method based on sample form |
CN106725341A (en) * | 2017-01-09 | 2017-05-31 | 燕山大学 | A kind of enhanced lingual diagnosis system |
CN106874867A (en) * | 2017-02-14 | 2017-06-20 | 江苏科技大学 | A kind of face self-adapting detecting and tracking for merging the colour of skin and profile screening |
CN107784263B (en) * | 2017-04-28 | 2021-03-30 | 新疆大学 | Planar rotation face detection method based on improved accelerated robust features |
CN109961455B (en) | 2017-12-22 | 2022-03-04 | 杭州萤石软件有限公司 | Target detection method and device |
CN108108724B (en) * | 2018-01-19 | 2020-05-08 | 浙江工商大学 | Vehicle detector training method based on multi-subregion image feature automatic learning |
CN108537143B (en) * | 2018-03-21 | 2019-02-15 | 光控特斯联(上海)信息科技有限公司 | A kind of face identification method and system based on key area aspect ratio pair |
CN109344868B (en) * | 2018-08-28 | 2021-11-16 | 广东奥普特科技股份有限公司 | General method for distinguishing different types of objects which are mutually axisymmetric |
CN110502992B (en) * | 2019-07-18 | 2021-06-15 | 武汉科技大学 | Relation graph based fast face recognition method for fixed scene video |
CN110674690B (en) * | 2019-08-21 | 2022-06-14 | 成都华为技术有限公司 | Detection method, detection device and detection equipment |
CN113822105B (en) * | 2020-07-07 | 2024-04-19 | 湖北亿立能科技股份有限公司 | Artificial intelligence water level monitoring system based on online two classifiers of SVM water scale |
CN113283378B (en) * | 2021-06-10 | 2022-09-27 | 合肥工业大学 | Pig face detection method based on trapezoidal region normalized pixel difference characteristics |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09101579A (en) * | 1995-10-05 | 1997-04-15 | Fuji Photo Film Co Ltd | Face area extraction method and copying condition determination method |
US5781650A (en) * | 1994-02-18 | 1998-07-14 | University Of Central Florida | Automatic feature detection and age classification of human faces in digital images |
CN1508752A (en) * | 2002-12-13 | 2004-06-30 | 佳能株式会社 | Image processing method and apparatus |
-
2005
- 2005-08-19 CN CNB2005100862485A patent/CN100336070C/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5781650A (en) * | 1994-02-18 | 1998-07-14 | University Of Central Florida | Automatic feature detection and age classification of human faces in digital images |
JPH09101579A (en) * | 1995-10-05 | 1997-04-15 | Fuji Photo Film Co Ltd | Face area extraction method and copying condition determination method |
CN1508752A (en) * | 2002-12-13 | 2004-06-30 | 佳能株式会社 | Image processing method and apparatus |
Non-Patent Citations (2)
Title |
---|
ROBUST PRECISE EYE LOCATION UNDERPROBABILISTIC FRAMEWORK Yong Ma,Xiaoqing Ding,Zhenger Wang,Ning Wang,IEEE FGR'04,Vol.2004 2004 * |
基于层次型支持向量机的人脸检测 马勇、丁晓青,清华大学学报(自然科学版),第43卷第1期 2003 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010072153A1 (en) * | 2008-12-25 | 2010-07-01 | 南京壹进制信息技术有限公司 | Computer intelligent energy saving method |
Also Published As
Publication number | Publication date |
---|---|
CN1731417A (en) | 2006-02-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN100336070C (en) | Method of robust human face detection in complicated background image | |
CN1214340C (en) | Multi-neural net imaging appts. and method | |
CN1794266A (en) | Biocharacteristics fusioned identity distinguishing and identification method | |
CN100336071C (en) | Method of robust accurate eye positioning in complicated background image | |
CN1811793A (en) | Automatic positioning method for characteristic point of human faces | |
CN101996405B (en) | Method and device for rapidly detecting and classifying defects of glass image | |
CN100345165C (en) | Method and apparatus for image-based photorealistic 3D face modeling | |
He et al. | Real-time human face detection in color image | |
CN1828632A (en) | Object detection apparatus, learning apparatus, object detection system, object detection method | |
CN1552041A (en) | Face meta-data creation and face similarity calculation | |
CN1664846A (en) | On-line hand-written Chinese characters recognition method based on statistic structural features | |
CN110781829A (en) | Light-weight deep learning intelligent business hall face recognition method | |
CN1924897A (en) | Image processing apparatus and method and program | |
CN1973757A (en) | Computerized disease sign analysis system based on tongue picture characteristics | |
CN1818927A (en) | Fingerprint identifying method and system | |
CN1885310A (en) | Human face model training module and method, human face real-time certification system and method | |
CN1695164A (en) | A method for generating a quality oriented signficance map for assessing the quality of an image or video | |
CN1945602A (en) | Characteristic selecting method based on artificial nerve network | |
CN1200387C (en) | Statistic handwriting identification and verification method based on separate character | |
CN1267849C (en) | Finger print identifying method based on broken fingerprint detection | |
CN1305002C (en) | Multiple registered fingerprint fusing method | |
WO2008151471A1 (en) | A robust precise eye positioning method in complicated background image | |
JP4795864B2 (en) | Feature point detection apparatus and method, and program | |
Babu et al. | Handwritten digit recognition using structural, statistical features and k-nearest neighbor classifier | |
CN1129331A (en) | Character recognition method and apparatus based on 0-1 pattern representation of histogram of character image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20070905 Termination date: 20190819 |