CN103455826A - Efficient matching kernel body detection method based on rapid robustness characteristics - Google Patents
Efficient matching kernel body detection method based on rapid robustness characteristics Download PDFInfo
- Publication number
- CN103455826A CN103455826A CN2013104052763A CN201310405276A CN103455826A CN 103455826 A CN103455826 A CN 103455826A CN 2013104052763 A CN2013104052763 A CN 2013104052763A CN 201310405276 A CN201310405276 A CN 201310405276A CN 103455826 A CN103455826 A CN 103455826A
- Authority
- CN
- China
- Prior art keywords
- image
- sample
- human body
- window
- sample level
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 53
- 238000000034 method Methods 0.000 claims abstract description 72
- 238000012549 training Methods 0.000 claims abstract description 67
- 239000013598 vector Substances 0.000 claims abstract description 67
- 230000006870 function Effects 0.000 claims abstract description 36
- 238000005070 sampling Methods 0.000 claims abstract description 15
- 238000000605 extraction Methods 0.000 claims abstract description 11
- 230000013016 learning Effects 0.000 claims abstract description 11
- 230000008878 coupling Effects 0.000 claims description 16
- 238000010168 coupling process Methods 0.000 claims description 16
- 238000005859 coupling reaction Methods 0.000 claims description 16
- 238000012360 testing method Methods 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 8
- 238000003064 k means clustering Methods 0.000 claims description 5
- 238000012706 support-vector machine Methods 0.000 claims description 5
- 239000012141 concentrate Substances 0.000 claims description 4
- 230000001186 cumulative effect Effects 0.000 claims description 3
- 238000009432 framing Methods 0.000 claims description 3
- 230000000452 restraining effect Effects 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 claims description 3
- 238000013459 approach Methods 0.000 claims description 2
- 238000013519 translation Methods 0.000 claims description 2
- 238000005286 illumination Methods 0.000 abstract description 8
- 238000012545 processing Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 5
- 230000007812 deficiency Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
The invention provides an efficient matching kernel body detection method based on rapid robustness characteristics. The efficient matching kernel body detection method mainly solves the problem that a traditional method can not well solve the problem of image background mixing or uneven illumination. The efficient matching nuclear body detection method includes the first step of selecting training sample set images, the second step of extracting SURF characteristic points of the images, the third step of building an initial vector basis of each layer, the fourth step of obtaining the biggest kernel function characteristic of a sampling layer, the fifth step of obtaining image efficient matching kernel characteristics, the sixth step of carrying out classified training, the seventh step of imputing the images to scan, the eighth step of detecting a scanning window, and the ninth step of outputting detection results. Through layered extraction of local information of the images to carry out characteristic learning, the characteristics are mapped to a low-dimensional space and are gathered into a characteristic set, then a linear classifier is used for training the characteristics, and a body detection classifier is obtained. The efficient matching kernel body detection method can be used for accurately detecting body information in natural images in the field of image processing.
Description
Technical field
The invention belongs to technical field of image processing, further relate to a kind of efficient coupling core (Efficient Match Kernel EMK) human body detecting method based on the fast robust feature in static human detection technique field.The present invention can be used for, from still image, human body information being detected, to reach the purpose of identification human body target.
Background technology
Human detection is to judge the process of human body information position from natural image, due to its using value in fields such as intelligent monitoring, driver assistance system, human body motion capture, porny filtrations, become a gordian technique in computer vision field in recent years.But, due to the diversity of human body attitude, the mixing and the clothes texture of background, illumination condition, many-sided factor such as self block and cause human detection to become a very difficult problem.At present, in still image, the method for human detection mainly is divided into two large classes: the human body detecting method based on manikin and the human body detecting method based on study.
The first, the human body detecting method based on manikin.The method does not need learning database, and clear and definite manikin is arranged, and then according to each position of Construction of A Model and the relation between human body, carries out human body identification.
Beijing Jiaotong University discloses a kind of detection method based on manikin in the patent " a kind of human body detecting method " (number of patent application CN201010218630.8, publication number CN101908150A) of its application.The method is set up the human detection template with certain fuzziness by the human sample of the multiple bodily form, multiple posture and is determined the human body candidate areas.The method can be processed occlusion issue preferably, can extrapolate the attitude of human body, improves efficiency and the precision of human detection, but the deficiency that the method still exists is, due to the matching algorithm more complicated, computation complexity is higher, is difficult to reach good testing result under the background complicated situation.
The second, the human body detecting method based on study.The method obtains a sorter by machine learning from a series of training data learnings, then utilizes this sorter input window is classified and identify.
Harbin Engineering University discloses a kind of human body detecting method in conjunction with multi-scale gradient histogram HOG feature and head color histogram feature in the patent " the real-time human body detecting method based on AdaBoost framework and head color " (number of patent application CN201110104892.6, publication number CN102163281A) of its application.The method combines feature templates when extracting histogram of gradients HOG feature, increased the function that head feature is differentiated, compare and improved verification and measurement ratio with classic method, particularly for change of background, little image space has good feature recognition effect, but, the deficiency that the method still exists is, for background, mixes or the situation of uneven illumination, and testing result can be interfered.
Summary of the invention
The object of the invention is to overcome the deficiency of above-mentioned prior art, a kind of efficient coupling core human body detecting method based on the fast robust feature is proposed, the human body detecting method of employing based on study, extract the local message of image by layering, then carry out dictionary learning Feature Mapping is arrived to lower dimensional space, assemble feature set, utilize linear classifier to be trained feature set, obtain the sorter of a human detection, recycle this sorter image to be detected is carried out to human detection.
For achieving the above object, the present invention includes and obtain detecting sorter and utilize the sorter obtained to be detected two processes to image, the specific implementation step is as follows:
First process, the concrete steps that obtain detecting sorter are as follows:
(1) select the training sample set image:
1a) utilize the bootstrapping operation, from the non-human body natural image of INRIA database, obtain enough negative sample images;
1b) the negative sample image of acquisition and the negative sample collection in the INRIA database are formed to new negative sample collection;
Positive sample set in the new negative sample collection image that 1c) will obtain and INRIA database forms the human body training sample set.
(2) extract image SURF unique point:
Every width image of 2a) the human body training sample being concentrated is divided into the grid of 8*8 pixel, to each grid, by the yardstick of 16,25,36 pixel sizes, samples respectively, and the trellised sample level of each yardstick sampling shape;
2b) to each 8*8 pixel grid, calculate after every layer of sampling the horizontal direction gradient of sampled point and the quadratic sum of vertical gradient in grid, by sampled point corresponding to gradient quadratic sum maximal value, as this pixel grid in the fast robust feature SURF of sample level unique point;
Every width image of 2c) the human body training sample being concentrated, each sample level in the fast robust feature SURF unique point of all pixel grid, choose at random 15 unique points, as the fast robust feature SURF unique point of image on sample level of human body training sample set.
(3) construct every layer of initial vector base:
Utilize the k means clustering method, to the human body training sample, concentrate the fast robust feature SURF unique point of all images on each sample level to carry out cluster, define 450 cluster centres, obtain the 450 dimension visual vocabularies of whole human body training sample set on sample level, form the initial base vector of sample level.
(4) obtain the maximum kernel Function feature of sample level:
For the initial base vector of each sample level, utilize respectively the core svd CKSVD of belt restraining to carry out dictionary learning, obtain the maximum kernel Function feature of sample level.
(5) obtain image and efficiently mate the core feature:
5a) to each sample level, press the element value of the maximum kernel Function feature of descending sort sample level, whether the element number that judges the greatest member value is 1, if, proper vector output using the maximum kernel Function feature of sample level as sample level, otherwise, the element that in the maximum kernel Function feature of sample level, element value equates with maximal value is set to zero, the proper vector output using the maximum kernel Function feature of the sample level after zero setting as sample level;
5b) proper vector of all sample level is weighted to summation, obtains all scale features, store all scale features;
5c) every row element of all scale feature vectors is averaged, on the coordinate transverse axis, corresponding average point carries out the cumulative of average number, obtain the distribution of the element average of all row of all scale feature vectors, select the feature of similar Gaussian distribution in the element distribution of mean value of all row, as the efficient coupling core feature of the fast robust characteristic of final image.
(6) classification based training:
Use the support vector machines sorter to carry out classification based training to the efficient coupling core feature of the fast robust characteristic extracted, obtain detecting sorter.
Second process, the concrete steps that the detection sorter that utilization obtains is detected image are as follows:
(7) input picture is scanned:
Input the detected image of a width, with the detected image of window scanning method scanning view picture, obtain one group of scanning window image, this group scanning window image is input to the detection sorter.
(8) detect scanning window:
8a) with detecting sorter, judge in the scanning window image of inputting whether include human body information, if there is not human body information, should be detected framing is non-human body natural image, otherwise, from all scanning window images that human body information arranged of judging, find out and detect scanning window image that the sorter mark is the highest as the main window image;
8b) from the remaining scanning window image that human body information arranged beyond the main window image, to be greater than 50% scanning window image and main window image with the main window doubling of the image and carry out the window combination operation, the window that window combination is obtained is preserved as a testing result, deletes the image of all participation window combination;
8c) judgement has the scanning window image of human body information whether to also have residue, if having, finds out in remaining scanning window image and detects image that the sorter mark is the highest as the main window image, execution step 8b), otherwise, execution step (9).
(9) output detections result:
All windows that window combination is obtained mark on detected image, and the image after output marks, as the human detection result of detected image.
The present invention compared with prior art has the following advantages:
First, the present invention has adopted the fast robust feature in the characteristic extraction procedure of human detection, the fast robust feature is added up area image by calculating local image gradient situation of change, form the feature with statistical property of entire image, can avoid the problem of the Fuzzy Representation based on edge and the generation of the image representation method based on profile in prior art, make the present invention when processing mixes background and uneven illumination image, can obtain better testing result.
Second, the present invention in the characteristic extraction procedure of human detection to image layered extraction feature, effectively utilize the characteristic point information on the different scale, avoided the too small local match error of bringing of prior art mesoscale, made the present invention can obtain testing result preferably.
The 3rd, the present invention adopts the dictionary learning method that characteristics of image is mapped to lower dimensional space the characteristics of image of extraction, assemble feature set, compared with prior art reduced the dimension of characteristics of image, effectively reduced the computing time of characteristics of image and the calculated amount of data.
The accompanying drawing explanation
Fig. 1 is process flow diagram of the present invention;
Fig. 2 is the sample image used in the present invention;
Fig. 3 is the present invention and the sorter classification performance comparison diagram based on histogram of gradients HOG feature human body detecting method;
Fig. 4 is the inventive method and the uneven illumination image is carried out to the analogous diagram of human detection based on histogram of gradients HOG feature human body detecting method;
Fig. 5 is the inventive method and complex background image is carried out to the analogous diagram of human detection based on histogram of gradients HOG feature human body detecting method.
Embodiment
Below in conjunction with accompanying drawing, the present invention will be further described.
With reference to accompanying drawing 1, concrete steps of the present invention are as follows:
Utilize the bootstrapping operation, from the non-human body natural image of INRIA database, obtain enough negative sample images.
The concrete steps of bootstrapping operation are as follows:
The first step, choose at random m positive sample image and n negative sample image from the INRIA database, 100≤m≤500 wherein, 100≤n≤800, and n≤m≤3n, used gradient orientation histogram HOG feature extracting method, and selected all positive and negative sample image is carried out to feature extraction, utilize the support vector machines sorter to carry out classification based training to the feature of extracting, obtain the preliminary classification device.
Second step, choose at random continuously the non-human body natural image in the INRIA database, adopt the scanning window of sample image size, 8 pixels of take from left to right are Moving Unit, 16 pixels of take from top to bottom are Moving Unit, the detected non-human body natural image of scanning view picture; Image in all scanning windows is input to the preliminary classification device and is detected, preserve the sorter scanning window image of wrong minute, open until the scanning window amount of images of wrong minute reaches a, 200≤a≤500, stop choosing non-human body natural image; The scanning window image divided from mistake, random choose b opens image, and 1/5a≤b≤1/3a forms new negative sample collection with current negative sample image.
The 3rd step, to m positive sample image and the new negative sample collection of choosing at random, carry out gradient orientation histogram HOG feature extraction, training classifier, the non-human body natural image of detection and upgrade the negative sample collection.
The 4th step, repeat the 3rd step operation, until the final training sample set after upgrading is comprised of 2416 positive sample images and 13500 negative sample images, size is 128 * 64 pixels.
The negative sample image of acquisition and the negative sample collection in the INRIA database are formed to new negative sample collection.
Positive sample set in the new negative sample collection image of acquisition and INRIA database is formed to the human body training sample set.
In embodiments of the invention, final human body training sample is concentrated, and training sample set is comprised of 2416 positive samples and 13500 negative samples, and the test sample book collection is comprised of 1132 positive samples and 4050 negative samples, and the size of sample image is 128 * 64 pixels.
Fig. 2 is the part sample image that the present invention uses, and wherein Fig. 2 (a) is the positive sample image of part used in the present invention, and Fig. 2 (b) is part negative sample image for what use in the present invention.
Step 2, the SURF unique point of extraction image.
Every width image that the human body training sample is concentrated is divided into the grid of 8*8 pixel, to each grid, by the yardstick of 16,25,36 pixel sizes, samples respectively, and the trellised sample level of each yardstick sampling shape.
To each 8*8 pixel grid, calculate after every layer of sampling the horizontal direction gradient of sampled point and the quadratic sum of vertical gradient in grid, by sampled point corresponding to gradient quadratic sum maximal value, as this pixel grid in the fast robust feature SURF of sample level unique point.
Every width image that the human body training sample is concentrated, each sample level in the fast robust feature SURF unique point of all pixel grid, choose at random 15 unique points, as the fast robust feature SURF unique point of image on sample level of human body training sample set.
Step 3, construct every layer of initial vector base.
Utilize the k means clustering method, all training sample image all fast robust feature SURF unique points on each sample level are carried out to cluster, define 450 cluster centres, obtain the 450 dimension visual vocabularies of whole training sample in sample level, form the initial vector base of sample level.
The concrete steps of k means clustering method are as follows:
The first step, to each sample level, random all sample images from the human body training sample set are the fast robust feature SURF of sample level unique point, choose 450 fast robust feature SURF unique points, initial cluster center as sample level, respectively by the data value of 450 initial cluster centers, as the cluster centre value of place initial cluster center.
Second step, calculate the human body training sample set and arrive the Euclidean distance of each cluster centre in all fast robust feature SURF unique points of sample level.
The 3rd step, be grouped into each fast robust feature SURF unique point of sample level in the classification at the cluster centre place nearest with self.
The 4th step, whether the data mean value that the fast robust feature SURF unique point of rear each class is sorted out in judgement equals the cluster centre value, if, carry out the 5th step, otherwise, using the data mean value of the unique point of required each class as new cluster centre value, return to second step.
The 5th step, preserve 450 cluster centre values, by 450 cluster centre values, forms column vector, using the initial base vector output on sample level as whole human body training sample set of this column vector.
Step 4, the maximum kernel Function feature of acquisition sample level.
For the initial base vector of each sample level, utilize respectively the core svd CKSVD of belt restraining to carry out dictionary learning, obtain the maximum kernel Function feature of sample level.
The step of bootstrapping operation is as follows:
The first step, for each sample level, project to the initial base vector of sample level on the space of one 450 dimension, by following formula, calculates, and obtains the projection vector of the initial base vector of sample level:
R=R
1×[v
1,...v
j...,v
N]
Wherein, R means the projection vector of the initial base vector of sample level, R
1the initial base vector that means sample level, v
jthe vector of the projection coefficient of j the unique point that all sample images of expression human body training sample set extract on sample level, v
j=[v
1j, v
2j... v
ij..., v
mj]
t, v
ijthe projection coefficient that means j the unique point that the human body training sample concentrates i width sample image to extract on sample level, M means all sample image numbers of human body training sample set, j=1,2, ..., N, N means the number of the unique point that each concentrated width sample image of human body training sample is chosen at random on sample level.
Second step, press an approximating function of following formula structure, the projection of the initial base vector on projector space gets on the approach sample level:
f(r)=argmin||r-R||
Wherein, r means maximum kernel Function feature on sample level, and R means the initial base vector R of sample level
1projection, || || mean 2 norms, argmin|||| means to minimize.
By R=R
1* [v
1... v
j..., v
n] the substitution above formula, and r is pressed to r=[r
1... r
j..., r
n] launch, obtain maximum kernel Function feature r to initial base vector R by following formula
12 approximating function f (v, r):
Wherein, v means the low-dimensional projection coefficient vector of all unique points that all sample images of human body training sample set extract, and v=[v
1... v
j..., v
n], v
jthe vector of the projection coefficient of j the unique point that all sample images of expression human body training sample set extract on sample level, N means the number of the unique point that each concentrated width sample image of human body training sample is chosen at random on sample level, r
jthe maximum kernel proper vector of j the unique point that all sample images of expression human body training sample set extract, R
1the initial base vector that means sample level.
The 3rd step, used random gradient descent method to solve approximating function, obtains following formula and come iteration to upgrade the maximum kernel Function feature on this sample level, forms the image feature representation of low-dimensional:
Wherein, the maximum kernel Function feature on the sample level that r(k+1) the expression iteration obtains for k+1 time, k means iterations, the maximum kernel Function feature on the sample level that r(k) the expression iteration obtains for k time, η means learning rate, is a constant,
the derivative of formula in expression calculating bracket to r, r
jthe maximum kernel proper vector that means j the unique point that the concentrated all sample images of human body training sample extract on sample level, R
1mean the initial base vector on sample level, R
1 tmean the initial base vector R on sample level
1transposed vector, setting iterations is 1000 times, the r obtained after iteration completes (1000) is as the final maximum kernel Function feature on sample level, j=1,2, ..., N, N means the quantity of the unique point at random chosen of each width sample image on sample level that the human body training sample is concentrated.
Step 5, obtain image and efficiently mate the core feature.
To each sample level, press the element value of the maximum kernel Function feature of descending sort sample level, whether the element number that judges the greatest member value is 1, if, proper vector output using the maximum kernel Function feature of sample level as sample level, otherwise, the element that in the maximum kernel Function feature of sample level, element value equates with maximal value is set to zero, the proper vector output using the maximum kernel Function feature of the sample level after zero setting as sample level.
Proper vector to all sample level is weighted summation, obtains all scale features, stores all scale features.
The mode of weighted sum is as follows:
G*=G
i×A
i
Wherein, G* means all sample level features, G
ithe proper vector that means each sample level, i=1,2,3, A
imean the weight that each sample level is corresponding,
w
i=1/p
i, p
ithe pixel size that means the sampling scale of each sample level, p
i={ 16,25,36}.
Every row element to all scale feature vectors is averaged, on the coordinate transverse axis, corresponding average point carries out the cumulative of average number, obtain the distribution of the element average of all row of all scale feature vectors, select the feature of similar Gaussian distribution in the element distribution of mean value of all row, as the efficient coupling core feature of the fast robust characteristic of final image.
Step 6, classification based training.
Use the support vector machines sorter to carry out classification based training to the efficient coupling core feature of the fast robust characteristic extracted, obtain detecting sorter.
Step 7, input picture is scanned.
Input the detected image of a width, with the detected image of window scanning method scanning view picture, obtain one group of scanning window image, this group scanning window image is input to the detection sorter.
The concrete steps of window scanning are as follows:
The first step, using the zone of sample image size in a human body training sample set in the detected image upper left corner of input as first scanning window, using this scanning window as current scanning window, preserve current scanning window image.
Second step, by current scanning window on detected image to 8 pixels of right translation or move down 16 pixels and obtain a new scanning window, remove to replace current scanning window with new scanning window, preserve current scanning window image.
The 3rd step, mobile current scanning window as stated above, go to replace current scanning window with the scanning window after movement until scan the detected image of complete width, preserves all scanning window images.
Step 8, detect scanning window.
8a) with detecting sorter, judge in the scanning window image of inputting whether include human body information, if there is not human body information, should be detected framing is non-human body natural image, otherwise, from all scanning window images that human body information arranged of judging, find out and detect scanning window image that the sorter mark is the highest as the main window image.
8b) from the remaining scanning window image that human body information arranged beyond the main window image, to be greater than 50% scanning window image and main window image with the main window doubling of the image and carry out the window combination operation, the window that window combination is obtained is preserved as a testing result, deletes the image of all participation window combination.
The concrete steps of window combination are as follows:
The first step, by all images of window combination that need since 1 serial number.
Second step, need every width the sorter mark of the image of window combination, and the proportion accounted in the sorter mark sum of all images that need window combination is as the weight of image boundary weighting.
The 3rd step, utilize following formula, and every border of the image that needs window combination is weighted:
Wherein, X means the pixel value of being expert on detected image of the window edge that obtains after weighting or the pixel value of column, x
1, x
2... x
nthe pixel value of being expert at of the image boundary that mean to participate in respectively window combination on detected image or the pixel value of column, m
1, m
2... m
nmean to participate in respectively the sorter mark corresponding to image of window combination, N means to participate in the image number of window combination, and A means to participate in the Image Classifier mark sum of window combination,
n means to participate in the image number of window combination, and i means the numbering of window combination image, m
imean that the i width participates in the sorter mark of the image of window combination.
The 4th step, form a window by the border after weighting.
8c) judgement has the scanning window image of human body information whether to also have residue, if having, finds out in remaining scanning window image and detects image that the sorter mark is the highest as the main window image, execution step 8b), otherwise, perform step 9.
Step 9, the output detections result.
All windows that window combination is obtained mark on detected image, and the image after output marks, as the human detection result of detected image.
Effect of the present invention can further illustrate by following emulation:
1, emulation experiment condition setting:
Emulation experiment of the present invention has compiled on Matlab2009a, and simulated environment is the HP workstation under the Windows framework.Test required positive sample image and negative sample image and all be taken from the INRIA database, training sample comprises 2416 positive samples and 13500 negative samples, the test sample book collection comprises 1132 positive samples and 4050 negative samples, the size of positive sample and negative sample image is 128 * 64 pixels, Fig. 2 is the part sample image used in the present invention, wherein Fig. 2 (a) is the positive sample image of part used in the present invention, the part negative sample image of Fig. 2 (b) for using in the present invention.
2, emulation content and interpretation of result:
Emulation 1:
The accuracy rate that sorter has obtained after having trained is one of important indicator of judgement classifier performance.In order to obtain the sorter of better performance, we are when extracting the sample image feature, choosing of these two parameters of projection dimension during to the sampling number of plies of sample image and initial base vector low-dimensional projection done a large amount of experiments, by the different sampling numbers of plies to sample image sampled, different projection dimension carries out to initial base vector the accuracy rate that the projection training classifier obtains and contrasts, comparing result is as shown in table 1.
As can be seen from Table 1, for identical projection dimension, the sorter accuracy rate of the sorter accuracy rate of sample image being carried out to 3 layers of when sampling when sample image being carried out to 2 layers of sampling; And, for the identical sampling number of plies, might not be that the projection dimension is higher, the sorter accuracy rate is just higher.Data from figure can be found out, sample image is carried out to 3 layers of sampling, and the sorter accuracy rate that initial base vector is obtained during to 450 dimension projection is the highest, and the classification performance of acquisition is best.
Emulation 2:
Use respectively the present invention and based on histogram of gradients HOG feature human body detecting method, the human body training sample set is carried out to feature extraction, training classifier, contrasted the classifier performance obtained.Classifier performance contrast schematic diagram, with reference to accompanying drawing 3, is selected the Rate by comparison kidney-Yang rate TPR(True Positive in Fig. 3) and false positive rate FPR(False Positives Rates) recipient's operating characteristics ROC(Receiver Operating Characteristic of relation) curve carrys out the performance of classification of assessment device.The left drift angle of the more top tendency of ROC curve, its corresponding sorter is just more outstanding.
Abscissa axis in accompanying drawing 3 means false positive rate FPR(False Positives Rates), axis of ordinates means kidney-Yang rate TPR(True Positive Rate).The curve indicated with square in accompanying drawing 3 means the ROC curve of sorter kidney-Yang rate of the present invention and false positive rate relation, and the curve indicated with cross means sorter kidney-Yang rate based on histogram of gradients HOG feature human body detecting method and the ROC curve of false positive rate relation.As seen from Figure 3, the ROC curve that the present invention obtains is compared the ROC curve obtained based on histogram of gradients HOG feature human body detecting method, more toply tend to left drift angle, illustrate that classification performance of the present invention is better than the classification performance based on histogram of gradients HOG feature human body detecting method.
Emulation 3:
With the present invention with based on histogram of gradients HOG feature human body detecting method, the natural image from the INRIA database is carried out to human detection, testing result is as shown in Figure 4 and Figure 5.
Fig. 4 is the image of a width uneven illumination, and image 4 (a) means human detection result of the present invention, and the white box in figure means the result that in detection detection of classifier image of the present invention, the human body information rear hatch merges.Image 4 (b) means the human detection result based on histogram of gradients HOG feature human body detecting method, the white box in figure, the result that in the detection detection of classifier image of expression the method, the human body information rear hatch merges.As can be seen from Figure 4, in the situation that uneven illumination, method of the present invention, compared to based on histogram of gradients HOG feature human body detecting method, can reduce false alarm rate greatly, can detect more accurately all human body informations in image to be detected.
Fig. 5 image that to be a width block with complex background and personage, image 5 (a) means human detection result of the present invention, the white box in figure means the result that in detection detection of classifier image of the present invention, the human body information rear hatch merges.Image 5 (b) means the human detection result based on histogram of gradients HOG feature human body detecting method, the white box in figure, the result that in the detection detection of classifier image of expression the method, the human body information rear hatch merges.As can be seen from Figure 5, in the situation that being arranged, complex background and personage block, use the inventive method can mark more accurately human body information, and the window size obtained after the window merging have higher human detection accuracy in more suitable based on histogram of gradients HOG feature human body detecting method.
In sum, the inventive method can be even at uneven illumination, and background is complicated and exist in the situation of partial occlusion by human detection out.Thereby illustrate that this method is very suitable for the human detection in natural image.
Claims (7)
1. the efficient coupling core human body detecting method based on the fast robust feature, comprise and obtain detecting sorter and utilize the sorter obtained to be detected two processes to image, and the specific implementation step is as follows:
First process, the concrete steps that obtain detecting sorter are as follows:
(1) select the training sample set image:
1a) utilize the bootstrapping operation, from the non-human body natural image of INRIA database, obtain enough negative sample images;
1b) the negative sample image of acquisition and the negative sample collection in the INRIA database are formed to new negative sample collection;
Positive sample set in the new negative sample collection image that 1c) will obtain and INRIA database forms the human body training sample set;
(2) extract image SURF unique point:
Every width image of 2a) the human body training sample being concentrated is divided into the grid of 8*8 pixel, to each grid, by the yardstick of 16,25,36 pixel sizes, samples respectively, and the trellised sample level of each yardstick sampling shape;
2b) to each 8*8 pixel grid, calculate after every layer of sampling the horizontal direction gradient of sampled point and the quadratic sum of vertical gradient in grid, by sampled point corresponding to gradient quadratic sum maximal value, as this pixel grid in the fast robust feature SURF of sample level unique point;
Every width image of 2c) the human body training sample being concentrated, each sample level in the fast robust feature SURF unique point of all pixel grid, choose at random 15 unique points, as the fast robust feature SURF unique point of image on sample level of human body training sample set;
(3) construct every layer of initial vector base:
Utilize the k means clustering method, to the human body training sample, concentrate the fast robust feature SURF unique point of all images on each sample level to carry out cluster, define 450 cluster centres, obtain the 450 dimension visual vocabularies of whole human body training sample set on sample level, form the initial base vector of sample level;
(4) obtain the maximum kernel Function feature of sample level:
For the initial base vector of each sample level, utilize respectively the core svd CKSVD of belt restraining to carry out dictionary learning, obtain the maximum kernel Function feature of sample level;
(5) obtain image and efficiently mate the core feature:
5a) to each sample level, press the element value of the maximum kernel Function feature of descending sort sample level, whether the element number that judges the greatest member value is 1, if, proper vector output using the maximum kernel Function feature of sample level as sample level, otherwise, the element that in the maximum kernel Function feature of sample level, element value equates with maximal value is set to zero, the proper vector output using the maximum kernel Function feature of the sample level after zero setting as sample level;
5b) proper vector of all sample level is weighted to summation, obtains all scale features, store all scale features;
5c) every row element of all scale feature vectors is averaged, on the coordinate transverse axis, corresponding average point carries out the cumulative of average number, obtain the distribution of the element average of all row of all scale feature vectors, select the feature of similar Gaussian distribution in the element distribution of mean value of all row, as the efficient coupling core feature of the fast robust characteristic of final image;
(6) classification based training:
Use the support vector machines sorter to carry out classification based training to the efficient coupling core feature of the fast robust characteristic extracted, obtain detecting sorter;
Second process, the concrete steps that the detection sorter that utilization obtains is detected image are as follows:
(7) input picture is scanned:
Input the detected image of a width, with the detected image of window scanning method scanning view picture, obtain one group of scanning window image, this group scanning window image is input to the detection sorter;
(8) detect scanning window:
8a) with detecting sorter, judge in the scanning window image of inputting whether include human body information, if there is not human body information, should be detected framing is non-human body natural image, otherwise, from all scanning window images that human body information arranged of judging, find out and detect scanning window image that the sorter mark is the highest as the main window image;
8b) from the remaining scanning window image that human body information arranged beyond the main window image, to be greater than 50% scanning window image and main window image with the main window doubling of the image and carry out the window combination operation, the window that window combination is obtained is preserved as a testing result, deletes the image of all participation window combination;
8c) judgement has the scanning window image of human body information whether to also have residue, if having, finds out in remaining scanning window image and detects image that the sorter mark is the highest as the main window image, execution step 8b), otherwise, execution step (9);
(9) output detections result:
All windows that window combination is obtained mark on detected image, and the image after output marks, as the human detection result of detected image.
2. the efficient coupling core human body detecting method based on the fast robust feature according to claim 1 is characterized in that: step 1a) concrete steps of described bootstrapping operation are as follows:
The first step, choose at random m positive sample image and n negative sample image from the INRIA database, 100≤m≤500 wherein, 100≤n≤800, and n≤m≤3n, used gradient orientation histogram HOG feature extracting method, and selected all positive and negative sample image is carried out to feature extraction, utilize the support vector machines sorter to carry out classification based training to the feature of extracting, obtain the preliminary classification device;
Second step, choose at random continuously the non-human body natural image in the INRIA database, adopt the scanning window of sample image size, 8 pixels of take from left to right are Moving Unit, 16 pixels of take from top to bottom are Moving Unit, the detected non-human body natural image of scanning view picture; Image in all scanning windows is input to the preliminary classification device and is detected, preserve the sorter scanning window image of wrong minute, open until the scanning window amount of images of wrong minute reaches a, 200≤a≤500, stop choosing non-human body natural image; The scanning window image divided from mistake, random choose b opens image, and 1/5a≤b≤1/3a forms new negative sample collection with current negative sample image;
The 3rd step, to m positive sample image and the new negative sample collection of choosing at random, carry out gradient orientation histogram HOG feature extraction, training classifier, the non-human body natural image of detection and upgrade the negative sample collection;
The 4th step, repeat the 3rd step operation, until the final training sample set after upgrading is comprised of 2416 positive sample images and 13500 negative sample images, size is 128 * 64 pixels.
3. the efficient coupling core human body detecting method based on the fast robust feature according to claim 1, it is characterized in that: the concrete steps of the described k means clustering method of step (3) are as follows:
The first step, to each sample level, random all sample images from the human body training sample set are the fast robust feature SURF of sample level unique point, choose 450 fast robust feature SURF unique points, initial cluster center as sample level, respectively by the data value of 450 initial cluster centers, as the cluster centre value of place initial cluster center;
Second step, calculate the human body training sample set and arrive the Euclidean distance of each cluster centre in all fast robust feature SURF unique points of sample level;
The 3rd step, be grouped into each fast robust feature SURF unique point of sample level in the classification at the cluster centre place nearest with self;
The 4th step, whether the data mean value that the fast robust feature SURF unique point of rear each class is sorted out in judgement equals the cluster centre value, if, carry out the 5th step, otherwise, using the data mean value of the unique point of required each class as new cluster centre value, return to second step;
The 5th step, preserve 450 cluster centre values, by 450 cluster centre values, forms column vector, using the initial base vector output on sample level as whole human body training sample set of this column vector.
4. the efficient coupling core human body detecting method based on the fast robust feature according to claim 1, it is characterized in that: the concrete steps of the described dictionary learning of step (4) are as follows:
The first step, for each sample level, project to the initial base vector of sample level on the space of one 450 dimension, by following formula, calculates, and obtains the projection vector of the initial base vector of sample level:
R=R
1×[v
1,...v
j...,v
N]
Wherein, R means the projection vector of the initial base vector of sample level, R
1the initial base vector that means sample level, v
jthe vector of the projection coefficient of j the unique point that all sample images of expression human body training sample set extract on sample level, v
j=[v
1j, v
2j... v
ij..., v
mj]
t, v
ijthe projection coefficient that means j the unique point that the human body training sample concentrates i width sample image to extract on sample level, M means the sample image number of human body training sample set, j=1,2, ..., N, N means the number of the unique point that each concentrated width sample image of human body training sample is chosen at random on sample level;
Second step, press an approximating function of following formula structure, the projection of the initial base vector on projector space gets on the approach sample level:
f(r)=argmin||r-R||
Wherein, r means the maximum kernel Function feature on sample level, and R means the initial base vector R on sample level
1projection, || || mean 2 norms, argmin|||| means to minimize;
The 3rd step, calculate approximating function, obtains following formula and come iteration to upgrade the maximum kernel Function feature on sample level, forms the image feature representation of low-dimensional:
Wherein, the maximum kernel Function feature on the sample level that r(k+1) the expression iteration obtains for k+1 time, k means iterations, the maximum kernel Function feature on the sample level that r(k) the expression iteration obtains for k time, η means learning rate, is a constant,
the derivative of formula in expression calculating bracket to r, r
jthe maximum kernel proper vector that means j the unique point that the concentrated all sample images of human body training sample extract on sample level, R
1mean the initial base vector on sample level, R
1 tmean the initial base vector R on sample level
1transposed vector, setting iterations is 1000 times, the r obtained after iteration completes (1000) is as the final maximum kernel Function feature on sample level, j=1,2, ..., N, N means the quantity of the unique point at random chosen of each width sample image on sample level that the human body training sample is concentrated.
5. the efficient coupling core human body detecting method based on the fast robust feature according to claim 1, it is characterized in that: step 5b) mode of described weighted sum is as follows:
G*=G
i×A
i
6. the efficient coupling core human body detecting method based on the fast robust feature according to claim 1, it is characterized in that: the concrete steps of the described window scanning method of step (7) are as follows:
The first step, using the zone of sample image size in a human body training sample set in the detected image upper left corner of input as first scanning window, using this scanning window as current scanning window, preserve current scanning window image;
Second step, by current scanning window on detected image to 8 pixels of right translation or move down 16 pixels and obtain a new scanning window, remove to replace current scanning window with new scanning window, preserve current scanning window image;
The 3rd step, mobile current scanning window as stated above, go to replace current scanning window with the scanning window after movement until scan the detected image of complete width, preserves all scanning window images.
7. the efficient coupling core human body detecting method based on the fast robust feature according to claim 1 is characterized in that: step 8b) concrete steps of described window combination operation are as follows:
The first step, by all images of window combination that need since 1 serial number;
Second step, need every width the sorter mark of the image of window combination, and the proportion accounted in the detection sorter mark sum of all images that need window combination is as the weight of image boundary weighting;
The 3rd step, utilize following formula, and every border of the image that needs window combination is weighted:
Wherein, X means the pixel value of being expert on detected image of the window edge that obtains after weighting or the pixel value of column, x
1, x
2... x
nthe pixel value of being expert at of the image boundary that mean to participate in respectively window combination on detected image or the pixel value of column, m
1, m
2... m
nmean to participate in respectively the detection sorter mark corresponding to image of window combination, N means to participate in the image number of window combination, and A means to participate in the image detection sorter mark sum of window combination,
n means to participate in the image number of window combination, and i means the numbering of window combination image, m
imean that the i width participates in the detection sorter mark of the image of window combination;
The 4th step, form a window by the border after weighting.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310405276.3A CN103455826B (en) | 2013-09-08 | 2013-09-08 | Efficient matching kernel body detection method based on rapid robustness characteristics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310405276.3A CN103455826B (en) | 2013-09-08 | 2013-09-08 | Efficient matching kernel body detection method based on rapid robustness characteristics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103455826A true CN103455826A (en) | 2013-12-18 |
CN103455826B CN103455826B (en) | 2017-02-08 |
Family
ID=49738167
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310405276.3A Expired - Fee Related CN103455826B (en) | 2013-09-08 | 2013-09-08 | Efficient matching kernel body detection method based on rapid robustness characteristics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103455826B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103985102A (en) * | 2014-05-29 | 2014-08-13 | 宇龙计算机通信科技(深圳)有限公司 | Image processing method and system |
CN104573646A (en) * | 2014-12-29 | 2015-04-29 | 长安大学 | Detection method and system, based on laser radar and binocular camera, for pedestrian in front of vehicle |
CN105139390A (en) * | 2015-08-14 | 2015-12-09 | 四川大学 | Image processing method for detecting pulmonary tuberculosis focus in chest X-ray DR film |
CN106462772A (en) * | 2014-02-19 | 2017-02-22 | 河谷控股Ip有限责任公司 | Invariant-based dimensional reduction of object recognition features, systems and methods |
CN107025436A (en) * | 2017-03-13 | 2017-08-08 | 西安电子科技大学 | A kind of self refresh human intrusion detection method based on confidence level |
CN107945145A (en) * | 2017-11-17 | 2018-04-20 | 西安电子科技大学 | Infrared image fusion Enhancement Method based on gradient confidence Variation Model |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102663369A (en) * | 2012-04-20 | 2012-09-12 | 西安电子科技大学 | Human motion tracking method on basis of SURF (Speed Up Robust Feature) high efficiency matching kernel |
CN102810159A (en) * | 2012-06-14 | 2012-12-05 | 西安电子科技大学 | Human body detecting method based on SURF (Speed Up Robust Feature) efficient matching kernel |
US20130004018A1 (en) * | 2011-06-29 | 2013-01-03 | Postech Academy - Industry Foundation | Method and apparatus for detecting object using volumetric feature vector and 3d haar-like filters |
-
2013
- 2013-09-08 CN CN201310405276.3A patent/CN103455826B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130004018A1 (en) * | 2011-06-29 | 2013-01-03 | Postech Academy - Industry Foundation | Method and apparatus for detecting object using volumetric feature vector and 3d haar-like filters |
CN102663369A (en) * | 2012-04-20 | 2012-09-12 | 西安电子科技大学 | Human motion tracking method on basis of SURF (Speed Up Robust Feature) high efficiency matching kernel |
CN102810159A (en) * | 2012-06-14 | 2012-12-05 | 西安电子科技大学 | Human body detecting method based on SURF (Speed Up Robust Feature) efficient matching kernel |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106462772A (en) * | 2014-02-19 | 2017-02-22 | 河谷控股Ip有限责任公司 | Invariant-based dimensional reduction of object recognition features, systems and methods |
CN106462772B (en) * | 2014-02-19 | 2019-12-13 | 河谷控股Ip有限责任公司 | Invariant-based dimension reduction for object recognition features, systems and methods |
CN103985102A (en) * | 2014-05-29 | 2014-08-13 | 宇龙计算机通信科技(深圳)有限公司 | Image processing method and system |
CN104573646A (en) * | 2014-12-29 | 2015-04-29 | 长安大学 | Detection method and system, based on laser radar and binocular camera, for pedestrian in front of vehicle |
CN104573646B (en) * | 2014-12-29 | 2017-12-12 | 长安大学 | Chinese herbaceous peony pedestrian detection method and system based on laser radar and binocular camera |
CN105139390A (en) * | 2015-08-14 | 2015-12-09 | 四川大学 | Image processing method for detecting pulmonary tuberculosis focus in chest X-ray DR film |
CN107025436A (en) * | 2017-03-13 | 2017-08-08 | 西安电子科技大学 | A kind of self refresh human intrusion detection method based on confidence level |
CN107945145A (en) * | 2017-11-17 | 2018-04-20 | 西安电子科技大学 | Infrared image fusion Enhancement Method based on gradient confidence Variation Model |
Also Published As
Publication number | Publication date |
---|---|
CN103455826B (en) | 2017-02-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106874894B (en) | Human body target detection method based on regional full convolution neural network | |
CN111695522B (en) | In-plane rotation invariant face detection method and device and storage medium | |
CN106529424B (en) | A kind of logo detection recognition method and system based on selective search algorithm | |
CN101930549B (en) | Second generation curvelet transform-based static human detection method | |
CN103310194B (en) | Pedestrian based on crown pixel gradient direction in a video shoulder detection method | |
CN103455826A (en) | Efficient matching kernel body detection method based on rapid robustness characteristics | |
CN104573685B (en) | A kind of natural scene Method for text detection based on linear structure extraction | |
CN106778687A (en) | Method for viewing points detecting based on local evaluation and global optimization | |
CN102163281B (en) | Real-time human body detection method based on AdaBoost frame and colour of head | |
CN102867195B (en) | Method for detecting and identifying a plurality of types of objects in remote sensing image | |
CN103390164A (en) | Object detection method based on depth image and implementing device thereof | |
CN103020614B (en) | Based on the human motion identification method that space-time interest points detects | |
CN106940791B (en) | A kind of pedestrian detection method based on low-dimensional histograms of oriented gradients | |
CN110599463B (en) | Tongue image detection and positioning algorithm based on lightweight cascade neural network | |
CN101520839B (en) | Human body detection method based on second-generation strip wave conversion | |
CN106023173A (en) | Number identification method based on SVM | |
CN102024149B (en) | Method of object detection and training method of classifier in hierarchical object detector | |
CN104573722A (en) | Three-dimensional face race classifying device and method based on three-dimensional point cloud | |
CN102609715B (en) | Object type identification method combining plurality of interest point testers | |
CN103927555A (en) | Static sign language letter recognition system and method based on Kinect sensor | |
CN103077383B (en) | Based on the human motion identification method of the Divisional of spatio-temporal gradient feature | |
CN105447457A (en) | License plate character identification method based on adaptive characteristic | |
CN103455798B (en) | Histogrammic human body detecting method is flowed to based on maximum geometry | |
CN106548195A (en) | A kind of object detection method based on modified model HOG ULBP feature operators | |
CN103186777B (en) | Based on the human body detecting method of Non-negative Matrix Factorization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170208 |
|
CF01 | Termination of patent right due to non-payment of annual fee |