CN103455826B - Efficient matching kernel body detection method based on rapid robustness characteristics - Google Patents
Efficient matching kernel body detection method based on rapid robustness characteristics Download PDFInfo
- Publication number
- CN103455826B CN103455826B CN201310405276.3A CN201310405276A CN103455826B CN 103455826 B CN103455826 B CN 103455826B CN 201310405276 A CN201310405276 A CN 201310405276A CN 103455826 B CN103455826 B CN 103455826B
- Authority
- CN
- China
- Prior art keywords
- image
- sample
- window
- feature
- human body
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Image Analysis (AREA)
Abstract
The invention provides an efficient matching kernel body detection method based on rapid robustness characteristics. The efficient matching kernel body detection method mainly solves the problem that a traditional method can not well solve the problem of image background mixing or uneven illumination. The efficient matching nuclear body detection method includes the first step of selecting training sample set images, the second step of extracting SURF characteristic points of the images, the third step of building an initial vector basis of each layer, the fourth step of obtaining the biggest kernel function characteristic of a sampling layer, the fifth step of obtaining image efficient matching kernel characteristics, the sixth step of carrying out classified training, the seventh step of imputing the images to scan, the eighth step of detecting a scanning window, and the ninth step of outputting detection results. Through layered extraction of local information of the images to carry out characteristic learning, the characteristics are mapped to a low-dimensional space and are gathered into a characteristic set, then a linear classifier is used for training the characteristics, and a body detection classifier is obtained. The efficient matching kernel body detection method can be used for accurately detecting body information in natural images in the field of image processing.
Description
Technical field
The invention belongs to technical field of image processing, the one kind further relating to static human detection technique field is based on
Efficient matchings core (the Efficient Match Kernel EMK) human body detecting method of fast robust feature.The present invention can use
In from still image, human body information is detected, to reach the purpose of identification human body target.
Background technology
Human detection is to judge the process of human body information position from natural image, in recent years because it is in intelligence
The using value in the fields such as monitoring, driver assistance system, human body motion capture, porny filtration, has become as computer
A key technology in visual field.But the multiformity due to human body attitude, the mixing and clothes texture of background, illumination bar
Part, many factors such as itself blocks and leads to human detection to become an extremely difficult problem.At present, people in still image
The method that health check-up is surveyed is broadly divided into two big class:Human body detecting method based on anthropometric dummy and the human detection side based on study
Method.
The first, the human body detecting method based on anthropometric dummy.The method does not need learning database, has clear and definite human body
Model, then carries out human bioequivalence according to the relation between each position and human body of model construction.
Patent " a kind of human body detecting method " (number of patent application that Beijing Jiaotong University applies at it
CN201010218630.8, publication number CN101908150A) disclose a kind of detection method based on anthropometric dummy.The method is led to
Cross multiple bodily forms, the human sample of multiple posture is set up has the human detection template of certain fuzziness to determine human body candidate area
Domain.The method can preferably process occlusion issue, can extrapolate the attitude of human body, improves efficiency and the precision of human detection,
But, the deficiency that the method yet suffers from is that, because matching algorithm is more complicated, computation complexity is higher, complicated in background
In the case of the good testing result of difficult to reach.
Second, the human body detecting method based on study.The method passes through machine learning from a series of training data middle schools
Acquistion, to a grader, is then classified to input window using this grader and is identified.
Patent " the real-time body's detection based on AdaBoost framework and head color that Harbin Engineering University applies at it
Disclose one kind in method " (number of patent application CN201110104892.6, publication number CN102163281A) and combine multiple dimensioned ladder
Degree rectangular histogram HOG feature and the human body detecting method of head color histogram feature.The method extracts histogram of gradients HOG feature
While combine feature templates, increased head feature differentiation function, improve verification and measurement ratio compared with traditional method, especially
It is to have good feature identification effect for the background little image space of change, but, the deficiency that the method yet suffers from is,
Mix or during uneven illumination for background, testing result can be interfered.
Content of the invention
It is an object of the invention to overcoming the shortcomings of above-mentioned prior art, a kind of height based on fast robust feature is proposed
Effect coupling core human body detecting method, using the human body detecting method based on study, by the local message of Multi-layer technology image, so
After carry out dictionary learning by Feature Mapping to lower dimensional space, assemble feature set, using linear classifier, feature set instructed
Practice, obtain the grader of a human detection, recycle this grader to carry out human detection to image to be detected.
For achieving the above object, the present invention includes obtaining detecting grader and using the grader being obtained, image being carried out
Two processes of detection, implement step as follows:
First process, obtains detecting comprising the following steps that of grader:
(1) select training sample set image:
1a) utilize bootstrapping operation, from the non-human natural image of INRIA data base, obtain enough negative sample images;
1b) the negative sample image obtaining is formed new negative sample collection with the negative sample collection in INRIA data base;
1c) the new negative sample collection image obtaining is constituted human body training sample with the positive sample collection in INRIA data base
Collection.
(2) extract image SURF characteristic point:
2a) each image concentrating human body training sample is divided into the grid of 8*8 pixel, to each grid, presses respectively
16th, the yardstick sampling of 25,36 pixel sizes, each yardstick sampling trellised sample level of shape;
2b) to each 8*8 pixel grid, the horizontal direction gradient of sampled point and Vertical Square in grid after every layer of sampling of calculating
To the quadratic sum of gradient, by corresponding for gradient quadratic sum maximum sampled point, as this pixel grid in the quick Shandong of sample level
Rod feature SURF characteristic point;
2c) each image that human body training sample is concentrated, the fast robust of all pixels grid from each sample level
In property feature SURF characteristic point, randomly select 15 characteristic points, the image as human body training sample set is fast in sample level
Fast robust features SURF characteristic point.
(3) construct every layer of initial base vector:
Using k means clustering method, fast robust in each sample level for all images is concentrated to human body training sample
Property feature SURF characteristic point clustered, define 450 cluster centres, obtain whole human body training sample set in sample level
450 dimension visual vocabularies, constitute the initial base vector of sample level.
(4) obtain the maximum kernel Function feature of sample level:
For the initial base vector of each sample level, core singular value decomposition CKSVD being utilized respectively belt restraining carries out dictionary
Study, obtains the maximum kernel Function feature of sample level.
(5) obtain Efficient image coupling core feature:
5a) to each sample level, the element value of the maximum kernel Function feature of arrangement sample level, judges greatest member in descending order
Whether the element number of value is 1, if it is, the maximum kernel Function feature of sample level is exported as the characteristic vector of sample level,
Otherwise, element equal with maximum for element value in the maximum kernel Function feature of sample level is set to zero, by the sampling after zero setting
The maximum kernel Function feature of layer exports as the characteristic vector of sample level;
5b) characteristic vector of all sample level is weighted suing for peace, obtains all scale features, all yardsticks of storage are special
Levy;
5c) every row element of all scale features vector is averaged, on abscissa line, corresponding average point is carried out all
Adding up of value number, obtains the distribution of the element average of all row of all scale features vectors, selects the element average of all row
The feature of similar Gauss distribution in distribution, as the efficient matchings core feature of the fast robust characteristic of final image.
(6) classification based training:
The efficient matchings core feature of the fast robust characteristic extracted is carried out point using support vector machines grader
Class is trained, and obtains detecting grader.
Second process, using comprising the following steps that the detection grader being obtained is detected to image:
(7) input picture is scanned:
One tested altimetric image of input, scans the tested altimetric image of view picture with window scanning method, obtains one group of scanning window figure
Picture, this group scanning window image is input to detection grader.
(8) detect scanning window:
8a) judge whether include human body information in inputted scanning window image with detection grader, if not existing
Human body information, then be non-human natural image by this detected framing, otherwise, all has human body information from judge
In scanning window image, find out detection grader fraction highest scanning window image as main window image;
8b) the remaining scanning window image having human body information beyond main window image, will be with main window image weight
The folded scanning window image more than 50% carries out window combination operation with main window image, the window that window combination is obtained as
One testing result preserves, and deletes all images participating in window combination;
8c) judge to have whether the scanning window image of human body information also has residue, if it has, finding out remaining scanning window
In image, detection grader fraction highest image is as main window image, execution step 8b), otherwise, execution step (9).
(9) export testing result:
All windows that window combination is obtained go out in tested altimetric image subscript, export the image after marking, as tested
The human detection result of altimetric image.
The present invention compared with prior art has advantages below:
First, the present invention employs fast robust feature, fast robust in the characteristic extraction procedure of human detection
Feature by calculate local image gradient situation of change area image is counted, constitute entire image with statistics
The feature of matter, can avoid asking of the Fuzzy Representation that in prior art, the image representing method based on edge with based on profile produces
Topic is so that the present invention is when process mixes background and uneven illumination image, it is possible to obtain preferably testing result.
Second, the present invention to image layered extraction feature, is effectively utilized not in the characteristic extraction procedure of human detection
With the characteristic point information on yardstick, it is to avoid the too small local match error brought of prior art mesoscale is so that the present invention is permissible
Obtain preferable testing result.
3rd, the characteristics of image extracting is adopted dictionary learning method by image feature maps to lower dimensional space by the present invention,
Assemble feature set, compared with prior art reduce the dimension of characteristics of image, effectively reduce the calculating time of characteristics of image
The amount of calculation of data.
Brief description
Fig. 1 is the flow chart of the present invention;
Fig. 2 is sample image used in the present invention;
Fig. 3 is the present invention and the grader classification performance comparison diagram based on histogram of gradients HOG feature human body detecting method;
Fig. 4 is that the inventive method is carried out to uneven illumination image with based on histogram of gradients HOG feature human body detecting method
The analogous diagram of human detection;
Fig. 5 is that the inventive method is carried out to complex background image with based on histogram of gradients HOG feature human body detecting method
The analogous diagram of human detection.
Specific embodiment
The present invention will be further described below in conjunction with the accompanying drawings.
Referring to the drawings 1, the present invention comprises the following steps that:
Step 1, selects training sample set image.
Using bootstrapping operation, from the non-human natural image of INRIA data base, obtain enough negative sample images.
The comprising the following steps that of bootstrapping operation:
The first step, randomly selects m positive sample image and n negative sample image, wherein 100≤m from INRIA data base
≤ 500,100≤n≤800, and n≤m≤3n, using gradient orientation histogram HOG feature extracting method, selected is owned
Positive and negative sample image carries out feature extraction, carries out classification based training using support vector machines grader to the feature extracted, obtains
Preliminary classification device.
Second step, continuous random chooses the non-human natural image in INRIA data base, using sweeping of sample image size
Retouch window, from left to right with 8 pixels as Moving Unit, from top to bottom with 16 pixels as Moving Unit, scanning view picture is tested
The non-human natural image surveyed;Image in all of scanning window is input to preliminary classification device detected, preserves classification
The wrong scanning window image dividing of device, until wrong point of scanning window amount of images reaches a and opens, 200≤a≤500, stop selection non-
Human body natural's image;From wrong point of scanning window image, random choose b opens image, and 1/5a≤b≤1/3a, with current negative sample
The new negative sample collection of this image composition.
3rd step, to the m positive sample image randomly selecting and new negative sample collection, carries out gradient orientation histogram HOG
Feature extraction, training grader, the non-human natural image of detection and renewal negative sample collection.
4th step, repeats the 3rd step operation, until the final training sample set after updating is by 2416 positive samples
Image and 13500 negative sample image compositions, size is 128 × 64 pixels.
The negative sample image obtaining is formed new negative sample collection with the negative sample collection in INRIA data base.
The new negative sample collection image obtaining is constituted human body training sample set with the positive sample collection in INRIA data base.
In embodiments of the invention, final human body training sample is concentrated, training sample set by 2416 positive samples with
13500 negative sample compositions, test sample collection is made up of with 4050 negative samples 1132 positive samples, and the size of sample image is equal
For 128 × 64 pixels.
Fig. 2 is the part sample image that the present invention uses, and wherein Fig. 2 (a) is part positive sample figure used in the present invention
Picture, Fig. 2 (b) is to be part negative sample image used in the present invention.
Step 2, extracts the SURF characteristic point of image.
The each image that human body training sample is concentrated is divided into the grid of 8*8 pixel, to each grid, press 16 respectively,
25th, the yardstick sampling of 36 pixel sizes, each yardstick sampling trellised sample level of shape.
To each 8*8 pixel grid, the horizontal direction gradient of sampled point and vertical direction in grid after every layer of sampling of calculating
The quadratic sum of gradient, by corresponding for gradient quadratic sum maximum sampled point, as this pixel grid sample level fast robust
Property feature SURF characteristic point.
The each image that human body training sample is concentrated, from each sample level, the fast robust of all pixels grid is special
Levy in SURF characteristic point, randomly select 15 characteristic points, as the quick Shandong in sample level for the image of human body training sample set
Rod feature SURF characteristic point.
Step 3, constructs every layer of initial base vector.
Using k means clustering method, to all training sample image in each sample level all fast robust features
SURF characteristic point is clustered, and defines 450 cluster centres, obtains the 450 dimension visual vocabularies in sample level for the whole training sample,
Constitute the initial base vector of sample level.
The comprising the following steps that of k means clustering method:
The first step, to each sample level, quick in sample level from all sample images of human body training sample set at random
In robust features SURF characteristic point, choose 450 fast robust feature SURF characteristic points, as the initial clustering of sample level
Center, respectively by the data value of 450 initial cluster centers, as the cluster centre value of place initial cluster center.
Second step, calculates human body training sample set and arrives each in all fast robust feature SURF characteristic points of sample level
The Euclidean distance of cluster centre.
3rd step, each fast robust feature SURF characteristic point of sample level is grouped into the cluster closest with itself
In the classification that center is located.
4th step, after judging to sort out, whether the statistical average of the fast robust feature SURF characteristic point of each class is equal to
Cluster centre value, if it is, execution the 5th step, otherwise, the statistical average of the characteristic point of required each class is gathered as new
Class central value, returns second step.
5th step, preserves 450 cluster centre values, constitutes column vectors with 450 cluster centre values, using this column vector as
Initial base vector output in sample level for the whole human body training sample set.
Step 4, obtains the maximum kernel Function feature of sample level.
For the initial base vector of each sample level, core singular value decomposition CKSVD being utilized respectively belt restraining carries out dictionary
Study, obtains the maximum kernel Function feature of sample level.
The step of bootstrapping operation is as follows:
The first step, for each sample level, the initial base vector of sample level is projected to one 450 dimension spatially, leads to
Cross following formula to calculate, obtain the projection vector of the initial base vector of sample level:
R=R1×[v1,...vj...,vN]
Wherein, R represents the projection vector of the initial base vector of sample level, R1Represent the initial base vector of sample level, vjRepresent
The vector of the projection coefficient of j-th characteristic point that all sample images of human body training sample set extract in sample level, vj=|
v1j,v2j,...vsj,vMj|T, vsjRepresent that human body training sample concentrates j-th spy that s width sample image extracts in sample level
Levy projection coefficient a little, M represents all sample image numbers of human body training sample set, j=1,2 ..., N, N represent that human body is instructed
Practice the number of the characteristic point that each width sample image in sample set randomly selects in sample level.
Second step, constructs an approximating function as the following formula, the first primordium on projector space up approach sample level
The projection vector of vector:
F (r)=arg min | | r-P | |
Wherein, r represents maximum kernel Function feature in sample level, and R represents the projection vector of the initial base vector p of sample level, |
| | | represent 2 norms, arg min | | | | represent and minimize.
By R=R1×[v1,...vj...,vN] substitute into above formula, and r is pressed r=[r1,...rj...,rN] launch, as the following formula
Obtain maximum kernel Function feature r to initial base vector R12 approximating functions f (v, r):
Wherein, v represents the low dimension projective coefficient of all characteristic points that all sample images of human body training sample set extract
Vector, and v=[v1,...vj...,vN], vjAll sample images of expression human body training sample set extract in sample level
The vector of the projection coefficient of j-th characteristic point, N represent human body training sample concentrate each width sample image in sample level with
The number of the characteristic point that machine is chosen, rjRepresent j-th characteristic point of all sample images extractions of human body training sample set
Big core characteristic vector, R1Represent the initial base vector of sample level.
3rd step, solves approximating function using stochastic gradient descent method, and acquisition following formula carrys out iteration and updates in this sample level
Maximum kernel Function feature, constitutes the image feature representation of low-dimensional:
Wherein, r (k+1) represents the maximum kernel Function feature in the sample level that iteration obtains for k+1 time, and k represents iterationses,
R (k) represents the maximum kernel Function feature in the sample level that iteration obtains for k time, and η represents learning rate, is a constant,Represent meter
The derivative to r for the formula in calculation bracket, rjRepresent what all sample images that human body training sample is concentrated extracted in sample level
The maximum kernel characteristic vector of j-th characteristic point, R1Represent the initial base vector in sample level, R1 TRepresent the first primordium in sample level
Vectorial R1Transposed vector, set iterationses as 1000 times, the r obtaining after the completion of iteration (1000) as in sample level
Whole maximum kernel Function feature, j=1,2 ..., N, N represent each width sample image of human body training sample concentration in sample level
On the characteristic point randomly selecting quantity.
Step 5, obtains Efficient image coupling core feature.
To each sample level, the element value of the maximum kernel Function feature of arrangement sample level, judges greatest member value in descending order
Element number whether be 1, if it is, using the maximum kernel Function feature of sample level as sample level characteristic vector output, no
Then, element equal with maximum for element value in the maximum kernel Function feature of sample level is set to zero, by the sample level after zero setting
Maximum kernel Function feature as sample level characteristic vector export.
The characteristic vector of all sample level is weighted suing for peace, obtains all scale features, store all scale features.
The mode of weighted sum is as follows:
G*=Gi×Ai
Wherein, G* represents all sample level features, GiRepresent the characteristic vector of each sample level, i=1,2,3, AiRepresent every
The corresponding weight of individual sample level,wi=1/pi, piRepresent that the pixel of the sampling scale of each sample level is big
Little, pi={ 16,25,36 }.
Every row element of all scale features vector is averaged, on abscissa line, corresponding average point carries out average
Adding up of number, obtains the distribution of the element average of all row of all scale features vectors, selects the element distribution of mean value of all row
In similar Gauss distribution feature, as the efficient matchings core feature of the fast robust characteristic of final image.
Step 6, classification based training.
The efficient matchings core feature of the fast robust characteristic extracted is carried out point using support vector machines grader
Class is trained, and obtains detecting grader.
Step 7, input picture is scanned.
One tested altimetric image of input, scans the tested altimetric image of view picture with window scanning method, obtains one group of scanning window figure
Picture, this group scanning window image is input to detection grader.
What window scanned comprises the following steps that:
The first step, by the area of sample image size in a human body training sample set in the tested altimetric image upper left corner of input
Domain, as first scanning window, using this scanning window as Current Scan window, preserves Current Scan video in window.
Second step, by Current Scan window on detected image to 8 pixels of right translation or move down 16 pixels and obtain
To a new scanning window, go to replace Current Scan window with new scanning window, preserve Current Scan video in window.
3rd step, moves Current Scan window as stated above, goes to replace Current Scan window with the scanning window after movement
Mouth, till scanning through the image that view picture is detected, preserves all of scanning window image.
Step 8, detects scanning window.
8a) judge whether include human body information in inputted scanning window image with detection grader, if not existing
Human body information, then be non-human natural image by this detected framing, otherwise, all has human body information from judge
In scanning window image, find out detection grader fraction highest scanning window image as main window image.
8b) the remaining scanning window image having human body information beyond main window image, will be with main window image weight
The folded scanning window image more than 50% carries out window combination operation with main window image, the window that window combination is obtained as
One testing result preserves, and deletes all images participating in window combination.
The comprising the following steps that of window combination:
The first step, by window combination in need image from 1 start serial number;
Second step, by every width need the grader fraction of window combination image window combination in need image
The weight that the proportion accounting in detection grader fraction sum weights as image boundary;
3rd step, using following formula, is weighted to each edge circle of the image needing window combination:
Wherein, the pixel value being expert on tested altimetric image for the window edge that X obtains after representing weighting or column
Pixel value, x1,x2,...xMRepresent the pixel being expert on tested altimetric image for the image boundary participating in window combination respectively
Value or the pixel value of column, m1,m2,...mMRepresent the image corresponding grader fraction participating in window combination respectively, M represents
Participate in the image number of window combination, A represents the Image Classifier fraction sum participating in window combination,M represents
Participate in the image number of window combination, g represents the numbering of window combination image, mgRepresent that g width participates in the image of window combination
Grader fraction.
4th step, the border after weighting is formed a window.
8c) judge to have whether the scanning window image of human body information also has residue, if it has, finding out remaining scanning window
In image, detection grader fraction highest image is as main window image, execution step 8b), otherwise, execution step 9.
Step 9, exports testing result.
All windows that window combination is obtained go out in tested altimetric image subscript, export the image after marking, as tested
The human detection result of altimetric image.
The effect of the present invention can be further illustrated by following emulation:
1st, emulation experiment condition setting:
The emulation experiment of the present invention compiles on Matlab 2009a and completes, and simulated environment is the HP under Windows framework
Work station.The required positive sample image of experiment and negative sample image are taken from INRIA data base, and training sample includes 2416
Positive sample and 13500 negative samples, test sample collection includes 1132 positive samples and 4050 negative samples, positive sample and negative sample
The size of image is 128 × 64 pixels, and Fig. 2 is part sample image used in the present invention, and wherein Fig. 2 (a) is the present invention
Used in part positive sample image, Fig. 2 (b) be the present invention used in part negative sample image.
2nd, emulation content and interpretation of result:
Emulation 1:
The accuracy rate obtaining after the completion of classifier training is to judge one of important indicator of classifier performance.In order to obtain relatively
The grader of good performance, when we are to extracting sample image feature, the sampling number of plies to sample image and initial base vector low-dimensional
Substantial amounts of experiment has been done in the selection of projection this two parameters of dimension during projection, and the different sampling numbers of plies is carried out to sample image
Sampling, different projection dimensions carry out to initial base vector projecting trains the accuracy rate that grader obtains to be contrasted, contrast
Result is as shown in table 1.
As it can be seen from table 1 projecting dimension for identical, grader sample image being carried out during 3 layers of sampling is accurate
Rate is higher than grader accuracy rate when sample image is carried out with 2 layers of sampling;And for the identical sampling number of plies, not necessarily throw
Shadow dimension is higher, and grader accuracy rate is higher.Can be seen that from the data of in figure and sample image is carried out with 3 layers of sampling, will just
Primordium vector is highest to the grader accuracy rate obtaining during 450 dimension projection, and the classification performance of acquisition is best.
Emulation 2:
Using the present invention with based on histogram of gradients HOG feature human body detecting method, human body training sample set is carried out respectively
Feature extraction, trains grader, and the classifier performance obtaining is contrasted.Classifier performance contrast schematic diagram referring to the drawings 3,
Select in Fig. 3 by comparing kidney-Yang rate TPR (True Positive Rate) and false sun rate FPR (False Positives
Rates) recipient performance characteristic ROC (the Receiver Operating Characteristic) curve of relation is evaluating point
The performance of class device.ROC curve is more top to be inclined to left drift angle, and its corresponding grader is more outstanding.
Axis of abscissas in accompanying drawing 3 represents false sun rate FPR (False Positives Rates), and axis of ordinates represents true
Positive rate TPR (True Positive Rate).The curve being indicated with square in accompanying drawing 3 represents grader kidney-Yang rate of the present invention and vacation
The ROC curve of positive rate relation, represents the classification based on histogram of gradients HOG feature human body detecting method with the curve that cross indicates
Device kidney-Yang rate and the ROC curve of false sun rate relation.It can be seen from figure 3 that the ROC curve that the present invention obtains is compared based on histogram of gradients
The ROC curve that HOG feature human body detecting method obtains, more top tend to left drift angle, illustrate that the classification performance of the present invention is better than
Classification performance based on histogram of gradients HOG feature human body detecting method.
Emulation 3:
With the present invention with based on histogram of gradients HOG feature human body detecting method to the nature figure from INRIA data base
As carrying out human detection, testing result is as shown in Figure 4 and Figure 5.
Fig. 4 is the image of a width uneven illumination, and image 4 (a) represents the human detection result of the present invention, the white side of in figure
Frame, represents the result that in the detection detection of classifier image of the present invention, human body information rear hatch merges.Image 4 (b) expression is based on
The human detection result of histogram of gradients HOG feature human body detecting method, the white box of in figure, represent that the detection of the method divides
The result that in class device detection image, human body information rear hatch merges.From fig. 4, it can be seen that in the case of uneven illumination, this
Bright method, compared to based on histogram of gradients HOG feature human body detecting method, greatly can reduce false alarm rate, can be more accurately
Detect all human body informations in altimetric image to be checked.
Fig. 5 is that a web has the image that complex background and personage are blocked, and image 5 (a) represents the human detection knot of the present invention
Really, the white box of in figure, represents the result that in the detection detection of classifier image of the present invention, human body information rear hatch merges.Figure
As 5 (b) represents the human detection result based on histogram of gradients HOG feature human body detecting method, the white box of in figure, represent
The result that in the detection detection of classifier image of the method, human body information rear hatch merges.From fig. 5, it can be seen that there being the complicated back of the body
In the case that scape and personage are blocked, can more accurately mark human body information using the inventive method, and window merge after obtain
Window size is more suitable compared with based on histogram of gradients HOG feature human body detecting method, has higher human detection accuracy.
In sum, the inventive method can be even in uneven illumination, and background is complicated and there is general in the case of partial occlusion
Human detection is out.Thus illustrating that this method is very suitable for the human detection in natural image.
Claims (4)
1. a kind of efficient matchings core human body detecting method based on fast robust feature, including obtaining detecting grader and utilization
The grader being obtained carries out to image detecting two processes, implements step as follows:
First process, obtains detecting comprising the following steps that of grader:
(1) select training sample set image:
1a) utilize bootstrapping operation, from the non-human natural image of INRIA data base, obtain enough negative sample images;
1b) the negative sample image obtaining is formed new negative sample collection with the negative sample collection in INRIA data base;
1c) the new negative sample collection image obtaining is constituted human body training sample set with the positive sample collection in INRIA data base;
(2) extract image SURF characteristic point:
2a) each image concentrating human body training sample is divided into the grid of 8*8 pixel, to each grid, press 16 respectively,
25th, the yardstick sampling of 36 pixel sizes, each yardstick sampling trellised sample level of shape;
2b) to each 8*8 pixel grid, the horizontal direction gradient of sampled point and vertical direction ladder in grid after every layer of sampling of calculating
The quadratic sum of degree, by corresponding for gradient quadratic sum maximum sampled point, as this pixel grid sample level fast robust
Feature SURF characteristic point;
2c) each image that human body training sample is concentrated, from each sample level, the fast robust of all pixels grid is special
Levy in SURF characteristic point, randomly select 15 characteristic points, as the quick Shandong in sample level for the image of human body training sample set
Rod feature SURF characteristic point;
(3) construct every layer of initial base vector:
Using k means clustering method, human body training sample is concentrated fast robust in each sample level for all images special
Levy SURF characteristic point to be clustered, define 450 cluster centres, obtain whole human body training sample set in sample level 450
Dimension visual vocabulary, constitutes the initial base vector of sample level;
The comprising the following steps that of described k means clustering method:
The first step, to each sample level, at random from all sample images of human body training sample set sample level fast robust
Property feature SURF characteristic point in, choose 450 fast robust feature SURF characteristic points, as in the initial clustering of sample level
The heart, respectively by the data value of 450 initial cluster centers, as the cluster centre value of place initial cluster center;
Second step, all fast robust feature SURF characteristic points calculating human body training sample set in sample level cluster to each
The Euclidean distance at center;
3rd step, each fast robust feature SURF characteristic point of sample level is grouped into the cluster centre closest with itself
In the classification being located;
4th step, after judging to sort out, whether the statistical average of the fast robust feature SURF characteristic point of each class is equal to cluster
Central value, if it is, execution the 5th step, otherwise, using the statistical average of the characteristic point of required each class as in new cluster
Center value, returns second step;
5th step, preserves 450 cluster centre values, constitutes column vectors with 450 cluster centre values, using this column vector as whole
Initial base vector output in sample level for the human body training sample set;
(4) obtain the maximum kernel Function feature of sample level:
For the initial base vector of each sample level, core singular value decomposition CKSVD being utilized respectively belt restraining carries out dictionary learning,
Obtain the maximum kernel Function feature of sample level;
The comprising the following steps that of described dictionary learning:
The first step, for each sample level, the initial base vector of sample level is projected to one 450 dimension spatially, by under
Formula calculates, and obtains the projection vector of the initial base vector of sample level:
R=R1×[v1,...vj...,vN]
Wherein, R represents the projection vector of the initial base vector of sample level, R1Represent the initial base vector of sample level, vjRepresent human body
The vector of the projection coefficient of j-th characteristic point that all sample images of training sample set extract in sample level, vj=[v1j,
v2j,...vsj,vMj]T, vsjRepresent that human body training sample concentrates j-th characteristic point that s width sample image extracts in sample level
Projection coefficient, M represents the sample image number of human body training sample set, j=1, and 2 ..., N, N represent human body training sample set
In the number of characteristic point that randomly selects in sample level of each width sample image;
Second step, constructs an approximating function as the following formula, the initial base vector on projector space up approach sample level
Projection vector:
F (r)=arg min | | r-R | |
Wherein, r represents the maximum kernel Function feature in sample level, | | | | represent 2 norms, arg min | | | | represent and ask
Little value;
3rd step, calculates approximating function, obtains following formula and carrys out the maximum kernel Function feature that iteration updates in sample level, constitutes low-dimensional
Image feature representation:
Wherein, r (k+1) represents the maximum kernel Function feature in the sample level that iteration obtains for k+1 time, and k represents iterationses, r (k)
Represent the maximum kernel Function feature in the sample level that iteration obtains for k time, η represents learning rate, be a constant,Represent that calculating includes
The derivative to r for the formula in number, rjExtract j-th of all sample images that human body training sample is concentrated is represented on sample level
The maximum kernel characteristic vector of characteristic point, R1 TRepresent the initial base vector R in sample level1Transposed vector, set iterationses as
1000 times, the r obtaining after the completion of iteration (1000) is as the final maximum kernel Function feature in sample level;
(5) obtain Efficient image coupling core feature:
5a) to each sample level, the element value of the maximum kernel Function feature of arrangement sample level, judges greatest member value in descending order
Whether element number is 1, if it is, the maximum kernel Function feature of sample level is exported as the characteristic vector of sample level, otherwise,
Element equal with maximum for element value in the maximum kernel Function feature of sample level is set to zero, by the sample level after zero setting
Big kernel function feature exports as the characteristic vector of sample level;
5b) characteristic vector of all sample level is weighted suing for peace, obtains all scale features, store all scale features;
The mode of described weighted sum is as follows:
G*=Gi×Ai
Wherein, G* represents all scale features, GiRepresent the characteristic vector of each sample level, i=1,2,3, AiRepresent each sampling
The corresponding weight of layer,wi=1/pi, piRepresent the pixel size of the sampling scale of each sample level, pi=
{16,25,36};
5c) every row element of all scale features vector is averaged, on abscissa line, corresponding average point carries out average
Adding up of number, obtains the distribution of the element average of all row of all scale features vectors, selects the element distribution of mean value of all row
In similar Gauss distribution feature, as the efficient matchings core feature of the fast robust characteristic of final image;
(6) classification based training:
Using support vector machines grader, the efficient matchings core feature of the fast robust characteristic extracted is carried out with classification instruction
Practice, obtain detecting grader;
Second process, using comprising the following steps that the detection grader being obtained is detected to image:
(7) input picture is scanned:
One tested altimetric image of input, scans the tested altimetric image of view picture with window scanning method, obtains one group of scanning window image, will
This group scanning window image is input to detection grader;
(8) detect scanning window:
8a) judge whether include human body information in inputted scanning window image with detection grader, if there is not human body
Information, then by this detected framing be non-human natural image, otherwise, from all scannings having human body information judged
In video in window, find out detection grader fraction highest scanning window image as main window image;
8b) the remaining scanning window image having human body information beyond main window image, will be overlapping with main window image big
Scanning window image in 50% and main window image carry out window combination operation, and the window that window combination is obtained is as one
Testing result preserves, and deletes all images participating in window combination;
8c) judge to have whether the scanning window image of human body information also has residue, if it has, finding out remaining scanning window image
Middle detection grader fraction highest image is as main window image, execution step 8b), otherwise, execution step (9);
(9) export testing result:
All windows that window combination is obtained go out in tested altimetric image subscript, export the image after marking, as tested mapping
The human detection result of picture.
2. the efficient matchings core human body detecting method based on fast robust feature according to claim 1, its feature exists
In:Step 1a) the comprising the following steps that of described bootstrapping operation:
The first step, randomly selects m positive sample image and n negative sample image from INRIA data base, and wherein 100≤m≤
500,100≤n≤800, and n≤m≤3n, using gradient orientation histogram HOG feature extracting method, to selected all just
Negative sample image carries out feature extraction, carries out classification based training using support vector machines grader to the feature extracted, and obtains just
Beginning grader;
Second step, continuous random chooses the non-human natural image in INRIA data base, using the scanning of sample image size
Window, from left to right with 8 pixels as Moving Unit, from top to bottom with 16 pixels as Moving Unit, scanning view picture is detected
Non-human natural image;Image in all of scanning window is input to preliminary classification device detected, preserves grader
The wrong scanning window image dividing, until wrong point of scanning window amount of images reaches a and opens, 200≤a≤500, stop selection inhuman
Body natural image;From wrong point of scanning window image, random choose b opens image, and 1/5a≤b≤1/3a, with current negative sample
The new negative sample collection of image composition;
3rd step, to the m positive sample image randomly selecting and new negative sample collection, carries out gradient orientation histogram HOG feature
Extract, train grader, detect non-human natural image and renewal negative sample collection;
4th step, repeats the 3rd step operation, until the final training sample set after updating is by 2416 positive sample images
Form with 13500 negative sample images, size is 128 × 64 pixels.
3. the efficient matchings core human body detecting method based on fast robust feature according to claim 1, its feature exists
In:The comprising the following steps that of step (7) described window scanning method:
The first step, the region of sample image size in a human body training sample set in the tested altimetric image upper left corner of input is made
For first scanning window, using this scanning window as Current Scan window, preserve Current Scan video in window;
Second step, by Current Scan window on detected image to 8 pixels of right translation or move down 16 pixels and obtain one
Individual new scanning window, goes to replace Current Scan window with new scanning window, preserves Current Scan video in window;
3rd step, moves Current Scan window as stated above, goes replacement Current Scan window straight with the scanning window after movement
To scanning through the image that view picture is detected, preserve all of scanning window image.
4. the efficient matchings core human body detecting method based on fast robust feature according to claim 1, its feature exists
In:Step 8b) the comprising the following steps that of described window combination operation:
The first step, by window combination in need image from 1 start serial number;
Second step, every width is needed window combination image grader fraction the image detection of window combination in need divide
The weight that the proportion accounting in class device fraction sum weights as image boundary;
3rd step, using following formula, is weighted to each edge circle of the image needing window combination:
Wherein, the pixel value being expert on tested altimetric image for the window edge that X obtains after representing weighting or the picture of column
Element value, x1,x2,...xERepresent respectively the pixel value being expert on tested altimetric image for the image boundary participating in window combination or
The pixel value of column, m1,m2,...mERepresent the image corresponding detection grader fraction participating in window combination respectively, E represents
Participate in the image number of window combination, A represents the image detection grader fraction sum participating in window combination,g
Represent the numbering of window combination image, mgRepresent that g width participates in the detection grader fraction of the image of window combination;
4th step, the border after weighting is formed a window.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310405276.3A CN103455826B (en) | 2013-09-08 | 2013-09-08 | Efficient matching kernel body detection method based on rapid robustness characteristics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310405276.3A CN103455826B (en) | 2013-09-08 | 2013-09-08 | Efficient matching kernel body detection method based on rapid robustness characteristics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103455826A CN103455826A (en) | 2013-12-18 |
CN103455826B true CN103455826B (en) | 2017-02-08 |
Family
ID=49738167
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310405276.3A Active CN103455826B (en) | 2013-09-08 | 2013-09-08 | Efficient matching kernel body detection method based on rapid robustness characteristics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103455826B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106462772B (en) * | 2014-02-19 | 2019-12-13 | 河谷控股Ip有限责任公司 | Invariant-based dimension reduction for object recognition features, systems and methods |
CN103985102A (en) * | 2014-05-29 | 2014-08-13 | 宇龙计算机通信科技(深圳)有限公司 | Image processing method and system |
CN104573646B (en) * | 2014-12-29 | 2017-12-12 | 长安大学 | Chinese herbaceous peony pedestrian detection method and system based on laser radar and binocular camera |
CN105139390A (en) * | 2015-08-14 | 2015-12-09 | 四川大学 | Image processing method for detecting pulmonary tuberculosis focus in chest X-ray DR film |
CN107025436B (en) * | 2017-03-13 | 2020-10-16 | 西安电子科技大学 | Reliability-based self-updating personnel intrusion detection method |
CN107945145A (en) * | 2017-11-17 | 2018-04-20 | 西安电子科技大学 | Infrared image fusion Enhancement Method based on gradient confidence Variation Model |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101233843B1 (en) * | 2011-06-29 | 2013-02-15 | 포항공과대학교 산학협력단 | Method and apparatus for object detection using volumetric feature vector and 3d haar-like filters |
CN102663369B (en) * | 2012-04-20 | 2013-11-20 | 西安电子科技大学 | Human motion tracking method on basis of SURF (Speed Up Robust Feature) high efficiency matching kernel |
CN102810159B (en) * | 2012-06-14 | 2014-10-29 | 西安电子科技大学 | Human body detecting method based on SURF (Speed Up Robust Feature) efficient matching kernel |
-
2013
- 2013-09-08 CN CN201310405276.3A patent/CN103455826B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN103455826A (en) | 2013-12-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103455826B (en) | Efficient matching kernel body detection method based on rapid robustness characteristics | |
CN107103298A (en) | Chin-up number system and method for counting based on image procossing | |
CN104166841B (en) | The quick detection recognition methods of pedestrian or vehicle is specified in a kind of video surveillance network | |
CN109635875A (en) | A kind of end-to-end network interface detection method based on deep learning | |
CN103186775B (en) | Based on the human motion identification method of mix description | |
CN108334847A (en) | A kind of face identification method based on deep learning under real scene | |
CN107134144A (en) | A kind of vehicle checking method for traffic monitoring | |
CN108334848A (en) | A kind of small face identification method based on generation confrontation network | |
CN109800628A (en) | A kind of network structure and detection method for reinforcing SSD Small object pedestrian detection performance | |
CN107330453A (en) | The Pornographic image recognizing method of key position detection is recognized and merged based on substep | |
CN106874894A (en) | A kind of human body target detection method based on the full convolutional neural networks in region | |
CN109711288A (en) | Remote sensing ship detecting method based on feature pyramid and distance restraint FCN | |
CN108520273A (en) | A kind of quick detection recognition method of dense small item based on target detection | |
CN110321891A (en) | A kind of big infusion medical fluid foreign matter object detection method of combined depth neural network and clustering algorithm | |
CN107358223A (en) | A kind of Face datection and face alignment method based on yolo | |
CN106296638A (en) | Significance information acquisition device and significance information acquisition method | |
CN101930549B (en) | Second generation curvelet transform-based static human detection method | |
CN105740780A (en) | Method and device for human face in-vivo detection | |
CN104361351B (en) | A kind of diameter radar image sorting technique based on range statistics similarity | |
CN107066972B (en) | Natural scene Method for text detection based on multichannel extremal region | |
CN104021396A (en) | Hyperspectral remote sensing data classification method based on ensemble learning | |
CN105869085A (en) | Transcript inputting system and method for processing images | |
CN109934847A (en) | The method and apparatus of weak texture three-dimension object Attitude estimation | |
CN107133955A (en) | A kind of collaboration conspicuousness detection method combined at many levels | |
CN107194418A (en) | A kind of Aphids in Rice Field detection method based on confrontation feature learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |