CN106845387B

CN106845387B - Pedestrian detection method based on self-learning

Info

Publication number: CN106845387B
Application number: CN201710033677.9A
Authority: CN
Inventors: 施培蓓; 曹风云; 胡玉娟; 杨雪洁; 王璐; 钱言玉; 王筱薇倩; 张娜; 谢超; 吴友情
Original assignee: Hefei Normal University
Current assignee: Hefei Normal University
Priority date: 2017-01-18
Filing date: 2017-01-18
Publication date: 2020-04-24
Anticipated expiration: 2037-01-18
Also published as: CN106845387A

Abstract

The invention discloses a pedestrian detection method based on self-learning, which comprises the following specific steps of: firstly, training an AdaBoost-based cascade classifier as an offline classifier, simultaneously training a Gaussian mixture model by using a group of public pedestrian photos, adopting HOG (histogram of oriented gradient) features and position information for feature coding, then adopting a low-threshold offline classifier to detect pedestrians in a specific scene, outputting confidence scores of candidate objects, then selecting high confidence scores as positive samples, taking low confidence scores as negative samples, re-representing the candidate detection objects by using the Gaussian mixture model, finally training a discriminative pedestrian classifier on line by using an SVM (support vector machine) classifier, re-predicting the candidate objects and outputting probability estimation. The pedestrian detection method solves the problem that the traditional pedestrian detection method cannot be self-adaptive to a specific scene, has a certain promotion effect on the pedestrian detection technology under the specific scene, and is remarkably improved in the aspect of identification rate compared with the traditional pedestrian detection method.

Description

Pedestrian detection method based on self-learning

Technical Field

The invention relates to a pedestrian detection method in the field of intelligent transportation, in particular to a self-learning pedestrian detection method based on a specific scene.

Background

The problem of road traffic safety has seriously influenced economic development and social construction, and the reduction of the occurrence of road traffic accidents and casualties is an important matter of the problem of the relation of civilians. The problem of road traffic safety is influenced by a plurality of factors such as pedestrians, vehicles, roads and the like, and as the pedestrians are main participants and weak people of road traffic, ensuring the safety of the pedestrians becomes a key of the problem of road traffic safety, and is also an important task in the field of intelligent traffic systems.

The pedestrian detection method is a core support technology of an intelligent traffic system, and has profound influence on guaranteeing pedestrian safety and reducing life and property loss of people. In practical application, the vehicle-mounted pedestrian detection system is required to be capable of adapting to pedestrian detection in different scenes. At present, researchers and automobile manufacturers at home and abroad have recognized the important economic value and research significance of the application of the pedestrian detection system, and have proposed to definitely start to develop the research and application of automobile automatic driving, but have technical defects and potential safety hazards.

The pedestrian detection method mainly comprises two main categories of image processing and machine learning. The classification method in machine learning is a pedestrian detection technology which is adopted at present. The core technology of the pedestrian detection system based on classification is feature extraction and classifier design, and the basic idea is to train a complete classifier by using a large amount of training data and then detect pedestrians for test data. The pedestrian detection method based on classification is successfully applied in a fixed scene, and belongs to an off-line training mode. If the training set and the test set are from different sources, the classification performance of the offline-trained classifier is greatly reduced in a new scene due to the fact that the pedestrian features and the postures are greatly changed and the mismatching factors of the vision, the illumination, the background, the resolution ratio and the like of the scene are added. Experimental results of the existing dozens of pedestrian classifiers show that the classifier trained on the INRIA data set is directly used for pedestrian detection under other different scenes, and the omission ratio is improved by 20% to 50%. Therefore, it is difficult to train a general detector to be suitable for all pedestrian scenarios.

The classifier can be retrained for pedestrian detection in a specific scene, but the complex cost brought by re-standard samples is difficult to bear. The classifier requires a large number of training samples and training time to ensure good detection performance in the detection phase. In addition, the trained classifier parameters are already determined, and the environment of a new scene is difficult to adapt. Therefore, the pedestrian detection classification method for designing the self-adaptive scene has certain research and application values.

Disclosure of Invention

The purpose of the invention is as follows: in order to overcome the defect that self-adaption to a specific task cannot be achieved in the prior art, a self-learning-based pedestrian detection method is provided, any offline-trained pedestrian classifier can adapt to a specific scene, and a good recognition rate is obtained.

The technical scheme is as follows: in order to achieve the purpose, the invention provides a pedestrian detection method based on self-learning, which comprises the following specific steps: firstly, training an AdaBoost-based cascade classifier as an offline classifier, simultaneously training a Gaussian mixture model by using a group of public pedestrian photos, adopting HOG (histogram of oriented gradient) features and position information for feature coding, then adopting a low-threshold offline classifier to detect pedestrians in a specific scene, outputting confidence scores of candidate objects, then selecting high confidence scores as positive samples, taking low confidence scores as negative samples, re-representing the candidate detection objects by using the Gaussian mixture model, finally training a discriminative pedestrian classifier on line by using an SVM (support vector machine) classifier, re-predicting the candidate objects and outputting probability estimation.

The offline classifier in the above step: training data come from any pedestrian data set, and feature extraction adopts LBP features;

the Gaussian mixture model is as follows: training a Gaussian mixture model by adopting an INRIA data set in an off-line manner, learning parameters of the Gaussian mixture model by adopting an EM (effective electromagnetic modeling) algorithm, and for each pedestrian sample, extracting HOG (histogram of oriented gradient) features and position information of each image block on a multi-scale image to form a group of feature vectors of the pedestrian sample;

screening the candidate objects: setting a low threshold value of an off-line classifier, performing pedestrian detection on a specific scene by using the off-line classifier to obtain candidate detection objects and confidence scores thereof, setting two threshold values to screen all candidate objects, and obtaining a positive and negative sample set;

the online classifier: and for each pedestrian sample, regenerating a new feature representation by using a Gaussian mixture model, then training an SVM classifier on line, finally detecting the candidate object of the specific scene again by using the SVM classifier, and outputting probability estimation.

Further, the offline classifier is trained by adopting a cascade classifier in OpenCV, and comprises two parts, namely a data set preparation part and a training program operation part, wherein the data set preparation part is training data: creating a positive sample set by using opencv _ createsamples, and manually preparing a large number of negative sample pictures; the training program is run to train the cascade classifier: setting the feature type as LBP, and training a cascade classifier by adopting opencv _ traincacade.

Further, the gaussian mixture model includes two parts, namely feature coding and GMM training:

feature coding: for each pedestrian picture in the INRIA data set, firstly, three layers of Gaussian pyramids are constructed, and then overlapped image blocks are extracted from each layer of image pyramids. Assume that each pedestrian picture comprises N image blocks

Extracting HOG feature HOG of each image block_piAnd its location information l_pi＝[xy]^TFinally, the feature of each image block is coded as f_pi＝[hog_pi ^T,l_pi ^T]^TAll image blocks constitute pedestrian sample features

And GMM training: an off-line trained Gaussian mixture model can be expressed as

Wherein K is the number of Gaussian mixture components,

i is a unit matrix of the image data,

is the mixing weight of the kth gaussian component,

is a Gaussian distribution with a mean value of μ_kAnd the variance is

f is the pedestrian sample feature in the feature encoding portion;

a set of pedestrian sample characteristics χ ═ f given in the INRIA dataset₁,f₂,...,f_MAnd learning the parameter theta of the GMM by using the likelihood function of the characteristics of the maximum training set learned by the EM algorithm, wherein the parameter theta is expressed as

The EM algorithm is to calculate the solution by using two steps alternately, wherein the first step expectation (E) is used for calculating the expectation value of the log likelihood, and the second step maximization (M) is used for updating the parameters to find the parameter value of the maximization likelihood expectation, which is as follows:

e, step E:

wherein the content of the first and second substances,

is the kth Gaussian component generation feature f_iA posterior probability of (d);

and M: updating the parameter Θ

Further, the specific method for screening the candidate objects comprises the following steps: the method comprises the steps of utilizing an offline-trained cascade classifier to detect pedestrians in a specific scene, setting a low detection threshold in order to obtain more detection objects, and ensuring that all pedestrian images are obtained, wherein a large number of false alarms exist, and all candidate objects in the specific scene are assumed to be T ═ T { (T) } T_iI 1.. M }, each candidate outputting a corresponding confidence score S ═ S ·_iI 1., M }, sorting the confidence scores in descending order, setting two thresholds λ_hAnd λ_lRepresenting positive and negative examples with high and low confidence scores, respectively, H and N are defined as follows:

H＝{t_i:s_i＞λ_h,i＝1,...,M}

N＝{t_i:s_i＜λ_l,i＝1,...,M}

we expect a balanced set of data H and N, so let C ═ min (| H |, | N |) then

H＝{t₁,...,t_C}

N＝{t_M-C+1,...,t_M}

C＜M/2

Further, the specific method of the online classifier is as follows: aiming at positive and negative sample sets H and N, firstly, expressing the positive and negative samples again by using an off-line trained Gaussian mixture model, then, training a classifier on line by using a LibSVM, and finally, using the on-line classifier to perform on-line classification on a candidate object t_iIs a positive probability estimate. The method comprises the following specific steps:

step 1) characterization: for a single pedestrian sample feature is expressed as

Performing feature representation by using a Gaussian mixture model, wherein each Gaussian component is derived from a feature vector

Select a corresponding feature

For a single pedestrian sample picture, K features are selected by K Gaussian components, and the single pedestrian sample feature is re-expressed as [ f [ ]_g1,f_g2,...,f_gK]And only the HOG characteristic is reserved for the final pedestrian sample characteristic, the position information of the image block is removed, and the final pedestrian sample characteristic is expressed as [ HOG_g1,hog_g2,...,hog_gK]；

Step 2) training an SVM classifier: the SVM classifier is represented as

Wherein the kernel function adopts a Gaussian RBF kernel with the formula of

Step 3) prediction of an SVM classifier: using the classifier trained in the step 2) to perform all the candidate objects T ═ T_iAnd predicting the I1, the other words, M, and outputting a probability estimation result.

Has the advantages that: compared with the prior art, the invention provides the pedestrian detection method based on self-learning, solves the problem that the traditional pedestrian detection method cannot be self-adapted to the specific scene, can change the common pedestrian classifier to be adapted to the specific scene, re-represents the candidate object of the specific scene through the off-line trained Gaussian mixture model, and has certain promotion effect on the pedestrian detection technology under the specific scene.

Drawings

FIG. 1 is a schematic structural view of the present invention;

FIG. 2 is a flow diagram of offline classifier training;

FIG. 3 is a flow chart of Gaussian mixture model training;

FIG. 4 is a flow chart of candidate screening;

FIG. 5 is a flow chart of classifier training.

Detailed Description

The present invention is further illustrated by the following figures and specific examples, which are to be understood as illustrative only and not as limiting the scope of the invention, which is to be given the full breadth of the appended claims and any and all equivalent modifications thereof which may occur to those skilled in the art upon reading the present specification.

Example 1:

as shown in fig. 1, the invention provides a pedestrian detection method based on self-learning, which comprises the following specific steps: firstly, training an AdaBoost-based cascade classifier as an offline classifier, simultaneously training a Gaussian mixture model by using a group of public pedestrian photos, adopting HOG (histogram of oriented gradient) features and position information for feature coding, then adopting a low-threshold offline classifier to detect pedestrians in a specific scene, outputting confidence scores of candidate objects, then selecting high confidence scores as positive samples, taking low confidence scores as negative samples, re-representing the candidate detection objects by using the Gaussian mixture model, finally training a discriminative pedestrian classifier on line by using an SVM (support vector machine) classifier, re-predicting the candidate objects and outputting probability estimation.

Example 2:

as shown in fig. 2, the offline classifier is trained by using a cascade classifier in OpenCV, and includes two parts, namely, a data set preparation part and a training program running part, where the data set preparation part is training data: creating a positive sample set by using opencv _ createsamples, and manually preparing a large number of negative sample pictures; the training program is run to train the cascade classifier: setting the feature type as LBP, and considering that the training and detection speed of the LBP feature is several times faster than that of the Haar feature, adopting the LBP feature and adopting opencv _ traincascade to train the cascade classifier.

Example 3:

as shown in fig. 3, the gaussian mixture model includes two parts, namely feature coding and GMM training:

Wherein K is the number of Gaussian mixture components,

i is a unit matrix of the image data,

is the mixing weight of the kth gaussian component,

is a Gaussian distribution with a mean value of μ_kAnd the variance is

f is the pedestrian sample feature in the feature encoding portion;

e, step E:

wherein the content of the first and second substances,

and M: updating the parameter Θ

Example 4:

as shown in fig. 4, the specific method for screening the candidate object includes: the method comprises the steps of utilizing an offline-trained cascade classifier to detect pedestrians in a specific scene, setting a low detection threshold in order to obtain more detection objects, and ensuring that all pedestrian images are obtained, wherein a large number of false alarms exist, and all candidate objects in the specific scene are assumed to be T ═ T { (T) } T_iI 1.. M }, each candidate outputting a corresponding confidence score S ═ S ·_iI 1., M }, sorting the confidence scores in descending order, setting two thresholds λ_hAnd λ_lRepresenting positive and negative examples with high and low confidence scores, respectively, H and N are defined as follows:

H＝{t_i:s_i＞λ_h,i＝1,...,M}

N＝{t_i:s_i＜λ_l,i＝1,...,M}

we expect a balanced set of data H and N, so let C ═ min (| H |, | N |) then

H＝{t₁,...,t_C}

N＝{t_M-C+1,...,t_M}

C＜M/2

Example 5:

as shown in fig. 5, the specific method of the online classifier is as follows: aiming at positive and negative sample sets H and N, firstly, expressing the positive and negative samples again by using an off-line trained Gaussian mixture model, then, training a classifier on line by using a LibSVM, and finally, using the on-line classifier to perform on-line classification on a candidate object t_iIs a positive probability estimate. The method comprises the following specific steps:

Select a corresponding feature

Step 2) training an SVM classifier: the SVM classifier is represented as

Wherein the kernel function adopts a Gaussian RBF kernel with the formula of

Claims

1. The pedestrian detection method based on self-learning is characterized by comprising the following steps: the method comprises the following specific steps: firstly, training an AdaBoost-based cascade classifier as an offline classifier, simultaneously training a Gaussian mixture model by using a group of public pedestrian photos, adopting HOG (histogram of oriented gradient) features and position information for feature coding, then adopting a low-threshold offline classifier to detect pedestrians in a specific scene, outputting confidence scores of candidate objects, then selecting high confidence scores as positive samples, taking low confidence scores as negative samples, re-representing the candidate detection objects by using the Gaussian mixture model, finally online training a discriminative pedestrian classifier by using an SVM (support vector machine) classifier, re-predicting the candidate objects and outputting probability estimation;

screening candidate objects: setting a low threshold value of an off-line classifier, performing pedestrian detection on a specific scene by using the off-line classifier to obtain candidate detection objects and confidence scores thereof, setting two threshold values to screen all candidate objects, and obtaining a positive and negative sample set;

an online classifier: for each pedestrian sample, regenerating a new feature representation by using a Gaussian mixture model, then training an SVM classifier on line, finally detecting candidate objects of a specific scene again by using the SVM classifier, and outputting probability estimation;

the Gaussian mixture model comprises two parts of feature coding and GMM training, and specifically comprises the following steps:

feature coding: aiming at each pedestrian picture in the INRIA data set, firstly constructing three layers of Gaussian pyramids, and then extracting overlapped image blocks from each layer of image pyramids; assume that each pedestrian picture comprises N image blocks

Extracting HOG feature HOG of each image block_piAnd its location information l_pi＝[x y]^TFinally, the feature of each image block is coded as f_pi＝[hog_pi ^T,l_pi ^T]^TAll image blocks constitute pedestrian sample features

And GMM training: an off-line trained Gaussian mixture model is expressed as

Wherein K is the number of Gaussian mixture components,

i is a unit matrix of the image data,

is the mixing weight of the kth gaussian component,

is a Gaussian distribution with a mean value of μ_kAnd the variance is

f is the pedestrian sample feature in the feature encoding portion;

The EM algorithm is to calculate the solution by using two steps, wherein the first step expectation (E) is used for calculating the expectation value of the log likelihood, and the second step maximization (M) is used for updating the parameters to find the parameter value maximizing the likelihood expectation, which is as follows:

e, step E:

wherein the content of the first and second substances,

and M: updating the parameter Θ

Where M is M-1, where M is the number of pedestrian samples in the INRIA dataset.

2. The self-learning based pedestrian detection method according to claim 1, wherein: the offline classifier is trained by adopting a cascade classifier in OpenCV, and comprises two parts of data set preparation and training program operation, wherein the data set preparation is training data: creating a positive sample set by using opencv _ createsamples, and manually preparing a negative sample picture; the training program is run to train the cascade classifier: setting the feature type as LBP, and training a cascade classifier by adopting opencv _ traincacade.

3. The self-learning based pedestrian detection method according to claim 1, wherein: the specific method for screening the candidate objects comprises the following steps: the method comprises the steps of utilizing an offline-trained cascade classifier to detect pedestrians in a specific scene, setting a low detection threshold in order to obtain more detection objects, and ensuring that all pedestrian images are obtained, wherein a large number of false alarms exist, and all candidate objects in the specific scene are assumed to be T ═ T { (T) } T_iI 1.. M }, each candidate outputting a corresponding confidence score S ═ S ·_iI 1., M }, sorting the confidence scores in descending order, setting two thresholds λ_hAnd λ_lRepresenting positive and negative examples with high and low confidence scores, respectively, H and N are defined as follows:

H＝{t_i:s_i＞λ_h,i＝1,...,M}

N＝{t_i:s_i＜λ_l,i＝1,...,M}

we expect a balanced set of data H and N, so let C ═ min (| H |, | N |) then

H＝{t₁,...,t_C}

N＝{t_M-C+1,...,t_M}

C＜M/2。

4. The self-learning based pedestrian detection method according to claim 3, characterized in that: the specific method of the online classifier comprises the following steps: aiming at positive and negative sample sets H and N, firstly, expressing the positive and negative samples again by using an off-line trained Gaussian mixture model, then, training a classifier on line by using a LibSVM, and finally, using the on-line classifier to perform on-line classification on a candidate object t_iA probability estimate that is positive; the method comprises the following specific steps:

Select a corresponding feature

Step 2) training an SVM classifier: the SVM classifier is represented as

Wherein the kernel function adopts a Gaussian RBF kernel with the formula of