Background
Far infrared imaging is particularly suited for detecting living objects in front, such as pedestrians, based on temperature differences. At present, auxiliary driving systems for pedestrian detection based on vehicle-mounted far infrared cameras are becoming popular. The core component of the system is usually a vehicle-mounted far infrared pedestrian classifier, and the performance of the core component directly determines the performance of the whole auxiliary driving system. However, due to the fact that the differences of the clothing of pedestrians on the road are large (the differences of infrared radiation emitted by pedestrians are large), the postures of pedestrians on the road are various, and the background of outdoor scenes is dynamically changed, the design of a robust real-time far infrared pedestrian classifier is a challenging task, however, the research and development of the classifier is particularly important for a vehicle-mounted auxiliary driving system, and the classifier has important commercial and research values.
Because the pedestrian detection field is essentially a classification problem, the pedestrian classification is usually realized by using a support vector machine classifier and an Adaboost classifier which are good at two classifications at present. The performance of the far infrared pedestrian classifier directly depends on the designed characteristics, so the design of a robust real-time characteristic extraction method is particularly important.
Liu Qiang et al (Robust and fast pedestrian detection method for far-Infrared automotive driving assistance systems [ J ]. Infinized Physics & Technology,2013, 60:288-299.) have focused on the edges of Infrared pedestrians with relatively rich information, improved HOG features by the concept of entropy weighting and image pyramid, and provided an entropy weighted HOG feature with higher accuracy relative to the HOG feature. However, improvement of this feature relies on the infrared pedestrian having a more pronounced edge, and the improvement is not significant when the pedestrian is at a smaller temperature differential from the background.
Miron Alina et al (Intensity self similarity features for pedestrian detection in Far-Infrared images [ C ]// Intelligent Vehicles symposium. IEEE 2012.) note that imaging of localized areas such as the left and right shoulders, left and right thighs of an Infrared pedestrian is relatively similar, and an Intensity self-similarity feature (Intensity SelfSimilarity, ISS) is proposed based on the internal self-similarity of the Infrared pedestrian. This feature is specifically designed for far infrared pedestrian classification, achieving better descriptive power on a particular dataset. However, the characteristic does not extract the gradient information characteristic in the infrared image, and the generalization capability of the characteristic is weak.
The far infrared pedestrian feature extraction was performed by combining HOG features and local binary pattern features (Local Binary Patterns, LBP) by Hurney Patrick et al (Night-time pedestrian classification with histograms of oriented gradients-local binarypatterns vectors [ J ]. IET Intelligent Transport Systems,2015,9 (1): 75-85.). Through feature fusion, the classification precision of single feature extraction is improved. However, only the HOG features are fused in series with the LBP features, and the HOG features have not been improved yet.
In the patent, an integration map and channel characteristics are adopted in a characteristic extraction stage of a pedestrian detection method and system based on an on-board infrared video (Chinese patent grant bulletin number: CN108319906A and grant bulletin date: 24 th of 2018, 07), and color characteristics and gradient characteristics of the infrared video are extracted, so that pedestrian classification is realized. However, in terms of gradient feature utilization, voting is performed by using gradient magnitudes according to gradient orientations, as in the conventional HOG features, and infrared pedestrian target feature extraction has not been performed by using a distribution relationship of gradient magnitudes.
The pedestrian detection method in the far infrared image at night (Chinese patent grant bulletin number: CN105787456A, grant bulletin day: 2016, 07 and 20 days) is characterized by utilizing Haar characteristics in the characteristic extraction stage, so that a relatively high detection speed is achieved. However, haar features simply utilize the fast brightness difference of the local image to perform feature extraction, and although the computing is performed in an integral graph mode, the computing efficiency is high, and the characterization capability on infrared pedestrians is weak.
In the patent, a multistage binary pattern feature description interested region is designed in a feature extraction stage by an infrared-based night intelligent vehicle front pedestrian detection method (Chinese patent grant bulletin number: CN105787456A, grant bulletin day: 2016, 03 and 16 days), the binary pattern feature is good at describing texture information of an image, but the texture information of a far infrared pedestrian is naturally missing, so that the characterization capability of the feature extraction method for the infrared pedestrian is limited.
In summary, although the vehicle-mounted pedestrian classification method based on far infrared imaging has achieved a certain result, in order to meet the requirements of practical applications, further improvement in terms of both robustness and real-time performance is urgently needed.
Disclosure of Invention
The embodiment of the invention aims to provide a far infrared pedestrian training method of gradient amplitude distribution gradient orientation histograms, and aims to solve the problems that the identification accuracy is unsatisfactory, and real-time performance and robustness are difficult to consider in the existing vehicle-mounted pedestrian classification method based on a far infrared camera.
The far infrared pedestrian training method for gradient amplitude distribution gradient oriented histogram is characterized by improving gradient oriented histogram (Histogram ofOriented Gradient, HOG) characteristics based on gradient amplitude distribution to obtain improved Coded HOG (CHOG) characteristics, and performing support vector machine training based on CHOG on training samples classified towards vehicle-mounted far infrared pedestrians, wherein the training of a vehicle-mounted far infrared pedestrian classifier specifically comprises the following steps: step one, improving HOG characteristics based on gradient amplitude distribution; step two, extracting CHOG characteristics of training samples for vehicle-mounted far infrared pedestrian classification; thirdly, training the training samples by using a support vector machine based on CHOG; step two, extracting training samples for classifying the vehicle-mounted far infrared pedestrians refers to analyzing road weakness group targets in a road scene according to actual application requirements of auxiliary driving of pedestrian detection by a vehicle-mounted far infrared camera so as to collect the training samples for classifying the vehicle-mounted far infrared pedestrians; specific disadvantaged group objectives of the present invention include bikes, tricycles, walkers and runners. The CHOG characteristic refers to the improved characteristic designed in the step one; and thirdly, training the support vector machine based on the training sample by utilizing the improved characteristic CHOG designed in the first step, respectively training according to four different categories of the training sample (two-wheeled vehicle pedestrians, three-wheeled vehicle pedestrians, walking pedestrians and running pedestrians), so as to obtain four linear support vector machine models, and when the models are used for sample testing, when the four models are all classified as negative samples, the tested sample side is a non-pedestrian, otherwise, the tested sample side is a pedestrian.
Further, the step one of improving the HOG feature based on the gradient amplitude distribution means that, for the HOG feature at present, the gradient amplitude distribution relation is not described yet, it is proposed to binary encode the gradient amplitude of each pixel in the HOG calculation process, and the decimal value of the binary encoding is used to represent the current pixel value. Further, the normalized histogram of the whole sample image is counted, so that the description of gradient amplitude distribution is completed, and the obtained new feature and the original HOG feature are fused in series, so that an improved coding HOG feature, namely a CHOG feature, is obtained.
Compared with the existing pedestrian classification technology based on the vehicle-mounted far infrared camera, the far infrared pedestrian training method based on the gradient amplitude distribution gradient orientation histogram has the following advantages and effects: on the basis of the traditional HOG characteristics, aiming at the problem that the existing HOG characteristics do not describe gradient amplitude distribution relation, binary coding is carried out on the gradient amplitude of each pixel in the HOG calculation process, the description of gradient amplitude distribution is completed by combining a statistical histogram technology, the obtained new characteristics and the original HOG characteristics are subjected to series fusion, so that an improved CHOG characteristic is obtained, the problem that the traditional HOG characteristics are insufficient in infrared pedestrian target representation capability is solved, and a good foundation is laid for candidate machine learning classification; in consideration of the fact that in the application field of pedestrian detection for auxiliary driving, the types of pedestrian targets are more, a mode of respectively training the targets into four categories (two-wheeled vehicle pedestrians, three-wheeled vehicle pedestrians, walking pedestrians and running pedestrians) is proposed; compared with a mode of training all pedestrian targets as one type, the intra-class variance is effectively reduced, and the accuracy of the classifier is jointly improved along with the improved CHOG characteristics; the training method of the vehicle-mounted far infrared pedestrian classifier can well consider classification accuracy and instantaneity, can be used for a core classifier design stage of a vehicle-mounted auxiliary driving system, and can be easily migrated to the fields of video monitoring pedestrian detection, identification, tracking and the like.
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The principles of the invention will be further described with reference to the drawings and specific examples.
As shown in fig. 1, a far infrared pedestrian training method of gradient magnitude distribution gradient orientation histogram according to an embodiment of the present invention includes the following steps:
s101, improving HOG characteristics based on gradient amplitude distribution;
s102, extracting CHOG features of training samples for vehicle-mounted far infrared pedestrian classification;
s103, training the training samples by using a support vector machine based on CHOG;
in step S101, improving the HOG feature based on the gradient amplitude distribution refers to, for the HOG feature at present, not describing the gradient amplitude distribution relationship, proposing binary encoding of the gradient amplitude of each pixel in the HOG calculation process, and using the decimal value of the binary encoding to represent the current pixel value. Further, the normalized histogram of the whole sample image is counted, so that the description of gradient amplitude distribution is completed, and the obtained new feature and the original HOG feature are fused in series, so that an improved coding HOG feature, namely a CHOG feature, is obtained.
Step S102, extracting training samples for classifying vehicle-mounted far infrared pedestrians refers to analyzing road weakness group targets in a Chinese road scene according to actual application requirements of pedestrian detection assisted driving by a vehicle-mounted far infrared camera so as to collect the training samples for classifying vehicle-mounted far infrared pedestrians; specific disadvantaged group objectives of the present invention include bikes, tricycles, walkers and runners. The CHOG feature refers to the improved feature designed in step one.
In step S103, the training of the support vector machine based on the training sample is to perform training according to four different categories of training samples (two-wheeled vehicle pedestrians, three-wheeled vehicle pedestrians, walking pedestrians and running pedestrians) by using the improved feature CHOG designed in step one, so as to obtain four linear support vector machine models, when the models are used for sample testing, the testing sample side is a non-pedestrian when the four models are all classified as negative samples, otherwise, the testing sample side is a pedestrian.
As shown in fig. 2, the far infrared pedestrian training method of the gradient amplitude distribution gradient orientation histogram in the embodiment of the invention mainly comprises a feature improvement module a, a feature extraction module B and a classifier training module C.
And the feature improvement module A provides a description of gradient amplitude distribution relation for HOG feature addition, improves the HOG feature and obtains CHOG.
And the characteristic extraction module B analyzes the road weakness group targets in the Chinese road scene so as to collect four types of pedestrian positive samples and corresponding negative samples which are classified towards vehicle-mounted far infrared pedestrians, and extract CHOG characteristics.
And the classifier training module C is used for offline training the support vector parameters and the intercept of the linear support vector machine based on the CSHOG new feature proposed by the module A and the four types of samples divided by the module B.
Specific examples of the invention:
the overall flow of the method is shown in fig. 1, and the main body of the method comprises three parts: 1. improving HOG characteristics based on the gradient magnitude distribution; 2. extracting CHOG characteristics of training samples for vehicle-mounted far infrared pedestrian classification; 3. and performing CHOG-based support vector machine training on the training samples.
1. Improving HOG features based on gradient magnitude distribution
In the invention, the traditional HOG characteristics are considered in the far infrared sample image, the gradient amplitude distribution characteristics are not extracted, and the distribution of the gradient amplitude can simultaneously represent the texture and the contour information of the whole sample image, so that a novel CHOG characteristic based on gradient amplitude distribution statistics is provided for representing the far infrared pedestrian sample so as to improve the HOG characteristics.
The CHOG feature extraction takes a sample as input and mainly comprises the following three substeps: 1) Extracting traditional HOG characteristics; 2) Extracting feature code_feature based on gradient statistical distribution; 3) And carrying out serial normalization on the HOG and the feature code_feature to obtain the CHOG feature. These three sub-steps are described in the following.
1.1 extraction of traditional HOG characteristics
The specific steps for extracting HOG features are as follows: 1) Gama correction; 2) Calculating the gradient size and direction of each pixel of the image; 3) Obtaining characteristics of the block; 4) And (5) connecting the features of all the blocks in series to obtain the HOG features.
1) Gamma correction
Firstly, uniformly scaling a sample image into 64 multiplied by 32 pixels 2 through a nearest neighbor interpolation algorithm to obtain a scaled image f, and then normalizing f, namely converting the pixel value into [0,1 ]]And f to (x, y) are obtained by real numbers, and the specific calculation is shown in a formula (1). Then, the pixel value is pre-compensated to obtain f G (x, y) the precompensated gamma value is set to 2, the specific calculation is shown in formula (2), and finally, the inverse normalization is carried out according to formula (3), and the precompensated f is carried out G Inverse transformation of (x, y) real values to [0,255]And (3) obtaining a new image Img after Gamma correction by integer values.
Img(x,y)=f G (x,y)×256-0.5 (3)
Where f (x, y) represents the gray value of the image f at (x, y), and Img (x, y) represents the gray value of the image f at (x, y).
2) Calculating the gradient magnitude and direction of each pixel of the image
X-direction gradient G of pixel Img (x, y) x (x, y) is calculated according to formula (4); gradient G in y-direction y (x, y) is calculated according to formula (5); the gradient direction θ (x, y) of the pixel Img (x, y) is calculated according to formula (6); the gradient magnitude α (x, y) of the pixel Img (x, y) is calculated according to formula (7).
G x (x,y)=H(x+1,y)-H(x-1,y) (4)
G y (x,y)=H(x,y+1)-H(x,y-1) (5)
3) Obtaining characteristics of blocks
Dividing the resulting image into 8 x 8 pixels 2 Cells of the size, and the gradient orientation histogram of each cell (at 20 DEG intervals) is counted to form the characteristics of each cell, and each 2×2 cell 2 The cells of (a) constitute a block, and the HOG characteristics of the block can be obtained by connecting the characteristics of all cells in the block in series.
4) Obtaining HOG features
And 3) connecting the features of all the blocks obtained in the step 3) in series to obtain HOG features.
1.2 extracting feature code_feature based on gradient statistical distribution
1) The magnitude of the gradient magnitude (i.e., α (x, y)) of each pixel in the sample obtained during the HOG feature extraction process is taken as an input. The specific implementation steps of the gradient statistical distribution characteristics provided by the patent of the invention comprise: 1) Global normalization of gradient amplitude; 2) Gradient amplitude coding; 3) A gradient magnitude histogram is calculated and the gradient magnitude distribution is normalized.
2) Global normalization of gradient amplitude
First, α (x, y) is L2 normalized according to equation (8), where α i The value of epsilon is 0.01 for the value at the (x, y) position in alpha.
3) Gradient amplitude encoding
At each ofCentered, for its 3 x 3 pixels 2 The values in the field are discretized, the amplitude 0.2 is taken as a section (as shown in fig. 3 (b)), and then the discretized values are binary coded in the following specific ways: 3 x 3 pixels 2 In the field, all pixels are associated with +.>The center value is compared and is encoded as 0 when it is smaller than the center value, and as 1 otherwise (as shown in fig. 3 (d)). For each->After the binary encoding is performed, a binary string is obtained (as shown in FIG. 3 (e)) in a counterclockwise order from the position of the upper left corner, and the binary is converted into decimal, i.e., the +_is completed>Calculation of the value at the (x, y) position. Obtaining encoded +.>An image.
4) Calculating a gradient amplitude histogram and normalizing the gradient amplitude distribution
StatisticsAnd then normalized according to the formula (8), and the code_feature characteristic with 256 dimensions can be obtained.
1.3 series normalization of HOG with feature code_feature based on gradient statistical distribution
And (3) connecting the HOG feature and the coded_feature feature in series to obtain a final CHOG feature.
2. CHOG features for extracting training samples for vehicle-mounted far infrared pedestrian classification
And collecting data of expressway, national road, urban area and suburban area scenes by means of vehicle-mounted far infrared shooting, and obtaining video for 100 hours. From which random sampling is performed to obtain pictures. 10 ten thousand original infrared images are obtained, all pedestrians, tricycles, walkers and running pedestrians appearing in the images are marked manually, and all pedestrian targets of the four types are sequentially formed into data sets Dataset1, dataset2, dataset3 and Dataset4; in 5 ten thousand far infrared images containing no pedestrians, the images pass through 32×64 pixels 2 48×96 pixels 2 And 96×198 pixels 2 And (3) performing sliding window clipping on the horizontal step size of 8 pixels and the vertical step size of 8 pixels to obtain non-pedestrian samples to form a data set Dataset0. On this basis, all sample sets of Dataset0 and Dataset1, dataset0 and Dataset2, dataset0 and Dataset3, dataset0 and Dataset4 are combined in turn, and all samples are scaled uniformly to 48×96 pixels using bilinear interpolation algorithm 2 Size, four training data sets are obtained: dataset1_0, dataset2_0, dataset3_0, and Dataset4_0. And (5) collecting training samples for classifying the vehicle-mounted far infrared pedestrians. On the basis, CHOG features of the four data sets are extracted, thereby completing the extraction oriented toThe CHOG features of the training samples for on-board far infrared pedestrian classification.
3. CHOG-based support vector machine training of training samples
For training samples on the data sets Dataset1_0, dataset2_0, dataset3_0 and Dataset4_0, respectively training lines are based on a CHOG characteristic support vector machine model, so that four support vector machine classifiers Classification 1, classification 2, classification 3 and Classification 4 based on the CHOG characteristic are sequentially obtained.
The linear support vector machine divides the training sample by searching a hyperplane to finish classification. The principle of segmentation is to maximize the separation, mathematically by solving a convex quadratic programming problem. The linear support vector machine accomplishes classification by mapping training samples from an original space to a higher dimensional space such that the samples are linearly separable in the high dimensional space. When the linear support vector machine is trained, a support vector w and an intercept b are obtained by solving a formula (9).
Where w is the decision weight obtained by training, b is a constant offset, y i Label, x, which is the ith training feature i Is the cheg feature of the ith training sample.
When the classifier is used for classifying candidate areas, a bilinear interpolation algorithm is used for uniformly scaling a certain candidate area to 48 multiplied by 96 pixels 2 And extracting CHOG characteristics, and classifying according to a judgment function of a linear support vector machine shown in a formula (10).
K(x i ,x)=x i T x (11)
Wherein K (x) i X) is a linear kernel function, specifically defined as shown in formula (11), x i Is a feature vector, x is the local intensity histogram of the candidate regionThe sign vector, b is a constant offset and the response of the input vector x is f (x). If f (x)>And 0, classifying the pedestrian targets, otherwise, judging the pedestrian targets as negative sample targets, and judging the pedestrian targets as final non-pedestrian targets by the Classifier when f (x) of the Classifier1, the Classifier2, the Classifier3 and the Classifier4 are all classified as negative sample targets, thereby completing the classification of the vehicle-mounted far infrared pedestrians.