WO2019196131A1

WO2019196131A1 - Method and apparatus for filtering regions of interest for vehicle-mounted thermal imaging pedestrian detection

Info

Publication number: WO2019196131A1
Application number: PCT/CN2018/083480
Authority: WO
Inventors: 许瑞霖; 刘琼; 彭绍武; 吴继平
Original assignee: 广州飒特红外股份有限公司
Priority date: 2018-04-12
Filing date: 2018-04-18
Publication date: 2019-10-17
Also published as: CN108549864B; CN108549864A

Abstract

Disclosed are a method and apparatus for filtering regions of interest (RoIs) for vehicle-mounted thermal imaging pedestrian detection. The method for filtering RoIs refers to a method for filtering out non-pedestrian RoIs by designing three layers of cascaded filters: in the first layer, by means of calculating pedestrian pixel heights and aspect ratios of RoIs and setting corresponding threshold intervals, filtering out RoIs of abnormal sizes; in the second layer, respectively calculating a vertical distance between upper and lower boundaries of each of the RoIs one by one and a current image pavement reference, and calculating threshold values based on pixel heights of the RoIs, so as to filter out RoIs with abnormal locations; and in the third layer, searching for a possible pedestrian head region according to a luminance vertical projection difference curve of each of the RoIs, and comparing the degree of difference between Haar-like features of the head region and those of an adjacent background region, so as to filter out RoIs without pedestrian heads. By means of the method, on the premise that pedestrian detection accuracy is considered, the calculation overhead of pedestrian detection can be reduced, and the scene adaptability of a classifier can be improved. The apparatus for filtering RoIs comprises a filter for RoIs of abnormal sizes, a filter for RoIs with abnormal locations and a filter for RoIs without heads.

Description

Region of interest filtering method and device for on-board thermal imaging pedestrian detection

Technical field

The present invention relates to pedestrian detection and, more particularly, to a Regions of Interest (RoIs) filtering method and apparatus for on-board thermal imaging pedestrian detection.

Background technique

On-board thermal imaging pedestrian detection technology refers to the use of infrared cameras as visual sensors to capture images/videos of vehicle traffic scenes, and the use of machine learning methods on computers or embedded platforms to identify all pedestrian targets present in images/videos, and The coordinate information of the minimum circumscribed rectangle identifies the position of each pedestrian on the image.

This process consists of two key phases: RoIs extraction and RoIs classification detection, where the important factors affecting computational overhead and accuracy are the number of extracted RoIs and the performance of the classifier used. In the RoIs extraction process, in order to meet the high recall rate requirements, a larger number of RoIs are usually obtained. However, the pedestrian target in the image is a rare thing, that is, most of the RoIs only contain background information, and some of the information of the background area and the characteristics of the pedestrian are very different. If classifier detection is used for these RoIs, the computational overhead is not optimistic, so it is necessary to propose a method that can reduce the number of RoIs to be detected while taking into account the accuracy.

Compared with computers, in-vehicle embedded platforms have obvious computational performance bottlenecks. Many published pedestrian detection methods, especially those using deep learning algorithms, cannot be applied to such platforms, resulting in detection rate and real-time performance of actual applications. influences. For example, the DM6437 vehicle platform produced by Texas Instruments has strong stability, but its processor is single core, the maximum processing speed of the main frequency is only 600Mhz per second, based on the "HOG feature + linear SVM" classifier for a single The processing time of RoIs is about 3 milliseconds, which is far from being comparable to that of a normal computer in terms of computational performance. In the process of popularization of pedestrian detection to practical applications, it is necessary to find a solution that weighs computational overhead and detection performance.

In the RoIs extraction phase, some of the methods currently published are to screen the foreground areas that pedestrians may exist based on the characteristics of the targets in the image. E.g:

Prior Art 1: Ge J, Luo Y, Tei G. Real-Time Pedestrian Detection and Tracking at Nighttime for Driver-Assistance Systems. [J].Intelligent Transportation Systems IEEE Transactions on, 2009, 10(2): 283-298 . According to the experience that the target pixel of the same horizontal line is higher than the brightness of the surrounding background, the RoIs are extracted from the near-infrared image by calculating the upper and lower limits of the segmentation threshold in the local neighborhood of each pixel.

Prior Art 2: Uijlings J R R, Sande K E A V D, Gevers T, et al. Selective Search for Object Recognition [J]. International Journal of Computer Vision, 2013, 104(2): 154-171. A selective search method is proposed. The main idea is to divide the image into small similar regions according to the different color spaces of the visible light image, and then merge the small regions with high similarity into large regions according to the color, texture and size according to the region merging algorithm. .

Prior Art 3: Zitnick C L, Dollár P. Edge Boxes: Locating Object Proposals from Edges [C]//European Conference on Computer Vision. Springer, Cham, 2014: 391-405. The EdgeBox method is proposed to find the RoIs containing the complete object according to the relationship between the closed contour and the cross contour in the local area.

Compared with the sliding window method, the magnitude of RoIs obtained by the prior art 1-3 method is significantly reduced, but still poses a threat to real-time performance. The prior art 2 method obtains an average of about 2000 RoIs in a single image, and the prior art 3 The method of processing a single image in a computer is approximately 0.2 s. However, the prior art 1-3 provides a thought worthy of reference, that is, filtering non-Pedestrian RoIs in advance by calculating a relatively small overhead, thereby reducing the number of RoIs to be detected.

In the stage of RoIs classification and detection, positive and negative samples with quantity and quality are an effective way to improve the performance of the classifier. The currently available thermal imaging pedestrian detection reference data set is very scarce, and the present invention uses the laboratory-published data set SCUT Dataset ( http://www2.scut.edu.cn/cv/scut_fir_pedestrian_dataset/ ). The dataset is for the traffic road scene in Guangzhou. It contains 100 infrared thermal imaging videos. The total number of frames is about 200,000. The number of marked Ground-Truth information is about 400,000. It has different pedestrian target types, such as “single walk”. Pedestrians, single cycling pedestrians, etc. Compared with other public thermal imaging pedestrian detection data sets such as KAIST Dataset, it has the advantages of image frame number, Ground-Truth information type and quantity, road scene type and so on.

In summary, although the on-board thermal imaging pedestrian detection method has achieved certain results, due to the real-time and accuracy trade-off between the computational bottleneck and the classifier performance, many methods cannot perform normal performance or even use. In order to meet the requirements of practical applications, it is urgent to make further improvements in terms of detection time and detection accuracy.

Summary of the invention

It is an object of the present invention to provide a RoIs filtering method and apparatus for on-board thermal imaging pedestrian detection, which aims to facilitate solving problems such as a decrease in accuracy caused by a computational performance bottleneck and failure to satisfy real-time performance. The present invention is achieved by the following technical solutions.

In order to achieve the above object, a Regions of Interest (RoIs) filtering method for on-board thermal imaging pedestrian detection is provided according to the present invention, the method comprising: calculating a pedestrian pixel height and a RoIs aspect ratio and setting corresponding Threshold interval, filtering Out-of-size RoIs; calculating the vertical spacing between the upper and lower boundaries and the current image pavement reference by RoIs, calculating the threshold based on the pixel height of RoIs, filtering out the abnormal position of the RoIs; and arranging the brightness according to the brightness of each RoIs The direct projection difference curve searches for possible pedestrian head regions, compares the degree of difference between the head region and the Haar-like features of adjacent background regions, and filters out the RoIs of missing pedestrian heads.

According to another aspect of the present invention, the filtering of the abnormal size of the RoIs includes: calculating a threshold interval of the pixel height of the pedestrian RoIs according to the image focal length f, the pedestrian height _target, and the detection distance distance parameter:

Height _pixel ≈height _target ×f/distance formula (1)

Where height _pixel is the threshold interval of the pixel height of the pedestrian RoIs, height _target is the height of the pedestrian target, f is the image focal length, and distance is the detection distance;

According to the statistical analysis method, the Gaussian distribution of the pedestrian RoIs aspect ratio is obtained, and the appropriate confidence level is selected to obtain the aspect ratio threshold interval; and each RoIs to be detected is evaluated, and the RoIs that do not meet the two interval conditions are the size abnormal RoIs. These abnormally sized RoIs are removed.

According to another aspect of the present invention, the RoIs for filtering out position abnormalities include: obtaining a current image road surface reference using a horizontal road surface hypothesis method; and calculating a distance value between the upper and lower boundaries and the road surface reference in the y-axis direction for each of the RoIs to be judged. The y-axis direction is the vertical direction of RoIs, and the threshold based on the current RoIs pixel height RoI _h is calculated according to formula (2):

Where α and β are scaling factors and ε is an offset noise factor;

And filtering out the RoIs to be detected that the pitch result does not meet the threshold.

According to another aspect of the present invention, filtering out RoIs missing a pedestrian head includes: dividing a current RoIs upper layer region into three parts in a horizontal direction by using a pedestrian head adaptive positioning algorithm, and the middle portion is named a head region, left and right. The part is named as the background area; and the Haar-like feature-based method is used to estimate the degree of difference in the brightness mean of the head area and the background area, and the RoIs of the missing head are removed according to the preset threshold.

According to another aspect of the present invention, the pedestrian head adaptive positioning algorithm uses the luminance vertical projection method to process the upper layer region of the current RoIs, and obtains a corresponding projection result sequence; calculates the difference of adjacent data in the sequence to obtain the brightness of the current RoIs. Vertical projection difference curve; according to the vertical boundary matching strategy, find the left and right boundary combination of the qualified head region at the extreme point of the curve, and the corresponding x-axis coordinate information defines the position of the head region, wherein the x-axis is the level of the RoIs direction.

In addition, the present invention provides a Regions of Interest (RoIs) filtering device for on-board thermal imaging pedestrian detection, the device comprising: a size anomaly RoIs filter, which is calculated by calculating the pixel height of the pedestrian and the aspect ratio of the RoIs Determine the corresponding threshold interval, filter out the size abnormal RoIs; position abnormal RoIs filter, calculate the vertical distance between the upper and lower boundaries and the current image road surface reference by RoIs, calculate the threshold based on the pixel height of RoIs, and filter out the abnormal position RoIs And the missing head RoIs filter, searching for possible pedestrian head regions according to the luminance vertical projection difference curve of each RoIs, comparing the degree of difference of the Haar-like features of the head region and the adjacent background region, filtering out the missing RoIs in the head of the pedestrian.

In addition, the present invention provides a method for in-vehicle thermal imaging pedestrian detection, the method comprising: extracting RoIs to be detected; filtering RoIs, wherein the RoIs filtering comprises the steps of: calculating a pedestrian pixel height and a RoIs aspect ratio The corresponding threshold interval is determined, and the RoIs of the size anomaly are filtered out; the vertical spacing between the upper and lower boundaries and the current image pavement reference is calculated by RoIs, and the threshold of the pixel height based on the RoIs is calculated, and the RoIs of the abnormal position are filtered out; and according to each RoIs The luminance vertical projection difference curve searches for possible pedestrian head regions, compares the degree of difference between the head region and the adjacent background region Haar-like features, filters out the RoIs of the missing pedestrian head, and performs off-line training on the classifier; And the classified RoIs are classified and detected using a trained classifier.

The invention provides a RoIs filtering method for on-board thermal imaging pedestrian detection, and has the following advantages and effects compared with the existing on-board thermal imaging pedestrian detection RoIs filtering technology for the adverse effects of the calculation bottleneck problem:

The invention proposes a RoIs filtering method. By constructing a three-layer cascade filter which conforms to the pedestrian characteristic rule and low computational overhead, the RoIs of the size anomaly, the positional abnormality and the missing pedestrian head can be preferentially filtered out, and a large number of non-pedestrian RoIs are suppressed and guaranteed. The remaining RoIs to be tested can meet the real-time requirements when performing the higher-precision classifier detection, and can reduce the system false alarm rate.

DRAWINGS

The above and other aspects, features and advantages of the specific embodiments of the present disclosure will become more apparent from

FIG. 1 is a flow chart showing a RoIs filtering method according to an embodiment of the present invention.

Fig. 2(a) shows the artificial statistical result of the pedestrian pixel height threshold interval, Fig. 2(b) shows the Ground-Truth aspect ratio interval statistical result, and Fig. 2(c) shows the head adaptive positioning algorithm. Part of the sample results.

FIG. 3 is a block diagram showing a RoIs filtering device according to an embodiment of the present invention.

4 is a flow chart showing a classifier training method in accordance with an embodiment of the present invention.

(a) of FIG. 5 shows an example of Y channel preprocessing of a YUV 4:2:2 format image, (b) of FIG. 5 shows a comparison of an original positive sample and an extended positive sample, and (c) of FIG. 5 shows a part of a car. A negative sample that interferes with the heat source.

FIG. 6 is a block diagram showing a classifier training apparatus according to an embodiment of the present invention.

FIG. 7 is a flow chart showing a pedestrian detection method according to an embodiment of the present invention.

FIG. 8 is a block diagram showing a pedestrian detecting apparatus according to an embodiment of the present invention.

detailed description

The following description of the drawings is provided to be a It includes various specific details to assist understanding, but these are considered merely exemplary. Accordingly, it will be appreciated by those skilled in the art that various changes and modifications of the various embodiments described herein can be made without departing from the scope and spirit of the disclosure. In addition, descriptions of well-known functions and constructions may be omitted for clarity and conciseness.

The use of the terms and words in the following description and claims is not limited to the written description, but is only used by the inventor to enable a clear and consistent understanding of the present disclosure. The following description of the various embodiments of the present invention are intended to be

In the pedestrian detection process, the extraction section obtains the RoIs bounding box information of the target possible area, and records the X-axis coordinate RoI _x , the upper left corner y-axis coordinate RoI _y , the width RoI _w and the height RoI _{h of} each RoIs upper left corner. In order to meet the high recall rate requirement, a large number of RoIs are usually obtained. If the subsequent classifier detection link is directly performed, it is difficult to achieve real-time requirements for a hardware platform with a computational bottleneck (such as an in-vehicle embedded platform). Through manual observation, the pedestrian target in the image belongs to rare things, and most of the extracted RoIs are non-pedestrian RoIs, among which there are not a few non-pedestrian RoIs.

The pedestrian RoIs refers to a RoIs bounding box that exceeds 50% of the intersection of the pedestrian Ground-Truth bounding box (IOU, Intersection over Union), and the non-pedestrian RoIs refers to an IOU of less than 50 with the pedestrian Ground-Truth bounding box. % RoIs bounding box. Obviously, non-pedestrian RoIs refers to RoIs that are less than 30% of the IOU of the pedestrian's Ground-Truth bounding box, which can easily distinguish the RoIs based on artificial vision, and can be distinguished by setting some simple filtering conditions. Among them, the pedestrian Ground-Truth bounding box refers to the real bounding box labeling information of the target type as a single walking pedestrian and a single cycling pedestrian.

Therefore, the main idea of the RoIs filtering method of the embodiment of the present invention is to construct a three-layer cascade filter conforming to the pedestrian characteristic rule to preferentially filter out the size abnormality, the position abnormality, and the RoIs of the missing pedestrian head, thereby reducing the RoIs to be detected. The quantity, detailed flow chart is shown in Figure 1.

In step 110, the abnormally sized RoIs are filtered out. Specifically, the RoIs of the abnormal size are filtered out by calculating the pedestrian pixel height and the RoIs aspect ratio and setting the corresponding threshold interval. In more detail, including:

Step 111: Calculate a threshold interval of the pixel height of the pedestrian RoIs according to the image focal length and the pedestrian detection distance.

Specifically, according to the artificial experience, the range of pedestrian detection is about 20 to 85 meters from the front area of the car. As shown in formula (1), the pedestrian _target is calculated according to the image focal length f, the pedestrian height _target and the detection distance parameter. The pixel height threshold interval in this range is [30, 140].

Height _pixel ≈height _target ×f/distance formula (1)

Among them, height _pixel is the _pixel height of the pedestrian target in the image, height _target is the height of the pedestrian target, the experimental setting is about 1.7 meters, f is the image focal length, the value in the SCUT Dataset is 1554, and the distance is the detection distance.

(a) in Fig. 2 shows the artificial statistical result of the pedestrian pixel height threshold interval. Specifically, (a) in Fig. 2 is an image taken by an infrared camera mounted on a car and a pedestrian target at a distance of 20 meters and 85 meters on a flat road surface, wherein the car is stationary. The artificial measurement of the pedestrian bounding box of the two images (the dotted line is the drawn pedestrian bounding box), the statistical 20-meter pedestrian pixel height is 138 pixels, the pedestrian pixel width is 42 pixels, and the pedestrian pixel height of 85 meters is 30 pixels, the pedestrian pixel width is 12 pixels, compared with the value calculated according to the above formula (1), the difference between the two is small, which proves that the method of formula (1) calculation is effective.

Step 112: According to the statistical analysis method, obtain the Gaussian distribution of the pedestrian RoIs aspect ratio, and select an appropriate confidence level to obtain an aspect ratio threshold interval.

In the currently published RoIs extraction method, as in the prior art, the RoIs obtained based on the foreground region vary greatly in the aspect ratio. Many high-width ratios of non-pedestrian RoIs differ greatly from actual human characteristics. Based on this characteristic, the Gaussian distribution of pedestrian RoIs aspect ratio is obtained by statistical analysis. The appropriate confidence level is selected to obtain the aspect ratio threshold of [1.5. 4]. Among them, the statistical sample is from the pedestrian Ground-Truth information of the data set SCUT Dataset, and the target annotation type is “single walking pedestrian” and “single bicycle pedestrian”.

(b) in Fig. 2 shows the Ground-Truth aspect ratio interval result of the target types of the 44 videos of "single walking pedestrian" and "single bicycle pedestrian". Specifically, (b) in FIG. 2 is a Ground-Truth sample for 44 videos whose target types are "single walking pedestrian" and "single bicycle pedestrian", and their aspect ratio is calculated and plotted as a histogram result. That is, the Gaussian distribution of the pedestrian RoIs aspect ratio is obtained using statistical analysis. The horizontal axis of the image is the aspect ratio value, and the vertical axis is the number of samples. It can be seen that the aspect ratio of the sample is approximately 1 to 4. In the technical solution, an appropriate confidence level is selected to determine the aspect ratio threshold interval [1.5, 4].

Step 113: Evaluate each of the to-be-detected RoIs, and the RoIs that do not meet the two interval conditions are RoIs of abnormal size, and remove the abnormally-sized RoIs.

The pedestrian target of the traffic scene has a strong positional constraint, that is, whether it is a pedestrian or a bicycle pedestrian, most of them are located on the road surface, so the center of the pedestrian target in the image is horizontally distributed. Based on this experience, the RoIs with abnormal positions in the image are likely to be apparently non-pedestrian RoIs.

In step 120, the abnormally located RoIs are filtered out. Specifically, the vertical distance between the upper and lower boundaries and the current image road surface reference is calculated by RoIs one by one, and the threshold based on the pixel height of the RoIs is calculated, and the RoIs with abnormal positions are filtered out. In more detail, including:

Step 121: Acquire a current image road surface reference using a horizontal road surface hypothesis method.

Specifically, based on the photographing angle of the thermal imager, the horizontal road surface hypothesis method is used to obtain the y-axis coordinate data Horizon _{y of the} current image pavement reference.

Step 122: Calculate the distance between the upper and lower boundaries and the road surface reference in the y-axis direction of the image by RoIs one by one, and set a threshold based on the current RoIs pixel height.

For the RoIs to be judged, the distance between the upper and lower boundaries of the RoIs and the road reference in the y-axis direction of the image is calculated, and the adaptive threshold based on the pixel height RoI _h of the current RoIs is calculated according to the formula (2).

Where α and β are scaling factors, ε is an offset noise factor, and α=4, β=2, ε=25 are experimentally set;

Step 123: Filter out the RoIs to be detected whose spacing result does not meet the threshold.

The operation of step 122 is repeated one by one for the to-be-detected RoIs satisfying the size feature requirement, and all RoIs having abnormal positions are filtered out.

In step 130, the RoIs missing the pedestrian head are filtered out. Specifically, the possible pedestrian head region is searched according to the luminance vertical projection difference curve of each RoIs, and the difference degree of the Haar-like features of the head region and the adjacent background region is compared, and the RoIs of the missing pedestrian head is filtered out.

The non-pedestrian RoIs obtained in the extraction process generally contain background interference heat sources of traffic scenes, such as roadside tree branches and uniform heat sources. It has been observed that the human head is rarely obscured by other objects and is exposed, so its thermal imaging effect is often higher than the adjacent background brightness, and has a more stable contour. Based on this, in more detail, the RoIs that filter out missing pedestrian heads include:

Step 131: Using the pedestrian head adaptive positioning algorithm to divide the current upper layer area of the RoIs into three parts in the horizontal direction, the middle part is named as the head area, the left and right parts are named as the background area, and the upper layer area of the RoIs is along the y-axis. The direction is from the upper boundary of the RoIs to a partial area of 1/3 or 1/5 pixel height position. The pedestrian head adaptive positioning algorithm uses the luminance vertical projection method to process the upper layer region of the current RoIs, and obtains a corresponding projection result sequence; calculates the difference between adjacent data in the sequence, and obtains the luminance vertical projection difference curve of the current RoIs. Further, according to the proposed vertical boundary matching strategy, the left and right boundary combinations of the qualified head regions are searched for at the extreme points of the curve, and the corresponding x-axis coordinate information defines the position of the head region.

In more detail, the pedestrian head adaptive positioning algorithm is as follows:

1 For the RoIs to be operated, the part of the area from the upper boundary of the RoIs to the (RoI _y +α×RoI _h ) position in the y-axis direction is the upper layer of the RoIs P _up , and the height of this area is denoted as H, where RoI _h <48, set α = 1/3, otherwise set α = 1 / 5;

2 judge according to the pixel height RoI _{h of the} current RoIs: if RoI _h <90, jump to step 3 to execute; if RoI _h ≥ 90, jump to step 8 to execute;

3 The coordinates of the upper left corner of RoIs (RoI _x , RoI _y ) are regarded as the origin of coordinates, and the vertical projection sequence of P _up is calculated according to formula (3). V _N ={V(x), x=0,1 ,...,RoI _w -1}, calculate the luminance vertical projection difference curve V' _N ={V'(x), x=0,1,...,RoI _w -2} according to formula (4), where Y( x, y) is the luminance value at the pixel point (x, y);

4 Influenced by image noise and background heat source, the projection difference curve V' _N may have some interference extreme points with smaller values, and the threshold T _{diff is} calculated according to formula (5), and then the projection difference curve according to formula (6) V' _N filters the interference extreme value, and obtains a new projection difference curve V' _T , where abs() is the absolute value function, α is the scaling factor, and the experiment is set α = 0.5;

5 traverse the extreme points of the projection difference curve V' _T from left to right, and record the x-axis position information (X_edge _l , X_edge _r ) of the left and right boundary pairs that conform to the following principles:

The head position boundary only corresponds to the extreme point of V' _T. By default, the head area is higher than the background area, so the left border of the head corresponds to the positive value of V'_T; the right border of the head corresponds to V' The negative value point of _T ;

If a new possible left boundary is found, its corresponding right boundary is first null;

If a new right boundary is found and its corresponding left boundary is empty, then this right boundary is the background interference, because the process of traversing from left to right is to find the left boundary of the head first;

If a match to a set of left and right boundaries (X_edge _l, X_edge _r), which is calculated corresponding to the head width W _head = X_edge _r -X_edge _l, according to the head width Min _head minimum threshold and a maximum threshold value of Max _head W _head is reasonable Judgment (experimental setting Min _head = RoI _w /8, Max _head = RoI _w /2): If Min _head ≤ W _head ≤ Max _head , then the set of boundary pairs is valid, save this data and continue to search for possible and current X_edge _l matches the other right boundary; if W _head <Min _head , the current right boundary X_edge _{r is} invalid; if W _head >Max _head , the current left and right boundaries are invalid, and the left boundary X_edge _l matches the following right boundary. significance;

6 If there are multiple eligible left and right boundary pairs, X_edge _N = {(X_edge _l1 , X_edge _r1 ), (X_edge _l2 , X_edge _r2 ), ..., (X_edge _ln , X_edge _rn )}, then traverse these boundary pairs Combine to find the optimal term: tentative (X_edge _l1 , X_edge _r1 ) is the optimal combination; view the next boundary pair combination, if the combination and the known optimal combination have the same left boundary, compare the right boundary of the two Position, the value is larger, update the optimal combination; if the left boundary is different, calculate the two vertical center line positions (position values along the x-axis direction) of the two sets of data, and then respectively and the current RoIs Straight centerline for spacing comparison, closer to the vertical centerline of RoIs, update the optimal combination (because the pedestrian head is more likely to be in the center of the upper layer of the RoIs P _up );

7 If the best combination of left and right boundary pairs (X_edge _l , X_edge _r ) is found, calculate the distance between the combination and the current RoIs left and right boundaries, and set the spacing threshold T _s =0.2×RoI _w +0.5 if one of the spacings If the result is less than the threshold value T _s , it indicates that the corresponding head region is too close to the left and right boundaries of the RoIs, and does not conform to the actual human body condition, and the boundary pair is invalid;

8 If there are no left and right boundary pairs (X_edge _l , X_edge _r ) that meet the conditions, then the upper layer P _{Up of} the RoIs is divided into three parts in the horizontal direction, and the obtained position data is the left and right boundary pairs (X_edge _l , X_edge _r ).

For the current RoIs, the above-mentioned pedestrian head adaptive positioning algorithm is used to obtain the left and right boundary pairs (X_edge _l , X_edge _r ) of the upper region P _up , and the P _{up is} divided into three parts P _l , P _m , P _r in the horizontal direction. .

Step 132: Evaluate the degree of difference in luminance mean of the head region and the background region using a Haar-like feature based method and compare it with a preset threshold.

Calculate the Haar-like eigenvalue of P _up according to formula (7) and compare it with the threshold T _haar . If it is greater than the threshold, the head constraint condition is satisfied.

Min(abs(avg _m -avg _l ), abs(avg _m -avg _r )) Formula (7)

Where min() is the minimum function, abs() is the absolute value function, avg _l , avg _m , avg _r are the mean values of the brightness of P _l , P _m , P _r , respectively, and the value of T _haar is set experimentally. The range is 13 to 15.

Step 133: Filter out the RoIs of the missing pedestrian head.

The steps of step 131 and step 132 are performed one by one for the to-be-detected RoIs satisfying the position feature requirement, and the RoIs missing the pedestrian head are filtered out.

Through the above RoIs filtering method, in the DM6437 vehicle embedded platform experiment with computational bottleneck, the RoIs are extracted by the double threshold segmentation method of the prior art 1, and the average number of RoIs obtained in a single image is about 100. With the RoIs filtering method described above, it is possible to reduce the number of RoIs by about half, and the average time is within a few milliseconds. Pedestrian Ground-Truth bounding box for dataset SCUT Dataset (target type is single walking pedestrian and single cycling pedestrian, occlusion label is unoccluded), a total of 14,000 samples are extracted for experiment of head adaptive positioning algorithm. After manual statistics, the number of top left and right boundary failures is only 1162, and the accuracy rate is about 92%. The proposed head positioning algorithm has higher precision. Some examples are shown in Figure 2(c). (c) of FIG. 2 is a partial example result of the head adaptive positioning algorithm, and two white vertical lines added in each image correspond to the left and right boundary pairs (X_edge _l , X_edge _r ) of the pedestrian head obtained by the algorithm.

FIG. 3 is a block diagram showing a RoIs filtering device according to an embodiment of the present invention. The RoIs filter device 300 includes a size anomaly RoIs filter 310, a position abnormality RoIs filter 320, and a missing head RoIs filter 330.

The size anomaly RoIs filter 310 filters out RoIs of abnormal size. Specifically, according to the image focal length and the pedestrian detection distance, the threshold interval of the pixel height of the pedestrian RoIs is calculated, and according to the statistical analysis method, the Gaussian distribution of the pedestrian RoIs aspect ratio is obtained, the appropriate confidence level is selected to obtain the aspect ratio threshold interval, and then each evaluation is performed. RoIs to be detected will be filtered out by RoIs that do not meet the two interval conditions.

The positional abnormality RoIs filter 320 filters out RoIs with abnormal positions. Specifically, the horizontal road surface hypothesis method is used to obtain the current image road surface reference, and the pitch of the upper and lower boundaries and the road surface reference in the y-axis direction of the image is calculated by RoIs one by one, and the threshold based on the current RoIs pixel height is set, and then the filtering result is not met. Threshold of the RoIs to be detected.

The missing head RoIs filter 330 filters out the RoIs missing the pedestrian head. Specifically, the pedestrian head adaptive positioning algorithm is used on the current RoIs, and the left and right boundary pairs (X_edge _l , X_edge _r ) of the upper layer P _up are obtained, and the P _{up is} divided into three parts P _l and P in the horizontal direction. _m , P _r , the Haar-like eigenvalue of P _up is calculated according to the above formula (7), and compared with the threshold value T _haar , if the threshold value is greater than the threshold value, the head constraint condition is satisfied, and the above-mentioned operations are performed one by one for the detected RoIs satisfying the position feature requirement. , filter out the RoIs missing the head of the pedestrian.

In step 410, an enhanced positive sample and an enhanced negative sample are generated. Specifically, the positive sample labeling information and the equalization technique are combined to generate an enhanced positive sample, the clustering method is used to analyze the information distribution of the non-pedestrian background image block, and the enhanced negative samples of different categories are assisted.

Since the pedestrian target of a traffic scene is a rare thing, the number of positive samples obtained through the published thermal imaging data set is usually limited, and an image enhancement method is needed to generate a new positive sample on this basis; since the negative sample is in the entire image The non-pedestrian area is relatively lacking in quantity, but the traditional method is to obtain negative samples based on the grid random method, and the RoIs extraction method used in the actual detection process is often different, resulting in the difference of background information distribution between the two. Very large, that is, the negative sample is less representative than the actual non-pedestrian RoIs.

The enhanced positive samples include the original positive samples and the extended positive samples. Generating the enhanced positive sample includes: taking the thermal imaging pedestrian detection data set SCUT Dataset as a source, extracting the corresponding image block information according to the marked pedestrian Ground-Truth bounding box and the preset index, and obtaining the original positive sample. The platform histogram equalization method is used to process the luminance information of the original positive samples one by one to obtain an extended positive sample. That is to say, the equalization method is used to enhance the contrast of the original positive sample luminance information, and an extended positive sample similar to the thermal imaging characteristic is generated to constitute a sufficient number of enhanced positive samples. (b) of FIG. 5 shows a comparison of the original positive sample and the extended positive sample.

In more detail, the specific steps to generate an enhanced positive sample are as follows:

1 Using the thermal imaging pedestrian detection data set SCUT Dataset as the source, use the Caltech operating tool (http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/) to extract the image block information corresponding to the pedestrian Ground-Truth bounding box, recorded as Temporary positive sample set Pos _temp ;

2 In the Pos _temp , the original positive sample set Pos _p is filtered according to the preset index. The specific indicators are: the target type target belongs to “single walking pedestrian” and “single bicycle pedestrian”, and the occlusion label is unoccluded, and the interval frame number is 5. The pixel height is [30, 140], and the number of Pos _p is recorded as PosNum _p ;

3 For Pos _{p, the} platform histogram equalization method is used to process the luminance information sample by sample, and the corresponding new sample image block information is obtained, and the examples in which the overexposed or lost contours appear are manually excluded, and the retained positive sample set Pos is retained. _e , the number of which is recorded as PosNum _e ;

The 4Pos _p and Pos _e sample sets constitute the enhanced positive sample Pos of the classifier, as shown in equation (8), where PosNum _e ≤ PosNum _p .

Generating enhanced negative samples includes: extracting the original negative samples in the complete image of the data set using the RoIs extraction method corresponding to the detection process, and using K-mean clustering and uniform random selection to ensure that the enhanced negative sample coverage obtained by the screening is more representative Sexual background information and appropriate proportions.

Specifically, the image block information is extracted from the complete image of the pedestrian detection data set SCUT Dataset by using the RoIs extraction method corresponding to the detection process, and the IOU with the pedestrian Ground-Truth bounding box is removed by more than 30% and is determined to be abnormal in size (for example, In the case of the above-mentioned RoIs filtering method, the retained image block is recorded as the source negative sample; the K-mean method is used to cluster the source negative samples, and the calculated ratio is uniformly randomized in the clustering result. The image block is selected to form an enhanced negative sample; further, a negative sample containing the vehicle interference heat source is added according to the clustering result, and the proportion of such background information in the enhanced negative sample is increased.

In more detail, the specific steps to generate an enhanced negative sample are as follows:

1 taking the thermal imaging pedestrian detection data set SCUT Dataset as the source, and extracting the RoIs information in all the complete images of the data set by using the RoIs extraction method corresponding to the detection process;

(2) The obtained RoIs are judged one by one, and an example in which the IOU of the pedestrian's Ground-Truth bounding box is higher than 30% and is judged to be abnormal in size (for example, the size is abnormal by the RoIs filtering method) is excluded;

3 extracting corresponding image block information according to the RoIs satisfying the preset requirement, and forming a source negative sample set Neg _temp , the number of which is recorded as NegNum _temp ;

4 For Neg _temp, use K-mean clustering method to classify into n classes (for example, experimental setting n=100), remember to increase the number of positive samples Pos to PosNum, increase the number of negative samples Neg to NegNum, and set NegNum=PosNum ×4, according to this standard, the image block information is randomly selected in a certain proportion in the clustering result, specifically: it is assumed that the current number of samples included in the i-th type result is Num _i , and is selected from the uniform random method (Num _i ×NegNum/NegNum _Temp ) a negative sample;

5 pairs of clustering results one by one operation, can meet the number of NegNum, constitute an enhanced negative sample Neg;

6 From the n-type results of K-mean clustering, manually select the result set containing the negative samples of the vehicle interference heat source, randomly select a negative sample from the random sample to add to Neg, and increase the proportion of such background information in Neg.

Next, in step 420, the generated enhanced positive samples and enhanced negative samples are pre-processed. The positive and negative samples are preprocessed by adjusting the brightness and boundary information. Preprocessing the generated positive and negative samples can improve the quality of the sample data and ultimately improve the performance of the classifier.

The preprocessing operations for enhancing positive and negative samples used in the present invention include: pixel Y channel extraction, boundary scaling adjustment, and gamma correction processing. Specifically, the pixel Y channel extraction method is used to convert the enhanced positive and negative samples into a single channel image format with low computational overhead; the boundary scaling strategy is used to adjust the boundary coordinate data of the positive and negative samples, and the information difference between the training samples and the actual extracted RoIs is reduced. Degree; further, the gamma correction method is used to process the enhanced positive and negative samples, and the dynamic range and stretch contrast of the sample Y channel information are improved.

It has the following advantages: (1) For the image input by the camera, the YUV4:2:2 format is taken as an example, and the feature is that the point (x, y) contains two channels of "Y, U" or "Y, V". The Y channel representing the luminance has complete information of thermal imaging with respect to the U and V channel information representing the chromaticity; therefore, the enhanced Y-channel extraction method is used to convert the enhanced positive and negative samples into a single-channel image format with low computational overhead. (2) In the RoIs extraction method, the RoIs obtained from the foreground area usually have a pedestrian contour and a RoIs boundary fit or the spacing is too small, while the pedestrian Ground-Truth bounding box of most data sets has a certain vicinity near the boundary. The background information of the spacing, which increases the degree of information difference between the training samples and the actual detection of the extracted RoIs; therefore, it is necessary to perform boundary scaling adjustment on the enhanced positive and negative samples to reduce the information difference. (3) Using the gamma correction method to process the enhanced positive and negative samples can improve the dynamic range and stretch contrast of the sample Y channel information.

In more detail, the specific steps for pre-processing enhanced positive and negative samples are as follows:

1 For the current sample image block, according to the arrangement format of the pixel point channel information, the corresponding Y channel information is extracted point by point; then the Y channel information is sequentially arranged into new sample data according to the position identifier of the point (x, y), Fig. 5(a) is an example of a YUV4:2:2 format image; a Y channel preprocessing process using a YUV 4:2:2 format image as an example, and an arrow above the YUV 4:2:2 format image before processing (Each pixel contains a Y channel and a U (or V) channel information), and below the arrow is a processed Y channel information image (each pixel contains only one Y channel information).

2 pairs of enhanced positive samples Pos and enhanced negative samples Neg, 1 operation per sample;

3 to determine the RoIs extraction method used in the actual detection process, if the pedestrian contour and the RoIs boundary fit or the spacing is too small, and does not match the data set, the boundary scaling processing is performed on the enhanced positive and negative samples Pos and Neg samples by sample; Yes: according to the center of gravity of the current sample image block, the four boundaries of the image block are respectively reduced by m pixels in the direction of the center of gravity, and the empirical value of m is obtained in the range of 3 to 5 by experiments;

4 The Y channel information of the current sample image block is processed point by point using a gamma correction method, and the gamma parameter γ=0.5 is experimentally set;

5 pairs of enhanced positive samples Pos and enhanced negative samples Neg, 4 operations per sample, resulting in new enhanced positive and negative samples Pos' and Neg'.

In step 430, the pre-processed enhanced positive and negative training set is divided and the classifier is trained. The sample size division criteria of the long, medium and near distances are obtained by the enhanced positive samples of the cluster preprocessing. According to this, the pre-processed enhanced positive and negative samples are divided into three training sets, respectively, which are suitable for classification, Three classifiers for middle and close pedestrian targets.

The present invention defines a pixel height threshold interval for a pedestrian target of [30, 140], corresponding to the farthest and most recent pedestrian target of the real scene. However, the pedestrian information of these two extreme distances is very different, resulting in a high intraclass difference in the obtained positive samples. If only one classifier is trained, the detection performance will be degraded.

The pre-processed enhanced positive and negative training set and the training classifier include: using the clustering method to analyze the pre-processed enhanced positive samples, setting the number of types k=3, obtaining the far, middle and near three based on the pixel height. The sample scale division criterion of the distance, thereby subdividing the enhanced positive and negative samples into three independent training sets; respectively training three classifiers (classifier _f , classifier _m , classifier _n) suitable for classifying long, medium and close pedestrian targets For the screening of difficult negative samples, the obtained classifier is used to separately detect the negative samples that are not used for training, and the false alarm cases are selected as difficult negative samples, added to the corresponding training set, and the classifier is retrained. This process is until the preset number of times of training is satisfied. (c) of Fig. 5 shows a negative sample of a part of the automobile interference heat source.

In more detail, the specific steps of dividing the pre-processed enhanced positive and negative training set and training the classifier include:

1 Define four boundaries of the distance, middle and near three consecutive distance intervals (Range _l , Range _s , Range _m , Range _r ), and obtain these boundary values based on K-mean clustering method. The specific operation is: experimental setting The actual detection distance interval [20,85] is divided into several parts at intervals of 5 meters, and the corresponding target pixel height value of each part is calculated according to formula (1); the sample of the corresponding pixel height is filtered in the positive sample for clustering Analysis; use K-mean clustering method to set the number of species k=3, and obtain four boundary values based on pixel height as Range _l = 30, Range _s = 48, Range _m = 90, Range _r = 140;

Height _pixel ≈height _target ×f/distance formula (1)

2 The pixel height of the current sample image block Sample is Sample _h . If Range _l ≤Sample _h <Range _{s, the} Sample is divided into the long-distance sample training set. If Range _s ≤Sample _h <Range _m, it is divided into the medium distance sample training. Set, if Range _m ≤Sample _h ≤Range _r is divided into a close-range sample training set;

3 pairs of enhanced positive and negative samples Pos' and Neg', 2 operations per sample, to obtain three sample training sets;

4 According to the obtained three independent training sets, three classifiers suitable for classifying long-, medium-, and close-range pedestrian targets are respectively trained. In the iterative process, for the screening of difficult negative samples, the obtained classifier is used to separately detect the Source negative samples that are not used for training, filter the false alarm samples as difficult negative samples, add to the corresponding training set and retrain the classifier until the preset number of times of training is satisfied.

FIG. 6 is a block diagram showing a classifier training apparatus according to an embodiment of the present invention. The classifier training device 600 includes an enhanced positive and negative sample generation module 610, an enhanced positive and negative sample preprocessing module 620, and a training set partitioning and classifier training module 630.

The enhanced positive and negative sample generation module 610 generates an enhanced positive sample and an enhanced negative sample. Specifically, the positive sample labeling information and the equalization technique are combined to generate an enhanced positive sample, the clustering method is used to analyze the information distribution of the non-pedestrian background image block, and the enhanced negative samples of different categories are assisted.

The enhanced positive samples include the original positive samples and the extended positive samples. Generating the enhanced positive sample includes: taking the thermal imaging pedestrian detection data set SCUT Dataset as a source, extracting the corresponding image block information according to the marked pedestrian Ground-Truth bounding box and the preset index, and obtaining the original positive sample. The platform histogram equalization method is used to process the luminance information of the original positive samples one by one to obtain an extended positive sample. That is, an equalization method is used to enhance the contrast of the original positive sample luminance information, and an extended positive sample similar to the thermal imaging characteristic is generated to constitute a sufficient number of enhanced positive samples.

Generating the enhanced negative sample includes extracting the image block information from the complete image of the pedestrian detection data set SCUT Dataset using the RoIs extraction method corresponding to the detection process, and removing the IOU from the pedestrian Ground-Truth bounding box by more than 30% and being judged as the size For an example of an abnormality (for example, a size abnormality determined by the aforementioned RoIs filtering method), the retained image block is recorded as a source negative sample; the source negative sample is clustered using the K-mean method, and the clustered result is obtained according to the calculated ratio. The image block is randomly selected to form an enhanced negative sample; further, a negative sample containing the vehicle interference heat source is added according to the clustering result, and the proportion of such background information in the enhanced negative sample is improved.

The enhanced positive and negative sample preprocessing module 620 preprocesses the enhanced positive samples and the enhanced negative samples generated by the enhanced positive and negative sample generating modules. The positive and negative samples are preprocessed by adjusting the brightness and boundary information. Preprocessing the generated positive and negative samples can improve the quality of the sample data and ultimately improve the performance of the classifier.

The training set partitioning and classifier training module 630 divides the enhanced positive and negative sample training set pre-processed by the positive and negative sample preprocessing modules and iteratively trains the classifier. By clustering positive samples, the sample size division criteria of the distances of the far, middle and near distances are obtained. According to this, the pre-processed enhanced positive and negative samples are divided into three training sets, which are respectively suitable for classifying far, medium and close distances. Three classifiers for pedestrian goals.

The pre-processed enhanced positive and negative training set and the training classifier include: using the clustering method to analyze the pre-processed enhanced positive samples, setting the number of types k=3, obtaining the far, middle and near three based on the pixel height. The sample scale division criterion of the distance, thereby subdividing the enhanced positive and negative samples into three independent training sets; respectively training three classifiers (classifier _f , classifier _m , classifier _n) suitable for classifying long, medium and close pedestrian targets For the screening of difficult negative samples, the obtained classifier is used to separately detect the negative samples that are not used for training, and the false alarm cases are selected as difficult negative samples, added to the corresponding training set, and the classifier is retrained. This process is until the preset number of times of training is satisfied.

The original positive samples obtained by using the enhanced positive sample generation method on the data set SCUT Dataset include: 26,000 positive samples in the long distance interval, 18800 positive samples in the middle distance interval, and 9700 positive samples in the short interval interval. The positive sample is generated, and the resulting enhanced positive sample can satisfy the classifier's requirement for the positive sample size.

At step 710, the RoIs to be detected are extracted.

At step 720, the RoIs are filtered. The RoIs filtering includes the steps of: filtering the RoIs of the abnormal size by calculating the pedestrian pixel height and the RoIs aspect ratio and setting the corresponding threshold interval; calculating the vertical spacing of the upper and lower boundaries and the current image pavement reference by the RoIs, respectively, and calculating Based on the threshold of RoIs pixel height, the RoIs with abnormal position is filtered out; the possible vertical head region is searched according to the luminance vertical projection difference curve of each RoIs, and the difference of Haar-like features of the head region and the adjacent background region is compared. Degree, filtering out RoIs missing the head of the pedestrian. A more detailed description has been introduced above, and will not be described again here.

At step 730, the classifier is trained offline. The classifier training method includes: combining positive sample labeling information and equalization technology to generate enhanced positive samples, clustering method for analyzing non-pedestrian background image block information distribution, and assisting in screening different categories of enhanced negative samples; adjusting brightness and boundary information pairs Enhance the positive and negative samples for preprocessing; and obtain the sample size division criteria of the far, middle and near distances by the enhanced positive samples of the cluster preprocessing, according to which the pre-processed enhanced positive and negative samples are divided into three trainings. Set, respectively, to train three classifiers suitable for classifying long-, medium-, and close-range pedestrian targets. A more detailed description has been introduced above, and will not be described again here.

At step 740, the filtered RoIs are classified and detected using the classifier that has completed the training.

FIG. 8 is a block diagram showing a pedestrian detecting apparatus according to an embodiment of the present invention. The pedestrian detection device 800 includes a RoIs extraction module 810, a RoIs filtering module 820, a classifier training module 830, and a classification detection module 840.

The RoIs extraction module 810 extracts the RoIs to be detected.

The RoIs filtering module 820 filters the RoIs. The RoIs filtering includes the steps of: filtering the RoIs of the abnormal size by calculating the pedestrian pixel height and the RoIs aspect ratio and setting the corresponding threshold interval; calculating the vertical spacing of the upper and lower boundaries and the current image pavement reference by the RoIs, respectively, and calculating Based on the threshold of RoIs pixel height, the RoIs with abnormal position is filtered out; the possible vertical head region is searched according to the luminance vertical projection difference curve of each RoIs, and the difference of Haar-like features of the head region and the adjacent background region is compared. Degree, filtering out RoIs missing the head of the pedestrian. A more detailed description has been introduced above, and will not be described again here.

The classifier offline training module 830 performs offline training on the classifier. The classifier training method includes: combining positive sample labeling information and equalization technology to generate enhanced positive samples, clustering method for analyzing non-pedestrian background image block information distribution, and assisting in screening different categories of enhanced negative samples; adjusting brightness and boundary information pairs Enhance the positive and negative samples for preprocessing; and obtain the sample size division criteria of the far, middle and near distances by the enhanced positive samples of the cluster preprocessing, according to which the pre-processed enhanced positive and negative samples are divided into three trainings. Set, respectively, to train three classifiers suitable for classifying long-, medium-, and close-range pedestrian targets. A more detailed description has been introduced above, and will not be described again here.

The classification detecting module 840 performs classification detection on the filtered RoIs using the classifier that has completed the training.

The on-board thermal imaging pedestrian detection method provided by the invention has the following advantages and effects compared with the existing on-board thermal imaging pedestrian detection technology for the adverse effects of the calculation bottleneck and the sample quality problem:

1. The classifier training method and the RoIs filtering method proposed by the present invention can form a "front-to-back cooperation" relationship, that is, in the on-board thermal imaging pedestrian detection process, the RoIs obtained for the extraction link preferentially use the RoIs filtering method to perform online non-pedestrian RoIs. Discriminate and remove; then use the classifier training method to offlinely train three classifiers suitable for far, medium and close distances, and divide the retained RoIs into corresponding classifiers for fine detection.

2. The present invention proposes a RoIs filtering method. By constructing a three-layer cascade filter that conforms to the pedestrian characteristic law and has low computational overhead, it is possible to preferentially filter out size anomalies, positional anomalies, and missing RoIs of pedestrian heads, and a large number of non-pedestrian RoIs are suppressed. To ensure that the remaining ROIs to be tested can meet the real-time requirements when performing the higher-precision classifier detection, and at the same time reduce the system false alarm rate.

3. The present invention proposes a classifier training method that focuses on the improvement of the number, distribution, and quality of the sample training set; by using the equalization method to enhance the contrast of the image, it is possible to generate an extension of similar thermal imaging characteristics based on the original positive sample. The sample constitutes a sufficient number of enhanced positive samples; by using the clustering method to analyze the type of background information of the source negative samples, it can ensure that the obtained enhanced negative samples cover more representative background information and the proportion is appropriate; Adjusting the positive and negative samples can improve the sample quality; by using the clustering method to obtain the classification criteria of the enhanced positive and negative sample training set, the intra-class differences of the samples can be reduced. The classifier training method can improve the scene adaptability of the classifier, and at the sample level, the increased system computational overhead is smaller, which can better meet the practical application requirements.

Performance testing and evaluation of the method of the present invention is performed in an actual road pedestrian detection environment. The complete thermal imaging pedestrian detection apparatus for testing includes: the RoIs extraction method of the prior art 1, the RoIs filtering method proposed by the present invention, the classifier training method proposed by the present invention, and the classifier type based on "HOG feature and linear SVM" , Kalman tracking method. The hardware platform used for testing refers to the vehicle with the pedestrian detection system installed, which uses the NV628 infrared thermal imager produced by Guangzhou Biotech Co., Ltd. and the DM6437 embedded platform produced by Texas Instruments.

The test plan specifically selects several sections of the road environment in Guangzhou, and uses the vehicles to perform static and dynamic tests of actual effects. The test environment is cloudy at night, the ambient temperature is about 27 ° C, and the relative humidity is about 90%. The evaluation index is specifically set as follows: the saved detection video is processed by manual statistics, the number of effective pedestrians, the number of pedestrians accurately detected, the number of false alarm individuals, and the detection rate are calculated. Among them, effective pedestrians refer to pedestrian targets with at least 1 second and above in the detection video with a frame rate of 25/second; pedestrian targets include front, back and side walking postures, as well as longitudinal cycling, electric vehicles and motorcycles. Attitude; the number of false alarm individuals refers to the number of error detections that occur within a certain test segment. When the false alarm individual or region always exists in the current picture, the processing is performed once; the detection rate refers to the number of pedestrians being accurately detected and valid. The ratio of the number of pedestrians.

For the static test, three ordinary pavement sections with a straight line distance of more than 200 meters were selected in the Guangzhou Free Trade Zone. The vehicles used for testing were parked at appropriate positions, and multiple erects were randomly distributed within the range of 15-70 meters in front of the vehicle. Walk pedestrians, use the computer to collect and statistical data results, as shown in Table 1.

According to the static test results of Table 1, it can be seen that the thermal imaging pedestrian detection system using the method of the present invention has better performance in the case where the test vehicle is stationary, and in the detailed static test of the test section, effective pedestrians The detection rate is 100%, and the number of false alarm individuals is zero.

Table 1 Statistics of static test results

For the dynamic test session, select 6 ordinary paved roads for the suburban scenes, urban scenes, and high-speed scenes in Guangzhou, and drive the vehicles at a speed of 10-80Km/h for 10 minutes of field test on each section. The total time is 60 minutes, and the results of the data collection and statistics are collected using a computer, as shown in Table 2.

Table 2 Dynamic test result statistics

According to the dynamic test results of Table 2, it can be seen that compared with the static test result, the detection performance of the thermal imaging pedestrian detection system is degraded in the case of the test vehicle running, and the analysis reason is that during the driving, the background interference The heat source is more complicated, such as road vehicles and trees, and the number of pedestrian targets is blocked. At the same time, due to the thermal imaging characteristics, the brightness and contrast of the captured image during the driving process will change at any time. These factors affect the results of the dynamic test. In the detailed dynamic test of the test section, the average detection rate reaches 75.63%, and the average number of false alarm individuals is 10, and the detection speed of the pedestrian detection system can basically meet the real-time requirement.

The above description is a detailed description of the present invention in connection with the specific embodiments, but it is not considered that the specific implementation of the present invention is limited to this. Numerous modifications, changes, substitutions and/or changes may be made to the embodiments of the present invention without departing from the spirit and scope of the invention. The scope of the invention is defined by the appended claims and their equivalents.

Claims

A region of interest (RoIs) filtering method for on-board thermal imaging pedestrian detection, characterized in that the method comprises:

Filtering the abnormal size RoIs by calculating the pixel height of the pedestrian and the aspect ratio of the RoIs and setting the corresponding threshold interval;

Calculate the vertical distance between the upper and lower boundaries and the current image road surface reference by RoIs one by one, calculate the threshold based on the pixel height of RoIs, and filter out the RoIs with abnormal position;

According to the luminance vertical projection difference curve of each RoIs, the possible pedestrian head region is searched, and the degree of difference between the Haar-like features of the head region and the adjacent background region is compared, and the RoIs of the missing pedestrian head is filtered out.
The region of interest filtering method according to claim 1, wherein the filtering of the abnormal size of the RoIs comprises:

According to the image focal length f, the pedestrian height target and the detection distance distance parameter, the threshold interval of the pedestrian RoIs pixel height is calculated:

Height pixel ≈height target ×f/distance formula (1)

Where height pixel is the threshold interval of the pixel height of the pedestrian RoIs, height target is the height of the pedestrian target, f is the image focal length, and distance is the detection distance;

According to the statistical analysis method, the Gaussian distribution of the pedestrian RoIs aspect ratio is obtained, and the appropriate confidence level is selected to obtain the aspect ratio threshold interval;

Each RoIs to be tested is evaluated, and the RoIs that do not meet the two interval conditions are RoIs of abnormal size, and these abnormally-sized RoIs are removed.
The ROI filtering method according to claim 1, wherein the RoIs for filtering out the abnormal position include:

Obtaining the current image road surface reference using the horizontal road surface hypothesis method;

For the RoIs to be judged, the distance between the upper and lower boundary and the road reference in the y-axis direction is calculated by RoIs one by one, and the y-axis direction is the vertical direction of RoIs, and the threshold based on the current RoIs pixel height RoI h is calculated according to formula (2). :

Where α and β are scaling factors and ε is an offset noise factor;

The RoIs to be detected whose filtering result does not meet the threshold value are filtered out.
The ROI filtering method according to claim 1, wherein the filtering of the RoIs missing the pedestrian head comprises:

The pedestrian head adaptive positioning algorithm is used to divide the current upper layer area of the RoIs into three parts in the horizontal direction, the middle part is named as the head area, and the left and right parts are named as the background area;

The Haar-like feature-based method is used to estimate the degree of difference in luminance mean between the head region and the background region, and the RoIs of the missing header are removed according to a preset threshold.
The region of interest filtering method according to claim 4, wherein the pedestrian head adaptive positioning algorithm processes the upper layer region of the current RoIs by using a vertical projection method, and obtains a corresponding projection result sequence; and calculates adjacent data in the sequence. The difference is obtained by obtaining the current vertical projection difference curve of the RoIs; according to the vertical boundary matching strategy, the left and right boundary combinations of the qualified head regions are searched at the extreme points of the curve, and the corresponding x-axis coordinate information defines the position of the head region. Where the x-axis is the horizontal direction of the RoIs.
A Regions of Interest (RoIs) filtering device for on-board thermal imaging pedestrian detection, characterized in that the device comprises:

The size abnormal RoIs filter filters out the abnormal size RoIs by calculating the pedestrian pixel height and the RoIs aspect ratio and setting the corresponding threshold interval;

The positional abnormality RoIs filter calculates the vertical distance between the upper and lower boundaries and the current image road surface reference by RoIs, calculates the threshold based on the pixel height of the RoIs, and filters out the abnormal position RoIs;

The head RoIs filter is missing, and the possible pedestrian head region is searched according to the luminance vertical projection difference curve of each RoIs, and the difference degree of the Haar-like features of the head region and the adjacent background region is compared, and the missing pedestrian head is filtered out. Department of RoIs.
The region of interest filtering device according to claim 6, wherein the size abnormality RoIs filter calculates a threshold interval of the pixel height of the pedestrian RoIs according to the image focal length and the pedestrian detection distance; and obtains the pedestrian RoIs aspect ratio according to the statistical analysis method. The Gaussian distribution, the appropriate confidence level is selected to obtain the aspect ratio threshold interval; and each RoIs to be detected is evaluated, and the RoIs that do not meet the two interval conditions are the size abnormal RoIs, and the RoIs with these abnormal sizes are removed.
The region of interest filtering device according to claim 6, wherein the positional abnormality RoIs filter obtains a current image road surface reference using a horizontal road surface hypothesis method; and calculates a distance between the upper and lower boundaries and the road surface reference in the y-axis direction by RoIs, respectively. The y-axis direction is the vertical direction of RoIs, and the threshold based on the current RoIs pixel height is calculated; and the ROI to be detected whose filtering result does not meet the threshold is filtered out.
The region of interest filtering apparatus according to claim 6, wherein the missing head RoIs filter uses the pedestrian head adaptive positioning algorithm to divide the current upper layer of the RoIs into three parts in the horizontal direction, and the middle part is named as the head. The area, the left and right parts are named as the background area; and the Haar-like feature-based method is used to estimate the degree of difference in the brightness mean of the head area and the background area, and the RoIs of the missing head are removed according to the preset threshold.
A pedestrian detection method for on-board thermal imaging, characterized in that the method comprises:

Extracting the RoIs to be detected;

Filtering the RoIs, wherein the RoIs filtering comprises the steps of: filtering the size abnormal RoIs by calculating the pedestrian pixel height and the RoIs aspect ratio and setting corresponding threshold intervals; calculating the upper and lower boundaries and the current image pavement respectively by RoIs The vertical spacing of the reference, the threshold based on the pixel height of the RoIs is calculated, the RoIs of the position abnormality is filtered out; and the possible pedestrian head region is searched according to the luminance vertical projection difference curve of each RoIs, and the head region and the adjacent background are compared. The degree of difference in the Haar-like characteristics of the region, filtering out the RoIs missing the head of the pedestrian;

Offline training of the classifier;

The filtered RoIs are classified and detected using a trained classifier.