WO2019196131A1 - Method and apparatus for filtering regions of interest for vehicle-mounted thermal imaging pedestrian detection - Google Patents

Method and apparatus for filtering regions of interest for vehicle-mounted thermal imaging pedestrian detection Download PDF

Info

Publication number
WO2019196131A1
WO2019196131A1 PCT/CN2018/083480 CN2018083480W WO2019196131A1 WO 2019196131 A1 WO2019196131 A1 WO 2019196131A1 CN 2018083480 W CN2018083480 W CN 2018083480W WO 2019196131 A1 WO2019196131 A1 WO 2019196131A1
Authority
WO
WIPO (PCT)
Prior art keywords
rois
pedestrian
head
filtering
region
Prior art date
Application number
PCT/CN2018/083480
Other languages
French (fr)
Chinese (zh)
Inventor
许瑞霖
刘琼
彭绍武
吴继平
Original Assignee
广州飒特红外股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州飒特红外股份有限公司 filed Critical 广州飒特红外股份有限公司
Publication of WO2019196131A1 publication Critical patent/WO2019196131A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/12Details of acquisition arrangements; Constructional details thereof
    • G06V10/14Optical characteristics of the device performing the acquisition or on the illumination arrangements
    • G06V10/143Sensing or illuminating at different wavelengths
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • G06V10/507Summing image-intensity values; Histogram projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Definitions

  • the present invention relates to pedestrian detection and, more particularly, to a Regions of Interest (RoIs) filtering method and apparatus for on-board thermal imaging pedestrian detection.
  • RoIs Regions of Interest
  • On-board thermal imaging pedestrian detection technology refers to the use of infrared cameras as visual sensors to capture images/videos of vehicle traffic scenes, and the use of machine learning methods on computers or embedded platforms to identify all pedestrian targets present in images/videos, and The coordinate information of the minimum circumscribed rectangle identifies the position of each pedestrian on the image.
  • This process consists of two key phases: RoIs extraction and RoIs classification detection, where the important factors affecting computational overhead and accuracy are the number of extracted RoIs and the performance of the classifier used.
  • RoIs extraction process in order to meet the high recall rate requirements, a larger number of RoIs are usually obtained.
  • the pedestrian target in the image is a rare thing, that is, most of the RoIs only contain background information, and some of the information of the background area and the characteristics of the pedestrian are very different. If classifier detection is used for these RoIs, the computational overhead is not optimistic, so it is necessary to propose a method that can reduce the number of RoIs to be detected while taking into account the accuracy.
  • Prior Art 1 Ge J, Luo Y, Tei G. Real-Time Pedestrian Detection and Tracking at Nighttime for Driver-Assistance Systems. [J].Intelligent Transportation Systems IEEE Transactions on, 2009, 10(2): 283-298 .
  • the RoIs are extracted from the near-infrared image by calculating the upper and lower limits of the segmentation threshold in the local neighborhood of each pixel.
  • Prior Art 2 Uijlings J R R, Sande K E A V D, Gevers T, et al. Selective Search for Object Recognition [J]. International Journal of Computer Vision, 2013, 104(2): 154-171.
  • a selective search method is proposed. The main idea is to divide the image into small similar regions according to the different color spaces of the visible light image, and then merge the small regions with high similarity into large regions according to the color, texture and size according to the region merging algorithm. .
  • the magnitude of RoIs obtained by the prior art 1-3 method is significantly reduced, but still poses a threat to real-time performance.
  • the prior art 2 method obtains an average of about 2000 RoIs in a single image
  • the prior art 3 The method of processing a single image in a computer is approximately 0.2 s.
  • the prior art 1-3 provides a thought worthy of reference, that is, filtering non-Pedestrian RoIs in advance by calculating a relatively small overhead, thereby reducing the number of RoIs to be detected.
  • the currently available thermal imaging pedestrian detection reference data set is very scarce, and the present invention uses the laboratory-published data set SCUT Dataset ( http://www2.scut.edu.cn/cv/scut_fir_pedestrian_dataset/ ).
  • the dataset is for the traffic road scene in Guangzhou. It contains 100 infrared thermal imaging videos.
  • the total number of frames is about 200,000.
  • the number of marked Ground-Truth information is about 400,000. It has different pedestrian target types, such as “single walk”. Pedestrians, single cycling pedestrians, etc.
  • KAIST Dataset it has the advantages of image frame number, Ground-Truth information type and quantity, road scene type and so on.
  • the present invention is achieved by the following technical solutions.
  • a Regions of Interest (RoIs) filtering method for on-board thermal imaging pedestrian detection comprising: calculating a pedestrian pixel height and a RoIs aspect ratio and setting corresponding Threshold interval, filtering Out-of-size RoIs; calculating the vertical spacing between the upper and lower boundaries and the current image pavement reference by RoIs, calculating the threshold based on the pixel height of RoIs, filtering out the abnormal position of the RoIs; and arranging the brightness according to the brightness of each RoIs
  • the direct projection difference curve searches for possible pedestrian head regions, compares the degree of difference between the head region and the Haar-like features of adjacent background regions, and filters out the RoIs of missing pedestrian heads.
  • the filtering of the abnormal size of the RoIs includes: calculating a threshold interval of the pixel height of the pedestrian RoIs according to the image focal length f, the pedestrian height target, and the detection distance distance parameter:
  • height pixel is the threshold interval of the pixel height of the pedestrian RoIs
  • height target is the height of the pedestrian target
  • f is the image focal length
  • distance is the detection distance
  • the Gaussian distribution of the pedestrian RoIs aspect ratio is obtained, and the appropriate confidence level is selected to obtain the aspect ratio threshold interval; and each RoIs to be detected is evaluated, and the RoIs that do not meet the two interval conditions are the size abnormal RoIs. These abnormally sized RoIs are removed.
  • the RoIs for filtering out position abnormalities include: obtaining a current image road surface reference using a horizontal road surface hypothesis method; and calculating a distance value between the upper and lower boundaries and the road surface reference in the y-axis direction for each of the RoIs to be judged.
  • the y-axis direction is the vertical direction of RoIs
  • the threshold based on the current RoIs pixel height RoI h is calculated according to formula (2):
  • ⁇ and ⁇ are scaling factors and ⁇ is an offset noise factor
  • filtering out RoIs missing a pedestrian head includes: dividing a current RoIs upper layer region into three parts in a horizontal direction by using a pedestrian head adaptive positioning algorithm, and the middle portion is named a head region, left and right.
  • the part is named as the background area; and the Haar-like feature-based method is used to estimate the degree of difference in the brightness mean of the head area and the background area, and the RoIs of the missing head are removed according to the preset threshold.
  • the pedestrian head adaptive positioning algorithm uses the luminance vertical projection method to process the upper layer region of the current RoIs, and obtains a corresponding projection result sequence; calculates the difference of adjacent data in the sequence to obtain the brightness of the current RoIs.
  • Vertical projection difference curve according to the vertical boundary matching strategy, find the left and right boundary combination of the qualified head region at the extreme point of the curve, and the corresponding x-axis coordinate information defines the position of the head region, wherein the x-axis is the level of the RoIs direction.
  • the present invention provides a Regions of Interest (RoIs) filtering device for on-board thermal imaging pedestrian detection, the device comprising: a size anomaly RoIs filter, which is calculated by calculating the pixel height of the pedestrian and the aspect ratio of the RoIs Determine the corresponding threshold interval, filter out the size abnormal RoIs; position abnormal RoIs filter, calculate the vertical distance between the upper and lower boundaries and the current image road surface reference by RoIs, calculate the threshold based on the pixel height of RoIs, and filter out the abnormal position RoIs And the missing head RoIs filter, searching for possible pedestrian head regions according to the luminance vertical projection difference curve of each RoIs, comparing the degree of difference of the Haar-like features of the head region and the adjacent background region, filtering out the missing RoIs in the head of the pedestrian.
  • RoIs Regions of Interest
  • the present invention provides a method for in-vehicle thermal imaging pedestrian detection, the method comprising: extracting RoIs to be detected; filtering RoIs, wherein the RoIs filtering comprises the steps of: calculating a pedestrian pixel height and a RoIs aspect ratio The corresponding threshold interval is determined, and the RoIs of the size anomaly are filtered out; the vertical spacing between the upper and lower boundaries and the current image pavement reference is calculated by RoIs, and the threshold of the pixel height based on the RoIs is calculated, and the RoIs of the abnormal position are filtered out; and according to each RoIs
  • the luminance vertical projection difference curve searches for possible pedestrian head regions, compares the degree of difference between the head region and the adjacent background region Haar-like features, filters out the RoIs of the missing pedestrian head, and performs off-line training on the classifier; And the classified RoIs are classified and detected using a trained classifier.
  • the invention provides a RoIs filtering method for on-board thermal imaging pedestrian detection, and has the following advantages and effects compared with the existing on-board thermal imaging pedestrian detection RoIs filtering technology for the adverse effects of the calculation bottleneck problem:
  • the invention proposes a RoIs filtering method.
  • a three-layer cascade filter which conforms to the pedestrian characteristic rule and low computational overhead, the RoIs of the size anomaly, the positional abnormality and the missing pedestrian head can be preferentially filtered out, and a large number of non-pedestrian RoIs are suppressed and guaranteed.
  • the remaining RoIs to be tested can meet the real-time requirements when performing the higher-precision classifier detection, and can reduce the system false alarm rate.
  • FIG. 1 is a flow chart showing a RoIs filtering method according to an embodiment of the present invention.
  • Fig. 2(a) shows the artificial statistical result of the pedestrian pixel height threshold interval
  • Fig. 2(b) shows the Ground-Truth aspect ratio interval statistical result
  • Fig. 2(c) shows the head adaptive positioning algorithm. Part of the sample results.
  • FIG. 3 is a block diagram showing a RoIs filtering device according to an embodiment of the present invention.
  • FIG. 4 is a flow chart showing a classifier training method in accordance with an embodiment of the present invention.
  • FIG. 5 shows an example of Y channel preprocessing of a YUV 4:2:2 format image
  • (b) of FIG. 5 shows a comparison of an original positive sample and an extended positive sample
  • (c) of FIG. 5 shows a part of a car. A negative sample that interferes with the heat source.
  • FIG. 6 is a block diagram showing a classifier training apparatus according to an embodiment of the present invention.
  • FIG. 7 is a flow chart showing a pedestrian detection method according to an embodiment of the present invention.
  • FIG. 8 is a block diagram showing a pedestrian detecting apparatus according to an embodiment of the present invention.
  • FIG. 1 is a flow chart showing a RoIs filtering method according to an embodiment of the present invention.
  • the extraction section obtains the RoIs bounding box information of the target possible area, and records the X-axis coordinate RoI x , the upper left corner y-axis coordinate RoI y , the width RoI w and the height RoI h of each RoIs upper left corner.
  • a large number of RoIs are usually obtained.
  • the subsequent classifier detection link is directly performed, it is difficult to achieve real-time requirements for a hardware platform with a computational bottleneck (such as an in-vehicle embedded platform).
  • the pedestrian target in the image belongs to rare things, and most of the extracted RoIs are non-pedestrian RoIs, among which there are not a few non-pedestrian RoIs.
  • the pedestrian RoIs refers to a RoIs bounding box that exceeds 50% of the intersection of the pedestrian Ground-Truth bounding box (IOU, Intersection over Union), and the non-pedestrian RoIs refers to an IOU of less than 50 with the pedestrian Ground-Truth bounding box. % RoIs bounding box.
  • non-pedestrian RoIs refers to RoIs that are less than 30% of the IOU of the pedestrian's Ground-Truth bounding box, which can easily distinguish the RoIs based on artificial vision, and can be distinguished by setting some simple filtering conditions.
  • the pedestrian Ground-Truth bounding box refers to the real bounding box labeling information of the target type as a single walking pedestrian and a single cycling pedestrian.
  • the main idea of the RoIs filtering method of the embodiment of the present invention is to construct a three-layer cascade filter conforming to the pedestrian characteristic rule to preferentially filter out the size abnormality, the position abnormality, and the RoIs of the missing pedestrian head, thereby reducing the RoIs to be detected.
  • the quantity, detailed flow chart is shown in Figure 1.
  • step 110 the abnormally sized RoIs are filtered out.
  • the RoIs of the abnormal size are filtered out by calculating the pedestrian pixel height and the RoIs aspect ratio and setting the corresponding threshold interval. In more detail, including:
  • Step 111 Calculate a threshold interval of the pixel height of the pedestrian RoIs according to the image focal length and the pedestrian detection distance.
  • the range of pedestrian detection is about 20 to 85 meters from the front area of the car.
  • the pedestrian target is calculated according to the image focal length f, the pedestrian height target and the detection distance parameter.
  • the pixel height threshold interval in this range is [30, 140].
  • height pixel is the pixel height of the pedestrian target in the image
  • height target is the height of the pedestrian target
  • the experimental setting is about 1.7 meters
  • f is the image focal length
  • the value in the SCUT Dataset is 1554
  • the distance is the detection distance.
  • FIG. 2 shows the artificial statistical result of the pedestrian pixel height threshold interval.
  • (a) in Fig. 2 is an image taken by an infrared camera mounted on a car and a pedestrian target at a distance of 20 meters and 85 meters on a flat road surface, wherein the car is stationary.
  • the artificial measurement of the pedestrian bounding box of the two images (the dotted line is the drawn pedestrian bounding box), the statistical 20-meter pedestrian pixel height is 138 pixels, the pedestrian pixel width is 42 pixels, and the pedestrian pixel height of 85 meters is 30 pixels, the pedestrian pixel width is 12 pixels, compared with the value calculated according to the above formula (1), the difference between the two is small, which proves that the method of formula (1) calculation is effective.
  • Step 112 According to the statistical analysis method, obtain the Gaussian distribution of the pedestrian RoIs aspect ratio, and select an appropriate confidence level to obtain an aspect ratio threshold interval.
  • the RoIs obtained based on the foreground region vary greatly in the aspect ratio. Many high-width ratios of non-pedestrian RoIs differ greatly from actual human characteristics. Based on this characteristic, the Gaussian distribution of pedestrian RoIs aspect ratio is obtained by statistical analysis. The appropriate confidence level is selected to obtain the aspect ratio threshold of [1.5. 4]. Among them, the statistical sample is from the pedestrian Ground-Truth information of the data set SCUT Dataset, and the target annotation type is “single walking pedestrian” and “single bicycle pedestrian”.
  • FIG. 2 shows the Ground-Truth aspect ratio interval result of the target types of the 44 videos of "single walking pedestrian" and "single bicycle pedestrian".
  • (b) in FIG. 2 is a Ground-Truth sample for 44 videos whose target types are "single walking pedestrian” and “single bicycle pedestrian", and their aspect ratio is calculated and plotted as a histogram result. That is, the Gaussian distribution of the pedestrian RoIs aspect ratio is obtained using statistical analysis.
  • the horizontal axis of the image is the aspect ratio value, and the vertical axis is the number of samples. It can be seen that the aspect ratio of the sample is approximately 1 to 4.
  • an appropriate confidence level is selected to determine the aspect ratio threshold interval [1.5, 4].
  • Step 113 Evaluate each of the to-be-detected RoIs, and the RoIs that do not meet the two interval conditions are RoIs of abnormal size, and remove the abnormally-sized RoIs.
  • the pedestrian target of the traffic scene has a strong positional constraint, that is, whether it is a pedestrian or a bicycle pedestrian, most of them are located on the road surface, so the center of the pedestrian target in the image is horizontally distributed. Based on this experience, the RoIs with abnormal positions in the image are likely to be apparently non-pedestrian RoIs.
  • step 120 the abnormally located RoIs are filtered out. Specifically, the vertical distance between the upper and lower boundaries and the current image road surface reference is calculated by RoIs one by one, and the threshold based on the pixel height of the RoIs is calculated, and the RoIs with abnormal positions are filtered out.
  • the vertical distance between the upper and lower boundaries and the current image road surface reference is calculated by RoIs one by one, and the threshold based on the pixel height of the RoIs is calculated, and the RoIs with abnormal positions are filtered out.
  • Step 121 Acquire a current image road surface reference using a horizontal road surface hypothesis method.
  • the horizontal road surface hypothesis method is used to obtain the y-axis coordinate data Horizon y of the current image pavement reference.
  • Step 122 Calculate the distance between the upper and lower boundaries and the road surface reference in the y-axis direction of the image by RoIs one by one, and set a threshold based on the current RoIs pixel height.
  • the distance between the upper and lower boundaries of the RoIs and the road reference in the y-axis direction of the image is calculated, and the adaptive threshold based on the pixel height RoI h of the current RoIs is calculated according to the formula (2).
  • Step 123 Filter out the RoIs to be detected whose spacing result does not meet the threshold.
  • step 122 The operation of step 122 is repeated one by one for the to-be-detected RoIs satisfying the size feature requirement, and all RoIs having abnormal positions are filtered out.
  • step 130 the RoIs missing the pedestrian head are filtered out. Specifically, the possible pedestrian head region is searched according to the luminance vertical projection difference curve of each RoIs, and the difference degree of the Haar-like features of the head region and the adjacent background region is compared, and the RoIs of the missing pedestrian head is filtered out.
  • the non-pedestrian RoIs obtained in the extraction process generally contain background interference heat sources of traffic scenes, such as roadside tree branches and uniform heat sources. It has been observed that the human head is rarely obscured by other objects and is exposed, so its thermal imaging effect is often higher than the adjacent background brightness, and has a more stable contour. Based on this, in more detail, the RoIs that filter out missing pedestrian heads include:
  • Step 131 Using the pedestrian head adaptive positioning algorithm to divide the current upper layer area of the RoIs into three parts in the horizontal direction, the middle part is named as the head area, the left and right parts are named as the background area, and the upper layer area of the RoIs is along the y-axis. The direction is from the upper boundary of the RoIs to a partial area of 1/3 or 1/5 pixel height position.
  • the pedestrian head adaptive positioning algorithm uses the luminance vertical projection method to process the upper layer region of the current RoIs, and obtains a corresponding projection result sequence; calculates the difference between adjacent data in the sequence, and obtains the luminance vertical projection difference curve of the current RoIs. Further, according to the proposed vertical boundary matching strategy, the left and right boundary combinations of the qualified head regions are searched for at the extreme points of the curve, and the corresponding x-axis coordinate information defines the position of the head region.
  • the pedestrian head adaptive positioning algorithm is as follows:
  • the head position boundary only corresponds to the extreme point of V' T.
  • the head area is higher than the background area, so the left border of the head corresponds to the positive value of V' T ; the right border of the head corresponds to V' The negative value point of T ;
  • this right boundary is the background interference, because the process of traversing from left to right is to find the left boundary of the head first;
  • X_edge N ⁇ (X_edge l1 , X_edge r1 ), (X_edge l2 , X_edge r2 ), ..., (X_edge ln , X_edge rn ) ⁇ , then traverse these boundary pairs Combine to find the optimal term: tentative (X_edge l1 , X_edge r1 ) is the optimal combination; view the next boundary pair combination, if the combination and the known optimal combination have the same left boundary, compare the right boundary of the two Position, the value is larger, update the optimal combination; if the left boundary is different, calculate the two vertical center line positions (position values along the x-axis direction) of the two sets of data, and then respectively and the current RoIs Straight centerline for spacing comparison, closer to the vertical centerline of RoIs, update the optimal combination (because the pedestrian head is more likely to be in the center of the upper layer of the RoIs P up );
  • the above-mentioned pedestrian head adaptive positioning algorithm is used to obtain the left and right boundary pairs (X_edge l , X_edge r ) of the upper region P up , and the P up is divided into three parts P l , P m , P r in the horizontal direction. .
  • Step 132 Evaluate the degree of difference in luminance mean of the head region and the background region using a Haar-like feature based method and compare it with a preset threshold.
  • min() is the minimum function
  • abs() is the absolute value function
  • avg l , avg m , avg r are the mean values of the brightness of P l , P m , P r , respectively, and the value of T haar is set experimentally.
  • the range is 13 to 15.
  • Step 133 Filter out the RoIs of the missing pedestrian head.
  • step 131 and step 132 are performed one by one for the to-be-detected RoIs satisfying the position feature requirement, and the RoIs missing the pedestrian head are filtered out.
  • the RoIs are extracted by the double threshold segmentation method of the prior art 1, and the average number of RoIs obtained in a single image is about 100.
  • the RoIs filtering method described above it is possible to reduce the number of RoIs by about half, and the average time is within a few milliseconds.
  • Pedestrian Ground-Truth bounding box for dataset SCUT Dataset target type is single walking pedestrian and single cycling pedestrian, occlusion label is unoccluded
  • a total of 14,000 samples are extracted for experiment of head adaptive positioning algorithm. After manual statistics, the number of top left and right boundary failures is only 1162, and the accuracy rate is about 92%.
  • the proposed head positioning algorithm has higher precision. Some examples are shown in Figure 2(c). (c) of FIG. 2 is a partial example result of the head adaptive positioning algorithm, and two white vertical lines added in each image correspond to the left and right boundary pairs (X_edge l , X_edge r ) of the pedestrian head obtained by the algorithm.
  • FIG. 3 is a block diagram showing a RoIs filtering device according to an embodiment of the present invention.
  • the RoIs filter device 300 includes a size anomaly RoIs filter 310, a position abnormality RoIs filter 320, and a missing head RoIs filter 330.
  • the size anomaly RoIs filter 310 filters out RoIs of abnormal size. Specifically, according to the image focal length and the pedestrian detection distance, the threshold interval of the pixel height of the pedestrian RoIs is calculated, and according to the statistical analysis method, the Gaussian distribution of the pedestrian RoIs aspect ratio is obtained, the appropriate confidence level is selected to obtain the aspect ratio threshold interval, and then each evaluation is performed. RoIs to be detected will be filtered out by RoIs that do not meet the two interval conditions.
  • the positional abnormality RoIs filter 320 filters out RoIs with abnormal positions. Specifically, the horizontal road surface hypothesis method is used to obtain the current image road surface reference, and the pitch of the upper and lower boundaries and the road surface reference in the y-axis direction of the image is calculated by RoIs one by one, and the threshold based on the current RoIs pixel height is set, and then the filtering result is not met. Threshold of the RoIs to be detected.
  • the missing head RoIs filter 330 filters out the RoIs missing the pedestrian head.
  • the pedestrian head adaptive positioning algorithm is used on the current RoIs, and the left and right boundary pairs (X_edge l , X_edge r ) of the upper layer P up are obtained, and the P up is divided into three parts P l and P in the horizontal direction.
  • m , P r , the Haar-like eigenvalue of P up is calculated according to the above formula (7), and compared with the threshold value T haar , if the threshold value is greater than the threshold value, the head constraint condition is satisfied, and the above-mentioned operations are performed one by one for the detected RoIs satisfying the position feature requirement. , filter out the RoIs missing the head of the pedestrian.
  • FIG. 4 is a flow chart showing a classifier training method in accordance with an embodiment of the present invention.
  • an enhanced positive sample and an enhanced negative sample are generated. Specifically, the positive sample labeling information and the equalization technique are combined to generate an enhanced positive sample, the clustering method is used to analyze the information distribution of the non-pedestrian background image block, and the enhanced negative samples of different categories are assisted.
  • the number of positive samples obtained through the published thermal imaging data set is usually limited, and an image enhancement method is needed to generate a new positive sample on this basis; since the negative sample is in the entire image
  • the non-pedestrian area is relatively lacking in quantity, but the traditional method is to obtain negative samples based on the grid random method, and the RoIs extraction method used in the actual detection process is often different, resulting in the difference of background information distribution between the two. Very large, that is, the negative sample is less representative than the actual non-pedestrian RoIs.
  • the enhanced positive samples include the original positive samples and the extended positive samples.
  • Generating the enhanced positive sample includes: taking the thermal imaging pedestrian detection data set SCUT Dataset as a source, extracting the corresponding image block information according to the marked pedestrian Ground-Truth bounding box and the preset index, and obtaining the original positive sample.
  • the platform histogram equalization method is used to process the luminance information of the original positive samples one by one to obtain an extended positive sample. That is to say, the equalization method is used to enhance the contrast of the original positive sample luminance information, and an extended positive sample similar to the thermal imaging characteristic is generated to constitute a sufficient number of enhanced positive samples.
  • (b) of FIG. 5 shows a comparison of the original positive sample and the extended positive sample.
  • the original positive sample set Pos p is filtered according to the preset index.
  • the specific indicators are: the target type target belongs to “single walking pedestrian” and “single bicycle pedestrian”, and the occlusion label is unoccluded, and the interval frame number is 5.
  • the pixel height is [30, 140], and the number of Pos p is recorded as PosNum p ;
  • the platform histogram equalization method is used to process the luminance information sample by sample, and the corresponding new sample image block information is obtained, and the examples in which the overexposed or lost contours appear are manually excluded, and the retained positive sample set Pos is retained.
  • PosNum e the number of which is recorded as PosNum e ;
  • the 4Pos p and Pos e sample sets constitute the enhanced positive sample Pos of the classifier, as shown in equation (8), where PosNum e ⁇ PosNum p .
  • Generating enhanced negative samples includes: extracting the original negative samples in the complete image of the data set using the RoIs extraction method corresponding to the detection process, and using K-mean clustering and uniform random selection to ensure that the enhanced negative sample coverage obtained by the screening is more representative sexual background information and appropriate proportions.
  • the image block information is extracted from the complete image of the pedestrian detection data set SCUT Dataset by using the RoIs extraction method corresponding to the detection process, and the IOU with the pedestrian Ground-Truth bounding box is removed by more than 30% and is determined to be abnormal in size (for example, In the case of the above-mentioned RoIs filtering method, the retained image block is recorded as the source negative sample; the K-mean method is used to cluster the source negative samples, and the calculated ratio is uniformly randomized in the clustering result.
  • the image block is selected to form an enhanced negative sample; further, a negative sample containing the vehicle interference heat source is added according to the clustering result, and the proportion of such background information in the enhanced negative sample is increased.
  • step 420 the generated enhanced positive samples and enhanced negative samples are pre-processed.
  • the positive and negative samples are preprocessed by adjusting the brightness and boundary information. Preprocessing the generated positive and negative samples can improve the quality of the sample data and ultimately improve the performance of the classifier.
  • the preprocessing operations for enhancing positive and negative samples used in the present invention include: pixel Y channel extraction, boundary scaling adjustment, and gamma correction processing.
  • the pixel Y channel extraction method is used to convert the enhanced positive and negative samples into a single channel image format with low computational overhead;
  • the boundary scaling strategy is used to adjust the boundary coordinate data of the positive and negative samples, and the information difference between the training samples and the actual extracted RoIs is reduced.
  • Degree further, the gamma correction method is used to process the enhanced positive and negative samples, and the dynamic range and stretch contrast of the sample Y channel information are improved.
  • the YUV4:2:2 format is taken as an example, and the feature is that the point (x, y) contains two channels of "Y, U" or "Y, V".
  • the Y channel representing the luminance has complete information of thermal imaging with respect to the U and V channel information representing the chromaticity; therefore, the enhanced Y-channel extraction method is used to convert the enhanced positive and negative samples into a single-channel image format with low computational overhead.
  • the RoIs obtained from the foreground area usually have a pedestrian contour and a RoIs boundary fit or the spacing is too small, while the pedestrian Ground-Truth bounding box of most data sets has a certain vicinity near the boundary.
  • the background information of the spacing which increases the degree of information difference between the training samples and the actual detection of the extracted RoIs; therefore, it is necessary to perform boundary scaling adjustment on the enhanced positive and negative samples to reduce the information difference.
  • (3) Using the gamma correction method to process the enhanced positive and negative samples can improve the dynamic range and stretch contrast of the sample Y channel information.
  • Fig. 5(a) is an example of a YUV4:2:2 format image; a Y channel preprocessing process using a YUV 4:2:2 format image as an example, and an arrow above the YUV 4:2:2 format image before processing (Each pixel contains a Y channel and a U (or V) channel information), and below the arrow is a processed Y channel information image (each pixel contains only one Y channel information).
  • the boundary scaling processing is performed on the enhanced positive and negative samples Pos and Neg samples by sample; Yes: according to the center of gravity of the current sample image block, the four boundaries of the image block are respectively reduced by m pixels in the direction of the center of gravity, and the empirical value of m is obtained in the range of 3 to 5 by experiments;
  • step 430 the pre-processed enhanced positive and negative training set is divided and the classifier is trained.
  • the sample size division criteria of the long, medium and near distances are obtained by the enhanced positive samples of the cluster preprocessing. According to this, the pre-processed enhanced positive and negative samples are divided into three training sets, respectively, which are suitable for classification, Three classifiers for middle and close pedestrian targets.
  • the present invention defines a pixel height threshold interval for a pedestrian target of [30, 140], corresponding to the farthest and most recent pedestrian target of the real scene.
  • the pedestrian information of these two extreme distances is very different, resulting in a high intraclass difference in the obtained positive samples. If only one classifier is trained, the detection performance will be degraded.
  • the sample scale division criterion of the distance thereby subdividing the enhanced positive and negative samples into three independent training sets; respectively training three classifiers (classifier f , classifier m , classifier n) suitable for classifying long, medium and close pedestrian targets
  • classifier f , classifier m , classifier n suitable for classifying long, medium and close pedestrian targets
  • the obtained classifier is used to separately detect the negative samples that are not used for training, and the false alarm cases are selected as difficult negative samples, added to the corresponding training set, and the classifier is retrained. This process is until the preset number of times of training is satisfied.
  • (c) of Fig. 5 shows a negative sample of a part of the automobile interference heat source.
  • the specific steps of dividing the pre-processed enhanced positive and negative training set and training the classifier include:
  • height pixel is the pixel height of the pedestrian target in the image
  • height target is the height of the pedestrian target
  • the experimental setting is about 1.7 meters
  • f is the image focal length
  • the value in the SCUT Dataset is 1554
  • the distance is the detection distance.
  • the pixel height of the current sample image block Sample is Sample h . If Range l ⁇ Sample h ⁇ Range s, the Sample is divided into the long-distance sample training set. If Range s ⁇ Sample h ⁇ Range m, it is divided into the medium distance sample training. Set, if Range m ⁇ Sample h ⁇ Range r is divided into a close-range sample training set;
  • the obtained classifier is used to separately detect the Source negative samples that are not used for training, filter the false alarm samples as difficult negative samples, add to the corresponding training set and retrain the classifier until the preset number of times of training is satisfied.
  • FIG. 6 is a block diagram showing a classifier training apparatus according to an embodiment of the present invention.
  • the classifier training device 600 includes an enhanced positive and negative sample generation module 610, an enhanced positive and negative sample preprocessing module 620, and a training set partitioning and classifier training module 630.
  • the enhanced positive and negative sample generation module 610 generates an enhanced positive sample and an enhanced negative sample. Specifically, the positive sample labeling information and the equalization technique are combined to generate an enhanced positive sample, the clustering method is used to analyze the information distribution of the non-pedestrian background image block, and the enhanced negative samples of different categories are assisted.
  • the enhanced positive samples include the original positive samples and the extended positive samples.
  • Generating the enhanced positive sample includes: taking the thermal imaging pedestrian detection data set SCUT Dataset as a source, extracting the corresponding image block information according to the marked pedestrian Ground-Truth bounding box and the preset index, and obtaining the original positive sample.
  • the platform histogram equalization method is used to process the luminance information of the original positive samples one by one to obtain an extended positive sample. That is, an equalization method is used to enhance the contrast of the original positive sample luminance information, and an extended positive sample similar to the thermal imaging characteristic is generated to constitute a sufficient number of enhanced positive samples.
  • Generating the enhanced negative sample includes extracting the image block information from the complete image of the pedestrian detection data set SCUT Dataset using the RoIs extraction method corresponding to the detection process, and removing the IOU from the pedestrian Ground-Truth bounding box by more than 30% and being judged as the size
  • an abnormality for example, a size abnormality determined by the aforementioned RoIs filtering method
  • the retained image block is recorded as a source negative sample; the source negative sample is clustered using the K-mean method, and the clustered result is obtained according to the calculated ratio.
  • the image block is randomly selected to form an enhanced negative sample; further, a negative sample containing the vehicle interference heat source is added according to the clustering result, and the proportion of such background information in the enhanced negative sample is improved.
  • the enhanced positive and negative sample preprocessing module 620 preprocesses the enhanced positive samples and the enhanced negative samples generated by the enhanced positive and negative sample generating modules.
  • the positive and negative samples are preprocessed by adjusting the brightness and boundary information. Preprocessing the generated positive and negative samples can improve the quality of the sample data and ultimately improve the performance of the classifier.
  • the preprocessing operations for enhancing positive and negative samples used in the present invention include: pixel Y channel extraction, boundary scaling adjustment, and gamma correction processing.
  • the pixel Y channel extraction method is used to convert the enhanced positive and negative samples into a single channel image format with low computational overhead;
  • the boundary scaling strategy is used to adjust the boundary coordinate data of the positive and negative samples, and the information difference between the training samples and the actual extracted RoIs is reduced.
  • Degree further, the gamma correction method is used to process the enhanced positive and negative samples, and the dynamic range and stretch contrast of the sample Y channel information are improved.
  • the YUV4:2:2 format is taken as an example, and the feature is that the point (x, y) contains two channels of "Y, U" or "Y, V".
  • the Y channel representing the luminance has complete information of thermal imaging with respect to the U and V channel information representing the chromaticity; therefore, the enhanced Y-channel extraction method is used to convert the enhanced positive and negative samples into a single-channel image format with low computational overhead.
  • the RoIs obtained from the foreground area usually have a pedestrian contour and a RoIs boundary fit or the spacing is too small, while the pedestrian Ground-Truth bounding box of most data sets has a certain vicinity near the boundary.
  • the background information of the spacing which increases the degree of information difference between the training samples and the actual detection of the extracted RoIs; therefore, it is necessary to perform boundary scaling adjustment on the enhanced positive and negative samples to reduce the information difference.
  • (3) Using the gamma correction method to process the enhanced positive and negative samples can improve the dynamic range and stretch contrast of the sample Y channel information.
  • the training set partitioning and classifier training module 630 divides the enhanced positive and negative sample training set pre-processed by the positive and negative sample preprocessing modules and iteratively trains the classifier. By clustering positive samples, the sample size division criteria of the distances of the far, middle and near distances are obtained. According to this, the pre-processed enhanced positive and negative samples are divided into three training sets, which are respectively suitable for classifying far, medium and close distances. Three classifiers for pedestrian goals.
  • the present invention defines a pixel height threshold interval for a pedestrian target of [30, 140], corresponding to the farthest and most recent pedestrian target of the real scene.
  • the pedestrian information of these two extreme distances is very different, resulting in a high intraclass difference in the obtained positive samples. If only one classifier is trained, the detection performance will be degraded.
  • the sample scale division criterion of the distance thereby subdividing the enhanced positive and negative samples into three independent training sets; respectively training three classifiers (classifier f , classifier m , classifier n) suitable for classifying long, medium and close pedestrian targets
  • classifier f , classifier m , classifier n suitable for classifying long, medium and close pedestrian targets
  • the obtained classifier is used to separately detect the negative samples that are not used for training, and the false alarm cases are selected as difficult negative samples, added to the corresponding training set, and the classifier is retrained. This process is until the preset number of times of training is satisfied.
  • the original positive samples obtained by using the enhanced positive sample generation method on the data set SCUT Dataset include: 26,000 positive samples in the long distance interval, 18800 positive samples in the middle distance interval, and 9700 positive samples in the short interval interval.
  • the positive sample is generated, and the resulting enhanced positive sample can satisfy the classifier's requirement for the positive sample size.
  • FIG. 7 is a flow chart showing a pedestrian detection method according to an embodiment of the present invention.
  • the RoIs to be detected are extracted.
  • the RoIs filtering includes the steps of: filtering the RoIs of the abnormal size by calculating the pedestrian pixel height and the RoIs aspect ratio and setting the corresponding threshold interval; calculating the vertical spacing of the upper and lower boundaries and the current image pavement reference by the RoIs, respectively, and calculating Based on the threshold of RoIs pixel height, the RoIs with abnormal position is filtered out; the possible vertical head region is searched according to the luminance vertical projection difference curve of each RoIs, and the difference of Haar-like features of the head region and the adjacent background region is compared. Degree, filtering out RoIs missing the head of the pedestrian. A more detailed description has been introduced above, and will not be described again here.
  • the classifier is trained offline.
  • the classifier training method includes: combining positive sample labeling information and equalization technology to generate enhanced positive samples, clustering method for analyzing non-pedestrian background image block information distribution, and assisting in screening different categories of enhanced negative samples; adjusting brightness and boundary information pairs Enhance the positive and negative samples for preprocessing; and obtain the sample size division criteria of the far, middle and near distances by the enhanced positive samples of the cluster preprocessing, according to which the pre-processed enhanced positive and negative samples are divided into three trainings. Set, respectively, to train three classifiers suitable for classifying long-, medium-, and close-range pedestrian targets. A more detailed description has been introduced above, and will not be described again here.
  • the filtered RoIs are classified and detected using the classifier that has completed the training.
  • FIG. 8 is a block diagram showing a pedestrian detecting apparatus according to an embodiment of the present invention.
  • the pedestrian detection device 800 includes a RoIs extraction module 810, a RoIs filtering module 820, a classifier training module 830, and a classification detection module 840.
  • the RoIs extraction module 810 extracts the RoIs to be detected.
  • the RoIs filtering module 820 filters the RoIs.
  • the RoIs filtering includes the steps of: filtering the RoIs of the abnormal size by calculating the pedestrian pixel height and the RoIs aspect ratio and setting the corresponding threshold interval; calculating the vertical spacing of the upper and lower boundaries and the current image pavement reference by the RoIs, respectively, and calculating Based on the threshold of RoIs pixel height, the RoIs with abnormal position is filtered out; the possible vertical head region is searched according to the luminance vertical projection difference curve of each RoIs, and the difference of Haar-like features of the head region and the adjacent background region is compared. Degree, filtering out RoIs missing the head of the pedestrian. A more detailed description has been introduced above, and will not be described again here.
  • the classifier offline training module 830 performs offline training on the classifier.
  • the classifier training method includes: combining positive sample labeling information and equalization technology to generate enhanced positive samples, clustering method for analyzing non-pedestrian background image block information distribution, and assisting in screening different categories of enhanced negative samples; adjusting brightness and boundary information pairs Enhance the positive and negative samples for preprocessing; and obtain the sample size division criteria of the far, middle and near distances by the enhanced positive samples of the cluster preprocessing, according to which the pre-processed enhanced positive and negative samples are divided into three trainings. Set, respectively, to train three classifiers suitable for classifying long-, medium-, and close-range pedestrian targets. A more detailed description has been introduced above, and will not be described again here.
  • the classification detecting module 840 performs classification detection on the filtered RoIs using the classifier that has completed the training.
  • the classifier training method and the RoIs filtering method proposed by the present invention can form a "front-to-back cooperation" relationship, that is, in the on-board thermal imaging pedestrian detection process, the RoIs obtained for the extraction link preferentially use the RoIs filtering method to perform online non-pedestrian RoIs. Discriminate and remove; then use the classifier training method to offlinely train three classifiers suitable for far, medium and close distances, and divide the retained RoIs into corresponding classifiers for fine detection.
  • the present invention proposes a RoIs filtering method.
  • a three-layer cascade filter that conforms to the pedestrian characteristic law and has low computational overhead, it is possible to preferentially filter out size anomalies, positional anomalies, and missing RoIs of pedestrian heads, and a large number of non-pedestrian RoIs are suppressed.
  • the remaining ROIs to be tested can meet the real-time requirements when performing the higher-precision classifier detection, and at the same time reduce the system false alarm rate.
  • the present invention proposes a classifier training method that focuses on the improvement of the number, distribution, and quality of the sample training set; by using the equalization method to enhance the contrast of the image, it is possible to generate an extension of similar thermal imaging characteristics based on the original positive sample.
  • the sample constitutes a sufficient number of enhanced positive samples; by using the clustering method to analyze the type of background information of the source negative samples, it can ensure that the obtained enhanced negative samples cover more representative background information and the proportion is appropriate; Adjusting the positive and negative samples can improve the sample quality; by using the clustering method to obtain the classification criteria of the enhanced positive and negative sample training set, the intra-class differences of the samples can be reduced.
  • the classifier training method can improve the scene adaptability of the classifier, and at the sample level, the increased system computational overhead is smaller, which can better meet the practical application requirements.
  • the complete thermal imaging pedestrian detection apparatus for testing includes: the RoIs extraction method of the prior art 1, the RoIs filtering method proposed by the present invention, the classifier training method proposed by the present invention, and the classifier type based on "HOG feature and linear SVM" , Kalman tracking method.
  • the hardware platform used for testing refers to the vehicle with the pedestrian detection system installed, which uses the NV628 infrared thermal imager produced by Guangzhou Biotech Co., Ltd. and the DM6437 embedded platform produced by Texas Instruments.
  • the test plan specifically selects several sections of the road environment in Guangzhou, and uses the vehicles to perform static and dynamic tests of actual effects.
  • the test environment is cloudy at night, the ambient temperature is about 27 ° C, and the relative humidity is about 90%.
  • the evaluation index is specifically set as follows: the saved detection video is processed by manual statistics, the number of effective pedestrians, the number of pedestrians accurately detected, the number of false alarm individuals, and the detection rate are calculated.
  • effective pedestrians refer to pedestrian targets with at least 1 second and above in the detection video with a frame rate of 25/second; pedestrian targets include front, back and side walking postures, as well as longitudinal cycling, electric vehicles and motorcycles.
  • Attitude; the number of false alarm individuals refers to the number of error detections that occur within a certain test segment. When the false alarm individual or region always exists in the current picture, the processing is performed once; the detection rate refers to the number of pedestrians being accurately detected and valid. The ratio of the number of pedestrians.
  • thermal imaging pedestrian detection system using the method of the present invention has better performance in the case where the test vehicle is stationary, and in the detailed static test of the test section, effective pedestrians The detection rate is 100%, and the number of false alarm individuals is zero.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Traffic Control Systems (AREA)

Abstract

Disclosed are a method and apparatus for filtering regions of interest (RoIs) for vehicle-mounted thermal imaging pedestrian detection. The method for filtering RoIs refers to a method for filtering out non-pedestrian RoIs by designing three layers of cascaded filters: in the first layer, by means of calculating pedestrian pixel heights and aspect ratios of RoIs and setting corresponding threshold intervals, filtering out RoIs of abnormal sizes; in the second layer, respectively calculating a vertical distance between upper and lower boundaries of each of the RoIs one by one and a current image pavement reference, and calculating threshold values based on pixel heights of the RoIs, so as to filter out RoIs with abnormal locations; and in the third layer, searching for a possible pedestrian head region according to a luminance vertical projection difference curve of each of the RoIs, and comparing the degree of difference between Haar-like features of the head region and those of an adjacent background region, so as to filter out RoIs without pedestrian heads. By means of the method, on the premise that pedestrian detection accuracy is considered, the calculation overhead of pedestrian detection can be reduced, and the scene adaptability of a classifier can be improved. The apparatus for filtering RoIs comprises a filter for RoIs of abnormal sizes, a filter for RoIs with abnormal locations and a filter for RoIs without heads.

Description

面向车载热成像行人检测的感兴趣区域过滤方法和装置Region of interest filtering method and device for on-board thermal imaging pedestrian detection 技术领域Technical field
本发明涉及行人检测,更具体地,涉及,面向车载热成像行人检测的感兴趣区域(Regions of Interest,RoIs)过滤方法和装置。The present invention relates to pedestrian detection and, more particularly, to a Regions of Interest (RoIs) filtering method and apparatus for on-board thermal imaging pedestrian detection.
背景技术Background technique
车载热成像行人检测技术指通过红外热像仪作为视觉传感器,捕获车载交通场景的图像/视频,在计算机或嵌入式平台使用机器学习等方法,识别图像/视频中存在的所有行人目标,并以最小外接矩形框的坐标信息标识每个行人在图像上的位置。On-board thermal imaging pedestrian detection technology refers to the use of infrared cameras as visual sensors to capture images/videos of vehicle traffic scenes, and the use of machine learning methods on computers or embedded platforms to identify all pedestrian targets present in images/videos, and The coordinate information of the minimum circumscribed rectangle identifies the position of each pedestrian on the image.
此过程包含两个关键阶段:RoIs提取和RoIs分类检测,其中影响计算开销和准确率的重要因素是提取的RoIs数量和所使用分类器的性能。在RoIs提取环节,为了满足高召回率要求,通常获得较多数量的RoIs。但是图像中的行人目标属于稀有事物,即大部分RoIs仅包含背景信息,且其中一些背景区域的信息和行人的特征差异很大。如果对这些RoIs都使用分类器检测,则计算开销不容乐观,因此需要提出既能减少待检测RoIs数量又能兼顾准确率的方法。This process consists of two key phases: RoIs extraction and RoIs classification detection, where the important factors affecting computational overhead and accuracy are the number of extracted RoIs and the performance of the classifier used. In the RoIs extraction process, in order to meet the high recall rate requirements, a larger number of RoIs are usually obtained. However, the pedestrian target in the image is a rare thing, that is, most of the RoIs only contain background information, and some of the information of the background area and the characteristics of the pedestrian are very different. If classifier detection is used for these RoIs, the computational overhead is not optimistic, so it is necessary to propose a method that can reduce the number of RoIs to be detected while taking into account the accuracy.
相对计算机而言,车载嵌入式平台具有明显的计算性能瓶颈,很多已发表的行人检测方法,特别是运用深度学习算法的技术,无法应用到此类平台,对实际应用的检测率和实时性造成影响。例如德州仪器公司生产的DM6437车载平台,该平台具有较强的稳定性,但是其处理器为单核、主频最高处理速度每秒仅有600Mhz,基于“HOG特征+线性SVM”分类器对单个RoIs的处理时间约为3毫秒,在计算性能方面远远无法和普通计算机相比。在行人检测推广到实际应用的过程中,需要找到权衡计算开销和检测性能的解决方案。Compared with computers, in-vehicle embedded platforms have obvious computational performance bottlenecks. Many published pedestrian detection methods, especially those using deep learning algorithms, cannot be applied to such platforms, resulting in detection rate and real-time performance of actual applications. influences. For example, the DM6437 vehicle platform produced by Texas Instruments has strong stability, but its processor is single core, the maximum processing speed of the main frequency is only 600Mhz per second, based on the "HOG feature + linear SVM" classifier for a single The processing time of RoIs is about 3 milliseconds, which is far from being comparable to that of a normal computer in terms of computational performance. In the process of popularization of pedestrian detection to practical applications, it is necessary to find a solution that weighs computational overhead and detection performance.
在RoIs提取阶段,目前发表的一部分方法是根据图像中目标的特性规律筛选行人可能存在的前景区域。例如:In the RoIs extraction phase, some of the methods currently published are to screen the foreground areas that pedestrians may exist based on the characteristics of the targets in the image. E.g:
现有技术1:Ge J,Luo Y,Tei G.Real-Time Pedestrian Detection and Tracking at Nighttime for Driver-Assistance Systems.[J].Intelligent Transportation Systems IEEE Transactions on,2009,10(2):283-298。根据同一水平线上行人目标像素相较于周围背景亮度更高的经验,通过计算每个像素局部邻域内的分割阈值上限和下限,对近红外图像提取RoIs。Prior Art 1: Ge J, Luo Y, Tei G. Real-Time Pedestrian Detection and Tracking at Nighttime for Driver-Assistance Systems. [J].Intelligent Transportation Systems IEEE Transactions on, 2009, 10(2): 283-298 . According to the experience that the target pixel of the same horizontal line is higher than the brightness of the surrounding background, the RoIs are extracted from the near-infrared image by calculating the upper and lower limits of the segmentation threshold in the local neighborhood of each pixel.
现有技术2:Uijlings J R R,Sande K E A V D,Gevers T,et al.Selective Search for Object Recognition[J].International Journal of Computer Vision,2013,104(2):154-171。提出选择性搜索方法,主要思想是根据可见光图像不同的颜色空间先对图像进行小的相 似区域分割,再根据区域合并算法从颜色、纹理、大小等方面将相似度高的小区域合并为大区域。Prior Art 2: Uijlings J R R, Sande K E A V D, Gevers T, et al. Selective Search for Object Recognition [J]. International Journal of Computer Vision, 2013, 104(2): 154-171. A selective search method is proposed. The main idea is to divide the image into small similar regions according to the different color spaces of the visible light image, and then merge the small regions with high similarity into large regions according to the color, texture and size according to the region merging algorithm. .
现有技术3:Zitnick C L,Dollár P.Edge Boxes:Locating Object Proposals from Edges[C]//European Conference on Computer Vision.Springer,Cham,2014:391-405。提出EdgeBox方法,根据局部区域内闭合轮廓和交叉轮廓的关系,寻找包含完整物体的RoIs。Prior Art 3: Zitnick C L, Dollár P. Edge Boxes: Locating Object Proposals from Edges [C]//European Conference on Computer Vision. Springer, Cham, 2014: 391-405. The EdgeBox method is proposed to find the RoIs containing the complete object according to the relationship between the closed contour and the cross contour in the local area.
与滑动窗口法相比,现有技术1-3方法获得的RoIs数量级有明显降低,但仍对实时性造成威胁,现有技术2方法在单幅图像中平均得到约2000个RoIs,现有技术3方法在计算机中的单幅图像处理时间约为0.2s。不过现有技术1-3提供了一种值得参考的思路,即通过计算开销相对小的方法提前滤除一些非行人RoIs,达到减少待检测RoIs数量的目的。Compared with the sliding window method, the magnitude of RoIs obtained by the prior art 1-3 method is significantly reduced, but still poses a threat to real-time performance. The prior art 2 method obtains an average of about 2000 RoIs in a single image, and the prior art 3 The method of processing a single image in a computer is approximately 0.2 s. However, the prior art 1-3 provides a thought worthy of reference, that is, filtering non-Pedestrian RoIs in advance by calculating a relatively small overhead, thereby reducing the number of RoIs to be detected.
在RoIs分类检测阶段,数量、质量合格的正负样本是提升分类器性能的一种有效途径。目前公开可用的热成像行人检测基准数据集非常稀缺,本发明使用实验室发布的数据集SCUT Dataset( http://www2.scut.edu.cn/cv/scut_fir_pedestrian_dataset/)。该数据集面向广州市的交通道路场景,包含100个红外热成像视频,总帧数大约为20万,标注的行人Ground-Truth信息数量有40万左右,具有不同行人目标类型,如“单一走路行人、单一骑车行人”等。与其他公开的热成像行人检测数据集如KAIST Dataset相比,具有图像帧数、Ground-Truth信息类型和数量、道路场景类型等方面的优势。 In the stage of RoIs classification and detection, positive and negative samples with quantity and quality are an effective way to improve the performance of the classifier. The currently available thermal imaging pedestrian detection reference data set is very scarce, and the present invention uses the laboratory-published data set SCUT Dataset ( http://www2.scut.edu.cn/cv/scut_fir_pedestrian_dataset/ ). The dataset is for the traffic road scene in Guangzhou. It contains 100 infrared thermal imaging videos. The total number of frames is about 200,000. The number of marked Ground-Truth information is about 400,000. It has different pedestrian target types, such as “single walk”. Pedestrians, single cycling pedestrians, etc. Compared with other public thermal imaging pedestrian detection data sets such as KAIST Dataset, it has the advantages of image frame number, Ground-Truth information type and quantity, road scene type and so on.
综上所述,虽然车载热成像行人检测方法取得了一定成果,但是由于计算瓶颈和分类器性能之间的实时性和准确率权衡问题,许多方法不能发挥正常性能甚至无法使用。为了满足实际应用的要求,迫切需要在检测时间、检测精度方面做出进一步改进。In summary, although the on-board thermal imaging pedestrian detection method has achieved certain results, due to the real-time and accuracy trade-off between the computational bottleneck and the classifier performance, many methods cannot perform normal performance or even use. In order to meet the requirements of practical applications, it is urgent to make further improvements in terms of detection time and detection accuracy.
发明内容Summary of the invention
本发明的目的在于提供面向车载热成像行人检测的RoIs过滤方法和装置,旨在促进解决受计算性能瓶颈导致的准确率下降和无法满足实时性等问题。本发明通过如下技术方案实现。It is an object of the present invention to provide a RoIs filtering method and apparatus for on-board thermal imaging pedestrian detection, which aims to facilitate solving problems such as a decrease in accuracy caused by a computational performance bottleneck and failure to satisfy real-time performance. The present invention is achieved by the following technical solutions.
为了达到上述发明目的,根据本发明提供面向车载热成像行人检测的感兴趣区域(Regions of Interest,RoIs)过滤方法,所述方法包括:通过计算行人像素高度和RoIs高宽比并设定相应的阈值区间,滤除尺寸异常的RoIs;逐个RoIs分别计算其上下边界与当前图像路面基准的竖直间距,计算基于RoIs像素高度的阈值,滤除位置异常的RoIs;以及依据每个RoIs的亮度竖直投影差值曲线搜索可能的行人头部区域,对比头部区域和相邻背景区域的Haar-like特征的差异程度,滤除缺失行人头部的RoIs。In order to achieve the above object, a Regions of Interest (RoIs) filtering method for on-board thermal imaging pedestrian detection is provided according to the present invention, the method comprising: calculating a pedestrian pixel height and a RoIs aspect ratio and setting corresponding Threshold interval, filtering Out-of-size RoIs; calculating the vertical spacing between the upper and lower boundaries and the current image pavement reference by RoIs, calculating the threshold based on the pixel height of RoIs, filtering out the abnormal position of the RoIs; and arranging the brightness according to the brightness of each RoIs The direct projection difference curve searches for possible pedestrian head regions, compares the degree of difference between the head region and the Haar-like features of adjacent background regions, and filters out the RoIs of missing pedestrian heads.
根据本发明的另一方面,滤除尺寸异常的RoIs包括:依据图像焦距f、行人身高height target和检测距离distance参数,计算得到行人RoIs像素高度的阈值区间: According to another aspect of the present invention, the filtering of the abnormal size of the RoIs includes: calculating a threshold interval of the pixel height of the pedestrian RoIs according to the image focal length f, the pedestrian height target, and the detection distance distance parameter:
height pixel≈height target×f/distance           公式(1) Height pixel ≈height target ×f/distance formula (1)
其中,height pixel是行人RoIs像素高度的阈值区间,height target为行人目标的身高,f为图像焦距,distance为检测距离; Where height pixel is the threshold interval of the pixel height of the pedestrian RoIs, height target is the height of the pedestrian target, f is the image focal length, and distance is the detection distance;
根据统计分析法,获得行人RoIs高宽比的高斯分布,选取合适的置信水平得到高宽比阈值区间;以及评估每个待检测RoIs,不符合两种区间条件的RoIs为尺寸异常的RoIs,将这些尺寸异常的RoIs移除。According to the statistical analysis method, the Gaussian distribution of the pedestrian RoIs aspect ratio is obtained, and the appropriate confidence level is selected to obtain the aspect ratio threshold interval; and each RoIs to be detected is evaluated, and the RoIs that do not meet the two interval conditions are the size abnormal RoIs. These abnormally sized RoIs are removed.
根据本发明的另一方面,滤除位置异常的RoIs包括:使用水平路面假设方法获取当前图像路面基准;对需要判断的RoIs,逐个RoIs分别计算其上下边界与路面基准在y轴方向的间距数值,y轴方向是RoIs的竖直方向,并根据公式(2)计算基于当前RoIs像素高度RoI h的阈值: According to another aspect of the present invention, the RoIs for filtering out position abnormalities include: obtaining a current image road surface reference using a horizontal road surface hypothesis method; and calculating a distance value between the upper and lower boundaries and the road surface reference in the y-axis direction for each of the RoIs to be judged. The y-axis direction is the vertical direction of RoIs, and the threshold based on the current RoIs pixel height RoI h is calculated according to formula (2):
Figure PCTCN2018083480-appb-000001
Figure PCTCN2018083480-appb-000001
其中,α和β是缩放因子,ε是偏移噪声因子;Where α and β are scaling factors and ε is an offset noise factor;
以及滤除间距结果不符合阈值的待检测RoIs。And filtering out the RoIs to be detected that the pitch result does not meet the threshold.
根据本发明的另一方面,滤除缺失行人头部的RoIs包括:使用行人头部自适应定位算法将当前RoIs上层区域沿水平方向划分为三部分,中间部分命名为头部区域,左、右部分命名为背景区域;以及使用基于Haar-like特征的方法评估头部区域和背景区域的亮度均值差异程度,依据预设的阈值移除缺失头部的RoIs。According to another aspect of the present invention, filtering out RoIs missing a pedestrian head includes: dividing a current RoIs upper layer region into three parts in a horizontal direction by using a pedestrian head adaptive positioning algorithm, and the middle portion is named a head region, left and right. The part is named as the background area; and the Haar-like feature-based method is used to estimate the degree of difference in the brightness mean of the head area and the background area, and the RoIs of the missing head are removed according to the preset threshold.
根据本发明的另一方面,行人头部自适应定位算法使用亮度竖直投影方法处理当前RoIs的上层区域,得到对应的投影结果序列;计算序列中相邻数据的差值,获得当前RoIs的亮度竖直投影差值曲线;根据竖直边界匹配策略,在曲线极值点寻找符合条件的头部区域左右边界组合,相应的x轴坐标信息定义头部区域位置,其中,x轴是RoIs的水平方向。According to another aspect of the present invention, the pedestrian head adaptive positioning algorithm uses the luminance vertical projection method to process the upper layer region of the current RoIs, and obtains a corresponding projection result sequence; calculates the difference of adjacent data in the sequence to obtain the brightness of the current RoIs. Vertical projection difference curve; according to the vertical boundary matching strategy, find the left and right boundary combination of the qualified head region at the extreme point of the curve, and the corresponding x-axis coordinate information defines the position of the head region, wherein the x-axis is the level of the RoIs direction.
此外,本发明提供面向车载热成像行人检测的感兴趣区域(Regions of Interest,RoIs)过滤装置,所述装置包括:尺寸异常RoIs过滤器,通过计算行人的像素高度和RoIs的高宽比并设定相应的阈值区间,滤除尺寸异常的RoIs;位置异常RoIs过滤器,逐个RoIs分别计算其上下边界与当前图像路面基准的竖直间距,计算基于RoIs像素高度的阈值,滤除位置异常的RoIs;以及缺失头部RoIs过滤器,依据每个RoIs的亮度竖直投影差值曲线搜索可能的行人头部区域,对比头部区域和相邻背景区域的Haar-like特征的差异程度,滤除缺失行人头部的RoIs。In addition, the present invention provides a Regions of Interest (RoIs) filtering device for on-board thermal imaging pedestrian detection, the device comprising: a size anomaly RoIs filter, which is calculated by calculating the pixel height of the pedestrian and the aspect ratio of the RoIs Determine the corresponding threshold interval, filter out the size abnormal RoIs; position abnormal RoIs filter, calculate the vertical distance between the upper and lower boundaries and the current image road surface reference by RoIs, calculate the threshold based on the pixel height of RoIs, and filter out the abnormal position RoIs And the missing head RoIs filter, searching for possible pedestrian head regions according to the luminance vertical projection difference curve of each RoIs, comparing the degree of difference of the Haar-like features of the head region and the adjacent background region, filtering out the missing RoIs in the head of the pedestrian.
另外,本发明提供面向车载热成像行人检测方法,所述方法包括:提取待检测的RoIs;对RoIs进行过滤,其中,所述RoIs过滤包括步骤:通过计算行人像素高度和RoIs高宽比并设定相应的阈值区间,滤除尺寸异常的RoIs;逐个RoIs分别计算其上下边界与当前图像路面基准的竖直间距,计算基于RoIs像素高度的阈值,滤除位置异常 的RoIs;以及依据每个RoIs的亮度竖直投影差值曲线搜索可能的行人头部区域,对比头部区域和相邻背景区域的Haar-like特征的差异程度,滤除缺失行人头部的RoIs;对分类器进行离线训练;以及使用已经过训练的分类器对过滤后的RoIs进行分类检测。In addition, the present invention provides a method for in-vehicle thermal imaging pedestrian detection, the method comprising: extracting RoIs to be detected; filtering RoIs, wherein the RoIs filtering comprises the steps of: calculating a pedestrian pixel height and a RoIs aspect ratio The corresponding threshold interval is determined, and the RoIs of the size anomaly are filtered out; the vertical spacing between the upper and lower boundaries and the current image pavement reference is calculated by RoIs, and the threshold of the pixel height based on the RoIs is calculated, and the RoIs of the abnormal position are filtered out; and according to each RoIs The luminance vertical projection difference curve searches for possible pedestrian head regions, compares the degree of difference between the head region and the adjacent background region Haar-like features, filters out the RoIs of the missing pedestrian head, and performs off-line training on the classifier; And the classified RoIs are classified and detected using a trained classifier.
本发明提供面向车载热成像行人检测的RoIs过滤方法,针对计算瓶颈问题的不利影响,与现有的车载热成像行人检测的RoIs过滤技术相比,具有如下优点和效果:The invention provides a RoIs filtering method for on-board thermal imaging pedestrian detection, and has the following advantages and effects compared with the existing on-board thermal imaging pedestrian detection RoIs filtering technology for the adverse effects of the calculation bottleneck problem:
本发明提出RoIs过滤方法,通过构造一个符合行人特征规律且计算开销低的三层级联过滤器,能够优先滤除尺寸异常、位置异常以及缺失行人头部的RoIs,大量非行人RoIs得到抑制,保证剩余待检测RoIs在进行精度更高的分类器检测环节时能够满足实时性要求,同时能够降低系统虚警率。The invention proposes a RoIs filtering method. By constructing a three-layer cascade filter which conforms to the pedestrian characteristic rule and low computational overhead, the RoIs of the size anomaly, the positional abnormality and the missing pedestrian head can be preferentially filtered out, and a large number of non-pedestrian RoIs are suppressed and guaranteed. The remaining RoIs to be tested can meet the real-time requirements when performing the higher-precision classifier detection, and can reduce the system false alarm rate.
附图说明DRAWINGS
从以下结合附图的描述中,本公开的特定实施例的以上和其它方面,特征和优点将变得更加明显,其中:The above and other aspects, features and advantages of the specific embodiments of the present disclosure will become more apparent from
图1是示出根据本发明的实施例的RoIs过滤方法的流程图。FIG. 1 is a flow chart showing a RoIs filtering method according to an embodiment of the present invention.
图2的(a)示出行人像素高度阈值区间的人工统计结果,图2的(b)示出Ground-Truth高宽比区间统计结果,图2的(c)示出头部自适应定位算法的部分示例结果。Fig. 2(a) shows the artificial statistical result of the pedestrian pixel height threshold interval, Fig. 2(b) shows the Ground-Truth aspect ratio interval statistical result, and Fig. 2(c) shows the head adaptive positioning algorithm. Part of the sample results.
图3是示出根据本发明的实施例的RoIs过滤装置的框图。FIG. 3 is a block diagram showing a RoIs filtering device according to an embodiment of the present invention.
图4是示出根据本发明的实施例的分类器训练方法的流程图。4 is a flow chart showing a classifier training method in accordance with an embodiment of the present invention.
图5的(a)示出YUV4:2:2格式图像的Y通道预处理示例,图5的(b)示出原始正样本和扩展正样本的对比,图5的(c)示出部分汽车干扰热源困难负样本。(a) of FIG. 5 shows an example of Y channel preprocessing of a YUV 4:2:2 format image, (b) of FIG. 5 shows a comparison of an original positive sample and an extended positive sample, and (c) of FIG. 5 shows a part of a car. A negative sample that interferes with the heat source.
图6是示出根据本发明的实施例的分类器训练装置的框图。FIG. 6 is a block diagram showing a classifier training apparatus according to an embodiment of the present invention.
图7是示出根据本发明的实施例的行人检测方法的流程图。FIG. 7 is a flow chart showing a pedestrian detection method according to an embodiment of the present invention.
图8是示出根据本发明的实施例的行人检测装置的框图。FIG. 8 is a block diagram showing a pedestrian detecting apparatus according to an embodiment of the present invention.
具体实施方式detailed description
提供以下参照附图的描述以帮助全面理解由权利要求及其等同物限定的本公开的各种实施例。它包括各种具体的细节来帮助理解,但是这些被认为仅仅是示例性的。因此,本领域的普通技术人员将认识到,在不脱离本公开的范围和精神的情况下,可以对本文描述的各种实施例进行各种改变和修改。另外,为了清楚和简明,可以省略对公知功能和结构的描述。The following description of the drawings is provided to be a It includes various specific details to assist understanding, but these are considered merely exemplary. Accordingly, it will be appreciated by those skilled in the art that various changes and modifications of the various embodiments described herein can be made without departing from the scope and spirit of the disclosure. In addition, descriptions of well-known functions and constructions may be omitted for clarity and conciseness.
在下面的描述和权利要求中使用的术语和词语不限于书面含义,而是仅由发明人使用以使得能够清楚和一致地理解本公开。因此,本领域技术人员应该清楚,提供本公开的各种实施例的以下描述仅用于说明的目的,而不是为了限制由所附权利要求及其等同物限定的本公开的目的。The use of the terms and words in the following description and claims is not limited to the written description, but is only used by the inventor to enable a clear and consistent understanding of the present disclosure. The following description of the various embodiments of the present invention are intended to be
图1是示出根据本发明的实施例的RoIs过滤方法的流程图。FIG. 1 is a flow chart showing a RoIs filtering method according to an embodiment of the present invention.
在行人检测过程中,提取环节得到目标可能区域的RoIs边界框信息,分别记录每个RoIs左上角x轴坐标RoI x、左上角y轴坐标RoI y、宽度RoI w、高度RoI h。为了满足高召回率要求,通常得到较多数量的RoIs,如果直接进行后续的分类器检测环节,对于具有计算瓶颈的硬件平台(如车载嵌入式平台),则很难达到实时性要求。通过人工观察,图像中的行人目标属于稀有事物,提取的RoIs大部分为非行人RoIs,其中属于明显非行人RoIs的不在少数。 In the pedestrian detection process, the extraction section obtains the RoIs bounding box information of the target possible area, and records the X-axis coordinate RoI x , the upper left corner y-axis coordinate RoI y , the width RoI w and the height RoI h of each RoIs upper left corner. In order to meet the high recall rate requirement, a large number of RoIs are usually obtained. If the subsequent classifier detection link is directly performed, it is difficult to achieve real-time requirements for a hardware platform with a computational bottleneck (such as an in-vehicle embedded platform). Through manual observation, the pedestrian target in the image belongs to rare things, and most of the extracted RoIs are non-pedestrian RoIs, among which there are not a few non-pedestrian RoIs.
所述行人RoIs是指与行人Ground-Truth边界框的重合度(IOU,Intersection over Union)超过50%的RoIs边界框,所述非行人RoIs是指与行人Ground-Truth边界框的IOU低于50%的RoIs边界框。明显非行人RoIs是指与行人Ground-Truth边界框的IOU低于30%、根据人工视觉很容易辨别该RoIs的信息为背景、可通过设定一些简单过滤条件进行区分的RoIs。其中,行人Ground-Truth边界框是指目标类型为单一走路行人和单一骑车行人的真实边界框标注信息。The pedestrian RoIs refers to a RoIs bounding box that exceeds 50% of the intersection of the pedestrian Ground-Truth bounding box (IOU, Intersection over Union), and the non-pedestrian RoIs refers to an IOU of less than 50 with the pedestrian Ground-Truth bounding box. % RoIs bounding box. Obviously, non-pedestrian RoIs refers to RoIs that are less than 30% of the IOU of the pedestrian's Ground-Truth bounding box, which can easily distinguish the RoIs based on artificial vision, and can be distinguished by setting some simple filtering conditions. Among them, the pedestrian Ground-Truth bounding box refers to the real bounding box labeling information of the target type as a single walking pedestrian and a single cycling pedestrian.
因此,本发明的实施例的RoIs过滤方法的主要思想是:构造符合行人特征规律的三层级联过滤器对尺寸异常、位置异常以及缺失行人头部的RoIs进行优先滤除,减少待检测RoIs的数量,详细流程图如图1所示。Therefore, the main idea of the RoIs filtering method of the embodiment of the present invention is to construct a three-layer cascade filter conforming to the pedestrian characteristic rule to preferentially filter out the size abnormality, the position abnormality, and the RoIs of the missing pedestrian head, thereby reducing the RoIs to be detected. The quantity, detailed flow chart is shown in Figure 1.
在步骤110中,滤除尺寸异常的RoIs。具体地,通过计算行人像素高度和RoIs高宽比并设定相应的阈值区间,滤除尺寸异常的RoIs。更详细地,包括:In step 110, the abnormally sized RoIs are filtered out. Specifically, the RoIs of the abnormal size are filtered out by calculating the pedestrian pixel height and the RoIs aspect ratio and setting the corresponding threshold interval. In more detail, including:
步骤111:依据图像焦距和行人检测距离,计算行人RoIs像素高度的阈值区间。Step 111: Calculate a threshold interval of the pixel height of the pedestrian RoIs according to the image focal length and the pedestrian detection distance.
具体地,根据人工经验,行人检测的范围大概是距离汽车20~85米的前方区域,如公式(1)所示,依据图像焦距f、行人身高height target和检测距离distance参数,计算得到行人目标在此范围内的像素高度阈值区间为[30,140]。 Specifically, according to the artificial experience, the range of pedestrian detection is about 20 to 85 meters from the front area of the car. As shown in formula (1), the pedestrian target is calculated according to the image focal length f, the pedestrian height target and the detection distance parameter. The pixel height threshold interval in this range is [30, 140].
height pixel≈height target×f/distance           公式(1) Height pixel ≈height target ×f/distance formula (1)
其中,height pixel为行人目标在图像中的像素高度,height target为行人目标的身高,实验设定约为1.7米,f为图像焦距,在SCUT Dataset的数值为1554,distance为检测距离。 Among them, height pixel is the pixel height of the pedestrian target in the image, height target is the height of the pedestrian target, the experimental setting is about 1.7 meters, f is the image focal length, the value in the SCUT Dataset is 1554, and the distance is the detection distance.
图2中的(a)示出了行人像素高度阈值区间的人工统计结果。具体地,图2中的(a)是汽车上安装的红外热像仪和行人目标在平坦路面相距20米和85米时拍摄得到的图像,其中汽车是静止的。对两幅图像进行行人边界框的人工测量(虚线为画出的行人边界框),可以得到统计的20米的行人像素高度为138像素,行人像素宽度为42像素,85米的行人像素高度为30像素,行人像素宽度为12像素,与根据上述公式(1)计算得到的数值对比,两者相差很小,证明公式(1)计算的方法有效。(a) in Fig. 2 shows the artificial statistical result of the pedestrian pixel height threshold interval. Specifically, (a) in Fig. 2 is an image taken by an infrared camera mounted on a car and a pedestrian target at a distance of 20 meters and 85 meters on a flat road surface, wherein the car is stationary. The artificial measurement of the pedestrian bounding box of the two images (the dotted line is the drawn pedestrian bounding box), the statistical 20-meter pedestrian pixel height is 138 pixels, the pedestrian pixel width is 42 pixels, and the pedestrian pixel height of 85 meters is 30 pixels, the pedestrian pixel width is 12 pixels, compared with the value calculated according to the above formula (1), the difference between the two is small, which proves that the method of formula (1) calculation is effective.
步骤112:根据统计分析法,获得行人RoIs高宽比的高斯分布,选取合适的置信水平得到高宽比阈值区间。Step 112: According to the statistical analysis method, obtain the Gaussian distribution of the pedestrian RoIs aspect ratio, and select an appropriate confidence level to obtain an aspect ratio threshold interval.
在目前公布的RoIs提取方法中,如现有技术1和2基于前景区域得到的RoIs,其高宽 比率变化很大。许多明显非行人RoIs的高宽比率与实际人体特征差别较大,基于此特性使用统计分析法得到行人RoIs高宽比的高斯分布,选取合适的置信水平,获得高宽比阈值区间为[1.5,4]。其中,被统计样本来自数据集SCUT Dataset的行人Ground-Truth信息,且目标标注类型为“单一走路行人”和“单一骑车行人”。In the currently published RoIs extraction method, as in the prior art, the RoIs obtained based on the foreground region vary greatly in the aspect ratio. Many high-width ratios of non-pedestrian RoIs differ greatly from actual human characteristics. Based on this characteristic, the Gaussian distribution of pedestrian RoIs aspect ratio is obtained by statistical analysis. The appropriate confidence level is selected to obtain the aspect ratio threshold of [1.5. 4]. Among them, the statistical sample is from the pedestrian Ground-Truth information of the data set SCUT Dataset, and the target annotation type is “single walking pedestrian” and “single bicycle pedestrian”.
图2中的(b)示出了44个视频的目标类型为“单一走路行人”和“单一骑车行人”的Ground-Truth高宽比区间统计结果。具体地,图2中的(b)是针对44个视频的目标类型为“单一走路行人”和“单一骑车行人”的Ground-Truth样本,统计它们的高宽比并绘制成直方图的结果,即,使用统计分析法得到行人RoIs高宽比的高斯分布。图像的横轴是高宽比数值,纵轴是样本数量,可以看到样本的高宽比分布范围大致在1~4。在本技术方案中选取合适的置信水平确定高宽比阈值区间为[1.5,4]。(b) in Fig. 2 shows the Ground-Truth aspect ratio interval result of the target types of the 44 videos of "single walking pedestrian" and "single bicycle pedestrian". Specifically, (b) in FIG. 2 is a Ground-Truth sample for 44 videos whose target types are "single walking pedestrian" and "single bicycle pedestrian", and their aspect ratio is calculated and plotted as a histogram result. That is, the Gaussian distribution of the pedestrian RoIs aspect ratio is obtained using statistical analysis. The horizontal axis of the image is the aspect ratio value, and the vertical axis is the number of samples. It can be seen that the aspect ratio of the sample is approximately 1 to 4. In the technical solution, an appropriate confidence level is selected to determine the aspect ratio threshold interval [1.5, 4].
步骤113:评估每个待检测RoIs,不符合两种区间条件的RoIs为尺寸异常的RoIs,将这些尺寸异常的RoIs移除。Step 113: Evaluate each of the to-be-detected RoIs, and the RoIs that do not meet the two interval conditions are RoIs of abnormal size, and remove the abnormally-sized RoIs.
交通场景的行人目标具有很强的位置约束,即不管是走路行人还是骑车行人,绝大多数是位于路面上,因此在图像中行人目标的位置中心呈水平条状分布。根据此经验,在图像中位置异常的RoIs很可能是明显非行人RoIs。The pedestrian target of the traffic scene has a strong positional constraint, that is, whether it is a pedestrian or a bicycle pedestrian, most of them are located on the road surface, so the center of the pedestrian target in the image is horizontally distributed. Based on this experience, the RoIs with abnormal positions in the image are likely to be apparently non-pedestrian RoIs.
在步骤120中,滤除位置异常的RoIs。具体地,逐个RoIs分别计算其上下边界与当前图像路面基准的竖直间距,计算基于RoIs像素高度的阈值,滤除位置异常的RoIs。更详细地,包括:In step 120, the abnormally located RoIs are filtered out. Specifically, the vertical distance between the upper and lower boundaries and the current image road surface reference is calculated by RoIs one by one, and the threshold based on the pixel height of the RoIs is calculated, and the RoIs with abnormal positions are filtered out. In more detail, including:
步骤121:使用水平路面假设方法获取当前图像路面基准。Step 121: Acquire a current image road surface reference using a horizontal road surface hypothesis method.
具体地,基于热像仪的拍摄角度,使用水平路面假设方法得到当前图像路面基准的y轴坐标数据Horizon ySpecifically, based on the photographing angle of the thermal imager, the horizontal road surface hypothesis method is used to obtain the y-axis coordinate data Horizon y of the current image pavement reference.
步骤122:逐个RoIs分别计算其上下边界与路面基准在图像y轴方向的间距,并设定基于当前RoIs像素高度的阈值。Step 122: Calculate the distance between the upper and lower boundaries and the road surface reference in the y-axis direction of the image by RoIs one by one, and set a threshold based on the current RoIs pixel height.
对需判断的RoIs,分别计算该RoIs上下边界与路面基准在图像y轴方向的间距数值,并根据公式(2)计算基于当前RoIs的像素高度RoI h的自适应阈值。 For the RoIs to be judged, the distance between the upper and lower boundaries of the RoIs and the road reference in the y-axis direction of the image is calculated, and the adaptive threshold based on the pixel height RoI h of the current RoIs is calculated according to the formula (2).
Figure PCTCN2018083480-appb-000002
Figure PCTCN2018083480-appb-000002
其中α和β是缩放因子,ε是偏移噪声因子,通过实验设定α=4,β=2,ε=25;Where α and β are scaling factors, ε is an offset noise factor, and α=4, β=2, ε=25 are experimentally set;
步骤123:滤除间距结果不符合阈值的待检测RoIs。Step 123: Filter out the RoIs to be detected whose spacing result does not meet the threshold.
对满足尺寸特征要求的待检测RoIs逐一重复步骤122的操作,滤除所有存在位置异常的RoIs。The operation of step 122 is repeated one by one for the to-be-detected RoIs satisfying the size feature requirement, and all RoIs having abnormal positions are filtered out.
在步骤130中,滤除缺失行人头部的RoIs。具体地,依据每个RoIs的亮度竖直投影差值曲线搜索可能的行人头部区域,对比头部区域和相邻背景区域的Haar-like特征的差异程度,滤除缺失行人头部的RoIs。In step 130, the RoIs missing the pedestrian head are filtered out. Specifically, the possible pedestrian head region is searched according to the luminance vertical projection difference curve of each RoIs, and the difference degree of the Haar-like features of the head region and the adjacent background region is compared, and the RoIs of the missing pedestrian head is filtered out.
提取环节得到的非行人RoIs一般包含交通场景的背景干扰热源,如路边的树木枝干、均匀热源等。通过观察发现,人体头部极少被其他物品遮挡,处于暴露状态,因此其热成像效果往往比邻近背景亮度更高,且具有较稳定的轮廓。基于此,更详细地,滤除缺失行人头部的RoIs包括:The non-pedestrian RoIs obtained in the extraction process generally contain background interference heat sources of traffic scenes, such as roadside tree branches and uniform heat sources. It has been observed that the human head is rarely obscured by other objects and is exposed, so its thermal imaging effect is often higher than the adjacent background brightness, and has a more stable contour. Based on this, in more detail, the RoIs that filter out missing pedestrian heads include:
步骤131:使用行人头部自适应定位算法将当前RoIs上层区域沿水平方向划分为三部分,中间部分命名为头部区域,左、右部分命名为背景区域,所述RoIs上层区域指沿y轴方向从RoIs上边界到1/3或1/5像素高度位置的部分区域。所述行人头部自适应定位算法使用亮度竖直投影方法处理当前RoIs的上层区域,得到对应的投影结果序列;计算序列中相邻数据的差值,获得当前RoIs的亮度竖直投影差值曲线;进一步地,根据提出的竖直边界匹配策略,在曲线极值点寻找符合条件的头部区域左右边界组合,相应的x轴坐标信息定义头部区域位置。Step 131: Using the pedestrian head adaptive positioning algorithm to divide the current upper layer area of the RoIs into three parts in the horizontal direction, the middle part is named as the head area, the left and right parts are named as the background area, and the upper layer area of the RoIs is along the y-axis. The direction is from the upper boundary of the RoIs to a partial area of 1/3 or 1/5 pixel height position. The pedestrian head adaptive positioning algorithm uses the luminance vertical projection method to process the upper layer region of the current RoIs, and obtains a corresponding projection result sequence; calculates the difference between adjacent data in the sequence, and obtains the luminance vertical projection difference curve of the current RoIs. Further, according to the proposed vertical boundary matching strategy, the left and right boundary combinations of the qualified head regions are searched for at the extreme points of the curve, and the corresponding x-axis coordinate information defines the position of the head region.
更详细地,行人头部自适应定位算法如下:In more detail, the pedestrian head adaptive positioning algorithm is as follows:
①对需操作的RoIs,定义沿y轴方向从该RoIs上边界到(RoI y+α×RoI h)位置的部分区域为RoIs上层区域P up,此区域的高度记为H,其中当RoI h<48时,设定α=1/3,否则设定α=1/5; 1 For the RoIs to be operated, the part of the area from the upper boundary of the RoIs to the (RoI y +α×RoI h ) position in the y-axis direction is the upper layer of the RoIs P up , and the height of this area is denoted as H, where RoI h <48, set α = 1/3, otherwise set α = 1 / 5;
②根据当前RoIs的像素高度RoI h判断:如果RoI h<90,则跳转到步骤③执行;如果RoI h≥90,则跳转到步骤⑧执行; 2 judge according to the pixel height RoI h of the current RoIs: if RoI h <90, jump to step 3 to execute; if RoI h ≥ 90, jump to step 8 to execute;
③将此RoIs的左上角坐标(RoI x,RoI y)视为坐标原点,相应地根据公式(3)计算P up的亮度竖直投影序列V N={V(x),x=0,1,…,RoI w-1},根据公式(4)计算亮度竖直投影差值曲线V’ N={V’(x),x=0,1,…,RoI w-2},其中Y(x,y)是像素点(x,y)处的亮度值; 3 The coordinates of the upper left corner of RoIs (RoI x , RoI y ) are regarded as the origin of coordinates, and the vertical projection sequence of P up is calculated according to formula (3). V N ={V(x), x=0,1 ,...,RoI w -1}, calculate the luminance vertical projection difference curve V' N ={V'(x), x=0,1,...,RoI w -2} according to formula (4), where Y( x, y) is the luminance value at the pixel point (x, y);
Figure PCTCN2018083480-appb-000003
Figure PCTCN2018083480-appb-000003
Figure PCTCN2018083480-appb-000004
Figure PCTCN2018083480-appb-000004
④受图像噪声和背景热源影响,投影差值曲线V’ N可能存在一些数值较小的干扰极值点,根据公式(5)计算得到阈值T diff,随后根据公式(6)对投影差值曲线V’ N过滤干扰极值,得到新的投影差值曲线V’ T,其中,abs()是求绝对值函数,α是缩放因子,实验设定α=0.5; 4 Influenced by image noise and background heat source, the projection difference curve V' N may have some interference extreme points with smaller values, and the threshold T diff is calculated according to formula (5), and then the projection difference curve according to formula (6) V' N filters the interference extreme value, and obtains a new projection difference curve V' T , where abs() is the absolute value function, α is the scaling factor, and the experiment is set α = 0.5;
Figure PCTCN2018083480-appb-000005
Figure PCTCN2018083480-appb-000005
Figure PCTCN2018083480-appb-000006
Figure PCTCN2018083480-appb-000006
⑤从左往右遍历投影差值曲线V’ T的极值点,记录符合以下原则的左右边界对的x轴位置信息(X_edge l,X_edge r): 5 traverse the extreme points of the projection difference curve V' T from left to right, and record the x-axis position information (X_edge l , X_edge r ) of the left and right boundary pairs that conform to the following principles:
头部位置边界仅和V’ T的极值点对应,默认情况下头部区域比背景区域的亮度高,因此头部左边界对应V’ T的正极值点;头部右边界对应V’ T的负极值点; The head position boundary only corresponds to the extreme point of V' T. By default, the head area is higher than the background area, so the left border of the head corresponds to the positive value of V'T; the right border of the head corresponds to V' The negative value point of T ;
如果搜索到新的可能左边界,将其对应的右边界先赋空;If a new possible left boundary is found, its corresponding right boundary is first null;
如果搜索到新的右边界而其对应左边界为空时,则此右边界是背景干扰,因为从左往右遍历的过程是先找到头部左边界;If a new right boundary is found and its corresponding left boundary is empty, then this right boundary is the background interference, because the process of traversing from left to right is to find the left boundary of the head first;
如果匹配到一组左右边界对(X_edge l,X_edge r),则计算其对应头部宽度W head=X_edge r-X_edge l,根据头部宽度最小阈值Min head和最大阈值Max head对W head的合理性进行判断(实验设定Min head=RoI w/8,Max head=RoI w/2):如果Min head≤W head≤Max head,则该组边界对有效,保存此数据并继续寻找可能和当前X_edge l匹配的其他右边界;如果W head<Min head,则当前右边界X_edge r无效;如果W head>Max head,则当前左、右边界均无效,左边界X_edge l再匹配后面的右边界没有意义; If a match to a set of left and right boundaries (X_edge l, X_edge r), which is calculated corresponding to the head width W head = X_edge r -X_edge l, according to the head width Min head minimum threshold and a maximum threshold value of Max head W head is reasonable Judgment (experimental setting Min head = RoI w /8, Max head = RoI w /2): If Min head ≤ W head ≤ Max head , then the set of boundary pairs is valid, save this data and continue to search for possible and current X_edge l matches the other right boundary; if W head <Min head , the current right boundary X_edge r is invalid; if W head >Max head , the current left and right boundaries are invalid, and the left boundary X_edge l matches the following right boundary. significance;
⑥如果存在多个符合条件的头部左右边界对组合X_edge N={(X_edge l1,X_edge r1),(X_edge l2,X_edge r2),…,(X_edge ln,X_edge rn)},则遍历这些边界对组合寻找其中的最优项:暂定(X_edge l1,X_edge r1)为最优组合;查看下一边界对组合,如果该组合和已知最优组合的左边界相同,则比较两者的右边界位置,值更大的较优,更新最优组合;如果两者左边界不同,则计算两组数据的两条竖直中心线位置(沿x轴方向的位置数值),随后分别和当前RoIs竖直中心线进行间距比较,与RoIs竖直中心线更近的较优,更新最优组合(因为行人头部更有可能在RoIs上层区域P up的居中位置); 6 If there are multiple eligible left and right boundary pairs, X_edge N = {(X_edge l1 , X_edge r1 ), (X_edge l2 , X_edge r2 ), ..., (X_edge ln , X_edge rn )}, then traverse these boundary pairs Combine to find the optimal term: tentative (X_edge l1 , X_edge r1 ) is the optimal combination; view the next boundary pair combination, if the combination and the known optimal combination have the same left boundary, compare the right boundary of the two Position, the value is larger, update the optimal combination; if the left boundary is different, calculate the two vertical center line positions (position values along the x-axis direction) of the two sets of data, and then respectively and the current RoIs Straight centerline for spacing comparison, closer to the vertical centerline of RoIs, update the optimal combination (because the pedestrian head is more likely to be in the center of the upper layer of the RoIs P up );
⑦如果找到左右边界对的最佳组合(X_edge l,X_edge r),则分别计算该组合与当前RoIs左右边界的间距数值,设定间距阈值T s=0.2×RoI w+0.5,如果其中一个间距结果小于阈值T s,则说明对应的头部区域过于靠近RoIs左右边界,不符合实际人体情况,此边界对无效; 7 If the best combination of left and right boundary pairs (X_edge l , X_edge r ) is found, calculate the distance between the combination and the current RoIs left and right boundaries, and set the spacing threshold T s =0.2×RoI w +0.5 if one of the spacings If the result is less than the threshold value T s , it indicates that the corresponding head region is too close to the left and right boundaries of the RoIs, and does not conform to the actual human body condition, and the boundary pair is invalid;
⑧如果不存在符合条件的左右边界对(X_edge l,X_edge r),则将此RoIs上层区域P up沿水平方向均分为三部分,得到的位置数据即为左右边界对(X_edge l,X_edge r)。 8 If there are no left and right boundary pairs (X_edge l , X_edge r ) that meet the conditions, then the upper layer P Up of the RoIs is divided into three parts in the horizontal direction, and the obtained position data is the left and right boundary pairs (X_edge l , X_edge r ).
对当前RoIs使用上述行人头部自适应定位算法,得到上层区域P up的头部左右边界对(X_edge l,X_edge r),沿水平方向将P up划分为三部分P l、P m、P rFor the current RoIs, the above-mentioned pedestrian head adaptive positioning algorithm is used to obtain the left and right boundary pairs (X_edge l , X_edge r ) of the upper region P up , and the P up is divided into three parts P l , P m , P r in the horizontal direction. .
步骤132:使用基于Haar-like特征的方法评估头部区域和背景区域的亮度均值差异程度,并且与预设的阈值进行比较。Step 132: Evaluate the degree of difference in luminance mean of the head region and the background region using a Haar-like feature based method and compare it with a preset threshold.
根据公式(7)计算P up的Haar-like特征值,并与阈值T haar比较,大于阈值则满足头部约束条件, Calculate the Haar-like eigenvalue of P up according to formula (7) and compare it with the threshold T haar . If it is greater than the threshold, the head constraint condition is satisfied.
min(abs(avg m-avg l),abs(avg m-avg r))        公式(7) Min(abs(avg m -avg l ), abs(avg m -avg r )) Formula (7)
其中,min()是求最小值函数,abs()是求绝对值函数,avg l、avg m、avg r分别是P l、P m、P r的亮度均值,实验设定T haar的取值范围是13~15。 Where min() is the minimum function, abs() is the absolute value function, avg l , avg m , avg r are the mean values of the brightness of P l , P m , P r , respectively, and the value of T haar is set experimentally. The range is 13 to 15.
步骤133:滤除缺失行人头部的RoIs。Step 133: Filter out the RoIs of the missing pedestrian head.
对满足位置特征要求的待检测RoIs逐一进行步骤131和步骤132的操作,滤除缺失行人头部的RoIs。The steps of step 131 and step 132 are performed one by one for the to-be-detected RoIs satisfying the position feature requirement, and the RoIs missing the pedestrian head are filtered out.
通过上述RoIs过滤方法,在具有计算瓶颈的DM6437车载嵌入式平台实验,使用现有技术1的双阈值分割方法提取RoIs,在单幅图像中获得的RoIs数量平均为100个左右。使用上述RoIs过滤方法后,能够减少大约一半的RoIs数量级别,且平均耗时在几毫秒以内。针对数据集SCUT Dataset标注的行人Ground-Truth边界框(目标类型为单一走路行人和单一骑车行人,遮挡标签为未被遮挡),总计抽取14000个样例进行头部自适应定位算法的实验,经过人工统计表明,其中头部左右边界定位失败的数量仅有1162个,准确率约为92%,即提出的头部定位算法具有较高的精度,部分示例如图2的(c)所示,图2的(c)是头部自适应定位算法的部分示例结果,每幅图像中添加的两条白色竖线对应算法得到的行人头部左右边界对(X_edge l,X_edge r)。 Through the above RoIs filtering method, in the DM6437 vehicle embedded platform experiment with computational bottleneck, the RoIs are extracted by the double threshold segmentation method of the prior art 1, and the average number of RoIs obtained in a single image is about 100. With the RoIs filtering method described above, it is possible to reduce the number of RoIs by about half, and the average time is within a few milliseconds. Pedestrian Ground-Truth bounding box for dataset SCUT Dataset (target type is single walking pedestrian and single cycling pedestrian, occlusion label is unoccluded), a total of 14,000 samples are extracted for experiment of head adaptive positioning algorithm. After manual statistics, the number of top left and right boundary failures is only 1162, and the accuracy rate is about 92%. The proposed head positioning algorithm has higher precision. Some examples are shown in Figure 2(c). (c) of FIG. 2 is a partial example result of the head adaptive positioning algorithm, and two white vertical lines added in each image correspond to the left and right boundary pairs (X_edge l , X_edge r ) of the pedestrian head obtained by the algorithm.
图3是示出根据本发明的实施例的RoIs过滤装置的框图。RoIs过滤装置300包括尺寸异常RoIs过滤器310、位置异常RoIs过滤器320和缺失头部RoIs过滤器330。FIG. 3 is a block diagram showing a RoIs filtering device according to an embodiment of the present invention. The RoIs filter device 300 includes a size anomaly RoIs filter 310, a position abnormality RoIs filter 320, and a missing head RoIs filter 330.
尺寸异常RoIs过滤器310滤除尺寸异常的RoIs。具体地,依据图像焦距和行人检测距离,计算行人RoIs像素高度的阈值区间,根据统计分析法,获得行人RoIs高宽比的高斯分布,选取合适的置信水平得到高宽比阈值区间,然后评估每个待检测RoIs,将不符合两种区间条件的RoIs滤除。The size anomaly RoIs filter 310 filters out RoIs of abnormal size. Specifically, according to the image focal length and the pedestrian detection distance, the threshold interval of the pixel height of the pedestrian RoIs is calculated, and according to the statistical analysis method, the Gaussian distribution of the pedestrian RoIs aspect ratio is obtained, the appropriate confidence level is selected to obtain the aspect ratio threshold interval, and then each evaluation is performed. RoIs to be detected will be filtered out by RoIs that do not meet the two interval conditions.
位置异常RoIs过滤器320滤除位置异常的RoIs。具体地,使用水平路面假设方法获取当前图像路面基准,逐个RoIs分别计算其上下边界与路面基准在图像y轴方向的间距,并设定基于当前RoIs像素高度的阈值,然后滤除间距结果不符合阈值的待检测RoIs。The positional abnormality RoIs filter 320 filters out RoIs with abnormal positions. Specifically, the horizontal road surface hypothesis method is used to obtain the current image road surface reference, and the pitch of the upper and lower boundaries and the road surface reference in the y-axis direction of the image is calculated by RoIs one by one, and the threshold based on the current RoIs pixel height is set, and then the filtering result is not met. Threshold of the RoIs to be detected.
缺失头部RoIs过滤器330滤除缺失行人头部的RoIs。具体地,对当前RoIs使用所述行人头部自适应定位算法,得到上层区域P up的头部左右边界对(X_edge l,X_edge r),沿水平方向将P up划分为三部分P l、P m、P r,根据上述公式(7)计算P up的Haar-like特征值,并与阈值T haar比较,大于阈值则满足头部约束条件,对满足位置特征要求的待检测RoIs逐一进行上述操作,滤除缺失行人头部的RoIs。 The missing head RoIs filter 330 filters out the RoIs missing the pedestrian head. Specifically, the pedestrian head adaptive positioning algorithm is used on the current RoIs, and the left and right boundary pairs (X_edge l , X_edge r ) of the upper layer P up are obtained, and the P up is divided into three parts P l and P in the horizontal direction. m , P r , the Haar-like eigenvalue of P up is calculated according to the above formula (7), and compared with the threshold value T haar , if the threshold value is greater than the threshold value, the head constraint condition is satisfied, and the above-mentioned operations are performed one by one for the detected RoIs satisfying the position feature requirement. , filter out the RoIs missing the head of the pedestrian.
图4是示出根据本发明的实施例的分类器训练方法的流程图。4 is a flow chart showing a classifier training method in accordance with an embodiment of the present invention.
在步骤410中,生成增强正样本和增强负样本。具体地,结合正样本标注信息和均衡化技术生成增强正样本,使用聚类方法分析非行人背景图像块的信息分布,辅助筛选不同类别的增强负样本。In step 410, an enhanced positive sample and an enhanced negative sample are generated. Specifically, the positive sample labeling information and the equalization technique are combined to generate an enhanced positive sample, the clustering method is used to analyze the information distribution of the non-pedestrian background image block, and the enhanced negative samples of different categories are assisted.
由于交通场景的行人目标属于稀有事物,因此通过已公开热成像数据集获得的正样本数量通常有限,需要使用图像增强的方法在此基础生成新的正样本;由于负样本是在整幅图像的非行人区域提取的,相对而言没有数量匮乏问题,但是传统做法是基于网格随机方法获取负样本,而实际检测过程使用的RoIs提取方法往往与此不同,导致两者代表的背景信息分布差异很大,即负样本相对实际非行人RoIs的代表性不足。Since the pedestrian target of a traffic scene is a rare thing, the number of positive samples obtained through the published thermal imaging data set is usually limited, and an image enhancement method is needed to generate a new positive sample on this basis; since the negative sample is in the entire image The non-pedestrian area is relatively lacking in quantity, but the traditional method is to obtain negative samples based on the grid random method, and the RoIs extraction method used in the actual detection process is often different, resulting in the difference of background information distribution between the two. Very large, that is, the negative sample is less representative than the actual non-pedestrian RoIs.
增强正样本包括原始正样本和扩展正样本。生成增强正样本包括:以热成像行人检测数据集SCUT Dataset为来源,根据标注的行人Ground-Truth边界框和预设指标提取对应图像块信息,获得原始正样本。使用平台直方图均衡化方法对原始正样本的亮度信息逐一进行处理,得到扩展正样本。也就是说,使用均衡化方法增强原始正样本亮度信息的对比度,生成类似热成像特性的扩展正样本,以此构成足够数量的增强正样本。图5的(b)示出原始正样本和扩展正样本的对比。The enhanced positive samples include the original positive samples and the extended positive samples. Generating the enhanced positive sample includes: taking the thermal imaging pedestrian detection data set SCUT Dataset as a source, extracting the corresponding image block information according to the marked pedestrian Ground-Truth bounding box and the preset index, and obtaining the original positive sample. The platform histogram equalization method is used to process the luminance information of the original positive samples one by one to obtain an extended positive sample. That is to say, the equalization method is used to enhance the contrast of the original positive sample luminance information, and an extended positive sample similar to the thermal imaging characteristic is generated to constitute a sufficient number of enhanced positive samples. (b) of FIG. 5 shows a comparison of the original positive sample and the extended positive sample.
更详细地,生成增强正样本的具体步骤如下:In more detail, the specific steps to generate an enhanced positive sample are as follows:
①以热成像行人检测数据集SCUT Dataset为来源,使用Caltech操作工具(http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/)提取行人Ground-Truth边界框对应的图像块信息,记为临时正样本集合Pos temp1 Using the thermal imaging pedestrian detection data set SCUT Dataset as the source, use the Caltech operating tool (http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/) to extract the image block information corresponding to the pedestrian Ground-Truth bounding box, recorded as Temporary positive sample set Pos temp ;
②在Pos temp中根据预设指标筛选原始正样本集合Pos p,具体指标为:目标类型target属于“单一走路行人”和“单一骑车行人”、遮挡标签label属于未被遮挡、间隔帧数为5、像素高度在[30,140],Pos p的数量记为PosNum p2 In the Pos temp , the original positive sample set Pos p is filtered according to the preset index. The specific indicators are: the target type target belongs to “single walking pedestrian” and “single bicycle pedestrian”, and the occlusion label is unoccluded, and the interval frame number is 5. The pixel height is [30, 140], and the number of Pos p is recorded as PosNum p ;
③针对Pos p使用平台直方图均衡化方法逐样本进行亮度信息处理,得到对应的新样本图像块信息,人工排除其中出现过曝或丢失轮廓的个例,保留下来的记为扩展正样本集合Pos e,其数量记为PosNum e3 For Pos p, the platform histogram equalization method is used to process the luminance information sample by sample, and the corresponding new sample image block information is obtained, and the examples in which the overexposed or lost contours appear are manually excluded, and the retained positive sample set Pos is retained. e , the number of which is recorded as PosNum e ;
④Pos p和Pos e样本集合构成分类器的增强正样本Pos,如式公式(8)所示,其中PosNum e≤PosNum pThe 4Pos p and Pos e sample sets constitute the enhanced positive sample Pos of the classifier, as shown in equation (8), where PosNum e ≤ PosNum p .
Figure PCTCN2018083480-appb-000007
Figure PCTCN2018083480-appb-000007
生成增强负样本包括:使用检测过程对应的RoIs提取方法在数据集的完整图像中提取原始负样本,并使用K-mean聚类和均匀随机选择方法保证筛选得到的增强负样本覆盖更多有代表性的背景信息且比例合适。Generating enhanced negative samples includes: extracting the original negative samples in the complete image of the data set using the RoIs extraction method corresponding to the detection process, and using K-mean clustering and uniform random selection to ensure that the enhanced negative sample coverage obtained by the screening is more representative Sexual background information and appropriate proportions.
具体地,使用检测过程对应的RoIs提取方法在行人检测数据集SCUT Dataset的完整图像中提取图像块信息,去除其中与行人Ground-Truth边界框的IOU高于30%且被判断为尺寸异常(例如,被前述RoIs过滤方法判断为尺寸异常)的个例,保留的图像块记为源负样本;使用K-mean方法对源负样本进行聚类,根据计算得到的比例在聚类结果中均匀随机选取图像块,构成增强负样本;进一步地,根据聚类结果增加包含汽车干扰热源的负样本,提高此类背景信息在增强负样本中的比例。Specifically, the image block information is extracted from the complete image of the pedestrian detection data set SCUT Dataset by using the RoIs extraction method corresponding to the detection process, and the IOU with the pedestrian Ground-Truth bounding box is removed by more than 30% and is determined to be abnormal in size (for example, In the case of the above-mentioned RoIs filtering method, the retained image block is recorded as the source negative sample; the K-mean method is used to cluster the source negative samples, and the calculated ratio is uniformly randomized in the clustering result. The image block is selected to form an enhanced negative sample; further, a negative sample containing the vehicle interference heat source is added according to the clustering result, and the proportion of such background information in the enhanced negative sample is increased.
更详细地,生成增强负样本的具体步骤如下:In more detail, the specific steps to generate an enhanced negative sample are as follows:
①以热成像行人检测数据集SCUT Dataset为来源,使用检测过程对应的RoIs提取方法在数据集的所有完整图像中提取RoIs信息;1 taking the thermal imaging pedestrian detection data set SCUT Dataset as the source, and extracting the RoIs information in all the complete images of the data set by using the RoIs extraction method corresponding to the detection process;
②对得到的RoIs进行逐一判断,排除与行人Ground-Truth边界框的IOU高于30%且被判断为尺寸异常(例如,被前述RoIs过滤方法判断为尺寸异常)的个例;(2) The obtained RoIs are judged one by one, and an example in which the IOU of the pedestrian's Ground-Truth bounding box is higher than 30% and is judged to be abnormal in size (for example, the size is abnormal by the RoIs filtering method) is excluded;
③根据满足预设要求的RoIs提取对应图像块信息,构成源负样本集合Neg temp,其数量记为NegNum temp3 extracting corresponding image block information according to the RoIs satisfying the preset requirement, and forming a source negative sample set Neg temp , the number of which is recorded as NegNum temp ;
④对Neg temp使用K-mean聚类方法划分为n类(例如,实验设定n=100),记增强正样本Pos的数量为PosNum,增强负样本Neg的数量为NegNum,设定NegNum=PosNum×4,根据此标准在聚类结果中以一定比例随机选取图像块信息,具体为:假定当前第i类结果包含的样本数量为Num i,使用均匀随机法从中选取(Num i×NegNum/NegNum temp)个负样本; 4 For Neg temp, use K-mean clustering method to classify into n classes (for example, experimental setting n=100), remember to increase the number of positive samples Pos to PosNum, increase the number of negative samples Neg to NegNum, and set NegNum=PosNum ×4, according to this standard, the image block information is randomly selected in a certain proportion in the clustering result, specifically: it is assumed that the current number of samples included in the i-th type result is Num i , and is selected from the uniform random method (Num i ×NegNum/NegNum Temp ) a negative sample;
⑤对聚类结果逐一进行④的操作,即可满足NegNum数量,组成增强负样本Neg;5 pairs of clustering results one by one operation, can meet the number of NegNum, constitute an enhanced negative sample Neg;
⑥从K-mean聚类划分的n类结果中,人工挑选包含汽车干扰热源负样本的结果集合,以一定比例随机选取其中的负样本添加到Neg,提高此类背景信息在Neg的数量比例。6 From the n-type results of K-mean clustering, manually select the result set containing the negative samples of the vehicle interference heat source, randomly select a negative sample from the random sample to add to Neg, and increase the proportion of such background information in Neg.
接下来,在步骤420中,对生成的增强正样本和增强负样本进行预处理。通过调整亮度和边界信息对增强正负样本进行预处理。对生成的增强正负样本进行预处理,能够提高样本数据质量,最终改进分类器性能。Next, in step 420, the generated enhanced positive samples and enhanced negative samples are pre-processed. The positive and negative samples are preprocessed by adjusting the brightness and boundary information. Preprocessing the generated positive and negative samples can improve the quality of the sample data and ultimately improve the performance of the classifier.
本发明使用的对增强正负样本进行预处理操作包括:像素Y通道提取、边界缩放调整、伽马校正处理。具体地,使用像素Y通道提取方法将增强正负样本转换为低计算开销的单通道图像格式;采用边界缩放策略调整增强正负样本的边界坐标数据,减小训练样本和实际提取RoIs的信息差异程度;进一步地,使用伽马校正方法处理增强正负样本,提高样本Y通道信息的动态范围和拉伸对比度。The preprocessing operations for enhancing positive and negative samples used in the present invention include: pixel Y channel extraction, boundary scaling adjustment, and gamma correction processing. Specifically, the pixel Y channel extraction method is used to convert the enhanced positive and negative samples into a single channel image format with low computational overhead; the boundary scaling strategy is used to adjust the boundary coordinate data of the positive and negative samples, and the information difference between the training samples and the actual extracted RoIs is reduced. Degree; further, the gamma correction method is used to process the enhanced positive and negative samples, and the dynamic range and stretch contrast of the sample Y channel information are improved.
具有如下优点:(1)针对热像仪输入的图像,以YUV4:2:2格式为例,其特征是点(x,y)包含“Y、U”或者“Y、V”的两通道信息;相对于代表色度的U和V通道信息,代表亮度的Y通道具备热成像的完备信息;因此使用像素Y通道提取方法将增强正负样本转换为低计算开销的单通道图像格式。(2)在RoIs提取方法中,根据前景区域得到的RoIs通常存在行人轮廓与RoIs边界贴合或间距过小的情况,而大部分数据集的行人Ground-Truth边界框则在边界附近留有一定间距的背景信息,这增加了训练样本和实际检测提取RoIs的信息差异程度;因此需对增强正负样本进行边界缩放调整以减少信息差异。(3)使用伽马校正方法处理增强正负样本,能够提高样本Y通道信息的动态范围和拉伸对比度。It has the following advantages: (1) For the image input by the camera, the YUV4:2:2 format is taken as an example, and the feature is that the point (x, y) contains two channels of "Y, U" or "Y, V". The Y channel representing the luminance has complete information of thermal imaging with respect to the U and V channel information representing the chromaticity; therefore, the enhanced Y-channel extraction method is used to convert the enhanced positive and negative samples into a single-channel image format with low computational overhead. (2) In the RoIs extraction method, the RoIs obtained from the foreground area usually have a pedestrian contour and a RoIs boundary fit or the spacing is too small, while the pedestrian Ground-Truth bounding box of most data sets has a certain vicinity near the boundary. The background information of the spacing, which increases the degree of information difference between the training samples and the actual detection of the extracted RoIs; therefore, it is necessary to perform boundary scaling adjustment on the enhanced positive and negative samples to reduce the information difference. (3) Using the gamma correction method to process the enhanced positive and negative samples can improve the dynamic range and stretch contrast of the sample Y channel information.
更详细地,对增强正负样本进行预处理的具体步骤如下:In more detail, the specific steps for pre-processing enhanced positive and negative samples are as follows:
①针对当前样本图像块,根据其像素点通道信息的排列格式,逐点提取对应的Y通道信息;随后根据点(x,y)的位置标识按顺序将Y通道信息排列为新的样本数据,图5的(a)为YUV4:2:2格式图像的示例;是以YUV4:2:2格式图像为例子的Y通道预处理过程,箭头上面表示处理前的一幅YUV4:2:2格式图像(每个像素点包含一个Y通道和一个U(或V)通道信息),箭头下面表示处理后的一幅Y通道信息图像(每个像素点只包含一个Y通道信息)。1 For the current sample image block, according to the arrangement format of the pixel point channel information, the corresponding Y channel information is extracted point by point; then the Y channel information is sequentially arranged into new sample data according to the position identifier of the point (x, y), Fig. 5(a) is an example of a YUV4:2:2 format image; a Y channel preprocessing process using a YUV 4:2:2 format image as an example, and an arrow above the YUV 4:2:2 format image before processing (Each pixel contains a Y channel and a U (or V) channel information), and below the arrow is a processed Y channel information image (each pixel contains only one Y channel information).
②对增强正样本Pos和增强负样本Neg,逐样本进行①的操作;2 pairs of enhanced positive samples Pos and enhanced negative samples Neg, 1 operation per sample;
③判断实际检测过程使用的RoIs提取方法,如果行人轮廓与RoIs边界贴合或间距过小、与数据集的情况不相符,则对增强正负样本Pos和Neg逐样本进行边界缩放处理;具体操作是:根据当前样本图像块的重心,将图像块的四个边界分别向重心方向缩小m个像素,通过实验获得m的经验值在3~5范围内;3 to determine the RoIs extraction method used in the actual detection process, if the pedestrian contour and the RoIs boundary fit or the spacing is too small, and does not match the data set, the boundary scaling processing is performed on the enhanced positive and negative samples Pos and Neg samples by sample; Yes: according to the center of gravity of the current sample image block, the four boundaries of the image block are respectively reduced by m pixels in the direction of the center of gravity, and the empirical value of m is obtained in the range of 3 to 5 by experiments;
④对当前样本图像块的Y通道信息使用伽马校正方法逐点进行处理,实验设定伽马参数γ=0.5;4 The Y channel information of the current sample image block is processed point by point using a gamma correction method, and the gamma parameter γ=0.5 is experimentally set;
⑤对增强正样本Pos和增强负样本Neg,逐样本进行④的操作,得到新的增强正负样本Pos’和Neg’。5 pairs of enhanced positive samples Pos and enhanced negative samples Neg, 4 operations per sample, resulting in new enhanced positive and negative samples Pos' and Neg'.
在步骤430中,划分预处理的增强正负样训练集并训练分类器。通过聚类预处理的增强正样本获得远、中、近三个距离的样本尺度划分标准,据此,将预处理后的增强正负样本分为三个训练集,分别训练适于分类远、中、近距离行人目标的三个分类器。In step 430, the pre-processed enhanced positive and negative training set is divided and the classifier is trained. The sample size division criteria of the long, medium and near distances are obtained by the enhanced positive samples of the cluster preprocessing. According to this, the pre-processed enhanced positive and negative samples are divided into three training sets, respectively, which are suitable for classification, Three classifiers for middle and close pedestrian targets.
本发明定义行人目标的像素高度阈值区间为[30,140],对应现实场景的最远和最近行人目标。然而这两种极端距离的行人信息差异很大,导致得到的增强正样本存在较高的类内差异,如果仅训练一个分类器会降低检测性能。The present invention defines a pixel height threshold interval for a pedestrian target of [30, 140], corresponding to the farthest and most recent pedestrian target of the real scene. However, the pedestrian information of these two extreme distances is very different, resulting in a high intraclass difference in the obtained positive samples. If only one classifier is trained, the detection performance will be degraded.
划分预处理的增强正负样训练集并训练分类器包括:使用聚类方法对预处理的增强正样本进行分析,设定种类数量k=3,获得基于像素高度的远、中、近三个距离的样本尺度划分标准,从而将增强正负样本细分为三个独立的训练集;分别训练适于分类远、中、近距离行人目标的三个分类器(classifier f、classifier m、classifier n),针对困难负样本的筛选,使用得到的分类器分别检测所述未用于训练的源负样本,筛选其中的虚警个例作为困难负样本,添加到对应训练集并重新训练分类器,此过程直到满足预设的迭次训练次数为止。图5的(c)示出部分汽车干扰热源困难负样本。 The pre-processed enhanced positive and negative training set and the training classifier include: using the clustering method to analyze the pre-processed enhanced positive samples, setting the number of types k=3, obtaining the far, middle and near three based on the pixel height. The sample scale division criterion of the distance, thereby subdividing the enhanced positive and negative samples into three independent training sets; respectively training three classifiers (classifier f , classifier m , classifier n) suitable for classifying long, medium and close pedestrian targets For the screening of difficult negative samples, the obtained classifier is used to separately detect the negative samples that are not used for training, and the false alarm cases are selected as difficult negative samples, added to the corresponding training set, and the classifier is retrained. This process is until the preset number of times of training is satisfied. (c) of Fig. 5 shows a negative sample of a part of the automobile interference heat source.
更详细地,划分预处理的增强正负样训练集并训练分类器的具体步骤包括:In more detail, the specific steps of dividing the pre-processed enhanced positive and negative training set and training the classifier include:
①定义远、中、近三个连续距离区间的四个边界为(Range l,Range s,Range m,Range r),基于K-mean聚类方法获得这些边界数值,具体操作是:实验设定以5米为间隔将实际检测距离区间[20,85]划分为若干部分,根据公式(1)计算每部分对应的行人目标像素高度数值;在正样本筛选相应像素高度的样例用于聚类分析;使用K-mean聚类方法设定种类数量k=3,获得基于像素高度的四个边界数值为Range l=30,Range s=48,Range m=90,Range r=140; 1 Define four boundaries of the distance, middle and near three consecutive distance intervals (Range l , Range s , Range m , Range r ), and obtain these boundary values based on K-mean clustering method. The specific operation is: experimental setting The actual detection distance interval [20,85] is divided into several parts at intervals of 5 meters, and the corresponding target pixel height value of each part is calculated according to formula (1); the sample of the corresponding pixel height is filtered in the positive sample for clustering Analysis; use K-mean clustering method to set the number of species k=3, and obtain four boundary values based on pixel height as Range l = 30, Range s = 48, Range m = 90, Range r = 140;
height pixel≈height target×f/distance          公式(1) Height pixel ≈height target ×f/distance formula (1)
其中,height pixel为行人目标在图像中的像素高度,height target为行人目标的身高,实验设定约为1.7米,f为图像焦距,在SCUT Dataset的数值为1554,distance为检测距离。 Among them, height pixel is the pixel height of the pedestrian target in the image, height target is the height of the pedestrian target, the experimental setting is about 1.7 meters, f is the image focal length, the value in the SCUT Dataset is 1554, and the distance is the detection distance.
②记当前样本图像块Sample的像素高度为Sample h,如果Range l≤Sample h<Range s则将Sample划分到远距离样本训练集,如果Range s≤Sample h<Range m则划分到中距离 样本训练集,如果Range m≤Sample h≤Range r则划分到近距离样本训练集; 2 The pixel height of the current sample image block Sample is Sample h . If Range l ≤Sample h <Range s, the Sample is divided into the long-distance sample training set. If Range s ≤Sample h <Range m, it is divided into the medium distance sample training. Set, if Range m ≤Sample h ≤Range r is divided into a close-range sample training set;
③对增强正负样本Pos’和Neg’,逐样本进行②的操作,得到三个样本训练集;3 pairs of enhanced positive and negative samples Pos' and Neg', 2 operations per sample, to obtain three sample training sets;
④根据获得的三个独立训练集,分别训练适于分类远、中、近距离行人目标的三个分类器,在迭代过程中,针对困难负样本的筛选,使用得到的分类器分别检测所述未用于训练的源负样本,筛选其中的虚警样例作为困难负样本,添加到对应训练集并重新训练分类器,此过程直到满足预设的迭次训练次数为止。4 According to the obtained three independent training sets, three classifiers suitable for classifying long-, medium-, and close-range pedestrian targets are respectively trained. In the iterative process, for the screening of difficult negative samples, the obtained classifier is used to separately detect the Source negative samples that are not used for training, filter the false alarm samples as difficult negative samples, add to the corresponding training set and retrain the classifier until the preset number of times of training is satisfied.
图6是示出根据本发明的实施例的分类器训练装置的框图。分类器训练装置600包括增强正负样本生成模块610,增强正负样本预处理模块620和训练集划分与分类器训练模块630。FIG. 6 is a block diagram showing a classifier training apparatus according to an embodiment of the present invention. The classifier training device 600 includes an enhanced positive and negative sample generation module 610, an enhanced positive and negative sample preprocessing module 620, and a training set partitioning and classifier training module 630.
增强正负样本生成模块610生成增强正样本和增强负样本。具体地,结合正样本标注信息和均衡化技术生成增强正样本,使用聚类方法分析非行人背景图像块的信息分布,辅助筛选不同类别的增强负样本。The enhanced positive and negative sample generation module 610 generates an enhanced positive sample and an enhanced negative sample. Specifically, the positive sample labeling information and the equalization technique are combined to generate an enhanced positive sample, the clustering method is used to analyze the information distribution of the non-pedestrian background image block, and the enhanced negative samples of different categories are assisted.
增强正样本包括原始正样本和扩展正样本。生成增强正样本包括:以热成像行人检测数据集SCUT Dataset为来源,根据标注的行人Ground-Truth边界框和预设指标提取对应图像块信息,获得原始正样本。使用平台直方图均衡化方法对原始正样本的亮度信息逐一进行处理,得到扩展正样本。即,使用均衡化方法增强原始正样本亮度信息的对比度,生成类似热成像特性的扩展正样本,以此构成足够数量的增强正样本。The enhanced positive samples include the original positive samples and the extended positive samples. Generating the enhanced positive sample includes: taking the thermal imaging pedestrian detection data set SCUT Dataset as a source, extracting the corresponding image block information according to the marked pedestrian Ground-Truth bounding box and the preset index, and obtaining the original positive sample. The platform histogram equalization method is used to process the luminance information of the original positive samples one by one to obtain an extended positive sample. That is, an equalization method is used to enhance the contrast of the original positive sample luminance information, and an extended positive sample similar to the thermal imaging characteristic is generated to constitute a sufficient number of enhanced positive samples.
生成增强负样本包括:使用检测过程对应的RoIs提取方法在行人检测数据集SCUT Dataset的完整图像中提取图像块信息,去除其中与行人Ground-Truth边界框的IOU高于30%且被判断为尺寸异常(例如,被前述RoIs过滤方法判断为尺寸异常)的个例,保留的图像块记为源负样本;使用K-mean方法对源负样本进行聚类,根据计算得到的比例在聚类结果中均匀随机选取图像块,构成增强负样本;进一步地,根据聚类结果增加包含汽车干扰热源的负样本,提高此类背景信息在增强负样本中的比例。Generating the enhanced negative sample includes extracting the image block information from the complete image of the pedestrian detection data set SCUT Dataset using the RoIs extraction method corresponding to the detection process, and removing the IOU from the pedestrian Ground-Truth bounding box by more than 30% and being judged as the size For an example of an abnormality (for example, a size abnormality determined by the aforementioned RoIs filtering method), the retained image block is recorded as a source negative sample; the source negative sample is clustered using the K-mean method, and the clustered result is obtained according to the calculated ratio. The image block is randomly selected to form an enhanced negative sample; further, a negative sample containing the vehicle interference heat source is added according to the clustering result, and the proportion of such background information in the enhanced negative sample is improved.
增强正负样本预处理模块620对增强正负样本生成模块生成的增强正样本和增强负样本进行预处理。通过调整亮度和边界信息对增强正负样本进行预处理。对生成的增强正负样本进行预处理,能够提高样本数据质量,最终改进分类器性能。The enhanced positive and negative sample preprocessing module 620 preprocesses the enhanced positive samples and the enhanced negative samples generated by the enhanced positive and negative sample generating modules. The positive and negative samples are preprocessed by adjusting the brightness and boundary information. Preprocessing the generated positive and negative samples can improve the quality of the sample data and ultimately improve the performance of the classifier.
本发明使用的对增强正负样本进行预处理操作包括:像素Y通道提取、边界缩放调整、伽马校正处理。具体地,使用像素Y通道提取方法将增强正负样本转换为低计算开销的单通道图像格式;采用边界缩放策略调整增强正负样本的边界坐标数据,减小训练样本和实际提取RoIs的信息差异程度;进一步地,使用伽马校正方法处理增强正负样本,提高样本Y通道信息的动态范围和拉伸对比度。The preprocessing operations for enhancing positive and negative samples used in the present invention include: pixel Y channel extraction, boundary scaling adjustment, and gamma correction processing. Specifically, the pixel Y channel extraction method is used to convert the enhanced positive and negative samples into a single channel image format with low computational overhead; the boundary scaling strategy is used to adjust the boundary coordinate data of the positive and negative samples, and the information difference between the training samples and the actual extracted RoIs is reduced. Degree; further, the gamma correction method is used to process the enhanced positive and negative samples, and the dynamic range and stretch contrast of the sample Y channel information are improved.
具有如下优点:(1)针对热像仪输入的图像,以YUV4:2:2格式为例,其特征是点(x,y)包含“Y、U”或者“Y、V”的两通道信息;相对于代表色度的U和V通道信息,代表亮度的Y通道具备热成像的完备信息;因此使用像素Y通道提取方法将增强正负样 本转换为低计算开销的单通道图像格式。(2)在RoIs提取方法中,根据前景区域得到的RoIs通常存在行人轮廓与RoIs边界贴合或间距过小的情况,而大部分数据集的行人Ground-Truth边界框则在边界附近留有一定间距的背景信息,这增加了训练样本和实际检测提取RoIs的信息差异程度;因此需对增强正负样本进行边界缩放调整以减少信息差异。(3)使用伽马校正方法处理增强正负样本,能够提高样本Y通道信息的动态范围和拉伸对比度。It has the following advantages: (1) For the image input by the camera, the YUV4:2:2 format is taken as an example, and the feature is that the point (x, y) contains two channels of "Y, U" or "Y, V". The Y channel representing the luminance has complete information of thermal imaging with respect to the U and V channel information representing the chromaticity; therefore, the enhanced Y-channel extraction method is used to convert the enhanced positive and negative samples into a single-channel image format with low computational overhead. (2) In the RoIs extraction method, the RoIs obtained from the foreground area usually have a pedestrian contour and a RoIs boundary fit or the spacing is too small, while the pedestrian Ground-Truth bounding box of most data sets has a certain vicinity near the boundary. The background information of the spacing, which increases the degree of information difference between the training samples and the actual detection of the extracted RoIs; therefore, it is necessary to perform boundary scaling adjustment on the enhanced positive and negative samples to reduce the information difference. (3) Using the gamma correction method to process the enhanced positive and negative samples can improve the dynamic range and stretch contrast of the sample Y channel information.
训练集划分与分类器训练模块630划分增强正负样本预处理模块预处理的增强正负样训练集并迭代训练分类器。通过聚类正样本获得远、中、近三个距离的样本尺度划分标准,据此,将预处理后的增强正负样本分为三个训练集,分别训练适于分类远、中、近距离行人目标的三个分类器。The training set partitioning and classifier training module 630 divides the enhanced positive and negative sample training set pre-processed by the positive and negative sample preprocessing modules and iteratively trains the classifier. By clustering positive samples, the sample size division criteria of the distances of the far, middle and near distances are obtained. According to this, the pre-processed enhanced positive and negative samples are divided into three training sets, which are respectively suitable for classifying far, medium and close distances. Three classifiers for pedestrian goals.
本发明定义行人目标的像素高度阈值区间为[30,140],对应现实场景的最远和最近行人目标。然而这两种极端距离的行人信息差异很大,导致得到的增强正样本存在较高的类内差异,如果仅训练一个分类器会降低检测性能。The present invention defines a pixel height threshold interval for a pedestrian target of [30, 140], corresponding to the farthest and most recent pedestrian target of the real scene. However, the pedestrian information of these two extreme distances is very different, resulting in a high intraclass difference in the obtained positive samples. If only one classifier is trained, the detection performance will be degraded.
划分预处理的增强正负样训练集并训练分类器包括:使用聚类方法对预处理的增强正样本进行分析,设定种类数量k=3,获得基于像素高度的远、中、近三个距离的样本尺度划分标准,从而将增强正负样本细分为三个独立的训练集;分别训练适于分类远、中、近距离行人目标的三个分类器(classifier f、classifier m、classifier n),针对困难负样本的筛选,使用得到的分类器分别检测所述未用于训练的源负样本,筛选其中的虚警个例作为困难负样本,添加到对应训练集并重新训练分类器,此过程直到满足预设的迭次训练次数为止。 The pre-processed enhanced positive and negative training set and the training classifier include: using the clustering method to analyze the pre-processed enhanced positive samples, setting the number of types k=3, obtaining the far, middle and near three based on the pixel height. The sample scale division criterion of the distance, thereby subdividing the enhanced positive and negative samples into three independent training sets; respectively training three classifiers (classifier f , classifier m , classifier n) suitable for classifying long, medium and close pedestrian targets For the screening of difficult negative samples, the obtained classifier is used to separately detect the negative samples that are not used for training, and the false alarm cases are selected as difficult negative samples, added to the corresponding training set, and the classifier is retrained. This process is until the preset number of times of training is satisfied.
使用所述增强正样本生成方法在数据集SCUT Dataset上得到的原始正样本有:远距离区间正样本26000个左右,中距离区间正样本18800个左右,近距离区间正样本9700个左右,结合扩展正样本的生成,最终得到的增强正样本能够满足分类器对正样本数量的要求。The original positive samples obtained by using the enhanced positive sample generation method on the data set SCUT Dataset include: 26,000 positive samples in the long distance interval, 18800 positive samples in the middle distance interval, and 9700 positive samples in the short interval interval. The positive sample is generated, and the resulting enhanced positive sample can satisfy the classifier's requirement for the positive sample size.
图7是示出根据本发明的实施例的行人检测方法的流程图。FIG. 7 is a flow chart showing a pedestrian detection method according to an embodiment of the present invention.
在步骤710,提取待检测的RoIs。At step 710, the RoIs to be detected are extracted.
在步骤720,对RoIs进行过滤。所述RoIs过滤包括步骤:通过计算行人像素高度和RoIs高宽比并设定相应的阈值区间,滤除尺寸异常的RoIs;逐个RoIs分别计算其上下边界与当前图像路面基准的竖直间距,计算基于RoIs像素高度的阈值,滤除位置异常的RoIs;依据每个RoIs的亮度竖直投影差值曲线搜索可能的行人头部区域,对比头部区域和相邻背景区域的Haar-like特征的差异程度,滤除缺失行人头部的RoIs。更详细的描述上面已经介绍,在此不再赘述。At step 720, the RoIs are filtered. The RoIs filtering includes the steps of: filtering the RoIs of the abnormal size by calculating the pedestrian pixel height and the RoIs aspect ratio and setting the corresponding threshold interval; calculating the vertical spacing of the upper and lower boundaries and the current image pavement reference by the RoIs, respectively, and calculating Based on the threshold of RoIs pixel height, the RoIs with abnormal position is filtered out; the possible vertical head region is searched according to the luminance vertical projection difference curve of each RoIs, and the difference of Haar-like features of the head region and the adjacent background region is compared. Degree, filtering out RoIs missing the head of the pedestrian. A more detailed description has been introduced above, and will not be described again here.
在步骤730,对分类器进行离线训练。分类器训练方法包括:结合正样本标注信息和均衡化技术生成增强正样本,使用聚类方法分析非行人背景图像块的信息分布,辅 助筛选不同类别的增强负样本;通过调整亮度和边界信息对增强正负样本进行预处理;以及通过聚类预处理的增强正样本获得远、中、近三个距离的样本尺度划分标准,据此,将预处理后的增强正负样本分为三个训练集,分别训练适于分类远、中、近距离行人目标的三个分类器。更详细的描述上面已经介绍,在此不再赘述。At step 730, the classifier is trained offline. The classifier training method includes: combining positive sample labeling information and equalization technology to generate enhanced positive samples, clustering method for analyzing non-pedestrian background image block information distribution, and assisting in screening different categories of enhanced negative samples; adjusting brightness and boundary information pairs Enhance the positive and negative samples for preprocessing; and obtain the sample size division criteria of the far, middle and near distances by the enhanced positive samples of the cluster preprocessing, according to which the pre-processed enhanced positive and negative samples are divided into three trainings. Set, respectively, to train three classifiers suitable for classifying long-, medium-, and close-range pedestrian targets. A more detailed description has been introduced above, and will not be described again here.
在步骤740,使用已完成训练的分类器对过滤后的RoIs进行分类检测。At step 740, the filtered RoIs are classified and detected using the classifier that has completed the training.
图8是示出根据本发明的实施例的行人检测装置的框图。行人检测装置800包括RoIs提取模块810,RoIs过滤模块820,分类器训练模块830和分类检测模块840。FIG. 8 is a block diagram showing a pedestrian detecting apparatus according to an embodiment of the present invention. The pedestrian detection device 800 includes a RoIs extraction module 810, a RoIs filtering module 820, a classifier training module 830, and a classification detection module 840.
RoIs提取模块810,提取待检测的RoIs。The RoIs extraction module 810 extracts the RoIs to be detected.
RoIs过滤模块820,对RoIs进行过滤。所述RoIs过滤包括步骤:通过计算行人像素高度和RoIs高宽比并设定相应的阈值区间,滤除尺寸异常的RoIs;逐个RoIs分别计算其上下边界与当前图像路面基准的竖直间距,计算基于RoIs像素高度的阈值,滤除位置异常的RoIs;依据每个RoIs的亮度竖直投影差值曲线搜索可能的行人头部区域,对比头部区域和相邻背景区域的Haar-like特征的差异程度,滤除缺失行人头部的RoIs。更详细的描述上面已经介绍,在此不再赘述。The RoIs filtering module 820 filters the RoIs. The RoIs filtering includes the steps of: filtering the RoIs of the abnormal size by calculating the pedestrian pixel height and the RoIs aspect ratio and setting the corresponding threshold interval; calculating the vertical spacing of the upper and lower boundaries and the current image pavement reference by the RoIs, respectively, and calculating Based on the threshold of RoIs pixel height, the RoIs with abnormal position is filtered out; the possible vertical head region is searched according to the luminance vertical projection difference curve of each RoIs, and the difference of Haar-like features of the head region and the adjacent background region is compared. Degree, filtering out RoIs missing the head of the pedestrian. A more detailed description has been introduced above, and will not be described again here.
分类器离线训练模块830,对分类器进行离线训练。分类器训练方法包括:结合正样本标注信息和均衡化技术生成增强正样本,使用聚类方法分析非行人背景图像块的信息分布,辅助筛选不同类别的增强负样本;通过调整亮度和边界信息对增强正负样本进行预处理;以及通过聚类预处理的增强正样本获得远、中、近三个距离的样本尺度划分标准,据此,将预处理后的增强正负样本分为三个训练集,分别训练适于分类远、中、近距离行人目标的三个分类器。更详细的描述上面已经介绍,在此不再赘述。The classifier offline training module 830 performs offline training on the classifier. The classifier training method includes: combining positive sample labeling information and equalization technology to generate enhanced positive samples, clustering method for analyzing non-pedestrian background image block information distribution, and assisting in screening different categories of enhanced negative samples; adjusting brightness and boundary information pairs Enhance the positive and negative samples for preprocessing; and obtain the sample size division criteria of the far, middle and near distances by the enhanced positive samples of the cluster preprocessing, according to which the pre-processed enhanced positive and negative samples are divided into three trainings. Set, respectively, to train three classifiers suitable for classifying long-, medium-, and close-range pedestrian targets. A more detailed description has been introduced above, and will not be described again here.
分类检测模块840,使用已完成训练的分类器对过滤后的RoIs进行分类检测。The classification detecting module 840 performs classification detection on the filtered RoIs using the classifier that has completed the training.
本发明提供的车载热成像行人检测方法,针对计算瓶颈和样本质量问题的不利影响,与现有的车载热成像行人检测技术相比,具有如下优点和效果:The on-board thermal imaging pedestrian detection method provided by the invention has the following advantages and effects compared with the existing on-board thermal imaging pedestrian detection technology for the adverse effects of the calculation bottleneck and the sample quality problem:
1、本发明提出的分类器训练方法和RoIs过滤方法能够形成“前后协作”关系,即在车载热成像行人检测过程中,针对提取环节得到的RoIs,优先使用RoIs过滤方法对非行人RoIs进行在线辨别并移除;随后使用分类器训练方法离线训练适于远、中、近距离的三个分类器,将保留的RoIs按像素高度划分到对应分类器进行精细检测。1. The classifier training method and the RoIs filtering method proposed by the present invention can form a "front-to-back cooperation" relationship, that is, in the on-board thermal imaging pedestrian detection process, the RoIs obtained for the extraction link preferentially use the RoIs filtering method to perform online non-pedestrian RoIs. Discriminate and remove; then use the classifier training method to offlinely train three classifiers suitable for far, medium and close distances, and divide the retained RoIs into corresponding classifiers for fine detection.
2、本发明提出RoIs过滤方法,通过构造一个符合行人特征规律且计算开销低的三层级联过滤器,能够优先滤除尺寸异常、位置异常以及缺失行人头部的RoIs,大量非行人RoIs得到抑制,保证剩余待检测RoIs在进行精度更高的分类器检测环节时能够满足实时性要求,同时能够降低系统虚警率。2. The present invention proposes a RoIs filtering method. By constructing a three-layer cascade filter that conforms to the pedestrian characteristic law and has low computational overhead, it is possible to preferentially filter out size anomalies, positional anomalies, and missing RoIs of pedestrian heads, and a large number of non-pedestrian RoIs are suppressed. To ensure that the remaining ROIs to be tested can meet the real-time requirements when performing the higher-precision classifier detection, and at the same time reduce the system false alarm rate.
3、本发明提出分类器训练方法,聚焦于样本训练集在数量、分布和质量方面的改进;通过使用均衡化方法增强图像的对比度,能够以原始正样本为基础生成类似热成像特性的扩展正样本,构成足够数量的增强正样本;通过使用聚类方法对源负样本进行背 景信息种类的分析,能够保证得到的增强负样本覆盖更多有代表性的背景信息且比例合适;通过预处理方法调整增强正负样本,能够提高样本质量;通过使用聚类方法获得增强正负样本训练集的划分标准,能够减少样本的类内差异。分类器训练方法能够提高分类器的场景适应性,同时由于在样本层面进行改进,增加的系统计算开销是较小的,能较好满足实际应用需求。3. The present invention proposes a classifier training method that focuses on the improvement of the number, distribution, and quality of the sample training set; by using the equalization method to enhance the contrast of the image, it is possible to generate an extension of similar thermal imaging characteristics based on the original positive sample. The sample constitutes a sufficient number of enhanced positive samples; by using the clustering method to analyze the type of background information of the source negative samples, it can ensure that the obtained enhanced negative samples cover more representative background information and the proportion is appropriate; Adjusting the positive and negative samples can improve the sample quality; by using the clustering method to obtain the classification criteria of the enhanced positive and negative sample training set, the intra-class differences of the samples can be reduced. The classifier training method can improve the scene adaptability of the classifier, and at the sample level, the increased system computational overhead is smaller, which can better meet the practical application requirements.
在实际道路行人检测环境下对本发明的方法进行性能测试和评价。用于测试的完整热成像行人检测装置包括:现有技术1的RoIs提取方法、本发明提出的RoIs过滤方法、本发明提出的分类器训练方法、基于“HOG特征和线性SVM”的分类器类型、卡尔曼跟踪方法。用于测试的硬件平台是指安装了行人检测系统的车辆,其中使用了广州飒特公司生产的NV628型号红外热像仪和德州仪器公司生产的DM6437嵌入式平台。Performance testing and evaluation of the method of the present invention is performed in an actual road pedestrian detection environment. The complete thermal imaging pedestrian detection apparatus for testing includes: the RoIs extraction method of the prior art 1, the RoIs filtering method proposed by the present invention, the classifier training method proposed by the present invention, and the classifier type based on "HOG feature and linear SVM" , Kalman tracking method. The hardware platform used for testing refers to the vehicle with the pedestrian detection system installed, which uses the NV628 infrared thermal imager produced by Guangzhou Biotech Co., Ltd. and the DM6437 embedded platform produced by Texas Instruments.
测试方案具体为挑选广州市的若干段道路环境,使用所述车辆进行实际效果的静态测试和动态测试。测试环境为夜间多云天气,环境温度约为27℃,相对湿度约为90%。评价指标具体设定为:使用人工统计的方式处理保存的检测视频,记录有效行人个体数量、被准确检测行人数量、虚警个体数量,并计算检测率。其中有效行人是指在帧率为25/每秒的检测视频中至少存在1秒及以上的行人目标;行人目标包括正面、背面与侧面走路的姿态、以及纵向骑自行车、电动车与摩托车的姿态;虚警个体数量是指在某一测试路段内出现的错误检测次数,当虚警个体或区域一直存在当前画面中时,按照出现1次处理;检测率是指被准确检测行人数量和有效行人个体数量的比率。The test plan specifically selects several sections of the road environment in Guangzhou, and uses the vehicles to perform static and dynamic tests of actual effects. The test environment is cloudy at night, the ambient temperature is about 27 ° C, and the relative humidity is about 90%. The evaluation index is specifically set as follows: the saved detection video is processed by manual statistics, the number of effective pedestrians, the number of pedestrians accurately detected, the number of false alarm individuals, and the detection rate are calculated. Among them, effective pedestrians refer to pedestrian targets with at least 1 second and above in the detection video with a frame rate of 25/second; pedestrian targets include front, back and side walking postures, as well as longitudinal cycling, electric vehicles and motorcycles. Attitude; the number of false alarm individuals refers to the number of error detections that occur within a certain test segment. When the false alarm individual or region always exists in the current picture, the processing is performed once; the detection rate refers to the number of pedestrians being accurately detected and valid. The ratio of the number of pedestrians.
针对静态测试环节,挑选了广州市保税区3个直线距离大于200米的普通铺装路段,将用于测试的车辆停靠在合适位置,在车辆正前方15-70米范围内随机分散设置多个直立走动行人,使用电脑采集并统计数据结果,具体如表1所示。For the static test, three ordinary pavement sections with a straight line distance of more than 200 meters were selected in the Guangzhou Free Trade Zone. The vehicles used for testing were parked at appropriate positions, and multiple erects were randomly distributed within the range of 15-70 meters in front of the vehicle. Walk pedestrians, use the computer to collect and statistical data results, as shown in Table 1.
根据表1的静态测试结果可以看出,在测试车辆静止的情况下,使用本发明所提出方法的热成像行人检测系统具有较好的性能,在所述测试路段的详细静态测试中,有效行人的检测率为100%,且虚警个体数量为0。According to the static test results of Table 1, it can be seen that the thermal imaging pedestrian detection system using the method of the present invention has better performance in the case where the test vehicle is stationary, and in the detailed static test of the test section, effective pedestrians The detection rate is 100%, and the number of false alarm individuals is zero.
表1 静态测试结果统计Table 1 Statistics of static test results
Figure PCTCN2018083480-appb-000008
Figure PCTCN2018083480-appb-000008
针对动态测试环节,挑选面向广州市郊区场景、市区场景、和高速场景的6个普通铺装道路,以10-80Km/h的速度驾驶车辆分别在每个路段进行10分钟的现场测试,测试时间共计60分钟,使用电脑采集并统计数据结果,具体如表2所示。For the dynamic test session, select 6 ordinary paved roads for the suburban scenes, urban scenes, and high-speed scenes in Guangzhou, and drive the vehicles at a speed of 10-80Km/h for 10 minutes of field test on each section. The total time is 60 minutes, and the results of the data collection and statistics are collected using a computer, as shown in Table 2.
表2 动态测试结果统计Table 2 Dynamic test result statistics
Figure PCTCN2018083480-appb-000009
Figure PCTCN2018083480-appb-000009
根据表2的动态测试结果可以看出,相比于静态测试结果,在测试车辆行驶的情况下,所述热成像行人检测系统的检测性能有所下降,分析原因是在行驶过程中,背景干扰热源更加复杂,如道路车辆、树木较多,且行人目标出现被遮挡的情况也增多。同时,受热成像特性影响,车辆行驶过程中捕获图像的亮度和对比度会随时发生变化,这些因素影响了动态测试的结果。在所述测试路段的详细动态测试中,平均检测率达到75.63%,平均虚警个体数量为10,同时所述行人检测系统的检测速度能够基本满足实时性要求。According to the dynamic test results of Table 2, it can be seen that compared with the static test result, the detection performance of the thermal imaging pedestrian detection system is degraded in the case of the test vehicle running, and the analysis reason is that during the driving, the background interference The heat source is more complicated, such as road vehicles and trees, and the number of pedestrian targets is blocked. At the same time, due to the thermal imaging characteristics, the brightness and contrast of the captured image during the driving process will change at any time. These factors affect the results of the dynamic test. In the detailed dynamic test of the test section, the average detection rate reaches 75.63%, and the average number of false alarm individuals is 10, and the detection speed of the pedestrian detection system can basically meet the real-time requirement.
上述内容是结合具体的实施方式对本发明进行的详细说明,但并不能认定本发明的具体实施只限于此内容。对于本发明所属技术领域的普通技术人员而言,在不脱离本发明的原理和精神的前提下,还可以对这些实施进行若干调整、修改、替换和/或变型。本发明的保护范围由所附权利要求及其等同内容限定。The above description is a detailed description of the present invention in connection with the specific embodiments, but it is not considered that the specific implementation of the present invention is limited to this. Numerous modifications, changes, substitutions and/or changes may be made to the embodiments of the present invention without departing from the spirit and scope of the invention. The scope of the invention is defined by the appended claims and their equivalents.

Claims (10)

  1. 一种面向车载热成像行人检测的感兴趣区域(RoIs)过滤方法,其特征在于,所述方法包括:A region of interest (RoIs) filtering method for on-board thermal imaging pedestrian detection, characterized in that the method comprises:
    通过计算行人的像素高度和RoIs的高宽比并设定相应的阈值区间,滤除尺寸异常的RoIs;Filtering the abnormal size RoIs by calculating the pixel height of the pedestrian and the aspect ratio of the RoIs and setting the corresponding threshold interval;
    逐个RoIs分别计算其上下边界与当前图像路面基准的竖直间距,计算基于RoIs像素高度的阈值,滤除位置异常的RoIs;以及Calculate the vertical distance between the upper and lower boundaries and the current image road surface reference by RoIs one by one, calculate the threshold based on the pixel height of RoIs, and filter out the RoIs with abnormal position;
    依据每个RoIs的亮度竖直投影差值曲线搜索可能的行人头部区域,对比头部区域和相邻背景区域的Haar-like特征的差异程度,滤除缺失行人头部的RoIs。According to the luminance vertical projection difference curve of each RoIs, the possible pedestrian head region is searched, and the degree of difference between the Haar-like features of the head region and the adjacent background region is compared, and the RoIs of the missing pedestrian head is filtered out.
  2. 根据权利要求1所述的感兴趣区域过滤方法,其特征在于,滤除尺寸异常的RoIs包括:The region of interest filtering method according to claim 1, wherein the filtering of the abnormal size of the RoIs comprises:
    依据图像焦距f、行人身高height target和检测距离distance参数,计算得到行人RoIs像素高度的阈值区间: According to the image focal length f, the pedestrian height target and the detection distance distance parameter, the threshold interval of the pedestrian RoIs pixel height is calculated:
    height pixel≈height target×f/distance         公式(1) Height pixel ≈height target ×f/distance formula (1)
    其中,height pixel是行人RoIs像素高度的阈值区间,height target为行人目标的身高,f为图像焦距,distance为检测距离; Where height pixel is the threshold interval of the pixel height of the pedestrian RoIs, height target is the height of the pedestrian target, f is the image focal length, and distance is the detection distance;
    根据统计分析法,获得行人RoIs高宽比的高斯分布,选取合适的置信水平得到高宽比阈值区间;以及According to the statistical analysis method, the Gaussian distribution of the pedestrian RoIs aspect ratio is obtained, and the appropriate confidence level is selected to obtain the aspect ratio threshold interval;
    评估每个待检测RoIs,不符合两种区间条件的RoIs为尺寸异常的RoIs,将这些尺寸异常的RoIs移除。Each RoIs to be tested is evaluated, and the RoIs that do not meet the two interval conditions are RoIs of abnormal size, and these abnormally-sized RoIs are removed.
  3. 根据权利要求1所述的感兴趣区域过滤方法,其特征在于,滤除位置异常的RoIs包括:The ROI filtering method according to claim 1, wherein the RoIs for filtering out the abnormal position include:
    使用水平路面假设方法获取当前图像路面基准;Obtaining the current image road surface reference using the horizontal road surface hypothesis method;
    对需要判断的RoIs,逐个RoIs分别计算其上下边界与路面基准在y轴方向的间距数值,y轴方向是RoIs的竖直方向,并根据公式(2)计算基于当前RoIs像素高度RoI h的阈值: For the RoIs to be judged, the distance between the upper and lower boundary and the road reference in the y-axis direction is calculated by RoIs one by one, and the y-axis direction is the vertical direction of RoIs, and the threshold based on the current RoIs pixel height RoI h is calculated according to formula (2). :
    Figure PCTCN2018083480-appb-100001
    Figure PCTCN2018083480-appb-100001
    其中,α和β是缩放因子,ε是偏移噪声因子;以及Where α and β are scaling factors and ε is an offset noise factor;
    滤除间距结果不符合阈值的待检测RoIs。The RoIs to be detected whose filtering result does not meet the threshold value are filtered out.
  4. 根据权利要求1所述的感兴趣区域过滤方法,其特征在于,滤除缺失行人头部的RoIs包括:The ROI filtering method according to claim 1, wherein the filtering of the RoIs missing the pedestrian head comprises:
    使用行人头部自适应定位算法将当前RoIs上层区域沿水平方向划分为三部分,中间部分命名为头部区域,左、右部分命名为背景区域;以及The pedestrian head adaptive positioning algorithm is used to divide the current upper layer area of the RoIs into three parts in the horizontal direction, the middle part is named as the head area, and the left and right parts are named as the background area;
    使用基于Haar-like特征的方法评估头部区域和背景区域的亮度均值差异程度,依据预设的阈值移除缺失头部的RoIs。The Haar-like feature-based method is used to estimate the degree of difference in luminance mean between the head region and the background region, and the RoIs of the missing header are removed according to a preset threshold.
  5. 根据权利要求4所述的感兴趣区域过滤方法,其特征在于,行人头部自适应定位算法使用亮度竖直投影方法处理当前RoIs的上层区域,得到对应的投影结果序列;计算序列中相邻数据的差值,获得当前RoIs的亮度竖直投影差值曲线;根据竖直边界匹配策略,在曲线极值点寻找符合条件的头部区域左右边界组合,相应的x轴坐标信息定义头部区域位置,其中,x轴是RoIs的水平方向。The region of interest filtering method according to claim 4, wherein the pedestrian head adaptive positioning algorithm processes the upper layer region of the current RoIs by using a vertical projection method, and obtains a corresponding projection result sequence; and calculates adjacent data in the sequence. The difference is obtained by obtaining the current vertical projection difference curve of the RoIs; according to the vertical boundary matching strategy, the left and right boundary combinations of the qualified head regions are searched at the extreme points of the curve, and the corresponding x-axis coordinate information defines the position of the head region. Where the x-axis is the horizontal direction of the RoIs.
  6. 一种面向车载热成像行人检测的感兴趣区域(Regions of Interest,RoIs)过滤装置,其特征在于,所述装置包括:A Regions of Interest (RoIs) filtering device for on-board thermal imaging pedestrian detection, characterized in that the device comprises:
    尺寸异常RoIs过滤器,通过计算行人像素高度和RoIs高宽比并设定相应的阈值区间,滤除尺寸异常的RoIs;The size abnormal RoIs filter filters out the abnormal size RoIs by calculating the pedestrian pixel height and the RoIs aspect ratio and setting the corresponding threshold interval;
    位置异常RoIs过滤器,逐个RoIs分别计算其上下边界与当前图像路面基准的竖直间距,计算基于RoIs像素高度的阈值,滤除位置异常的RoIs;以及The positional abnormality RoIs filter calculates the vertical distance between the upper and lower boundaries and the current image road surface reference by RoIs, calculates the threshold based on the pixel height of the RoIs, and filters out the abnormal position RoIs;
    缺失头部RoIs过滤器,依据每个RoIs的亮度竖直投影差值曲线搜索可能的行人头部区域,对比头部区域和相邻背景区域的Haar-like特征的差异程度,滤除缺失行人头部的RoIs。The head RoIs filter is missing, and the possible pedestrian head region is searched according to the luminance vertical projection difference curve of each RoIs, and the difference degree of the Haar-like features of the head region and the adjacent background region is compared, and the missing pedestrian head is filtered out. Department of RoIs.
  7. 根据权利要求6所述的感兴趣区域过滤装置,其特征在于,尺寸异常RoIs过滤器依据图像焦距和行人检测距离,计算行人RoIs像素高度的阈值区间;根据统计分析法,获得行人RoIs高宽比的高斯分布,选取合适的置信水平得到高宽比阈值区间;以及评估每个待检测RoIs,不符合两种区间条件的RoIs为尺寸异常的RoIs,将这些尺寸异常的RoIs移除。The region of interest filtering device according to claim 6, wherein the size abnormality RoIs filter calculates a threshold interval of the pixel height of the pedestrian RoIs according to the image focal length and the pedestrian detection distance; and obtains the pedestrian RoIs aspect ratio according to the statistical analysis method. The Gaussian distribution, the appropriate confidence level is selected to obtain the aspect ratio threshold interval; and each RoIs to be detected is evaluated, and the RoIs that do not meet the two interval conditions are the size abnormal RoIs, and the RoIs with these abnormal sizes are removed.
  8. 根据权利要求6所述的感兴趣区域过滤装置,其特征在于,位置异常RoIs过滤器使用水平路面假设方法获取当前图像路面基准;逐个RoIs分别计算其上下边界与路面基准在y轴方向的间距,y轴方向是RoIs的竖直方向,并计算基于当前RoIs像素高度的阈值;以及滤除间距结果不符合阈值的待检测RoIs。The region of interest filtering device according to claim 6, wherein the positional abnormality RoIs filter obtains a current image road surface reference using a horizontal road surface hypothesis method; and calculates a distance between the upper and lower boundaries and the road surface reference in the y-axis direction by RoIs, respectively. The y-axis direction is the vertical direction of RoIs, and the threshold based on the current RoIs pixel height is calculated; and the ROI to be detected whose filtering result does not meet the threshold is filtered out.
  9. 根据权利要求6所述的感兴趣区域过滤装置,其特征在于,缺失头部RoIs过滤器使用行人头部自适应定位算法将当前RoIs上层区域沿水平方向划分为三部分,中间部分命名为头部区域,左、右部分命名为背景区域;以及使用基于Haar-like特征的方法评估头部区域和背景区域的亮度均值差异程度,依据预设的阈值移除缺失头部的RoIs。The region of interest filtering apparatus according to claim 6, wherein the missing head RoIs filter uses the pedestrian head adaptive positioning algorithm to divide the current upper layer of the RoIs into three parts in the horizontal direction, and the middle part is named as the head. The area, the left and right parts are named as the background area; and the Haar-like feature-based method is used to estimate the degree of difference in the brightness mean of the head area and the background area, and the RoIs of the missing head are removed according to the preset threshold.
  10. 一种面向车载热成像的行人检测方法,其特征在于,所述方法包括:A pedestrian detection method for on-board thermal imaging, characterized in that the method comprises:
    提取待检测的RoIs;Extracting the RoIs to be detected;
    对RoIs进行过滤,其中,所述RoIs过滤包括步骤:通过计算行人像素高度和RoIs高宽比并设定相应的阈值区间,滤除尺寸异常的RoIs;逐个RoIs分别计算其上下边界与当前图像路面基准的竖直间距,计算基于RoIs像素高度的阈值,滤除位置异常的RoIs;以及依据每个RoIs的亮度竖直投影差值曲线搜索可能的行人头部区域,对比头部区域和相邻背景区域的Haar-like特征的差异程度,滤除缺失行人头部的RoIs;Filtering the RoIs, wherein the RoIs filtering comprises the steps of: filtering the size abnormal RoIs by calculating the pedestrian pixel height and the RoIs aspect ratio and setting corresponding threshold intervals; calculating the upper and lower boundaries and the current image pavement respectively by RoIs The vertical spacing of the reference, the threshold based on the pixel height of the RoIs is calculated, the RoIs of the position abnormality is filtered out; and the possible pedestrian head region is searched according to the luminance vertical projection difference curve of each RoIs, and the head region and the adjacent background are compared. The degree of difference in the Haar-like characteristics of the region, filtering out the RoIs missing the head of the pedestrian;
    对分类器进行离线训练;以及Offline training of the classifier;
    使用已经过训练的分类器对过滤后的RoIs进行分类检测。The filtered RoIs are classified and detected using a trained classifier.
PCT/CN2018/083480 2018-04-12 2018-04-18 Method and apparatus for filtering regions of interest for vehicle-mounted thermal imaging pedestrian detection WO2019196131A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810324429.4 2018-04-12
CN201810324429.4A CN108549864B (en) 2018-04-12 2018-04-12 Vehicle-mounted thermal imaging pedestrian detection-oriented region-of-interest filtering method and device

Publications (1)

Publication Number Publication Date
WO2019196131A1 true WO2019196131A1 (en) 2019-10-17

Family

ID=63514702

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/083480 WO2019196131A1 (en) 2018-04-12 2018-04-18 Method and apparatus for filtering regions of interest for vehicle-mounted thermal imaging pedestrian detection

Country Status (2)

Country Link
CN (1) CN108549864B (en)
WO (1) WO2019196131A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553252A (en) * 2020-04-24 2020-08-18 福建农林大学 Road pedestrian automatic identification and positioning method based on deep learning and U-V parallax algorithm
CN114113218A (en) * 2021-11-24 2022-03-01 北京理工大学 Residual glue detection method and system
CN116337868A (en) * 2023-02-28 2023-06-27 靖江安通电子设备有限公司 Surface defect detection method and detection system
CN117095009A (en) * 2023-10-20 2023-11-21 山东绿康装饰材料有限公司 PVC decorative plate defect detection method based on image processing
CN117788871A (en) * 2023-12-26 2024-03-29 海南言发高科技有限公司 Vehicle-mounted weighing management method and platform based on artificial intelligence

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108564030A (en) * 2018-04-12 2018-09-21 广州飒特红外股份有限公司 Classifier training method and apparatus towards vehicle-mounted thermal imaging pedestrian detection
CN109389073A (en) * 2018-09-29 2019-02-26 北京工业大学 The method and device of detection pedestrian area is determined by vehicle-mounted camera
CN109671090B (en) * 2018-11-12 2024-01-09 深圳佑驾创新科技股份有限公司 Far infrared ray-based image processing method, device, equipment and storage medium
CN109784176B (en) * 2018-12-15 2023-05-23 华南理工大学 Vehicle-mounted thermal imaging pedestrian detection Rois extraction method and device
CN109784216B (en) * 2018-12-28 2023-06-20 华南理工大学 Vehicle-mounted thermal imaging pedestrian detection Rois extraction method based on probability map
CN110298905A (en) * 2019-07-02 2019-10-01 麦克奥迪(厦门)医疗诊断系统有限公司 It is a kind of to be sliced the method and apparatus for generating digital slices based on biological sample
CN110335285B (en) * 2019-07-08 2022-04-26 中国科学院自动化研究所 SAR image target marking method, system and device based on sparse representation
CN110765877B (en) * 2019-09-20 2022-09-06 南京理工大学 Pedestrian detection method and system based on thermal imager and binocular camera
CN110737276B (en) * 2019-11-06 2023-03-31 达闼机器人股份有限公司 Early warning method, patrol robot and computer readable storage medium
CN111368704B (en) * 2020-02-29 2023-05-23 华南理工大学 Vehicle-mounted thermal imaging pedestrian detection Rois extraction method based on head characteristic points
CN114519696B (en) * 2021-12-31 2022-11-29 扬州盛强薄膜材料有限公司 PVC heat shrinkage film detection method and system based on optical intelligence
CN114943805B (en) * 2022-06-01 2023-08-01 北京精英路通科技有限公司 Parking shielding determination method, device, equipment, storage medium and program product

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130020151A (en) * 2011-08-19 2013-02-27 주식회사 만도 Vehicle detection device and method
CN103279741A (en) * 2013-05-20 2013-09-04 大连理工大学 Pedestrian early warning system based on vehicle-mounted infrared image and working method thereof
CN105404857A (en) * 2015-11-04 2016-03-16 北京联合大学 Infrared-based night intelligent vehicle front pedestrian detection method
CN105426852A (en) * 2015-11-23 2016-03-23 天津津航技术物理研究所 Method for identifying pedestrians by vehicle-mounted monocular long-wave infrared camera

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130020151A (en) * 2011-08-19 2013-02-27 주식회사 만도 Vehicle detection device and method
CN103279741A (en) * 2013-05-20 2013-09-04 大连理工大学 Pedestrian early warning system based on vehicle-mounted infrared image and working method thereof
CN105404857A (en) * 2015-11-04 2016-03-16 北京联合大学 Infrared-based night intelligent vehicle front pedestrian detection method
CN105426852A (en) * 2015-11-23 2016-03-23 天津津航技术物理研究所 Method for identifying pedestrians by vehicle-mounted monocular long-wave infrared camera

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG, GUOHUA: "Research on Key Techniques of Far-infrared Pedestrian Detection Based on Onboard Monocular Camera", ELECTRONIC TECHNOLOGY & INFORMATION SCIENCE , CHINA DOCTORAL DISSERTATIONS FULL-TEXT DATABASE, 15 May 2017 (2017-05-15), pages 52; 62 - 67, ISSN: 1674-022X *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553252A (en) * 2020-04-24 2020-08-18 福建农林大学 Road pedestrian automatic identification and positioning method based on deep learning and U-V parallax algorithm
CN111553252B (en) * 2020-04-24 2022-06-07 福建农林大学 Road pedestrian automatic identification and positioning method based on deep learning and U-V parallax algorithm
CN114113218A (en) * 2021-11-24 2022-03-01 北京理工大学 Residual glue detection method and system
CN114113218B (en) * 2021-11-24 2023-09-26 北京理工大学 Residual glue detection method and system
CN116337868A (en) * 2023-02-28 2023-06-27 靖江安通电子设备有限公司 Surface defect detection method and detection system
CN116337868B (en) * 2023-02-28 2023-09-19 靖江安通电子设备有限公司 Surface defect detection method and detection system
CN117095009A (en) * 2023-10-20 2023-11-21 山东绿康装饰材料有限公司 PVC decorative plate defect detection method based on image processing
CN117095009B (en) * 2023-10-20 2024-01-12 山东绿康装饰材料有限公司 PVC decorative plate defect detection method based on image processing
CN117788871A (en) * 2023-12-26 2024-03-29 海南言发高科技有限公司 Vehicle-mounted weighing management method and platform based on artificial intelligence

Also Published As

Publication number Publication date
CN108549864B (en) 2020-04-10
CN108549864A (en) 2018-09-18

Similar Documents

Publication Publication Date Title
WO2019196130A1 (en) Classifier training method and device for vehicle-mounted thermal imaging pedestrian detection
WO2019196131A1 (en) Method and apparatus for filtering regions of interest for vehicle-mounted thermal imaging pedestrian detection
WO2019169816A1 (en) Deep neural network for fine recognition of vehicle attributes, and training method thereof
US9477892B2 (en) Efficient method of offline training a special-type parked vehicle detector for video-based on-street parking occupancy detection systems
Giannoukos et al. Operator context scanning to support high segmentation rates for real time license plate recognition
CN104778444B (en) The appearance features analysis method of vehicle image under road scene
CN102509098B (en) Fisheye image vehicle identification method
CN110263712B (en) Coarse and fine pedestrian detection method based on region candidates
WO2022027931A1 (en) Video image-based foreground detection method for vehicle in motion
CN107256633B (en) Vehicle type classification method based on monocular camera three-dimensional estimation
CN111489330B (en) Weak and small target detection method based on multi-source information fusion
Kim et al. Autonomous vehicle detection system using visible and infrared camera
WO2024037408A1 (en) Underground coal mine pedestrian detection method based on image fusion and feature enhancement
CN111915583A (en) Vehicle and pedestrian detection method based on vehicle-mounted thermal infrared imager in complex scene
Su et al. A new local-main-gradient-orientation HOG and contour differences based algorithm for object classification
Ghahremannezhad et al. Robust road region extraction in video under various illumination and weather conditions
Wang et al. Real-time vehicle classification based on eigenface
CN108009480A (en) A kind of image human body behavioral value method of feature based identification
Piniarski et al. Efficient pedestrian detection with enhanced object segmentation in far IR night vision
Lee An accident detection system on highway through CCTV with calogero-moser system
Qadar et al. A comparative study of nighttime object detection with datasets from australia and china
CN109784176B (en) Vehicle-mounted thermal imaging pedestrian detection Rois extraction method and device
Zhao et al. Research on vehicle detection and vehicle type recognition under cloud computer vision
Karthiprem et al. Recognizing the moving vehicle while driving on Indian roads
Kiro et al. Road Lane Line Detection using Machine Learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18914564

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18914564

Country of ref document: EP

Kind code of ref document: A1