CN110837769B - Image processing and deep learning embedded far infrared pedestrian detection method - Google Patents

Image processing and deep learning embedded far infrared pedestrian detection method Download PDF

Info

Publication number
CN110837769B
CN110837769B CN201910745838.6A CN201910745838A CN110837769B CN 110837769 B CN110837769 B CN 110837769B CN 201910745838 A CN201910745838 A CN 201910745838A CN 110837769 B CN110837769 B CN 110837769B
Authority
CN
China
Prior art keywords
pedestrian
candidate region
local
threshold
double
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910745838.6A
Other languages
Chinese (zh)
Other versions
CN110837769A (en
Inventor
郑永森
王国华
李进业
周殿清
周伟滨
林琳
李卓思
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongshan Sanzhuo Intelligent Technology Co ltd
Original Assignee
Zhongshan Sanzhuo Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongshan Sanzhuo Intelligent Technology Co ltd filed Critical Zhongshan Sanzhuo Intelligent Technology Co ltd
Priority to CN201910745838.6A priority Critical patent/CN110837769B/en
Publication of CN110837769A publication Critical patent/CN110837769A/en
Application granted granted Critical
Publication of CN110837769B publication Critical patent/CN110837769B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses an embedded far infrared pedestrian detection method for image processing and deep learning, which utilizes a rapid local double-threshold and local sliding window technology to obtain a pedestrian candidate region; constructing a classifier based on the joint classification of the Alxnet network and the VGGnet depth network to classify the candidate region, and obtaining a pedestrian detection frame; on the basis, the fast local double-threshold segmentation result is used as an observation value to carry out Kalman tracking on the detection result. The system comprises: the pedestrian tracking system comprises a candidate region generation module for acquiring a pedestrian candidate region by a rapid local double-threshold and local sliding window technology, a candidate region classification module for classifying the candidate region based on a classifier for joint classification of Alexnet and VGGnet depth networks, an Alexnet and VGGnet depth network training and an offline training module for training Alexnet and VGGnet network weights based on a support vector machine, and a pedestrian tracking module for carrying out Kalman tracking on detection results by taking the rapid local double-threshold segmentation results as observation values. The pedestrian detection system and the pedestrian detection method can better consider the pedestrian detection accuracy and the real-time performance.

Description

Image processing and deep learning embedded far infrared pedestrian detection method
Technical Field
The invention belongs to the field of computer vision and mode recognition, image processing and computer vision aided driving systems, and particularly relates to an embedded far infrared pedestrian detection method for image processing and deep learning.
Background
When driving at ordinary times, the visual field and visibility of a driver are easily affected during night driving, bad weather and strong light and light changes. If the visual field and the visibility of the driver can be improved through the sensor device and pedestrians on the road can be detected, traffic accidents can be effectively prevented. The research of the vehicle-mounted far infrared pedestrian detection algorithm is the key for achieving the effects. Because far infrared imaging can effectively achieve the effects of inhibiting night, severe weather and strong light according to temperature difference, the research on a vehicle-mounted pedestrian detection method based on thermal imaging is a key for effectively guaranteeing the safety of road pedestrians in the driving process, and has great research and social values.
Wang Xiaolei (InfraRed detection research [ J ] university of North China university (Nature science edition), 2019,40 (1): 73-80.) is used for obtaining segmentation results through a selective search algorithm, and then the segmentation results are combined by using priori knowledge to obtain candidate regions, and on the basis, the Adaboost classifier based on the integral channel characteristics realizes far infrared pedestrian detection. Although the method has better real-time effect, the traditional feature extraction method is used for extracting the infrared pedestrian features, the deep learning method is not adopted for extracting the image features, so that the system precision is lower.
Dan Yongbiao et al (infrared pedestrian detection method based on aggregate channel characteristics [ J ]. Infrared, 2018, v.39 (05): 44-50.) in the classification stage, far infrared pedestrian detection was achieved using an Adaboost classifier. Because only one classifier is used for completing detection, higher precision is difficult to achieve in complex and various vehicle-mounted outdoor scenes. The invention proposes to use multiple classifiers for joint decision, and the weights of the classifiers are not determined manually, but are learned by a support vector machine.
Wang Dianwei (improved YOLOv3 infrared video image pedestrian detection algorithm [ J ]. Xian university of post and email newspaper 2018,23 (04): 52-56.) the present end-to-end depth target detection network YOLOv3 is improved by performing dimensional cluster analysis on target candidate frames of an infrared image dataset, adjusting a classification network pre-training process and performing multi-scale network training, so that higher accuracy is obtained, however, the defects that YOLOv3 is not accurate enough to position pedestrians and the remote target detection accuracy is too low still remain difficult to avoid. Therefore, the method has poor detection effect on the long-distance pedestrian target when the vehicle speed is high, and also has low accuracy on estimating the distance between the pedestrian and the vehicle.
The patent infrared pedestrian detection method based on the image block deep learning characteristics (Chinese patent grant bulletin number: CN106096561A, grant bulletin day: 2016, 11, 09) acquires a convolutional neural network group by sliding and extracting small image blocks on positive and negative samples of an infrared pedestrian data set, clustering the small image blocks and training a convolutional neural network for each type of image blocks. And in the test, the obtained neural network group is utilized to realize the classification of the candidate region so as to finish the infrared pedestrian detection. Although the method has higher precision, the calculation cost is higher because the obtained convolutional neural network group comprises a plurality of depth networks. In the embedded type, real-time performance is difficult to ensure.
The patent discloses a night pedestrian detection method based on infrared pedestrian brightness statistical characteristics (Chinese patent grant bulletin number: CN104778453A, grant bulletin day: 2015: 07 month 15) which constructs a brightness histogram characteristic for distinguishing voting interval division, connects the brightness histogram characteristic with gradient direction histogram characteristic in series, combines the two characteristics to form a final characteristic descriptor, and classifies candidate areas by utilizing Adaboost in combination with a decision tree to finish pedestrian detection. Although the algorithm has good real-time performance, the system has poor precision because the deep learning technology is not used for extracting the characteristics.
In summary, although research on the vehicle-mounted pedestrian detection method based on thermal imaging has achieved a certain result, in order to meet the requirements of practical applications, further improvement in detection accuracy and real-time performance is urgently needed, and an algorithm is required to be implemented in an embedded system instead of a simulation algorithm in a personal computer.
Disclosure of Invention
The embodiment of the invention aims to provide an embedded far infrared pedestrian detection method for image processing and deep learning, and aims to solve the problems that the identification accuracy of the existing vehicle-mounted pedestrian detection method for a vehicle-mounted far infrared camera cannot meet the accuracy requirement of practicality, the real-time performance needs to be further improved and an algorithm is not normally operated in embedded equipment.
The embedded far infrared pedestrian detection method for image processing and deep learning is characterized in that a pedestrian candidate region is obtained by utilizing a rapid local double-threshold and local sliding window technology, then the candidate region is subjected to joint classification by adopting a deep learning double classifier based on a support vector machine learning weight, and a segmentation result is used as an observation value to carry out Kalman tracking on the detection result, so that pedestrian detection is completed, and the method specifically comprises the following steps:
step one, obtaining a pedestrian candidate area by utilizing a rapid local double-threshold and local sliding window technology;
secondly, carrying out joint classification on the candidate areas by adopting a deep learning double classifier based on the learning weight of the support vector machine;
step three, carrying out Kalman tracking on the detection result by taking the segmentation result as an observation value;
the method is characterized in that the step one is that the selective search algorithm is combined with a local sliding window technology to obtain a preliminary candidate region, and then the selective search algorithm performs local sliding window on the basis of the preliminary candidate region so as to obtain a final candidate region, so that the defect that the current selective search algorithm cannot obtain all pedestrian candidate regions in various scenes is overcome; the local sliding window technology refers to that the sitting angular coordinates of each rectangular frame obtained by selective search are respectively according to 10 multiplied by 20 pixels by taking the left upper angular coordinates as the sitting angular coordinates of the sliding window 2 24×48 pixels 2 32×64 pixels 2 The local window size of 48 x 96 pixels is windowed to obtain the final infrared pedestrian candidate region. The method is characterized in that the deep learning double classifier joint classification in the second step refers to classification of candidate areas by an Alexnet network and a VGGnet network through weight joint; the learning weights based on the support vector machine refer to weights occupied by an Alexnet network and a VGGnet network respectively, which are obtained by learning the support vector machine.
The method for detecting embedded far infrared pedestrians by image processing and deep learning of claim 1, further characterized in that the segmentation result of the step three refers to the segmentation result obtained by the local self-adaptive double-threshold segmentation obtained in the step one; the Kalman tracking value for the detection result by taking the segmentation result as the observation value means that the observation value required by the Kalman tracking algorithm is provided by the segmentation result.
Compared with the existing pedestrian detection technology based on the vehicle-mounted far infrared camera, the vehicle-mounted far infrared pedestrian detection method based on the selective search and machine learning double-branch classification has the following advantages and effects: the candidate region is obtained by carrying out four scale local sliding windows on the basis of the local double-threshold segmentation result, so that the defect of infrared image segmentation by the current local double-threshold segmentation is overcome, and a pedestrian candidate region with higher quality can be obtained; the method has the advantages that the candidate areas are subjected to joint classification by adopting the deep learning double classifier based on the learning weight of the support vector machine, and compared with the existing single classifier and single feature extraction method, the method can fully utilize the advantages of the feature extraction and classification of different classifiers, and a more robust classification result is obtained through joint decision; meanwhile, the weights occupied by the two deep networks in the invention are obtained by learning a support vector machine when the two deep networks carry out joint decision; furthermore, in the tracking stage, the invention provides a segmentation result obtained by utilizing the rapid local double threshold value as an observation value of far infrared pedestrian tracking, so that the accuracy of pedestrian tracking is remarkably improved; in addition, the system can run in real time in an embedded system under various outdoor traffic scenes, and tests in actual scenes and various weather scenes show that the system has high accuracy and meets the requirements of actual application.
Drawings
FIG. 1 is a schematic illustration of an embedded far infrared pedestrian detection method for image processing and deep learning according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an embedded far infrared pedestrian detection method for image processing and deep learning according to an embodiment of the present invention;
in the figure: A. a candidate region generation module; B. a candidate region classification training module; C. a pedestrian tracking module; D. and the classifier offline training module.
FIG. 3 is a diagram of an embodiment of a deep learning dual classifier structure based on support vector machine learning weights provided by an embodiment of the present invention;
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The principles of the invention will be further described with reference to the drawings and specific examples.
As shown in fig. 1, an embedded far infrared pedestrian detection method for image processing and deep learning according to an embodiment of the present invention includes the following steps:
s101, obtaining a pedestrian candidate region by utilizing a rapid local double-threshold and local sliding window technology;
s102, carrying out joint classification on candidate areas by adopting a deep learning double classifier based on a learning weight of a support vector machine;
s103, carrying out Kalman tracking on the detection result by taking the segmentation result as an observation value;
step S101, after a preliminary candidate region is obtained by using a quick local double-threshold and local sliding window technology, a local sliding window is performed on the basis of the preliminary candidate region, so as to obtain a final candidate region, and the defect that all pedestrian candidate regions cannot be obtained in various scenes by using the current quick local double-threshold algorithm is overcome; the rapid local double-threshold algorithm means that a high threshold and a low threshold are calculated through 24 pixels on the same horizontal line of each pixel, so that image segmentation is realized, and a preliminary pedestrian candidate area is obtained through a 4-communication area marking algorithm; the local sliding window technology refers to that the sitting angular coordinates of each rectangular frame obtained by selective search are respectively according to 10 multiplied by 20 pixels by taking the left upper angular coordinates as the sitting angular coordinates of the sliding window 2 24×48 pixels 2 32×64 pixels 2 48×96 pixels 2 Sliding a window to obtain a final infrared pedestrian candidate region.
Step S102, the deep learning double classifier joint classification refers to classifying candidate areas through weight combination by an Alexnet network and a VGGnet network; the learning weights based on the support vector machine refer to weights occupied by an Alexnet network and a VGGnet network respectively, which are obtained by learning the support vector machine.
The segmentation result in step S103 refers to the segmentation result obtained by the local adaptive dual-threshold segmentation in step a; the Kalman tracking value for the detection result by taking the segmentation result as the observation value means that the observation value required by the Kalman tracking algorithm is provided by the segmentation result.
As shown in fig. 2, an embedded far infrared pedestrian detection method for image processing and deep learning in the embodiment of the invention mainly comprises a candidate region generation module a; a candidate region classification training module B; a pedestrian tracking module C; and the classifier off-line training module D.
And the candidate region selecting module A is used for quickly and accurately acquiring the pedestrian candidate region by combining a quick local double-threshold segmentation algorithm with a local sliding window technology.
And the candidate region classification module B is connected with the candidate region selection module A and the classifier offline training module D, and performs online joint classification on the candidate region by using a double classifier and decision weights obtained based on deep learning.
And the pedestrian tracking module C is used for tracking pedestrian targets obtained according to the deep learning classification by taking a segmentation result obtained by a local double-threshold algorithm as an observation value, so that a detection frame for pedestrians is more stable.
And the classifier offline training module D is used for collecting samples, offline training Alexnet and VGGnet deep learning network classifiers and offline determining weights of the two classifiers when in joint decision.
Specific examples of the invention:
the overall flow of the method is shown in fig. 1, and the main body of the method comprises three parts: 1. obtaining a pedestrian candidate region by utilizing a rapid local double-threshold and local sliding window technology; 2. carrying out joint classification on the candidate areas by adopting a deep learning double classifier based on the learning weight of the support vector machine; 3. and taking the segmentation result as an observation value to carry out Kalman tracking on the detection result. All algorithms of the present invention are implemented in the Nvidia JetsonTX2 embedded computer of inflight corporation.
1. Obtaining pedestrian candidate areas by using a rapid local double-threshold and local sliding window technology
The candidate region generation method obtains a candidate region with lower precision based on a rapid local double-threshold algorithm special for far infrared pedestrian segmentation at present, and obtains a final far infrared pedestrian candidate region by utilizing the left upper corner coordinates of all low-precision candidate regions and a local sliding window technology on the basis of the candidate region. Through the two main steps, the pedestrian candidate region is obtained by utilizing the rapid local double-threshold and local sliding window technology. The candidate region generation stage of the present invention mainly includes two steps, the first step: performing a rapid local double-threshold algorithm on the original infrared image to obtain a low-precision candidate region, and performing the second step: and acquiring the infrared pedestrian candidate region by utilizing a local sliding window technology according to the left upper corner coordinates of the low-precision candidate region. 1.1 performing a fast local double-threshold segmentation algorithm on an original infrared image to obtain a low-precision candidate region
The rapid local double-threshold segmentation algorithm drives pixels in infrared pedestrians, and on the same horizontal line, the average value of the pixels is higher than that of surrounding pixels, so that the infrared pedestrians are segmented, and the specific implementation steps are as follows: the original infrared image is taken as input, image segmentation is carried out, a segmented binary image is obtained, and a 4-communication area of the binary image is a low-precision candidate area. The specific steps for executing image segmentation are as follows: for each pixel of the image (except for the leftmost and rightmost 12 pixels), two segmentation thresholds are dynamically calculated according to equations (1) and (2), equation (1) calculates a low threshold T L Calculating a high threshold T according to the formula (2) H . If the pixel value of the current pixel is lower than T L When the pixel is segmented into a background; if the gray value of the pixel is higher than T H When the pixel is segmented into foreground; the pixel value for doing this pixel is located at [ T ] L ,T H ]And when the current pixel is segmented into the foreground, or else, the current pixel is segmented into the background.
T H (i,j)=T L (i,j)+θ (2)
Wherein T is L (i, j) is the low threshold of the current pixel (i, j), T H (i, j) is the high threshold of the current pixel (i, j), L is the width of the same horizontal line of the current pixel, and θ has a value of 8.
1.2 acquiring an infrared pedestrian candidate region by utilizing a local sliding window technology according to the left upper corner coordinates of the low-precision candidate region
In the invention, the candidate area obtained by the rapid local double-threshold segmentation algorithm is a preliminary candidate area with lower precision, and on the basis, the invention proposes to perform local sliding window on the basis of the preliminary candidate area so as to obtain a final candidate area, so as to make up for the defect that all pedestrian candidate areas cannot be obtained in various scenes by the current rapid local double-threshold segmentation algorithm; specifically, for the sitting angular position of each rectangular frame obtained by the selective search, the sitting angular position of the sliding window is set to be the left upper angular position, and the sitting angular position is respectively set to 10×20 pixels 2 24×48 pixels 2 32×64 pixels 2 The local window size of 48 x 96 pixels is windowed to obtain the final infrared pedestrian candidate region. Preparation is made for subsequent candidate region feature extraction.
2. The candidate areas are subjected to joint classification by adopting a deep learning dual classifier based on the learning weight of the support vector machine
The classifier for carrying out joint classification based on the deep learning double classifier for supporting the learning weight of the vector machine comprises two parts, namely training sample preparation and offline training of the double classifier, decision weight of the deep learning double classifier for supporting the learning of the vector machine and joint online detection of the double classifier.
2.1 training sample preparation and Dual classifier offline training
1) Training sample preparation
And collecting data of expressway, national road, urban area and suburban area scenes by means of vehicle-mounted far infrared shooting, and obtaining video for 300 hours. From which random sampling is performed to obtain pictures. 100 ten thousand original infrared images are obtained, all pedestrians appearing in the original infrared images are manually marked, all positive samples of all pedestrians appearing in the original infrared images are obtained, the positive samples of 50 ten thousand marked pictures form a data set Dataset1, and the positive samples of the other 50 ten thousand marked pictures form a data set Dataset2; in 10 ten thousand far infrared images which do not contain pedestrians, acquiring a non-pedestrian sample by the method for acquiring a candidate region in the first step, namely acquiring a red candidate region by using a rapid double-threshold segmentation algorithm and a local sliding window technology to form a non-pedestrian data set Dataset3; taking out all pedestrian pictures in the Dataset1 and forming a Dataset4 together with all non-pedestrian pictures of the Dataset3; all the pedestrian pictures in the Dataset2 are taken out and together with all the non-pedestrian pictures of the Dataset3, a Dataset5 is formed.
2) Offline training of double classifiers
The double classifier of the invention refers to an Alexnet deep convolutional neural network and a VGGnet deep convolutional neural network. In the Dataset4, the Alexnet depth network and VGGnet depth network were trained by fine tuning, respectively, using the Alexnet and VGGnet depth networks that had been trained in the ImageNet Dataset. Wherein the super parameter is set as follows: (1) selecting an adaptive optimization algorithm Adam by an optimization algorithm; (2) the learning rate is set to 0.01; (3) batch size set to 32; (4) the image is a single-channel gray scale image; (5) not employing dropout technology; (6) data enhancement of the original picture includes: translation transformation and left-right overturn transformation; (7) The image input sizes are scaled to 224 x 224 size using bilinear interpolation algorithms. The VGGnet of the invention is VGG19net network, and the specific network structure is shown in Table 1.
TABLE 1 VGGnet network Structure diagram (VGG 19-net)
Where "conv" represents the convolution operation, "relu" represents the classification layer where the linear rectification function is the activation function, "fc" is the fully connected operation, and "prob" is the function of the classifier with softmax.
Table 2 Alexnet network structure diagram.
2.2 support vector machine learning decision weights for Dual classifiers and Dual classifier Joint Online detection
The classifier results of the two classifiers Alexnet and VGGnet are combined for classification to finish classification of all candidate areas, and fusion is carried out in a weighting mode. The specific weight value obtaining method is obtained through learning of a nonlinear support vector machine. More specifically, for any sample S of Dataset5, classification is performed using a trained Alexnet classifier, assuming that the output Score of the classifier is Score 1 The method comprises the steps of carrying out a first treatment on the surface of the Classifying with a trained VGGnet classifier, assuming that the output Score of the classifier is Score 2 . Will (Score) 1 ,Score 2 ) Forming new characteristics, representing the new characteristics of the sample S, training a linear support vector machine classifier together with the original labels of the sample S, thereby obtaining decision weights w when the double-classifier Alexnet and VGGnet classifier are used for joint classification 1 And w 2 And a bias b. And (3) completing joint classification of the candidate areas according to a formula (3).
Score=w 1 ×Score 1 +w 2 ×Score 2 +b (3)
Wherein Score is the final output result of the dual classifier joint classification, when Score >0, the joint classification result is pedestrian, otherwise is non-pedestrian.
3. Carrying out Kalman tracking on the detection result by taking the segmentation result as an observation value
The Kalman tracking algorithm corrects the prediction estimation of the state variables by using the observation data to obtain the optimal estimation of the state variables, and when the Kalman tracking algorithm is used for multi-target tracking of pedestrians, the possible positions of all pedestrians in the next frame of images can be directly given, and the detection positions of the pedestrians in the next frame of images can be positioned by performing similarity matching on the pedestrian targets of the previous frame and the images of the prediction positions, so that the possible situation of missed detection of the pedestrians is compensated. In the process, considering that the observed value of Kalman has a larger influence on the accuracy of a tracking algorithm, the invention can generally obtain a more accurate segmentation result according to a local double-threshold segmentation algorithm, and proposes the segmentation result as the observed value of the traditional Kalman algorithm so as to obtain a more accurate Kalman predicted value. Specifically, the center position of a pedestrian target obtained through multi-frame verification (a candidate area is detected as a pedestrian side by three frames in succession and is regarded as a pedestrian target) and the height and width of a detection frame are tracked, so that the state vector of the pedestrian is expressed as formula (4).
X t =(x t ,y t ,h t ,w t ,Δx t ,Δy t ,Δh t ,Δw t ) T (4)
Wherein, (x) t ,y t ) Representing the center position coordinates of the pedestrian detection frame of the t-th frame, (h) t ,w t ) Representing the height and width of the pedestrian detection frame of the t frame; (Deltax) t ,Δy t ) Representing the change in the center point of the detection frame, (Δh) t ,Δw t ) Representing the variation in height and width in the detection frame. Since the frame rate of the video is 25 frames per second, the motion of the rectangular frames of pedestrians in two adjacent frames can be regarded as uniform motion, the kalman state transition matrix Ω is expressed as formula (5), and the system measurement matrix H is expressed as formula (6).
The invention uses the rapid double-threshold segmentation result as the observation value of the traditional Kalman algorithm, and uses the nearest neighbor matching method of the formula (6) for matching in order to find the observation value corresponding to the detection result. When the matching cannot be performed according to the formula (6), the predicted value of the Kalman is directly used as an observed value, and the updating of the Kalman tracker is completed.
|x 1 -x 2 |<T 1 &&|y 1 -y 2 |<T 1 &&|w 1 -w 2 |<T 1 &&|h 1 -h 2 |<T 1 (7)
Wherein w is 1 ,h 1 Respectively representing the width and height of a certain detection frame rectangle, and the central point coordinate of the detection frame rectangle is (x) 1 ,y 1 ),w 2 ,h 2 Representing the width and height of a rectangle of a certain rapid local double-threshold segmentation result respectively, and the central point coordinate of the rectangle is (x) 2 ,y 2 ),T 1 And T 2 (value 7) represents the nearest neighbor distance threshold in the transverse and longitudinal directions, respectively.

Claims (2)

1. The embedded far infrared pedestrian detection method for image processing and deep learning is characterized in that a pedestrian candidate region is obtained by utilizing a rapid local double-threshold and local sliding window technology, then the candidate region is subjected to joint classification by adopting a deep learning double classifier based on a support vector machine learning weight, and a segmentation result is used as an observation value to carry out Kalman tracking on the detection result, so that pedestrian detection is completed, and the method specifically comprises the following steps: step one, obtaining a pedestrian candidate area by utilizing a rapid local double-threshold and local sliding window technology; secondly, carrying out joint classification on the candidate areas by adopting a deep learning double classifier based on the learning weight of the support vector machine; step three, carrying out Kalman tracking on the detection result by taking the segmentation result as an observation value; step one, a quick local double-threshold and local sliding window technology is used, namely after a quick local double-threshold algorithm obtains a preliminary candidate region, a local sliding window is carried out on the basis of the preliminary candidate region, so that a final candidate region is obtained, and the defect that all pedestrian candidate regions cannot be obtained in various scenes by the current quick local double-threshold algorithm is overcome; the rapid local double-threshold algorithm means that a high threshold and a low threshold are calculated through the nearest 24 pixels on the same horizontal line of each pixel, so that image segmentation is realized, and a preliminary pedestrian candidate area is obtained through a 4-communication area marking algorithm; the local sliding window technique refers to that each moment obtained by selective searchThe upper-left angular coordinates of the frame are respectively 10×20 pixels 2 24×48 pixels 2 32×64 pixels 2 48×96 pixels 2 Sliding a window to obtain a final infrared pedestrian candidate region; step two, the deep learning double classifier joint classification refers to classifying candidate areas through weight combination by an Alexnet network and a VGGnet network; the learning weights based on the support vector machine refer to weights occupied by an Alexnet network and a VGGnet network respectively, which are obtained by learning the support vector machine.
2. The method for detecting embedded far infrared pedestrians in image processing and deep learning according to claim 1, wherein the segmentation result in the third step is a segmentation result obtained by local self-adaptive double-threshold segmentation in the first step; the Kalman tracking value for the detection result by taking the segmentation result as the observation value means that the observation value required by the Kalman tracking algorithm is provided by the segmentation result.
CN201910745838.6A 2019-08-13 2019-08-13 Image processing and deep learning embedded far infrared pedestrian detection method Active CN110837769B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910745838.6A CN110837769B (en) 2019-08-13 2019-08-13 Image processing and deep learning embedded far infrared pedestrian detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910745838.6A CN110837769B (en) 2019-08-13 2019-08-13 Image processing and deep learning embedded far infrared pedestrian detection method

Publications (2)

Publication Number Publication Date
CN110837769A CN110837769A (en) 2020-02-25
CN110837769B true CN110837769B (en) 2023-08-29

Family

ID=69573984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910745838.6A Active CN110837769B (en) 2019-08-13 2019-08-13 Image processing and deep learning embedded far infrared pedestrian detection method

Country Status (1)

Country Link
CN (1) CN110837769B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626334B (en) * 2020-04-28 2023-07-14 东风汽车集团有限公司 Key control target selection method for vehicle-mounted advanced auxiliary driving system
CN111814755A (en) * 2020-08-18 2020-10-23 深延科技(北京)有限公司 Multi-frame image pedestrian detection method and device for night motion scene
CN114255373B (en) * 2021-12-27 2024-02-02 中国电信股份有限公司 Sequence anomaly detection method, device, electronic equipment and readable medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198332A (en) * 2012-12-14 2013-07-10 华南理工大学 Real-time robust far infrared vehicle-mounted pedestrian detection method
CN104091171A (en) * 2014-07-04 2014-10-08 华南理工大学 Vehicle-mounted far infrared pedestrian detection system and method based on local features
CN106096561A (en) * 2016-06-16 2016-11-09 重庆邮电大学 Infrared pedestrian detection method based on image block degree of depth learning characteristic
CN106156401A (en) * 2016-06-07 2016-11-23 西北工业大学 Data-driven system state model on-line identification methods based on many assembled classifiers
CN107563347A (en) * 2017-09-20 2018-01-09 南京行者易智能交通科技有限公司 A kind of passenger flow counting method and apparatus based on TOF camera
KR101869442B1 (en) * 2017-11-22 2018-06-20 공주대학교 산학협력단 Fire detecting apparatus and the method thereof
CN108460336A (en) * 2018-01-29 2018-08-28 南京邮电大学 A kind of pedestrian detection method based on deep learning
US10108867B1 (en) * 2017-04-25 2018-10-23 Uber Technologies, Inc. Image-based pedestrian detection
CN109886245A (en) * 2019-03-02 2019-06-14 山东大学 A kind of pedestrian detection recognition methods based on deep learning cascade neural network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2003295318A1 (en) * 2002-06-14 2004-04-19 Honda Giken Kogyo Kabushiki Kaisha Pedestrian detection and tracking with night vision
KR101543105B1 (en) * 2013-12-09 2015-08-07 현대자동차주식회사 Method And Device for Recognizing a Pedestrian and Vehicle supporting the same
WO2016094330A2 (en) * 2014-12-08 2016-06-16 20/20 Genesystems, Inc Methods and machine learning systems for predicting the liklihood or risk of having cancer

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198332A (en) * 2012-12-14 2013-07-10 华南理工大学 Real-time robust far infrared vehicle-mounted pedestrian detection method
CN104091171A (en) * 2014-07-04 2014-10-08 华南理工大学 Vehicle-mounted far infrared pedestrian detection system and method based on local features
CN106156401A (en) * 2016-06-07 2016-11-23 西北工业大学 Data-driven system state model on-line identification methods based on many assembled classifiers
CN106096561A (en) * 2016-06-16 2016-11-09 重庆邮电大学 Infrared pedestrian detection method based on image block degree of depth learning characteristic
US10108867B1 (en) * 2017-04-25 2018-10-23 Uber Technologies, Inc. Image-based pedestrian detection
CN107563347A (en) * 2017-09-20 2018-01-09 南京行者易智能交通科技有限公司 A kind of passenger flow counting method and apparatus based on TOF camera
KR101869442B1 (en) * 2017-11-22 2018-06-20 공주대학교 산학협력단 Fire detecting apparatus and the method thereof
CN108460336A (en) * 2018-01-29 2018-08-28 南京邮电大学 A kind of pedestrian detection method based on deep learning
CN109886245A (en) * 2019-03-02 2019-06-14 山东大学 A kind of pedestrian detection recognition methods based on deep learning cascade neural network

Also Published As

Publication number Publication date
CN110837769A (en) 2020-02-25

Similar Documents

Publication Publication Date Title
CN111209810B (en) Boundary frame segmentation supervision deep neural network architecture for accurately detecting pedestrians in real time through visible light and infrared images
CN110175576B (en) Driving vehicle visual detection method combining laser point cloud data
CN109447018B (en) Road environment visual perception method based on improved Faster R-CNN
CN110443827B (en) Unmanned aerial vehicle video single-target long-term tracking method based on improved twin network
CN106096561B (en) Infrared pedestrian detection method based on image block deep learning features
CN105046196B (en) Front truck information of vehicles structuring output method based on concatenated convolutional neutral net
CN110837769B (en) Image processing and deep learning embedded far infrared pedestrian detection method
CN106919902B (en) Vehicle identification and track tracking method based on CNN
CN111368830B (en) License plate detection and recognition method based on multi-video frame information and kernel correlation filtering algorithm
CN110287826B (en) Video target detection method based on attention mechanism
CN105989334B (en) Road detection method based on monocular vision
CN111340855A (en) Road moving target detection method based on track prediction
CN111160212B (en) Improved tracking learning detection system and method based on YOLOv3-Tiny
CN110781744A (en) Small-scale pedestrian detection method based on multi-level feature fusion
CN111695514A (en) Vehicle detection method in foggy days based on deep learning
Xiao et al. Real-time object detection algorithm of autonomous vehicles based on improved yolov5s
CN109800714A (en) A kind of ship detecting system and method based on artificial intelligence
CN111915583A (en) Vehicle and pedestrian detection method based on vehicle-mounted thermal infrared imager in complex scene
CN116434159A (en) Traffic flow statistics method based on improved YOLO V7 and Deep-Sort
Tarchoun et al. Hand-Crafted Features vs Deep Learning for Pedestrian Detection in Moving Camera.
Ghahremannezhad et al. Automatic road detection in traffic videos
CN113221739B (en) Monocular vision-based vehicle distance measuring method
CN114743126A (en) Lane line sign segmentation method based on graph attention machine mechanism network
CN109215059A (en) Local data&#39;s correlating method of moving vehicle tracking in a kind of video of taking photo by plane
CN117036412A (en) Twin network infrared pedestrian target tracking method integrating deformable convolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230803

Address after: One of the fourth floors, No. 107, Jucheng Avenue East, Xiaolan Town, Zhongshan City, Guangdong Province, 528400

Applicant after: Zhongshan sanzhuo Intelligent Technology Co.,Ltd.

Address before: Unit C403A, No. 205 Changfu Road, Tianhe District, Guangzhou City, Guangdong Province, 510000 (for office use only) (not intended for use as a factory building)

Applicant before: Guangzhou Sanmu Intelligent Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant