CN110837769A - Embedded far infrared pedestrian detection method based on image processing and deep learning - Google Patents

Embedded far infrared pedestrian detection method based on image processing and deep learning Download PDF

Info

Publication number
CN110837769A
CN110837769A CN201910745838.6A CN201910745838A CN110837769A CN 110837769 A CN110837769 A CN 110837769A CN 201910745838 A CN201910745838 A CN 201910745838A CN 110837769 A CN110837769 A CN 110837769A
Authority
CN
China
Prior art keywords
pedestrian
local
threshold
candidate region
dual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910745838.6A
Other languages
Chinese (zh)
Other versions
CN110837769B (en
Inventor
郑永森
王国华
李进业
周殿清
周伟滨
林琳
李卓思
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongshan Sanzhuo Intelligent Technology Co ltd
Original Assignee
Guangzhou Sanmu Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Sanmu Intelligent Technology Co Ltd filed Critical Guangzhou Sanmu Intelligent Technology Co Ltd
Priority to CN201910745838.6A priority Critical patent/CN110837769B/en
Publication of CN110837769A publication Critical patent/CN110837769A/en
Application granted granted Critical
Publication of CN110837769B publication Critical patent/CN110837769B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an embedded far infrared pedestrian detection method based on image processing and deep learning, which comprises the steps of obtaining a pedestrian candidate area by utilizing a rapid local double-threshold and local sliding window technology; constructing a classifier based on the combined classification of the Alxnet network and the VGGnet depth network to classify the candidate regions, and obtaining a pedestrian detection frame; on the basis, the fast local dual-threshold segmentation result is used as an observation value to carry out Kalman tracking on the detection result. The system comprises: the pedestrian tracking system comprises a candidate region generating module for acquiring a pedestrian candidate region by using a rapid local dual-threshold and local sliding window technology, a candidate region classifying module for classifying the candidate region by using a classifier based on Alexenet and VGGnet depth network joint classification, an offline training module for Alexenet and VGGnet depth network training and Alexenet and VGGnet network weight training based on a support vector machine, and a pedestrian tracking module for performing Kalman tracking on a detection result by using a rapid local dual-threshold segmentation result as an observation value. The pedestrian detection system can better give consideration to the accuracy rate and the real-time performance of pedestrian detection, can run in an embedded mode, and can be used for realizing the auxiliary driving system based on the pedestrian detection of the vehicle-mounted camera at night.

Description

Embedded far infrared pedestrian detection method based on image processing and deep learning
Technical Field
The invention belongs to the field of auxiliary driving systems of computer vision and pattern recognition, image processing and computer vision, and particularly relates to an embedded far infrared pedestrian detection method of image processing and deep learning.
Background
When the automobile is driven at ordinary times, the visual field and the visibility of a driver are easily influenced when the automobile is driven at night and is in bad weather and when strong light and light change. If the field of vision and visibility of a driver can be improved by the sensor device and pedestrians on a road can be detected, occurrence of traffic accidents can be effectively prevented. The research of the vehicle-mounted far infrared pedestrian detection algorithm is the key for achieving the effects. Because far infrared can effectively achieve the effects of night, severe weather and strong light inhibition according to temperature difference imaging, the research on the vehicle-mounted pedestrian detection method based on thermal imaging is a key for effectively guaranteeing the safety of pedestrians on roads in the driving process, and has great research and social values.
The method comprises the steps that a dividing result is obtained by a bud of royal jelly (infrared pedestrian detection research [ J ] based on candidate region enumeration, Huaibei university scholarship (Nature science edition), 2019,40(1):73-80.) through a selective search algorithm, then the dividing result is combined by utilizing priori knowledge to obtain a candidate region, and on the basis, an Adaboost classifier based on integral channel characteristics realizes far infrared pedestrian detection. The method has the advantages that although a good real-time effect is achieved, the infrared pedestrian feature extraction is carried out by using the traditional feature extraction method, the image feature extraction is not carried out by using a deep learning method, and the system precision is low.
Stone YongBiao et al (infrared pedestrian detection method [ J ] infrared, 2018, v.39(05): 44-50) based on aggregate channel characteristics) use Adaboost classifier to realize far infrared pedestrian detection in the classification stage. As only one classifier is used for completing detection, high precision is difficult to achieve in complex and various vehicle-mounted outdoor scenes. The invention provides a method for performing joint decision by using a plurality of classifiers, and the weight of each classifier is not artificially determined but is obtained by learning through a support vector machine.
The royal viagra et al (improved YOLOv3 infrared video image pedestrian detection algorithm [ J ]. the academic newspaper of western-ampere post and telecommunications university, 2018,23(04):52-56.) improve the current end-to-end depth target detection network YOLOv3 by performing dimension clustering analysis on target candidate frames of an infrared image data set, adjusting a classification network pre-training process and performing multi-scale network training, thereby obtaining higher precision, however, the defects that the YOLOv3 per se has inaccurate positioning on pedestrians and excessively low precision of remote target detection are still difficult to avoid. Therefore, the method has poor detection effect on the long-distance pedestrian target when the speed is high, and has low estimation precision on the distance between the pedestrian and the vehicle.
The patent discloses an infrared pedestrian detection method based on image block deep learning features (Chinese patent authorization publication No. CN106096561A, authorization publication date: 2016, 11, and 09, 2016), which extracts small image blocks by sliding on positive and negative samples of an infrared pedestrian data set, then clusters the small image blocks, trains a convolutional neural network for each type of image block, and accordingly obtains a convolutional neural network group. And during testing, classifying the candidate regions by using the obtained neural network group to finish infrared pedestrian detection. Although the method is high in precision, the obtained convolutional neural network group contains a plurality of deep networks, so that the calculation cost is high. In the embedded type, real-time property is difficult to guarantee.
A night pedestrian detection method based on infrared pedestrian brightness statistical characteristics (Chinese patent grant publication No. CN104778453A, grant publication date: 2015: 07/15) constructs a brightness histogram characteristic for distinguishing voting interval division, connects gradient direction histogram characteristics in series, combines the two characteristics to form a final characteristic descriptor, and classifies candidate regions by utilizing Adaboost in combination with a decision tree to complete pedestrian detection. Although the algorithm has good real-time performance, the accuracy of the system is poor because the deep learning technology is not utilized to extract the features.
In summary, although the research on the thermal imaging-based vehicle-mounted pedestrian detection method has achieved certain results, in order to meet the requirements of practical application, further improvements in detection accuracy and real-time performance are urgently needed, and an algorithm implemented in an embedded system rather than a simulation algorithm in a personal computer is needed.
Disclosure of Invention
The embodiment of the invention aims to provide an embedded far-infrared pedestrian detection method based on image processing and deep learning, and aims to solve the problems that the identification accuracy of the existing vehicle-mounted pedestrian detection method based on a vehicle-mounted far-infrared camera cannot meet the requirement of practical accuracy, the real-time performance needs to be further improved, and the algorithm is not usually operated in embedded equipment.
The embedded far-infrared pedestrian detection method is characterized in that a rapid local double-threshold and local sliding window technology is used for obtaining a pedestrian candidate region, then a deep learning double-classifier based on the learning weight of a support vector machine is used for carrying out joint classification on the candidate region, and a segmentation result is used as an observation value to carry out Kalman tracking on a detection result so as to complete pedestrian detection, and specifically comprises the following steps:
acquiring a pedestrian candidate area by using a rapid local double threshold and local sliding window technology;
step two, performing combined classification on the candidate regions by adopting a deep learning double classifier based on the learning weight of the support vector machine;
thirdly, performing Kalman tracking on the detection result by taking the segmentation result as an observation value;
further, the embedded far-infrared pedestrian detection method based on image processing and deep learning according to claim 1, characterized in that the selective search algorithm in step one is combined with a local sliding window technique, and after the selective search algorithm obtains a preliminary candidate region, local sliding window is performed on the basis of the preliminary candidate region, so as to obtain a final candidate region, thereby making up for the defect that the current selective search algorithm cannot obtain all pedestrian candidate regions in various scenes; the local sliding window technology refers to that the sitting corner coordinate of each rectangular frame obtained by selective search is respectively 10 multiplied by 20 pixels by taking the upper left corner coordinate as the sitting corner coordinate of the sliding window224 x 48 pixels232 x 64 pixels2The local window size of 48 × 96 pixels is subjected to sliding window to obtain a final infrared pedestrian candidate region.
Further, the embedded far-infrared pedestrian detection method based on image processing and deep learning of claim 1, wherein the deep learning dual-classifier joint classification in the second step is to classify candidate regions by combining weights through an Alexnet network and a VGGnet network; the learning weight based on the support vector machine means that the weight occupied by the Alexnet network and the VGGnet network is obtained by learning of the support vector machine.
Further, the method for detecting pedestrians under embedded far infrared with image processing and deep learning as claimed in claim 1, wherein the segmentation result in the third step is the segmentation result obtained by the local adaptive dual-threshold segmentation in the first step; taking the segmentation result as an observation value to perform Kalman tracking on the detection result means that the observation value required by the Kalman tracking algorithm is provided by the segmentation result.
Compared with the existing pedestrian detection technology based on the vehicle-mounted far infrared camera, the vehicle-mounted far infrared pedestrian detection method based on the selective search and the machine learning double-branch classification has the following advantages and effects: the candidate region is obtained by performing local sliding windows of four scales on the basis of the local dual-threshold segmentation result, so that the defect of infrared image segmentation performed by the current local dual-threshold segmentation is overcome, and a pedestrian candidate region with higher quality can be obtained; the deep learning double classifiers based on the learning weight of the support vector machine are designed for the combined classification of the candidate regions, and compared with the existing single classifier and single feature extraction method, the method can fully utilize the respective advantages of different classifiers in feature extraction and classification, and obtain a more robust classification result through combined decision; meanwhile, the two deep networks of the invention are obtained by learning the weights occupied by the two deep networks during the joint decision making; furthermore, in the tracking stage, the invention provides a segmentation result obtained by utilizing a quick local double threshold value as an observation value of far infrared pedestrian tracking, so that the accuracy of pedestrian tracking is remarkably improved; in addition, the system can run in real time in an embedded system under various outdoor traffic scenes, and tests under actual scenes and various weather scenes show that the system has high accuracy and meets the requirements of actual application.
Drawings
Fig. 1 is an embedded far-infrared pedestrian detection method with image processing and deep learning according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an embedded far-infrared pedestrian detection method based on image processing and deep learning according to an embodiment of the present invention;
in the figure: A. a candidate region generation module; B. a candidate region classification training module; C. a pedestrian tracking module; D. and a classifier offline training module.
FIG. 3 is a diagram of an embodiment of a deep learning dual classifier structure based on support vector machine learning weights according to an embodiment of the present invention;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The application of the principles of the present invention will be further described with reference to the accompanying drawings and specific embodiments.
As shown in fig. 1, an embedded far infrared pedestrian detection method with image processing and deep learning according to an embodiment of the present invention includes the following steps:
s101, acquiring a pedestrian candidate area by using a rapid local double threshold and local sliding window technology;
s102, performing combined classification on the candidate regions by adopting a deep learning double classifier based on the learning weight of the support vector machine;
s103, performing Kalman tracking on the detection result by taking the segmentation result as an observation value;
step S101, the rapid local dual-threshold and local sliding window technology is used, after a preliminary candidate area is obtained by a rapid local dual-threshold algorithm, local sliding window is carried out on the basis of the preliminary candidate area, so that a final candidate area is obtained, and the defect that all pedestrian candidate areas cannot be obtained in various scenes by the conventional rapid local dual-threshold algorithm is overcome; the fast local dual-threshold algorithm is that a high threshold and a low threshold are calculated through 24 pixels on the same horizontal line of each pixel, so that image segmentation is realized, and a preliminary pedestrian candidate area is obtained through a 4-link area marking algorithm; the local sliding window technology refers to that the sitting corner coordinate of each rectangular frame obtained by selective search is respectively 10 multiplied by 20 pixels by taking the upper left corner coordinate as the sitting corner coordinate of the sliding window224 x 48 pixels232 x 64 pixels248 x 96 pixels2The size of the local window is slid to obtain a final infrared pedestrian candidate area.
The deep learning dual-classifier joint classification in the step S102 refers to classifying candidate regions through weight joint by an Alexnet network and a VGGnet network; the learning weight based on the support vector machine means that the weight occupied by the Alexnet network and the VGGnet network is obtained by learning of the support vector machine.
The segmentation result obtained in the step S103 is the segmentation result obtained by the local adaptive dual-threshold segmentation obtained in the step one; taking the segmentation result as an observation value to perform Kalman tracking on the detection result means that the observation value required by the Kalman tracking algorithm is provided by the segmentation result.
As shown in fig. 2, an embedded far-infrared pedestrian detection method of image processing and deep learning according to an embodiment of the present invention mainly includes a candidate region generation module a; a candidate region classification training module B; a pedestrian tracking module C; and a classifier off-line training module D.
And the candidate region selection module A combines a rapid local dual-threshold segmentation algorithm with a local sliding window technology to rapidly and accurately acquire the pedestrian candidate region.
And the candidate region classification module B is connected with the candidate region selection module A and the classifier offline training module D, and performs online joint classification on the candidate regions by using the double classifiers obtained based on deep learning and the decision weight.
And the pedestrian tracking module C is used for tracking the pedestrian target obtained by deep learning classification by taking the segmentation result obtained by the local dual-threshold algorithm as an observation value, so that the detection frame of the pedestrian is more stable.
And the classifier offline training module D is used for collecting samples, offline training Alexnet and VGGnet deep learning network classifiers and offline determining the weight of the two classifiers in the joint decision process.
The specific embodiment of the invention:
the overall flow of the method of the invention is shown in figure 1, and the main body of the method of the invention comprises three parts: 1. acquiring a pedestrian candidate area by using a rapid local double threshold and local sliding window technology; 2. performing joint classification on the candidate regions by adopting a deep learning double classifier based on the learning weight of the support vector machine; 3. and taking the segmentation result as an observation value to carry out Kalman tracking on the detection result. All algorithms of the present invention are implemented in Nvidia Jetson TX2 embedded computer, engida.
1. Pedestrian candidate area acquisition by using fast local dual-threshold and local sliding window technology
The candidate area generation method is characterized in that a candidate area with lower precision is obtained based on a rapid local dual-threshold algorithm specially used for far infrared pedestrian segmentation at present, and on the basis of the candidate area, the final far infrared pedestrian candidate area is obtained by utilizing coordinates of the upper left corners of all the candidate areas with low precision through a local sliding window technology. Through the two main steps, the pedestrian candidate area is obtained by utilizing the rapid local double-threshold and local sliding window technology. Therefore, the candidate region generation stage of the invention mainly comprises two steps, the first step is: executing a fast local dual-threshold algorithm on the original infrared image to obtain a low-precision candidate region, and secondly: and acquiring the infrared pedestrian candidate region by using a local sliding window technology according to the upper left corner coordinate of the low-precision candidate region. 1.1 executing fast local dual-threshold segmentation algorithm to the original infrared image to obtain low-precision candidate region
The fast local dual-threshold segmentation algorithm drives pixels inside the infrared pedestrians, and the pixels are higher than the average value of surrounding pixels on the same horizontal line, so that the infrared pedestrians are segmented, and the specific execution steps are as follows: and (3) performing image segmentation by taking the original infrared image as input to obtain a segmented binary image, wherein a 4-pass region of the binary image is a low-precision candidate region. The specific steps for performing image segmentation are as follows: for each pixel of the image (except for the 12 leftmost and rightmost pixels), two segmentation thresholds are dynamically calculated according to equations (1) and (2), and the low threshold T is calculated according to equation (1)LEquation (2) calculates the high threshold value TH. If the pixel value of the current pixel is lower than TLThen, the pixel is divided into background; if the gray level of the pixel is higher than THThen, the pixel is segmented into the foreground; the pixel value of the pixel is set to [ T ]L,TH]In between, it is determined whether the current pixel is segmented into the foreground or the background according to the segmentation result of the left side of the pixel, specifically, when the left side of the pixel is segmented into the foreground, the current pixel is also segmented into the foreground, otherwise, the current pixel is segmented into the background.
Figure RE-GDA0002341778580000071
TH(i,j)=TL(i,j)+θ (2)
Wherein, TL(i, j) is the low threshold for the current pixel (i, j), TH(i, j) is the high threshold of the current pixel (i, j), L is the width of the current pixel along the same horizontal line, and θ has a value of 8.
1.2 according to the coordinates of the upper left corner of the low-precision candidate region, acquiring the infrared pedestrian candidate region by using a local sliding window technology
In the invention, the candidate region obtained by the rapid local dual-threshold segmentation algorithm is a preliminary candidate region with lower precision, and on the basis, the invention provides that the local sliding window is carried out on the basis of the preliminary candidate region so as to obtain a final candidate region, so that the defect that all pedestrian candidate regions cannot be obtained in various scenes by the conventional rapid local dual-threshold segmentation algorithm is overcome; specifically, for the sitting corner coordinate of each rectangular frame obtained by selective search, the sitting corner coordinate of the sliding window is taken as the upper left corner coordinate, and the sitting corner coordinate is respectively calculated according to 10 × 20 pixels224 x 48 pixels232 x 64 pixels2The local window size of 48 × 96 pixels is subjected to sliding window to obtain a final infrared pedestrian candidate region. And preparing for subsequent candidate region feature extraction.
2. Joint classification is carried out on candidate regions by adopting deep learning double classifiers based on learning weights of support vector machine
The classifier for performing combined classification by the deep learning double classifiers based on the learning weight of the support vector machine comprises two parts, namely training sample preparation and off-line training of the double classifiers, decision weight of the deep learning double classifiers based on the learning weight of the support vector machine and combined on-line detection of the double classifiers.
2.1 training sample preparation and Dual classifier offline training
1) Training sample preparation
The data of scenes of expressways, national roads, urban areas and suburbs are collected by a vehicle-mounted far infrared camera shooting mode, and videos which are as long as 300 hours are obtained. Random sampling is carried out to obtain pictures. Obtaining 100 million original infrared images in total, and manually labeling all pedestrians appearing in the original infrared images, wherein positive samples of all pedestrians appearing in the original infrared images comprise a data set Dataset1, and positive samples of the other 50 million labeled images comprise a data set Dataset 2; in 10 ten thousand far infrared images without pedestrians, a non-pedestrian sample is obtained by the method for obtaining the candidate region in the first step of the patent of the invention, namely, the red candidate region is obtained by using a fast dual-threshold segmentation algorithm and a local sliding window technology, and a non-pedestrian data set Dataset3 is formed; all pedestrian pictures in the Dataset1 are taken out and form a data set Dataset4 together with all non-pedestrian pictures of Dataset 3; all pedestrian pictures in Dataset2 are taken out and form a data set Dataset5 together with all non-pedestrian pictures in Dataset 3.
2) Dual classifier offline training
The dual classifiers of the patent of the invention refer to Alexnet deep convolutional neural network and VGGnet deep convolutional neural network. The Alexnet and VGGnet depth networks were trained by fine tuning in Dataset4, respectively, using the Alexnet and VGGnet depth networks that had been trained in the ImageNet Dataset. The hyper-parameters are set as follows: (1) the optimization algorithm selects a self-adaptive optimization algorithm Adam; (2) the learning rate was set to 0.01; (3) the batch size is set to 32; (4) the image is a single-channel gray scale image; (5) the dropout technology is not adopted; (6) the data enhancement of the original picture comprises: translation transformation and left-right turning transformation; (7) the image input size is scaled to 224 x 224 size using a bilinear interpolation algorithm. The VGGnet of the invention is a VGG19net network, and the specific network structure is shown in Table 1.
TABLE 1 VGGnet network Structure (VGG19-net)
Wherein "conv" represents convolution operation, "relu" represents linear rectification function as activation function, "fc" is full-connection operation, "prob" is classification layer of function with softmax as classifier.
Table 2 Alexnet network architecture diagram.
Figure RE-GDA0002341778580000092
2.2 support vector machine learning decision weights for Dual classifiers and Dual-classifier Joint on-line detection
And jointly classifying by using dual classifiers Alexnet and VGGnet to finish classification of all candidate regions, and fusing classifier results of the dual classifiers Alexnet and the VGGnet in a weighting mode. The specific weight value obtaining method is obtained through the learning of a nonlinear support vector machine. More specifically, any sample S of Dataset5 is classified by using Alexnet classifier obtained through training, and the output Score of the classifier is assumed to be Score1(ii) a Classifying by using the VGGnet classifier obtained by training, and assuming that the output Score of the classifier is Score2. Will (Score)1,Score2) Forming new features, representing the new features of the sample S, and training a linear support vector machine classifier together with the original label of the sample S, thereby obtaining the decision weight w when the dual classifiers Alexenet and VGGnet are jointly classified1And w2And an offset b. The joint classification of the candidate regions is done according to equation (3).
Score=w1×Score1+w2×Score2+b (3)
Wherein, the Score is the final output result of the dual-classifier joint classification, and when the Score is greater than 0, the joint classification result is a pedestrian, otherwise, the joint classification result is a non-pedestrian.
3. Taking the segmentation result as an observation value to carry out Kalman tracking on the detection result
The Kalman tracking algorithm corrects the prediction estimation of the state variable by using observation data to obtain the optimal estimation of the state variable, when the Kalman tracking algorithm is used for multi-target tracking of pedestrians, the positions of all the pedestrians possibly appearing in the next frame of image can be directly given, and the detection positions of the pedestrians in the next frame of image can be positioned by performing similarity matching on the pedestrian target of the previous frame and the image of the predicted position, so that the possible missing detection condition of the pedestrians is made up. In the process, considering that the Kalman observed value has a large influence on the accuracy of the tracking algorithm, the method can generally obtain a more accurate segmentation result according to a local dual-threshold segmentation algorithm, and provides the segmentation result as the observed value of the traditional Kalman algorithm so as to obtain a more accurate Kalman predicted value. Specifically, the center position of the pedestrian target and the height and width of the detection frame obtained through multi-frame verification (three consecutive frames of a certain candidate region are detected as the pedestrian target) are tracked, so that the state vector of the pedestrian is expressed as formula (4).
Xt=(xt,yt,ht,wt,Δxt,Δyt,Δht,Δwt)T(4)
Wherein (x)t,yt) Coordinates of the center position of the pedestrian detection frame representing the t-th frame, (h)t,wt) Representing the height and width of the pedestrian detection frame of the t-th frame; (Δ x)t,Δyt) Represents the change of the center point of the detection frame (delta h)t,Δwt) Representing variations in height and width in the detection frame. Because the frame rate of the video is 25 frames per second, the motion of the rectangular frames of two adjacent frames of pedestrians can be regarded as uniform motion, the kalman state transition matrix Ω is expressed as formula (5), and the system measurement matrix H is expressed as formula (6).
Figure RE-GDA0002341778580000101
The invention uses the fast dual-threshold segmentation result as the observed value of the traditional Kalman algorithm, and uses the nearest neighbor matching method of formula (6) to match in order to find the observed value corresponding to the detection result. And when the Kalman tracker cannot be matched according to the formula (6), directly taking the Kalman predicted value as an observation value to complete the updating of the Kalman tracker.
|x1-x2|<T1&&|y1-y2|<T1&&|w1-w2|<T1&&|h1-h2|<T1(7)
Wherein, w1,h1Respectively representing the width and height of a certain detection frame rectangle, and the coordinate of the center point of the detection frame rectangle is (x)1,y1), w2,h2Respectively representing the width and height of a rectangle of a certain fast local dual-threshold segmentation result, and the coordinate of the center point of the rectangle is (x)2,y2),T1And T2(value 7) represents the nearest neighbor distance threshold in the horizontal and vertical directions, respectively.

Claims (4)

1. The embedded far-infrared pedestrian detection method is characterized in that a rapid local double-threshold and local sliding window technology is used for obtaining a pedestrian candidate region, then a deep learning double-classifier based on the learning weight of a support vector machine is used for carrying out joint classification on the candidate region, and a segmentation result is used as an observation value to carry out Kalman tracking on a detection result so as to complete pedestrian detection, and specifically comprises the following steps:
acquiring a pedestrian candidate area by using a rapid local double threshold and local sliding window technology;
step two, performing combined classification on the candidate regions by adopting a deep learning double classifier based on the learning weight of the support vector machine;
and step three, taking the segmentation result as an observation value to perform Kalman tracking on the detection result.
2. The method for detecting the embedded far infrared pedestrian according to the image processing and deep learning of the claim 1, wherein the step one of utilizing the fast local dual-threshold and local sliding window technique means that after the fast local dual-threshold algorithm obtains the preliminary candidate region, the local sliding window is performed on the basis of the preliminary candidate region, so as to obtain the final candidate region, thereby making up for the defect that the current fast local dual-threshold algorithm cannot obtain all pedestrian candidate regions in various scenes; the fast local dual-threshold algorithm refers to passing through the nearest neighbors 24 on the same horizontal line for each pixelCalculating a high threshold and a low threshold by each pixel so as to realize image segmentation, and obtaining a preliminary pedestrian candidate region by a 4-link region marking algorithm; the local sliding window technology refers to that the sitting corner coordinate of each rectangular frame obtained by selective search is respectively 10 multiplied by 20 pixels by taking the upper left corner coordinate as the sitting corner coordinate of the sliding window224 x 48 pixels232 x 64 pixels248 x 96 pixels2The size of the local window is slid to obtain a final infrared pedestrian candidate area.
3. The image processing and deep learning embedded far-infrared pedestrian detection method of claim 1, wherein the deep learning dual-classifier joint classification in the second step is to classify candidate regions by a weight joint through an Alexnet network and a VGGnet network; the learning weight based on the support vector machine means that the weight occupied by the Alexnet network and the VGGnet network is obtained by learning of the support vector machine.
4. The method as claimed in claim 1, wherein the segmentation result obtained in step three is obtained by local adaptive dual-threshold segmentation obtained in step one; taking the segmentation result as an observation value to perform Kalman tracking on the detection result means that the observation value required by the Kalman tracking algorithm is provided by the segmentation result.
CN201910745838.6A 2019-08-13 2019-08-13 Image processing and deep learning embedded far infrared pedestrian detection method Active CN110837769B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910745838.6A CN110837769B (en) 2019-08-13 2019-08-13 Image processing and deep learning embedded far infrared pedestrian detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910745838.6A CN110837769B (en) 2019-08-13 2019-08-13 Image processing and deep learning embedded far infrared pedestrian detection method

Publications (2)

Publication Number Publication Date
CN110837769A true CN110837769A (en) 2020-02-25
CN110837769B CN110837769B (en) 2023-08-29

Family

ID=69573984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910745838.6A Active CN110837769B (en) 2019-08-13 2019-08-13 Image processing and deep learning embedded far infrared pedestrian detection method

Country Status (1)

Country Link
CN (1) CN110837769B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626334A (en) * 2020-04-28 2020-09-04 东风汽车集团有限公司 Key control target selection method of vehicle-mounted advanced auxiliary driving system
CN111814755A (en) * 2020-08-18 2020-10-23 深延科技(北京)有限公司 Multi-frame image pedestrian detection method and device for night motion scene
CN114255373A (en) * 2021-12-27 2022-03-29 中国电信股份有限公司 Sequence anomaly detection method and device, electronic equipment and readable medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060177097A1 (en) * 2002-06-14 2006-08-10 Kikuo Fujimura Pedestrian detection and tracking with night vision
CN103198332A (en) * 2012-12-14 2013-07-10 华南理工大学 Real-time robust far infrared vehicle-mounted pedestrian detection method
CN104091171A (en) * 2014-07-04 2014-10-08 华南理工大学 Vehicle-mounted far infrared pedestrian detection system and method based on local features
US20150161796A1 (en) * 2013-12-09 2015-06-11 Hyundai Motor Company Method and device for recognizing pedestrian and vehicle supporting the same
CN106096561A (en) * 2016-06-16 2016-11-09 重庆邮电大学 Infrared pedestrian detection method based on image block degree of depth learning characteristic
CN106156401A (en) * 2016-06-07 2016-11-23 西北工业大学 Data-driven system state model on-line identification methods based on many assembled classifiers
CN107563347A (en) * 2017-09-20 2018-01-09 南京行者易智能交通科技有限公司 A kind of passenger flow counting method and apparatus based on TOF camera
US20180068083A1 (en) * 2014-12-08 2018-03-08 20/20 Gene Systems, Inc. Methods and machine learning systems for predicting the likelihood or risk of having cancer
KR101869442B1 (en) * 2017-11-22 2018-06-20 공주대학교 산학협력단 Fire detecting apparatus and the method thereof
CN108460336A (en) * 2018-01-29 2018-08-28 南京邮电大学 A kind of pedestrian detection method based on deep learning
US10108867B1 (en) * 2017-04-25 2018-10-23 Uber Technologies, Inc. Image-based pedestrian detection
CN109886245A (en) * 2019-03-02 2019-06-14 山东大学 A kind of pedestrian detection recognition methods based on deep learning cascade neural network

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060177097A1 (en) * 2002-06-14 2006-08-10 Kikuo Fujimura Pedestrian detection and tracking with night vision
CN103198332A (en) * 2012-12-14 2013-07-10 华南理工大学 Real-time robust far infrared vehicle-mounted pedestrian detection method
US20150161796A1 (en) * 2013-12-09 2015-06-11 Hyundai Motor Company Method and device for recognizing pedestrian and vehicle supporting the same
CN104091171A (en) * 2014-07-04 2014-10-08 华南理工大学 Vehicle-mounted far infrared pedestrian detection system and method based on local features
US20180068083A1 (en) * 2014-12-08 2018-03-08 20/20 Gene Systems, Inc. Methods and machine learning systems for predicting the likelihood or risk of having cancer
CN106156401A (en) * 2016-06-07 2016-11-23 西北工业大学 Data-driven system state model on-line identification methods based on many assembled classifiers
CN106096561A (en) * 2016-06-16 2016-11-09 重庆邮电大学 Infrared pedestrian detection method based on image block degree of depth learning characteristic
US10108867B1 (en) * 2017-04-25 2018-10-23 Uber Technologies, Inc. Image-based pedestrian detection
CN107563347A (en) * 2017-09-20 2018-01-09 南京行者易智能交通科技有限公司 A kind of passenger flow counting method and apparatus based on TOF camera
KR101869442B1 (en) * 2017-11-22 2018-06-20 공주대학교 산학협력단 Fire detecting apparatus and the method thereof
CN108460336A (en) * 2018-01-29 2018-08-28 南京邮电大学 A kind of pedestrian detection method based on deep learning
CN109886245A (en) * 2019-03-02 2019-06-14 山东大学 A kind of pedestrian detection recognition methods based on deep learning cascade neural network

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626334A (en) * 2020-04-28 2020-09-04 东风汽车集团有限公司 Key control target selection method of vehicle-mounted advanced auxiliary driving system
CN111814755A (en) * 2020-08-18 2020-10-23 深延科技(北京)有限公司 Multi-frame image pedestrian detection method and device for night motion scene
CN114255373A (en) * 2021-12-27 2022-03-29 中国电信股份有限公司 Sequence anomaly detection method and device, electronic equipment and readable medium
CN114255373B (en) * 2021-12-27 2024-02-02 中国电信股份有限公司 Sequence anomaly detection method, device, electronic equipment and readable medium

Also Published As

Publication number Publication date
CN110837769B (en) 2023-08-29

Similar Documents

Publication Publication Date Title
CN110175576B (en) Driving vehicle visual detection method combining laser point cloud data
WO2019196130A1 (en) Classifier training method and device for vehicle-mounted thermal imaging pedestrian detection
WO2019196131A1 (en) Method and apparatus for filtering regions of interest for vehicle-mounted thermal imaging pedestrian detection
CN110837769B (en) Image processing and deep learning embedded far infrared pedestrian detection method
US9626599B2 (en) Reconfigurable clear path detection system
CN106919902B (en) Vehicle identification and track tracking method based on CNN
CN105989334B (en) Road detection method based on monocular vision
CN106023257A (en) Target tracking method based on rotor UAV platform
CN111340855A (en) Road moving target detection method based on track prediction
CN115311241B (en) Underground coal mine pedestrian detection method based on image fusion and feature enhancement
CN116434159A (en) Traffic flow statistics method based on improved YOLO V7 and Deep-Sort
CN111915583A (en) Vehicle and pedestrian detection method based on vehicle-mounted thermal infrared imager in complex scene
CN115601717B (en) Deep learning-based traffic offence behavior classification detection method and SoC chip
He et al. A novel multi-source vehicle detection algorithm based on deep learning
Tarchoun et al. Hand-Crafted Features vs Deep Learning for Pedestrian Detection in Moving Camera.
CN113538585B (en) High-precision multi-target intelligent identification, positioning and tracking method and system based on unmanned aerial vehicle
CN109934096B (en) Automatic driving visual perception optimization method based on characteristic time sequence correlation
CN114743126A (en) Lane line sign segmentation method based on graph attention machine mechanism network
CN112347967B (en) Pedestrian detection method fusing motion information in complex scene
CN117036412A (en) Twin network infrared pedestrian target tracking method integrating deformable convolution
CN116934820A (en) Cross-attention-based multi-size window Transformer network cloth image registration method and system
Phu et al. Traffic sign recognition system using feature points
Zhang et al. IQ-STAN: Image quality guided spatio-temporal attention network for license plate recognition
CN115457420A (en) Low-contrast vehicle weight detection method based on unmanned aerial vehicle shooting at night
Wang et al. Vehicle recognition based on saliency detection and color histogram

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230803

Address after: One of the fourth floors, No. 107, Jucheng Avenue East, Xiaolan Town, Zhongshan City, Guangdong Province, 528400

Applicant after: Zhongshan sanzhuo Intelligent Technology Co.,Ltd.

Address before: Unit C403A, No. 205 Changfu Road, Tianhe District, Guangzhou City, Guangdong Province, 510000 (for office use only) (not intended for use as a factory building)

Applicant before: Guangzhou Sanmu Intelligent Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant