CN112907616B - Pedestrian detection method based on thermal imaging background filtering - Google Patents

Pedestrian detection method based on thermal imaging background filtering Download PDF

Info

Publication number
CN112907616B
CN112907616B CN202110460457.0A CN202110460457A CN112907616B CN 112907616 B CN112907616 B CN 112907616B CN 202110460457 A CN202110460457 A CN 202110460457A CN 112907616 B CN112907616 B CN 112907616B
Authority
CN
China
Prior art keywords
thermal imaging
image
background
pedestrian detection
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110460457.0A
Other languages
Chinese (zh)
Other versions
CN112907616A (en
Inventor
张森林
卢晨
刘妹琴
郑荣濠
董山玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202110460457.0A priority Critical patent/CN112907616B/en
Publication of CN112907616A publication Critical patent/CN112907616A/en
Application granted granted Critical
Publication of CN112907616B publication Critical patent/CN112907616B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration using histogram techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a pedestrian detection method based on thermal imaging background filtering, which comprises the following steps of: firstly, histogram equalization processing is carried out on an original thermal imaging picture acquired by a thermal imaging infrared camera, then a suitable threshold value is set for threshold value segmentation to obtain a preliminary candidate region for pedestrian detection, meanwhile, a foreground and a background are separated from the relation between a front frame and a rear frame of the picture based on a Gaussian mixture model to obtain a background subtraction picture, and a composite picture obtained by connecting the foreground and the rear frames is sent to a follow-up improved Faster R-CNN frame to complete pedestrian detection. The invention solves the problem of temperature drift of the imaging result of the thermal imaging camera through normalization, uses threshold segmentation and background subtraction to filter the background, fully utilizes the characteristics of the thermal imaging picture, and improves the accuracy of pedestrian detection in low-light and no-light environments.

Description

Pedestrian detection method based on thermal imaging background filtering
Technical Field
The invention relates to a pedestrian detection method based on thermal imaging background filtering, and belongs to the field of target detection of image processing.
Background
Vision is the most direct and dominant method for obtaining environmental information biologically, and the amount of information obtained by vision is also very abundant, so processing of visual information plays a crucial role in environmental information processing. Vision-based target detection is a research hotspot in the field of computer vision at present.
In recent years, with the development of the fields of artificial intelligence, deep learning, and the like, visual target detection has been developed. Different from the traditional target detection method based on feature extraction, the target detection method based on deep learning extracts deep information of images through a deep neural network, and uses massive data for training, so that the accuracy and speed of target detection are greatly improved.
In the field of object detection, pedestrian detection is an important component. The pedestrian detection is to use a computer technology to judge whether a pedestrian exists in a picture or a video and select the pedestrian position in the picture. Pedestrian detection has important application in fields such as autopilot, unmanned aerial vehicle, control. The pedestrian detection method currently mainstream includes: global detection, local-based detection, motion-based detection, multi-camera stereo vision detection.
Target detection based on visible light images has received extensive attention and research because of the characteristics of low equipment cost, wide application range and the like. However, visible light images are very susceptible to environmental influences. Factors such as appearance change, shading and illumination condition change can have great influence on target detection based on visible light. The appearance of infrared thermal imaging cameras provides ideas for solving the problems. Thermographic images have a distinct advantage over visible light images, in which an object is represented by its temperature and radiant heat, which means that thermographic images can be used both day and night. In addition, thermal images eliminate the effect of color and illumination changes on the appearance of the object. With the remarkable development of heat sensors in recent years, much research has been conducted on pedestrian detection and tracking in thermal images.
Disclosure of Invention
The invention aims to solve the defects of a visible light pedestrian detection method under the conditions of weak light and no light. The invention provides a pedestrian detection method based on thermal imaging background filtering. The method uses a thermal imaging sensor to obtain a thermal imaging image of the environment, and improves the pedestrian detection precision through a preprocessing method of background filtering and a pedestrian detection model based on improved FasterR-CNN.
The invention adopts the following specific technical scheme:
a pedestrian detection method based on thermal imaging background filtering comprises the following steps:
s1: firstly, processing a thermal imaging image acquired by a thermal imaging camera by using a histogram equalization method, so as to solve the problems of deviation and drift of the thermal imaging image and obtain a histogram equalization enhanced image;
s2: based on a Gaussian mixture model, separating foreground and background from the histogram equalization enhanced image obtained after the processing of S1 according to the relation between the previous frame and the next frame to obtain a binary background subtraction image;
s3: performing double-threshold segmentation on a thermal imaging image acquired by a thermal imaging camera by using upper and lower thresholds of imaging of pedestrians in the thermal imaging image to obtain a binary threshold segmentation image after segmentation of the pedestrians and a background;
s4: superposing the binary background subtraction image obtained in the step S2 and the binary threshold segmentation image obtained in the step S3 to obtain a binary background filtering image for distinguishing a foreground from a background, and performing background removal on the histogram equalization enhanced image obtained in the step S1 by using the binary background filtering image to obtain a background filtering image which is only promising;
s5: inputting the background filtered image obtained in S4 into a pre-constructed and trained pedestrian detection network based on improved FasterR-CNN for human body candidate region extraction and pedestrian detection, in the pedestrian detection process, firstly performing feature extraction on the background filtered image by a convolutional neural network to obtain a feature map, then extracting three proportional target suggestion frames respectively corresponding to a head, a half body and a human body from the feature map by an improved RPN, then projecting the target suggestion frames onto the feature map to obtain corresponding feature matrices, sequentially passing each feature matrix through a ROIpooling layer and a full connection layer to obtain a category probability and a boundary frame regression parameter, and finally combining the intersection relations between the three proportional target suggestion frames by taking the head as a reference to obtain a final thermal imaging pedestrian detection result.
Preferably, the specific implementation method in S1 is:
converting the thermal imaging image into a thermal imaging gray image, then counting to obtain a cumulative normalized histogram, and then mapping the thermal imaging gray image pixel by pixel according to a mapping relation to form a histogram equalization enhanced image, wherein the mapping relation is as follows:
p′i=min{x}+si·(max{x}-min{x})
in the formula: p'iRepresenting the equalized gray value s in the histogram equalized enhanced image obtained by mapping the pixel with the gray value i in the thermal imaging gray level imageiThe histogram probability accumulated value of a pixel with the gray level i in the thermal imaging gray level image is obtained from the accumulated normalized histogram; min { x } represents the minimum grayscale value in the thermal imaging grayscale map, and max { x } represents the maximum grayscale value in the thermal imaging grayscale map.
Preferably, the specific implementation method of S2 is as follows:
s21: training a Gaussian mixture model by using a plurality of enhanced images in the histogram equalization enhanced image; during training, firstly, initializing a basic Gaussian mixture matrix by using a first frame of enhanced image, then inputting the enhanced image frame by frame, comparing each newly added pixel with the mean value of the prior Gaussian mixture model, updating matrix coefficients if the newly added pixel is within 3 times of the variance with the mean value, or creating a new Gaussian distribution;
s22: and matching the histogram equalization enhanced image to be segmented pixel by adopting a Gaussian mixture model obtained in the step S21, and if one pixel value can be matched with one Gaussian mixture matrix, considering the pixel as a background, otherwise, considering the pixel as a foreground.
Preferably, the specific implementation method of S3 is as follows:
s31: calibrating a thermal imaging camera for acquiring a thermal imaging image, and determining upper and lower thresholds of pedestrian imaging in the thermal imaging camera;
s32: pixels between the upper threshold value and the lower threshold value in the thermal imaging map are regarded as a pedestrian area, and the rest pixels are regarded as a background area.
Preferably, the specific implementation method of S4 is as follows:
s41: adding the binary background subtraction image obtained in the step S2 and the binary threshold segmentation image obtained in the step S3 to obtain a binary background filtered image with a foreground pixel value of 1 and a background pixel value of 0;
s42: and multiplying the binary background filtering image and the histogram equalization enhanced image obtained in the step S1 by pixel points one by one to obtain a final background filtering image.
Preferably, in S5, the pedestrian detection network based on the improved fasterrr-CNN includes a convolutional neural network, an improved RPN network, a roiploling layer, and a full link layer, wherein the thermal imaging pedestrian detection result is obtained as follows:
s51: inputting the background filtered image into a convolutional neural network to obtain a corresponding characteristic map;
s52: feeding the feature map obtained in the step S51 into a target suggestion box which is possibly existed in the improved RPN network extraction target; for each position in the image, initializing 9 possible candidate frames according to the area of three sizes and the orthogonal combination of the proportions of the three sizes; the minimum proportion corresponds to a target suggestion frame of a human head, the middle proportion corresponds to a target suggestion frame of a half body, the half body is an upper half body or a lower half body, and the maximum proportion corresponds to a target suggestion frame of a human body;
s53: projecting the target suggestion boxes obtained in the step S52 to the feature map obtained in the step S51 to obtain corresponding feature matrixes, scaling each feature matrix to 7 × 7 through a ROIploling layer, and then flattening and sending the feature matrixes to a full-connection layer to obtain final class probability and regression parameters of the bounding box;
s54: taking the human head target frame with the highest reliability as a reference, and regarding each human head target frame, if a half body target frame exists or a human body target frame intersects with the human body target frame, combining the human body target frame and the human body target frame into a human body together, if no other target frame intersects with the human body target frame, regarding the human head target frame as the human head detected by the human body occlusion, and taking the human head target frame as a final target frame; and regarding the human body target frame, if the human body target frame does not intersect with any human head target frame, judging the human body target frame to be misjudged, and abandoning the human body target frame.
Further, the areas of the three sizes are 128 × 128,256 × 256,384 × 384, respectively.
Further, the ratio of the three sizes is 1:1,1:2 and 1:3 respectively.
Further, whether intersection exists among the target frames is judged through intersection comparison among the target frames.
Preferably, the improved FasterR-CNN based pedestrian detection network is trained in advance using labeled thermal imaging datasets.
The invention solves the problem of temperature drift of the imaging result of the thermal imaging camera by a histogram equalization method, uses threshold segmentation and background subtraction to filter the background, can fully utilize the characteristics of a thermal imaging picture, and improves the pedestrian detection precision in low-light and no-light environments.
Drawings
FIG. 1 is an overall flow chart of a thermal imaging background filtering based pedestrian detection algorithm as disclosed in the present invention.
FIG. 2 is a diagram of the neural network architecture of the improved Faster R-CNN.
Fig. 3 is a thermal imaging diagram used as an example.
Fig. 4 is a histogram equalization enhanced image obtained after the histogram equalization method processing.
Fig. 5 is a binary background subtraction image obtained based on a gaussian mixture model.
Fig. 6 is a binarized dual-threshold-segmented image obtained by dual-threshold segmentation.
Fig. 7 is a background-filtered image obtained based on a binary background subtraction image and a binary threshold segmentation image.
Fig. 8 is a head target suggestion box, a half-length target suggestion box, and a whole-body target suggestion box obtained by the RPN network being improved.
Fig. 9 is the final target detection result.
Detailed Description
The invention will be further elucidated and described with reference to the drawings and the detailed description. The technical features of the embodiments of the present invention can be combined correspondingly without mutual conflict.
In a preferred embodiment of the present invention, the open source tool Pytorch based on deep learning implements a pedestrian detection method based on thermal imaging background filtering. As shown in fig. 1, the method for detecting a pedestrian based on background filtering disclosed by the invention comprises two parts of thermal imaging image background filtering and deep learning model construction, training and detection, and the specific implementation process is as follows:
first, background filtering treatment of thermal imaging image
A thermal imaging camera is first used to acquire a segment of thermal imaging video, which consists of a series of successive frames of thermal imaging images. For the image at the time t, as shown in fig. 1, a histogram equalization method is firstly used for background filtering to improve the deviation and drift problems of the thermal imaging image, and the specific implementation process is as follows:
1) and performing histogram equalization method processing on the thermal imaging graph needing target detection.
Firstly, converting an original thermal imaging image into a thermal imaging gray level image, expressing the thermal imaging gray level image as { x }, counting the proportion of total pixels of each gray level on the image, and obtaining the occurrence probability of pixels with gray level i in the image as follows:
Figure BDA0003042231650000051
in the formula: n represents the number of all pixels in the image;
p obtained as described abovex(i) Is the histogram of the gray scale map.
Then, the cumulative normalized histogram of the thermographic image is obtained by accumulation:
Figure BDA0003042231650000052
in the formula: skHistogram probability accumulation for k-gray pixelsA value;
finally, mapping the thermal imaging gray level image pixel by pixel according to a mapping relation to form a histogram equalization enhanced image, wherein the mapping relation is as follows:
p′i=min{x}+si·(max{x}-min{x})
in the formula: p'iExpressing the gray value after equalization in the histogram equalization enhanced image obtained by mapping the pixel with the gray value i in the thermal imaging gray level image, namely as the pixel value in the histogram equalization enhanced image, siThe histogram probability accumulated value of the pixel with the gray level i in the thermal imaging gray level image can be obtained from the accumulated normalized histogram; min { x } represents the minimum grayscale value in the thermal imaging grayscale map, and max { x } represents the maximum grayscale value in the thermal imaging grayscale map.
Taking fig. 3 as an example, after the above operation is performed on the original image, each pixel may be mapped to a new pixel, and a histogram equalization enhanced image may be obtained, as shown in fig. 4.
2) Based on a Gaussian mixture model, separating foreground and background according to the relation between the front frame and the rear frame in the histogram equalization enhanced image obtained after 1) processing to obtain a background subtraction image. In the process, the enhanced image of the previous t frames of the enhanced image obtained by histogram equalization is used for training a Gaussian mixture model, and the value of specific t can be adjusted according to needs. The training process of the Gaussian mixture model is as follows:
firstly, a first frame of enhanced image is used for initializing a basic Gaussian mixture matrix, and a Gaussian mixture model is established for each pixel point on an image at the moment t:
Figure BDA0003042231650000061
in the formula: xtIs the pixel value of the pixel point at the time t, k is the number of Gaussian distribution functions, wi,t、μi,t
Figure BDA0003042231650000062
Respectively representing the weight corresponding to the ith Gaussian modelThe coefficient, the mean and the variance are,
Figure BDA0003042231650000063
is a gaussian density function.
And then, inputting subsequent enhanced images frame by frame, comparing the newly added pixels with the mean value of the existing Gaussian mixture model, if the newly added pixels and the mean value are within 3 times of the variance, updating the matrix coefficient, and otherwise, creating a new Gaussian distribution. The model update formula is as follows:
wi,t=(1+α)wi,t-1
μi,t=ρμi,t-1+(1-ρ)Xt
in the formula: alpha is a model weight updating coefficient, rho is a model mean value updating coefficient,
and finally, carrying out background pixel matching on the subsequent histogram equalization enhanced image to be segmented by adopting the mixed Gaussian model obtained in the previous step. If a pixel value can match one of the gaussian mixture matrices, the pixel is considered as a background and is recorded as 0, otherwise, the pixel is considered as a foreground and is recorded as 1, and thus the final binary background subtraction image is shown in fig. 5.
3) And performing double-threshold segmentation on the thermal imaging image acquired by the thermal imaging camera by using the upper and lower thresholds of the image of the pedestrian in the thermal imaging image to obtain a threshold segmentation image after segmentation of the pedestrian and the background.
Before segmentation, the thermal imaging camera needs to be calibrated, the upper and lower threshold boundaries of pedestrian imaging for the thermal imaging camera are determined, and the upper and lower thresholds are set to be T respectivelyu,TdBased on these two thresholds, the image can be divided into two parts: the first part is greater than or equal to TdAnd is less than or equal to TuIs marked as 1, and the second part is smaller than TdOr greater than TuIs denoted as 0, and the formula is as follows:
Figure BDA0003042231650000071
in the formula: p (x, y) represents a pixel value of a point (x, y) in the image, and f (x, y) represents the resulting binary threshold-divided image, as shown in fig. 6 in this embodiment.
4) Adding the binary background subtraction image and the binary threshold segmentation image to obtain a binary background filtered image for distinguishing the foreground from the background, and performing pixel-by-pixel multiplication on the binary background filtered image and the histogram equalization enhanced image obtained by histogram equalization, so as to remove the background of the histogram equalization enhanced image by using the binary background filtered image to obtain a background filtered image which is only promising, as shown in fig. 7.
Construction, training and detection of pedestrian detection network based on improved FasterR-CNN
In the part, the obtained background filtering image is sent to a trained improved FasterR-CNN framework for reasoning, so that human body candidate region extraction and pedestrian detection are realized, and a pedestrian detection result based on a thermal imaging image under the condition of weak light or no light is finally obtained. The pedestrian detection network structure based on the improved FasterR-CNN is shown in FIG. 2:
firstly, inputting the background filtering image obtained in the previous step into a convolutional neural network to obtain a characteristic diagram of the image. In this embodiment, the convolutional neural network may employ a ResNet-101 network.
Then, the feature map of the image is sent to an improved RPN (region suggestion network) to extract a candidate region where the target may exist. In the improved RPN network, compared with the common RPN network, the improved RPN network is characterized in that three proportion target suggestion boxes respectively corresponding to a human head, a human body (the half body can be an upper half body or a lower half body) and the human body are extracted from a characteristic diagram, wherein the three proportions are specifically determined according to a detection target. In this example, for each location in the image, 9 possible candidate boxes, 128 × 128(1:1), 128 × 256(1:2), 128 × 384(1:3), 256 × 256(1:1), 256 × 512(1:2), 256 × 768(1:3), 384 × 384(1:1), 384 × 768(1:2), 384 × 1152(1:3), were initialized in three sizes of areas (128 × 128,256 × 256,384 × 384) and the three sizes of ratios (1:1,1:2,1:3) orthogonally combined. The minimum ratio of 1:1 corresponds to the target suggestion box for the human head, the intermediate ratio of 1:2 corresponds to the target suggestion box for the half of the body, and the maximum ratio of 1:3 corresponds to the target suggestion box for the human body, which can subsequently be used to combine to form a complete human body, as shown in fig. 8.
And then, projecting the target suggestion frame on the characteristic diagram to obtain corresponding characteristic matrixes, zooming each characteristic matrix to 7 × 7 in turn through ROIploling layers, and flattening and sending the characteristic matrixes into a full-connection layer to obtain class probability and regression parameters of the boundary frame.
And finally, combining the intersection relationship among the three proportional target suggestion frames by taking the human head as a reference to obtain a final thermal imaging pedestrian detection result. Taking the human head as a reference means that the human head target frame with the highest reliability is taken as a reference, for each human head target frame, if a half body target frame exists or a human body target frame and the human body target frame have an intersection, the human head target frame and the human body target frame are combined together to form a human body, if no other target frame and the human body target frame have an intersection, the human head is considered to be the human head detected by the human body being blocked, and the human head target frame is taken as a final target frame; and regarding the human body target frame, if the human body target frame does not intersect with any human head target frame, judging the human body target frame to be misjudged, and abandoning the human body target frame.
Specifically, whether or not there is an intersection between the target frames is determined by the intersection ratio between the target frames. Defining the target frame to be represented by the lower left corner coordinate and the upper right corner coordinate of the target frame, and then representing the human head detection frame as Dhead(xhead-bl,yhead-bl,xhead-ur,yhead-ur) The half-length object box is denoted as Dhalf(xhalf-bl,yhalf-bl,xhalf-ur,yhalf-ur) The human target box is denoted as Dbody(xbody-bl,ybody-bl,xbody-ur,ybody-ur). Because the head imaging characteristics are most obvious in the thermal imaging image, the reliability of the detected human head target frames is higher, and for each human head target frame, if a half body or a human body target frame has an Intersection with the human head target frame, namely the Intersection over Unit (IoU) between the target frames is larger than zero, a human body is combined; if no other detection frame has intersection with the detection frame, the detection frame is regarded as the head of the person detected by the occlusion, and the final target is obtainedThe frame is the human head target frame. And regarding the human body target frame, if no human head has intersection, judging as misjudgment, and abandoning the target frame.
Figure BDA0003042231650000081
In the formula: IoU1-2Representing the intersection ratio between object boxes 1 and 2. D1Representing the target frame 1, with coordinates (x)1-bl,y1-bl,x1-ur,y1-ur),D1Representing the target frame 2, with coordinates (x)2-bl,y2-bl,x2-ur,y2-ur)。
Combined overall pedestrian target frame DpeopleThe coordinate is (x)p-bl,yp-bl,xp-ur,yp-ur) Wherein:
xp-bl=min(x1-bl,x2-bl)
yp-bl=min(y1-bl,y2-bl)
xp-ur=max(x1-ur,x2-ur)
yp-ur=max(y1-ur,y2-ur)
the final pedestrian detection result is shown in fig. 9.
In addition, the pedestrian detection network based on the improved FasterR-CNN needs to be trained by using a thermal imaging data set with labels in advance before being used for actual detection, and the training method belongs to the prior art. In this embodiment, the specific implementation manner that the training process can adopt is as follows:
1. initializing the parameters of the preposed convolutional layer by using an ImageNet pre-training classification model, and training an RPN network;
2. training a classification and bounding box regression network by using the obtained target suggestion box;
3. fine tuning the RPN by using the trained pre-convolutional network layer;
4. fine-tuning the classification and bounding box regression network by using the trained pre-convolutional network layer;
5. the RPN network and the classification and bounding box regression network share the trained pre-convolutional network layer to form a complete network model.
The above-described embodiments are merely preferred embodiments of the present invention, which should not be construed as limiting the invention. Various changes and modifications may be made by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present invention. Therefore, the technical scheme obtained by adopting the mode of equivalent replacement or equivalent transformation is within the protection scope of the invention.

Claims (10)

1. A pedestrian detection method based on thermal imaging background filtering is characterized by comprising the following steps:
s1: firstly, processing a thermal imaging image acquired by a thermal imaging camera by using a histogram equalization method, so as to solve the problems of deviation and drift of the thermal imaging image and obtain a histogram equalization enhanced image;
s2: based on a Gaussian mixture model, separating foreground and background from the histogram equalization enhanced image obtained after the processing of S1 according to the relation between the front and rear frames to obtain a binary background subtraction image;
s3: performing double-threshold segmentation on a thermal imaging image acquired by a thermal imaging camera by using upper and lower thresholds of imaging of pedestrians in the thermal imaging image to obtain a binary threshold segmentation image after segmentation of the pedestrians and a background;
s4: superposing the binary background subtraction image obtained in the step S2 and the binary threshold segmentation image obtained in the step S3 to obtain a binary background filtering image for distinguishing a foreground from a background, and performing background removal on the histogram equalization enhanced image obtained in the step S1 by using the binary background filtering image to obtain a background filtering image which is only promising;
s5: inputting the background filtered image obtained in S4 into a pre-constructed and trained pedestrian detection network based on improved FasterR-CNN for human body candidate region extraction and pedestrian detection, in the pedestrian detection process, firstly performing feature extraction on the background filtered image by a convolutional neural network to obtain a feature map, then extracting three proportional target suggestion frames respectively corresponding to a head, a half body and a human body from the feature map by an improved RPN, then projecting the target suggestion frames onto the feature map to obtain corresponding feature matrices, sequentially passing each feature matrix through a ROIpooling layer and a full connection layer to obtain a category probability and a boundary frame regression parameter, and finally combining the intersection relations between the three proportional target suggestion frames by taking the head as a reference to obtain a final thermal imaging pedestrian detection result.
2. The pedestrian detection method based on thermal imaging background filtering as claimed in claim 1, wherein the specific implementation method in S1 is:
converting the thermal imaging image into a thermal imaging gray image, then counting to obtain a cumulative normalized histogram, and then mapping the thermal imaging gray image pixel by pixel according to a mapping relation to form a histogram equalization enhanced image, wherein the mapping relation is as follows:
p′i=min{x}+si·(max{x}-min{x})
in the formula: p'iRepresenting the equalized gray value s in the histogram equalized enhanced image obtained by mapping the pixel with the gray value i in the thermal imaging gray level imageiThe histogram probability accumulated value of a pixel with the gray level i in the thermal imaging gray level image is obtained from the accumulated normalized histogram; min { x } represents the minimum grayscale value in the thermal imaging grayscale map, and max { x } represents the maximum grayscale value in the thermal imaging grayscale map.
3. The pedestrian detection method based on thermal imaging background filtering as claimed in claim 1, wherein the specific implementation method of S2 is as follows:
s21: training a Gaussian mixture model by using a plurality of enhanced images in the histogram equalization enhanced image; during training, firstly, initializing a basic Gaussian mixture matrix by using a first frame of enhanced image, then inputting the enhanced image frame by frame, comparing each newly added pixel with the mean value of the prior Gaussian mixture model, updating matrix coefficients if the newly added pixel is within 3 times of the variance with the mean value, or creating a new Gaussian distribution;
s22: and matching the histogram equalization enhanced image to be segmented pixel by adopting a Gaussian mixture model obtained in the step S21, and if one pixel value can be matched with one Gaussian mixture matrix, considering the pixel as a background, otherwise, considering the pixel as a foreground.
4. The pedestrian detection method based on thermal imaging background filtering as claimed in claim 1, wherein the specific implementation method of S3 is as follows:
s31: calibrating a thermal imaging camera for acquiring a thermal imaging image, and determining upper and lower thresholds of pedestrian imaging in the thermal imaging camera;
s32: pixels between the upper threshold value and the lower threshold value in the thermal imaging map are regarded as a pedestrian area, and the rest pixels are regarded as a background area.
5. The pedestrian detection method based on thermal imaging background filtering as claimed in claim 1, wherein the specific implementation method of S4 is as follows:
s41: adding the binary background subtraction image obtained in the step S2 and the binary threshold segmentation image obtained in the step S3 to obtain a binary background filtered image with a foreground pixel value of 1 and a background pixel value of 0;
s42: and multiplying the binary background filtering image and the histogram equalization enhanced image obtained in the step S1 by pixel points one by one to obtain a final background filtering image.
6. The pedestrian detection method based on thermal imaging background filtering as claimed in claim 1, wherein in S5, the pedestrian detection network based on modified FasterR-CNN comprises a convolutional neural network, a modified RPN network, a roiploling layer and a full connectivity layer, wherein the thermal imaging pedestrian detection result is obtained as follows:
s51: inputting the background filtered image into a convolutional neural network to obtain a corresponding characteristic map;
s52: feeding the feature map obtained in the step S51 into a target suggestion box which is possibly existed in the improved RPN network extraction target; for each position in the image, initializing 9 possible candidate frames according to the area with three sizes and the orthogonal combination of the proportions with the three sizes; the minimum proportion corresponds to a target suggestion frame of a human head, the middle proportion corresponds to a target suggestion frame of a half body, the half body is an upper half body or a lower half body, and the maximum proportion corresponds to a target suggestion frame of a human body;
s53: projecting the target suggestion frames obtained in the step S52 on the feature map obtained in the step S51 to obtain corresponding feature matrixes, scaling each feature matrix to 7 × 7 through a ROIPooling layer, flattening and sending the feature matrixes into a full connection layer to obtain final class probability and regression parameters of the boundary frame;
s54: taking the human head target frame with the highest reliability as a reference, and regarding each human head target frame, if a half body target frame exists or a human body target frame intersects with the human body target frame, combining the human body target frame and the human body target frame into a human body together, if no other target frame intersects with the human body target frame, regarding the human head target frame as the human head detected by the human body occlusion, and taking the human head target frame as a final target frame; and regarding the human body target frame, if the human body target frame does not intersect with any human head target frame, judging the human body target frame to be misjudged, and abandoning the human body target frame.
7. The method of claim 6, wherein the three sizes of areas are 128 x 128,256 x 256,384 x 384, respectively.
8. The pedestrian detection method based on thermal imaging background filtering according to claim 6, wherein the ratio of the three sizes is 1:1,1:2,1:3, respectively.
9. The pedestrian detection method based on thermal imaging background filtering as claimed in claim 6, wherein whether there is an intersection between the target frames is determined by an intersection ratio between the target frames.
10. The pedestrian detection method based on thermal imaging background filtering according to claim 1, wherein the pedestrian detection network based on modified FasterR-CNN is trained using labeled thermal imaging data sets in advance.
CN202110460457.0A 2021-04-27 2021-04-27 Pedestrian detection method based on thermal imaging background filtering Active CN112907616B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110460457.0A CN112907616B (en) 2021-04-27 2021-04-27 Pedestrian detection method based on thermal imaging background filtering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110460457.0A CN112907616B (en) 2021-04-27 2021-04-27 Pedestrian detection method based on thermal imaging background filtering

Publications (2)

Publication Number Publication Date
CN112907616A CN112907616A (en) 2021-06-04
CN112907616B true CN112907616B (en) 2022-05-03

Family

ID=76108934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110460457.0A Active CN112907616B (en) 2021-04-27 2021-04-27 Pedestrian detection method based on thermal imaging background filtering

Country Status (1)

Country Link
CN (1) CN112907616B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2264643A1 (en) * 2009-06-19 2010-12-22 Universidad de Castilla-La Mancha Surveillance system and method by thermal camera
CN106504274A (en) * 2016-10-10 2017-03-15 广东技术师范学院 A kind of visual tracking method and system based under infrared camera
CN108710838A (en) * 2018-05-08 2018-10-26 河南工程学院 Thermal infrared facial image recognition method under a kind of overnight sight
KR20180125278A (en) * 2017-05-15 2018-11-23 한국전자통신연구원 Apparatus and method for detecting pedestrian
CN110490877A (en) * 2019-07-04 2019-11-22 西安理工大学 Binocular stereo image based on Graph Cuts is to Target Segmentation method
CN110717393A (en) * 2019-09-06 2020-01-21 北京富吉瑞光电科技有限公司 Forest fire automatic detection method and system based on infrared panoramic system
CN111046880A (en) * 2019-11-28 2020-04-21 中国船舶重工集团公司第七一七研究所 Infrared target image segmentation method and system, electronic device and storage medium
CN111340765A (en) * 2020-02-20 2020-06-26 南京邮电大学 Thermal infrared image reflection detection method based on background separation
CN111461036A (en) * 2020-04-07 2020-07-28 武汉大学 Real-time pedestrian detection method using background modeling enhanced data
CN112200764A (en) * 2020-09-02 2021-01-08 重庆邮电大学 Photovoltaic power station hot spot detection and positioning method based on thermal infrared image
CN112529065A (en) * 2020-12-04 2021-03-19 浙江工业大学 Target detection method based on feature alignment and key point auxiliary excitation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10719727B2 (en) * 2014-10-01 2020-07-21 Apple Inc. Method and system for determining at least one property related to at least part of a real environment

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2264643A1 (en) * 2009-06-19 2010-12-22 Universidad de Castilla-La Mancha Surveillance system and method by thermal camera
CN106504274A (en) * 2016-10-10 2017-03-15 广东技术师范学院 A kind of visual tracking method and system based under infrared camera
KR20180125278A (en) * 2017-05-15 2018-11-23 한국전자통신연구원 Apparatus and method for detecting pedestrian
CN108710838A (en) * 2018-05-08 2018-10-26 河南工程学院 Thermal infrared facial image recognition method under a kind of overnight sight
CN110490877A (en) * 2019-07-04 2019-11-22 西安理工大学 Binocular stereo image based on Graph Cuts is to Target Segmentation method
CN110717393A (en) * 2019-09-06 2020-01-21 北京富吉瑞光电科技有限公司 Forest fire automatic detection method and system based on infrared panoramic system
CN111046880A (en) * 2019-11-28 2020-04-21 中国船舶重工集团公司第七一七研究所 Infrared target image segmentation method and system, electronic device and storage medium
CN111340765A (en) * 2020-02-20 2020-06-26 南京邮电大学 Thermal infrared image reflection detection method based on background separation
CN111461036A (en) * 2020-04-07 2020-07-28 武汉大学 Real-time pedestrian detection method using background modeling enhanced data
CN112200764A (en) * 2020-09-02 2021-01-08 重庆邮电大学 Photovoltaic power station hot spot detection and positioning method based on thermal infrared image
CN112529065A (en) * 2020-12-04 2021-03-19 浙江工业大学 Target detection method based on feature alignment and key point auxiliary excitation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Automatic annotation of pedestrians in thermal images using background/foreground segmentation for training deep neural networks;Zuhaib Ahmed Shaikh 等;《2020 IEEE Symposium Series on Computational Intelligence (SSCI)》;20210105;全文 *
基于红外图像的行人检测算法研究;吴迪;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20190630(第6期);全文 *

Also Published As

Publication number Publication date
CN112907616A (en) 2021-06-04

Similar Documents

Publication Publication Date Title
CN111274976B (en) Lane detection method and system based on multi-level fusion of vision and laser radar
CN111209810B (en) Boundary frame segmentation supervision deep neural network architecture for accurately detecting pedestrians in real time through visible light and infrared images
CN108304873B (en) Target detection method and system based on high-resolution optical satellite remote sensing image
Maddalena et al. Stopped object detection by learning foreground model in videos
CN105930868B (en) A kind of low resolution airport target detection method based on stratification enhancing study
Kong et al. General road detection from a single image
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
EP3499414B1 (en) Lightweight 3d vision camera with intelligent segmentation engine for machine vision and auto identification
CN111784747B (en) Multi-target vehicle tracking system and method based on key point detection and correction
CN109685045B (en) Moving target video tracking method and system
Zin et al. Fusion of infrared and visible images for robust person detection
CN109919026B (en) Surface unmanned ship local path planning method
CN109272455A (en) Based on the Weakly supervised image defogging method for generating confrontation network
CN104766065B (en) Robustness foreground detection method based on various visual angles study
WO2016165064A1 (en) Robust foreground detection method based on multi-view learning
CN113158943A (en) Cross-domain infrared target detection method
Naufal et al. Preprocessed mask RCNN for parking space detection in smart parking systems
CN111582074A (en) Monitoring video leaf occlusion detection method based on scene depth information perception
Huerta et al. Exploiting multiple cues in motion segmentation based on background subtraction
CN109784216B (en) Vehicle-mounted thermal imaging pedestrian detection Rois extraction method based on probability map
CN115116132B (en) Human behavior analysis method for depth perception in Internet of things edge service environment
Surkutlawar et al. Shadow suppression using rgb and hsv color space in moving object detection
Lu et al. A cross-scale and illumination invariance-based model for robust object detection in traffic surveillance scenarios
Nosheen et al. Efficient Vehicle Detection and Tracking using Blob Detection and Kernelized Filter
CN107103301B (en) Method and system for matching discriminant color regions with maximum video target space-time stability

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant