CN114202646A

CN114202646A - Infrared image smoking detection method and system based on deep learning

Info

Publication number: CN114202646A
Application number: CN202111438613.XA
Authority: CN
Inventors: 王建; 谷湘煜; 罗国庆; 吴有江; 罗贞兰; 张华�; 赵皓; 张静; 李林静; 饶爽; 刘桂华
Original assignee: School Of Information Engineering Southwest University Of Science And Technology; Shenzhen Launch Digital Technology Co Ltd
Current assignee: School Of Information Engineering Southwest University Of Science And Technology; Shenzhen Launch Digital Technology Co Ltd
Priority date: 2021-11-26
Filing date: 2021-11-26
Publication date: 2022-03-18

Abstract

The invention discloses an infrared image smoking detection method and system based on deep learning, relates to the technical field of image recognition, and solves the technical problems of unstable smoke point detection precision and poor positioning effect in the existing infrared scene, and comprises the steps of carrying hardware environment, data acquisition, data set preprocessing and manufacturing, smoke point feature extraction, smoke point feature fusion, smoke point detection and smoke point alarm; the calculation cost is reduced by using image self-adaptive scaling filling, the requirement for hardware deployment is reduced, the model identification precision and real-time performance are improved, and smoking identification and detection in an infrared scene are effectively realized; the system can be deployed in factories, workshops and other work places to effectively and stably detect smoking behaviors.

Description

Infrared image smoking detection method and system based on deep learning

Technical Field

The invention relates to the technical field of image recognition, in particular to the technical field of infrared image smoking detection methods and systems based on deep learning.

Background

China is the only country with a complete industrial system in the world at present, and factory safety protection is an important guarantee in industrial economic development; according to incomplete statistics, fire is one of the most frequent and serious problems in factory safety accidents, and smoking behavior is one of the main factors causing the factory fire; therefore, smoking detection in a factory environment is an important means to ensure safe production in a factory; smoking detection methods are mainly divided into a traditional method, a detection method based on smoke identification and a method based on a convolutional neural network:

1. in 2010, Wu Pin et al extract the characteristics between the ignited cigarette and its smoker by a characteristic extraction method based on a color histogram, realize the tasks of segmenting the cigarette and tracking background pixels by combining a Gaussian mixture model technology, and detect cigarette targets with various sizes, colors and shapes; in 2011, scholars such as Wu Wen-C and the like propose a system for detecting smoking behaviors through human faces; firstly, positioning a mouth of a person by using an existing face detection scheme, then carrying out white balance processing on a face image to reduce misjudgment caused by light, simultaneously realizing color segmentation and noise elimination in an HSV color system, and judging whether smoking behaviors exist or not by carrying out binarization form expansion and area connection conditions after erosion on an object, namely the range between cigarettes and the mouth, but the face detection effect is poor due to the fact that the face posture is not correct and the image background is complex; in 2013, the scholars of Wang super and the like propose a smoking gesture recognition algorithm of 'recognition after segmentation' aiming at the specific behavior characteristic of a smoking gesture, firstly segment a gesture area in a mode of combining motion information of a video sequence and skin color detection, and then detect the smoking gesture by extracting gesture features and adopting a support vector machine.

Considering that the color characteristics are greatly influenced by environmental illumination, the size of a cigarette target is small, the difficulty in extracting colors from a complex background exists, and suspected cigarette objects influence the judgment of cigarette identification, misjudgment is easily caused, smoking gestures are various, and the accuracy rate of smoking behavior detection through smoking gestures is not high, so researchers try to perform smoking detection through smoke generated by smoking.

2. The detection method based on smoke identification comprises the following steps: in 2013, Dinghongjie researches smoking behavior detection based on indoor place smoke identification, divides a suspected cigarette smoke area by using a method of combining a background difference method and a color model, extracts cigarette smoke characteristics according to smoke primitive blocking and space-time characteristics, and finally realizes cigarette smoke identification based on an SVM (support vector machine) so as to achieve the aim of detecting smoking; in 2014, su xiang proposed a smoking behavior detection algorithm with multiple characteristics of smoke, which firstly determines the spatial interest range through the relative position of human hands and faces, and then classifies and judges the suspected smoke area according to the dynamic characteristics of the cigarette smoke; in 2016, aiming at the problems that the cigarette smoke is thin and the background is not easy to distinguish, an Blackman adopts a background modeling method based on a mixed Gaussian model to update the background of a monitored area in real time, extracts a foreground target in a video frame by combining background subtraction, and identifies the cigarette smoke by using a classifier after preliminarily judging the geometric characteristics of an interested area, so that the accuracy of smoking behavior detection is greatly improved; in 2018, Liu Yuan Ding et al analyze and extract HOG characteristics and textural features of cigarette smoke, and apply a feature fusion algorithm to realize recognition of the smoke emitted during smoking; in the same year, scholars such as chenqueshi and Huanghaitao propose a smoking behavior detection algorithm, which comprises the steps of firstly identifying cigarettes on a positioned face image according to the number of white pixel points, then identifying smoke in a face area according to the change of energy values of the face area image in a detection process, and finally judging smoking behavior according to the identified cigarettes and the smoke; because the cigarette has the characteristics of low smoke concentration, easy diffusion and unobvious edge, the target detection is difficult, and the effect based on smoke detection is not good; with the continuous development of deep learning, the robustness of a neural network model is gradually improved, the traditional method is limited in feature extraction, semantic information of smoking behaviors can be better extracted through the deep learning, and the accuracy of smoking detection is improved.

3. The method based on the convolutional neural network comprises the following steps: in 2017, smoking detection of drivers, Artan Y et al proposed a method for detecting smoking behavior of drivers using Near Infrared (NIR) surveillance camera images, first locating a front windshield and a driver head region in sequence using a target detection technique based on deep learning, and then performing a dual-window (local) anomaly detector in a local region to determine white hot spots on the NIR image due to high temperature, thereby determining smoking behavior of drivers; the method is only applied to local narrow space, and detection is not attempted to be carried out by expanding the method to a larger indoor space; in 2019, Zhao R et al propose an indoor smoking behavior detection algorithm based on a YOLOv3-tiny deep learning network, a k-means clustering algorithm is used for obtaining a prior frame of a cigarette target, a small target detection layer is added on the basis of an original YOLOv3-tiny network, the actual application requirements are effectively met, a new way is provided for assisting indoor monitoring, and an optimizable space also exists; in 2020, Chen Ruilon et al propose a real-time smoking detection method based on deep learning, the model uses a convolutional neural network to process the video stream input shot by a camera, and positions the cigarette ends through the processes of image feature extraction, feature fusion, target classification, target positioning and the like, so as to judge smoking behaviors, the common target detection algorithm has a not ideal detection effect on small targets, and the detection speed needs to be improved; through a series of designed convolutional neural network modules, the calculation amount of a model is reduced, the deduction speed is increased, the real-time requirement is met, and the accuracy rate of small target (cigarette end) detection is improved. The method optimizes the structure of the neural network for many times and has good performance in the aspect of feature extraction. Therefore, the smoking detection based on deep learning can greatly improve the detection precision, enhance the effective rate of fire early warning and provide effective guarantee for social economy and life safety of people.

The smoking process is accompanied with the temperature change in the infrared image, wherein the smoke point takes the highest temperature position of the cigarette end as a central point and is radiated and diffused to the periphery to form a color temperature spot which is approximately circular, and the color is changed from inside to outside from deep to light; moreover, with the influence of factors such as scene depth, seasonal temperature difference, day and night illumination and the like, the representation of the smoke point presents diversity, so that the detection of the smoke point by using the traditional method has great limitation. The deep learning has stronger learning ability, and the rule of the sample can be estimated through data support, so that the robustness of the network model is improved, and the smoke point detection with high accuracy is realized; therefore, it is a reasonable method to use a detection method based on deep learning to realize the smoke point detection. From the above, it can be seen that the detection of smoke spots in an infrared environment can effectively determine whether smoking behavior exists. However, infrared images often have the characteristics of poor resolution, low contrast, low signal-to-noise ratio, blurred visual effect, monotonous information compared with visible light images and the like, so that effective and stable target detection cannot be well obtained in an infrared scene. The method has the characteristics of poor resolution, low contrast, low signal-to-noise ratio, fuzzy visual effect, monotonous information of the infrared image compared with the visible light image and the like, so that effective and stable target detection cannot be well obtained in the infrared scene.

In summary, the prior art has the following drawbacks: 1) target feature information in the infrared image is not abundant and lacks context information, and the target detection precision of the infrared image is poor; 2) the area of the smoke point in the sample is small, only a few pixel points exist sometimes, and the small target detection positioning effect is poor; 3) the traditional method has large limitation, and when the smoke point is influenced by environmental factors and is too large in class spacing caused by different presentation forms in a sample, the traditional method cannot effectively detect the smoke point.

Disclosure of Invention

The invention aims to: in order to solve the technical problems, the invention provides an infrared image smoking detection method and system based on depth learning.

The technical scheme adopted by the invention is as follows: an infrared image smoking detection method based on deep learning comprises the following steps:

the method comprises the following steps: carrying a hardware environment, installing an infrared visible light binocular camera according to the installation standard of the security camera, and electrically connecting the infrared visible light binocular camera with the terminal processor respectively;

step two: data acquisition, namely performing video shooting on a simulation workplace through an infrared visible binocular camera, acquiring smoking behavior videos of three simulation real smoking scenes of a single person, two persons and a plurality of persons, simultaneously acquiring a visible light video and an infrared video, and acquiring frames at the same time starting point and the same frequency to obtain a visible light initial sample and an infrared initial sample;

preprocessing data, namely performing image fusion processing operation on the visible light initial sample and the infrared initial sample to obtain a fused sample;

step four: and (3) data set preparation, namely screening the fused sample to obtain 2700 pictures and marking the pictures, wherein the sample is a double-light fusion image with the size of 640 multiplied by 480 multiplied by 3. And 2700 post-labeling samples were prepared as 8: 1: the proportion of 1 is divided into a training set, a testing set and a verification set;

step five: smoke point feature extraction, namely performing Mosaic enhancement on an input sample by adopting a Yolov5 network model, after the original sample is subjected to adaptive picture scaling and filling, changing the size of the input sample into 640 multiplied by 3, inputting the input sample into a backbone network, slicing the sample by a Focus structure to obtain 320 multiplied by 12, performing 64 convolution operations once to obtain a feature map, and completing smoke point feature extraction by the backbone network;

step six: fusing smoke point features, wherein the Yolov5 processes the backbone network by adopting a FPN + PAN structure to extract smoke point features;

step seven: detecting smoke spots, screening the target frames through non-maximum value suppression NMS, reserving the prediction frames with the highest confidence level, completing the smoke spot detection process, and judging that people smoke in the images with the smoke spots;

step eight: the method comprises the steps of smoke point alarming, wherein a smoke point detection model is deployed in a terminal processor, videos shot by an infrared visible light binocular camera are transmitted to the terminal processor through the Ethernet, real-time smoke point monitoring is implemented on the shooting range of the camera, and once smoke points are detected, instructions are sent to an alarm device through a serial port to complete alarming.

The working principle of the invention is as follows: the method takes an infrared camera and a visible light camera as hardware basis, acquires smoking behavior images through the infrared camera and the visible light binocular camera, makes a smoke point sample data set of an infrared image and a visible light image dual-light fusion image, and supplements the characteristics of smoke points by utilizing the characteristic that the dual-light fusion image can better express the characteristic advantages of image data; enriching semantic features and positioning features of the sample through an FPN + PAN structure by combining a smoking behavior detection method based on deep learning; by using image self-adaptive scaling and filling, the calculation cost is reduced, the requirement for hardware deployment is reduced, the model identification precision and real-time performance are improved, and smoking identification and detection in an infrared scene are effectively realized; the smoking behavior detection system can be deployed in factories, workshops and other work places and can be used for effective and stable smoking behavior detection.

Step one, the infrared visible light binocular camera is installed to the height of 2.5m away from the ground indoors, and the person activity range of the sample is marked in a detection range which is 2m away from the camera horizontally.

And step two, detecting the smoking behavior of the target and the behavior accompanied by normal communication, limb movement and back-and-forth walking when the initial sample range of the video shooting is within two meters of the horizontal distance of the camera.

Step three, the image fusion processing operation comprises the following specific steps:

3.1, performing edge extraction on the visible light sample by using a Scharr operator to obtain a smoking behavior characteristic diagram under the visible light sample;

3.2, converting the visible light characteristic image I through the weight operator phi_RGBAnd infrared image I_RCarrying out fusion to obtain a double-light fusion image; the concrete formula is as follows: i ═ I_R+ФI_RGBThe sample after image fusion not only preserves the characteristic value of smoking behavior in the infrared image, but also adds the characteristic supplement of smoking behavior to the original infrared image through the characteristic image of visible light, so that the contrast and the contour of the final output sample are more obvious, and the characteristic expression is facilitated.

And fifthly, the Mosaic enhancement of the input samples is to splice the four samples in a random arrangement, random scaling and random cutting mode, so that the background of the smoke point samples is richer, the detection precision of the small target is improved, the data input of the four pictures is completed in the calculation process in one step through the mode, and the calculation cost is reduced.

Sixthly, the structure of FPN + PAN is used for processing the backbone network to extract smoke point characteristics, and the specific steps are as follows:

6.1, firstly, extracting a sample through an FPN structure to obtain feature maps with three sizes of 80 × 80, 40 × 40 and 20 × 20, upsampling the feature map with the size of 20 × 20 to 40 × 40, fusing the feature map with the original medium-size feature map, then, performing double upsampling on the fused feature map again, fusing the fused feature map with the original large-size feature map to obtain a predicted 80 × 80 feature map, and transmitting and fusing high-layer information in an upsampling mode from top to bottom;

6.2 and then the feature maps of three sizes extracted from the FPN structure were PAN-sampled to 40 × 40 after fusion for 80 × 80, fused with the medium-size feature map in the FPN structure, and then down-sampled to 20 × 20 again for fusion with the small-size feature map in the FPN structure, to deliver localization features from bottom to top.

An infrared image smoking detection system based on deep learning, comprising:

the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring image data through workplace monitoring equipment and defining the image data as a source image for acquisition and storage;

the fusion module is used for carrying out image fusion processing operation on the collected visible light initial sample and the collected infrared initial sample to obtain a fused sample;

and the data set making module is used for making a preset training set, a preset testing set and a preset verification set.

The smoke point feature extraction module is used for comparing the features of the fused samples one by utilizing a preset training sample, judging the region of the smoke point features and extracting the smoke point features;

the smoke point detection module is used for screening the smoke point characteristic extraction frame through the non-maximum value inhibition NMS, keeping a prediction frame with the highest reliability, completing the smoke point detection process and obtaining the result of whether smoking behaviors exist or not;

and the smoking alarm module sends an instruction to the alarm equipment according to the result detected by the smoke point detection module to complete alarm.

The monitoring equipment is an infrared visible light binocular camera.

In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:

according to the invention, the visible light and the infrared image are fused by using an image fusion technology, so that the fused image contains richer characteristic information, and the accuracy and reliability after fusion are improved;

according to the invention, Yolov5 is used as a detection network, the detection precision of the small target is improved through Mosaic enhancement, and the calculation cost is reduced through self-adaptive picture scaling, so that the deployment is facilitated;

the invention combines the FPN + PAN feature processing mode, the semantic features are transmitted from top to bottom, the positioning features are transmitted from bottom to top, and the two phases are combined to finish feature aggregation, thereby improving the detection precision.

Drawings

The invention will now be described, by way of example, with reference to the accompanying drawings, in which:

FIG. 1 is a flow chart of the operation of the smoking detection system of the present invention;

FIG. 2 is a flow chart of a smoke point detection algorithm of the present invention;

FIG. 3 is a block diagram of a smoke point detection network model of the present invention;

FIG. 4 is a diagram illustrating the fusion effect of visible light and infrared images;

FIG. 5 is a graph of the effect of smoking detection according to the present invention;

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention, generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1

As shown in fig. 1-5, in the present embodiment, an infrared image smoking detection method based on deep learning is provided, where smoking behavior images are collected by an infrared and visible light binocular camera, a smoke point sample data set of an infrared image and a visible light image dual-light fusion image is made, and the characteristics of smoke points are supplemented by using the characteristics that the dual-light fusion image can better express the advantages of image data characteristics; by utilizing the thought of deep learning and training the network model, the semantic features of the smoke points are fully extracted, the problems of unstable smoke point detection precision and poor positioning effect under an infrared scene are solved, and the method comprises the following specific steps:

the method comprises the following steps: carrying a hardware environment, firstly, according to the installation standard of a security camera, under the condition that the requirement of monitoring a target view field range is met, installing an infrared visible light binocular camera to the height 2.5m away from the ground indoors, and marking the movement range of people in a detection range 2 meters away from the camera horizontally;

step two: data acquisition, namely performing video shooting on a simulation workplace through a binocular camera to enable a human target to perform smoking behavior within a range of two meters from the horizontal distance of the camera and accompany with behaviors such as normal communication, limb movement, back-and-forth walking and the like, acquiring smoking behavior videos of three simulation real smoking scenes of one person, two persons and multiple persons, simultaneously acquiring a visible light video and an infrared video, and taking frames at the same time starting point and the same frequency to obtain a visible light initial sample and an infrared initial sample;

step three: data preprocessing, namely performing image fusion processing operation on a visible light sample and an infrared sample in order to perform integrated processing on the information of the same smoking behavior shot by the infrared image and the visible light image at the same time; edge extraction of visible light samples using Scharr operatorObtaining a smoking behavior characteristic diagram under the visible light sample; then the visible light characteristic image I is processed by the weight operator phi_RGBAnd infrared image I_RCarrying out fusion to obtain a double-light fusion image; the concrete formula is as follows: i ═ I_R+ФI_RGBThe sample after image fusion not only saves the characteristic value of smoking behavior in the infrared image, but also increases the characteristic supplement of smoking behavior to the original infrared image through the characteristic image of visible light, so that the contrast and the outline of the final output sample are more obvious, and the characteristic expression is facilitated;

step four: and (3) data set preparation, wherein a fused sample is screened, 2700 pictures are obtained by screening and labeled, and the sample is a double-light fusion image with the size of 640 multiplied by 480 multiplied by 3. And 2700 post-labeling samples were prepared as 8: 1: the proportion of 1 is divided into a training set, a testing set and a verification set;

step five: the smoke point feature extraction method adopts a Yolov5 network model, and the network is used as a lightweight model with high detection speed and has excellent detection accuracy. Firstly, conducting Mosaic enhancement on an input sample, splicing four samples in a random arrangement, random scaling and random cutting mode to enrich the background of a smoke point sample and improve the detection precision of a small target, completing data input of four pictures in a calculation process in one step by the mode, reducing the calculation cost, after the original sample is subjected to self-adaptive picture scaling filling, changing the size of the input sample into 640 multiplied by 3, inputting the input sample into a backbone network, slicing the sample by a Focus structure to obtain 320 multiplied by 12, obtaining a feature map by 64 convolution kernel operations, and completing feature extraction by the backbone network;

step six: fusing smoke point features, wherein the Yolov5 processes features extracted by a backbone network by adopting a structure of FPN (feature Pyramid network) + PAN (spatial adaptive network);

firstly, extracting a sample through an FPN structure to obtain feature maps with three sizes of 80 × 80, 40 × 40 and 20 × 20, upsampling the feature map with the size of 20 × 20 to 40 × 40, fusing the feature map with an original medium-size feature map, then performing double upsampling on the fused feature map again, fusing the fused feature map with an original large-size feature map to obtain a predicted 80 × 80 feature map, and transmitting and fusing high-level information in an upsampling mode from top to bottom;

then, the feature maps of three sizes extracted by the FPN are subjected to PAN, the feature map with the size of 80 multiplied by 80 after fusion is down-sampled to 40 multiplied by 40, the feature map is fused with the feature map with the middle size in the FPN structure, then the feature map after fusion is down-sampled again to 20 multiplied by 20, the feature map with the small size in the FPN structure is fused, and the positioning feature is transmitted from bottom to top;

step seven: detecting smoke points, namely screening a target frame by Non-Maximum Suppression NMS (Non-Maximum Suppression) in the post-processing of smoke point detection, reserving a prediction frame with the highest confidence coefficient, completing a smoke point detection process, and judging that the image with the smoke points is smoked by people;

By the Yolov5 detection method, the detection precision and the detection frame rate are obviously superior to those of the Yolov1 and the fast RCNN detection method, and the comparison table of the smoking detection method is as follows:

detection method	Detection accuracy	Detecting frame rate (fps)
			Yolov1	67.95％	43
Faster RCNN	78.43％	76
			Yolov5	83.56％	96

Claims

1. An infrared image smoking detection method based on deep learning is characterized by comprising the following steps:

step two: data acquisition, namely performing video shooting on a simulation workplace through an infrared visible binocular camera, acquiring smoking behavior videos of three simulation real smoking scenes of a single person, two persons and a plurality of persons, simultaneously acquiring a visible light video and an infrared video, and taking frames at the same time starting point and the same frequency to obtain a visible light initial sample and an infrared initial sample;

step four: and (3) data set production, screening the fused sample to obtain 2700 pictures and marking the pictures, wherein the sample is a double-light fusion image with the size of 640 multiplied by 480 multiplied by 3, and the 2700 marked samples are 8: 1: the proportion of 1 is divided into a training set, a testing set and a verification set;

step seven: detecting smoke points, screening the target frames through non-maximum value suppression NMS, reserving the prediction frame with the highest confidence coefficient, completing the smoke point detection process, and judging that people smoke in the image with the smoke points;

step eight: the method comprises the steps of smoke point alarming, wherein a smoke point detection model is deployed in a terminal processor, videos shot by an infrared visible light binocular camera are transmitted to the terminal processor through the Ethernet, real-time smoke point monitoring is implemented in a camera shooting range, and once smoke points are detected, instructions are sent to an alarm device through a serial port to complete alarming.

2. The infrared image smoking detection method based on deep learning as claimed in claim 1, wherein the infrared visible light binocular camera of the step one is installed at a height of 2.5m from the ground indoors, and the human movement range of the marked sample is marked in a detection range of 2m from the camera horizontally.

3. The infrared image smoking detection method based on deep learning of claim 2, wherein the initial sample range of the video shooting in the second step is within two meters of the horizontal distance of the camera, and the smoking behavior of the target accompanied by normal communication, limb movement and back-and-forth walking is detected.

4. The infrared image smoking detection method based on deep learning according to claim 1, wherein the image fusion processing operation of step three specifically comprises the following steps:

3.2 converting visible light by weight operator phiCharacteristic image I_RGBAnd infrared image I_RCarrying out fusion to obtain a double-light fusion image; the concrete formula is as follows: i ═ I_R+ФI_RGBThe sample after image fusion not only preserves the characteristic value of smoking behavior in the infrared image, but also increases the characteristic supplement of smoking behavior to the original infrared image through the characteristic image of visible light, so that the contrast and the outline of the final output sample are more obvious, and the characteristic expression is facilitated.

5. The infrared image smoking detection method based on deep learning of claim 1, wherein the Mosaic enhancement of the input samples in the fifth step is to splice four samples in a random arrangement, random scaling and random cutting manner, so that the background of the smoke point sample is richer, the small target detection precision is improved, and the data input of four pictures is completed in the calculation process at one time by the method, thereby reducing the calculation cost.

6. The infrared image smoking detection method based on deep learning of claim 1, wherein the structure of the FPN + PAN is used to process a backbone network to extract smoke point features, and the specific steps are as follows:

6.2 and then the feature maps of three sizes extracted by the FPN are passed through the PAN, the feature map of 80 × 80 size after fusion is down-sampled to 40 × 40, and then fused with the feature map of medium size in the FPN structure, and then the feature map after fusion is down-sampled again to 20 × 20, and then fused with the feature map of small size in the FPN structure, and the positioning feature is transmitted from bottom to top.

7. An infrared image smoking detection system based on deep learning is characterized by comprising:

the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring image data through a workplace monitoring device and defining the image data as a source image to be acquired and stored;

the smoke point detection module is used for screening the smoke point characteristic extraction frame through the non-maximum value inhibition NMS, reserving the prediction frame with the highest confidence coefficient, completing the smoke point detection process and obtaining the result of whether smoking behaviors exist or not;

8. The infrared image smoking detection system based on deep learning of claim 7, further comprising a data set creation module for creating a preset training set, a test set and a verification set.

9. The infrared image smoking detection system based on deep learning of claim 7, wherein the monitoring device is an infrared visible binocular camera.