CN116912770A - Public place smoking detection method based on improved YOLOv8 - Google Patents
Public place smoking detection method based on improved YOLOv8 Download PDFInfo
- Publication number
- CN116912770A CN116912770A CN202310848825.8A CN202310848825A CN116912770A CN 116912770 A CN116912770 A CN 116912770A CN 202310848825 A CN202310848825 A CN 202310848825A CN 116912770 A CN116912770 A CN 116912770A
- Authority
- CN
- China
- Prior art keywords
- yolov8
- smoking
- improved
- model
- convolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 47
- 230000000391 smoking effect Effects 0.000 title claims abstract description 37
- 235000019504 cigarettes Nutrition 0.000 claims abstract description 29
- 238000012544 monitoring process Methods 0.000 claims abstract description 12
- 230000006399 behavior Effects 0.000 claims abstract description 10
- 238000012360 testing method Methods 0.000 claims abstract description 4
- 238000005070 sampling Methods 0.000 claims description 7
- 230000004927 fusion Effects 0.000 claims description 6
- 238000010586 diagram Methods 0.000 claims description 5
- 239000000779 smoke Substances 0.000 claims description 5
- 230000000694 effects Effects 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 238000000034 method Methods 0.000 abstract description 3
- 238000011160 research Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000005180 public health Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/44—Event detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a public place smoking detection method based on improved YOLOv8, which is used for detecting smoking behaviors of monitoring videos in a market, wherein the smoking behaviors comprise smoking cigarettes on a mouth and exhaling cigarettes, holding lighted cigarettes in a hand, and solving the problems of high false detection rate, low accuracy and the like by an improved YOLOv8 model. The method comprises the following steps: s1, acquiring a smoking image to form a data set, and dividing the data set by using Labellmg label images; s2, adopting a YOLOv8 model, and simultaneously adding a small target detection layer; s3, improving a YOLOv8 backbone network, and replacing a Darknet53 with a lightweight network MobileNet V3; s4, introducing a attention mechanism CBAM into a Neck of the YOLOv8 model; s5, optimizing a loss function, and replacing the CIoU with the EIoU; s6, inputting the images into a trained improved YOLOv8 model based on a preset test set, and detecting each image to obtain a target detection result. The method improves the accuracy, the detection speed and the precision by improving the YOLOv8 model.
Description
Technical Field
The invention relates to a smoking detection method, in particular to a public place smoking detection method based on improved YOLOv 8. Belongs to the technical field of computer vision.
Background
The target detection is an important application direction in the field of computer vision, is used for detecting semantic objects of specific categories in images and videos, and is widely applied to the fields of face recognition, intelligent transportation, medical diagnosis, security monitoring and the like. The smoking detection is used as the key research content in the security monitoring field, and aims to rapidly and accurately identify and position the smoking position, record the detection result and assist in timely giving out smoking alarm. The fact proves that the harm caused by smoking in public places is not small, the harm not only affects the health of other people, but also can cause problems of fire, polluted air and the like, which is a public health problem to be solved urgently, so the smoking detection method has extremely important practical significance for the research of smoking detection methods. Because smoking detection belongs to small target detection, the proportion of the smoking detection in an image is small, and meanwhile, the color is difficult to distinguish from the surrounding environment color, and the smoking detection is difficult to capture, so that the missing detection rate is high and the false detection rate is high.
The traditional smoking detection methods comprise smoke sensor detection based on smoke concentration, biological detection based on sampling analysis indexes, detection based on a vision monitoring system and the like, and the methods are low in accuracy and long in time consumption.
Disclosure of Invention
In order to solve the problems, the invention provides a public place smoking detection method based on improved YOLOv 8. The problem of high false detection rate and low accuracy rate of smoking detection behaviors in a monitoring video of a mall is solved by applying and improving a YOLOv8 algorithm for object identification and positioning based on a deep neural network.
The technical solution of the invention is realized as follows:
a public place smoking detection method based on improved YOLOv8 comprises the following steps:
s1, acquiring a smoking image to form a data set, and dividing the data set by using Labellmg label images, wherein the smoking image refers to an image for detecting and labeling smoking behaviors in a monitoring environment by a monitoring video, and the smoking behaviors comprise holding cigarettes on the mouth and exhaling the smoke of the cigarettes, and holding the lighted cigarettes in the hands;
s2, adopting a YOLOv8 model, adding a small target detection layer, continuing the operations of a C2f module, up-sampling and the like on the original Neck module, and carrying out feature fusion convolution on the obtained feature map and a first layer feature map of a backbone network to obtain a feature map with the size of 160 x 160; and carrying out feature fusion convolution on the obtained feature map and a second layer feature map of the backbone network so as to obtain 80 x 80 feature maps. Because the size of the cigarette is too small, the sampling multiple of the YOLOv8 is large, the characteristic information of the cigarette is difficult to identify by the original characteristic image, and the improved YOLOv8 is added with a small target detection layer with the size of 160 x 160, so that the problem that the cigarette is difficult to detect is solved, the false detection rate is reduced, and the detection effect is improved;
s3, improving a YOLOv8 backbone network, replacing a Darknet53 network with a lightweight network MobileNet V3, specifically replacing C2f and a part Cnov in the YOLOv8 with Bneck, and adopting a depth separable convolution to reduce the calculated amount by a lightweight YOLOv8 model; the inverse residual structure is adopted, the original 1×1 convolution dimension reduction, 3×3 convolution and 1×1 convolution dimension increase are changed into 1×1 convolution dimension increase, 3×3 convolution and 1×1 convolution dimension reduction are changed into a structure with wider middle and narrower two ends, the channel number is improved, and the calculated amount is reduced; the attention mechanism SE is introduced, a large weight is given to important channels, a small weight is given to unimportant channels, and the task processing efficiency and accuracy are improved; the calculated amount of the lightweight YOLOv8 model is smaller, and the accuracy is higher;
s4, introducing an attention mechanism CBAM into a Neck of the YOLOv8 model, calculating attention force diagrams from two different dimensions of a channel and a space in sequence by a CBAM module, after introducing the attention mechanism CBAM into the Neck, focusing on cigarette information, ignoring other useless information, covering the characteristics to more positions of the cigarettes, and improving the accuracy;
s5, optimizing a Loss function, replacing the CIoU with the Loss function of the EIoU, wherein certain ambiguity exists in the aspect ratio of the CIoU, the EIoU respectively calculates a wide difference value and a high difference value to replace the aspect ratio on the basis of the CIoU, and meanwhile, focal-Loss is introduced to solve the problem of sample unbalance;
s6, inputting the images into a trained improved YOLOv8 model based on a preset test set, and detecting each image to obtain a target detection result.
Compared with the prior art, the invention has the following characteristics:
1. a small target detection layer is added, so that the problem of high false detection rate caused by small cigarette size is solved;
2. the YOLOv8 backbone network is improved, the light-weight network MobileNet V3 is used for replacing the Darknet53, and parameters and calculation amount are reduced;
3. the attention mechanism CBAM is introduced, more weight parameters are given to the relevant information of the cigarettes, and the accuracy is improved;
4. the smoking behavior in the monitoring video is monitored in real time;
5. and (3) improving the loss function, replacing the CIoU with the EIoU, and solving the problem of sample unbalance.
Drawings
The invention is shown in figure 2.
FIG. 1 is a diagram of the overall network architecture of YOLOv8 of the present invention;
fig. 2 is a diagram of the overall network architecture of the improved YOLOv8 of the present invention.
Detailed Description
A public place smoking detection method based on improved YOLOv8 as shown in fig. 1 and 2, comprising the following steps:
s1, acquiring a smoking image to form a data set, and dividing the data set by using Labellmg label images, wherein the smoking image refers to an image for detecting and labeling smoking behaviors in a monitoring environment by a monitoring video, and the smoking behaviors comprise holding cigarettes on the mouth and exhaling the smoke of the cigarettes, and holding the lighted cigarettes in the hands;
s2, adopting a YOLOv8 model, adding a small target detection layer, continuing the operations of a C2f module, up-sampling and the like on the original Neck module, and carrying out feature fusion convolution on the obtained feature map and a first layer feature map of a backbone network to obtain a feature map with the size of 160 x 160; and carrying out feature fusion convolution on the obtained feature map and a second layer feature map of the backbone network so as to obtain 80 x 80 feature maps. Because the size of the cigarette is too small, the sampling multiple of the YOLOv8 is large, the characteristic information of the cigarette is difficult to identify by the original characteristic image, and the improved YOLOv8 is added with a small target detection layer with the size of 160 x 160, so that the problem that the cigarette is difficult to detect is solved, the false detection rate is reduced, and the detection effect is improved;
s3, improving a YOLOv8 backbone network, replacing a Darknet53 network with a lightweight network MobileNet V3, specifically replacing C2f and a part Cnov in the YOLOv8 with Bneck, and adopting a depth separable convolution to reduce the calculated amount by a lightweight YOLOv8 model; the inverse residual structure is adopted, the original 1×1 convolution dimension reduction, 3×3 convolution and 1×1 convolution dimension increase are changed into 1×1 convolution dimension increase, 3×3 convolution and 1×1 convolution dimension reduction are changed into a structure with wider middle and narrower two ends, the channel number is improved, and the calculated amount is reduced; the attention mechanism SE is introduced, a large weight is given to important channels, a small weight is given to unimportant channels, and the task processing efficiency and accuracy are improved; the calculated amount of the lightweight YOLOv8 model is smaller, and the accuracy is higher;
s4, introducing an attention mechanism CBAM into a Neck of the YOLOv8 model, calculating attention force diagrams from two different dimensions of a channel and a space in sequence by a CBAM module, after introducing the attention mechanism CBAM into the Neck, focusing on cigarette information, ignoring other useless information, covering the characteristics to more positions of the cigarettes, and improving the accuracy;
s5, optimizing a Loss function, replacing the CIoU with the Loss function of the EIoU, wherein certain ambiguity exists in the aspect ratio of the CIoU, the EIoU respectively calculates a wide difference value and a high difference value to replace the aspect ratio on the basis of the CIoU, and meanwhile, focal-Loss is introduced to solve the problem of sample unbalance;
s6, inputting the images into a trained improved YOLOv8 model based on a preset test set, and detecting each image to obtain a target detection result.
Claims (1)
1. The public place smoking detection method based on the improved YOLOv8 is characterized by comprising the following steps of:
s1, acquiring a smoking image to form a data set, and dividing the data set by using Labellmg label images, wherein the smoking image refers to an image for detecting and labeling smoking behaviors in a monitoring environment by a monitoring video, and the smoking behaviors comprise holding cigarettes on the mouth and exhaling the smoke of the cigarettes, and holding the lighted cigarettes in the hands;
s2, adopting a YOLOv8 model, adding a small target detection layer, continuing the operations of a C2f module, up-sampling and the like on the original Neck module, and carrying out feature fusion convolution on the obtained feature map and a first layer feature map of a backbone network to obtain a feature map with the size of 160 x 160; and carrying out feature fusion convolution on the obtained feature map and a second layer feature map of the backbone network so as to obtain 80 x 80 feature maps. Because the size of the cigarette is too small, the sampling multiple of the YOLOv8 is large, the characteristic information of the cigarette is difficult to identify by the original characteristic image, and the improved YOLOv8 is added with a small target detection layer with the size of 160 x 160, so that the problem that the cigarette is difficult to detect is solved, the false detection rate is reduced, and the detection effect is improved;
s3, improving a YOLOv8 backbone network, replacing a Darknet53 network with a lightweight network MobileNet V3, specifically replacing C2f and a part Cnov in the YOLOv8 with Bneck, and adopting a depth separable convolution to reduce the calculated amount by a lightweight YOLOv8 model; the inverse residual structure is adopted, the original 1×1 convolution dimension reduction, 3×3 convolution and 1×1 convolution dimension increase are changed into 1×1 convolution dimension increase, 3×3 convolution and 1×1 convolution dimension reduction are changed into a structure with wider middle and narrower two ends, the channel number is improved, and the calculated amount is reduced; the attention mechanism SE is introduced, a large weight is given to important channels, a small weight is given to unimportant channels, and the task processing efficiency and accuracy are improved; the calculated amount of the lightweight YOLOv8 model is smaller, and the accuracy is higher;
s4, introducing an attention mechanism CBAM into a Neck of the YOLOv8 model, calculating attention force diagrams from two different dimensions of a channel and a space in sequence by a CBAM module, after introducing the attention mechanism CBAM into the Neck, focusing on cigarette information, ignoring other useless information, covering the characteristics to more positions of the cigarettes, and improving the accuracy;
s5, optimizing a Loss function, replacing the CIoU with the Loss function of the EIoU, wherein certain ambiguity exists in the aspect ratio of the CIoU, the EIoU respectively calculates a wide difference value and a high difference value to replace the aspect ratio on the basis of the CIoU, and meanwhile, focal-Loss is introduced to solve the problem of sample unbalance;
s6, inputting the images into a trained improved YOLOv8 model based on a preset test set, and detecting each image to obtain a target detection result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310848825.8A CN116912770A (en) | 2023-07-10 | 2023-07-10 | Public place smoking detection method based on improved YOLOv8 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310848825.8A CN116912770A (en) | 2023-07-10 | 2023-07-10 | Public place smoking detection method based on improved YOLOv8 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116912770A true CN116912770A (en) | 2023-10-20 |
Family
ID=88366048
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310848825.8A Pending CN116912770A (en) | 2023-07-10 | 2023-07-10 | Public place smoking detection method based on improved YOLOv8 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116912770A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117237741A (en) * | 2023-11-08 | 2023-12-15 | 烟台持久钟表有限公司 | Campus dangerous behavior detection method, system, device and storage medium |
-
2023
- 2023-07-10 CN CN202310848825.8A patent/CN116912770A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117237741A (en) * | 2023-11-08 | 2023-12-15 | 烟台持久钟表有限公司 | Campus dangerous behavior detection method, system, device and storage medium |
CN117237741B (en) * | 2023-11-08 | 2024-02-13 | 烟台持久钟表有限公司 | Campus dangerous behavior detection method, system, device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111400547B (en) | Human-computer cooperation video anomaly detection method | |
CN110147763B (en) | Video semantic segmentation method based on convolutional neural network | |
CN111726586A (en) | Production system operation standard monitoring and reminding system | |
CN107729363B (en) | Bird population identification analysis method based on GoogLeNet network model | |
CN111814661A (en) | Human behavior identification method based on residual error-recurrent neural network | |
CN109034092A (en) | Accident detection method for monitoring system | |
CN116912770A (en) | Public place smoking detection method based on improved YOLOv8 | |
CN108229407A (en) | A kind of behavioral value method and system in video analysis | |
CN110133049A (en) | Tea grades fast non-destructive detection method based on electronic nose and machine vision | |
CN111144321B (en) | Concentration detection method, device, equipment and storage medium | |
CN112132009A (en) | Classroom behavior analysis method and system and electronic equipment | |
CN111145222A (en) | Fire detection method combining smoke movement trend and textural features | |
CN113963399A (en) | Personnel trajectory retrieval method and device based on multi-algorithm fusion application | |
CN116206112A (en) | Remote sensing image semantic segmentation method based on multi-scale feature fusion and SAM | |
CN109117774A (en) | A kind of multi-angle video method for detecting abnormality based on sparse coding | |
CN114241423A (en) | Intelligent detection method and system for river floaters | |
CN110991341A (en) | Method and device for detecting face image | |
CN111191498A (en) | Behavior recognition method and related product | |
CN112712008A (en) | Water environment early warning judgment method based on 3D convolutional neural network | |
CN115083229B (en) | Intelligent recognition and warning system of flight training equipment based on AI visual recognition | |
CN115953832A (en) | Semantic decoupling-based combined action recognition method of self-attention model | |
CN114550032A (en) | Video smoke detection method of end-to-end three-dimensional convolution target detection network | |
CN114005054A (en) | AI intelligence system of grading | |
CN113327236A (en) | Identification method and system of novel coronavirus antibody rapid detection reagent | |
CN116665016B (en) | Single-frame infrared dim target detection method based on improved YOLOv5 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |