CN116912770A - Public place smoking detection method based on improved YOLOv8 - Google Patents

Public place smoking detection method based on improved YOLOv8 Download PDF

Info

Publication number
CN116912770A
CN116912770A CN202310848825.8A CN202310848825A CN116912770A CN 116912770 A CN116912770 A CN 116912770A CN 202310848825 A CN202310848825 A CN 202310848825A CN 116912770 A CN116912770 A CN 116912770A
Authority
CN
China
Prior art keywords
yolov8
smoking
improved
model
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310848825.8A
Other languages
Chinese (zh)
Inventor
刘丽娟
张澳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Jiaotong University
Original Assignee
Dalian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Jiaotong University filed Critical Dalian Jiaotong University
Priority to CN202310848825.8A priority Critical patent/CN116912770A/en
Publication of CN116912770A publication Critical patent/CN116912770A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a public place smoking detection method based on improved YOLOv8, which is used for detecting smoking behaviors of monitoring videos in a market, wherein the smoking behaviors comprise smoking cigarettes on a mouth and exhaling cigarettes, holding lighted cigarettes in a hand, and solving the problems of high false detection rate, low accuracy and the like by an improved YOLOv8 model. The method comprises the following steps: s1, acquiring a smoking image to form a data set, and dividing the data set by using Labellmg label images; s2, adopting a YOLOv8 model, and simultaneously adding a small target detection layer; s3, improving a YOLOv8 backbone network, and replacing a Darknet53 with a lightweight network MobileNet V3; s4, introducing a attention mechanism CBAM into a Neck of the YOLOv8 model; s5, optimizing a loss function, and replacing the CIoU with the EIoU; s6, inputting the images into a trained improved YOLOv8 model based on a preset test set, and detecting each image to obtain a target detection result. The method improves the accuracy, the detection speed and the precision by improving the YOLOv8 model.

Description

Public place smoking detection method based on improved YOLOv8
Technical Field
The invention relates to a smoking detection method, in particular to a public place smoking detection method based on improved YOLOv 8. Belongs to the technical field of computer vision.
Background
The target detection is an important application direction in the field of computer vision, is used for detecting semantic objects of specific categories in images and videos, and is widely applied to the fields of face recognition, intelligent transportation, medical diagnosis, security monitoring and the like. The smoking detection is used as the key research content in the security monitoring field, and aims to rapidly and accurately identify and position the smoking position, record the detection result and assist in timely giving out smoking alarm. The fact proves that the harm caused by smoking in public places is not small, the harm not only affects the health of other people, but also can cause problems of fire, polluted air and the like, which is a public health problem to be solved urgently, so the smoking detection method has extremely important practical significance for the research of smoking detection methods. Because smoking detection belongs to small target detection, the proportion of the smoking detection in an image is small, and meanwhile, the color is difficult to distinguish from the surrounding environment color, and the smoking detection is difficult to capture, so that the missing detection rate is high and the false detection rate is high.
The traditional smoking detection methods comprise smoke sensor detection based on smoke concentration, biological detection based on sampling analysis indexes, detection based on a vision monitoring system and the like, and the methods are low in accuracy and long in time consumption.
Disclosure of Invention
In order to solve the problems, the invention provides a public place smoking detection method based on improved YOLOv 8. The problem of high false detection rate and low accuracy rate of smoking detection behaviors in a monitoring video of a mall is solved by applying and improving a YOLOv8 algorithm for object identification and positioning based on a deep neural network.
The technical solution of the invention is realized as follows:
a public place smoking detection method based on improved YOLOv8 comprises the following steps:
s1, acquiring a smoking image to form a data set, and dividing the data set by using Labellmg label images, wherein the smoking image refers to an image for detecting and labeling smoking behaviors in a monitoring environment by a monitoring video, and the smoking behaviors comprise holding cigarettes on the mouth and exhaling the smoke of the cigarettes, and holding the lighted cigarettes in the hands;
s2, adopting a YOLOv8 model, adding a small target detection layer, continuing the operations of a C2f module, up-sampling and the like on the original Neck module, and carrying out feature fusion convolution on the obtained feature map and a first layer feature map of a backbone network to obtain a feature map with the size of 160 x 160; and carrying out feature fusion convolution on the obtained feature map and a second layer feature map of the backbone network so as to obtain 80 x 80 feature maps. Because the size of the cigarette is too small, the sampling multiple of the YOLOv8 is large, the characteristic information of the cigarette is difficult to identify by the original characteristic image, and the improved YOLOv8 is added with a small target detection layer with the size of 160 x 160, so that the problem that the cigarette is difficult to detect is solved, the false detection rate is reduced, and the detection effect is improved;
s3, improving a YOLOv8 backbone network, replacing a Darknet53 network with a lightweight network MobileNet V3, specifically replacing C2f and a part Cnov in the YOLOv8 with Bneck, and adopting a depth separable convolution to reduce the calculated amount by a lightweight YOLOv8 model; the inverse residual structure is adopted, the original 1×1 convolution dimension reduction, 3×3 convolution and 1×1 convolution dimension increase are changed into 1×1 convolution dimension increase, 3×3 convolution and 1×1 convolution dimension reduction are changed into a structure with wider middle and narrower two ends, the channel number is improved, and the calculated amount is reduced; the attention mechanism SE is introduced, a large weight is given to important channels, a small weight is given to unimportant channels, and the task processing efficiency and accuracy are improved; the calculated amount of the lightweight YOLOv8 model is smaller, and the accuracy is higher;
s4, introducing an attention mechanism CBAM into a Neck of the YOLOv8 model, calculating attention force diagrams from two different dimensions of a channel and a space in sequence by a CBAM module, after introducing the attention mechanism CBAM into the Neck, focusing on cigarette information, ignoring other useless information, covering the characteristics to more positions of the cigarettes, and improving the accuracy;
s5, optimizing a Loss function, replacing the CIoU with the Loss function of the EIoU, wherein certain ambiguity exists in the aspect ratio of the CIoU, the EIoU respectively calculates a wide difference value and a high difference value to replace the aspect ratio on the basis of the CIoU, and meanwhile, focal-Loss is introduced to solve the problem of sample unbalance;
s6, inputting the images into a trained improved YOLOv8 model based on a preset test set, and detecting each image to obtain a target detection result.
Compared with the prior art, the invention has the following characteristics:
1. a small target detection layer is added, so that the problem of high false detection rate caused by small cigarette size is solved;
2. the YOLOv8 backbone network is improved, the light-weight network MobileNet V3 is used for replacing the Darknet53, and parameters and calculation amount are reduced;
3. the attention mechanism CBAM is introduced, more weight parameters are given to the relevant information of the cigarettes, and the accuracy is improved;
4. the smoking behavior in the monitoring video is monitored in real time;
5. and (3) improving the loss function, replacing the CIoU with the EIoU, and solving the problem of sample unbalance.
Drawings
The invention is shown in figure 2.
FIG. 1 is a diagram of the overall network architecture of YOLOv8 of the present invention;
fig. 2 is a diagram of the overall network architecture of the improved YOLOv8 of the present invention.
Detailed Description
A public place smoking detection method based on improved YOLOv8 as shown in fig. 1 and 2, comprising the following steps:
s1, acquiring a smoking image to form a data set, and dividing the data set by using Labellmg label images, wherein the smoking image refers to an image for detecting and labeling smoking behaviors in a monitoring environment by a monitoring video, and the smoking behaviors comprise holding cigarettes on the mouth and exhaling the smoke of the cigarettes, and holding the lighted cigarettes in the hands;
s2, adopting a YOLOv8 model, adding a small target detection layer, continuing the operations of a C2f module, up-sampling and the like on the original Neck module, and carrying out feature fusion convolution on the obtained feature map and a first layer feature map of a backbone network to obtain a feature map with the size of 160 x 160; and carrying out feature fusion convolution on the obtained feature map and a second layer feature map of the backbone network so as to obtain 80 x 80 feature maps. Because the size of the cigarette is too small, the sampling multiple of the YOLOv8 is large, the characteristic information of the cigarette is difficult to identify by the original characteristic image, and the improved YOLOv8 is added with a small target detection layer with the size of 160 x 160, so that the problem that the cigarette is difficult to detect is solved, the false detection rate is reduced, and the detection effect is improved;
s3, improving a YOLOv8 backbone network, replacing a Darknet53 network with a lightweight network MobileNet V3, specifically replacing C2f and a part Cnov in the YOLOv8 with Bneck, and adopting a depth separable convolution to reduce the calculated amount by a lightweight YOLOv8 model; the inverse residual structure is adopted, the original 1×1 convolution dimension reduction, 3×3 convolution and 1×1 convolution dimension increase are changed into 1×1 convolution dimension increase, 3×3 convolution and 1×1 convolution dimension reduction are changed into a structure with wider middle and narrower two ends, the channel number is improved, and the calculated amount is reduced; the attention mechanism SE is introduced, a large weight is given to important channels, a small weight is given to unimportant channels, and the task processing efficiency and accuracy are improved; the calculated amount of the lightweight YOLOv8 model is smaller, and the accuracy is higher;
s4, introducing an attention mechanism CBAM into a Neck of the YOLOv8 model, calculating attention force diagrams from two different dimensions of a channel and a space in sequence by a CBAM module, after introducing the attention mechanism CBAM into the Neck, focusing on cigarette information, ignoring other useless information, covering the characteristics to more positions of the cigarettes, and improving the accuracy;
s5, optimizing a Loss function, replacing the CIoU with the Loss function of the EIoU, wherein certain ambiguity exists in the aspect ratio of the CIoU, the EIoU respectively calculates a wide difference value and a high difference value to replace the aspect ratio on the basis of the CIoU, and meanwhile, focal-Loss is introduced to solve the problem of sample unbalance;
s6, inputting the images into a trained improved YOLOv8 model based on a preset test set, and detecting each image to obtain a target detection result.

Claims (1)

1. The public place smoking detection method based on the improved YOLOv8 is characterized by comprising the following steps of:
s1, acquiring a smoking image to form a data set, and dividing the data set by using Labellmg label images, wherein the smoking image refers to an image for detecting and labeling smoking behaviors in a monitoring environment by a monitoring video, and the smoking behaviors comprise holding cigarettes on the mouth and exhaling the smoke of the cigarettes, and holding the lighted cigarettes in the hands;
s2, adopting a YOLOv8 model, adding a small target detection layer, continuing the operations of a C2f module, up-sampling and the like on the original Neck module, and carrying out feature fusion convolution on the obtained feature map and a first layer feature map of a backbone network to obtain a feature map with the size of 160 x 160; and carrying out feature fusion convolution on the obtained feature map and a second layer feature map of the backbone network so as to obtain 80 x 80 feature maps. Because the size of the cigarette is too small, the sampling multiple of the YOLOv8 is large, the characteristic information of the cigarette is difficult to identify by the original characteristic image, and the improved YOLOv8 is added with a small target detection layer with the size of 160 x 160, so that the problem that the cigarette is difficult to detect is solved, the false detection rate is reduced, and the detection effect is improved;
s3, improving a YOLOv8 backbone network, replacing a Darknet53 network with a lightweight network MobileNet V3, specifically replacing C2f and a part Cnov in the YOLOv8 with Bneck, and adopting a depth separable convolution to reduce the calculated amount by a lightweight YOLOv8 model; the inverse residual structure is adopted, the original 1×1 convolution dimension reduction, 3×3 convolution and 1×1 convolution dimension increase are changed into 1×1 convolution dimension increase, 3×3 convolution and 1×1 convolution dimension reduction are changed into a structure with wider middle and narrower two ends, the channel number is improved, and the calculated amount is reduced; the attention mechanism SE is introduced, a large weight is given to important channels, a small weight is given to unimportant channels, and the task processing efficiency and accuracy are improved; the calculated amount of the lightweight YOLOv8 model is smaller, and the accuracy is higher;
s4, introducing an attention mechanism CBAM into a Neck of the YOLOv8 model, calculating attention force diagrams from two different dimensions of a channel and a space in sequence by a CBAM module, after introducing the attention mechanism CBAM into the Neck, focusing on cigarette information, ignoring other useless information, covering the characteristics to more positions of the cigarettes, and improving the accuracy;
s5, optimizing a Loss function, replacing the CIoU with the Loss function of the EIoU, wherein certain ambiguity exists in the aspect ratio of the CIoU, the EIoU respectively calculates a wide difference value and a high difference value to replace the aspect ratio on the basis of the CIoU, and meanwhile, focal-Loss is introduced to solve the problem of sample unbalance;
s6, inputting the images into a trained improved YOLOv8 model based on a preset test set, and detecting each image to obtain a target detection result.
CN202310848825.8A 2023-07-10 2023-07-10 Public place smoking detection method based on improved YOLOv8 Pending CN116912770A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310848825.8A CN116912770A (en) 2023-07-10 2023-07-10 Public place smoking detection method based on improved YOLOv8

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310848825.8A CN116912770A (en) 2023-07-10 2023-07-10 Public place smoking detection method based on improved YOLOv8

Publications (1)

Publication Number Publication Date
CN116912770A true CN116912770A (en) 2023-10-20

Family

ID=88366048

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310848825.8A Pending CN116912770A (en) 2023-07-10 2023-07-10 Public place smoking detection method based on improved YOLOv8

Country Status (1)

Country Link
CN (1) CN116912770A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237741A (en) * 2023-11-08 2023-12-15 烟台持久钟表有限公司 Campus dangerous behavior detection method, system, device and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237741A (en) * 2023-11-08 2023-12-15 烟台持久钟表有限公司 Campus dangerous behavior detection method, system, device and storage medium
CN117237741B (en) * 2023-11-08 2024-02-13 烟台持久钟表有限公司 Campus dangerous behavior detection method, system, device and storage medium

Similar Documents

Publication Publication Date Title
CN111400547B (en) Human-computer cooperation video anomaly detection method
CN110147763B (en) Video semantic segmentation method based on convolutional neural network
CN111726586A (en) Production system operation standard monitoring and reminding system
CN107729363B (en) Bird population identification analysis method based on GoogLeNet network model
CN111814661A (en) Human behavior identification method based on residual error-recurrent neural network
CN109034092A (en) Accident detection method for monitoring system
CN116912770A (en) Public place smoking detection method based on improved YOLOv8
CN108229407A (en) A kind of behavioral value method and system in video analysis
CN110133049A (en) Tea grades fast non-destructive detection method based on electronic nose and machine vision
CN111144321B (en) Concentration detection method, device, equipment and storage medium
CN112132009A (en) Classroom behavior analysis method and system and electronic equipment
CN111145222A (en) Fire detection method combining smoke movement trend and textural features
CN113963399A (en) Personnel trajectory retrieval method and device based on multi-algorithm fusion application
CN116206112A (en) Remote sensing image semantic segmentation method based on multi-scale feature fusion and SAM
CN109117774A (en) A kind of multi-angle video method for detecting abnormality based on sparse coding
CN114241423A (en) Intelligent detection method and system for river floaters
CN110991341A (en) Method and device for detecting face image
CN111191498A (en) Behavior recognition method and related product
CN112712008A (en) Water environment early warning judgment method based on 3D convolutional neural network
CN115083229B (en) Intelligent recognition and warning system of flight training equipment based on AI visual recognition
CN115953832A (en) Semantic decoupling-based combined action recognition method of self-attention model
CN114550032A (en) Video smoke detection method of end-to-end three-dimensional convolution target detection network
CN114005054A (en) AI intelligence system of grading
CN113327236A (en) Identification method and system of novel coronavirus antibody rapid detection reagent
CN116665016B (en) Single-frame infrared dim target detection method based on improved YOLOv5

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination