CN112818913B - Real-time smoking calling identification method - Google Patents

Real-time smoking calling identification method Download PDF

Info

Publication number
CN112818913B
CN112818913B CN202110207092.0A CN202110207092A CN112818913B CN 112818913 B CN112818913 B CN 112818913B CN 202110207092 A CN202110207092 A CN 202110207092A CN 112818913 B CN112818913 B CN 112818913B
Authority
CN
China
Prior art keywords
smoking
calling
real
pedestrian
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110207092.0A
Other languages
Chinese (zh)
Other versions
CN112818913A (en
Inventor
张全
赵磊
彭博
周文俊
张伟
涂然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Petroleum University
Original Assignee
Southwest Petroleum University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Petroleum University filed Critical Southwest Petroleum University
Priority to CN202110207092.0A priority Critical patent/CN112818913B/en
Publication of CN112818913A publication Critical patent/CN112818913A/en
Application granted granted Critical
Publication of CN112818913B publication Critical patent/CN112818913B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention discloses a real-time smoking calling identification method, which comprises the following steps: s1: establishing a real-time smoking and calling identification model; s2: positioning a pedestrian area by utilizing a multi-target tracking algorithm according to a monitoring video of a target scene; s3: and performing smoking and calling real-time identification on the pedestrian area according to the real-time smoking and calling identification model. According to the invention, a backbone network is constructed by taking Se-Res2Block as a basic module, so that more characteristics can be fused, and the detection speed can be improved; aiming at the problem of low identification precision of small targets, the resolution of an input image is increased, an SPP module and an ASFF module are introduced, information interaction between contexts is enhanced, and the identification precision of the small targets is improved; most of the existing smoking and calling methods are single-frame detection, the false detection rate is high, multi-frame information is introduced through a multi-target tracking algorithm, the false detection rate can be reduced by the IOU method for calculating the distance between a pedestrian area and a rectangular frame of a mobile phone and a cigarette, and the robustness is higher.

Description

Real-time smoking calling identification method
Technical Field
The invention relates to the technical field of target detection in computer vision, in particular to a real-time smoking and calling identification method.
Background
The smoking calling identification has an important role in the fields of gas stations, chemical engineering and the like. In practical applications, the existing smoking and calling recognition algorithm has the following disadvantages: (1) Smoking call detection typically employs target detection algorithms such as: SDD, RCNN, etc., but these algorithms have higher requirements for GPU resources, increasing deployment costs; (2) small mobile phone and cigarette targets, difficult to detect; and (3) single-frame detection is performed, so that the false detection rate is high, and the robustness is low.
Disclosure of Invention
In view of the above problems, the present invention is directed to a real-time smoking and phone call recognition method.
The technical scheme of the invention is as follows:
a real-time smoking and calling identification method comprises the following steps:
s1: establishing a real-time smoking and calling identification model;
s2: positioning a pedestrian area by utilizing a multi-target tracking algorithm according to a monitoring video of a target scene;
s3: and performing smoking and calling real-time identification on the pedestrian area according to the real-time smoking and calling identification model.
Preferably, in step S1, the establishing of the real-time smoking and calling recognition model specifically includes the following sub-steps:
s11: collecting smoking and calling picture data to obtain a data set, labeling objects in the data set, and dividing the data set into a training set, a verification set and a test set, wherein the objects comprise pedestrians, mobile phones and cigarettes;
s12: establishing an improved YOLOV3 model, which specifically comprises the following substeps:
s121: constructing a lightweight backbone network by taking Se-Res2Block as a basic module;
s122: introducing an SPP module behind the lightweight backbone network, and adjusting the resolution of an input image;
s123: adding an ASFF module to obtain the improved Yolov3 model;
s13: generating anchors by using the data in the training set through a kmeans clustering algorithm, wherein the number of the anchors required to be generated is 9;
s14: training and verifying the improved Yolov3 model by taking data of a training set and a verification set as input of the improved Yolov3 model;
s15: taking data of a test set as input of the improved Yolov3 model, and testing the accuracy of the improved Yolov3 model; and when the accuracy reaches a target threshold value, obtaining the real-time smoking and calling identification model.
Preferably, in step S11, when labeling the object in the data set, labeling is performed using yolomark.
Preferably, in step S11, the division ratio of the training set, the validation set, and the test set is 8.
Preferably, in step S121, the Se-Res2Block module is composed of two parts, namely a Res2Bolck module and a channel attention mechanism, and the channel attention mechanism is arranged behind the Res2Block module.
Preferably, the lightweight backbone network comprises two 3 × 3 convolutions with step length of 2, a bottleneck layer, three Se-Res2Block modules, convolution kernel number of 36, a bottleneck layer, three Se-Res2Block modules, convolution kernel number of 72, a bottleneck layer, three Se-Res2Block modules, and convolution kernel number of 144, which are connected in sequence.
Preferably, in step S122, introducing the SPP module after the lightweight backbone network specifically includes: and downsampling the output of the upper layer through three maximum pooling kernels with the step length of 1 and the sizes of 5 × 5, 9 × 9 and 13 × 13, and realizing multi-feature fusion of the three downsampled outputs and the output of the upper layer in a splicing mode to obtain the receptive fields with different sizes.
Preferably, in step S122, when the resolution of the input image is adjusted, the resolution of the input image is adjusted to 512 × 512.
Preferably, step S123 is specifically: increasing an ASFF fusion mode on the basis of YOLOV3, wherein YOLOV3 is output in three scales, ASFF designs a weight parameter for a feature layer in each scale, the sum of the three weight parameters is 1, the feature layers in different scales are adjusted to be the same in size through up-sampling or down-sampling during fusion, and each feature layer is multiplied by the respective weight parameter to serve as output.
Preferably, in step S14, when training the improved YOLOV3 model, data enhancement is performed on the data in the training set, specifically, data enhancement is performed by changing one or more of an angle, a contrast and a brightness of a picture of the data set.
Preferably, in step S2, when the pedestrian area is located, a depsort multi-target tracking algorithm is used for locating.
Preferably, in step S3, when performing real-time recognition of smoking and making a call to the pedestrian area: when the confidence threshold of the real-time smoking and calling identification model is larger than 0.5, the correct cigarette or mobile phone is considered to be detected, and smoking and calling judgment is carried out at the moment, wherein the specific judgment method comprises the following steps:
determining a pedestrian head area;
if the detection result is the mobile phone and the IOU values of the mobile phone area and the pedestrian head area are larger than 0.08, the calling behavior is considered to exist;
if the detection result is a cigarette and the IOU values of the cigarette area and the pedestrian head area are larger than 0, the smoking behavior is considered to exist;
recording the judgment result of each frame of each pedestrian, if the judgment result in the current frame is that smoking or calling behaviors exist, multiplying 1 by 0.2, and if the judgment result is normal, multiplying 1 by 0 to obtain the instantaneous judgment result of the current frame;
adding the instantaneous judgment results of the same pedestrian in 5 continuous frames, if the calculation result is more than 0.5, considering that the pedestrian has smoking or calling behaviors in the period of time, otherwise, judging that the pedestrian is normal.
Preferably, when the head area of the pedestrian is determined, the pedestrian is divided according to the proportion of the head to the body, wherein the proportion of the head to the body is 1.
The invention has the beneficial effects that:
in the real-time smoking and calling identification model, the Se-Res2Block is used as a basic module to construct a backbone network, so that more features can be fused and the speed is higher; the resolution of an input picture is increased, and the SPP module and the ASFF module are introduced, so that context information can be better fused, and the detection precision of a small target object can be improved; the prior smoking and calling method mostly adopts single-frame detection and has high false detection rate, and the invention introduces multi-frame information through a multi-target tracking algorithm and calculates the IOU between a pedestrian area and a rectangular frame of a mobile phone and a cigarette, so that the false detection rate can be reduced, and the robustness is higher.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a real-time smoking and phone call recognition method according to the present invention;
FIG. 2 is a schematic diagram of a Se-Res2Blcok base module according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating sizes of convolution kernels and a connection manner adopted by an SPP module according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of an ASFF module fusion in accordance with an embodiment of the present invention;
FIG. 5 is a schematic diagram of an improved YOLOV3 detection network according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating the results of cigarette testing in accordance with one embodiment of the improved YOLOV3 network of the present invention;
FIG. 7 is a diagram illustrating a cigarette inspection result of an embodiment of an original Yolov3 network;
fig. 8 is a schematic diagram illustrating a mobile phone detection result of an embodiment of the improved YOLOV3 network according to the present invention;
fig. 9 is a schematic diagram of a mobile phone detection result of an embodiment of an original YOLOV3 network;
fig. 10 is a schematic diagram of a cigarette, a cell phone area and a pedestrian head area IOU according to an embodiment of the present invention.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
It should be noted that, in the present application, the embodiments and the technical features of the embodiments may be combined with each other without conflict.
It is noted that, unless otherwise indicated, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
As shown in fig. 1-10, a real-time smoking and calling identification method includes the following steps:
s1: establishing a real-time smoking and calling identification model, which specifically comprises the following substeps:
s11: the method comprises the steps of collecting smoking and calling picture data to obtain a data set, marking objects in the data set, and dividing the data set into a training set, a verification set and a test set, wherein the objects comprise pedestrians, mobile phones and cigarettes.
In a specific embodiment, smoking and calling behaviors are simulated under a monitoring camera, then a monitoring video is collected, a picture is captured from the obtained video every 24 frames, useless pictures are manually removed, and effective pictures are used as a data set.
In a specific embodiment, when the objects in the data set are labeled, yolomark is used for labeling. It should be noted that, in addition to the labeling by using the method in the present embodiment, other labeling methods in the prior art may also be used for labeling.
In a specific embodiment, the division ratio of the training set, the validation set and the test set is 8. It should be noted that the division ratio may be adjusted according to the amount of data in the data set, and in addition to the division ratio used in the present embodiment, other division ratios such as 6.
S12: establishing an improved YOLOV3 model, which specifically comprises the following substeps:
s121: constructing a lightweight backbone network by taking Se-Res2Block as a basic module, wherein the Se-Res2Block module consists of a Res2Bolck module and a channel attention mechanism, and the channel attention mechanism is arranged behind the Res2Block module; the Res2Block module firstly passes through a bottleneck layer on input features, then is divided into four parts of X1, X2, X3 and X4, the X2 part adopts a grouping convolution with the convolution kernel size of 3 and the group number of 2 to extract features, then is fused with the X3 part in an addition mode, the X3 part and the X4 part adopt the same mode to fuse the features, finally, the X1 part and the X2, X3 and X4 which are subjected to grouping convolution to extract the features are fused in a splicing mode, the fused features utilize a channel attention mechanism to extract useful features, and finally, the bottleneck layer is used for reducing dimensionality; a channel attention mechanism, which compresses the features by adopting average pooling, then reduces the dimension to 1/r by utilizing an FC layer, takes 16 from r in a specific embodiment, finally adjusts the weight of each channel by using a logistic function, and multiplies the obtained weight by the input features correspondingly; the lightweight backbone network comprises two 3 x3 convolutions with the step length of 2, a bottleneck layer, three Se-Res2Block modules, convolution kernel number of 36, a bottleneck layer, three Se-Res2Block modules, convolution kernel number of 72, a bottleneck layer, three Se-Res2Block modules and convolution kernel number of 144, wherein the two convolution layers are sequentially connected.
S122: and introducing an SPP module behind the lightweight backbone network, and adjusting the resolution of the input image.
In a specific embodiment, the introduction of the SPP module after the lightweight backbone network specifically is: and an SPP module is connected behind the lightweight backbone network, the output of the upper layer is downsampled through three maximum pooling kernels with the step length of 1 and the sizes of 5 × 5, 9 × 9 and 13 × 13, and the three downsampled outputs and the output of the upper layer are spliced to realize multi-feature fusion to obtain the receptive fields with different sizes.
In a specific embodiment, when the resolution of the input image is adjusted, if the resolution of the input image is smaller, the resolution of the input image is increased to 512 × 512, so that the identification precision of small objects can be improved; if the resolution of the input image is large, the resolution of the input image is reduced to 512 × 512, which can reduce the amount of calculation and increase the calculation speed. It should be noted that the resolution of 512 × 512 is a preferable resolution in this embodiment, and in actual application, other resolutions may be adopted according to the recognition accuracy requirement and the calculation requirement.
S123: adding an ASFF module to obtain the improved Yolov3 model; specifically, the ASFF fusion mode is added on the basis of YOLOV3, and the fact that YOLOV3 outputs threeThe characteristics of the scales of level1, level2 and level3 are X respectively 1 、X 2 、X 3 Then, respectively designing a weight parameter alpha for the characteristic layer of each scale 3 、β 3 、γ 3 And the sum of the three weight parameters is 1, because an addition mode is adopted, the feature graphs need to be ensured to be the same in size during fusion, feature layers with different scales are adjusted to be the same in size through upsampling or downsampling, and each feature layer is multiplied by respective weight parameter to serve as output. Can be expressed by the following formula:
Figure BDA0002949714970000051
Figure BDA0002949714970000052
Figure BDA0002949714970000053
the weight parameters α, β, and γ are obtained by convolution of 1 × 1 through the level1-level3 feature maps, and the parameters α, β, and γ are made to be in the range of [0,1] by softmax after being spliced.
S13: and generating anchors by using the data in the training set through a kmeans clustering algorithm, wherein the number of the anchors required to be generated is 9.
S14: taking data of a training set and a verification set as input of an improved YOLOV3 model, and training and verifying the improved YOLOV3 model;
in a specific embodiment, when the improved YOLOV3 model is trained, data enhancement is performed on data in the training set, specifically by changing one or more of an angle, a contrast and a brightness of a picture of the data set. It should be noted that, training by using the data enhancement method can increase the sample size, enhance the diversity of the data set, and further improve the accuracy and robustness of the model, but this method is not an essential technical means, and the data enhancement training may not be performed even when the data size of the data set is large enough, the samples are diversified, and the like.
S15: taking data of a test set as input of the improved Yolov3 model, and testing the accuracy of the improved Yolov3 model; and when the accuracy reaches a target threshold value, obtaining the real-time smoking and calling identification model.
In a specific embodiment, the target threshold is 80%, it should be noted that the target threshold is determined according to the requirement of the user for accuracy, and besides the target threshold of the embodiment, other target thresholds such as 85%, 90%, 95% and the like may also be used.
In a specific embodiment, the detection performance of the improved YOLOV3 model of the present invention and the original YOLOV3 model on cell phones and cigarettes was verified on a test data set, and the test results are shown in fig. 6-9. In this embodiment, the precision of the improved YOLOV3 model of the present invention is 90.81%, and the precision of the original YOLOV3 model is 75.6%. Meanwhile, the experimental result shows that the improved YOLOV3 model can detect the mobile phone and the cigarette more easily in the same picture, and the detection effect on the mobile phone and the cigarette is obvious.
S2: and positioning the pedestrian area by utilizing a multi-target tracking algorithm according to the monitoring video of the target scene. In a specific embodiment, when the pedestrian area is located, a deepsort multi-target tracking algorithm is adopted for location. It should be noted that the multi-frame information is introduced mainly by using the multi-target tracking algorithm, and besides the deppsort multi-target tracking algorithm adopted in this embodiment, other backend tracking optimization algorithms such as sort and the like matched with kalman filtering, hungarian and KM, or other single-target tracking algorithms such as KCF and the like based on multithreading, and the like, may also be adopted.
S3: according to the real-time smoking and calling identification model, smoking and calling real-time identification is carried out on the pedestrian area, and the method comprises the following specific steps: when the confidence threshold of the real-time smoking and calling identification model is larger than 0.5, the correct cigarette or mobile phone is considered to be detected, and smoking and calling judgment is carried out at the moment, wherein the specific judgment method comprises the following steps:
determining a pedestrian head area; in a specific embodiment, when determining the head area of the pedestrian, the head area is divided according to the ratio of the head to the body, wherein the ratio of the head to the body is 1. It should be noted that, in addition to the determination of the head area of the pedestrian by using the proportional division of the head and the body in the embodiment, other prior art techniques may be used to determine the head area of the pedestrian.
If the detection result is the mobile phone and the IOU values of the mobile phone area and the pedestrian head area are larger than 0.08, the calling behavior is considered to exist;
if the detection result is a cigarette and the IOU values of the cigarette area and the pedestrian head area are larger than 0, the smoking behavior is considered to exist;
recording the judgment result of each frame of each pedestrian, if the judgment result in the current frame is that smoking or calling behaviors exist, multiplying 1 by 0.2, and if the judgment result is normal, multiplying 1 by 0 to obtain the instantaneous judgment result of the current frame;
adding the instantaneous judgment results of the same pedestrian in 5 continuous frames, if the calculation result is more than 0.5, considering that the pedestrian has smoking or calling behaviors in the period of time, otherwise, judging that the pedestrian is normal.
The pedestrian area is positioned by utilizing the multi-target tracking algorithm, the false detection rate can be reduced by combining multi-frame information, the target detection is single-frame detection, small targets such as cigarettes and mobile phones are easily identified by mistake, the multi-target tracking algorithm is introduced, and the information of continuous frames is introduced, so that the false detection rate can be greatly reduced, and the robustness of the model is improved; in addition, IOU calculation is carried out according to the head position of the pedestrian and the positions of the cigarettes and the mobile phone, and a threshold value is set to identify a result, so that the false detection rate is low.
Although the present invention has been described with reference to a preferred embodiment, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (6)

1. A real-time smoking and calling identification method is characterized by comprising the following steps:
s1: the method for establishing the real-time smoking and calling identification model specifically comprises the following substeps:
s11: collecting smoking and calling picture data to obtain a data set, labeling objects in the data set, and dividing the data set into a training set, a verification set and a test set, wherein the objects comprise pedestrians, mobile phones and cigarettes;
s12: establishing an improved YOLOV3 model, which specifically comprises the following substeps:
s121: constructing a lightweight backbone network by taking Se-Res2Block as a basic module, wherein the lightweight backbone network comprises two 3 x3 convolutions with the step length of 2, a bottleneck layer, three Se-Res2Block modules, the number of convolution cores is 36, a bottleneck layer, three Se-Res2Block modules, the number of convolution cores is 72, a bottleneck layer, three Se-Res2Block modules and the number of convolution cores is 144, which are sequentially connected;
s122: introducing an SPP module behind the lightweight backbone network, and adjusting the resolution of an input image; the introduction of the SPP module after the lightweight backbone network specifically comprises: down-sampling the output of the upper layer by three maximum pooling kernels with the step length of 1 and the sizes of 5 × 5, 9 × 9 and 13 × 13, and realizing multi-feature fusion of the three down-sampled outputs and the output of the upper layer in a splicing manner to obtain receptive fields with different sizes;
s123: adding an ASFF module to obtain the improved Yolov3 model;
s13: generating anchors by using the data in the training set through a kmeans clustering algorithm, wherein the number of the anchors required to be generated is 9;
s14: training and verifying the improved Yolov3 model by taking data of a training set and a verification set as input of the improved Yolov3 model;
s15: taking data of a test set as input of the improved Yolov3 model, and testing the accuracy of the improved Yolov3 model; when the accuracy reaches a target threshold value, obtaining the real-time smoking and calling identification model;
s2: positioning a pedestrian area by utilizing a multi-target tracking algorithm according to a monitoring video of a target scene;
s3: according to the real-time smoking and calling identification model, smoking and calling real-time identification is carried out on the pedestrian area, and the smoking and calling real-time identification result of the pedestrian is determined according to the instantaneous judgment addition result of continuous 5 frames of the same pedestrian;
when smoking and making a call to the pedestrian area are identified in real time: when the confidence threshold of the real-time smoking and calling identification model is larger than 0.5, the correct cigarette or mobile phone is detected, and smoking and calling judgment is carried out at the moment, wherein the specific judgment method comprises the following steps:
determining a pedestrian head area;
if the detection result is the mobile phone and the IOU values of the mobile phone area and the pedestrian head area are larger than 0.08, the calling behavior is considered to exist;
if the detection result is a cigarette and the IOU values of the cigarette area and the pedestrian head area are larger than 0, the smoking behavior is considered to exist;
recording the judgment result of each frame of each pedestrian, if the judgment result in the current frame is that smoking or calling behaviors exist, multiplying 1 by 0.2, and if the judgment result is normal, multiplying 1 by 0 to obtain the instantaneous judgment result of the current frame;
adding the instantaneous judgment results of the same pedestrian in 5 continuous frames, if the calculation result is more than 0.5, considering that the pedestrian has smoking or calling behaviors in the period of time, otherwise, judging that the pedestrian is normal.
2. The method according to claim 1, wherein the labeling step S11 is performed by yolomark when labeling the objects in the data set.
3. The real-time smoking call recognition method according to claim 1, wherein in step S121, the Se-Res2Block module is composed of two parts, namely a Res2Bolck module and a channel attention mechanism, and the channel attention mechanism is arranged behind the Res2Block module.
4. The real-time smoking calling identification method according to claim 1, wherein the step S123 specifically comprises: increasing a fusion mode of ASFF on the basis of YOLOV3, wherein the YOLOV3 is output in three scales, the ASFF designs a weight parameter for the feature layer in each scale, the sum of the three weight parameters is 1, the feature layers in different scales are adjusted to be the same in size through up-sampling or down-sampling during fusion, and each feature layer is multiplied by the respective weight parameter to serve as output.
5. The method of claim 1, wherein in step S14, when the improved YOLOV3 model is trained, data enhancement is performed on data in the training set, specifically by changing one or more of an angle, a contrast, and a brightness of a picture of the data set.
6. The real-time smoking calling identification method of claim 1, wherein in step S2, when the pedestrian area is located, a depsort multi-target tracking algorithm is used for location.
CN202110207092.0A 2021-02-24 2021-02-24 Real-time smoking calling identification method Active CN112818913B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110207092.0A CN112818913B (en) 2021-02-24 2021-02-24 Real-time smoking calling identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110207092.0A CN112818913B (en) 2021-02-24 2021-02-24 Real-time smoking calling identification method

Publications (2)

Publication Number Publication Date
CN112818913A CN112818913A (en) 2021-05-18
CN112818913B true CN112818913B (en) 2023-04-07

Family

ID=75865407

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110207092.0A Active CN112818913B (en) 2021-02-24 2021-02-24 Real-time smoking calling identification method

Country Status (1)

Country Link
CN (1) CN112818913B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113591662A (en) * 2021-07-24 2021-11-02 深圳市铁越电气有限公司 Method, system and storage medium for recognizing smoking calling behavior
CN115880683B (en) * 2023-03-02 2023-05-16 江西省水利科学院(江西省大坝安全管理中心、江西省水资源管理中心) Urban waterlogging ponding intelligent water level detection method based on deep learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807429A (en) * 2019-10-23 2020-02-18 西安科技大学 Construction safety detection method and system based on tiny-YOLOv3
CN111368696A (en) * 2020-02-28 2020-07-03 淮阴工学院 Dangerous chemical transport vehicle illegal driving behavior detection method and system based on visual cooperation
CN112052815A (en) * 2020-09-14 2020-12-08 北京易华录信息技术股份有限公司 Behavior detection method and device and electronic equipment

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020596B (en) * 2012-12-05 2016-06-22 华北电力大学 A kind of based on Human bodys' response method abnormal in the power generation of block models
CN105913022A (en) * 2016-04-11 2016-08-31 深圳市飞瑞斯科技有限公司 Handheld calling state determining method and handheld calling state determining system based on video analysis
CN108960216A (en) * 2018-09-21 2018-12-07 浙江中正智能科技有限公司 A kind of detection of dynamic human face and recognition methods
US11308325B2 (en) * 2018-10-16 2022-04-19 Duke University Systems and methods for predicting real-time behavioral risks using everyday images
CN110633643A (en) * 2019-08-15 2019-12-31 青岛文达通科技股份有限公司 Abnormal behavior detection method and system for smart community
CN110723621B (en) * 2019-10-11 2021-09-17 浙江新再灵科技股份有限公司 Device and method for detecting smoking in elevator car based on deep neural network
CN111222449B (en) * 2020-01-02 2023-04-11 上海中安电子信息科技有限公司 Driver behavior detection method based on fixed camera image
CN111898514B (en) * 2020-07-24 2022-10-18 燕山大学 Multi-target visual supervision method based on target detection and action recognition
CN112115775A (en) * 2020-08-07 2020-12-22 北京工业大学 Smoking behavior detection method based on computer vision in monitoring scene
CN112257643A (en) * 2020-10-30 2021-01-22 天津天地伟业智能安全防范科技有限公司 Smoking behavior and calling behavior identification method based on video streaming

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807429A (en) * 2019-10-23 2020-02-18 西安科技大学 Construction safety detection method and system based on tiny-YOLOv3
CN111368696A (en) * 2020-02-28 2020-07-03 淮阴工学院 Dangerous chemical transport vehicle illegal driving behavior detection method and system based on visual cooperation
CN112052815A (en) * 2020-09-14 2020-12-08 北京易华录信息技术股份有限公司 Behavior detection method and device and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于姿态估计的驾驶员手部动作检测方法研究;刘唐波等;《信号处理》(第12期);第136-143页 *

Also Published As

Publication number Publication date
CN112818913A (en) 2021-05-18

Similar Documents

Publication Publication Date Title
CN110929560B (en) Video semi-automatic target labeling method integrating target detection and tracking
CN112818913B (en) Real-time smoking calling identification method
CN109977782B (en) Cross-store operation behavior detection method based on target position information reasoning
CN102831402B (en) Sparse coding and visual saliency-based method for detecting airport through infrared remote sensing image
CN109815863B (en) Smoke and fire detection method and system based on deep learning and image recognition
CN108388879A (en) Mesh object detection method, device and storage medium
CN109858547A (en) A kind of object detection method and device based on BSSD
CN111222396A (en) All-weather multispectral pedestrian detection method
CN111160249A (en) Multi-class target detection method of optical remote sensing image based on cross-scale feature fusion
CN113762209A (en) Multi-scale parallel feature fusion road sign detection method based on YOLO
CN108647695A (en) Soft image conspicuousness detection method based on covariance convolutional neural networks
WO2023083280A1 (en) Scene text recognition method and device
CN110781917B (en) Method and device for detecting repeated image, electronic equipment and readable storage medium
CN111274942A (en) Traffic cone identification method and device based on cascade network
CN111175318A (en) Screen scratch fragmentation detection method and equipment
CN113723377A (en) Traffic sign detection method based on LD-SSD network
CN111539456B (en) Target identification method and device
CN111881984A (en) Target detection method and device based on deep learning
CN115115973A (en) Weak and small target detection method based on multiple receptive fields and depth characteristics
CN115719463A (en) Smoke and fire detection method based on super-resolution reconstruction and adaptive extrusion excitation
CN111553337A (en) Hyperspectral multi-target detection method based on improved anchor frame
CN113095445B (en) Target identification method and device
CN111476314B (en) Fuzzy video detection method integrating optical flow algorithm and deep learning
CN111881914A (en) License plate character segmentation method and system based on self-learning threshold
CN113111888B (en) Picture discrimination method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant