CN114170677A - Network model training method and equipment for detecting smoking behavior - Google Patents

Network model training method and equipment for detecting smoking behavior Download PDF

Info

Publication number
CN114170677A
CN114170677A CN202111342312.7A CN202111342312A CN114170677A CN 114170677 A CN114170677 A CN 114170677A CN 202111342312 A CN202111342312 A CN 202111342312A CN 114170677 A CN114170677 A CN 114170677A
Authority
CN
China
Prior art keywords
network model
image
data set
smoking
training data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111342312.7A
Other languages
Chinese (zh)
Inventor
杨之乐
杨猛
郭媛君
王尧
冯伟
吴承科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN202111342312.7A priority Critical patent/CN114170677A/en
Priority to PCT/CN2021/138039 priority patent/WO2023082407A1/en
Publication of CN114170677A publication Critical patent/CN114170677A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a network model training method and equipment for detecting smoking behaviors. The network model training method comprises the following steps: establishing an initial network model; acquiring a training data set related to smoking behavior, and labeling the training data set to obtain labeling information related to the training data set; inputting the training data set and the labeling information thereof into a one-stage network model to generate a detection frame of each image in the training data set; inputting each image and the detection frame thereof in the training data set into the two-stage network model to generate a judgment result about smoking behavior in each image in the training data set; and training the first-stage network model and the second-stage network model based on the judgment result to obtain a preset network model for detecting smoking behavior. By means of the method, the network model training method can obtain the network model for detecting smoking behaviors, solves the limitation of the traditional method, and has good accuracy.

Description

Network model training method and equipment for detecting smoking behavior
Technical Field
The application relates to the technical field of image recognition, in particular to a network model training method and equipment for detecting smoking behaviors.
Background
Smoking behavior is one of ten bans on construction sites, and if the smoking behavior is not controlled, the possibility of fire disaster is increased, so that serious property loss and casualties are caused. At present, the smoke suction behavior is mostly controlled through a smoke detector and an artificial monitoring mode on the construction site, but a smoke sensor is limited by space, the space is very large under an outdoor scene, the accuracy rate of the smoke sensor detection can be reduced, and the manual monitoring mode cannot realize the real-time performance of the detection while wasting manpower. Therefore, it is an urgent problem to detect smoking behavior accurately and in real time on the construction site.
The traditional unsafe behavior detection method for workers is based on manual detection, the manual detection completely depends on manpower, the detection accuracy depends on the attention concentration degree of detection personnel, and as the coverage area of a construction site is larger and larger, the detection of whether unsafe behaviors exist in the workers in the construction site or not by the manpower is more difficult.
Disclosure of Invention
The application provides a network model training method and equipment for detecting smoking behaviors.
The application provides a network model training method for detecting smoking behaviors, which comprises the following steps:
establishing an initial network model, wherein the initial network model comprises a one-stage network model and a two-stage network model;
acquiring a training data set related to the smoking behavior, and labeling the training data set to obtain labeling information related to the training data set;
inputting the training data set and the labeling information thereof into the one-stage network model to generate a detection frame of each image in the training data set;
inputting each image and the detection frame thereof in the training data set into the two-stage network model to generate a judgment result about smoking behavior in each image in the training data set;
and training the first-stage network model and the second-stage network model based on the judgment result to obtain a preset network model for detecting smoking behavior.
Wherein said obtaining a training data set relating to said smoking behaviour comprises:
under different external conditions, carrying out image acquisition on smoking behaviors in a preset scene to obtain a smoking behavior image;
under different external conditions, carrying out image acquisition on the non-smoking behavior in a preset scene to obtain a non-smoking behavior image;
establishing the training data set based on the smoking behavior image and the non-smoking behavior image.
Wherein the labeling of the training data set comprises:
labeling a worker detection box and a worker skeletal point in each image in the training dataset;
wherein the worker skeletal points comprise basic skeletal points and hand skeletal points.
Wherein the inputting the training data set and its labeling information into the one-stage network model comprises:
normalizing all images in the training data set;
and inputting the normalized image and the labeling information thereof into the one-stage network model.
The loss function of the first-stage network model comprises a detection frame classification loss function and a detection frame coordinate loss function;
the detection frame classification loss function is used for learning the relation between the confidence coefficient of the human classification in the prediction detection frame and the confidence coefficient of the human classification in the labeling detection frame, and the detection frame coordinate loss function is used for learning the relation between the coordinate position of the prediction detection frame and the coordinate position of the labeling detection frame.
Before each image and its detection box in the training data set are input into the two-stage network model, the network model training method further includes:
screening the detection frames in the image by adopting a preset algorithm;
and fusing the screened detection frame with the corresponding image.
Wherein the generating of the determination result about smoking behavior in each image in the training dataset comprises:
acquiring a pre-marked real smoking action;
inputting each image and the detection frame thereof in the training data set into the two-stage network model to obtain a preset smoking action in the detection frame of each image;
calculating the attitude distance between the real smoking action and the preset smoking action;
and when the gesture distance is smaller than a preset threshold value, confirming that the smoking behavior exists in the range of the detection frame of the image.
Wherein, the calculating the gesture distance between the real smoking action and the preset smoking action comprises:
acquiring the number of joints of the real smoking action matched with the preset smoking action;
calculating the space distance of the same joint in the real smoking action and the preset smoking action;
and calculating the posture distance between the real smoking action and the preset smoking action according to the matched joint number and the space distance of the same joint.
Wherein, the network model training method further comprises:
acquiring a real-time monitoring image;
inputting the real-time monitoring image into the preset network model to obtain a judgment result output by the preset network model;
and confirming whether the smoking behavior exists in the real-time monitoring image based on the judgment result.
The application also provides a terminal device comprising a memory and a processor, wherein the memory is coupled to the processor;
the memory is used for storing program data, and the processor is used for executing the program data to realize the network model training method.
The present application also provides a computer storage medium for storing program data which, when executed by a processor, is used to implement the network model training method described above.
The beneficial effect of this application is: the terminal equipment establishes an initial network model; acquiring a training data set related to smoking behavior, and labeling the training data set to obtain labeling information related to the training data set; inputting the training data set and the labeling information thereof into a one-stage network model to generate a detection frame of each image in the training data set; inputting each image and the detection frame thereof in the training data set into the two-stage network model to generate a judgment result about smoking behavior in each image in the training data set; and training the first-stage network model and the second-stage network model based on the judgment result to obtain a preset network model for detecting smoking behavior. By means of the method, the network model training method can obtain the network model for detecting smoking behaviors, solves the limitation of the traditional method, and has good accuracy.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts. Wherein:
FIG. 1 is a schematic flow chart diagram illustrating an embodiment of a network model training method for detecting smoking behavior provided herein;
FIG. 2 is a block diagram of an embodiment of an overall network model provided herein;
fig. 3 is a schematic structural diagram of an embodiment of a terminal device provided in the present application;
fig. 4 is a schematic structural diagram of an embodiment of a terminal device provided in the present application;
FIG. 5 is a schematic structural diagram of an embodiment of a computer storage medium provided in the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In order to solve the technical problems, the application discloses a deep learning-based two-stage detection method for unsafe behaviors of construction site workers, which is suitable for the field of worker behavior detection of construction sites in various big cities. The method is based on a deep learning framework, has good autonomous learning capability, fault-tolerant capability and generalization capability for various external factors which are difficult to quantify, and has good expansibility and higher prediction precision. Aiming at the problems that the existing construction site is gradually popularized, for example, smoking of workers is forbidden gradually and comprehensively in the construction site, the method can realize automatic monitoring and alarming of unsafe behaviors of the workers in the construction site, reduce the management cost of the construction site and improve the safety and reliability of construction.
Referring to fig. 1 in detail, fig. 1 is a schematic flowchart of an embodiment of a network model training method for detecting smoking behavior provided in the present application.
The network model training method is applied to a terminal device, wherein the terminal device can be a server, and can also be a system in which the server and a mobile terminal are matched with each other. Accordingly, various parts, such as various units, sub-units, modules and sub-modules, included in the terminal device may be all disposed in the server, or may be disposed in the server and the mobile terminal, respectively.
Further, the server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules, for example, software or software modules for providing distributed servers, or as a single software or software module, and is not limited herein. In some possible implementations, the network model training method of the embodiments of the present application may be implemented by a processor calling computer readable instructions stored in a memory.
Specifically, as shown in fig. 1, the network model training method in the embodiment of the present application specifically includes the following steps:
step S11: and establishing an initial network model, wherein the initial network model comprises a one-stage network model and a two-stage network model.
In the embodiment of the present application, the terminal device establishes an initial network model as shown in fig. 2, where the initial network model includes a one-stage network model and a two-stage network model, the one-stage network model is responsible for generating a worker detection box, and the two-stage network model is responsible for detecting the behavior of a worker.
Specifically, the one-phase network model may be based on the fast-rcnn setting, and the two-phase network model may be based on RMPE, Regional Multi-Person Pose Estimation setting. In other embodiments, other possible network models may be used to build the initial network model.
Step S12: and acquiring a training data set related to smoking behavior, and labeling the training data set to obtain labeling information related to the training data set.
In the embodiment of the application, the terminal equipment collects images of workers on the construction site and establishes a training data set. The worker marks the specific position of the worker in the image by using a marking tool in the training data set, namely marking a worker detection frame, marking basic skeleton points of the worker and marking hand skeleton points in detail. Wherein the basic skeleton points comprise trunk skeleton points, hand skeleton points, leg skeleton points, foot skeleton points and the like; the hand skeleton points comprise shoulder skeleton points, elbow skeleton points, wrist skeleton points, palmar skeleton points and five finger skeleton points, and the total number of the hand skeleton points is 9. Further, the worker needs to enjoy and label each image in the training data set to label whether the behavior of the worker in the image belongs to the smoking behavior.
Specifically, when the worker collects images of a training data set of smoking behaviors of the worker in a construction site scene, the worker needs to collect images including smoking behaviors with different scales and non-smoking behaviors under different lighting conditions so as to enrich the content of the training data set.
Step S13: and inputting the training data set and the labeling information thereof into the one-stage network model to generate a detection frame of each image in the training data set.
In the embodiment of the present application, as shown in fig. 2, the terminal device performs convolution processing on all images in the training data set by the convolutional layer to obtain a feature map of each image. Then, the terminal device normalizes the image feature map, and inputs the normalized image feature map and the manually marked annotation information into the one-stage network model at the same time to obtain a prediction detection frame generated on the image feature map by the one-stage network model.
Specifically, the terminal device generates a plurality of candidate regions on the image feature map through a candidate region network, and then selects a region of interest related to smoking behavior through a classifier to generate a prediction detection frame.
Further, the loss function set for the one-stage network model in the embodiment of the present application is defined as follows:
Figure BDA0003352612330000071
where i is the index of the prediction detection box, i.e., the anchor box generated by the one-stage network model. p is a radical ofiIs the confidence that there is a worker in the ith prediction detection box, the true tag
Figure BDA0003352612330000072
Indicating that a worker is present in the predictive detection box, a genuine tag
Figure BDA0003352612330000073
Indicating that there is no worker in the prediction detection box, only background.
Figure BDA0003352612330000074
Is the real position coordinate of the ith label prediction box, tiIs the position coordinate of the ith prediction detection frame, including [ tx,ty,tw,th]Four parameters. N is a radical ofclsAdjustment parameter for the classification loss function of the detection frame, NregAnd λ is an adjustment function of the detection frame coordinate loss function. Therefore, the loss function of the one-stage network model is specifically composed of a detection frame classification loss function and a detection frame coordinate loss function.
In the detection frame regression, [ t ]x,ty,tw,th]The four parameters are defined as follows:
tx=(x-xa)/waty=(y-ya)/hatw=log(w/wa)th=log(h/ha)
wherein, [ x, y, w, h [ ]]The center coordinates, width and height, [ x ] of the predicted detection frame, respectivelya,ya,wa,ha]Respectively, the center coordinate, width and height of the labeled detection frame, [ t ]x,ty,tw,th]Four parameters are used to characterize the offset between the prediction detection box and the annotation detection box.
Before inputting the prediction detection frame into the two-stage network model, the terminal device may further fuse the prediction detection frame generated by the one-stage network model with the image feature map after being screened by using an NMS (Non-Maximum Suppression) algorithm, and input a fusion result into the two-stage network model, where the two-stage network model mainly reselects the detection frame generated by the one-stage network model by a posture analysis method to generate a final result.
The definition of the NMS algorithm in the embodiment of the present application is as follows:
Figure BDA0003352612330000075
Figure BDA0003352612330000081
specifically, the terminal device inputs one image feature map and all prediction detection frames B ═ B thereof1…bNAnd (5) the confidence coefficient S ═ S corresponding to each prediction detection frame1...sN}. And circularly executing the following steps until all the prediction detection boxes are traversed: obtaining the prediction detection frame with the highest confidence coefficient in the B, then calculating the intersection ratio of the prediction detection frame with the highest confidence coefficient and other prediction detection frames, and when the intersection ratio is larger than a preset threshold valueNtThen, deleting the prediction detection frame and the confidence thereof; when the intersection ratio is less than or equal to a preset threshold value NtThen, the predicted detection box and its confidence are retained.
And after traversing all the prediction detection boxes based on the NMS algorithm, fusing the reserved prediction detection boxes with the image feature map, and inputting a fusion result into the two-stage network model. The two-stage network model is mainly used for selecting the prediction detection frame generated by the one-stage network model again through a posture analysis method to generate a final result.
Step S14: and inputting each image and the detection box thereof in the training data set into the two-stage network model to generate a judgment result about smoking behavior in each image in the training data set.
In the embodiment of the application, the terminal device defines the smoking action with m joints in the two-stage network model
Figure BDA0003352612330000082
Wherein the content of the first and second substances,
Figure BDA0003352612330000083
the j position coordinate of the i joint and the confidence thereof are respectively.
The two-stage network model identifies the worker smoking action in the prediction detection box according to the prediction detection box generated in the step S13
Figure BDA0003352612330000091
Assuming a smoking action PiThe detection frame is
Figure BDA0003352612330000092
A soft matching function may be defined for calculating the smoking action PiAnd smoking action PjThe number of joints matching between the two poses is:
Figure BDA0003352612330000093
wherein σ1Hyper-parameter being a soft-match functionAnd (4) counting.
Further, a smoking action P is definediAnd smoking action PjThe spatial distance between each identical joint in the two poses is formulated as:
Figure BDA0003352612330000094
wherein σ2Is a hyperparameter of the spatial distance formula.
Finally, an attitude distance function is obtained:
dpose(Pi,Pj|Λ)=KSim(Pi,Pj1)+λHSim(Pi,Pj2)
therefore, the two-stage network model is responsible for generating the smoking action P of the workerjAnd calculating the smoking action P of the workerjSmoking action P with labeliWhen the distance is smaller than the preset threshold value, the action can be judged to belong to the smoking action.
Step S15: and training the first-stage network model and the second-stage network model based on the judgment result to obtain a preset network model for detecting smoking behavior.
In this embodiment of the application, the terminal device may train the first-stage network model and the second-stage network model based on the loss functions of the first-stage network model and the second-stage network model, respectively, so as to train the overall network model shown in fig. 2, and finally obtain the preset network model for detecting smoking behavior.
Further, after a preset network model for detecting smoking behavior is trained, the terminal equipment can acquire a real-time monitoring image; inputting the real-time monitoring image into the preset network model to obtain a judgment result output by the preset network model; and confirming whether the smoking behavior exists in the real-time monitoring image based on the judgment result.
In the embodiment of the application, the terminal equipment establishes an initial network model; acquiring a training data set related to smoking behavior, and labeling the training data set to obtain labeling information related to the training data set; inputting the training data set and the labeling information thereof into a one-stage network model to generate a detection frame of each image in the training data set; inputting each image and the detection frame thereof in the training data set into the two-stage network model to generate a judgment result about smoking behavior in each image in the training data set; and training the first-stage network model and the second-stage network model based on the judgment result to obtain a preset network model for detecting smoking behavior. By means of the method, the network model training method can obtain the network model for detecting smoking behaviors, solves the limitation of the traditional method, and has good accuracy.
It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.
To implement the network model training method of the foregoing embodiment, the present application further provides a terminal device, and specifically refer to fig. 3, where fig. 3 is a schematic structural diagram of an embodiment of the terminal device provided in the present application.
The terminal device 300 of the embodiment of the present application includes a model building module 31, a data obtaining module 32, a behavior judging module 33, and a network training module 34; wherein the content of the first and second substances,
a model building module 31 is configured to build an initial network model, where the initial network model includes a one-stage network model and a two-stage network model.
And the data acquisition module 32 is configured to acquire a training data set related to the smoking behavior, and label the training data set to obtain labeled information related to the training data set.
A behavior judging module 33, configured to input the training data set and the label information thereof into the one-stage network model, and generate a detection frame for each image in the training data set; and inputting each image and the detection frame thereof in the training data set into the two-stage network model to generate a judgment result about smoking behavior in each image in the training data set.
And the network training module 34 is configured to train the first-stage network model and the second-stage network model based on the determination result to obtain a preset network model for detecting smoking behavior.
To implement the network model training method of the foregoing embodiment, the present application further provides another terminal device, and specifically refer to fig. 4, where fig. 4 is a schematic structural diagram of another embodiment of the terminal device provided in the present application.
The terminal device 400 of the embodiment of the present application includes a memory 41 and a processor 42, wherein the memory 41 and the processor 42 are coupled.
The memory 41 is used for storing program data, and the processor 42 is used for executing the program data to implement the network model training method described in the above embodiments.
In the present embodiment, the processor 42 may also be referred to as a CPU (Central Processing Unit). The processor 42 may be an integrated circuit chip having signal processing capabilities. The processor 42 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor 42 may be any conventional processor or the like.
The present application further provides a computer storage medium, as shown in fig. 5, the computer storage medium 500 is used for storing program data 51, and the program data 51 is used for implementing the network model training method according to the above embodiment when being executed by a processor.
The present application further provides a computer program product, wherein the computer program product includes a computer program operable to cause a computer to execute the network model training method according to the embodiment of the present application. The computer program product may be a software installation package.
The network model training method according to the above embodiment of the present application may be stored in a device, for example, a computer-readable storage medium, when the network model training method is implemented in the form of a software functional unit and sold or used as an independent product. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the purpose of illustrating embodiments of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application or are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims (11)

1. A network model training method for detecting smoking behavior is characterized by comprising the following steps:
establishing an initial network model, wherein the initial network model comprises a one-stage network model and a two-stage network model;
acquiring a training data set related to the smoking behavior, and labeling the training data set to obtain labeling information related to the training data set;
inputting the training data set and the labeling information thereof into the one-stage network model to generate a detection frame of each image in the training data set;
inputting each image and the detection frame thereof in the training data set into the two-stage network model to generate a judgment result about smoking behavior in each image in the training data set;
and training the first-stage network model and the second-stage network model based on the judgment result to obtain a preset network model for detecting smoking behavior.
2. The network model training method of claim 1,
the obtaining a training data set regarding the smoking behavior comprises:
under different external conditions, carrying out image acquisition on smoking behaviors in a preset scene to obtain a smoking behavior image;
under different external conditions, carrying out image acquisition on the non-smoking behavior in a preset scene to obtain a non-smoking behavior image;
establishing the training data set based on the smoking behavior image and the non-smoking behavior image.
3. The network model training method of claim 1,
the labeling of the training data set includes:
labeling a worker detection box and a worker skeletal point in each image in the training dataset;
wherein the worker skeletal points comprise basic skeletal points and hand skeletal points.
4. The network model training method of claim 3,
the inputting the training data set and the labeling information thereof into the one-stage network model includes:
normalizing all images in the training data set;
and inputting the normalized image and the labeling information thereof into the one-stage network model.
5. The network model training method according to claim 1 or 4,
the loss function of the first-stage network model comprises a detection frame classification loss function and a detection frame coordinate loss function;
the detection frame classification loss function is used for learning the relation between the confidence coefficient of the human classification in the prediction detection frame and the confidence coefficient of the human classification in the labeling detection frame, and the detection frame coordinate loss function is used for learning the relation between the coordinate position of the prediction detection frame and the coordinate position of the labeling detection frame.
6. The network model training method of claim 1,
before each image and its detection box in the training data set are input into the two-stage network model, the network model training method further includes:
screening the detection frames in the image by adopting a preset algorithm;
and fusing the screened detection frame with the corresponding image.
7. The network model training method of claim 1,
the generating of the judgment result about smoking behavior in each image in the training data set includes:
acquiring a pre-marked real smoking action;
inputting each image and the detection frame thereof in the training data set into the two-stage network model to obtain a preset smoking action in the detection frame of each image;
calculating the attitude distance between the real smoking action and the preset smoking action;
and when the gesture distance is smaller than a preset threshold value, confirming that the smoking behavior exists in the range of the detection frame of the image.
8. The network model training method of claim 7,
the calculating the attitude distance between the real smoking action and the preset smoking action comprises the following steps:
acquiring the number of joints of the real smoking action matched with the preset smoking action;
calculating the space distance of the same joint in the real smoking action and the preset smoking action;
and calculating the posture distance between the real smoking action and the preset smoking action according to the matched joint number and the space distance of the same joint.
9. The network model training method of claim 1,
the network model training method further comprises the following steps:
acquiring a real-time monitoring image;
inputting the real-time monitoring image into the preset network model to obtain a judgment result output by the preset network model;
and confirming whether the smoking behavior exists in the real-time monitoring image based on the judgment result.
10. A terminal device, comprising a memory and a processor, wherein the memory is coupled to the processor;
wherein the memory is configured to store program data and the processor is configured to execute the program data to implement the network model training method of any one of claims 1-9.
11. A computer storage medium for storing program data which, when executed by a processor, is adapted to implement the network model training method of any of claims 1-9.
CN202111342312.7A 2021-11-12 2021-11-12 Network model training method and equipment for detecting smoking behavior Pending CN114170677A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111342312.7A CN114170677A (en) 2021-11-12 2021-11-12 Network model training method and equipment for detecting smoking behavior
PCT/CN2021/138039 WO2023082407A1 (en) 2021-11-12 2021-12-14 Network model training method for detecting smoking behavior and device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111342312.7A CN114170677A (en) 2021-11-12 2021-11-12 Network model training method and equipment for detecting smoking behavior

Publications (1)

Publication Number Publication Date
CN114170677A true CN114170677A (en) 2022-03-11

Family

ID=80479448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111342312.7A Pending CN114170677A (en) 2021-11-12 2021-11-12 Network model training method and equipment for detecting smoking behavior

Country Status (2)

Country Link
CN (1) CN114170677A (en)
WO (1) WO2023082407A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116071624A (en) * 2023-01-28 2023-05-05 南京云创大数据科技股份有限公司 Smoking detection data labeling method based on active learning

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112052815B (en) * 2020-09-14 2024-02-20 北京易华录信息技术股份有限公司 Behavior detection method and device and electronic equipment
CN112395978B (en) * 2020-11-17 2024-05-03 平安科技(深圳)有限公司 Behavior detection method, behavior detection device and computer readable storage medium
CN112464895B (en) * 2020-12-14 2023-09-01 深圳市优必选科技股份有限公司 Gesture recognition model training method and device, gesture recognition method and terminal equipment
CN113076903A (en) * 2021-04-14 2021-07-06 上海云从企业发展有限公司 Target behavior detection method and system, computer equipment and machine readable medium
CN113392706A (en) * 2021-05-13 2021-09-14 上海湃道智能科技有限公司 Device and method for detecting smoking and using mobile phone behaviors

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116071624A (en) * 2023-01-28 2023-05-05 南京云创大数据科技股份有限公司 Smoking detection data labeling method based on active learning

Also Published As

Publication number Publication date
WO2023082407A1 (en) 2023-05-19

Similar Documents

Publication Publication Date Title
CN110807429B (en) Construction safety detection method and system based on tiny-YOLOv3
WO2021051601A1 (en) Method and system for selecting detection box using mask r-cnn, and electronic device and storage medium
CN108256404B (en) Pedestrian detection method and device
CN109670441A (en) A kind of realization safety cap wearing knows method for distinguishing, system, terminal and computer readable storage medium
CN110781836A (en) Human body recognition method and device, computer equipment and storage medium
CN111738258A (en) Pointer instrument reading identification method based on robot inspection
CN114359974B (en) Human body posture detection method and device and storage medium
WO2021082112A1 (en) Neural network training method, skeleton diagram construction method, and abnormal behavior monitoring method and system
CN108960145A (en) Facial image detection method, device, storage medium and electronic equipment
CN113177968A (en) Target tracking method and device, electronic equipment and storage medium
CN111783716A (en) Pedestrian detection method, system and device based on attitude information
CN114170677A (en) Network model training method and equipment for detecting smoking behavior
B Nair et al. Machine vision based flood monitoring system using deep learning techniques and fuzzy logic on crowdsourced image data
CN114373162A (en) Dangerous area personnel intrusion detection method and system for transformer substation video monitoring
CN113780145A (en) Sperm morphology detection method, sperm morphology detection device, computer equipment and storage medium
CN117475253A (en) Model training method and device, electronic equipment and storage medium
CN113591885A (en) Target detection model training method, device and computer storage medium
Anjum et al. A pull-reporting approach for floor opening detection using deep-learning on embedded devices
CN116959099A (en) Abnormal behavior identification method based on space-time diagram convolutional neural network
CN116597501A (en) Video analysis algorithm and edge device
CN115862138A (en) Personnel tumbling behavior detection method, device, equipment and storage medium
Sun et al. Automatic building age prediction from street view images
Dai et al. Trajectory outlier detection based on dbscan and velocity entropy
CN110969209B (en) Stranger identification method and device, electronic equipment and storage medium
CN110363162B (en) Deep learning target detection method for focusing key region

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination