CN116977904A - Yolov 5-based rapid large-scene-identification multi-man-made garment detection method - Google Patents

Yolov 5-based rapid large-scene-identification multi-man-made garment detection method Download PDF

Info

Publication number
CN116977904A
CN116977904A CN202311023456.5A CN202311023456A CN116977904A CN 116977904 A CN116977904 A CN 116977904A CN 202311023456 A CN202311023456 A CN 202311023456A CN 116977904 A CN116977904 A CN 116977904A
Authority
CN
China
Prior art keywords
detection
data
model
worker
scene
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311023456.5A
Other languages
Chinese (zh)
Inventor
陶茜茜
赵静
梁鸿
宋贞耀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Dinghong Safety Technology Co ltd
Original Assignee
Shandong Dinghong Safety Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Dinghong Safety Technology Co ltd filed Critical Shandong Dinghong Safety Technology Co ltd
Priority to CN202311023456.5A priority Critical patent/CN116977904A/en
Publication of CN116977904A publication Critical patent/CN116977904A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Abstract

The invention discloses a method for rapidly identifying a large scene multi-man-made garment based on YOLOv5, which comprises the following steps: extracting image information of a video stream shot under a fixed camera, marking a foreground in an image by using LabelImg software in consideration of the complexity of an oil extraction operation site, constructing a data set of target detection of the oil extraction operation site, and randomly dividing the data set into training sets, wherein the test set is=8:2; according to the training result, adjusting parameters of the network, and selecting a model with the optimal training result as a final model; extracting image information of an oilfield on-site video stream as input into a trained YOLOv5 student network after knowledge distillation, and carrying out foreground information identification to obtain detection data; performing secondary detection on workers through the obtained detection information; and judging whether the worker does not wear the work clothes or not according to the set label threshold value, and storing detection and judgment information. And alarming when the environmental photo shows that the worker does not wear the work clothes. The invention uses knowledge distillation to compress the network model, extracts the valve area and the multi-scale characteristics of the image, solves the problem of low efficiency in the traditional method, and can rapidly and accurately identify that workers do not wear the worker clothes in the field photo.

Description

Yolov 5-based rapid large-scene-identification multi-man-made garment detection method
Technical Field
The invention belongs to the field of computer vision, and particularly relates to a method for quickly identifying a large scene and multiple manual clothes based on YOLOv 5.
Background
In petroleum operation sites, the scene is complex, and the risk of personnel operation is higher. In order to ensure the personal safety of petroleum workers, it is necessary to correctly wear the work clothes. The all-weather staring examination mode is adopted by monitoring, so that the cost of manpower and material resources is high, and the efficiency is low. Therefore, the intelligent analysis of the phenomenon that the worker wears the clothes incorrectly exists on the operation site by using some intelligent technologies is of great significance.
With the continuous upgrading of computer hardware and the breakthrough of artificial intelligence technology, artificial intelligence such as machine learning and deep learning is becoming more and more popular. In the field of oilfield safety production, the computer vision technology has achieved remarkable results, and the computer vision technology such as target detection, target tracking and attitude estimation is used for analyzing the field video in real time, so that the phenomenon of incorrect wearing of the work clothes on the operation field is automatically judged, the traditional manual monitoring mode is effectively replaced, and the safety and the supervision efficiency of the field are improved.
Disclosure of Invention
The invention aims to automatically alarm the phenomenon of not wearing the worker's clothes by using target detection, and simultaneously, the invention uses knowledge distillation to improve the model reasoning speed and the efficiency of the existing oil extraction operation site on the detection of the worker's clothes. The method for quickly identifying the large-scene multi-man-made clothes based on the YOLOv5 is provided, features are automatically extracted, detection efficiency is improved, and cost of manpower and material resources is reduced.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
(1) Collecting video stream data of an oilfield site, carrying out framing treatment to obtain image data, and manufacturing a non-wearing clothing data set;
(2) Constructing a large-scene multi-artificial-coat target detection teacher model and a student model based on YOLOv5 under a PyTorch frame;
(3) Selecting a training sample to train the non-wearing detection teacher model, and distilling the backbone network of the teacher model and the output of the student model;
(4) The method comprises the steps of carrying out framing treatment on a video stream to be detected, and inputting a trained large-scene multi-manual work clothes detection model to obtain detection data;
(5) According to the obtained detection data of the continuous frames, carrying out secondary detection on the data detected as workers;
(6) And obtaining the state of whether the worker wears the frock according to the set label threshold value and storing relevant detection and judgment information.
The invention is further improved in that the specific implementation steps of the step (1) are as follows:
(101) Framing video stream data of an oilfield operation site to obtain image data;
(102) In consideration of the complexity of the oil extraction site and various types of the worker's clothes, worker's shoes, etc. in the image data are calibrated by LabelImg software, so as to obtain the image and the corresponding label file;
the invention is further improved in that the specific implementation steps of the step (2) are as follows:
(201) A teacher network, which uses ResNet151 as a backbone network of the teacher network;
(202) A student network using mobilenet v3 as a backbone network of the student network;
the invention is further improved in that the specific implementation step of the step (3) comprises the following steps:
(301) Setting training parameters of a network: the maximum number of iterations is set to 200; the learning rate was initialized to 0.001, which was reduced to 0.0001 on round 10 and to 0.00001 on round 50;
(302) Training a teacher model by using the data set of the unworn clothes;
(303) And adjusting parameters of the network according to training results, and distilling the teacher network, the backbone network of the student model, the output of the teacher model and the output of the student network.
The invention is further improved in that the specific implementation steps of the step (5) are as follows:
(501) Performing secondary detection on the data detected as the worker;
(502) Judging the brightness of the data subjected to the secondary detection, and if the average brightness is lower than a threshold value L=100, performing gamma conversion on the data to improve the brightness;
(503) And setting a threshold value T=0.7, and judging that the clothing is not worn when the confidence coefficient of the secondarily detected clothing label which is not worn is larger than T.
The method for rapidly identifying the large-scene multi-man-made clothes based on the YOLOv5 has the beneficial effects that: unified task, easy training, convenient optimization and the like; the problems of low efficiency and the like of the traditional manual staring method are considered, whether the phenomenon of not wearing the work clothes occurs or not is automatically judged in real time through a computer and a camera of an oilfield operation site, and meanwhile, the knowledge distillation technology is used, so that the magnitude of a model is reduced, and the reasoning speed on a terminal is improved. Compared with the process of manually checking the camera to confirm and feed back, the method is more accurate and rapid, and reduces a large amount of manpower and material cost.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic block diagram of a method for detecting multiple artificial clothes in a large scene based on YOLOv5 in embodiment 1 of the present invention.
Fig. 2 is a flowchart of a method for detecting multiple artificial clothes in a large scene based on YOLOv5 in embodiment 2 of the present invention.
Fig. 3 is a network structure schematic diagram of a YOLOv 5-based rapid identification large-scene multi-man-made garment detection method according to embodiment 3 of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements throughout or elements having like or similar functionality. The embodiments described below by way of the drawings are exemplary only and should not be construed as limiting the invention.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, and/or groups thereof.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
In order that the invention may be readily understood, a further description of the invention will be rendered by reference to specific embodiments that are illustrated in the appended drawings and are not to be construed as limiting embodiments of the invention.
It will be appreciated by those skilled in the art that the drawings are merely schematic representations of examples and that the elements of the drawings are not necessarily required to practice the invention.
Example 1
As shown in fig. 1, embodiment 1 of the present invention provides a YOLOv 5-based rapid identification large scene multi-man clothing detection frame diagram, which includes:
the image processing module is used for carrying out framing processing on the real-time video data of the oilfield site to obtain image data;
the network module is used for analyzing and processing the obtained image by using the unworn garment detection model, extracting the position coordinates of the unworn garment region and outputting prediction information;
and the judging and storing module is used for judging whether the accident of wearing the work clothes occurs or not according to the detection information and storing the detection and judging information.
In this embodiment 1, the network model includes a backbone network, a neck network, a classification network, and a training optimization unit.
The main network is used for extracting multi-scale characteristics of the input image by utilizing the pre-constructed main network in the rapid identification large-scene multi-manual clothing detection model;
the neck network is used for fusing feature graphs among different scales and improving the accuracy and the robustness of the model;
the classification network is used for classifying and regressing the image characteristics and the aggregation information of the suspected unworn clothing areas by using a loss function to obtain prediction information;
the training optimization unit adjusts parameters of the network according to training results, and selects a model with the optimal training results as a final model.
In this embodiment 1, the determination storage module determines whether an accident of putting on a garment has occurred after outputting the detection information by the large-scene multiple-person garment detection model, and stores the position information and the determination information of the occurrence of the accident of putting on a garment.
Example 2
Fig. 2 is a diagram of a YOLOv 5-based rapid identification method for detecting multiple artificial clothes in a large scene, which comprises the following specific operation steps:
(1) Collecting video stream data of an oilfield site, carrying out framing treatment to obtain image data, and manufacturing a non-wearing clothing data set;
(2) Constructing a model of a plurality of artificial clothes detection teachers and students based on YOLOv5 under a Pytorch framework;
(3) Training and optimizing a target detection model by using the manufactured data set of the unworn clothes, and distilling a backbone network of the student model and a backbone network trained by a teacher model;
(4) Detecting the field video stream by using the trained student target detection model;
(5) Performing secondary detection on workers according to the obtained detection data;
(6) Judging whether the worker has the illegal phenomenon of wearing the work clothes or not through the set threshold value.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing description of the preferred embodiments of the present disclosure is provided only and not intended to limit the disclosure so that various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims (7)

1. A method for quickly identifying a large scene multiple manual clothes detection based on YOLOv5 is characterized by comprising the following steps:
(1) Manufacturing a target detection data set of an oil extraction operation site; firstly, framing video data to obtain image data, calibrating the image data by using LabelImg software to manufacture an oil extraction site operation data set, and finally, randomly dividing the data set into a training set and a testing set according to a fixed proportion;
(2) Constructing a large-scene multi-artificial-coat target detection teacher model and a student model based on YOLOv5 under a PyTorch frame;
(3) Training and optimizing a large-scene multi-artificial-coat target detection teacher model by selecting training samples, and distilling a backbone network of the teacher model and a student model, and output of the teacher model and output of the student network respectively;
(4) The method comprises the steps of carrying out framing treatment on a video stream to be detected, and inputting the video stream to be detected into a trained large-scene multi-manual work clothes detection model to obtain detection data;
(5) According to the obtained detection data of the continuous frames, carrying out secondary detection on the data detected as workers;
(6) And obtaining the state of whether the worker wears the frock according to the set label threshold value and storing relevant detection and judgment information.
2. The YOLOv 5-based large scene multiple manual garment detection method of claim 1, wherein the creating a running drip target detection dataset comprises:
the video stream data of the oil field operation site is subjected to framing treatment to obtain image data, because the foreground information of the oil extraction site is complex, workers, hydraulic tongs, tongs frames and the like in the image data are calibrated by LabelImg software to obtain images and corresponding tag files, and the image data and the corresponding tag files are randomly divided into a training set and a testing set according to 8:2.
3. The YOLOv 5-based large-scene multiple-manual garment detection method of claim 2, wherein the YOLOv 5-based large-scene multiple-manual garment detection model comprises:
(a) The main network is used for extracting multi-scale characteristics of the input image;
(b) The neck network is used for processing the extracted features and fusing the features with different scales;
(c) And the classification network is used for outputting the detected category and position information.
4. A YOLOv 5-based large-scene multiple-artificial-garment detection method according to claim 3, wherein the selecting training samples to train and optimize the large-scene multiple-artificial-garment detection model comprises:
(a) Setting training parameters of a network;
(b) Training a YOLOv5 large-scene multi-artificial-garment target detection model by utilizing an oil extraction operation site data set;
(c) And adjusting parameters of the network according to the training result, and selecting a model with the optimal training result as a final model.
5. The YOLOv 5-based large-scene multiple-manual garment detection method according to claim 2, wherein the large-scene multiple-manual garment detection data are: surrounding frame and coordinates of workers.
6. The YOLOv 5-based large-scene multiple-manual garment detection method of claim 5, wherein the unworn garment violation data is behavior categories of each video segment.
7. The YOLOv 5-based large-scene multiple-manual garment detection method according to claim 6, wherein the specific implementation steps of judging whether the worker does not wear the garment are as follows:
(a) Setting tag thresholds T1 and T2 and a brightness threshold L;
(b) Performing brightness judgment according to the acquired worker data, and judging that the worker wears the worker clothes when the worker clothes threshold T is more than 0.5 under the condition that L is more than 100; otherwise, the worker is judged not to wear the frock.
(c) And storing and visualizing the real-time analysis result of whether the worker wears the worker clothes or not for the supervision personnel to check and process.
CN202311023456.5A 2023-08-15 2023-08-15 Yolov 5-based rapid large-scene-identification multi-man-made garment detection method Pending CN116977904A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311023456.5A CN116977904A (en) 2023-08-15 2023-08-15 Yolov 5-based rapid large-scene-identification multi-man-made garment detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311023456.5A CN116977904A (en) 2023-08-15 2023-08-15 Yolov 5-based rapid large-scene-identification multi-man-made garment detection method

Publications (1)

Publication Number Publication Date
CN116977904A true CN116977904A (en) 2023-10-31

Family

ID=88474944

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311023456.5A Pending CN116977904A (en) 2023-08-15 2023-08-15 Yolov 5-based rapid large-scene-identification multi-man-made garment detection method

Country Status (1)

Country Link
CN (1) CN116977904A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117710374A (en) * 2024-02-05 2024-03-15 中海油田服务股份有限公司 Method, device, equipment and medium for detecting running and leaking based on deep learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117710374A (en) * 2024-02-05 2024-03-15 中海油田服务股份有限公司 Method, device, equipment and medium for detecting running and leaking based on deep learning

Similar Documents

Publication Publication Date Title
CN111881730A (en) Wearing detection method for on-site safety helmet of thermal power plant
CN113553977B (en) Improved YOLO V5-based safety helmet detection method and system
CN113516076B (en) Attention mechanism improvement-based lightweight YOLO v4 safety protection detection method
CN113553979B (en) Safety clothing detection method and system based on improved YOLO V5
CN102332094B (en) Semi-supervised online study face detection method
CN103839065A (en) Extraction method for dynamic crowd gathering characteristics
CN101916365A (en) Intelligent video identifying method for cheat in test
CN113642474A (en) Hazardous area personnel monitoring method based on YOLOV5
CN113688709B (en) Intelligent detection method, system, terminal and medium for wearing safety helmet
CN116977904A (en) Yolov 5-based rapid large-scene-identification multi-man-made garment detection method
CN112541393A (en) Transformer substation personnel detection method and device based on deep learning
CN114662208B (en) Construction visualization system and method based on Bim technology
CN115410119A (en) Violent movement detection method and system based on adaptive generation of training samples
CN113837154B (en) Open set filtering system and method based on multitask assistance
CN113128412B (en) Fire trend prediction method based on deep learning and fire monitoring video
CN116311081B (en) Medical laboratory monitoring image analysis method and system based on image recognition
CN113191273A (en) Oil field well site video target detection and identification method and system based on neural network
CN104537392A (en) Object detection method based on distinguishing semantic component learning
CN112560880A (en) Object classification method, object classification apparatus, and computer-readable storage medium
CN114694090A (en) Campus abnormal behavior detection method based on improved PBAS algorithm and YOLOv5
CA3012927A1 (en) Counting objects in images based on approximate locations
CN115273009A (en) Road crack detection method and system based on deep learning
CN114005089A (en) Multi-scene construction safety helmet and reflective clothes detection method
CN114387564A (en) Head-knocking engine-off pumping-stopping detection method based on YOLOv5
CN113392927A (en) Animal target detection method based on single-order deep neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination