CN117496131A - Electric power operation site safety behavior identification method and system - Google Patents

Electric power operation site safety behavior identification method and system Download PDF

Info

Publication number
CN117496131A
CN117496131A CN202311839596.XA CN202311839596A CN117496131A CN 117496131 A CN117496131 A CN 117496131A CN 202311839596 A CN202311839596 A CN 202311839596A CN 117496131 A CN117496131 A CN 117496131A
Authority
CN
China
Prior art keywords
model
detr
operation site
power operation
safety behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311839596.XA
Other languages
Chinese (zh)
Other versions
CN117496131B (en
Inventor
郭鹏天
王晓辉
谈元鹏
陈勇
李黎
王勇
刘晗
徐康
陈霞
梁栋
张纪伟
张若冰
邱镇
卢大玮
周飞
张国梁
王博
宋明黎
宋杰
王万国
袁弘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Smart Grid Research Institute Co ltd
Zhejiang University ZJU
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
China Electric Power Research Institute Co Ltd CEPRI
Jinan Power Supply Co of State Grid Shandong Electric Power Co Ltd
Original Assignee
State Grid Smart Grid Research Institute Co ltd
Zhejiang University ZJU
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
China Electric Power Research Institute Co Ltd CEPRI
Jinan Power Supply Co of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Smart Grid Research Institute Co ltd, Zhejiang University ZJU, State Grid Corp of China SGCC, State Grid Information and Telecommunication Co Ltd, China Electric Power Research Institute Co Ltd CEPRI, Jinan Power Supply Co of State Grid Shandong Electric Power Co Ltd filed Critical State Grid Smart Grid Research Institute Co ltd
Priority to CN202311839596.XA priority Critical patent/CN117496131B/en
Publication of CN117496131A publication Critical patent/CN117496131A/en
Application granted granted Critical
Publication of CN117496131B publication Critical patent/CN117496131B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method and a system for identifying safety behaviors of an electric power operation site, and relates to the technical field of computer vision and target detection, wherein the method comprises the steps of constructing a DETR model, and pre-training the model by adopting a general data set; constructing a power industry data set and training a model; based on the pretrained DETR model, introducing an Adapter module into a converter structure of an encoding layer and a decoding layer of the pretrained DETR model, and performing fine adjustment based on an electric power operation site safety behavior recognition data set after the model structure is adjusted to obtain a final electric power operation site safety behavior recognition model based on the DETR+adapter; inputting the picture to be detected into a safety behavior recognition model of the electric power operation site based on the DETR+adapter, and outputting a recognition result. Compared with other algorithms, the model provided by the invention has stronger generalization performance and higher detection precision.

Description

Electric power operation site safety behavior identification method and system
Technical Field
The invention relates to the technical field of computer vision and target detection, in particular to a method and a system for identifying safety behaviors of an electric power operation site.
Background
Along with the rapid increase of economic flight and modern urbanization process, the equipment is rapidly increased, the power grid scale is continuously increased, and the corresponding number of the operation site scales of electric power construction, technical improvement, relocation and overhaul is also greatly increased. The electric power operation has heavy tasks and wide range, and commonly has high risk factors such as severe construction environment, easy electric shock, high-altitude cross operation, large hoisting and the like. Personal accidents can easily occur under the condition of poor construction capability and safety management capability of enterprises. In the electric power operation field, once an electric power accident occurs, not only economic loss can be caused, but also casualties can be brought. The traditional manual supervision mode has the defects of resource waste, low efficiency and the like, and the development of an artificial intelligence technology enables automatic safety detection to be possible. Early, artificial intelligence techniques were applied primarily based on traditional machine learning algorithms, such as support vector machines, decision trees, random forests, and the like. These algorithms are excellent in feature extraction, model training, etc., but have limited ability to process complex nonlinear patterns and large-scale data, and cannot meet the requirements for safety behavior recognition in the power operation field. With the advent of deep learning algorithms, such as Convolutional Neural Networks (CNNs), cyclic neural networks (RNNs), long-short-term memory networks (LSTM), and the like, high-level features can be automatically extracted from original data through a multi-layer neuron structure, so that dependence on manual feature engineering is reduced, recognition accuracy and robustness are greatly improved, and accuracy of electric power operation site behavior recognition is greatly improved.
The safety behavior identification of the electric power operation site belongs to a target detection task in a deep learning algorithm, and the current mainstream target detection algorithm comprises an R-CNN series and a YOLO series. The R-CNN series algorithm is one of the earliest target detection algorithms, and the idea is to generate candidate areas by using an area suggestion network (Region Proposal Network, RPN), then perform feature extraction and classification on each candidate area, and finally perform fine adjustment on a target frame by using a regression algorithm. Such algorithms perform well in terms of accuracy, but are relatively slow to calculate. The YOLO series algorithm is another popular target detection algorithm, and the core idea is to convert the target detection task into a regression problem to directly predict the target frame and class. Such algorithms have fast detection speeds and higher accuracy, but do not perform as well as the R-CNN series of algorithms on small target detection. In 2017, google corporation proposed a transducer architecture, which was originally a neural network architecture for Natural Language Processing (NLP), such as machine translation, question-answering systems, language modeling, etc. Due to the characteristics of the transform with strong modeling capability and parallelization processing, the method is also applied to the field of vision in recent years. In 2020, google Brain team proposed a VIT model for image classification that uses a Transformer structure instead of the traditional CNN structure, while using a self-attention mechanism to link different parts of an image to better handle images with different aspect ratios. The VIT model brings new thought and method for the development of computer vision, and simultaneously achieves remarkable results. The Facebook AI research team proposed DETR algorithm, detection Transformer, in 2020, to process the input sequence using a transducer encoder, and then build a link between different elements in the input sequence through a self-attention mechanism, which can directly perform end-to-end training and reasoning, avoiding the problem of coupling between target positioning and target classification.
Although the DETR algorithm performs well on many computer vision tasks, there are challenges when applied in the field of electrical job site safety behavior identification. In particular, the training of the DETR model relies on a large amount of high quality annotation data, since the DETR model employs a transducer as the primary feature extraction and target detection component, and the transducer network requires a large amount of training data to learn the effective representation and relationships. However, obtaining high quality training data in the field of safety behavior identification of electric power operation sites has higher difficulty and cost, and especially, defect type samples are more rare for practical business scenes.
Aiming at the problem that samples are scarce, namely small samples, which are faced by actual scenes, one feasible method is to adopt a pre-training and fine-tuning mode, namely, pre-training the model by utilizing a large amount of existing labeling data, so that the model learns information such as edges, colors, textures and the like of images, has certain feature extraction and expression capability, and fine-tuning is carried out by a small amount of specific task samples, so that the model can adapt to specific tasks. The traditional model fine tuning method mainly adjusts the weight of a pre-trained model, the implementation process of the method is relatively simple and clear, but the original weight is changed, so that the model can be disastrous and forgotten.
Disclosure of Invention
Therefore, the embodiment of the invention provides a method and a system for identifying safety behaviors of an electric power operation site, which are used for solving the problems that in the prior art, the training of a DETR model depends on a large amount of high-quality labeling data, and aiming at the situation that samples facing an actual scene are scarce, the weight of a pre-trained model is adjusted by a traditional model fine adjustment method, so that the model can be disastrous and forget.
In order to solve the above problems, an embodiment of the present invention provides a method for identifying safety behavior of an electric power operation site, the method including:
s1: constructing a DETR-based power operation site safety behavior recognition model, and pre-training the model by adopting a general data set, wherein the DETR-based power operation site safety behavior recognition model comprises a backbone network, a position coding layer, a decoding layer and a target detection head;
s2: constructing a power industry data set consisting of a power transmission line, power distribution and infrastructure field, training the electric power operation site safety behavior recognition model based on the DETR based on the power industry data set again on the basis of not changing the electric power operation site safety behavior recognition model structure based on the DETR, and freezing parameters of a backbone network and position codes in the training process to obtain a pre-trained DETR model;
s3: based on the pretrained DETR model, introducing an Adapter module into a converter structure of an encoding layer and a decoding layer of the pretrained DETR model, performing fine adjustment based on an electric power operation site safety behavior recognition data set after the model structure is adjusted, freezing all parameters including a backbone network, a position encoder, the encoding layer and the decoding layer module in the process, only training parameters of the Adapter module and a target detection head part, and obtaining a final electric power operation site safety behavior recognition model based on the DETR+adapter after the training is completed;
s4: inputting the picture to be detected into a safety behavior recognition model of the electric power operation site based on the DETR+adapter, and outputting a recognition result.
Preferably, the electric power operation site safety behavior recognition model based on the DETR comprises a backbone network, a position code, an encoding layer, a decoding layer and a target detection head;
the backbone network is used for extracting image characteristics;
the position codes are used for embedding two-dimensional coordinate information into the feature map so as to preserve space context;
the coding layer is used for receiving the feature map and the position coding information output by the backbone network, and learning the relation between the features through a self-attention mechanism by adopting a transducer architecture, so that the model can capture the dependency relation between targets in a global range;
the decoding layer adopts a transducer architecture and is used for receiving the output of the encoding layer and a fixed number of target query blocks;
the detection head consists of a feedforward neural network and is used for receiving the output of the decoding layer and outputting the target class probability and the target position information.
Preferably, the position code is calculated using a cosine function, the calculation formula of which is as follows:
in the method, in the process of the invention,the position is indicated by the position of the object,the coding dimension is represented as such,representing hidden layer dimensions of the model.
Preferably, the target query block does not contain any valid information at initialization, and functions as an input to the decoder, learning and predicting the class and bounding box of each target by interacting with the output of the encoding layer.
Preferably, the structure of the Adapter module is as follows:
the Adapter module consists of two feedforward neural network layers and a nonlinear activation function layer, wherein the first feedforward neural network layer takes the output of a transducer block as input, projects an original input dimension d to m, and limits the parameter quantity of the Adapter module by controlling the size of m; in the output stage, the input dimension is restored through the second feedforward neural network layer, and m is re-projected to d to serve as the output of the Adapter module.
Preferably, the transducer structure is:
the input is added with the input after the feature extraction through the multi-head attention layer and the feedforward neural network layer, then enters the two feedforward neural network layers after normalization, then the normalized features are added with the features after the feature extraction through the two feedforward neural network layers, and then the output features are obtained after normalization.
Preferably, an Adapter module is introduced into the transducer structure, which specifically comprises:
and embedding an Adapter module behind the feedforward neural network layer and the two feedforward neural network layers of the transducer structure.
The embodiment of the invention also provides a system for identifying the safety behavior of the electric power operation site, which is used for realizing the method for identifying the safety behavior of the electric power operation site, and specifically comprises the following steps:
the system comprises a DETR model training module based on a general data set, a power operation site safety behavior recognition module based on the DETR, a target detection head and a power operation site safety behavior recognition module, wherein the DETR model training module is used for constructing a power operation site safety behavior recognition model based on the DETR, and pretraining the model by adopting the general data set, and the power operation site safety behavior recognition model based on the DETR comprises a backbone network, a position coding layer, a decoding layer and the target detection head;
the system comprises a DETR model training module based on a power industry data set, a power transmission line, a power distribution and infrastructure field, wherein the power industry data set is used for constructing the power industry data set formed by the power transmission line, the power distribution and infrastructure field, the power operation site safety behavior recognition model based on the DETR is trained again based on the power industry data set on the basis of not changing the power operation site safety behavior recognition model structure based on the DETR, and in the training process, parameters of a backbone network and position codes are frozen, so that a pre-trained DETR model is obtained;
the fine tuning module is used for introducing an Adapter module into a transducer structure of an encoding layer and a decoding layer of the pre-trained DETR model, fine tuning is performed based on an electric power operation site safety behavior recognition data set after the model structure is adjusted, all parameters including a backbone network, a position encoding layer, the encoding layer and the decoding layer module are frozen in the process, parameters of the Adapter model and a target detection head part are only trained, and a final electric power operation site safety behavior recognition model based on the DETR+adapter is obtained after training is completed;
the identification module is used for inputting the picture to be detected into the electric power operation site safety behavior identification model based on the DETR+adapter, and outputting an identification result.
The embodiment of the invention also provides electronic equipment, which comprises a processor, a memory and a bus system, wherein the processor and the memory are connected through the bus system, the memory is used for storing instructions, and the processor is used for executing the instructions stored by the memory so as to realize the method for identifying the safety behavior of the electric power operation site.
The embodiment of the invention also provides a computer storage medium which stores a computer software product, wherein the computer software product comprises a plurality of instructions for enabling a piece of computer equipment to execute the electric power operation site safety behavior identification method.
From the above technical scheme, the invention has the following advantages:
the embodiment of the invention provides a method and a system for identifying safety behaviors of an electric power operation site. Firstly, a DETR model is built and pre-trained through a general data set, so that the DETR model fully learns general visual characteristics, and generalization performance of the model is improved. Secondly, a power industry data set consisting of a power transmission line, a power distribution and a construction field is constructed, training is performed again based on the power industry data set on the basis of not changing a model structure, and parameters of a backbone network and position codes are frozen in the process. The coding layer and the decoding layer in the DETR model are mainly trained, so that the coding layer and the decoding layer can learn specific context information related to the power industry, the characteristic extraction capability of the power industry data is provided, and the performance of the model in specific field tasks is improved. And then introducing an Adapter module into a converter structure of an encoding layer and a decoding layer of the converter based on the pretrained DETR model, performing fine adjustment based on an electric power operation site safety behavior recognition data set after the model structure is adjusted, freezing all parameters including a backbone network, a position encoder, the encoding layer and the decoding layer module in the process, training parameters of the Adapter and a target detection head part only, and obtaining a final electric power operation site safety behavior recognition model based on the DETR+adapter after the training is finished. And finally, inputting the picture to be detected into a safety behavior recognition model of the electric power operation site based on the DETR+adapter, and outputting a recognition result. Compared with other algorithms, the method has higher recognition precision, and because the DETR model architecture is adopted, the end-to-end training can be carried out, the whole training process is simpler and more convenient, and the rapid migration of the model can be realized.
Drawings
For a clearer description of embodiments of the invention or of solutions in the prior art, reference will be made to the accompanying drawings, which are intended to be used in the examples, for a clearer understanding of the characteristics and advantages of the invention, by way of illustration and not to be interpreted as limiting the invention in any way, and from which, without any inventive effort, a person skilled in the art can obtain other figures. Wherein:
FIG. 1 is a flow chart of a method for identifying safety behavior of an electric power operation site provided in an embodiment;
FIG. 2 is a flow chart of the construction of a DETR-based power job site safety behavior recognition model in an embodiment;
FIG. 3 is a schematic diagram of the basic structure of an Adapter module according to an embodiment;
FIG. 4 (a) is a schematic diagram of a transducer structure in an embodiment, and FIG. 4 (b) is a schematic diagram of a transducer+adapter structure in an embodiment;
FIG. 5 is a sample graph of a safety behavior recognition scenario for an electrical power job site in an embodiment;
FIG. 6 is a schematic diagram of the results of an ablation experiment under the condition of 100 samples in the example;
FIG. 7 is a schematic diagram of the results of an ablation experiment under the condition of 300 samples in the example;
FIG. 8 is a schematic diagram showing the results of comparative experiments under the condition of 100 samples in the example;
FIG. 9 is a schematic diagram showing the results of comparative experiments under the condition of 300 samples in the example;
fig. 10 is a block diagram of an electrical job site safety behavior recognition system provided in an embodiment.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
As shown in fig. 1, an embodiment of the present invention provides a method for identifying safety behavior of an electric power operation site, where the method includes:
s1: constructing a DETR-based power operation site safety behavior recognition model, and pre-training the model by adopting a general data set, wherein the DETR-based power operation site safety behavior recognition model comprises a backbone network, a position coding layer, a decoding layer and a target detection head;
s2: constructing a power industry data set consisting of a power transmission line, power distribution and infrastructure field, training the electric power operation site safety behavior recognition model based on the DETR based on the power industry data set again on the basis of not changing the electric power operation site safety behavior recognition model structure based on the DETR, and freezing parameters of a backbone network and position codes in the training process to obtain a pre-trained DETR model;
s3: based on the pretrained DETR model, introducing an Adapter module into a converter structure of an encoding layer and a decoding layer of the pretrained DETR model, performing fine adjustment based on an electric power operation site safety behavior recognition data set after the model structure is adjusted, freezing all parameters including a backbone network, a position encoder, the encoding layer and the decoding layer module in the process, only training parameters of the Adapter module and a target detection head part, and obtaining a final electric power operation site safety behavior recognition model based on the DETR+adapter after the training is completed;
s4: inputting the picture to be detected into a safety behavior recognition model of the electric power operation site based on the DETR+adapter, and outputting a recognition result.
According to the technical scheme, the invention provides the electric power operation site safety behavior recognition method, firstly, the DETR model is built and pre-trained through the universal data set, so that the general visual characteristics are fully learned, and the generalization performance of the model is improved. Secondly, a power industry data set consisting of a power transmission line, a power distribution and a construction field is constructed, training is performed again based on the power industry data set on the basis of not changing a model structure, and parameters of a backbone network and position codes are frozen in the process. The coding layer and the decoding layer in the DETR model are mainly trained, so that the coding layer and the decoding layer can learn specific context information related to the power industry, the characteristic extraction capability of the power industry data is provided, and the performance of the model in specific field tasks is improved. And then introducing an Adapter model into a converter structure of an encoding layer and a decoding layer of the converter based on the pretrained DETR model, performing fine adjustment based on an electric power operation site safety behavior recognition data set after the model structure is adjusted, freezing all parameters including a backbone network, a position encoder, the encoding layer and the decoding layer module in the process, training parameters of the Adapter and a target detection head part only, and obtaining a final electric power operation site safety behavior recognition model based on the DETR+adapter after the training is finished. And finally, inputting the picture to be detected into a safety behavior recognition model of the electric power operation site based on the DETR+adapter, and outputting a recognition result. Compared with other algorithms, the method has higher recognition precision, and because the DETR model architecture is adopted, the end-to-end training can be carried out, the whole training process is simpler and more convenient, and the rapid migration of the model can be realized.
In this embodiment, in step S1, a DETR-based electric power job site safety behavior recognition model is constructed, and the recognition model includes a backbone network, a position encoding layer, an encoding layer, a decoding layer, and a target detection head.
Further, the present invention uses a generic data set to pretrain the DETR model. The target detection field has a plurality of public data sets, wherein the COCO data set comprises 123287 images, covers 80 target categories, has rich category characteristic information, is favorable for the model to learn more scenes and target information, and is the target detection data set most commonly used in the industry, so that the invention adopts the COCO data set for pre-training, fully learns general visual characteristics and improves the generalization performance of the model.
Further, the detar-based power job site safety behavior recognition model construction flow is shown in fig. 2. The backbone network is a basic part of the DETR model and is used for extracting image features; the invention adopts ResNe50 based on convolutional neural network structure as backbone network to convert the input image into a group of feature images, which contain advanced semantic information of the original image and provide basis for subsequent target detection task. The position code is used for embedding two-dimensional coordinate information into the feature map so as to preserve space context and help the model to better understand the relative position of the target in the image.
In the method, in the process of the invention,the position is indicated by the position of the object,the coding dimension is represented as such,representing hidden layer dimensions of the model.
The coding layer is used for receiving the feature map and the position coding information output by the backbone network, and adopts a transducer architecture to learn the relation between features through a self-attention mechanism, so that the model can capture the dependency relation between targets in a global range, and the detection performance is improved. The decoding layer also employs a transform architecture for receiving the output of the encoding layer and a fixed number of object query blocks (objects) that do not contain any valid information at initialization, which function as input to the decoder to learn and predict the class and bounding box of each object by interacting with the output of the encoding layer. The detection head is composed of a feedforward neural network (Feed Forward Network, FFN) and is used for receiving output of the decoding layer and outputting target class probability and target position information. Each target query block of the decoding layer can obtain target class probability and target position information through FFN calculation, and finally output a target detection frame through judging the confidence coefficient.
In this embodiment, in step S2, a power industry data set composed of the power transmission line, the distribution and the infrastructure fields is constructed, and the detailing-based power operation site safety behavior recognition model is trained again based on the power industry data set on the basis of not changing the detailing-based power operation site safety behavior recognition model structure. In the training process, parameters of the backbone network and the position codes are frozen to obtain a pretrained DETR model, the mode can firstly reduce the parameters of the trainable model, quicken the training speed of the model, and secondly can keep the extraction capacity of the model to general features and reduce the risk of overfitting. The coding layer and the decoding layer in the DETR model are mainly trained, so that the coding layer and the decoding layer can learn specific context information related to the power industry, the characteristic extraction capability of the power industry data is provided, and the performance of the model in specific field tasks is improved.
In this embodiment, in step S3, based on the pretrained DETR model, an Adapter module is introduced into a converter structure of an encoding layer and a decoding layer thereof, after the adjustment of the model structure is completed, fine adjustment is performed based on an electric power operation site safety behavior recognition data set, in this process, all parameters including a backbone network, a position encoding layer, an encoding layer and a decoding layer module are frozen, only parameters of the Adapter module and a target detection head portion are trained, and after the training is completed, a final electric power operation site safety behavior recognition model based on detr+adapter is obtained.
The common method for model fine tuning is to directly retrain the weight of the pre-trained model and update the weight parameters of the pre-trained model, and the method generally comprises two modes of full fine tuning and partial fine tuning. The full-quantity fine tuning is large in training resource consumption because full-quantity parameters of the model are updated; the partial fine tuning training is faster, but the fine tuning effect of the model depends on the selection of the update portion of the model. In addition, the weight of the pre-trained model is changed, so that the model can be updated towards an unpredictable direction, and the problems of disastrous forgetting and the like are generated.
In order to solve the above problems, the invention provides a model fine tuning method based on an Adapter module, wherein the Adapter module is a lightweight model structure and consists of two feedforward neural network layers and a nonlinear activation function layer, the first feedforward neural network layer takes the output of a transducer block as input, the original input dimension d (high-dimensional feature) is projected to m (low-dimensional feature), and the parameter number of the Adapter module is limited by controlling the size of m, and m < < d in general cases; in the output stage, the input dimension is restored through the second feedforward neural network layer, and m is re-projected to d to serve as the output of the Adapter module. The basic structure is shown in figure 3. The core idea of the Adapter is to introduce fewer parameters into the original model structure so as to reduce the parameter number and the calculation complexity of the model, reduce the calculation cost and the parameter number in the transfer learning process and reduce the training time and the memory requirement.
Further, an Adapter module is introduced into the transducer structure of the coding layer and the decoding layer based on the pretrained DETR model. As shown in fig. 4, the transformation is performed by inserting a new linear layer in the transducer layer, thereby enabling the model to adapt to new tasks. The parameters of these linear layers are trained on the target task while the parameters of the original model remain unchanged.
As shown in fig. 4 (a), the transducer structure is: the input is added with the input after the feature extraction through the multi-head attention layer and the feedforward neural network layer, then enters the two feedforward neural network layers after normalization, then the normalized features are added with the features after the feature extraction through the two feedforward neural network layers, and then the output features are obtained after normalization. As shown in fig. 4 (b), the transducer+adapter structure is: and embedding an Adapter module behind the feedforward neural network layer and the two feedforward neural network layers of the transducer structure.
In this embodiment, in step S4, the picture to be detected is input into the electric power operation site safety behavior recognition model based on detr+adapter, and the recognition result is output.
To demonstrate the advantages of the method of the present invention, the following description is provided in connection with specific experiments.
The detection sample is shown in fig. 5, and the detection sample aims at the service scene, namely an electric safety behavior identification scene, comprising three types of safety helmet wearing identification, safety belt wearing identification and short sleeve shorts identification. Specifically, the helmet wearing identification is used for detecting whether a worker wears the helmet or not, and comprises two types, namely a wearing helmet and a non-wearing helmet; the safety belt wearing recognition is used for detecting whether a worker wears the safety belt correctly or not, and comprises three types of wearing the safety belt correctly, wearing the safety belt incorrectly and wearing no safety belt; short-sleeved shorts are identified as short-sleeved shorts or shorts for detecting whether the clothes of workers are short-sleeved shorts or shorts, and are divided into short-sleeved shorts and non-short-sleeved shorts.
(1) Evaluation index
There are numerous evaluation indexes in the target detection field, and common main indexes include accuracy (Precision), recall (Recall), harmonic mean (F1-score), average accuracy (Average Precision, AP), cross-over ratio (Intersection over Union, ioU), average accuracy mean (Mean Average Precision, MAP), misdiagnosis rate (False Positive Rate, FPR), missed diagnosis rate (False Negative Rate, FNR), and the like. In the electric power safety supervision scene, targets of different categories need to be accurately detected, MAP is an average value of APs of all the categories, performance of the model in an overall detection task can be estimated, generalization capability of the model among different categories can be fully embodied by MAP, and comprehensive performance of the model can be well reflected. In addition, in the electric power safety supervision scene, false detection and omission detection can cause equipment damage and potential safety hazard, and serious safety accidents are caused. Therefore, the invention selects three indexes of average precision mean value (MAP), misdiagnosis rate (FPR) and missed diagnosis rate (FNR) for evaluation.
(2) Experimental environment
The experimental equipment used by the invention is 1 GPU server, which comprises 4 TESLA A A100 GPU cards, and the memory is DDR4 2933 32G 16; the hard disk capacity was 2.4TB.
(3) Data set construction
A total of 400 samples of a data set containing three scenes of safety helmet wearing recognition, short sleeve shorts recognition and safety belt wearing recognition are constructed, 100 samples are selected as a test set in a random sampling mode, and 300 samples are used as a training set. In order to better embody the model performance of the algorithm provided by the invention under the condition of small samples, 300 training set samples are randomly sampled again to divide 100 pictures to be used as a comparison training set. Finally, two training sets, including 100 samples and 300 samples, are constructed, and a test set is constructed to contain 100 samples.
(4) Experimental results
1) Ablation experiments
In order to analyze the contribution of each step of the model training method to the model performance, an ablation experiment is carried out, and mainly comprises a DETR model without pre-training, a DETR-1 model pre-trained by using a general training set, a DETR-2 model pre-trained by using an industry data set, and a detr+adapter model embedded with an Adapter, wherein experimental analysis is carried out under the conditions that the training sample size is 100 and 300, and experimental results are shown in fig. 6 and 7.
The specific implementation results after summarizing are shown in table 1 (ablation experimental result), and the results show that the accuracy of the model can be obviously improved after the pre-training operation, and the performance of the model can be slightly improved after the Adapter module is added.
TABLE 1
2) Mainstream target detection algorithm contrast
The invention also carries out comparison experiments with the mainstream target detection algorithms YOLOv5 and fast RCNN, and also carries out experimental analysis under the condition that the training sample size is 100 and 300, and the experimental results are shown in fig. 8 and 9.
The summarized concrete implementation results are shown in table 2 (the comparison result of the invention and the main stream target detection algorithm), and the results prove that the model precision of the DETR+adapter provided by the invention is improved to a certain extent under the condition of a small sample, and because the DETR model architecture is adopted, the end-to-end training can be carried out, the whole training process is simpler and more convenient, and the rapid migration of the model can be realized.
TABLE 2
Example two
As shown in fig. 10, the present invention provides a system for identifying safety behavior of an electric power operation site, where the system is configured to implement the above-mentioned method for identifying safety behavior of an electric power operation site, and specifically includes:
the DETR model training module 10 is configured to construct a DETR-based power operation site safety behavior recognition model, and pretrain the model with the general data set, where the DETR-based power operation site safety behavior recognition model includes a backbone network, a position coding layer, a decoding layer, and a target detection head;
the DETR model training module 20 based on the electric power industry data set is configured to construct an electric power industry data set composed of electric power transmission lines, power distribution and infrastructure fields, train the electric power operation site safety behavior recognition model based on the DETR based on the electric power industry data set again on the basis of not changing the electric power operation site safety behavior recognition model structure based on the DETR, and freeze parameters of a backbone network and position codes during training to obtain a pre-trained DETR model;
the fine tuning module 30 is configured to introduce an Adapter module into a transducer structure of an encoding layer and a decoding layer of the pre-trained DETR model, perform fine tuning based on an electric power operation site safety behavior recognition data set after the model structure is adjusted, freeze all parameters including a backbone network, a position encoding layer, an encoding layer and a decoding layer module in the process, train only parameters of an Adapter model and a target detection head part, and obtain a final electric power operation site safety behavior recognition model based on detr+adapter after training is completed;
the recognition module 40 is configured to input the picture to be detected into a detr+adapter-based power operation site safety behavior recognition model, and output a recognition result.
The embodiment of the electric power operation site safety behavior recognition system is applicable to implementing the foregoing electric power operation site safety behavior recognition method, so that the embodiment of the electric power operation site safety behavior recognition system can be seen from the foregoing embodiment parts of the electric power operation site safety behavior recognition method, for example, the DETR model training module 10 based on the general data set, the DETR model training module 20 based on the electric power industry data set, the fine tuning module 30, and the recognition module 40, which are respectively used to implement steps S1, S2, S3, and S4 in the foregoing electric power operation site safety behavior recognition method, so that the detailed description of the embodiment parts can be referred to for avoiding redundancy, and will not be repeated herein.
Example III
The embodiment of the invention also provides electronic equipment, which comprises a processor, a memory and a bus system, wherein the processor and the memory are connected through the bus system, the memory is used for storing instructions, and the processor is used for executing the instructions stored by the memory so as to realize the method for identifying the safety behavior of the electric power operation site.
Example IV
The embodiment of the invention also provides a computer storage medium which stores a computer software product, wherein the computer software product comprises a plurality of instructions for enabling a piece of computer equipment to execute the electric power operation site safety behavior identification method.
It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations and modifications of the present invention will be apparent to those of ordinary skill in the art in light of the foregoing description. It is not necessary here nor is it exhaustive of all embodiments. And obvious variations or modifications thereof are contemplated as falling within the scope of the present invention.

Claims (10)

1. A method for identifying safety behavior of an electric power operation site, comprising:
s1: constructing a DETR-based power operation site safety behavior recognition model, and pre-training the model by adopting a general data set, wherein the DETR-based power operation site safety behavior recognition model comprises a backbone network, a position coding layer, a decoding layer and a target detection head;
s2: constructing a power industry data set consisting of a power transmission line, power distribution and infrastructure field, training the electric power operation site safety behavior recognition model based on the DETR based on the power industry data set again on the basis of not changing the electric power operation site safety behavior recognition model structure based on the DETR, and freezing parameters of a backbone network and position codes in the training process to obtain a pre-trained DETR model;
s3: based on the pretrained DETR model, introducing an Adapter module into a converter structure of an encoding layer and a decoding layer of the pretrained DETR model, performing fine adjustment based on an electric power operation site safety behavior recognition data set after the model structure is adjusted, freezing all parameters including a backbone network, a position encoder, the encoding layer and the decoding layer module in the process, only training parameters of the Adapter module and a target detection head part, and obtaining a final electric power operation site safety behavior recognition model based on the DETR+adapter after the training is completed;
s4: inputting the picture to be detected into a safety behavior recognition model of the electric power operation site based on the DETR+adapter, and outputting a recognition result.
2. The method for identifying the safety behavior of the electric power operation site according to claim 1, wherein the electric power operation site safety behavior identification model based on the DETR comprises a backbone network, a position code, a coding layer, a decoding layer and a target detection head;
the backbone network is used for extracting image characteristics;
the position codes are used for embedding two-dimensional coordinate information into the feature map so as to preserve space context;
the coding layer is used for receiving the feature map and the position coding information output by the backbone network, and learning the relation between the features through a self-attention mechanism by adopting a transducer architecture, so that the model can capture the dependency relation between targets in a global range;
the decoding layer adopts a transducer architecture and is used for receiving the output of the encoding layer and a fixed number of target query blocks;
the detection head consists of a feedforward neural network and is used for receiving the output of the decoding layer and outputting the target class probability and the target position information.
3. The method for identifying safety behavior of an electric power operation site according to claim 2, wherein the position code is calculated by using a cosine function, and the calculation formula is as follows:
in (1) the->Indicate position(s) (i.e.)>Representing the coding dimension->Representing hidden layer dimensions of the model.
4. The method of claim 2, wherein the target query block does not contain any valid information at initialization, and functions as an input to a decoder, learning and predicting the class and bounding box of each target by interacting with the output of the encoding layer.
5. The method for identifying safety behavior of an electric power operation site according to claim 1, wherein the Adapter module has a structure as follows:
the Adapter module consists of two feedforward neural network layers and a nonlinear activation function layer, wherein the first feedforward neural network layer takes the output of a transducer block as input, projects an original input dimension d to m, and limits the parameter quantity of the Adapter module by controlling the size of m; in the output stage, the input dimension is restored through the second feedforward neural network layer, and m is re-projected to d to serve as the output of the Adapter module.
6. The method for identifying safety behavior of an electric power operation site according to claim 1, wherein the transducer structure is:
the input is added with the input after the feature extraction through the multi-head attention layer and the feedforward neural network layer, then enters the two feedforward neural network layers after normalization, then the normalized features are added with the features after the feature extraction through the two feedforward neural network layers, and then the output features are obtained after normalization.
7. The method for identifying safety behavior of an electric power operation site according to claim 1, wherein an Adapter module is introduced into a transducer structure, and specifically comprises:
and embedding an Adapter module behind the feedforward neural network layer and the two feedforward neural network layers of the transducer structure.
8. An electric power operation site safety behavior recognition system, characterized in that the system is used for realizing the electric power operation site safety behavior recognition method according to any one of claims 1 to 7, and specifically comprises the following steps:
the system comprises a DETR model training module based on a general data set, a power operation site safety behavior recognition module based on the DETR, a target detection head and a power operation site safety behavior recognition module, wherein the DETR model training module is used for constructing a power operation site safety behavior recognition model based on the DETR, and pretraining the model by adopting the general data set, and the power operation site safety behavior recognition model based on the DETR comprises a backbone network, a position coding layer, a decoding layer and the target detection head;
the system comprises a DETR model training module based on a power industry data set, a power transmission line, a power distribution and infrastructure field, wherein the power industry data set is used for constructing the power industry data set formed by the power transmission line, the power distribution and infrastructure field, the power operation site safety behavior recognition model based on the DETR is trained again based on the power industry data set on the basis of not changing the power operation site safety behavior recognition model structure based on the DETR, and in the training process, parameters of a backbone network and position codes are frozen, so that a pre-trained DETR model is obtained;
the fine tuning module is used for introducing an Adapter module into a transducer structure of an encoding layer and a decoding layer of the pre-trained DETR model, fine tuning is performed based on an electric power operation site safety behavior recognition data set after the model structure is adjusted, all parameters including a backbone network, a position encoding layer, the encoding layer and the decoding layer module are frozen in the process, parameters of the Adapter model and a target detection head part are only trained, and a final electric power operation site safety behavior recognition model based on the DETR+adapter is obtained after training is completed;
the identification module is used for inputting the picture to be detected into the electric power operation site safety behavior identification model based on the DETR+adapter, and outputting an identification result.
9. An electronic device comprising a processor, a memory and a bus system, the processor and the memory being connected by the bus system, the memory being configured to store instructions, the processor being configured to execute the instructions stored by the memory to implement the method for identifying a safety behavior of an electrical work site according to any one of claims 1 to 7.
10. A computer storage medium storing a computer software product comprising instructions for causing a computer device to perform the method of identifying safety behavior of an electric power job site according to any one of claims 1 to 7.
CN202311839596.XA 2023-12-29 2023-12-29 Electric power operation site safety behavior identification method and system Active CN117496131B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311839596.XA CN117496131B (en) 2023-12-29 2023-12-29 Electric power operation site safety behavior identification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311839596.XA CN117496131B (en) 2023-12-29 2023-12-29 Electric power operation site safety behavior identification method and system

Publications (2)

Publication Number Publication Date
CN117496131A true CN117496131A (en) 2024-02-02
CN117496131B CN117496131B (en) 2024-05-10

Family

ID=89680344

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311839596.XA Active CN117496131B (en) 2023-12-29 2023-12-29 Electric power operation site safety behavior identification method and system

Country Status (1)

Country Link
CN (1) CN117496131B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023019636A1 (en) * 2021-08-18 2023-02-23 浙江工商大学 Defect point identification method based on deep learning network
CN116186171A (en) * 2022-12-19 2023-05-30 中国人民解放军战略支援部队信息工程大学 Continuous relation extraction method and system based on multi-head self-attention mechanism adapter
WO2023116507A1 (en) * 2021-12-22 2023-06-29 北京沃东天骏信息技术有限公司 Target detection model training method and apparatus, and target detection method and apparatus
CN117076983A (en) * 2023-08-10 2023-11-17 中国移动通信集团广东有限公司 Transmission outer line resource identification detection method, device, equipment and storage medium
CN117253191A (en) * 2023-10-09 2023-12-19 上海工程技术大学 Safety helmet wearing detection method based on DETR model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023019636A1 (en) * 2021-08-18 2023-02-23 浙江工商大学 Defect point identification method based on deep learning network
WO2023116507A1 (en) * 2021-12-22 2023-06-29 北京沃东天骏信息技术有限公司 Target detection model training method and apparatus, and target detection method and apparatus
CN116186171A (en) * 2022-12-19 2023-05-30 中国人民解放军战略支援部队信息工程大学 Continuous relation extraction method and system based on multi-head self-attention mechanism adapter
CN117076983A (en) * 2023-08-10 2023-11-17 中国移动通信集团广东有限公司 Transmission outer line resource identification detection method, device, equipment and storage medium
CN117253191A (en) * 2023-10-09 2023-12-19 上海工程技术大学 Safety helmet wearing detection method based on DETR model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ZHE CHEN等: "VISION TRANSFORMER ADAPTER FOR DENSE PREDICTIONS", 《ARXIV》, 13 February 2023 (2023-02-13) *
周丽娟等: "视觉Transformer 识别任务研究综述", 《中国图象图形学报》, vol. 28, no. 10, 31 October 2023 (2023-10-31) *
李铭;郑苏生;姚磊岳;: "基于HOG+SVM实现对任意物体的检测", 现代信息科技, no. 24, 25 December 2019 (2019-12-25) *

Also Published As

Publication number Publication date
CN117496131B (en) 2024-05-10

Similar Documents

Publication Publication Date Title
CN107563372B (en) License plate positioning method based on deep learning SSD frame
CN110610129A (en) Deep learning face recognition system and method based on self-attention mechanism
Kadam et al. Detection and localization of multiple image splicing using MobileNet V1
CN107844743A (en) A kind of image multi-subtitle automatic generation method based on multiple dimensioned layering residual error network
CN110163069A (en) Method for detecting lane lines for assisting driving
CN110390308B (en) Video behavior identification method based on space-time confrontation generation network
CN111462140B (en) Real-time image instance segmentation method based on block stitching
CN114419413A (en) Method for constructing sensing field self-adaptive transformer substation insulator defect detection neural network
CN117011883A (en) Pedestrian re-recognition method based on pyramid convolution and transducer double branches
CN111462090B (en) Multi-scale image target detection method
CN115861756A (en) Earth background small target identification method based on cascade combination network
CN114170686A (en) Elbow bending behavior detection method based on human body key points
Liu et al. Helmet wearing detection based on YOLOv4-MT
CN117496131B (en) Electric power operation site safety behavior identification method and system
CN115346169B (en) Method and system for detecting sleep post behaviors
CN115019039A (en) Example segmentation method and system combining self-supervision and global information enhancement
CN115100546A (en) Mobile-based small target defect identification method and system for power equipment
CN114241189A (en) Ship black smoke identification method based on deep learning
CN112966569B (en) Image processing method and device, computer equipment and storage medium
CN112200139B (en) User image identification method based on variable-order fractional multilayer convolutional neural network
CN114937153B (en) Visual characteristic processing system and method based on neural network in weak texture environment
CN117456286B (en) Ginseng grading method, device and equipment
CN115359537A (en) Face key point detection method, face key point detection device and electronic equipment
CN118038152A (en) Infrared small target detection and classification method based on multi-scale feature fusion
Meng et al. Research on Text Recognition Method of Financial Documents

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant