CN115410111A - A helmet detection method based on structural reparameterization for edge devices - Google Patents
A helmet detection method based on structural reparameterization for edge devices Download PDFInfo
- Publication number
- CN115410111A CN115410111A CN202210843903.0A CN202210843903A CN115410111A CN 115410111 A CN115410111 A CN 115410111A CN 202210843903 A CN202210843903 A CN 202210843903A CN 115410111 A CN115410111 A CN 115410111A
- Authority
- CN
- China
- Prior art keywords
- convolution
- basic
- rep
- safety helmet
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 44
- 238000012549 training Methods 0.000 claims abstract description 21
- 230000009466 transformation Effects 0.000 claims description 20
- 238000000034 method Methods 0.000 claims description 18
- 238000010606 normalization Methods 0.000 claims description 15
- 238000010276 construction Methods 0.000 claims description 8
- 230000010354 integration Effects 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims 1
- 238000002372 labelling Methods 0.000 claims 1
- 238000004804 winding Methods 0.000 claims 1
- 238000004880 explosion Methods 0.000 abstract description 5
- 239000006185 dispersion Substances 0.000 abstract description 4
- 230000000694 effects Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 208000030886 Traumatic Brain injury Diseases 0.000 description 1
- 208000027418 Wounds and injury Diseases 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 208000014674 injury Diseases 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
本发明涉及一种应用于边缘设备基于结构重参数化的安全帽检测方法,包括以下步骤:1)构建稠密重参数化模块;2)构建适用于安全帽检测的标准YOLOv3‑tiny模型与训练数据集;3)对标准YOLOv3‑tiny模型进行重构并训练;4)将训练好的重构模型等价转换为推理模型,并进行安全帽检测。与现有技术相比,本发明具有实时性高、准确率高、泛化能力强,能够避免梯度弥散与梯度爆炸,降低特征冗余度,提升网络的学习能力等优点。
The invention relates to a safety helmet detection method applied to edge devices based on structural reparameterization, comprising the following steps: 1) constructing a dense reparameterization module; 2) constructing a standard YOLOv3-tiny model and training data suitable for safety helmet detection 3) Refactor and train the standard YOLOv3-tiny model; 4) Equivalently convert the trained reconstructed model into an inference model, and perform helmet detection. Compared with the prior art, the present invention has the advantages of high real-time performance, high accuracy and strong generalization ability, can avoid gradient dispersion and gradient explosion, reduce feature redundancy, and improve network learning ability.
Description
技术领域technical field
本发明涉及深度神经网络结构重参数化技术与目标检测技术领域,尤其是涉及一种应用于边缘设备基于结构重参数化的安全帽检测方法。The invention relates to the technical fields of deep neural network structure reparameterization technology and target detection technology, in particular to a safety helmet detection method applied to edge devices based on structure reparameterization.
背景技术Background technique
工程、建筑等行业是典型的劳动密集型行业,其工作环境复杂、安全事故多发,物体高空坠落导致脑部外伤死亡是建筑业的常见典型事故,安全帽作为一种有效的安全防护设备,可以阻挡高处坠落物体的冲击能量,减少头部震击伤害,被广泛应用于施工现场。在工地安全管理中,对安全帽佩戴行为进行实时准确的监管是及其重要的环节。工地安全管理要求检测设备体积小、可移动性强,因而常使用嵌入式边缘计算设备,从而导致了安全帽检测算法在实时性和准确率上的欠缺。Engineering, construction and other industries are typical labor-intensive industries. Their working environment is complex and safety accidents occur frequently. The death of brain trauma caused by objects falling from high altitude is a common typical accident in the construction industry. As an effective safety protection equipment, safety helmets can It can block the impact energy of falling objects from high places and reduce head shock injuries, and is widely used in construction sites. In construction site safety management, real-time and accurate supervision of helmet wearing behavior is an extremely important link. Site safety management requires detection equipment to be small in size and highly mobile, so embedded edge computing devices are often used, which leads to the lack of real-time and accuracy of the helmet detection algorithm.
国内外已有许多工作针对安全帽的自动识别技术进行了研究,Dalal等人首次提出了通过提取梯度直方图特征来实现对安全帽的自动检测;冯杰等结合Adaboost分类器检测安全帽位置,根据人与安全帽的位置关系来判断安全帽的是否佩戴;胡恬等在着重分析小波变换和深度学习在安全帽识别中应用的基础上,提出了神经网络安全帽识别模型;刘晓慧等采用肤色检测的方法定位人脸,再利用支持向量机(SVM)实现安全帽的识别;刘云波等通过检测运动目标的上1/3部分中像素点色度值分布情况,判断是否佩戴安全帽。以上方法虽在特定场景下可以实现对安全帽较为精确的识别,但仍存在对环境要求高、实时性差、泛化能力弱、用户操作过程复杂等问题。Many works at home and abroad have studied the automatic identification technology of helmets. Dalal et al. first proposed to realize the automatic detection of helmets by extracting gradient histogram features; Feng Jie et al. combined with Adaboost classifier to detect the position of helmets, Judging whether to wear a helmet or not based on the positional relationship between the person and the helmet; Hu Tian et al. put forward a neural network helmet identification model based on the analysis of wavelet transform and deep learning in the application of helmet identification; Liu Xiaohui et al. The detection method locates the face, and then uses the support vector machine (SVM) to realize the identification of the helmet; Liu Yunbo et al. judge whether to wear a helmet by detecting the distribution of pixel chromaticity values in the upper 1/3 part of the moving target. Although the above methods can achieve more accurate identification of helmets in specific scenarios, there are still problems such as high environmental requirements, poor real-time performance, weak generalization ability, and complicated user operation processes.
近年来,随着深度学习技术的深入研究与应用普及,以YOLOv3(You Only LookOnce V3)模型为代表的深度网络目标检测方法,不仅具有良好的实时性,同时具有较高的准确性。然而,复杂多样的卷积结构,导致网络结构碎片化,网络复杂度增加,特征冗余度高,内存存取效率低下,灵活性差,这严重阻碍了基于YOLOv3的安全帽检测算法在弱算力、低内存的边缘智能设备上的部署应用。In recent years, with the in-depth research and popularization of deep learning technology, the deep network target detection method represented by the YOLOv3 (You Only LookOnce V3) model not only has good real-time performance, but also has high accuracy. However, complex and diverse convolutional structures lead to fragmented network structures, increased network complexity, high feature redundancy, low memory access efficiency, and poor flexibility, which seriously hinders the YOLOv3-based helmet detection algorithm in weak computing power. , Deploy applications on edge smart devices with low memory.
发明内容Contents of the invention
本发明的目的就是为了克服上述现有技术存在的缺陷而提供一种应用于边缘设备基于结构重参数化的安全帽检测方法。The purpose of the present invention is to provide a safety helmet detection method applied to edge devices based on structure reparameterization in order to overcome the above-mentioned defects in the prior art.
本发明的目的可以通过以下技术方案来实现:The purpose of the present invention can be achieved through the following technical solutions:
一种应用于边缘设备基于结构重参数化的安全帽检测方法,该方法包括以下步骤:A safety helmet detection method applied to an edge device based on structural reparameterization, the method comprising the following steps:
1)构建稠密重参数化模块;1) Construct a dense reparameterization module;
2)构建适用于安全帽检测的标准YOLOv3-tiny模型与训练数据集;2) Construct a standard YOLOv3-tiny model and training data set suitable for helmet detection;
3)对标准YOLOv3-tiny模型进行重构并训练;3) Refactor and train the standard YOLOv3-tiny model;
4)将训练好的重构模型等价转换为推理模型,并进行安全帽检测。4) Equivalently convert the trained reconstruction model into an inference model, and perform helmet detection.
所述的步骤1)具体包括以下步骤:Described step 1) specifically comprises the following steps:
11)定义基本单元;11) Define the basic unit;
12)构建变换结构;12) Construct transformation structure;
13)构建学习结构;13) Build a learning structure;
14)构建稠密重参数化模块DR-Block:该稠密重参数化模块DR-Block由1个变换结构级联1个学习结构构成,变换结构的参数为Ftrans(x;Crep_in),学习结构的参数为Flearn(xlearn;2×Crep_in,Crep_out,Krep,Srep),且xlearn=Ftrans(x;Crep_in),则该构建稠密重参数化模块DR-Block记为Frep(x;Crep_in,Crep_out,Krep,Srep)。14) Construct a dense reparameterization module DR-Block: The dense reparameterization module DR-Block is composed of a transformation structure cascaded with a learning structure. The parameter of the transformation structure is F trans (x;C rep_in ), and the learning structure The parameter of is F learn (x learn ;2×C rep_in ,C rep_out ,K rep ,S rep ), and x learn =F trans (x;C rep_in ), then the construction of dense reparameterization module DR-Block is recorded as F rep (x; C rep_in , C rep_out , K rep , S rep ).
所述的步骤11)中,基本单元由1个卷积层级联1个批归一化层构成,记为Fbasic(x;Cbasic_in,Cbasic_out,Kbasic,Sbasic),其中,x为输入,Cbasic_in为输入通道数、Cbasic_out为输出通道数、Kbasic为卷积核大小、Sbasic为步长。具体地,卷积层的输入通道数为Cbasic_in,输出通道数为Cbasic_out,卷积核大小为Kbasic,批归一化层作用通道数为Cbasic_out。In the step 11), the basic unit is composed of a convolutional layer cascaded with a batch normalization layer, which is denoted as F basic (x; C basic_in , C basic_out , K basic , S basic ), where x is Input, C basic_in is the number of input channels, C basic_out is the number of output channels, K basic is the convolution kernel size, and S basic is the step size. Specifically, the number of input channels of the convolutional layer is C basic_in , the number of output channels is C basic_out , the size of the convolution kernel is K basic , and the number of channels of the batch normalization layer is C basic_out .
所述的步骤12)具体为:Described step 12) is specifically:
首先级联4个基本单元,形成深层级联网络结构,实现网络过参数化,然后在任意两个基本单元之间添加跳跃连接以实现不同层级特征的模型集成,最后将每个基本单元的输出拼接为一体,记为Ftrans(x;Ctrans_in)。First, 4 basic units are cascaded to form a deep cascaded network structure to achieve over-parameterization of the network, then a jump connection is added between any two basic units to achieve model integration of different levels of features, and finally the output of each basic unit is Spliced together as one, denoted as F trans (x;C trans_in ).
所述的步骤13)中,学习结构由1个基本单元构成,记为Flearn(x;Clearn_in,Clearn_out,Klearn,Slearn)。In the step 13), the learning structure is composed of one basic unit, denoted as F learn (x; C learn_in , C learn_out , K learn , S learn ).
所述的步骤2)具体包括以下步骤:Described step 2) specifically comprises the following steps:
21)在施工现场和网络图片中采集并标注安全帽检测数据集,进行标准数据预处理;21) Collect and mark the safety helmet detection data set in the construction site and network pictures, and perform standard data preprocessing;
22)搭建标准YOLOv3-tiny模型,并将检测类别数设定为2,具体为未佩戴安全帽人体和已佩戴安全帽人体。22) Build a standard YOLOv3-tiny model, and set the number of detection categories to 2, specifically human body without helmet and human body with helmet.
所述的步骤3)具体包括以下步骤:Described step 3) specifically comprises the following steps:
31)对于标准YOLOv3-tiny模型中所有非1×1卷积层及其级联的批归一化层替换为稠密重参数化模块DR-Block,替换后的重构模型记为DR-Net;对于标准YOLOv3-tiny模型中输入通道数为Cin,输出通道数为Cout,卷积核为K(K≠1),步长为S的卷积,替换为Frep(x;Cin,Cout,K,S);31) For all non-1×1 convolutional layers and their cascaded batch normalization layers in the standard YOLOv3-tiny model, replace them with the dense reparameterization module DR-Block, and the replaced reconstruction model is denoted as DR-Net; For the standard YOLOv3-tiny model, the number of input channels is C in , the number of output channels is C out , the convolution kernel is K (K≠1), and the step size is S. Replace it with F rep (x;C in , C out ,K,S);
32)采用安全帽检测数据集,通过YOLOv3-tiny标准训练参数与训练策略,对重构模型DR-Net进行训练。32) Using the helmet detection data set, the reconstructed model DR-Net is trained through the YOLOv3-tiny standard training parameters and training strategy.
所述的步骤4)具体包括以下步骤:Described step 4) specifically comprises the following steps:
41)将基本单元转换为单一卷积层,基本单元中卷积的权重记作wconv,批归一化层的均值记作μ,标准差为σ,缩放系数为γ,偏移系数为β,则重构后的单一卷积层F′basic的权重为偏置为 41) Convert the basic unit into a single convolutional layer, the weight of the convolution in the basic unit is denoted as w conv , the mean value of the batch normalization layer is denoted as μ, the standard deviation is σ, the scaling factor is γ, and the offset coefficient is β , then the weight of the reconstructed single convolutional layer F′ basic is biased to
42)将使用跳跃连接的卷积转换为单一卷积层,转换后的单一卷积层F′skip_connect的权重为w′=concat([wprev,w]),偏置为b′=concat([bprev,b]),其中,concat表示级联操作,wprev为等效卷积的权重,bprev为等效卷积的偏置,w为被转换卷积的权重,b为被转换卷积的偏置;42) Convert the convolution using skip connection to a single convolutional layer, the weight of the converted single convolutional layer F′ skip_connect is w′=concat([w prev ,w]), and the bias is b′=concat( [b prev ,b]), where concat represents the cascade operation, w prev is the weight of the equivalent convolution, b prev is the bias of the equivalent convolution, w is the weight of the converted convolution, and b is the converted convolution bias;
43)将两个级联的卷积转换为单一卷积层;43) converting two cascaded convolutions into a single convolutional layer;
44)将变换结构转换为单一1×1卷积层,按照步骤41)将变换结构中的所有卷积转换为单一卷积层,然后按照步骤42)依次将所有跳跃连接重构为单一卷积层,最终得到与整个变换结构等效的单一卷积层F′trans;44) Transform the transformation structure into a single 1×1 convolutional layer, follow step 41) convert all convolutions in the transformation structure into a single convolutional layer, and then follow step 42) to sequentially reconstruct all skip connections into a single convolution layer, and finally a single convolutional layer F′ trans equivalent to the entire transformation structure is obtained;
45)将DR-Block转换为单一卷积层;45) Convert DR-Block to a single convolutional layer;
46)将DR-Net中所有的DR-Block按照步骤45)全部转换为单一卷积层,从而获得部署所需的推理阶段结构;46) Convert all DR-Blocks in DR-Net into a single convolutional layer according to step 45), so as to obtain the inference stage structure required for deployment;
47)提取现场图像帧,输入步骤46)得到的模型中进行安全帽检测,并输出检测结果。47) Extract the scene image frame, input the model obtained in step 46) to carry out helmet detection, and output the detection result.
所述的步骤43)具体为:Described step 43) is specifically:
首先将第一个卷积的权重w1的第一和第二维度转置,结果记作w1′,则重构后的单一卷积层F′cascade的权重为w′=Conv2d(w2,w1′),偏置为b′=b2+(b1×w2),其中,Conv2d表示二维卷积操作,b1为第一个卷积的偏置,w2为第二个卷积的权重,b2为第二个卷积的偏置。First, transpose the first and second dimensions of the weight w 1 of the first convolution, and record the result as w 1 ′, then the weight of the reconstructed single convolutional layer F′ cascade is w′=Conv2d(w 2 ,w 1 ′), the offset is b′=b 2 +(b 1 ×w 2 ), where Conv2d represents a two-dimensional convolution operation, b 1 is the offset of the first convolution, and w 2 is the second The weight of the first convolution, b 2 is the bias of the second convolution.
所述的步骤45)具体为:Described step 45) is specifically:
首先按照步骤44)将变换结构转换为单一卷积层F′trans,然后将单一卷积层F′trans与学习结构的级联结构,按照步骤43转换为单一卷积层F′rep,F′rep即为DR-Block重参数化后的单一卷积层结构。First, convert the transformation structure into a single convolutional layer F'trans according to step 44), and then convert the cascaded structure of the single convolutional layer F'trans and the learning structure into a single convolutional layer F'rep , F' according to step 43 rep is a single convolutional layer structure after DR-Block reparameterization.
与现有技术相比,本发明具有以下优点:Compared with the prior art, the present invention has the following advantages:
一、针对现有基于深度学习的安全帽检测算法实时性不足,本发明采用yolov3-tiny网络进行训练与推理,实现高实时性。1. In view of the lack of real-time performance of the existing hard hat detection algorithm based on deep learning, the present invention uses a yolov3-tiny network for training and reasoning to achieve high real-time performance.
二、YOLOv3-tiny网络因追求实时性,网络深度较浅,准确率不高,本发明采用网络重参数化方法,对网络的训练阶段与推理阶段进行解耦,在训练阶段使用复杂结构进行模型学习,提高网络准确率,在推理阶段将复杂结构等价转化为原yolov3-tiny简单结构,复原网络的实时性。2. Due to the pursuit of real-time performance, the YOLOv3-tiny network has shallow network depth and low accuracy rate. The present invention adopts the network re-parameterization method to decouple the training phase and the reasoning phase of the network, and use complex structures to model in the training phase. Learning, improve the accuracy of the network, convert the complex structure equivalently into the original yolov3-tiny simple structure in the reasoning stage, and restore the real-time performance of the network.
三、因训练数据有限,模型泛化能力不足,本发明通过在网络重参数化结构中加入深层级联结构,实现网络过参数化、增大模型容量,从而对网络产生隐式正则化作用,提高了原模型的泛化能力。3. Due to the limited training data and insufficient generalization ability of the model, the present invention realizes over-parameterization of the network and increases the capacity of the model by adding a deep cascade structure into the re-parameterization structure of the network, thereby generating an implicit regularization effect on the network. The generalization ability of the original model is improved.
四、深度网络因层数较多,梯度回传存在弥散与爆炸问题,训练缓慢且不易收敛,现有方法没有从实质上改变网络深度,也没有对回传梯度进行额外处理,而本发明通过多层级联的批归一化层,对回传梯度进行了多级分布调整,更好的保证了梯度回传的有效性,在一定程度上避免了梯度弥散与梯度爆炸。4. Due to the large number of layers in the deep network, there are dispersion and explosion problems in the gradient return, the training is slow and difficult to converge, the existing method does not substantially change the network depth, and does not perform additional processing on the return gradient, and the present invention passes The multi-layer cascaded batch normalization layer adjusts the multi-level distribution of the return gradient, which better ensures the effectiveness of the gradient return and avoids gradient dispersion and gradient explosion to a certain extent.
五、由于深度网络参数较多,存在大量冗余,网络的学习能力被削弱,现有方法主要通过设计不同尺度和复杂度的多分支结构来增强单个卷积层的表达能力,但引入了大量特征冗余,影响了对性能的提升能力,而本方法通过稠密连接,实现了不同层级特征的模型集成,大大降低了特征冗余度,提升了网络的学习能力,进一步提高了原模型的准确率。5. Due to the large number of deep network parameters, there is a lot of redundancy, and the learning ability of the network is weakened. The existing methods mainly enhance the expressive ability of a single convolutional layer by designing multi-branch structures of different scales and complexity, but introduce a large number of Feature redundancy affects the ability to improve performance. This method realizes model integration of different levels of features through dense connections, greatly reduces feature redundancy, improves the learning ability of the network, and further improves the accuracy of the original model. Rate.
附图说明Description of drawings
图1为本发明的方法设计流程图。Fig. 1 is the flow chart of method design of the present invention.
图2为DR-Block结构与重参数化变换图。Figure 2 is a diagram of DR-Block structure and reparameterization transformation.
具体实施方式Detailed ways
下面结合附图和具体实施例对本发明进行详细说明。The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.
实施例Example
如图1所示,本发明提供一种应用于边缘设备基于结构重参数化的安全帽检测方法,包括以下步骤:As shown in Figure 1, the present invention provides a safety helmet detection method applied to edge devices based on structural reparameterization, including the following steps:
首先,按照如下步骤构建稠密重参数化模块First, build a dense reparameterization module as follows
S1、定义“基本单元”:该“基本单元”的参数为:输入为x,输入通道数Cbasic_in、输出通道数Cbasic_out、卷积核大小Kbasic、步长为Sbasic,该“基本单元”包括1个卷积层级联1个批归一化层,其中,卷积层的输入通道数为Cbasic_in,输出通道数为Cbasic_out,卷积核大小为Kbasic;批归一化层作用通道数为Cbasic_out,记“基本单元”为Fbasic(x;Cbasic_in,Cbasic_out,Kbasic,Sbasic);S1. Define the "basic unit": the parameters of the "basic unit" are: the input is x, the number of input channels C basic_in , the number of output channels C basic_out , the size of the convolution kernel K basic , and the step size is S basic , the "basic unit""Includes 1 convolutional layer cascaded with 1 batch normalization layer, where the number of input channels of the convolutional layer is C basic_in , the number of output channels is C basic_out , and the size of the convolution kernel is K basic ; the role of the batch normalization layer The number of channels is C basic_out , and the "basic unit" is F basic (x;C basic_in ,C basic_out ,K basic ,S basic );
S2、在S1基础上,构建“变换结构”:该“变换结构”的参数为输入为x、输入通道数Ctrans_in、输出通道数2×Ctrans_in,该“变换结构”包括:首先级联4个“基本单元”,从而形成深层级联网络结构,用于实现网络过参数化,引入隐式正则化效应,增大模型容量;其中,多层批归一化层的引入,使得重参数化模块对网络回传梯度进行有效调整,避免梯度消失和梯度爆炸,其中,第i个“基本单元”的参数为然后在任意两个“基本单元”之间添加跳跃连接,用于实现不同层级特征的模型集成,从而降低特征冗余度;最后将每个“基本单元”的输出拼接为一体,记“变换结构”为Ftrans(x;Ctrans_in);S2. On the basis of S1, build a "transformation structure": the parameters of the "transformation structure" are x, the number of input channels C trans_in , and the number of output channels 2×C trans_in . The "transformation structure" includes: first cascade 4 A "basic unit" to form a deep cascaded network structure, which is used to achieve network overparameterization, introduce implicit regularization effects, and increase model capacity; among them, the introduction of multi-layer batch normalization layers makes reparameterization The module effectively adjusts the gradient of the network return to avoid gradient disappearance and gradient explosion. Among them, the parameter of the i-th "basic unit" is Then add a skip connection between any two "basic units" to achieve model integration of different levels of features, thereby reducing feature redundancy; finally, the output of each "basic unit" is spliced into one, and the "transformation structure" is recorded. ” is F trans (x; C trans_in );
S3、在S1基础上,定义“学习结构”,用于提取重要特征,并扩大感受野:该“学习结构”的参数为输入为x、输入通道数Clearn_in、输出通道数Clearn_out、卷积核大小Klearn、步长为Slearn。该结构包括:1个“基本单元”,其参数为Flearn(x;Clearn_in,Clearn_out,Klearn,Slearn);S3. On the basis of S1, define a "learning structure" for extracting important features and expanding the receptive field: the parameters of the "learning structure" are x, the number of input channels C learn_in , the number of output channels C learn_out , convolution The kernel size is K learn and the step size is S learn . The structure includes: 1 "basic unit" whose parameters are F learn (x;C learn_in ,C learn_out ,K learn ,S learn );
S4、在S2与S3基础上,构建“稠密重参数化模块(densely reparametrizedblock)”,记为DR-Block,该DR-Block的参数为:输入为x、输入通道数Crep_in、输出通道数Crep_out、卷积核大小Krep、步长为Srep。该DR-Block包括1个“变换结构”级联1个“学习结构”,其中,“变换结构”的参数为Ftrans(x;Crep_in),“学习结构”的参数为Flearn(xlearn;2×Crep_in,Crep_out,Krep,Srep),其中,xlearn=Ftrans(x;Crep_in),将DR-Block记作Frep(x;Crep_in,Crep_out,Krep,Srep)。S4. On the basis of S2 and S3, build a "densely reparametrized block (densely reparametrized block)", which is recorded as DR-Block. The parameters of this DR-Block are: input is x, input channel number C rep_in , output channel number C rep_out , convolution kernel size K rep , step size S rep . The DR-Block includes a "transformation structure" concatenated with a "learning structure", wherein the parameter of the "transformation structure" is F trans (x;C rep_in ), and the parameter of the "learning structure" is F learn (x learn ;2×C rep_in , C rep_out , K rep , S rep ), wherein, x learn = F trans (x; C rep_in ), the DR-Block is recorded as F rep (x; C rep_in , C rep_out , K rep , S rep ).
第二,构建适用于安全帽检测的YOLOv3-tiny模型与训练数据集Second, build a YOLOv3-tiny model and training data set suitable for helmet detection
S5、在施工现场和网络图片中采集、标注安全帽检测数据集,并进行标准数据预处理;S5. Collect and mark the safety helmet detection data set in the construction site and network pictures, and perform standard data preprocessing;
S6、搭建标准YOLOv3-tiny模型,并将检测类别数设定为2(两类:未佩戴安全帽人体、已佩戴安全帽人体);S6. Build a standard YOLOv3-tiny model, and set the number of detection categories to 2 (two categories: human body without helmet and human body with helmet);
第三,按照如下步骤对YOLOv3-tiny原模型进行重构并训练Third, follow the steps below to reconstruct and train the original YOLOv3-tiny model
S7、给定S6中搭建好的网络模型,将原模型中所有非1×1卷积层及其级联的批归一化层替换为DR-Block,替换后的模型记为DR-Net,对于原参数为输入通道数为Cin,输出通道数为Cout,卷积核为K(K≠1),步长为S的卷积,替换为Frep(x;Cin,Cout,K,S);S7. Given the network model built in S6, replace all non-1×1 convolutional layers and their cascaded batch normalization layers in the original model with DR-Block, and the replaced model is denoted as DR-Net, For the convolution whose original parameter is the number of input channels is C in , the number of output channels is C out , the convolution kernel is K (K≠1), and the step size is S, replace it with F rep (x;C in ,C out , K, S);
S8、使用S5制作好的数据集,利用YOLOv3-tiny标准训练参数与训练策略,对S7中重构后的模型进行训练;S8. Use the data set made in S5, and use the YOLOv3-tiny standard training parameters and training strategy to train the reconstructed model in S7;
最后,按照如下步骤将训练好的重构模型等价转换为推理模型,并进行检测Finally, follow the steps below to equivalently convert the trained reconstruction model into an inference model and perform detection
S9、将“基本单元”转换为单一卷积层,“基本单元”中卷积的权重记作wconv;批归一化层的均值记作μ,标准差为σ,缩放系数为γ,偏移系数为β;则重构后的单一卷积层F′basic,其权重为偏置为 S9. Convert the "basic unit" into a single convolutional layer. The weight of the convolution in the "basic unit" is denoted as w conv ; the mean value of the batch normalization layer is denoted as μ, the standard deviation is σ, the scaling factor is γ, and the partial The shift coefficient is β; then the weight of the reconstructed single convolutional layer F′ basic is biased to
S10、基于S9,将使用跳跃连接的卷积转换为单一卷积层。结构的输入可表示为原始输入x经过等效卷积Fprev得来,将等效卷积的权重记作wprev,偏置记作bprev;将该结构中卷积的权重记作w,偏置记作b。则转换后的单一卷积层F′skip_connect,其权重为w′=concat([wprev,w]),偏置为b′=concat([bprev,b]),其中concat表示级联操作;S10, based on S9, convert the convolution using skip connections into a single convolutional layer. The input of the structure can be expressed as the original input x obtained by equivalent convolution F prev , and the weight of the equivalent convolution is denoted as w prev , and the bias is denoted as b prev ; the weight of the convolution in the structure is denoted as w, The bias is denoted as b. Then the weight of the converted single convolutional layer F′ skip_connect is w′=concat([w prev ,w]), and the bias is b′=concat([b prev ,b]), where concat represents a cascade operation ;
S11、基于S10,将两个级联的卷积转换为单一卷积层。将第一个卷积的权重记作w1,偏置记作b1;第二个卷积的权重记作w2,偏置记作b2。首先将w1的第一和第二维度转置,结果记作w1′,则重构后的单一卷积层F′cascade,其权重为w′=Conv2d(w2,w1′),偏置为b′=b2+(b1×w2),其中Conv2d表示二维卷积操作。S11. Based on S10, convert the two cascaded convolutions into a single convolutional layer. Denote the weight of the first convolution as w 1 and the bias as b 1 ; denote the weight of the second convolution as w 2 and the bias as b 2 . First, the first and second dimensions of w 1 are transposed, and the result is recorded as w 1 ′, then the weight of the reconstructed single convolutional layer F′ cascade is w′=Conv2d(w 2 ,w 1 ′), The bias is b'=b 2 +(b 1 ×w 2 ), where Conv2d represents a two-dimensional convolution operation.
S12:基于S11,将“变换结构”转换为单一1×1卷积层。首先按照S9,将“变换结构”中的所有卷积转换为单一卷积层;然后按照S10依次将所有跳跃连接重构为单一卷积层;最终得到与整个“变换结构”等效的单一卷积层F′trans。S12: Based on S11, convert the "transform structure" into a single 1×1 convolutional layer. First, convert all convolutions in the "transformation structure" into a single convolutional layer according to S9; then reconstruct all skip connections into a single convolutional layer in sequence according to S10; finally obtain a single convolution equivalent to the entire "transformation structure" Multiply F′ trans .
S13、基于S12,将DR-Block转换为单一卷积层。首先按照S12将“变换结构”转换为单一卷积层F′trans;然后将F′trans与“学习结构”的级联结构,按照S11转换为单一卷积层F′rep。F′rep即为DR-Block重参数化后的单一卷积层结构;S13. Based on S12, convert the DR-Block into a single convolutional layer. First, convert the "transformation structure" into a single convolutional layer F'trans according to S12; then convert the cascade structure of F'trans and "learning structure" into a single convolutional layer F'rep according to S11. F' rep is a single convolutional layer structure after DR-Block reparameterization;
S14、最后将DR-Net中所有的DR-Block按S13全部转换为单一卷积层,从而获得部署所需的推理阶段结构;S14. Finally, convert all DR-Blocks in DR-Net into a single convolutional layer according to S13, so as to obtain the inference stage structure required for deployment;
S15、提取现场图像帧,输入S14得到的模型中进行安全帽检测,并输出检测结果。S15. Extracting the on-site image frame, inputting the model obtained in S14 for helmet detection, and outputting the detection result.
表1显示了本发明在安全帽检测数据集上的实验结果,结果表明本发明可以在不改变原方法网络结构,不增加额外推理开销的情况下,以即插即用的方式,对边缘设备上的安全帽检测任务的准确率具有提升作用。Table 1 shows the experimental results of the present invention on the hard hat detection data set. The results show that the present invention can detect edge devices in a plug-and-play manner without changing the network structure of the original method and without adding additional reasoning overhead. The accuracy of the helmet detection task on the above can be improved.
表1本发明对安全帽检测算法准确率的提升效果Table 1 The improvement effect of the present invention on the accuracy of the helmet detection algorithm
由于DR-Block是即插即用的,因而,本发明能够应用于多种任务,促进深度网络在边缘设备上的高效部署与实时计算。Since the DR-Block is plug-and-play, the present invention can be applied to various tasks and promote efficient deployment and real-time computing of deep networks on edge devices.
综上,本发明针对边缘设备上安全帽检测算法的性能提升问题,设计了稠密线性复合结构,用来在训练阶段代替安全帽检测算法的卷积,并在推理阶段重构为原模型简单卷积,从而在保证推理速度不变的情况下,实现性能提升。To sum up, the present invention aims at improving the performance of the hard hat detection algorithm on the edge device, and designs a dense linear composite structure, which is used to replace the convolution of the hard hat detection algorithm in the training phase, and reconstructed into a simple convolution of the original model in the inference phase. In order to achieve performance improvement while keeping the inference speed unchanged.
本发明首先构建了稠密网络重参数化结构(DR-Block),该结构首先通过深层级联结构,实现网络过参数化,增大模型容量,从而对网络产生隐式正则化作用,提高了原模型的泛化能力;其次利用级联的批归一化层,对回传梯度进行了多级分布调整,更好的保证了梯度回传的有效性,在一定程度上避免了梯度弥散与梯度爆炸;最后通过稠密连接,实现了不同层级特征的模型集成,大大降低了特征冗余度,提高了对原模型的学习能力。其次,本发明构建了实时性极强的yolov3-tiny网络,并使之适配于安全帽检测任务。随后,本发明将yolov3-tiny网络的训练和推理结构进行结构,在训练阶段利用DR-Block替换原网络卷积并进行训练,实现网络性能提升;最后,将DR-Block等价转换为原yolov3-tiny网络的简单卷积,从而在不增加额外推理开销的情况下,实现性能提升,并针对安全帽检测任务进行部署应用。The present invention first constructs a dense network re-parameterization structure (DR-Block). This structure first realizes network over-parameterization through a deep cascading structure, increases the model capacity, thereby exerts an implicit regularization effect on the network and improves the efficiency of the network. The generalization ability of the model; secondly, the cascaded batch normalization layer is used to adjust the multi-level distribution of the return gradient, which better ensures the effectiveness of the gradient return and avoids gradient dispersion and gradient Explosion; Finally, through dense connection, the model integration of different levels of features is realized, which greatly reduces feature redundancy and improves the learning ability of the original model. Secondly, the present invention constructs a yolov3-tiny network with strong real-time performance, and adapts it to the helmet detection task. Subsequently, the present invention constructs the training and reasoning structure of the yolov3-tiny network, and uses DR-Block to replace the original network convolution and perform training in the training stage to achieve network performance improvement; finally, the DR-Block is equivalently converted to the original yolov3 -Simple convolution of the tiny network, so as to achieve performance improvement without adding additional reasoning overhead, and deploy and apply it for the helmet detection task.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210843903.0A CN115410111A (en) | 2022-07-18 | 2022-07-18 | A helmet detection method based on structural reparameterization for edge devices |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210843903.0A CN115410111A (en) | 2022-07-18 | 2022-07-18 | A helmet detection method based on structural reparameterization for edge devices |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115410111A true CN115410111A (en) | 2022-11-29 |
Family
ID=84158328
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210843903.0A Pending CN115410111A (en) | 2022-07-18 | 2022-07-18 | A helmet detection method based on structural reparameterization for edge devices |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115410111A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116206188A (en) * | 2023-05-04 | 2023-06-02 | 浪潮电子信息产业股份有限公司 | Image recognition method, system, equipment and storage medium |
CN117789153A (en) * | 2024-02-26 | 2024-03-29 | 浙江驿公里智能科技有限公司 | Automobile oil tank outer cover positioning system and method based on computer vision |
-
2022
- 2022-07-18 CN CN202210843903.0A patent/CN115410111A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116206188A (en) * | 2023-05-04 | 2023-06-02 | 浪潮电子信息产业股份有限公司 | Image recognition method, system, equipment and storage medium |
CN117789153A (en) * | 2024-02-26 | 2024-03-29 | 浙江驿公里智能科技有限公司 | Automobile oil tank outer cover positioning system and method based on computer vision |
CN117789153B (en) * | 2024-02-26 | 2024-05-03 | 浙江驿公里智能科技有限公司 | Automobile oil tank outer cover positioning system and method based on computer vision |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115410111A (en) | A helmet detection method based on structural reparameterization for edge devices | |
CN105678231A (en) | Pedestrian image detection method based on sparse coding and neural network | |
CN112200090A (en) | Hyperspectral image classification method based on cross-grouping space-spectral feature enhancement network | |
CN102915453B (en) | Real-time feedback and update vehicle detection method | |
CN105023014A (en) | Method for extracting tower target in unmanned aerial vehicle routing inspection power transmission line image | |
CN110210620A (en) | A kind of channel pruning method for deep neural network | |
CN116563645A (en) | A Model Compression Method for Object Detection Based on Joint Iterative Pruning and Knowledge Distillation | |
CN118090211A (en) | A fault diagnosis method for elevator traction machine bearing based on time-frequency feature fusion | |
CN111597929A (en) | Group Behavior Recognition Method Based on Channel Information Fusion and Group Relationship Spatial Structured Modeling | |
CN104680192B (en) | A kind of electric power image classification method based on deep learning | |
CN114998145A (en) | Low-illumination image enhancement method based on multi-scale and context learning network | |
CN117994655A (en) | Bridge disease detection system and method based on improved Yolov s model | |
CN113436198A (en) | Remote sensing image semantic segmentation method for collaborative image super-resolution reconstruction | |
CN105224943A (en) | Based on the image swift nature method for expressing of multi thread normalization non-negative sparse coding device | |
CN110728352A (en) | Large-scale image classification method based on deep convolutional neural network | |
CN105718858B (en) | A kind of pedestrian recognition method based on positive and negative broad sense maximum pond | |
CN105447468A (en) | Color image over-complete block feature extraction method | |
CN111144456B (en) | A Deep Model Compression Method Based on Intrinsic Feature Migration | |
Yian et al. | Improved deeplabv3+ network segmentation method for urban road scenes | |
CN115115990A (en) | Behavior recognition method based on infrared surveillance video under complex lighting conditions | |
CN116342553A (en) | ConvNext-yolov 7-based building site environment detection method | |
CN115690833A (en) | A Pedestrian Re-Identification Method Based on Deep Active Learning and Model Compression | |
CN111008986B (en) | A remote sensing image segmentation method based on multi-task semi-convolution | |
CN114387539A (en) | Behavior identification method based on SimAM attention mechanism | |
Kang et al. | Mushroom Image Classification Based on Improved MobileNetV2 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |