CN115410111A

CN115410111A - A helmet detection method based on structural reparameterization for edge devices

Info

Publication number: CN115410111A
Application number: CN202210843903.0A
Authority: CN
Inventors: 刘成菊; 陈启军; 闫卿卿; 李树
Original assignee: Tongji University
Current assignee: Tongji University
Priority date: 2022-07-18
Filing date: 2022-07-18
Publication date: 2022-11-29

Abstract

The invention relates to a safety helmet detection method applied to edge devices based on structural reparameterization, comprising the following steps: 1) constructing a dense reparameterization module; 2) constructing a standard YOLOv3-tiny model and training data suitable for safety helmet detection 3) Refactor and train the standard YOLOv3-tiny model; 4) Equivalently convert the trained reconstructed model into an inference model, and perform helmet detection. Compared with the prior art, the present invention has the advantages of high real-time performance, high accuracy and strong generalization ability, can avoid gradient dispersion and gradient explosion, reduce feature redundancy, and improve network learning ability.

Description

A Hard Hat Detection Method Based on Structure Reparameterization for Edge Devices

技术领域technical field

本发明涉及深度神经网络结构重参数化技术与目标检测技术领域，尤其是涉及一种应用于边缘设备基于结构重参数化的安全帽检测方法。The invention relates to the technical fields of deep neural network structure reparameterization technology and target detection technology, in particular to a safety helmet detection method applied to edge devices based on structure reparameterization.

背景技术Background technique

工程、建筑等行业是典型的劳动密集型行业，其工作环境复杂、安全事故多发，物体高空坠落导致脑部外伤死亡是建筑业的常见典型事故，安全帽作为一种有效的安全防护设备，可以阻挡高处坠落物体的冲击能量，减少头部震击伤害，被广泛应用于施工现场。在工地安全管理中，对安全帽佩戴行为进行实时准确的监管是及其重要的环节。工地安全管理要求检测设备体积小、可移动性强，因而常使用嵌入式边缘计算设备，从而导致了安全帽检测算法在实时性和准确率上的欠缺。Engineering, construction and other industries are typical labor-intensive industries. Their working environment is complex and safety accidents occur frequently. The death of brain trauma caused by objects falling from high altitude is a common typical accident in the construction industry. As an effective safety protection equipment, safety helmets can It can block the impact energy of falling objects from high places and reduce head shock injuries, and is widely used in construction sites. In construction site safety management, real-time and accurate supervision of helmet wearing behavior is an extremely important link. Site safety management requires detection equipment to be small in size and highly mobile, so embedded edge computing devices are often used, which leads to the lack of real-time and accuracy of the helmet detection algorithm.

国内外已有许多工作针对安全帽的自动识别技术进行了研究，Dalal等人首次提出了通过提取梯度直方图特征来实现对安全帽的自动检测；冯杰等结合Adaboost分类器检测安全帽位置，根据人与安全帽的位置关系来判断安全帽的是否佩戴；胡恬等在着重分析小波变换和深度学习在安全帽识别中应用的基础上，提出了神经网络安全帽识别模型；刘晓慧等采用肤色检测的方法定位人脸，再利用支持向量机(SVM)实现安全帽的识别；刘云波等通过检测运动目标的上1/3部分中像素点色度值分布情况，判断是否佩戴安全帽。以上方法虽在特定场景下可以实现对安全帽较为精确的识别，但仍存在对环境要求高、实时性差、泛化能力弱、用户操作过程复杂等问题。Many works at home and abroad have studied the automatic identification technology of helmets. Dalal et al. first proposed to realize the automatic detection of helmets by extracting gradient histogram features; Feng Jie et al. combined with Adaboost classifier to detect the position of helmets, Judging whether to wear a helmet or not based on the positional relationship between the person and the helmet; Hu Tian et al. put forward a neural network helmet identification model based on the analysis of wavelet transform and deep learning in the application of helmet identification; Liu Xiaohui et al. The detection method locates the face, and then uses the support vector machine (SVM) to realize the identification of the helmet; Liu Yunbo et al. judge whether to wear a helmet by detecting the distribution of pixel chromaticity values in the upper 1/3 part of the moving target. Although the above methods can achieve more accurate identification of helmets in specific scenarios, there are still problems such as high environmental requirements, poor real-time performance, weak generalization ability, and complicated user operation processes.

近年来，随着深度学习技术的深入研究与应用普及，以YOLOv3(You Only LookOnce V3)模型为代表的深度网络目标检测方法，不仅具有良好的实时性，同时具有较高的准确性。然而，复杂多样的卷积结构，导致网络结构碎片化，网络复杂度增加，特征冗余度高，内存存取效率低下，灵活性差，这严重阻碍了基于YOLOv3的安全帽检测算法在弱算力、低内存的边缘智能设备上的部署应用。In recent years, with the in-depth research and popularization of deep learning technology, the deep network target detection method represented by the YOLOv3 (You Only LookOnce V3) model not only has good real-time performance, but also has high accuracy. However, complex and diverse convolutional structures lead to fragmented network structures, increased network complexity, high feature redundancy, low memory access efficiency, and poor flexibility, which seriously hinders the YOLOv3-based helmet detection algorithm in weak computing power. , Deploy applications on edge smart devices with low memory.

发明内容Contents of the invention

本发明的目的就是为了克服上述现有技术存在的缺陷而提供一种应用于边缘设备基于结构重参数化的安全帽检测方法。The purpose of the present invention is to provide a safety helmet detection method applied to edge devices based on structure reparameterization in order to overcome the above-mentioned defects in the prior art.

本发明的目的可以通过以下技术方案来实现：The purpose of the present invention can be achieved through the following technical solutions:

一种应用于边缘设备基于结构重参数化的安全帽检测方法，该方法包括以下步骤：A safety helmet detection method applied to an edge device based on structural reparameterization, the method comprising the following steps:

1)构建稠密重参数化模块；1) Construct a dense reparameterization module;

2)构建适用于安全帽检测的标准YOLOv3-tiny模型与训练数据集；2) Construct a standard YOLOv3-tiny model and training data set suitable for helmet detection;

3)对标准YOLOv3-tiny模型进行重构并训练；3) Refactor and train the standard YOLOv3-tiny model;

4)将训练好的重构模型等价转换为推理模型，并进行安全帽检测。4) Equivalently convert the trained reconstruction model into an inference model, and perform helmet detection.

所述的步骤1)具体包括以下步骤：Described step 1) specifically comprises the following steps:

11)定义基本单元；11) Define the basic unit;

12)构建变换结构；12) Construct transformation structure;

13)构建学习结构；13) Build a learning structure;

14)构建稠密重参数化模块DR-Block：该稠密重参数化模块DR-Block由1个变换结构级联1个学习结构构成，变换结构的参数为F_trans(x;C_{rep_in})，学习结构的参数为F_learn(x_learn;2×C_{rep_in},C_{rep_out},K_rep,S_rep)，且x_learn＝F_trans(x;C_{rep_in})，则该构建稠密重参数化模块DR-Block记为F_rep(x;C_{rep_in},C_{rep_out},K_rep,S_rep)。14) Construct a dense reparameterization module DR-Block: The dense reparameterization module DR-Block is composed of a transformation structure cascaded with a learning structure. The parameter of the transformation structure is F _trans (x;C _{rep_in} ), and the learning structure The parameter of is F _learn (x _learn ;2×C _{rep_in} ,C _{rep_out} ,K _rep ,S _rep ), and x _learn =F _trans (x;C _{rep_in} ), then the construction of dense reparameterization module DR-Block is recorded as F _rep (x; C _{rep_in} , C _{rep_out} , K _rep , S _rep ).

所述的步骤11)中，基本单元由1个卷积层级联1个批归一化层构成，记为F_basic(x;C_{basic_in},C_{basic_out},K_basic,S_basic)，其中，x为输入，C_{basic_in}为输入通道数、C_{basic_out}为输出通道数、K_basic为卷积核大小、S_basic为步长。具体地，卷积层的输入通道数为C_{basic_in}，输出通道数为C_{basic_out}，卷积核大小为K_basic，批归一化层作用通道数为C_{basic_out}。In the step 11), the basic unit is composed of a convolutional layer cascaded with a batch normalization layer, which is denoted as F _basic (x; C _{basic_in} , C _{basic_out} , K _basic , S _basic ), where x is Input, C _{basic_in} is the number of input channels, C _{basic_out} is the number of output channels, K _basic is the convolution kernel size, and S _basic is the step size. Specifically, the number of input channels of the convolutional layer is C _{basic_in} , the number of output channels is C _{basic_out} , the size of the convolution kernel is K _basic , and the number of channels of the batch normalization layer is C _{basic_out} .

所述的步骤12)具体为：Described step 12) is specifically:

首先级联4个基本单元，形成深层级联网络结构，实现网络过参数化，然后在任意两个基本单元之间添加跳跃连接以实现不同层级特征的模型集成，最后将每个基本单元的输出拼接为一体，记为F_trans(x;C_{trans_in})。First, 4 basic units are cascaded to form a deep cascaded network structure to achieve over-parameterization of the network, then a jump connection is added between any two basic units to achieve model integration of different levels of features, and finally the output of each basic unit is Spliced together as one, denoted as F _trans (x;C _{trans_in} ).

所述的步骤13)中，学习结构由1个基本单元构成，记为F_learn(x;C_{learn_in},C_{learn_out},K_learn,S_learn)。In the step 13), the learning structure is composed of one basic unit, denoted as F _learn (x; C _{learn_in} , C _{learn_out} , K _learn , S _learn ).

所述的步骤2)具体包括以下步骤：Described step 2) specifically comprises the following steps:

21)在施工现场和网络图片中采集并标注安全帽检测数据集，进行标准数据预处理；21) Collect and mark the safety helmet detection data set in the construction site and network pictures, and perform standard data preprocessing;

22)搭建标准YOLOv3-tiny模型，并将检测类别数设定为2，具体为未佩戴安全帽人体和已佩戴安全帽人体。22) Build a standard YOLOv3-tiny model, and set the number of detection categories to 2, specifically human body without helmet and human body with helmet.

所述的步骤3)具体包括以下步骤：Described step 3) specifically comprises the following steps:

31)对于标准YOLOv3-tiny模型中所有非1×1卷积层及其级联的批归一化层替换为稠密重参数化模块DR-Block，替换后的重构模型记为DR-Net；对于标准YOLOv3-tiny模型中输入通道数为C_in，输出通道数为C_out，卷积核为K(K≠1)，步长为S的卷积，替换为F_rep(x;C_in,C_out,K,S)；31) For all non-1×1 convolutional layers and their cascaded batch normalization layers in the standard YOLOv3-tiny model, replace them with the dense reparameterization module DR-Block, and the replaced reconstruction model is denoted as DR-Net; For the standard YOLOv3-tiny model, the number of input channels is C _in , the number of output channels is C _out , the convolution kernel is K (K≠1), and the step size is S. Replace it with F _rep (x;C _in , C _out ,K,S);

32)采用安全帽检测数据集，通过YOLOv3-tiny标准训练参数与训练策略，对重构模型DR-Net进行训练。32) Using the helmet detection data set, the reconstructed model DR-Net is trained through the YOLOv3-tiny standard training parameters and training strategy.

所述的步骤4)具体包括以下步骤：Described step 4) specifically comprises the following steps:

41)将基本单元转换为单一卷积层，基本单元中卷积的权重记作w_conv，批归一化层的均值记作μ，标准差为σ，缩放系数为γ，偏移系数为β，则重构后的单一卷积层F′_basic的权重为

偏置为

41) Convert the basic unit into a single convolutional layer, the weight of the convolution in the basic unit is denoted as w _conv , the mean value of the batch normalization layer is denoted as μ, the standard deviation is σ, the scaling factor is γ, and the offset coefficient is β , then the weight of the reconstructed single convolutional layer F′ _basic is

biased to

42)将使用跳跃连接的卷积转换为单一卷积层，转换后的单一卷积层F′_{skip_connect}的权重为w′＝concat([w_prev,w])，偏置为b′＝concat([b_prev,b])，其中，concat表示级联操作，w_prev为等效卷积的权重，b_prev为等效卷积的偏置，w为被转换卷积的权重，b为被转换卷积的偏置；42) Convert the convolution using skip connection to a single convolutional layer, the weight of the converted single convolutional layer F′ _{skip_connect} is w′=concat([w _prev ,w]), and the bias is b′=concat( [b _prev ,b]), where concat represents the cascade operation, w _prev is the weight of the equivalent convolution, b _prev is the bias of the equivalent convolution, w is the weight of the converted convolution, and b is the converted convolution bias;

43)将两个级联的卷积转换为单一卷积层；43) converting two cascaded convolutions into a single convolutional layer;

44)将变换结构转换为单一1×1卷积层，按照步骤41)将变换结构中的所有卷积转换为单一卷积层，然后按照步骤42)依次将所有跳跃连接重构为单一卷积层，最终得到与整个变换结构等效的单一卷积层F′_trans；44) Transform the transformation structure into a single 1×1 convolutional layer, follow step 41) convert all convolutions in the transformation structure into a single convolutional layer, and then follow step 42) to sequentially reconstruct all skip connections into a single convolution layer, and finally a single convolutional layer F′ _trans equivalent to the entire transformation structure is obtained;

45)将DR-Block转换为单一卷积层；45) Convert DR-Block to a single convolutional layer;

46)将DR-Net中所有的DR-Block按照步骤45)全部转换为单一卷积层，从而获得部署所需的推理阶段结构；46) Convert all DR-Blocks in DR-Net into a single convolutional layer according to step 45), so as to obtain the inference stage structure required for deployment;

47)提取现场图像帧，输入步骤46)得到的模型中进行安全帽检测，并输出检测结果。47) Extract the scene image frame, input the model obtained in step 46) to carry out helmet detection, and output the detection result.

所述的步骤43)具体为：Described step 43) is specifically:

首先将第一个卷积的权重w₁的第一和第二维度转置，结果记作w₁′，则重构后的单一卷积层F′_cascade的权重为w′＝Conv2d(w₂,w₁′)，偏置为b′＝b₂+(b₁×w₂)，其中，Conv2d表示二维卷积操作，b₁为第一个卷积的偏置，w₂为第二个卷积的权重，b₂为第二个卷积的偏置。First, transpose the first and second dimensions of the weight w ₁ of the first convolution, and record the result as w ₁ ′, then the weight of the reconstructed single convolutional layer F′ _cascade is w′=Conv2d(w ₂ ,w ₁ ′), the offset is b′=b ₂ +(b ₁ ×w ₂ ), where Conv2d represents a two-dimensional convolution operation, b ₁ is the offset of the first convolution, and w ₂ is the second The weight of the first convolution, b ₂ is the bias of the second convolution.

所述的步骤45)具体为：Described step 45) is specifically:

首先按照步骤44)将变换结构转换为单一卷积层F′_trans，然后将单一卷积层F′_trans与学习结构的级联结构，按照步骤43转换为单一卷积层F′_rep，F′_rep即为DR-Block重参数化后的单一卷积层结构。First, convert the transformation structure into a single convolutional layer _F'trans according to step 44), and then convert the cascaded structure of the single convolutional layer _F'trans and the learning structure into a single convolutional layer _F'rep , F' according to step 43 _rep is a single convolutional layer structure after DR-Block reparameterization.

与现有技术相比，本发明具有以下优点：Compared with the prior art, the present invention has the following advantages:

一、针对现有基于深度学习的安全帽检测算法实时性不足，本发明采用yolov3-tiny网络进行训练与推理，实现高实时性。1. In view of the lack of real-time performance of the existing hard hat detection algorithm based on deep learning, the present invention uses a yolov3-tiny network for training and reasoning to achieve high real-time performance.

二、YOLOv3-tiny网络因追求实时性，网络深度较浅，准确率不高，本发明采用网络重参数化方法，对网络的训练阶段与推理阶段进行解耦，在训练阶段使用复杂结构进行模型学习，提高网络准确率，在推理阶段将复杂结构等价转化为原yolov3-tiny简单结构，复原网络的实时性。2. Due to the pursuit of real-time performance, the YOLOv3-tiny network has shallow network depth and low accuracy rate. The present invention adopts the network re-parameterization method to decouple the training phase and the reasoning phase of the network, and use complex structures to model in the training phase. Learning, improve the accuracy of the network, convert the complex structure equivalently into the original yolov3-tiny simple structure in the reasoning stage, and restore the real-time performance of the network.

三、因训练数据有限，模型泛化能力不足，本发明通过在网络重参数化结构中加入深层级联结构，实现网络过参数化、增大模型容量，从而对网络产生隐式正则化作用，提高了原模型的泛化能力。3. Due to the limited training data and insufficient generalization ability of the model, the present invention realizes over-parameterization of the network and increases the capacity of the model by adding a deep cascade structure into the re-parameterization structure of the network, thereby generating an implicit regularization effect on the network. The generalization ability of the original model is improved.

四、深度网络因层数较多，梯度回传存在弥散与爆炸问题，训练缓慢且不易收敛，现有方法没有从实质上改变网络深度，也没有对回传梯度进行额外处理，而本发明通过多层级联的批归一化层，对回传梯度进行了多级分布调整，更好的保证了梯度回传的有效性，在一定程度上避免了梯度弥散与梯度爆炸。4. Due to the large number of layers in the deep network, there are dispersion and explosion problems in the gradient return, the training is slow and difficult to converge, the existing method does not substantially change the network depth, and does not perform additional processing on the return gradient, and the present invention passes The multi-layer cascaded batch normalization layer adjusts the multi-level distribution of the return gradient, which better ensures the effectiveness of the gradient return and avoids gradient dispersion and gradient explosion to a certain extent.

五、由于深度网络参数较多，存在大量冗余，网络的学习能力被削弱，现有方法主要通过设计不同尺度和复杂度的多分支结构来增强单个卷积层的表达能力，但引入了大量特征冗余，影响了对性能的提升能力，而本方法通过稠密连接，实现了不同层级特征的模型集成，大大降低了特征冗余度，提升了网络的学习能力，进一步提高了原模型的准确率。5. Due to the large number of deep network parameters, there is a lot of redundancy, and the learning ability of the network is weakened. The existing methods mainly enhance the expressive ability of a single convolutional layer by designing multi-branch structures of different scales and complexity, but introduce a large number of Feature redundancy affects the ability to improve performance. This method realizes model integration of different levels of features through dense connections, greatly reduces feature redundancy, improves the learning ability of the network, and further improves the accuracy of the original model. Rate.

附图说明Description of drawings

图1为本发明的方法设计流程图。Fig. 1 is the flow chart of method design of the present invention.

图2为DR-Block结构与重参数化变换图。Figure 2 is a diagram of DR-Block structure and reparameterization transformation.

具体实施方式Detailed ways

下面结合附图和具体实施例对本发明进行详细说明。The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

实施例Example

如图1所示，本发明提供一种应用于边缘设备基于结构重参数化的安全帽检测方法，包括以下步骤：As shown in Figure 1, the present invention provides a safety helmet detection method applied to edge devices based on structural reparameterization, including the following steps:

首先，按照如下步骤构建稠密重参数化模块First, build a dense reparameterization module as follows

S1、定义“基本单元”：该“基本单元”的参数为：输入为x，输入通道数C_{basic_in}、输出通道数C_{basic_out}、卷积核大小K_basic、步长为S_basic，该“基本单元”包括1个卷积层级联1个批归一化层，其中，卷积层的输入通道数为C_{basic_in}，输出通道数为C_{basic_out}，卷积核大小为K_basic；批归一化层作用通道数为C_{basic_out}，记“基本单元”为F_basic(x;C_{basic_in},C_{basic_out},K_basic,S_basic)；S1. Define the "basic unit": the parameters of the "basic unit" are: the input is x, the number of input channels C _{basic_in} , the number of output channels C _{basic_out} , the size of the convolution kernel K _basic , and the step size is S _basic , the "basic unit""Includes 1 convolutional layer cascaded with 1 batch normalization layer, where the number of input channels of the convolutional layer is C _{basic_in} , the number of output channels is C _{basic_out} , and the size of the convolution kernel is K _basic ; the role of the batch normalization layer The number of channels is C _{basic_out} , and the "basic unit" is F _basic (x;C _{basic_in} ,C _{basic_out} ,K _basic ,S _basic );

S2、在S1基础上，构建“变换结构”：该“变换结构”的参数为输入为x、输入通道数C_{trans_in}、输出通道数2×C_{trans_in}，该“变换结构”包括：首先级联4个“基本单元”，从而形成深层级联网络结构，用于实现网络过参数化，引入隐式正则化效应，增大模型容量；其中，多层批归一化层的引入，使得重参数化模块对网络回传梯度进行有效调整，避免梯度消失和梯度爆炸，其中，第i个“基本单元”的参数为

然后在任意两个“基本单元”之间添加跳跃连接，用于实现不同层级特征的模型集成，从而降低特征冗余度；最后将每个“基本单元”的输出拼接为一体，记“变换结构”为F_trans(x;C_{trans_in})；S2. On the basis of S1, build a "transformation structure": the parameters of the "transformation structure" are x, the number of input channels C _{trans_in} , and the number of output channels 2×C _{trans_in} . The "transformation structure" includes: first cascade 4 A "basic unit" to form a deep cascaded network structure, which is used to achieve network overparameterization, introduce implicit regularization effects, and increase model capacity; among them, the introduction of multi-layer batch normalization layers makes reparameterization The module effectively adjusts the gradient of the network return to avoid gradient disappearance and gradient explosion. Among them, the parameter of the i-th "basic unit" is

Then add a skip connection between any two "basic units" to achieve model integration of different levels of features, thereby reducing feature redundancy; finally, the output of each "basic unit" is spliced into one, and the "transformation structure" is recorded. ” is F _trans (x; C _{trans_in} );

S3、在S1基础上，定义“学习结构”，用于提取重要特征，并扩大感受野：该“学习结构”的参数为输入为x、输入通道数C_{learn_in}、输出通道数C_{learn_out}、卷积核大小K_learn、步长为S_learn。该结构包括：1个“基本单元”，其参数为F_learn(x;C_{learn_in},C_{learn_out},K_learn,S_learn)；S3. On the basis of S1, define a "learning structure" for extracting important features and expanding the receptive field: the parameters of the "learning structure" are x, the number of input channels C _{learn_in} , the number of output channels C _{learn_out} , convolution The kernel size is K _learn and the step size is S _learn . The structure includes: 1 "basic unit" whose parameters are F _learn (x;C _{learn_in} ,C _{learn_out} ,K _learn ,S _learn );

S4、在S2与S3基础上，构建“稠密重参数化模块(densely reparametrizedblock)”，记为DR-Block，该DR-Block的参数为：输入为x、输入通道数C_{rep_in}、输出通道数C_{rep_out}、卷积核大小K_rep、步长为S_rep。该DR-Block包括1个“变换结构”级联1个“学习结构”，其中，“变换结构”的参数为F_trans(x;C_{rep_in})，“学习结构”的参数为F_learn(x_learn;2×C_{rep_in},C_{rep_out},K_rep,S_rep)，其中，x_learn＝F_trans(x;C_{rep_in})，将DR-Block记作F_rep(x;C_{rep_in},C_{rep_out},K_rep,S_rep)。S4. On the basis of S2 and S3, build a "densely reparametrized block (densely reparametrized block)", which is recorded as DR-Block. The parameters of this DR-Block are: input is x, input channel number C _{rep_in} , output channel number C _{rep_out} , convolution kernel size K _rep , step size S _rep . The DR-Block includes a "transformation structure" concatenated with a "learning structure", wherein the parameter of the "transformation structure" is F _trans (x;C _{rep_in} ), and the parameter of the "learning structure" is F _learn (x _learn ;2×C _{rep_in} , C _{rep_out} , K _rep , S _rep ), wherein, x _learn = F _trans (x; C _{rep_in} ), the DR-Block is recorded as F _rep (x; C _{rep_in} , C _{rep_out} , K _rep , S _rep ).

第二，构建适用于安全帽检测的YOLOv3-tiny模型与训练数据集Second, build a YOLOv3-tiny model and training data set suitable for helmet detection

S5、在施工现场和网络图片中采集、标注安全帽检测数据集，并进行标准数据预处理；S5. Collect and mark the safety helmet detection data set in the construction site and network pictures, and perform standard data preprocessing;

S6、搭建标准YOLOv3-tiny模型，并将检测类别数设定为2(两类：未佩戴安全帽人体、已佩戴安全帽人体)；S6. Build a standard YOLOv3-tiny model, and set the number of detection categories to 2 (two categories: human body without helmet and human body with helmet);

第三，按照如下步骤对YOLOv3-tiny原模型进行重构并训练Third, follow the steps below to reconstruct and train the original YOLOv3-tiny model

S7、给定S6中搭建好的网络模型，将原模型中所有非1×1卷积层及其级联的批归一化层替换为DR-Block，替换后的模型记为DR-Net，对于原参数为输入通道数为C_in，输出通道数为C_out，卷积核为K(K≠1)，步长为S的卷积，替换为F_rep(x;C_in,C_out,K,S)；S7. Given the network model built in S6, replace all non-1×1 convolutional layers and their cascaded batch normalization layers in the original model with DR-Block, and the replaced model is denoted as DR-Net, For the convolution whose original parameter is the number of input channels is C _in , the number of output channels is C _out , the convolution kernel is K (K≠1), and the step size is S, replace it with F _rep (x;C _in ,C _out , K, S);

S8、使用S5制作好的数据集，利用YOLOv3-tiny标准训练参数与训练策略，对S7中重构后的模型进行训练；S8. Use the data set made in S5, and use the YOLOv3-tiny standard training parameters and training strategy to train the reconstructed model in S7;

最后，按照如下步骤将训练好的重构模型等价转换为推理模型，并进行检测Finally, follow the steps below to equivalently convert the trained reconstruction model into an inference model and perform detection

S9、将“基本单元”转换为单一卷积层，“基本单元”中卷积的权重记作w_conv；批归一化层的均值记作μ，标准差为σ，缩放系数为γ，偏移系数为β；则重构后的单一卷积层F′_basic，其权重为

偏置为

S9. Convert the "basic unit" into a single convolutional layer. The weight of the convolution in the "basic unit" is denoted as w _conv ; the mean value of the batch normalization layer is denoted as μ, the standard deviation is σ, the scaling factor is γ, and the partial The shift coefficient is β; then the weight of the reconstructed single convolutional layer F′ _basic is

biased to

S10、基于S9，将使用跳跃连接的卷积转换为单一卷积层。结构的输入可表示为原始输入x经过等效卷积F_prev得来，将等效卷积的权重记作w_prev，偏置记作b_prev；将该结构中卷积的权重记作w，偏置记作b。则转换后的单一卷积层F′_{skip_connect}，其权重为w′＝concat([w_prev,w])，偏置为b′＝concat([b_prev,b])，其中concat表示级联操作；S10, based on S9, convert the convolution using skip connections into a single convolutional layer. The input of the structure can be expressed as the original input x obtained by equivalent convolution F _prev , and the weight of the equivalent convolution is denoted as w _prev , and the bias is denoted as b _prev ; the weight of the convolution in the structure is denoted as w, The bias is denoted as b. Then the weight of the converted single convolutional layer F′ _{skip_connect} is w′=concat([w _prev ,w]), and the bias is b′=concat([b _prev ,b]), where concat represents a cascade operation ;

S11、基于S10，将两个级联的卷积转换为单一卷积层。将第一个卷积的权重记作w₁，偏置记作b₁；第二个卷积的权重记作w₂，偏置记作b₂。首先将w₁的第一和第二维度转置，结果记作w₁′，则重构后的单一卷积层F′_cascade，其权重为w′＝Conv2d(w₂,w₁′)，偏置为b′＝b₂+(b₁×w₂)，其中Conv2d表示二维卷积操作。S11. Based on S10, convert the two cascaded convolutions into a single convolutional layer. Denote the weight of the first convolution as w ₁ and the bias as b ₁ ; denote the weight of the second convolution as w ₂ and the bias as b ₂ . First, the first and second dimensions of w ₁ are transposed, and the result is recorded as w ₁ ′, then the weight of the reconstructed single convolutional layer F′ _cascade is w′=Conv2d(w ₂ ,w ₁ ′), The bias is b'=b ₂ +(b ₁ ×w ₂ ), where Conv2d represents a two-dimensional convolution operation.

S12：基于S11，将“变换结构”转换为单一1×1卷积层。首先按照S9，将“变换结构”中的所有卷积转换为单一卷积层；然后按照S10依次将所有跳跃连接重构为单一卷积层；最终得到与整个“变换结构”等效的单一卷积层F′_trans。S12: Based on S11, convert the "transform structure" into a single 1×1 convolutional layer. First, convert all convolutions in the "transformation structure" into a single convolutional layer according to S9; then reconstruct all skip connections into a single convolutional layer in sequence according to S10; finally obtain a single convolution equivalent to the entire "transformation structure" Multiply F′ _trans .

S13、基于S12，将DR-Block转换为单一卷积层。首先按照S12将“变换结构”转换为单一卷积层F′_trans；然后将F′_trans与“学习结构”的级联结构，按照S11转换为单一卷积层F′_rep。F′_rep即为DR-Block重参数化后的单一卷积层结构；S13. Based on S12, convert the DR-Block into a single convolutional layer. First, convert the "transformation structure" into a single convolutional layer _F'trans according to S12; then convert the cascade structure of _F'trans and "learning structure" into a single convolutional layer _F'rep according to S11. F' _rep is a single convolutional layer structure after DR-Block reparameterization;

S14、最后将DR-Net中所有的DR-Block按S13全部转换为单一卷积层，从而获得部署所需的推理阶段结构；S14. Finally, convert all DR-Blocks in DR-Net into a single convolutional layer according to S13, so as to obtain the inference stage structure required for deployment;

S15、提取现场图像帧，输入S14得到的模型中进行安全帽检测，并输出检测结果。S15. Extracting the on-site image frame, inputting the model obtained in S14 for helmet detection, and outputting the detection result.

表1显示了本发明在安全帽检测数据集上的实验结果，结果表明本发明可以在不改变原方法网络结构，不增加额外推理开销的情况下，以即插即用的方式，对边缘设备上的安全帽检测任务的准确率具有提升作用。Table 1 shows the experimental results of the present invention on the hard hat detection data set. The results show that the present invention can detect edge devices in a plug-and-play manner without changing the network structure of the original method and without adding additional reasoning overhead. The accuracy of the helmet detection task on the above can be improved.

表1本发明对安全帽检测算法准确率的提升效果Table 1 The improvement effect of the present invention on the accuracy of the helmet detection algorithm

指标index AP@0.5-0.95AP@0.5-0.95 AP@0.5AP@0.5 原始模型original model 78.178.1 92.192.1 DR-BlockDR-Block 81.2(↑3.1)81.2 (↑3.1) 96.4(↑4.3)96.4 (↑4.3)

由于DR-Block是即插即用的，因而，本发明能够应用于多种任务，促进深度网络在边缘设备上的高效部署与实时计算。Since the DR-Block is plug-and-play, the present invention can be applied to various tasks and promote efficient deployment and real-time computing of deep networks on edge devices.

综上，本发明针对边缘设备上安全帽检测算法的性能提升问题，设计了稠密线性复合结构，用来在训练阶段代替安全帽检测算法的卷积，并在推理阶段重构为原模型简单卷积，从而在保证推理速度不变的情况下，实现性能提升。To sum up, the present invention aims at improving the performance of the hard hat detection algorithm on the edge device, and designs a dense linear composite structure, which is used to replace the convolution of the hard hat detection algorithm in the training phase, and reconstructed into a simple convolution of the original model in the inference phase. In order to achieve performance improvement while keeping the inference speed unchanged.

本发明首先构建了稠密网络重参数化结构(DR-Block)，该结构首先通过深层级联结构，实现网络过参数化，增大模型容量，从而对网络产生隐式正则化作用，提高了原模型的泛化能力；其次利用级联的批归一化层，对回传梯度进行了多级分布调整，更好的保证了梯度回传的有效性，在一定程度上避免了梯度弥散与梯度爆炸；最后通过稠密连接，实现了不同层级特征的模型集成，大大降低了特征冗余度，提高了对原模型的学习能力。其次，本发明构建了实时性极强的yolov3-tiny网络，并使之适配于安全帽检测任务。随后，本发明将yolov3-tiny网络的训练和推理结构进行结构，在训练阶段利用DR-Block替换原网络卷积并进行训练，实现网络性能提升；最后，将DR-Block等价转换为原yolov3-tiny网络的简单卷积，从而在不增加额外推理开销的情况下，实现性能提升，并针对安全帽检测任务进行部署应用。The present invention first constructs a dense network re-parameterization structure (DR-Block). This structure first realizes network over-parameterization through a deep cascading structure, increases the model capacity, thereby exerts an implicit regularization effect on the network and improves the efficiency of the network. The generalization ability of the model; secondly, the cascaded batch normalization layer is used to adjust the multi-level distribution of the return gradient, which better ensures the effectiveness of the gradient return and avoids gradient dispersion and gradient Explosion; Finally, through dense connection, the model integration of different levels of features is realized, which greatly reduces feature redundancy and improves the learning ability of the original model. Secondly, the present invention constructs a yolov3-tiny network with strong real-time performance, and adapts it to the helmet detection task. Subsequently, the present invention constructs the training and reasoning structure of the yolov3-tiny network, and uses DR-Block to replace the original network convolution and perform training in the training stage to achieve network performance improvement; finally, the DR-Block is equivalently converted to the original yolov3 -Simple convolution of the tiny network, so as to achieve performance improvement without adding additional reasoning overhead, and deploy and apply it for the helmet detection task.

Claims

1. A safety helmet detection method based on structural parameterization applied to edge equipment is characterized by comprising the following steps:

1) Constructing a dense heavy parameterization module;

2) Constructing a standard YOLOv3-tiny model and a training data set which are suitable for safety helmet detection;

3) Reconstructing and training a standard YOLOv3-tiny model;

4) Equivalently converting the trained reconstruction model into a reasoning model, and detecting the safety helmet.

2. The method for detecting a safety helmet based on structural parameterization applied to edge equipment according to claim 1, wherein the step 1) specifically comprises the following steps:

11 Define basic units;

12 Constructing a transformation structure;

13 Build a learning structure;

14 Constructing a dense heavy parameterization module DR-Block: the dense heavy parameterization module DR-Block consists of 1The transformation structure is formed by cascading 1 learning structure, and the parameter of the transformation structure is F _trans (x；C _{rep_in} ) The parameter of the learning structure is F _learn (x _learn ；2×C _{rep_in} ，C _{rep_out} ，K _rep ，S _rep ) And x is _learn ＝F _trans (x；C _{rep_in} ) If the building density reparameterization module DR-Block is marked as F _rep (x；C _{rep_in} ，C _{rep_out} ，K _rep ，S _rep )。

3. The method for inspecting safety helmet based on structural parameterization applied to edge device of claim 2, wherein in the step 11), the basic unit is formed by cascading 1 convolution layer and 1 batch normalization layer, which is denoted as F _basic (x；C _{basic_in} ，C _{basic_out} ，K _basic ，S _basic ) Wherein x is the input, C _{basic_in} Is input channel number, C _{basic_out} Is the number of output channels, K _basic As the size of the convolution kernel, S _basic Is the step length; specifically, the number of input channels of the convolutional layer is C _{basic_in} The number of output channels is C _{basic_out} Convolution kernel size of K _basic The number of batch normalization layer action channels is C _{basic_out} 。

4. The method for detecting a safety helmet based on structural parameterization applied to edge equipment according to claim 3, wherein the step 12) is specifically as follows:

firstly, cascading 4 basic units to form a deep cascading network structure to realize over-parameterization of the network, then adding jump connection between any two basic units to realize model integration of different hierarchical features, and finally, splicing the output of each basic unit into a whole, and marking as F _trans (x；C _{trans_in} )。

5. The safety helmet detection method based on structural parameterization applied to the edge device according to claim 4,in the step 13), the learning structure is composed of 1 basic unit, and is marked as F _learn (x；C _{learn_in} ，C _{learn_out} ，K _learn ，S _learn )。

6. The method for detecting safety helmet based on structural parameterization applied to edge device according to claim 1, wherein the step 2) specifically comprises the following steps:

21 Collecting and labeling a safety helmet detection data set in a construction site and a network picture, and performing standard data preprocessing;

22 Building a standard YOLOv3-tiny model, and setting the number of detection categories as 2, specifically a body without a safety helmet and a body with a safety helmet.

7. The method for detecting safety helmet based on structural parameterization applied to edge equipment according to claim 6, wherein the step 3) comprises the following steps:

31 Replacing all non-1 multiplied by 1 convolution layers and cascaded batch normalization layers in the standard YOLOv3-tiny model by a dense parameterization module DR-Block, and marking the reconstructed model after replacement as DR-Net; for the standard YOLOv3-tiny model, the number of input channels is C _in The number of output channels is C _out The convolution kernel is K (K ≠ 1) and the convolution with the step length S is replaced by F _rep (x；C _in ；C _out ，K，S)；

32 Adopting a helmet detection data set, and training the reconstructed model DR-Net through a Yolov3-tiny standard training parameter and a training strategy.

8. The method for detecting safety helmet based on structural parameterization applied to edge device according to claim 2, wherein the step 4) comprises the following steps:

41 Convert the elementary units into a single convolutional layer, the weight of the convolution in the elementary units is denoted as w _conv The mean of the batch normalization layer is denoted as μ, the standard deviation is σ, the scaling factor is γ, and the deviation isA shift coefficient of beta, the reconstructed single convolutional layer F' _basic Has a weight of

Is biased to

42 Conversion of convolution using skip connection into a single convolution layer, converted single convolution layer F' _{skip_connect} Weight of (b) is w' = concat ([ w ] _prev ，w]) Bias is b '= concat ([ b' = b) _prev ，b]) Wherein concat represents a cascade operation, w _preu Weights for equivalent convolution, b _prev The offset of the equivalent convolution, w is the weight of the converted convolution, and b is the offset of the converted convolution;

43 Two concatenated convolutions are converted into a single convolutional layer;

44 Convert the transform structure to a single 1 × 1 convolutional layer, convert all convolutions in the transform structure to a single convolutional layer as per step 41), and then reconstruct all skip connections into a single convolutional layer in sequence as per step 42), finally obtaining a single convolutional layer F 'equivalent to the entire transform structure' _trans ；

45 Convert DR-Block to a single convolutional layer;

46 All DR-blocks in the DR-Net are converted into a single convolution layer according to the step 45) so as to obtain an inference phase structure required by deployment;

47 Extracting the field image frames, inputting the field image frames into the model obtained in the step 46) for safety helmet detection, and outputting a detection result.

9. The method for detecting a safety helmet based on structural parameterization applied to edge equipment according to claim 8, wherein the step 43) is specifically as follows:

first the weight w of the first convolution ₁ Transposing the first and second dimensions of (1), the result being denoted w ₁ 'then reconstructed single convolution layer F' _cascade Is w' = in weightConv2d(w ₂ ，w ₁ ') offset b' = b ₂ +(b ₁ ×w ₂ ) Where Conv2d denotes a two-dimensional convolution operation, b ₁ Offset for the first convolution, w ₂ As a weight of the second convolution, b ₂ Is the offset of the second convolution.

10. The method for detecting safety helmet based on structural parameterization applied to edge device according to claim 8, wherein the step 45) is specifically as follows:

first converting the transformed structure to a single convolutional layer F 'as per step 44)' _trans Then, a single winding of laminate F' _trans The cascade structure with the learning structure is converted to a single convolutional layer F 'according to step 43' _rep ，F′ _rep The single convolution layer structure after the DR-Block reparameterization is obtained.