CN118036476A

CN118036476A - Precast concrete crack detection model, method, system and readable medium

Info

Publication number: CN118036476A
Application number: CN202410432015.9A
Authority: CN
Inventors: 蒋庆; 梁雨; 周满旭; 李赛; 叶冠廷
Original assignee: Hefei University of Technology
Current assignee: Hefei University of Technology
Priority date: 2024-04-11
Filing date: 2024-04-11
Publication date: 2024-05-14
Anticipated expiration: 2044-04-11
Also published as: CN118036476B

Abstract

The invention relates to the technical field of precast concrete surface detection and computers, in particular to a precast concrete crack detection model, a precast concrete crack detection method, a precast concrete crack detection system and a precast concrete crack detection readable medium. According to the training method of the precast concrete crack detection model, a YOLOv model is optimized firstly, and then machine learning is carried out. According to the invention, an LW module is set for three paths of data splicing, wherein one path is data input by the LW module, the LW module participates in the splicing, a larger characteristic diagram is reserved, and the detection of a small target is facilitated; the other path is data which is subjected to average pooling and maximum pooling and then dimension superposition on the data input by the LW module, the average pooling is beneficial to extracting overall characteristics, the maximum pooling is beneficial to extracting the most prominent characteristics, and the combination of the two ensures the overall characteristic extraction of the data, thereby being beneficial to improving the detection precision.

Description

Precast concrete crack detection model, method, system and readable medium

技术领域Technical Field

本发明涉及预制混凝土表面检测和计算机技术领域，尤其是一种预制混凝土裂缝检测模型、方法、系统和可读介质。The invention relates to the fields of precast concrete surface detection and computer technology, in particular to a precast concrete crack detection model, method, system and readable medium.

背景技术Background technique

质量控制在产品生产中的意义日趋显著，产品的表面缺陷影响其质量。建筑建设成本很大一部分用于因材料缺陷而导致的返工。裂缝缺陷是一种十分常见的预制构件表面缺陷类型，准确地进行裂缝缺陷检测对于提高产品质量和降低建设成本具有非常重要的作用。Quality control is becoming more and more important in product production. Surface defects of products affect their quality. A large part of the cost of building construction is spent on rework caused by material defects. Crack defects are a very common type of surface defects of prefabricated components. Accurate crack defect detection plays a very important role in improving product quality and reducing construction costs.

目前检测评估预制构件质量主要有人工识别检测法、基于图像处理算法检测方法以及深度学习方法。其中，传统的人工检测方法，该方法抽检率低、检测准确度受人工经验和疲劳程度等主观因素的影响大、劳动强度大、检测效率低和实时性差。图像处理算法检测方法，主要是针对同一材质和纹理背景图像进行裂缝检测，且目前尚无法对彩色图像直接进行裂缝检测。现有的深度学习算法检测方法，一般是将待检测图像直接压缩到较小尺寸，以满足神经网络输入要求，再送入提前训练好的神经网络模型中进行检测，得出预测结果，此类算法在处理高分辨率的裂缝图像时，会因过度的压缩会导致图像失真，有较高的误检率。At present, the main methods for detecting and evaluating the quality of prefabricated components include manual recognition detection methods, detection methods based on image processing algorithms, and deep learning methods. Among them, the traditional manual detection method has a low sampling rate, the detection accuracy is greatly affected by subjective factors such as manual experience and fatigue, the labor intensity is high, the detection efficiency is low, and the real-time performance is poor. The image processing algorithm detection method mainly detects cracks in images of the same material and texture background, and it is currently impossible to directly detect cracks in color images. The existing deep learning algorithm detection method generally compresses the image to be detected directly to a smaller size to meet the neural network input requirements, and then sends it to the pre-trained neural network model for detection to obtain the prediction result. When processing high-resolution crack images, this type of algorithm will cause image distortion due to excessive compression and have a high false detection rate.

发明内容Summary of the invention

为了克服上述现有技术中混凝土预制件裂缝检测，或效率低，或精度低的缺陷，本发明提出了一种预制混凝土裂缝检测模型的训练方法，可训练出一种高效、精确的预制混凝土裂缝检测模型。In order to overcome the defects of low efficiency or low precision in the above-mentioned prior art of crack detection of precast concrete parts, the present invention proposes a training method for a precast concrete crack detection model, which can train an efficient and accurate precast concrete crack detection model.

本发明提出的一种预制混凝土裂缝检测模型的训练方法，包括以下步骤：The present invention proposes a method for training a precast concrete crack detection model, comprising the following steps:

首先获取基础模型和学习样本，学习样本为相关联的裂缝拍摄图像和裂缝标注图像；基础模型在YOLOv5模型的基础上，将相连接的Upsample单元和Concat单元替换为LW模块，LW模块首先对YOLOv5模型中Upsample单元的输入数据进行池化处理，然后将池化数据与YOLOv5模型中Upsample单元和Concat单元组合模块的两路输入进行三向拼接后输出；First, a basic model and learning samples are obtained. The learning samples are the associated crack shooting images and crack annotated images. The basic model is based on the YOLOv5 model, and the connected Upsample unit and Concat unit are replaced by the LW module. The LW module first performs pooling on the input data of the Upsample unit in the YOLOv5 model, and then performs three-way splicing of the pooled data with the two inputs of the Upsample unit and Concat unit combination module in the YOLOv5 model and outputs them.

再令基础模型对学习样本进行学习，以迭代模型参数，直至收敛；将收敛后的基础模型作为预制混凝土裂缝检测模型，其输入为拍摄图像，输出为裂缝标注图像。Then the basic model is made to learn the learning samples to iterate the model parameters until convergence; the converged basic model is used as a precast concrete crack detection model, whose input is the captured image and output is the crack annotation image.

优选的，LW模块包括平均池化层、最大池化层、维度叠加单元和第五Concat单元；第五Concat单元设有三个输入端；LW模块设有两个输入端；令YOLOv5模型中Upsample单元和Concat单元组合模块中，Upsample单元的输入端记作所述组合模块的第一输入端，Concat单元连接第二C3模块的输入端记作所述组合模块的第二输入端；LW模块的第一输入端取代所述组合模块的第一输入端，LW模块的第二输入端取代所述组合模块的第二输入端；Preferably, the LW module includes an average pooling layer, a maximum pooling layer, a dimension superposition unit and a fifth Concat unit; the fifth Concat unit is provided with three input terminals; the LW module is provided with two input terminals; in the Upsample unit and the Concat unit combination module in the YOLOv5 model, the input terminal of the Upsample unit is recorded as the first input terminal of the combination module, and the input terminal of the Concat unit connected to the second C3 module is recorded as the second input terminal of the combination module; the first input terminal of the LW module replaces the first input terminal of the combination module, and the second input terminal of the LW module replaces the second input terminal of the combination module;

LW模块的第一输入端分别连接平均池化层的输入端、最大池化层的输入端和第五Concat单元的第一输入端，平均池化层的输出端和最大池化层的输出端均连接维度叠加单元的输入端，维度叠加单元的输出端连接第五Concat单元的第二输入端，第五Concat单元的第三输入端连接LW模块的第二输入端；第五Concat单元的输出端作为LW模块的输出端。The first input end of the LW module is respectively connected to the input end of the average pooling layer, the input end of the maximum pooling layer and the first input end of the fifth Concat unit, the output end of the average pooling layer and the output end of the maximum pooling layer are both connected to the input end of the dimension superposition unit, the output end of the dimension superposition unit is connected to the second input end of the fifth Concat unit, and the third input end of the fifth Concat unit is connected to the second input end of the LW module; the output end of the fifth Concat unit serves as the output end of the LW module.

优选的，令YOLOv5模型中，相同结构的模块沿着数据流转方向顺序命名；基础模型在YOLOv5模型的基础上，还将数据流通方向上的第一C3模块和第四C3模块均替换为MMT模块，MMT模块包括顺序连接的Bottleneck单元、MultiAttenCat单元和第十Conv单元；MMT模块的输入端分别连接Bottleneck单元的输入端和MultiAttenCat单元的输入端，第十Conv单元的输出端作为MMT模块的输出端。Preferably, in the YOLOv5 model, modules with the same structure are named sequentially along the direction of data flow; based on the YOLOv5 model, the basic model also replaces the first C3 module and the fourth C3 module in the data flow direction with an MMT module, and the MMT module includes a Bottleneck unit, a MultiAttenCat unit and a tenth Conv unit connected in sequence; the input end of the MMT module is respectively connected to the input end of the Bottleneck unit and the input end of the MultiAttenCat unit, and the output end of the tenth Conv unit is used as the output end of the MMT module.

优选的，基础模型在YOLOv5模型的基础上，还将第四Conv单元和第五Conv单元均替换为RepVGG单元。Preferably, the basic model is based on the YOLOv5 model, and the fourth Conv unit and the fifth Conv unit are both replaced by RepVGG units.

优选的，模型参数的迭代过程包括以下步骤：Preferably, the iteration process of the model parameters comprises the following steps:

St1、将学习样本划分为训练集和测试集；St1, divide the learning samples into training set and test set;

St2、从训练集中抽取多个样本作为训练样本，令基础模型对训练样本进行学习，以迭代模型参数；St2, extract multiple samples from the training set as training samples, and let the basic model learn the training samples to iterate the model parameters;

St3、从测试集中抽取多个样本作为测试样本，令基础模型对测试样本进行预测并输出裂缝标注图像；St3, extract multiple samples from the test set as test samples, let the basic model predict the test samples and output crack annotation images;

St4、在测试样本上计算基础模型的损失，判断基础模型是否收敛；否，则返回步骤St2；是，则固定基础模型。St4, calculate the loss of the basic model on the test sample to determine whether the basic model converges; if not, return to step St2; if yes, fix the basic model.

优选的，St4中，基础模型的损失为交叉熵损失或者是均方差损失。Preferably, in St4, the loss of the basic model is cross entropy loss or mean square error loss.

本发明提出的一种预制混凝土裂缝检测方法，其特征在于，首先采用所述的预制混凝土裂缝检测模型的训练方法获取检测模型；再对待检测目标进行拍照，获取目标图像；然后将目标图像输入检测模型，检测模型输出裂缝标注图像，以标注裂缝类型。A precast concrete crack detection method proposed in the present invention is characterized in that the detection model is firstly obtained by adopting the training method of the precast concrete crack detection model; then the target to be detected is photographed to obtain a target image; then the target image is input into the detection model, and the detection model outputs a crack annotation image to annotate the crack type.

优选的，裂缝类型包括：横向裂缝、纵向裂缝和疲劳裂缝。Preferably, the crack types include: transverse cracks, longitudinal cracks and fatigue cracks.

本发明提出的一种预制混凝土裂缝检测系统，包括存储器和处理器，存储器中存储有计算机程序，处理器连接存储器，处理器用于执行所述计算机程序，以实现所述的预制混凝土裂缝检测模型的训练方法。A precast concrete crack detection system proposed in the present invention includes a memory and a processor, wherein a computer program is stored in the memory, and the processor is connected to the memory, and the processor is used to execute the computer program to implement the training method of the precast concrete crack detection model.

本发明提出的一种可读介质，存储有计算机程序，所述计算机程序被执行时用于实现所述的预制混凝土裂缝检测模型的训练方法。本发明的优点在于：The present invention provides a readable medium storing a computer program, which is used to implement the training method of the precast concrete crack detection model when executed. The advantages of the present invention are:

（1）本发明中，设置LW模块进行三路数据拼接，其中一路为LW模块输入的数据，其参与拼接，保留了较大的特征图，有利于小目标检测；另一路为LW模块输入的数据进行平均池化和最大池化后维度叠加的数据，平均池化有利于提取总体特征，最大池化有利于提取最突出特征，两者结合保证了对数据的特征提取的全面，有利于提高检测精度。(1) In the present invention, an LW module is set to perform three-way data splicing, one of which is the data input by the LW module. It participates in the splicing and retains a larger feature map, which is beneficial to the detection of small targets; the other is the data input by the LW module after average pooling and maximum pooling and dimension superposition. Average pooling is beneficial to extracting overall features, and maximum pooling is beneficial to extracting the most prominent features. The combination of the two ensures the comprehensive feature extraction of the data, which is beneficial to improving the detection accuracy.

（2）MMT模块通过用轻量级网络模型取代原始YOLOv5中的C3模块来提高数据处理速度。RepVGG模块用于通过重用简单的卷积结构来减轻网络重量，将卷积层的计算等效地表示为几个小卷积块的加权和。MMT模块用于优化网络结构，RepVGG模块用于减轻网络重量，LW模块提高了检测精度。(2) The MMT module improves data processing speed by replacing the C3 module in the original YOLOv5 with a lightweight network model. The RepVGG module is used to reduce the network weight by reusing a simple convolutional structure, and the calculation of the convolutional layer is equivalently expressed as the weighted sum of several small convolutional blocks. The MMT module is used to optimize the network structure, the RepVGG module is used to reduce the network weight, and the LW module improves the detection accuracy.

（3）本发明所提出的检测模型可以检测混凝土预制件上不同类型的裂纹，采用本发明提出的检测模型进行混凝土预制件的表面检测，可大大提高各种类型裂缝上的检测精度，提高检测效率。(3) The detection model proposed in the present invention can detect different types of cracks on precast concrete parts. Using the detection model proposed in the present invention to perform surface detection of precast concrete parts can greatly improve the detection accuracy of various types of cracks and improve the detection efficiency.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为第一种检测模型构建方法流程图；FIG1 is a flow chart of a first detection model construction method;

图2为传统的YOLOv5网络结构图；Figure 2 is a diagram of the traditional YOLOv5 network structure;

图3为第一种检测模型网络结构图；FIG3 is a network structure diagram of the first detection model;

图4为LW模块网络结构图；Figure 4 is a diagram of the LW module network structure;

图5为MMT模块网络结构图；FIG5 is a diagram of the MMT module network structure;

图6为第二种检测模型网络结构图；FIG6 is a diagram of the network structure of the second detection model;

图7为第三种检测模型网络结构图；FIG7 is a network structure diagram of the third detection model;

图8为混凝土3种开裂模式图像；Figure 8 shows images of three concrete cracking modes;

图9为第一种检测模型的损失曲线；Figure 9 is the loss curve of the first detection model;

图10为第二种检测模型的损失曲线；Figure 10 shows the loss curve of the second detection model;

图11为第三种检测模型的损失曲线；Figure 11 shows the loss curve of the third detection model;

图12为多种模型P-R曲线对比；Figure 12 is a comparison of P-R curves of various models;

图13为第一种检测模型的精度混淆矩阵；Figure 13 is the accuracy confusion matrix of the first detection model;

图14为第二种检测模型的精度混淆矩阵；Figure 14 is the accuracy confusion matrix of the second detection model;

图15为第三种检测模型的精度混淆矩阵。Figure 15 is the accuracy confusion matrix of the third detection model.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

传统的YOLOv5模型，包括：骨干网络（backbone）、颈部网络（neck）和头部网络（head）；The traditional YOLOv5 model includes: backbone network (backbone), neck network (neck) and head network (head);

参照图2，骨干网络包括顺序连接的第一Conv单元、第二Conv单元、第一C3模块、第三Conv单元、第二C3模块、第四Conv单元、第三C3模块、第五Conv单元、第四C3模块和SPPF模块；第一Conv单元的输入端作为骨干网络的输入端，也是YOLOv5模型的输入端；2 , the backbone network includes a first Conv unit, a second Conv unit, a first C3 module, a third Conv unit, a second C3 module, a fourth Conv unit, a third C3 module, a fifth Conv unit, a fourth C3 module and an SPPF module connected sequentially; the input end of the first Conv unit serves as the input end of the backbone network and is also the input end of the YOLOv5 model;

颈部网络包括顺序连接的第六Conv单元、第一Upsample单元、第一Concat单元、第五C3模块、第七Conv单元、第二Upsample单元、第二Concat单元、第六C3模块、第八Conv单元、第三Concat单元、第七C3模块、第九Conv单元、第四Concat单元和第八C3模块；The neck network includes a sixth Conv unit, a first Upsample unit, a first Concat unit, a fifth C3 module, a seventh Conv unit, a second Upsample unit, a second Concat unit, a sixth C3 module, an eighth Conv unit, a third Concat unit, a seventh C3 module, a ninth Conv unit, a fourth Concat unit, and an eighth C3 module connected sequentially;

SPPF模块的输出端连接第六Conv单元的输入端，第一Concat单元的输入端还连接第三C3模块的输出端，第二Concat单元的输入端还连接第二C3模块的输出端，第三Concat单元的输入端还连接第七Conv单元的输出端，第四Concat单元的输入端还连接第六Conv单元的输出端；The output end of the SPPF module is connected to the input end of the sixth Conv unit, the input end of the first Concat unit is also connected to the output end of the third C3 module, the input end of the second Concat unit is also connected to the output end of the second C3 module, the input end of the third Concat unit is also connected to the output end of the seventh Conv unit, and the input end of the fourth Concat unit is also connected to the output end of the sixth Conv unit;

头部网络包括：第一Detect单元、第二Detect单元和第三Detect单元；第一Detect单元的输入端连接第六C3模块的输出端，第二Detect单元的输入端连接第七C3模块的输出端，第三Detect单元的输入端连接第八C3模块的输出端；The head network includes: a first Detect unit, a second Detect unit and a third Detect unit; the input end of the first Detect unit is connected to the output end of the sixth C3 module, the input end of the second Detect unit is connected to the output end of the seventh C3 module, and the input end of the third Detect unit is connected to the output end of the eighth C3 module;

第一Detect单元输出第一检测结果，第二Detect单元输出第二检测结果，第三Detect单元输出第三检测结果。The first Detect unit outputs a first detection result, the second Detect unit outputs a second detection result, and the third Detect unit outputs a third detection result.

实际应用时，可根据需要选择第一检测结果、第二检测结果和第三检测结果中的一个或者多个作为YOLOv5模型的输出。In actual application, one or more of the first detection result, the second detection result and the third detection result can be selected as the output of the YOLOv5 model as needed.

定义数据形式H×P×Q表示尺寸为P×Q的特征图的H维数据表示。本实施例中，令H=2^h+1、2^h+2、2^h+3、2^h+4、n、2n、i、4i；P=x、x/2、x/4、x/8、x/16、p、p/2、2p、j；Q=y、y/2、y/4、y/8、y/16、q、q/2、2q、k；h、x、y、n、p、q、i、j和k均为设定值。The data form H×P×Q is defined to represent the H-dimensional data representation of the feature graph of size P×Q. In this embodiment, let H= ^2h+1 , 2h ⁺² , 2h ⁺³ , 2h ⁺⁴ , n, 2n, i, 4i; P=x, x/2, x/4, x/8, x/16, p, p/2, 2p, j; Q=y, y/2, y/4, y/8, y/16, q, q/2, 2q, k; h, x, y, n, p, q, i, j and k are all set values.

YOLOv5模型中，第一Conv单元对输入的图像数据进行卷积处理，并输出数据2^h×x×y；数据2^h×x×y经过第二Conv单元卷积处理转换为数据2^h+1×(x/2)×(y/2)，数据2^h+1×(x/2)×(y/2)依次经过第一C3模块和第三Conv单元处理，第三Conv单元输出数据2^h+2×(x/4)×(y/4)；数据2^h+2×(x/4)×(y/4)依次经过第二C3模块和第四Conv单元处理，第四Conv单元输出数据2^h+3×(x/8)×(y/8)；数据2^h+3×(x/8)×(y/8)依次经过第三C3模块和第五Conv单元处理，第五Conv单元输出数据2^h+4×(x/16)×(y/16)；数据2^h+4×(x/16)×(y/16)依次经过第四C3模块、SPPF模块和第六Conv单元进行处理后输出数据2^h+3×(x/16)×(y/16)。In the YOLOv5 model, the first Conv unit performs convolution processing on the input image data and outputs data ^2h ×x×y; the data ^2h ×x×y is converted into data 2h ⁺¹ ×(x/2)×(y/2) by the second Conv unit convolution processing, and the data 2h ⁺¹ ×(x/2)×(y/2) is processed by the first C3 module and the third Conv unit in turn, and the third Conv unit outputs data 2h ⁺² ×(x/4)×(y/4); the data 2h ⁺² ×(x/4)×(y/4) is processed by the second C3 module and the fourth Conv unit in turn, and the fourth Conv unit outputs data 2h ⁺³ ×(x/8)×(y/8); the data 2h ⁺³ ×(x/8)×(y/8) is processed by the third C3 module and the fifth Conv unit in turn, and the fifth Conv unit outputs data 2h ⁺⁴ ×(x/16)×(y/16) ^; ×(x/16)×(y/16) is processed by the fourth C3 module, the SPPF module and the sixth Conv unit in sequence and then outputs the data 2h ⁺³ ×(x/16)×(y/16).

第六Conv单元输出的数据2^h+3×(x/16)×(y/16)经过第一Upsample单元转换为数据2^h+3×(x/8)×(y/8)；第一Concat单元将第三C3模块输出的数据2^h+3×(x/8)×(y/8)和第一Upsample单元输出的数据2^h+3×(x/8)×(y/8)进行维度拼接。The data ^2h+3 ×(x/16)×(y/16) output by the sixth Conv unit is converted into data 2h ⁺³ ×(x/8)×(y/8) by the first Upsample unit; the first Concat unit concatenates the data 2h ⁺³ ×(x/8)×(y/8) output by the third C3 module and the data 2h ⁺³ ×(x/8)×(y/8) output by the first Upsample unit.

第一Concat单元输出的数据2^h+4×(x/8)×(y/8)经第五C3模块转换为数据2^h+3×(x/8)×(y/8)后输入第七Conv单元；第七Conv单元将输入的数据转换为数据2^h+2×(x/8)×(y/8)后输入第二Upsample单元。The data 2h ⁺⁴ ×(x/8)×(y/8) output by the first Concat unit is converted into data 2h ⁺³ ×(x/8)×(y/8) by the fifth C3 module and then input into the seventh Conv unit; the seventh Conv unit converts the input data into data 2h ⁺² ×(x/8)×(y/8) and then inputs into the second Upsample unit.

第二Concat单元将第二C3模块输出的数据2^h+2×(x/4)×(y/4)和第二Upsample单元输出的数据2^h+2×(x/4)×(y/4)进行维度拼接，第二Concat单元输出的数据2^h+3×(x/4)×(y/4)经第六C3模块转换为数据2^h+2×(x/4)×(y/4)后输入第八Conv单元；第八Conv单元将输入的数据转换为数据2^h+2×(x/8)×(y/8)后输入第三Concat单元。The second Concat unit concatenates the data 2h ⁺² ×(x/4)×(y/4) output by the second C3 module and the data 2h ⁺² ×(x/4)×(y/4) output by the second Upsample unit. The data 2h ⁺³ ×(x/4)×(y/4) output by the second Concat unit is converted into data 2h ⁺² ×(x/4)×(y/4) by the sixth C3 module and then input into the eighth Conv unit. The eighth Conv unit converts the input data into data 2h ⁺² ×(x/8)×(y/8) and then inputs into the third Concat unit.

第三Concat单元将第七Conv单元输出的数据2^h+2×(x/8)×(y/8)和第八Conv单元输出的数据2^h+2×(x/8)×(y/8)拼接成数据2^h+3×(x/8)×(y/8)，第三Concat单元输出的数据2^h+3×(x/8)×(y/8)依次经过第七C3模块和第九Conv单元进行处理，第九Conv单元输出数据2^h+3×(x/16)×(y/16)。The third Concat unit concatenates the data 2h ⁺² ×(x/8)×(y/8) output by the seventh Conv unit and the data 2h ⁺² ×(x/8)×(y/8) output by the eighth Conv unit into data 2h ⁺³ ×(x/8)×(y/8). The data 2h ⁺³ ×(x/8)×(y/8) output by the third Concat unit is processed by the seventh C3 module and the ninth Conv unit in sequence, and the ninth Conv unit outputs data 2h ⁺³ ×(x/16)×(y/16).

第四Concat单元将第六Conv单元输出的数据2^h+3×(x/16)×(y/6)和第九Conv单元输出的数据2^h+3×(x/16)×(y/16)拼接成数据2^h+4×(x/16)×(y/16)，第八C3模块对第四Concat单元输出的数据2^h+4×(x/16)×(y/16)进行处理。The fourth Concat unit concatenates the data 2h ⁺³ ×(x/16)×(y/6) output by the sixth Conv unit and the data 2h ⁺³ ×(x/16)×(y/16) output by the ninth Conv unit into data 2h ⁺⁴ ×(x/16)×(y/16), and the eighth C3 module processes the data 2h ⁺⁴ ×(x/16)×(y/16) output by the fourth Concat unit.

本发明提出的预制混凝土裂缝检测模型，简称检测模型，用于检测预制混凝土表面的裂缝；检测模型的输入为：预制混凝土的拍摄图像，输出为：裂缝标注图像，即标注裂缝的图像。The precast concrete crack detection model proposed in the present invention, referred to as the detection model, is used to detect cracks on the surface of precast concrete; the input of the detection model is: a photographed image of the precast concrete, and the output is: a crack annotation image, that is, an image with annotated cracks.

第一种检测模型The first detection model

参照图1、图3、图4，本发明提出的第一种检测模型，在传统的YOLOv5模型上改进获得，改进方式包括以下步骤S1-S2。1 , 3 and 4 , the first detection model proposed in the present invention is obtained by improving the traditional YOLOv5 model, and the improvement method includes the following steps S1-S2.

S1、构建LW模块，LW模块包括平均池化层（mean-pooling）、最大池化层（max-pooling）、维度叠加单元和第五Concat单元；第五Concat单元设有三个输入端；LW模块设有两个输入端，LW模块的第一输入端分别连接平均池化层的输入端、最大池化层的输入端和第五Concat单元的第一输入端，平均池化层的输出端和最大池化层的输出端均连接维度叠加单元的输入端，维度叠加单元的输出端连接第五Concat单元的第二输入端，第五Concat单元的第三输入端连接LW模块的第二输入端；第五Concat单元的输出端作为LW模块的输出端。S1. Construct an LW module, which includes an average pooling layer (mean-pooling), a maximum pooling layer (max-pooling), a dimension superposition unit and a fifth Concat unit; the fifth Concat unit is provided with three input terminals; the LW module is provided with two input terminals, the first input terminal of the LW module is respectively connected to the input terminal of the average pooling layer, the input terminal of the maximum pooling layer and the first input terminal of the fifth Concat unit, the output terminal of the average pooling layer and the output terminal of the maximum pooling layer are both connected to the input terminal of the dimension superposition unit, the output terminal of the dimension superposition unit is connected to the second input terminal of the fifth Concat unit, and the third input terminal of the fifth Concat unit is connected to the second input terminal of the LW module; the output terminal of the fifth Concat unit serves as the output terminal of the LW module.

如此，LW模块的第一输入端输入的数据n×p×q分别经过平均池化和最大池化后再维度叠加处理，维度叠加数据n×(p/2)×(q/2)和LW模块的两个输入端输入的数据通过第五Concat单元进行维度拼接后作为LW模块的输出数据2n×2p×2q。LW模块中，第五Concat单元对输入的3路数据进行加权拼接，三个通道上的参数权重W₁、W₂和W₂均为设定值。In this way, the data n×p×q input from the first input end of the LW module are processed by average pooling and maximum pooling respectively, and then dimensionally superimposed. The dimensionally superimposed data n×(p/2)×(q/2) and the data input from the two input ends of the LW module are dimensionally spliced through the fifth Concat unit as the output data 2n×2p×2q of the LW module. In the LW module, the fifth Concat unit performs weighted splicing on the 3 input data, and the parameter weights W ₁ , W ₂ and W ₃ on the three channels are all set values.

S2、将YOLOv5模型中的第一Upsample单元和第一Concat单元替换为LW模块，记作第一LW模块；将YOLOv5模型中的第二Upsample单元和第二Concat单元替换为LW模块，记作第二LW模块；第一LW模块的第一输入端连接第六Conv单元的输出端，第一LW模块的第二输入端连接第三C3模块的输出端，第一LW模块的输出端连接第五C3模块的输入端；第二LW模块的第一输入端连接第七Conv单元的输出端，第二LW模块的第二输入端连接第二C3模块的输出端，第二LW模块的输出端连接第六C3模块的输入端。S2. Replace the first Upsample unit and the first Concat unit in the YOLOv5 model with an LW module, recorded as the first LW module; replace the second Upsample unit and the second Concat unit in the YOLOv5 model with an LW module, recorded as the second LW module; the first input end of the first LW module is connected to the output end of the sixth Conv unit, the second input end of the first LW module is connected to the output end of the third C3 module, and the output end of the first LW module is connected to the input end of the fifth C3 module; the first input end of the second LW module is connected to the output end of the seventh Conv unit, the second input end of the second LW module is connected to the output end of the second C3 module, and the output end of the second LW module is connected to the input end of the sixth C3 module.

本实施例中，LW模块进行三路数据拼接，其中一路为LW模块输入的数据，其参与拼接，保留了较大的特征图，有利于小目标检测；另一路为LW模块输入的数据进行平均池化和最大池化后维度叠加的数据，平均池化有利于提取总体特征，最大池化有利于提取最突出特征，两者结合保证了对数据的特征提取的全面，有利于提高检测精度。In this embodiment, the LW module performs three-way data splicing, one of which is the data input to the LW module. It participates in the splicing and retains a larger feature map, which is beneficial to the detection of small targets. The other is the data input to the LW module after average pooling and maximum pooling and dimension superposition. Average pooling is beneficial to extracting overall features, and maximum pooling is beneficial to extracting the most prominent features. The combination of the two ensures comprehensive feature extraction of the data, which is beneficial to improving detection accuracy.

第二种检测模型The second detection model

参照图5、图6，本发明提出的第二种检测模型，在第一种检测模型上进一步改进获得，改进方式为：将第一C3模块和第四C3模块分别替换为第一MMT模块和第二MMT模块。5 and 6 , the second detection model proposed in the present invention is obtained by further improving the first detection model in that the first C3 module and the fourth C3 module are replaced by the first MMT module and the second MMT module respectively.

第一MMT模块和第二MMT模块结构相同，统称为MMT模块，其包括顺序连接的Bottleneck单元、MultiAttenCat单元和第十Conv单元；MMT模块的输入端分别连接Bottleneck单元的输入端和MultiAttenCat单元的输入端，第十Conv单元的输出端作为MMT模块的输出端。The first MMT module and the second MMT module have the same structure and are collectively referred to as MMT modules, which include a Bottleneck unit, a MultiAttenCat unit and a tenth Conv unit connected in sequence; the input end of the MMT module is respectively connected to the input end of the Bottleneck unit and the input end of the MultiAttenCat unit, and the output end of the tenth Conv unit serves as the output end of the MMT module.

MMT模块的输入数据i×j×k经Bottleneck单元转换为数据4i×j×k；MultiAttenCat单元对输入数据i×j×k和数据4i×j×k进行采样拼接以生成数据i×j×k作为输出数据。The input data i×j×k of the MMT module is converted into data 4i×j×k by the Bottleneck unit; the MultiAttenCat unit samples and splices the input data i×j×k and the data 4i×j×k to generate data i×j×k as output data.

本实施例在骨干网络中引入更轻量级的MMT模块代替两个C3模块，以提高数据处理速度。This embodiment introduces a lighter MMT module in the backbone network to replace two C3 modules to increase data processing speed.

C3模块的主要功能是通过跨阶段的部分连接来改善模型中的信息流；C3模块将输入数据卷积处理后再输入Bottleneck网络进行处理，Bottleneck网络的输出果与输入数据进行拼接。The main function of the C3 module is to improve the information flow in the model through partial connections across stages; the C3 module convolves the input data and then inputs it into the Bottleneck network for processing, and the output of the Bottleneck network is concatenated with the input data.

本实施例中的MMT模块将Bottleneck网络对输入数据的处理结果与输入数据通过MultiAttenCat单元进行拼接，以融合更多的信息。MultiAttenCat单元可人工设置不同输入的权重，使得网络能够最适应的去选择通道和权重，能够抑制对输出有干扰的特征，从而有利于对目标的检测。The MMT module in this embodiment splices the processing result of the Bottleneck network on the input data with the input data through the MultiAttenCat unit to integrate more information. The MultiAttenCat unit can manually set the weights of different inputs so that the network can most adaptably select channels and weights, and can suppress features that interfere with the output, thereby facilitating the detection of targets.

第三种检测模型The third detection model

参照图7，本发明提出的第三种检测模型，在第二种检测模型上进一步改进获得，改进方式为：将第四Conv单元和第五Conv单元均替换为RepVGG单元。7 , the third detection model proposed in the present invention is obtained by further improving the second detection model in that the fourth Conv unit and the fifth Conv unit are both replaced by RepVGG units.

第四Conv单元和第五Conv单元为原YOLOv5模型中参数数量最多的两个卷积模块，本实施例中通过替换RepVGG单元，将第四Conv单元和第五Conv单元进行轻量化处理，在减少了参数的数量的情况下，提高了模型的准确性。The fourth Conv unit and the fifth Conv unit are the two convolution modules with the largest number of parameters in the original YOLOv5 model. In this embodiment, the fourth Conv unit and the fifth Conv unit are lightweighted by replacing the RepVGG unit, thereby improving the accuracy of the model while reducing the number of parameters.

以下结合具体实施例，对上述的三种检测模型进行验证。The three detection models mentioned above are verified below in combination with specific embodiments.

本实施例中，用于目标检测的数据集包括1371张使用高分辨率相机拍摄的混凝土损伤图像，这些混凝土损伤图像是在不同的光照条件下拍摄的，描绘了具有不同平整度的混凝土路面。本实施例中使用Photoshop将每个裂纹图像手动注释为二进制图像并进行裁剪，裁剪后获得512×512的分辨率。即，本实施例中的样本为：标注裂纹图像的混凝土损伤图像，裂纹图像为512×512的分辨率的二进制图像。In this embodiment, the data set for target detection includes 1371 concrete damage images taken with a high-resolution camera. These concrete damage images are taken under different lighting conditions and depict concrete pavements with different flatness. In this embodiment, Photoshop is used to manually annotate each crack image as a binary image and crop it, and a resolution of 512×512 is obtained after cropping. That is, the samples in this embodiment are: concrete damage images with crack images annotated, and the crack images are binary images with a resolution of 512×512.

混凝土开裂包括三种模式，即纵向开裂、横向开裂和疲劳开裂，也记作纵向裂缝、横向裂缝和疲劳裂缝，如图8所示。Concrete cracking includes three modes, namely longitudinal cracking, transverse cracking and fatigue cracking, also recorded as longitudinal cracks, transverse cracks and fatigue cracks, as shown in Figure 8.

本实施例中，数据集随机分为训练集和验证集，训练集和验证集分别包含960和411个不同的裂纹图像，具体如表1所示。In this embodiment, the data set is randomly divided into a training set and a validation set. The training set and the validation set contain 960 and 411 different crack images, respectively, as shown in Table 1.

表1 自建数据集中每种损伤类型的图像数量Table 1 The number of images of each damage type in the self-built dataset

本实施例中，模型评估度量包括：精度、召回率、平均精度（mAP）、F1分数、每秒千兆浮点运算数（GFLOPs）和参数数（Params）。In this embodiment, the model evaluation metrics include: accuracy, recall, mean average precision (mAP), F1 score, GFLOPs, and number of parameters (Params).

精度、召回率和mAP值越大，裂纹检测的准确性就越高，Params和GFLOPs值越小，模型所需的计算能力就越低。对于F1，值为1表示最佳性能。The larger the precision, recall, and mAP values, the more accurate the crack detection is, and the smaller the Params and GFLOPs values, the less computing power the model requires. For F1, a value of 1 indicates the best performance.

本实施例中，将上述第一种检测模型记作YOLOv5-L，第二种检测模型记作YOLOv5-ML，第三种检测模型记作YOLOv5-RML；本实施例中，还选取模型Fast R-CNN、YOLOv5、YOLOv6、YOLOv7和YOLOv8作为对比模型，以便通过五种对比模型与三种检测模型的数据对比，验证三种检测模型的性能。In this embodiment, the first detection model is recorded as YOLOv5-L, the second detection model is recorded as YOLOv5-ML, and the third detection model is recorded as YOLOv5-RML. In this embodiment, the models Fast R-CNN, YOLOv5, YOLOv6, YOLOv7 and YOLOv8 are also selected as comparison models, so as to verify the performance of the three detection models by comparing the data of the five comparison models with the three detection models.

本实施例中，采用训练集对三种检测模型和五种对比模型进行训练。In this embodiment, three detection models and five comparison models are trained using a training set.

本实施例中，随着模型迭代次数的增加，三种检测模型的损失曲线分别如图9、图10和图11所示，图中散点展示了模型在训练集上的损失表现和模型在验证集上的损失表现。结合图9-图11可知，三种检测模型YOLOv5-L、YOLOv5-ML和YOLOv5-RML，模型损失在大约100个训练步骤后趋于饱和，其中YOLOv5 RML最终损失值为0.017。In this embodiment, as the number of model iterations increases, the loss curves of the three detection models are shown in Figures 9, 10 and 11, respectively. The scatter points in the figures show the loss performance of the model on the training set and the loss performance of the model on the validation set. Combining Figures 9 to 11, it can be seen that the model losses of the three detection models YOLOv5-L, YOLOv5-ML and YOLOv5-RML tend to saturate after about 100 training steps, among which the final loss value of YOLOv5 RML is 0.017.

本实施例中，三种检测模型和五种对比模型训练完成后，在验证集上验证模型性能，结果如表2所示。In this embodiment, after the three detection models and five comparison models are trained, the model performance is verified on the verification set, and the results are shown in Table 2.

表2：模型验证结果对比Table 2: Comparison of model validation results

结合表2可知，在模型精度上，三种检测模型均高于0.9，五种对比模型均低于0.9；在召回率上，三种检测模型均高于0.86，五种对比模型均低于0.84；在平均精度上，三种检测模型均高于0.93，五种对比模型均低于或等于0.92；在F1分数上，三种检测模型均高于0.89，五种对比模型均低于0.86；在参数数上，三种检测模型均低于5.61，五种对比模型均高于7.02。Combined with Table 2, we can see that in terms of model accuracy, the three detection models are all higher than 0.9, and the five comparison models are all lower than 0.9; in terms of recall rate, the three detection models are all higher than 0.86, and the five comparison models are all lower than 0.84; in terms of average precision, the three detection models are all higher than 0.93, and the five comparison models are all lower than or equal to 0.92; in terms of F1 score, the three detection models are all higher than 0.89, and the five comparison models are all lower than 0.86; in terms of the number of parameters, the three detection models are all lower than 5.61, and the five comparison models are all higher than 7.02.

尤其是第三种检测模型YOLOv5-RML，与原始模型YOLOv5相比，平均精度提高了6.86%，推理时间减少了4.82%，模型权重减少了18.23%；与YOLOv8相比，在精度、召回率、平均精度和F1得分方面分别提高了3.28%、8.46%、3.79%和5.89%。可见，本发明提出的第三种检测模型，在混凝土预制件的表面检测上，性能表现比YOLOv8更好，大大促进了混凝土预制件表面检测的技术进步。In particular, the third detection model YOLOv5-RML, compared with the original model YOLOv5, has an average accuracy improvement of 6.86%, a reduction of inference time of 4.82%, and a reduction of model weight of 18.23%; compared with YOLOv8, the accuracy, recall rate, average precision and F1 score have been improved by 3.28%, 8.46%, 3.79% and 5.89% respectively. It can be seen that the third detection model proposed in the present invention has better performance than YOLOv8 in the surface detection of precast concrete parts, which greatly promotes the technical progress of surface detection of precast concrete parts.

可见，本发明提出的三种检测模型的精度高于现有的对比模型，且收敛速度也优于现有大多数的对比模型，个别收敛速度更快的对比模型YOLOv8在精度上与任一种检测模型均存在较大区别。It can be seen that the three detection models proposed in the present invention have higher accuracy than the existing comparison models, and their convergence speed is also better than most of the existing comparison models. The comparison model YOLOv8 with a faster convergence speed has a large difference in accuracy from any detection model.

参照图12，本实施例中，通过P-R曲线，进一步展示表2中各模型的平均精度（P-R曲线下的面积），P-R曲线下的面积越大，表明模型表现更好，可见本发明提出的第三种检测模型的性能比YOLOv8更加优异。Referring to Figure 12, in this embodiment, the average accuracy (area under the P-R curve) of each model in Table 2 is further displayed through the P-R curve. The larger the area under the P-R curve, the better the model performance. It can be seen that the performance of the third detection model proposed in the present invention is better than that of YOLOv8.

本实施例中，还进一步在验证集上验证训练完成的检测模型在不同类型的裂缝的检测精度。为便于表示，以D00标注横向裂缝，以D10标注纵向裂缝，以D20标注疲劳裂缝。检测模型Yolov5-L对不同类型的裂缝图像的预测精度如图13的混淆矩阵所示，检测模型Yolov5-ML对不同类型的裂缝图像的预测精度如图14的混淆矩阵所示，检测模型Yolov5-RML对不同类型的裂缝图像的预测精度如图15的混淆矩阵所示。结合混淆矩阵可知，与其他情况相比，第三种检测模型YOLOv5 RML在纵向裂缝、横向裂缝和疲劳裂缝上的精度、召回率和mAP50方面表现出更好的性能。In this embodiment, the detection accuracy of the trained detection model on different types of cracks is further verified on the validation set. For ease of representation, transverse cracks are marked with D00, longitudinal cracks are marked with D10, and fatigue cracks are marked with D20. The prediction accuracy of the detection model Yolov5-L for different types of crack images is shown in the confusion matrix of Figure 13, the prediction accuracy of the detection model Yolov5-ML for different types of crack images is shown in the confusion matrix of Figure 14, and the prediction accuracy of the detection model Yolov5-RML for different types of crack images is shown in the confusion matrix of Figure 15. Combined with the confusion matrix, it can be seen that compared with other cases, the third detection model YOLOv5 RML shows better performance in terms of accuracy, recall rate and mAP50 on longitudinal cracks, transverse cracks and fatigue cracks.

当然，对于本领域技术人员而言，本发明不限于上述示范性实施例的细节，而还包括在不背离本发明的精神或基本特征的情况下，能够以其他的具体形式实现的相同或类似结构。因此，无论从哪一点来看，均应将实施例看作是示范性的，而且是非限制性的，本发明的范围由所附权利要求而不是上述说明限定，因此旨在将落在权利要求的等同要件的含义和范围内的所有变化囊括在本发明内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。Of course, it is obvious to those skilled in the art that the present invention is not limited to the details of the exemplary embodiments described above, but also includes the same or similar structures that can be implemented in other specific forms without departing from the spirit or essential features of the present invention. Therefore, no matter from which point of view, the embodiments should be regarded as exemplary and non-limiting, and the scope of the present invention is defined by the appended claims rather than the above description, and it is intended that all changes falling within the meaning and scope of the equivalent elements of the claims are included in the present invention. Any reference numerals in the claims should not be regarded as limiting the claims involved.

此外，应当理解，虽然本说明书按照实施方式加以描述，但并非每个实施方式仅包含一个独立的技术方案，说明书的这种叙述方式仅仅是为清楚起见，本领域技术人员应当将说明书作为一个整体，各实施例中的技术方案也可以经适当组合，形成本领域技术人员可以理解的其他实施方式。In addition, it should be understood that although the present specification is described according to implementation modes, not every implementation mode contains only one independent technical solution. This description of the specification is only for the sake of clarity. Those skilled in the art should regard the specification as a whole. The technical solutions in each embodiment may also be appropriately combined to form other implementation modes that can be understood by those skilled in the art.

本发明未详细描述的技术、形状、构造部分均为公知技术。The techniques, shapes, and structural parts not described in detail in the present invention are all well-known techniques.

Claims

1. A training method for a precast concrete crack detection model, comprising the following steps:

First, a basic model and learning samples are obtained. The learning samples are the associated crack shooting images and crack annotated images. The basic model is based on the YOLOv5 model, and the connected Upsample unit and Concat unit are replaced by the LW module. The LW module first performs pooling on the input data of the Upsample unit in the YOLOv5 model, and then performs three-way splicing of the pooled data with the two-way input of the Upsample unit and Concat unit combination module in the YOLOv5 model and outputs the result.

Then the basic model is made to learn the learning samples to iterate the model parameters until convergence; the converged basic model is used as a precast concrete crack detection model, whose input is the captured image and output is the crack annotation image.

2. The training method of the precast concrete crack detection model according to claim 1 is characterized in that the LW module includes an average pooling layer, a maximum pooling layer, a dimension superposition unit and a fifth Concat unit; the fifth Concat unit is provided with three input ends; the LW module is provided with two input ends; in the Upsample unit and the Concat unit combination module in the YOLOv5 model, the input end of the Upsample unit is recorded as the first input end of the combination module, and the input end of the Concat unit connected to the second C3 module is recorded as the second input end of the combination module; the first input end of the LW module replaces the first input end of the combination module, and the second input end of the LW module replaces the second input end of the combination module;

The first input end of the LW module is respectively connected to the input end of the average pooling layer, the input end of the maximum pooling layer and the first input end of the fifth Concat unit, the output end of the average pooling layer and the output end of the maximum pooling layer are both connected to the input end of the dimension superposition unit, the output end of the dimension superposition unit is connected to the second input end of the fifth Concat unit, and the third input end of the fifth Concat unit is connected to the second input end of the LW module; the output end of the fifth Concat unit serves as the output end of the LW module.

3. The training method of the precast concrete crack detection model as described in claim 1 is characterized in that, in the YOLOv5 model, the modules with the same structure are named sequentially along the data flow direction; on the basis of the YOLOv5 model, the basic model also replaces the first C3 module and the fourth C3 module in the data flow direction with an MMT module, and the MMT module includes a Bottleneck unit, a MultiAttenCat unit and a tenth Conv unit connected in sequence; the input end of the MMT module is respectively connected to the input end of the Bottleneck unit and the input end of the MultiAttenCat unit, and the output end of the tenth Conv unit is used as the output end of the MMT module.

4. The training method of the precast concrete crack detection model as described in claim 3 is characterized in that the basic model is based on the YOLOv5 model, and the fourth Conv unit and the fifth Conv unit are replaced by RepVGG units.

5. The training method of the precast concrete crack detection model according to claim 1, wherein the iterative process of the model parameters comprises the following steps:

St1, divide the learning samples into training set and test set;

St2, extract multiple samples from the training set as training samples, and let the basic model learn the training samples to iterate the model parameters;

St3, extract multiple samples from the test set as test samples, let the basic model predict the test samples and output crack annotation images;

St4, calculate the loss of the basic model on the test sample to determine whether the basic model converges; if not, return to step St2; if yes, fix the basic model.

6. The training method for the precast concrete crack detection model as described in claim 5 is characterized in that, in St4, the loss of the basic model is a cross entropy loss or a mean square error loss.

7. A precast concrete crack detection method using the precast concrete crack detection model training method as described in any one of claims 1 to 6, characterized in that the detection model is first obtained by using the precast concrete crack detection model training method as described in any one of claims 1 to 6; then the target to be detected is photographed to obtain a target image; then the target image is input into the detection model, and the detection model outputs a crack annotation image to annotate the crack type.

8. The precast concrete crack detection method according to claim 7, wherein the crack types include: transverse cracks, longitudinal cracks and fatigue cracks.

9. A precast concrete crack detection system, characterized in that it comprises a memory and a processor, wherein a computer program is stored in the memory, and the processor is connected to the memory, and the processor is used to execute the computer program to implement the training method of the precast concrete crack detection model according to any one of claims 1 to 6.

10. A readable medium, characterized in that a computer program is stored therein, and when the computer program is executed, it is used to implement the training method of the precast concrete crack detection model according to any one of claims 1 to 6.