CN114882474A

CN114882474A - Road disease detection method and system based on convolutional neural network

Info

Publication number: CN114882474A
Application number: CN202210608746.5A
Authority: CN
Inventors: 刘国良; 刘泳辰; 田国会
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2022-05-30
Filing date: 2022-05-30
Publication date: 2022-08-09

Abstract

The invention belongs to the technical field of road construction, and provides a road disease detection method and a system based on a convolutional neural network, wherein a shadow removal module based on a generated countermeasure network removes shadows of a road disease image to be detected; detecting and obtaining road disease types based on the image after shadow removal and the target detection model; the construction process of the target detection model comprises the following steps: adopting a Yolov5 target detection network fused with a convolution attention module to respectively execute an attention mechanism on the channel and space dimensions and extract characteristic graphs with different dimensions; based on the idea of bidirectional feature fusion, the feature graphs of different dimensions are subjected to weighted fusion by adopting a self-adaptive feature fusion method to obtain a fused feature graph. The method solves the defects of the traditional road disease detection scheme, and obviously improves the detection precision.

Description

Method and system for road disease detection based on convolutional neural network

技术领域technical field

本发明属于道路建设技术领域，尤其涉及一种基于卷积神经网络的道路病害检测方法及系统。The invention belongs to the technical field of road construction, and in particular relates to a road disease detection method and system based on a convolutional neural network.

背景技术Background technique

本部分的陈述仅仅是提供了与本发明相关的背景技术信息，不必然构成在先技术。The statements in this section merely provide background information related to the present invention and do not necessarily constitute prior art.

随着道路建设的快速发展,道路病害检测工作的重要性日益凸显。及时准确地获取道路病害信息，可以为道路养护工作节省巨大成本，同时降低道路交通事故发生的可能性。With the rapid development of road construction, the importance of road disease detection has become increasingly prominent. Timely and accurate access to road disease information can save huge costs for road maintenance and reduce the possibility of road traffic accidents.

传统的道路病害检测主要依靠巡查人员，采用停车巡检、拍照记录、人工量测的方式采集道路病害数据。Traditional road disease detection mainly relies on inspectors to collect road disease data by means of parking inspections, photographing records, and manual measurement.

但是其存在的问题是：一方面人力成本高、检测效率低、安全性差，另一方面数据不客观、无法有效地管理空间信息，不符合现代道路巡查管理的要求。近年来，计算机视觉方法和深度学习算法越来越多地运用到道路病害检测领域，显著提升了行业水平。但目前的算法仍然存在着受噪声影响大、模型稳定性差等缺点，当前算法有待优化。However, the existing problems are: on the one hand, high labor cost, low detection efficiency, and poor safety; on the other hand, the data is not objective and cannot effectively manage spatial information, which does not meet the requirements of modern road inspection management. In recent years, computer vision methods and deep learning algorithms have been increasingly applied to the field of road disease detection, which has significantly improved the industry level. However, the current algorithm still has shortcomings such as being greatly affected by noise and poor model stability, and the current algorithm needs to be optimized.

发明内容SUMMARY OF THE INVENTION

为了解决上述背景技术中存在的至少一项技术问题，本发明提供一种基于卷积神经网络的道路病害检测方法及系统，其解决了传统道路病害检测方案存在的弊病，并且在检测精度上有着显著提升。In order to solve at least one technical problem existing in the above background art, the present invention provides a road disease detection method and system based on a convolutional neural network, which solves the shortcomings of the traditional road disease detection scheme, and has high detection accuracy. obvious improvement.

为了实现上述目的，本发明采用如下技术方案：In order to achieve the above object, the present invention adopts the following technical solutions:

本发明的第一个方面提供基于卷积神经网络的道路病害检测方法，包括如下步骤：A first aspect of the present invention provides a road disease detection method based on a convolutional neural network, comprising the following steps:

获取待检测道路病害图像；Obtain the image of the road disease to be detected;

基于生成对抗网络的阴影去除模块去除待检测道路病害图像的阴影；The shadow removal module based on generative adversarial network removes the shadow of the road disease image to be detected;

基于去除阴影后的图像和目标检测模型检测得到道路的病害类型；其中，所述目标检测模型的构建过程为：采用融合卷积注意力模块的Yolov5目标检测网络，分别在通道和空间维度上执行注意力机制，提取得到不同维度的特征图；The disease type of the road is detected based on the image after removing the shadow and the target detection model; wherein, the construction process of the target detection model is as follows: using the Yolov5 target detection network fused with the convolutional attention module, executes in the channel and space dimensions respectively. Attention mechanism to extract feature maps of different dimensions;

基于特征双向融合的思想，采用自适应特征融合方法对不同维度的特征图进行加权融合得到融合的特征图，基于融合的特征图进行特征识别得到道路的病害的分类结果。Based on the idea of two-way feature fusion, the adaptive feature fusion method is used to weight the feature maps of different dimensions to obtain the fused feature map, and the classification results of road diseases are obtained based on the feature recognition based on the fused feature map.

本发明的第二个方面提供基于卷积神经网络的道路病害检测系统，包括：A second aspect of the present invention provides a road disease detection system based on a convolutional neural network, including:

数据获取模块，用于获取待检测道路病害图像；A data acquisition module, used to acquire images of road diseases to be detected;

阴影去除模块，用于基于生成对抗网络的阴影去除模块去除待检测道路病害图像的阴影；The shadow removal module is used to remove the shadow of the road disease image to be detected based on the shadow removal module of the generative adversarial network;

道路病害检测模块，用于基于去除阴影后的图像和目标检测模型检测得到道路病害类型；其中，所述目标检测模型的构建过程为：采用融合卷积注意力模块的Yolov5目标检测网络，分别在通道和空间维度上执行注意力机制，提取得到不同维度的特征图；The road disease detection module is used to detect the type of road disease based on the image after removing the shadow and the target detection model; wherein, the construction process of the target detection model is: using the Yolov5 target detection network fused with the convolutional attention module, respectively in The attention mechanism is performed on the channel and spatial dimensions, and feature maps of different dimensions are extracted;

基于特征双向融合的思想，采用自适应特征融合方法对不同维度的特征图进行加权融合得到融合的特征图；基于融合的特征图进行特征识别得到道路的病害的分类结果。Based on the idea of two-way feature fusion, the adaptive feature fusion method is used to weight the feature maps of different dimensions to obtain the fused feature map; the feature recognition based on the fused feature map is used to obtain the classification result of road diseases.

本发明的第三个方面提供一种计算机可读存储介质。A third aspect of the present invention provides a computer-readable storage medium.

一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如上述所述的基于卷积神经网络的道路病害检测方法中的步骤。A computer-readable storage medium on which a computer program is stored, when the program is executed by a processor, implements the steps in the above-mentioned method for detecting road diseases based on a convolutional neural network.

本发明的第四个方面提供一种计算机设备。A fourth aspect of the present invention provides a computer apparatus.

一种计算机设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现如上述所述的基于卷积神经网络的道路病害检测方法中的步骤。A computer device, comprising a memory, a processor, and a computer program stored on the memory and running on the processor, when the processor executes the program, the above-mentioned road disease detection based on a convolutional neural network is realized steps in the method.

与现有技术相比，本发明的有益效果是：Compared with the prior art, the beneficial effects of the present invention are:

本发明为了解决在道路病害的目标检测过程中，原始的Yolov5模型对于不同尺寸的道路病害检测鲁棒性差，尤其过度聚焦小物体问题，例如直径小于30毫米的道路坑槽，该目标不属于道路病害的范畴，反之增加了工作负担。提出了融合注意力模块的方法，分别在通道和空间维度上执行注意力机制，在正确类别的检测置信度上普遍高于原始Yolov5模型的检测置信度。The present invention aims to solve the problem that the original Yolov5 model has poor robustness for road disease detection of different sizes during the target detection process of road diseases, especially the problem of over-focusing on small objects, such as road potholes with a diameter of less than 30 mm, the target does not belong to the road The scope of the disease, on the contrary, increases the workload. A method of fusing the attention module is proposed to perform the attention mechanism in the channel and spatial dimensions respectively, and the detection confidence of the correct category is generally higher than that of the original Yolov5 model.

本发明为了解决在道路病害的目标检测过程中，检测效果很容易受到图像阴影的干扰，尤其是裂缝类病害容易与树枝阴影混淆，为降低漏检漏判、错检错判的概率。设计了基于通道注意的生成对抗网络单幅图像阴影去除方法，设计了阴影去除网络。The invention aims to solve the problem that the detection effect is easily disturbed by image shadows during the target detection process of road diseases, especially crack diseases are easily confused with tree shadows, so as to reduce the probability of missed detection and misjudgment. A single-image shadow removal method for generative adversarial networks based on channel attention is designed, and a shadow removal network is designed.

本发明附加方面的优点将在下面的描述中部分给出，部分将从下面的描述中变得明显，或通过本发明的实践了解到。Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will become apparent from the description which follows, or may be learned by practice of the invention.

附图说明Description of drawings

构成本发明的一部分的说明书附图用来提供对本发明的进一步理解，本发明的示意性实施例及其说明用于解释本发明，并不构成对本发明的不当限定。The accompanying drawings forming a part of the present invention are used to provide further understanding of the present invention, and the exemplary embodiments of the present invention and their descriptions are used to explain the present invention, and do not constitute an improper limitation of the present invention.

图1是本发明实施例基于卷积神经网络的道路病害检测方法的流程示意图；1 is a schematic flowchart of a method for detecting road diseases based on a convolutional neural network according to an embodiment of the present invention;

图2(a)-图2(d)是本发明实施例道路病害信息的样本库；Fig. 2(a)-Fig. 2(d) are a sample library of road disease information according to an embodiment of the present invention;

图3(a)-图3(d)是本发明实施例阴影训练数据集示例图片；Fig. 3 (a)-Fig. 3 (d) is the example picture of shadow training data set of the embodiment of the present invention;

图4(a)-图4(d)是本发明实施例为待测试的有阴影图像；4(a)-FIG. 4(d) are shaded images to be tested according to an embodiment of the present invention;

图5(a)-图5(d)是本发明实施例去除阴影后的模型测试效果。Fig. 5(a)-Fig. 5(d) are the model test effects after the shadow is removed according to the embodiment of the present invention.

图6(a)-图6(j)是本发明实施例改进后模型训练过程可视化结果。Fig. 6(a)-Fig. 6(j) are visualization results of the model training process after the improvement of the embodiment of the present invention.

具体实施方式Detailed ways

下面结合附图与实施例对本发明作进一步说明。The present invention will be further described below with reference to the accompanying drawings and embodiments.

应该指出，以下详细说明都是例示性的，旨在对本发明提供进一步的说明。除非另有指明，本文使用的所有技术和科学术语具有与本发明所属技术领域的普通技术人员通常理解的相同含义。It should be noted that the following detailed description is exemplary and intended to provide further explanation of the invention. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

需要注意的是，这里所使用的术语仅是为了描述具体实施方式，而非意图限制根据本发明的示例性实施方式。如在这里所使用的，除非上下文另外明确指出，否则单数形式也意图包括复数形式，此外，还应当理解的是，当在本说明书中使用术语“包含”和/或“包括”时，其指明存在特征、步骤、操作、器件、组件和/或它们的组合。It should be noted that the terminology used herein is for the purpose of describing specific embodiments only, and is not intended to limit the exemplary embodiments according to the present invention. As used herein, unless the context clearly dictates otherwise, the singular is intended to include the plural as well, furthermore, it is to be understood that when the terms "comprising" and/or "including" are used in this specification, it indicates that There are features, steps, operations, devices, components and/or combinations thereof.

本发明的整体思想为：The overall idea of the present invention is:

首先，基于车载摄像和无人机航拍两个视角，收集、标注了道路病害图像数据集，建立了包含四类道路病害信息的图像样本库；其次，基于Yolov5目标检测算法，融合基于生成对抗网络的阴影去除算法、注意力机制及特征融合模块，构建了效果更好的目标检测模型。First, based on the two perspectives of vehicle-mounted camera and UAV aerial photography, the road disease image data set was collected and labeled, and an image sample library containing four types of road disease information was established; secondly, based on the Yolov5 target detection algorithm, the fusion-based generative confrontation network The shadow removal algorithm, attention mechanism and feature fusion module of the proposed method construct a better target detection model.

本发明解决了传统道路病害检测方案存在的弊病，并且在检测精度上有着显著提升，该研究成果主要应用在道路养护信息化智能化建设中，以及时发现道路病害、延长道路及相关辅助设施使用寿命，推进“互联网+道路管理”模式创新。The invention solves the shortcomings of the traditional road disease detection scheme, and has a significant improvement in detection accuracy. The research results are mainly applied in the intelligent construction of road maintenance informatization, timely detection of road diseases, extension of roads and the use of related auxiliary facilities Longevity, and promote the innovation of the "Internet + road management" model.

实施例一Example 1

如图1所示，本实施例提供一种基于卷积神经网络的道路病害检测方法，包括如下步骤：As shown in FIG. 1 , this embodiment provides a method for detecting road diseases based on a convolutional neural network, including the following steps:

步骤1：获取待检测道路病害图像；Step 1: Obtain the road disease image to be detected;

步骤2：基于生成对抗网络的阴影去除模块去除待检测道路病害图像的阴影；Step 2: The shadow removal module based on the generative adversarial network removes the shadow of the road disease image to be detected;

步骤3：基于去除阴影后的图像和目标检测模型检测得到道路病害类型；Step 3: Detect the type of road disease based on the shadow-removed image and the target detection model;

其中，所述目标检测模型的构建过程为：采用融合卷积注意力模块的Yolov5目标检测网络，分别在通道和空间维度上执行注意力机制，提取得到不同维度的特征图；Wherein, the construction process of the target detection model is as follows: adopt the Yolov5 target detection network fused with the convolutional attention module, perform the attention mechanism in the channel and space dimensions respectively, and extract feature maps of different dimensions;

基于特征双向融合的思想，采用自适应特征融合方法对不同维度的特征图进行加权融合得到融合的特征图；Based on the idea of two-way feature fusion, the adaptive feature fusion method is used to weight the feature maps of different dimensions to obtain the fused feature map;

基于融合得到的三张不同分辨率的特征图分别使用锚框，基于置信度阈值和非极大值抑制(NMS)的方法对锚框进行回归计算，最终生成带有类别概率、置信度得分和目标包围框的输出向量，实现了道路病害信息的分类与定位。Based on the three feature maps of different resolutions obtained by fusion, the anchor boxes are used respectively, and the anchor boxes are regressed based on the confidence threshold and non-maximum suppression (NMS) methods, and finally generated with class probability, confidence score and The output vector of the target bounding box realizes the classification and localization of road disease information.

作为一种或多种实施例，步骤1中，全面、规范的数据集是深度学习模型训练的基础。本实施例针对道路病害检测问题，建立了双视角、四类道路病害信息的样本库；针对图像阴影处理问题，整理了针对性较强的图像阴影数据集。As one or more embodiments, in step 1, a comprehensive and standardized data set is the basis for deep learning model training. Aiming at the problem of road disease detection, this embodiment establishes a sample library of dual-view and four types of road disease information; for the problem of image shadow processing, a more targeted image shadow data set is organized.

如图2(a)-图2(d)所示,检测目标依次分为纵向裂缝、横向裂缝、龟纹裂缝和道路坑槽四类道路病害。As shown in Figure 2(a)-Figure 2(d), the detection targets are divided into four types of road diseases: longitudinal cracks, transverse cracks, moiré cracks and road potholes.

数据集分为车载摄像(HDV)和无人机航拍(UAV)两个视角，为道路养护工作提供更大的灵活性。The dataset is divided into two perspectives, vehicle-mounted camera (HDV) and unmanned aerial vehicle (UAV), to provide greater flexibility for road maintenance work.

另外针对阴影影响问题，本实施例建立了包含无阴影图片、有阴影图片、阴影部分蒙版、阴影部分边缘四个部分的图像阴影训练数据集，依次如图3(a)-图3(d)所示。In addition, for the problem of shadow influence, this embodiment establishes an image shadow training data set including four parts: no shadow picture, shadow picture, shadow part mask, and shadow part edge, as shown in Figure 3(a)-Figure 3(d) ) shown.

作为一种或多种实施例，步骤2中，所述生成对抗网络的阴影去除模块包括阴影检测器、阴影检测判别器、阴影消除器以及阴影消除判别器；As one or more embodiments, in step 2, the shadow removal module of the generative adversarial network includes a shadow detector, a shadow detection discriminator, a shadow remover, and a shadow removal discriminator;

所述阴影检测器和阴影消除器采用UNet++网络，由上采样、下采样和多个节点组成，每个节点都是一个由卷积层、批量归一化层、Mish激活函数和scSE模块组成的残差块。The shadow detector and shadow remover adopt UNet++ network, which is composed of up-sampling, down-sampling and multiple nodes, each node is a convolution layer, batch normalization layer, Mish activation function and scSE module. residual block.

所述阴影检测器在UNet++的网络之后设置附加结构，附加结构由3×3卷积层、批量归一化层、LReLU激活函数和Sigmoid激活函数组成，实现了对输出的阴影蒙版限制范围，使输出限制在0到1的范围内。The shadow detector is provided with an additional structure after the network of UNet++, and the additional structure is composed of a 3×3 convolution layer, a batch normalization layer, an LReLU activation function and a Sigmoid activation function, which realizes the output shadow mask limit range, Limits the output to the range 0 to 1.

所述阴影检测判别器根据标注的阴影蒙版、输入的图像以及阴影检测器输出的阴影蒙版堆叠成四通道，判断是否为阴影而非裂缝类病害。The shadow detection discriminator is stacked into four channels according to the marked shadow mask, the input image, and the shadow mask output by the shadow detector, and determines whether it is a shadow rather than a crack-like disease.

所述阴影消除器在UNet++的网络之后附加结构ColorBlock，由于阴影受到光波长的影响较强，相机可以获得的光强度根据目标的颜色而变化，因此在训练过程中应注重每个颜色通道及其之间的关系。ColorBlock利用全连接层对每个通道的权重进行估计，弥补了普通卷积层不能有效训练通道之间关系的缺陷，以便有效地训练每个波长的物理特性，学习输入图像与地面实况图像之间的差异。The shadow remover is attached with the structure ColorBlock after the network of UNet++. Since the shadow is strongly affected by the light wavelength, the light intensity that the camera can obtain varies according to the color of the target. Therefore, during the training process, attention should be paid to each color channel and its The relationship between. ColorBlock uses a fully connected layer to estimate the weight of each channel, which makes up for the defect that ordinary convolutional layers cannot effectively train the relationship between channels, so as to effectively train the physical properties of each wavelength and learn the relationship between the input image and the ground truth image. difference.

所述阴影消除器的消除过程包括：The removal process of the shadow remover includes:

根据阴影模型表示任意位置的光强度和阴影区域的光强度；According to the shadow model, the light intensity of any position and the light intensity of the shadow area are represented;

基于任意位置的光强度和阴影区域的光强度得到输入图像与地面实况图像之间的差异；Obtain the difference between the input image and the ground truth image based on the light intensity at any location and the light intensity in the shaded area;

基于输入图像与地面实况图像之间的差异消除阴影得到去除阴影的图像。Removing shadows based on the difference between the input image and the ground truth image results in a shadow-removed image.

其中，所述根据阴影模型表示任意位置的光强度和阴影区域的光强度具体为：Wherein, the light intensity of any position and the light intensity of the shadow area represented by the shadow model are specifically:

基于阴影模型：I(x,λ)＝L(x,λ)R(x,λ)Based on shadow model: I(x,λ)=L(x,λ)R(x,λ)

其中，I为光强度，L为光照度，R为反射率，I、L、R取决于图像上某点的位置x和波长λ。Among them, I is the light intensity, L is the illuminance, R is the reflectivity, and I, L, and R depend on the position x and wavelength λ of a certain point on the image.

由此可得到：From this we get:

非阴影区域中某点的光强度I^lit表示为：The light intensity I ^lit at a point in the non-shaded area is expressed as:

I^lit(x,λ)＝L^d(x,λ)R(x,λ)+L^a(x,λ)R(x,λ)I ^lit (x,λ)=L ^d (x,λ)R(x,λ)+L ^a (x,λ)R(x,λ)

阴影区域中某点的光强度I^shadow表示为：The light intensity I ^shadow of a point in the shadow area is expressed as:

I^shadow(x,λ)＝L^a(x,λ)R(x,λ)I ^shadow (x,λ)=L ^a (x,λ)R(x,λ)

式中，L^d表示直接照明的光照度，L^a表示间接照明的光照度。In the formula, L ^d represents the illuminance of direct lighting, and ^La represents the illuminance of indirect lighting.

所述输入图像上某点与地面实况图像上对应点之间的差异表示为：The difference between a point on the input image and the corresponding point on the ground truth image is expressed as:

Δ＝I_gt-I_input Δ=I _gt -I _input

＝P(I^lit(x,λ)-I^shadow(x,λ))=P(I ^lit (x,λ)-I ^shadow (x,λ))

≈I^lit(x,λ)-I^shadow(x,λ)≈I ^lit (x,λ)-I ^shadow (x,λ)

＝L^d(x,λ)R(x,λ)=L ^d (x,λ)R(x,λ)

其中，函数P表示相机图像采集系统的图像处理，I_gt表示地面实况图像(即无阴影图像)，I_input表示输入图像(即有阴影图像)。Among them, the function P represents the image processing of the camera image acquisition system, I _gt represents the ground truth image (ie, the image without shadow), and I _input represents the input image (ie, the image with shadow).

由于图像只有三个颜色通道R、G、B，因此上式中的λ可近似表示为R、G、B的函数，ColorBlock具有对每个颜色通道进行特定值加权的作用，可理解为对每个图像的三个颜色通道的直接照明的光照度

的估计。Since the image has only three color channels, R, G, and B, λ in the above formula can be approximately expressed as a function of R, G, and B. ColorBlock has the function of weighting each color channel with a specific value, which can be understood as the weighting of each color channel. The illuminance of the direct illumination of the three color channels of the image

's estimate.

假设来自光源的直射光强度对于整幅图像是恒定的，L^d与位置x无关，其结果Δ可表示为：Assuming that the direct light intensity from the light source is constant for the entire image, and L ^d is independent of the position x, the result Δ can be expressed as:

所述阴影消除判别器由卷积层、批处理归一化层和Mish激活函数层组成，这些图层按顺序连接，设置卷积层的步幅，输出消除阴影后的图像。The shadow removal discriminator is composed of a convolution layer, a batch normalization layer and a Mish activation function layer, these layers are connected in sequence, the stride of the convolution layer is set, and the image after shadow removal is output.

所述卷积层的步幅可以根据本实需求进行设置，本实施例中，将卷积层的步幅设置为每两个卷积层之间2。The stride of the convolutional layer can be set according to actual requirements. In this embodiment, the stride of the convolutional layer is set to 2 between every two convolutional layers.

所述阴影消除判别器的损失函数可表示为：The loss function of the shadow removal discriminator can be expressed as:

I_input为阴影图像，M_gt为阴影蒙版图像，I_gt为无阴影图像，I_output为阴影消除器输出的去阴影图像，V_real是一个平均值为0.5的随机值矩阵，V_fake是一个平均值为-0.5的随机值矩阵，其遵循高斯分布。I _input is the shadow image, M _gt is the shadow mask image, I _gt is the shadowless image, I _output is the unshaded image output by the shadow remover, V _real is a random value matrix with an average value of 0.5, and V _fake is a A matrix of random values with mean -0.5 that follows a Gaussian distribution.

上述技术的优势在于，在道路病害的目标检测过程中，为了解决检测效果很容易受到图像阴影的干扰的问题，尤其是裂缝类病害容易与树枝阴影混淆，为降低漏检漏判、错检错判的概率，基于通道注意的生成对抗网络单幅图像阴影去除方法(CANet)，设计了阴影去除网络。The advantage of the above technology is that in the process of target detection of road diseases, in order to solve the problem that the detection effect is easily disturbed by image shadows, especially crack diseases are easily confused with the shadows of branches, in order to reduce missed detection, missed judgment and false detection. The probability of judgment is based on the channel attention-based generative adversarial network single image shadow removal method (CANet), and a shadow removal network is designed.

作为一种或多种实施例，步骤3中，所述卷积融合注意力模块(ConvolutionalBlock Attention Module，CBAM)包括通道注意力(CAM)和空间注意力(SAM)两个子模块，分别在通道和空间维度上执行注意力机制。As one or more embodiments, in step 3, the Convolutional Block Attention Module (CBAM) includes two sub-modules, a channel attention (CAM) and a spatial attention (SAM), respectively, in the channel and The attention mechanism is executed in the spatial dimension.

针对在道路病害的目标检测过程中，原始的Yolov5模型对于不同尺寸的道路病害检测鲁棒性差，尤其过度聚焦小物体问题，例如直径小于30毫米的道路坑槽，该目标不属于道路病害的范畴，反之增加了工作负担。In the process of target detection of road diseases, the original Yolov5 model has poor robustness for road disease detection of different sizes, especially the problem of over-focusing on small objects, such as road potholes with a diameter of less than 30 mm, which do not belong to the category of road diseases. , which in turn increases the workload.

所述分别在通道和空间维度上执行注意力机制具体包括：The implementation of the attention mechanism in the channel and space dimensions respectively includes:

对一般卷积层提取到的特征图F₀(H×W×C)进行维度压缩时，同时引用了平均池化和最大池化，得到两个一维特征图F_1,2(1×1×C)，将F_1,2分别送入一个两层共享的神经网络(MLP)，MLP输出的特征进行基于element-wise做加和操作，最后经过Sigmoid激活操作，生成最终的通道注意力特征Mc；When compressing the dimension of the feature map F ₀ (H×W×C) extracted by the general convolutional layer, both average pooling and maximum pooling are used to obtain two one-dimensional feature maps F _1,2 (1×1 ×C), send F _{1 and 2} into a two-layer shared neural network (MLP) respectively, the features output by MLP are added based on element-wise, and finally through the sigmoid activation operation, the final channel attention feature is generated Mc;

对通道注意力特征Mc和输入特征图F₀进行基于element-wise的乘法操作，得到F₃作为SAM模块的输入特征，F₃基于通道做全局最大池化和全局平均池化，得到两个一维特征图F_4,5(H×W×1)，将这2个特征图进行通道拼接，并使用一个7×7卷积操作实现降维，最后经过sigmoid激活操作，生成空间注意力特征Ms。Perform an element-wise multiplication operation on the channel attention feature Mc and the input feature map F ₀ to obtain F ₃ as the input feature of the SAM module. F ₃ performs global maximum pooling and global average pooling based on the channel, and obtains two one Dimensional feature map F _4,5 (H×W×1), the two feature maps are channel spliced, and a 7×7 convolution operation is used to achieve dimensionality reduction, and finally through the sigmoid activation operation, the spatial attention feature Ms is generated .

最后对空间注意力模块的输出Ms和输入特征图F₃进行基于element-wise的乘法操作，得到最终生成的特征。Finally, an element-wise multiplication operation is performed on the output Ms _of the spatial attention module and the input feature map F3 to obtain the final generated features.

空间注意力旨在提升关键区域的特征表达，本质上是将原始图片中的空间信息通过空间转换模块，变换到另一个空间中并保留关键信息，为每个位置生成权重掩膜(mask)并加权输出，从而增强感兴趣的特定目标区域同时弱化不相关的背景区域。Spatial attention aims to improve the feature expression of key regions. It essentially transforms the spatial information in the original image into another space and retains the key information through a spatial transformation module, generates a weight mask for each position, and Weight the output so that specific target regions of interest are enhanced while attenuating irrelevant background regions.

本实施例中，为了尽量不改变网络结构，即为了更多地使用预训练的参数，让模型训练过程尽快收敛，把注意力模块加到block外面。In this embodiment, in order not to change the network structure as much as possible, that is, in order to make more use of pre-trained parameters and let the model training process converge as soon as possible, the attention module is added outside the block.

在本实施例，将Yolov5骨干网络中的主要负责提取残差特征的C3模块全部替换为“卷积融合注意力+C3”模块。In this embodiment, the C3 modules in the Yolov5 backbone network, which are mainly responsible for extracting residual features, are all replaced with "convolution fusion attention + C3" modules.

作为一种或多种实施例，步骤3中，所述采用自适应特征融合方法对不同维度的特征图进行加权融合具体包括：As one or more embodiments, in step 3, the weighted fusion of feature maps of different dimensions using an adaptive feature fusion method specifically includes:

考虑到仅通过简单的堆叠和叠加操作进行融合无法充分利用不同阶段的特征图，为更好地融合和提取骨干网络输出的特征、充分表征不同尺寸特征，本实施例在Yolov5原有的特征提取模块进行了改进。Considering that only simple stacking and stacking operations cannot make full use of the feature maps of different stages, in order to better fuse and extract the features output by the backbone network and fully characterize the features of different sizes, this embodiment uses the original feature extraction of Yolov5. Modules were improved.

基于特征双向融合的思想，自适应特征融合(Adaptively Spatial FeatureFusion，ASFF)的每一层都对原有特征结构的stages进行了加权融合，其中不同stage特征的融合采用了注意力机制，以控制其他stage对本stage特征的贡献度。Based on the idea of two-way feature fusion, each layer of Adaptive Spatial Feature Fusion (ASFF) performs weighted fusion of the stages of the original feature structure. The fusion of different stage features uses an attention mechanism to control other stages. The contribution of the stage to the features of this stage.

算法验证Algorithm verification

本实施例基于Pytorch深度学习框架实现，使用硬件平台GeForce RTX 2080Ti。实验使用Yolov5s预训练模型，因改变网络结构部分预训练参数不可用，故需要在未过拟合的情况下训练更多的轮次。This embodiment is implemented based on the Pytorch deep learning framework and uses the hardware platform GeForce RTX 2080Ti. The experiment uses the Yolov5s pre-training model. Since some pre-training parameters are not available for changing the network structure, it is necessary to train more rounds without overfitting.

模型训练策略如表1所示：The model training strategy is shown in Table 1:

表1模型训练策略Table 1 Model training strategy

在实验中使用了多种评估标准：Various evaluation criteria were used in the experiments:

准确率(Accuracy，Acc)：

Accuracy (Accuracy, Acc):

精确率(Precision，P)：

Precision (Precision, P):

召回率(Recall，R)：

Recall (Recall, R):

其中，TP是正类判定为正类数量、FP是负类判定为正类数量、FN是正类判定为负类数量、TN是负类判定为负类数量。Among them, TP is the number of positive classes determined as positive classes, FP is the number of positive classes determined by negative classes, FN is the number of positive classes determined as negative classes, and TN is the number of negative classes determined by negative classes.

平均正确率(Average Precision，AP)：

Average accuracy (Average Precision, AP):

AP在Precision-Recall曲线基础上获得，其中，r1,r2…rn是按升序排列的Precison插值段第一个插值处对应的Recall值。AP is obtained on the basis of the Precision-Recall curve, where r1, r2...rn are the Recall values corresponding to the first interpolation of the Precision interpolation segment in ascending order.

全类别平均正确率(mean Average Precision，mAP)：

Mean Average Precision (mAP) for all categories:

mAP_0.5:0.95表示取IoU从0.5到0.95情况下的平均值，mAP_0.5表示取IoU为0.5的情况。mAP_0.5:0.95 means to take the average value of IoU from 0.5 to 0.95, mAP_0.5 means to take the case of IoU of 0.5.

1、图像阴影处理效果1. Image shadow processing effect

使用SRD+预训练模型，使用1055张树枝阴影图片和1398张ISTD开源图片组成数据集4进行训练。如图4(a)-图4(d)所示为有阴影图像，图5(a)-图5(d)所示为去除阴影后的模型测试效果。Using the SRD+ pre-training model, 1055 branch shadow images and 1398 ISTD open source images are used to form dataset 4 for training. Figure 4(a)-Figure 4(d) shows the shadowed image, and Figure 5(a)-Figure 5(d) shows the model test effect after removing the shadow.

由实验可知，基于对抗神经网络的去阴影算法模型可以高效去除图片阴影。It can be seen from the experiments that the shadow removal algorithm model based on the adversarial neural network can effectively remove the shadow of the image.

2、Yolov5算法改进的效果2. The effect of Yolov5 algorithm improvement

(1)为便于测试模型改进效果，选取了3000张车载视角图片组成数据集1进行训练。针对不同的改进方法共设计了25组实验，训练结果如表2所示：(1) In order to test the improvement effect of the model, 3000 vehicle-mounted perspective pictures were selected to form Dataset 1 for training. A total of 25 groups of experiments were designed for different improvement methods, and the training results are shown in Table 2:

表2数据集1训练结果Table 2 Dataset 1 training results

由实验可知，Yolov5在增加CBAM注意力机制、更换ASFF特征提取模块后，提升效果最佳，模型训练mAP增加了3.5％。It can be seen from the experiments that after adding the CBAM attention mechanism and replacing the ASFF feature extraction module, Yolov5 has the best improvement effect, and the model training mAP increases by 3.5%.

(2)为提高模型泛化性能，选取多视角共14055张图片组成数据集2进行训练。训练结果如表3所示，改进后模型训练过程可视化如图6(a)-图6(j)所示：(2) In order to improve the generalization performance of the model, a total of 14,055 images from multiple perspectives are selected to form dataset 2 for training. The training results are shown in Table 3, and the visualization of the improved model training process is shown in Figure 6(a)-Figure 6(j):

表3数据集2训练结果Table 3 Dataset 2 training results

MethodsMethods metrics/mAP_0.5metrics/mAP_0.5 metrics/mAP_0.5:0.95metrics/mAP_0.5:0.95 初始Yolov5Initial Yolov5 0.9730.973 0.7480.748 改进后Yolov5Improved Yolov5 0.9870.987 0.7940.794

由实验可知，采用ASFF+CBAM的改进方法的Yolov5模型，相比于原始Yolov5算法mAP提升了4.6％。It can be seen from the experiments that the Yolov5 model using the improved method of ASFF+CBAM has a 4.6% increase in mAP compared to the original Yolov5 algorithm.

(3)为适应实际路况检测应用，采集100张多视角道路病害图片组成数据集3进行测试。测试结果如表4所示。(3) In order to adapt to the actual road condition detection application, 100 multi-view road disease pictures were collected to form Data Set 3 for testing. The test results are shown in Table 4.

表4数据集3测试结果Table 4 Dataset 3 test results

MethodsMethods AccuracyAccuracy 原始Yolov5Original Yolov5 0.860.86 改进后Yolov5Improved Yolov5 0.940.94

由实验可知，原始Yolov5模型在检测过程中易过度聚焦小物体，例如直径小于30毫米的道路坑槽，检测目标不构成道路病害，反之增加了工作负担，改进后的Yolov5模型很好的解决了这一问题；同时，改进Yolov5模型在正确类别的检测置信度上普遍高于原始Yolov5模型的检测置信度。It can be seen from the experiment that the original Yolov5 model is prone to over-focusing on small objects during the detection process, such as road potholes with a diameter of less than 30 mm. The detection target does not constitute a road disease, on the contrary, it increases the workload. The improved Yolov5 model solves the problem very well. This problem; at the same time, the detection confidence of the improved Yolov5 model is generally higher than that of the original Yolov5 model in the detection confidence of the correct category.

由于道路病害的类别划分具有一定重叠性和复杂性，模型在目标检测的过程中出现了少量类别误判的情况。Due to the overlap and complexity of the classification of road diseases, the model has a small number of misjudgments in the process of target detection.

3、引入图像阴影处理的Yolov5检测模型3. Yolov5 detection model that introduces image shadow processing

为测试阴影处理后目标检测模型的性能改进效果，制作100张不同角度的带有阴影的图片数据集5进行测试。测试结果如表5所示。In order to test the performance improvement effect of the target detection model after shadow processing, a dataset 5 of 100 images with shadows from different angles is made for testing. The test results are shown in Table 5.

表5数据集3测试结果Table 5 Dataset 3 test results

MethodsMethods AccuracyAccuracy Yolov5算法检测去阴影前图片Yolov5 algorithm detects the image before shadow removal 0.830.83 Yolov5算法检测去阴影后图片Yolov5 algorithm detects the image after shadow removal 0.900.90 改进后Yolov5检测去阴影前图片After improvement, Yolov5 detects the image before shadow removal 0.860.86 改进后Yolov5检测去阴影后图片The improved Yolov5 detects the image after removing the shadow 0.940.94

由实验可知，去除阴影干扰的图片在道路病害检测时显著降低了误判概率，且可以更高效精准的找到道路病害所在位置，对于四类道路病害的检测均有巨大提升。It can be seen from the experiments that the images with the shadow interference removed significantly reduce the probability of misjudgment in road disease detection, and can find the location of road diseases more efficiently and accurately, which greatly improves the detection of four types of road diseases.

综上所述，本发明提出的新型道路病害检测算法相较于传统道路病害检测算法，在检测精度上有明显提升。检测结果表明，本发明的方案能够准确识别宽度大于5毫米的裂缝、直径大于50毫米的坑槽，满足当前道路普查的实际需求。To sum up, compared with the traditional road disease detection algorithm, the new road disease detection algorithm proposed by the present invention has significantly improved detection accuracy. The detection results show that the solution of the present invention can accurately identify cracks with a width greater than 5 mm and potholes with a diameter greater than 50 mm, which meets the actual needs of the current road census.

实施例二Embodiment 2

本实施例提供基于卷积神经网络的道路病害检测系统，包括：This embodiment provides a road disease detection system based on a convolutional neural network, including:

基于特征双向融合的思想，采用自适应特征融合方法对不同维度的特征图进行加权融合得到融合的特征图；基于融合的特征图进行特征识别得到道路病害的分类结果。Based on the idea of two-way feature fusion, the adaptive feature fusion method is used to weight the feature maps of different dimensions to obtain the fused feature map; the feature recognition based on the fused feature map is used to obtain the classification result of road diseases.

实施例三Embodiment 3

本实施例提供了一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如上述所述的基于卷积神经网络的道路病害检测方法中的步骤。This embodiment provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the steps in the above-mentioned method for detecting road diseases based on a convolutional neural network.

实施例四Embodiment 4

本实施例提供了一种计算机设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现如上述所述的基于卷积神经网络的道路病害检测方法中的步骤。This embodiment provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor, when the processor executes the program, the above-mentioned convolutional neural-based computer program is implemented. Steps in a networked road disease detection method.

本领域内的技术人员应明白，本发明的实施例可提供为方法、系统、或计算机程序产品。因此，本发明可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且，本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied therein, including but not limited to disk storage, optical storage, and the like.

本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，所述的程序可存储于一计算机可读取存储介质中，该程序在执行时，可包括如上述各方法的实施例的流程。其中，所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory，ROM)或随机存储记忆体(RandomAccessMemory，RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through a computer program, and the program can be stored in a computer-readable storage medium. During execution, the processes of the embodiments of the above-mentioned methods may be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM) or the like.

以上所述仅为本发明的优选实施例而已，并不用于限制本发明，对于本领域的技术人员来说，本发明可以有各种更改和变化。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims

1. The road disease detection method based on the convolutional neural network is characterized by comprising the following steps of:

acquiring a road disease image to be detected;

removing the shadow of the road disease image to be detected based on a shadow removal module for generating a countermeasure network;

detecting and obtaining road disease types based on the image after shadow removal and the target detection model; the construction process of the target detection model comprises the following steps: adopting a Yolov5 target detection network fused with a convolution attention module to respectively execute an attention mechanism on the channel and space dimensions and extract characteristic graphs with different dimensions;

based on the idea of feature bidirectional fusion, the feature maps with different dimensions are subjected to weighted fusion by adopting a self-adaptive feature fusion method to obtain a fused feature map, and feature identification is carried out based on the fused feature map to obtain a classification result of the road diseases.

2. The convolutional neural network-based road disease detection method of claim 1, wherein the performing an attention mechanism in channel and spatial dimensions, respectively, specifically comprises:

the method comprises the steps of simultaneously introducing average pooling and maximum pooling when dimension compression is carried out on an original feature map to obtain two one-dimensional feature maps, respectively sending the two one-dimensional feature maps into a two-layer shared neural network, and carrying out addition operation to generate channel attention features;

and multiplying the channel attention characteristic and the original characteristic graph to obtain a third characteristic graph, performing global maximum pooling and global average pooling on the basis of the channel to obtain two one-dimensional characteristic graphs, splicing the two one-dimensional characteristic graphs, and reducing the dimension by using convolution operation to generate the space attention characteristic.

3. The convolutional neural network-based road disease detection method as claimed in claim 1, wherein the shadow removal module based on the generation countermeasure network removes shadows of the road disease image to be detected,

the shadow removal module generating the countermeasure network includes a shadow eliminator, the shadow eliminator elimination process including:

representing the light intensity of any position and the light intensity of a shadow area according to a shadow model;

obtaining a difference between the input image and the ground truth image based on the light intensity of the arbitrary position and the light intensity of the shadow area;

the elimination of the shadow based on the difference between the input image and the ground truth image results in a shadow-removed image.

4. The convolutional neural network-based road disease detection method as claimed in claim 3, wherein the shadow canceller adopts a network structure of UNet + +, and is composed of upsampling, downsampling and a plurality of nodes, each node is a residual block composed of a convolutional layer, a batch normalization layer, a Mish activation function and a scSE module, an additional structure ColorBlock is arranged behind the network of UNet + +, and the weight of each color channel of the image is estimated by using a full connection layer.

5. The convolutional neural network-based road disease detection method of claim 3, wherein the difference between the input image and the ground truth image is represented as:

Δ＝I _gt -I _input

＝P(I ^lit (x,λ)-I ^shadow (x,λ))

≈I ^lit (x,λ)-I ^shadow (x,λ)

＝L ^d (x,λ)R(x,λ)

in the formula I _gt Representing an unshaded image, I _input Representing a shadowed image, a function P representing the image processing of the camera image acquisition system, I ^lit Is the light intensity at position x, I ^shadow Light intensity of the shaded area, L ^d The illuminance of direct illumination is shown, R is the reflectance, and λ is the wavelength.

6. The convolutional neural network-based road disease detection method of claim 1, wherein the types of road diseases include four types of road diseases, namely longitudinal cracks, transverse cracks, tortoiseshell cracks and road pits.

7. The convolutional neural network-based road disease detection method of claim 1, wherein the shadow removal module further comprises a shadow detector that sets an additional structure after the UNet + + network to limit the range of the output shadow mask to be in the range of 0 to 1.

8. Road disease detecting system based on convolutional neural network, its characterized in that includes:

the data acquisition module is used for acquiring a road disease image to be detected;

the shadow removing module is used for removing the shadow of the road disease image to be detected based on the shadow removing module for generating the confrontation network;

the road disease detection module is used for detecting and obtaining the type of the road disease based on the image after the shadow is removed and the target detection model; the construction process of the target detection model comprises the following steps: adopting a Yolov5 target detection network fused with a convolution attention module to respectively execute an attention mechanism on the channel and space dimensions and extract characteristic graphs with different dimensions;

based on the thought of bidirectional feature fusion, the feature maps with different dimensions are subjected to weighted fusion by adopting a self-adaptive feature fusion method to obtain a fused feature map, and feature recognition is carried out based on the fused feature map to obtain a classification result of the road diseases.

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the convolutional neural network-based road disease detection method according to any one of claims 1 to 7.

10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps in the convolutional neural network-based road disease detection method of any one of claims 1-7 when executing the program.