CN117333845A

CN117333845A - Real-time detection method for small target traffic sign based on improved YOLOv5s

Info

Publication number: CN117333845A
Application number: CN202311453323.1A
Authority: CN
Inventors: 周欣欣; 薛青常; 李茂源; 杨峰; 姜万昌; 宋琼
Original assignee: Northeast Dianli University
Current assignee: Northeast Electric Power University
Priority date: 2023-11-03
Filing date: 2023-11-03
Publication date: 2024-01-02
Anticipated expiration: 2043-11-03
Also published as: CN117333845B

Abstract

The invention discloses a real-time detection method for a small target traffic sign based on improved YOLOv5s, which specifically comprises the following steps: (1) creating a traffic sign image dataset; (2) adding annotation information to the image in the dataset; (3) Constructing a small target traffic sign real-time detection model based on improved YOLOv5 s; (4) employing a new loss function; (5) training the model using the training set and the verification set; (6) And testing the model by adopting a test set, and obtaining a final small target traffic sign real-time detection model. Compared with the prior art, the real-time detection method for the small target traffic sign based on the improved YOLOv5s can effectively improve the detection precision of the small target traffic sign and greatly improve the real-time performance of traffic sign detection.

Description

A real-time detection method of small target traffic signs based on improved YOLOv5s

技术领域Technical field

本发明涉及人工智能领域，具体涉及一种小目标交通标志实时检测方法。The invention relates to the field of artificial intelligence, and specifically relates to a real-time detection method of small target traffic signs.

背景技术Background technique

随着自动驾驶技术的不断发展，道路交通标志识别(Traffic Sign Recognition,TSR)已经成为辅助驾驶系统(Advanced Driver Assistance Systems,ADAS)的关键组件之一。TSR能够对摄像头捕获的图像进行高效且准确的识别，实时地为驾驶员提供指示、警告等信息。在实际道路行驶过程中，交通标志在刚进入视野时所占的图像比例通常小于1％，属于小目标检测，小目标交通标志在图像中存在模糊、细节特征不明显的问题，因此，对小目标交通标志的快速、准确的识别，对于辅助驾驶系统的发展具有重要意义。With the continuous development of autonomous driving technology, Traffic Sign Recognition (TSR) has become one of the key components of the Advanced Driver Assistance Systems (ADAS). TSR can efficiently and accurately identify images captured by cameras and provide drivers with instructions, warnings and other information in real time. In the actual road driving process, the proportion of the image occupied by traffic signs when they first enter the field of view is usually less than 1%, which belongs to small target detection. Small target traffic signs have problems such as blur and unclear details in the image. Therefore, it is difficult to detect small target traffic signs. The rapid and accurate recognition of target traffic signs is of great significance to the development of assisted driving systems.

传统交通标志检测方法主要集中在颜色分割，形状、轮廓等特征进行特征提取，然后通过分类器完成特征分类，从而实现对道路交通标志的检测与识别。传统交通标志检测算法利用手工制作的方式提取特征，然后将这些特征作为算法的输入，这些检测算法具有维度高、计算量大、费时费力，交通标志检测过程中存在实时性、稳定性和精确性不佳。Traditional traffic sign detection methods mainly focus on color segmentation, feature extraction of shapes, contours and other features, and then complete feature classification through classifiers to achieve detection and recognition of road traffic signs. Traditional traffic sign detection algorithms use manual methods to extract features, and then use these features as input to the algorithm. These detection algorithms have high dimensions, large amounts of calculation, and are time-consuming and labor-intensive. There are real-time, stability, and accuracy issues in the traffic sign detection process. Not good.

随着深度学习的快速发展，改进卷积神经模型的目标检测技术已成为当下研究热点。卷积神经模型(Convolutional Neural Network，CNN)作为一种主流的深度学习算法，包括R-CNN、Fast-R-CNN、Fast-R-CNN、Mask-R-CNN、Alex Net、YOLO等多种模型。这些模型在不同应用场景下的目标检测取得了较好的效果。目标检测算法可分为两类：使用区域候选框的双阶段检测算法和不使用区域候选框的单阶段检测算法。双阶段算法精度高但检测速度较慢，改进后的算法在一定程度上提高了速度，但由于候选区域的设置，双阶段模型不能较好的满足高速驾驶过程中小目标交通标志的实时检测要求，因此在交通标志的识别中通常采用单阶段算法。单阶段目标检测算法方面，相较于双阶段检测算法虽然实时性有所提高，但精度却较低。尤其在小目标交通标志识别检测中，存在检测精度不高，容易导致漏检和误检，且由于模型参数量大、计算复杂度高导致检测实时性不好，模型迁移到可移动设备中具有一定困难。With the rapid development of deep learning, improving the target detection technology of convolutional neural models has become a current research hotspot. Convolutional Neural Network (CNN) is a mainstream deep learning algorithm, including R-CNN, Fast-R-CNN, Fast-R-CNN, Mask-R-CNN, Alex Net, YOLO, etc. Model. These models have achieved good results in target detection in different application scenarios. Target detection algorithms can be divided into two categories: two-stage detection algorithms that use region candidate frames and single-stage detection algorithms that do not use region candidate frames. The two-stage algorithm has high accuracy but slow detection speed. The improved algorithm improves the speed to a certain extent. However, due to the setting of the candidate area, the two-stage model cannot better meet the real-time detection requirements of small target traffic signs during high-speed driving. Therefore, a single-stage algorithm is usually used in traffic sign recognition. In terms of single-stage target detection algorithm, although the real-time performance is improved compared to the two-stage detection algorithm, the accuracy is lower. Especially in the recognition and detection of small target traffic signs, the detection accuracy is not high, which can easily lead to missed detections and false detections. Moreover, due to the large number of model parameters and high computational complexity, the real-time detection performance is not good. Migrating the model to mobile devices has the disadvantage of It must be difficult.

发明内容Contents of the invention

针对现有技术中存在的问题，本发明提供了一种基于改进YOLOv5s的小目标交通标志实时检测方法，通过改进YOLOv5s目标检测模型，利用特征增强以及采用NWD度量方法，解决小目标交通标志检测精度不高、实时性不好、模型参数量大、计算复杂度高、不适合部署在移动设备上的问题。In view of the problems existing in the existing technology, the present invention provides a real-time detection method of small target traffic signs based on improved YOLOv5s. By improving the YOLOv5s target detection model, utilizing feature enhancement and adopting the NWD measurement method, the detection accuracy of small target traffic signs is solved. It is not high, has poor real-time performance, has a large number of model parameters, has high computational complexity, and is not suitable for deployment on mobile devices.

本发明提供的技术方案包括以下步骤：The technical solution provided by the invention includes the following steps:

步骤1：获取交通标志图像，形成第一数据集；Step 1: Obtain traffic sign images to form the first data set;

步骤2：为所述第一数据集中的图像添加标注信息，形成第二数据集，并将所述第二数据集划分训练集、验证集和测试集；Step 2: Add annotation information to the images in the first data set to form a second data set, and divide the second data set into a training set, a verification set and a test set;

步骤3：构建基于改进YOLOv5s的交通标志检测模型；Step 3: Build a traffic sign detection model based on improved YOLOv5s;

步骤4：采用新的损失函数；Step 4: Use a new loss function;

步骤5：利用所述训练集和验证集对所述基于改进YOLOv5s的交通标志检测模型进行训练；Step 5: Use the training set and verification set to train the traffic sign detection model based on improved YOLOv5s;

步骤6：使用交通标志测试集对训练获取的最优模型进行测试，并获取最终的小目标交通标志实时检测模型。Step 6: Use the traffic sign test set to test the optimal model obtained through training, and obtain the final small target traffic sign real-time detection model.

进一步的，所述步骤1中，第一数据集中的交通标志图像可通过数码相机进行拍摄，也可从网络中收集获取，或者从监控视频中获取。Further, in step 1, the traffic sign images in the first data set can be captured by a digital camera, collected from the Internet, or obtained from surveillance videos.

优选的，所述步骤2中，可按照7∶2∶1的比例进行划分训练集、验证集和测试集。Preferably, in step 2, the training set, verification set and test set can be divided according to the ratio of 7:2:1.

进一步的，所述步骤3具体包括步骤3.1至步骤3.3：Further, the step 3 specifically includes step 3.1 to step 3.3:

步骤3.1：将YOLOv5s模型中的C3模块和ConvMixer相结合，构成CSPCM模块，用CSPCM模块分别替换YOLOv5s主干网络最后一层的C3模块和颈部网络中最后一层的C3模块；Step 3.1: Combine the C3 module and ConvMixer in the YOLOv5s model to form the CSPCM module, and use the CSPCM module to replace the C3 module at the last layer of the YOLOv5s backbone network and the C3 module at the last layer in the neck network respectively;

步骤3.2：用轻量化卷积模块C3_Faster替换YOLOv5s模型主干网络和颈部网络中其余的C3模块，即采用C3_Faster替换除所述步骤3.1中已经替换的主干网络和颈部网络中最后一层的C3模块之外的其余的C3模块；Step 3.2: Replace the remaining C3 modules in the backbone network and neck network of the YOLOv5s model with the lightweight convolution module C3_Faster, that is, use C3_Faster to replace the last layer of C3 in the backbone network and neck network except the ones that have been replaced in step 3.1. The remaining C3 modules outside the module;

步骤3.3：输出层在已有的3个检测头的基础上增加一个小目标检测头。Step 3.3: The output layer adds a small target detection head to the existing three detection heads.

进一步的，所述步骤4中，在CIoU损失函数中引入NWD度量方法，使用NWD度量优化YOLOv5s的CIoU损失函数，优化后的损失函数公式为：Further, in step 4, the NWD metric method is introduced into the CIoU loss function, and the NWD metric is used to optimize the CIoU loss function of YOLOv5s. The optimized loss function formula is:

L＝(1-β)*(1-NWD(N_a，N_b))+β*(1-CIoU) (1)L＝(1-β)*(1-NWD(N _a , N _b ))+β*(1-CIoU) (1)

式(1)中，L是优化后的损失函数，NWD是归一化的Wasserstein距离，N_a，N_b是由和/>建模的高斯分布，a表示真实框，b表示预测框，cx_a，，cy_a代表真实框的中心点坐标，w_a、h_a代表真实框的宽度和高度；cx_b，cy_b代表预测框的中心点坐标，w_b，h_b代表预测框的宽度和高度；β是权重比例系数，CIoU是原YOLOv5s中的损失函数，其计算公式为：In formula (1), L is the optimized loss function, NWD is the normalized Wasserstein distance, N _a and N _b are given by and/> Modeled Gaussian distribution, a represents the real box, b represents the predicted box, cx _a , cy _a represents the center point coordinates of the real box, w _a , _ha represents the width and height of the real box; cx _b , cy _b represents prediction The center point coordinates of the box, w _b and h _b represent the width and height of the prediction box; β is the weight proportion coefficient, CIoU is the loss function in the original YOLOv5s, and its calculation formula is:

式(2)中，ρ²(b_A,b_B)表示真实框与预测框中心点之间的欧几里得距离，c表示预测框和真实框的最小限定矩形的对角线距离，α为权重因子，v为纵横比一致性，IoU为真实框与预测框之间的重叠比例。In formula (2), ρ ² (b _A ,b _B ) represents the Euclidean distance between the center point of the real box and the prediction box, c represents the diagonal distance of the minimum limiting rectangle between the prediction box and the real box, α is the weight factor, v is the aspect ratio consistency, and IoU is the overlap ratio between the real box and the predicted box.

进一步的，所述步骤5具体包括步骤5.1至步骤5.4：Further, the step 5 specifically includes step 5.1 to step 5.4:

步骤5.1：设定所述基于改进YOLOv5s的小目标交通标志实时检测模型训练参数，模型训练参数具体包括：学习率，动量，权重衰减，优化器，迭代轮数，批大小；Step 5.1: Set the training parameters of the real-time detection model of small target traffic signs based on improved YOLOv5s. The model training parameters specifically include: learning rate, momentum, weight attenuation, optimizer, number of iteration rounds, and batch size;

步骤5.2：将训练集和验证集图像以及对应标签输入到所述改进YOLOv5s的小目标交通标志实时检测模型中，使用反向传播算法计算损失函数对模型参数的梯度，通过最小化损失函数调整模型参数，使其逐渐接近最优解；Step 5.2: Input the training set and verification set images and corresponding labels into the improved YOLOv5s small target traffic sign real-time detection model, use the back propagation algorithm to calculate the gradient of the loss function on the model parameters, and adjust the model by minimizing the loss function parameters so that it gradually approaches the optimal solution;

步骤5.3：使用优化器来更新模型参数，使其朝着梯度下降的方向更新；直到训练集和验证集的损失函数不再下降，同时准确率P、召回率R、mAP等评价指标也不再提高；Step 5.3: Use the optimizer to update the model parameters in the direction of gradient descent; until the loss functions of the training set and verification set no longer decrease, and the evaluation indicators such as accuracy P, recall R, and mAP no longer decrease. improve;

步骤5.4：将训练好的模型参数保存为最优模型。Step 5.4: Save the trained model parameters as the optimal model.

进一步的，所述步骤6具体包括步骤6.1至步骤6.3：Further, the step 6 specifically includes step 6.1 to step 6.3:

步骤6.1：将所述测试集输入到步骤5所述的改进后的最优模型；Step 6.1: Input the test set into the improved optimal model described in step 5;

步骤6.2：计算模型性能指标：性能指标具体包括准确率P(Precision)、召回率R、mAP、参数量、计算复杂度(GFLOPs)、模型大小，具体公式如下：Step 6.2: Calculate model performance indicators: Performance indicators specifically include accuracy P (Precision), recall rate R, mAP, number of parameters, computational complexity (GFLOPs), and model size. The specific formula is as follows:

其中，P为准确率，R为召回率，mAP为所有类别的平均精确率均值，AP为平均精确率，m为交通标志总类别数，TP表示正样本被正确识别为正样本的个数，FP表示负样本被错误识别为正样本的个数，FN表示正样本被错误识别为负样本的个数；Among them, P is the precision rate, R is the recall rate, mAP is the average precision rate mean of all categories, AP is the average precision rate, m is the total number of traffic sign categories, TP represents the number of positive samples that are correctly recognized as positive samples, FP represents the number of negative samples that are incorrectly identified as positive samples, and FN represents the number of positive samples that are incorrectly identified as negative samples;

步骤6.3：当性能指标满足精度要求时，获得最终基于改进YOLOv5s的小目标交通标志实时检测模型。Step 6.3: When the performance indicators meet the accuracy requirements, obtain the final real-time detection model of small target traffic signs based on improved YOLOv5s.

与现有技术相比，本发明的有益效果是：Compared with the prior art, the beneficial effects of the present invention are:

本发明公开的一种基于改进YOLOv5s的小目标交通标志实时检测方法，通过引入CSPCM模块和增加小目标检测头，提高了小目标交通标志的检测精度；采用C3_Faster网络结构减少了模型参数和计算复杂度，提高了交通标志检测的实时性；在CIoU损失函数中引入NWD度量方法，采用了新的损失函数，有效避免了小目标带来的较大损失，加快了模型的收敛。The invention discloses a real-time detection method of small target traffic signs based on improved YOLOv5s. By introducing the CSPCM module and adding a small target detection head, the detection accuracy of small target traffic signs is improved; the C3_Faster network structure is used to reduce model parameters and calculation complexity. degree, which improves the real-time performance of traffic sign detection; the NWD measurement method is introduced in the CIoU loss function, and a new loss function is adopted, which effectively avoids large losses caused by small targets and accelerates the convergence of the model.

附图说明Description of drawings

图1为本发明基于改进YOLOv5s的小目标交通标志实时检测方法的流程图；Figure 1 is a flow chart of the real-time detection method of small target traffic signs based on improved YOLOv5s according to the present invention;

图2为本发明基于改进YOLOv5s的小目标交通标志实时检测方法模型结构示意图；Figure 2 is a schematic structural diagram of the model structure of the real-time detection method of small target traffic signs based on improved YOLOv5s according to the present invention;

图3为CSPCM结构示意图；Figure 3 is a schematic diagram of the CSPCM structure;

图4为C3_Faster结构示意图；Figure 4 is a schematic diagram of the C3_Faster structure;

具体实施方式Detailed ways

为了使本发明的技术方案、构造特点、所实现的目的及优点更加清楚明白，以下结合具体实施方式并配合附图，对本发明进行进一步详细说明。需要说明的是，此处所描述的具体实施例仅用于更清楚的解释本发明，并不用于限定本发明。In order to make the technical solutions, structural features, achieved objectives and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with specific embodiments and accompanying drawings. It should be noted that the specific embodiments described here are only used to explain the present invention more clearly and are not used to limit the present invention.

图1是本发明公开的一种基于改进YOLOv5s的小目标交通标志实时检测方法流程图，其实现过程如下：Figure 1 is a flow chart of a real-time detection method for small target traffic signs based on improved YOLOv5s disclosed by the present invention. The implementation process is as follows:

步骤1：获取交通标志图像，形成第一数据集；所述第一数据集中的交通标志图像可通过数码相机进行拍摄、从网络中收集获取，或者从监控视频中获取；Step 1: Obtain traffic sign images to form a first data set; the traffic sign images in the first data set can be photographed by a digital camera, collected from the Internet, or obtained from surveillance videos;

在本实施例中，为了更好的评价本发明公开的一种基于改进YOLOv5s的小目标交通标志实时检测方法的检测效果，采取了公开数据集TT100K(Tsinghua-Tencent 100K)交通标志数据集。采用TT100K数据集，需要对数据集进行预处理：首先，统计TT100K数据集中出现次数小于200次的交通标志类，删除类别出现次数小于200次的交通标志类所对应的标签和只包含该类别的图片，最终制成含36类交通标志的数据集，即形成所述第一数据集。In this embodiment, in order to better evaluate the detection effect of a real-time detection method for small target traffic signs based on improved YOLOv5s disclosed in the present invention, the public data set TT100K (Tsinghua-Tencent 100K) traffic sign data set was used. Using the TT100K data set, the data set needs to be preprocessed: First, count the traffic sign categories that appear less than 200 times in the TT100K data set, delete the labels corresponding to the traffic sign categories that appear less than 200 times, and the labels that only contain this category. The pictures are finally made into a data set containing 36 types of traffic signs, which forms the first data set.

步骤2：为所述第一数据集中的图像使用lableme标注工具添加标注信息。由于本实施例中采取的公开数据集TT100K中已有标注信息，故略过该步骤。将本实施例中所述的第一数据集中json格式的标签文件转换为YOLOv5s所需要的txt格式标签文件，形成第二数据集，并按照7:2:1将所述第二数据集划分训练集、验证集和测试集；划分后的训练集包括6598张图片，测试集包括1889张图片，验证集包括970张图片，本实施例中所述第二数据集中图像尺寸大小为640*640。Step 2: Use labelme annotation tool to add annotation information to the images in the first data set. Since the public data set TT100K used in this embodiment already has annotation information, this step is skipped. Convert the json format label file in the first data set described in this embodiment into the txt format label file required by YOLOv5s to form a second data set, and divide the second data set according to 7:2:1 for training Set, verification set and test set; the divided training set includes 6598 pictures, the test set includes 1889 pictures, and the verification set includes 970 pictures. The image size in the second data set described in this embodiment is 640*640.

步骤3：构建基于改进YOLOv5s的交通标志检测模型，改进后的YOLOv5s模型结构如图2所示，模型构建过程具体包括步骤3.1至步骤3.3：Step 3: Construct a traffic sign detection model based on improved YOLOv5s. The improved YOLOv5s model structure is shown in Figure 2. The model construction process specifically includes steps 3.1 to 3.3:

步骤3.1：将C3模块中的Bottleneck模块用ConvMixer模块进行替换，形成新的卷积模块CSPCM；用新的卷积CSPCM模块替换了Backbone最后一层的C3模块和Neck最后一层的C3模块；Step 3.1: Replace the Bottleneck module in the C3 module with the ConvMixer module to form a new convolution module CSPCM; replace the C3 module of the last layer of Backbone and the C3 module of the last layer of Neck with the new convolution CSPCM module;

进一步的，ConvMixer模块的结构如图3所示，它采用分离空间和通道维度的混合方式进行操作，并在整个模型中保持相同的大小和分辨率；在ConvMixer中，输入先通过一个Depthwise Conv(Dwconv)，即组数等于通道数的分组卷积；然后通过一个PointwiseConv(PWCONV)，即1×1卷积；每个卷积操作之后，都跟随一个激活函数z′₁和一个z₁₊ ₁BatchNorm层。这种设计能够有效地减少模型参数量，有利于提高模型的实时性，并有效提高模型特征表达能力。Further, the structure of the ConvMixer module is shown in Figure 3. It operates in a mixed manner of separating space and channel dimensions, and maintains the same size and resolution throughout the model; in ConvMixer, the input is first passed through a Depthwise Conv( Dwconv), that is, a grouped convolution with the number of groups equal to the number of channels; and then through a PointwiseConv (PWCONV), which is a 1×1 convolution; after each convolution operation, it is followed by an activation function z′ ₁ and a z ₁₊ ₁ BatchNorm layer. This design can effectively reduce the amount of model parameters, improve the real-time performance of the model, and effectively improve the model's feature expression ability.

步骤3.2：使用轻量化卷积模块C3_Faster替换YOLOv5s模型主干网络和颈部网络中其余的C3模块，即采用C3_Faster替换除步骤3.1中已经替换的主干网络和颈部网络中最后一层的C3模块之外的其余的C3模块；Step 3.2: Use the lightweight convolution module C3_Faster to replace the remaining C3 modules in the backbone network and neck network of the YOLOv5s model, that is, use C3_Faster to replace the C3 modules in the last layer of the backbone network and neck network that have been replaced in step 3.1. The remaining C3 modules;

轻量级卷积模块C3_Faster使用了Pconv卷积；相较于深度卷积(DWConv)和组卷积(GConv)，PConv可以减少FLOPs的计算量，减少计算冗余和内存访问，图4为C3_Faster的结构，由一个PConv层后跟2个Conv层组成，它们一起显示为倒置残差块，中间层具有扩展的通道数量，每个中间Conv之后，都会有一个标准化和激活层。The lightweight convolution module C3_Faster uses Pconv convolution; compared with depth convolution (DWConv) and group convolution (GConv), PConv can reduce the calculation amount of FLOPs, reduce computational redundancy and memory access, Figure 4 shows C3_Faster The structure consists of a PConv layer followed by 2 Conv layers, which are shown together as an inverted residual block, the intermediate layer has an extended number of channels, and after each intermediate Conv, there is a normalization and activation layer.

步骤3.3：输出层在已有的3个检测头的基础上增加一个小目标检测头，能够更高效地处理多尺度特征信息，使检测器对小目标更加敏感，提高了对小目标交通标志的检测性能。Step 3.3: The output layer adds a small target detection head to the existing three detection heads, which can process multi-scale feature information more efficiently, make the detector more sensitive to small targets, and improve the detection of small target traffic signs. Detection performance.

步骤4：采用新的损失函数，在CIoU损失函数中引入NWD度量方法，将CIoU与NWD度量相结合，提出一个新的损失函数；新的损失函数能有效地避免小目标带来的较大损失，加快模型的收敛，具体步骤包括步骤4.1至步骤4.3：Step 4: Use a new loss function, introduce the NWD metric method into the CIoU loss function, combine CIoU with the NWD metric, and propose a new loss function; the new loss function can effectively avoid large losses caused by small targets. , to speed up the convergence of the model. The specific steps include steps 4.1 to 4.3:

步骤4.1：将边界框建模为二维高斯分布，利用预测目标与实际目标对应的高斯分布，计算预测框与真实框之间的Wasserstein距离，计算公式为：Step 4.1: Model the bounding box as a two-dimensional Gaussian distribution, and use the Gaussian distribution corresponding to the predicted target and the actual target to calculate the Wasserstein distance between the predicted box and the real box. The calculation formula is:

式(1)中，是Wasserstein距离，N_a,N_b是由/>和建模的高斯分布，a表示真实框，b表示预测框，cx_a，，cy_a代表真实框的中心点坐标，w_a、h_a代表真实框的宽度和高度；cx_b，cy_b代表预测框的中心点坐标，w_b，h_b代表预测框的宽度和高度；In formula (1), is the Wasserstein distance, N _a and N _b are given by/> and Modeled Gaussian distribution, a represents the real box, b represents the predicted box, cx _a , cy _a represents the center point coordinates of the real box, w _a , _ha represents the width and height of the real box; cx _b , cy _b represents prediction The center point coordinates of the box, w _b , h _b represent the width and height of the prediction box;

步骤4.2：计算它们之间归一化的Wasserstein距离，计算公式为：Step 4.2: Calculate the normalized Wasserstein distance between them. The calculation formula is:

式(2)中，NWD(N_a，N_b)是归一化的Wasserstein距离，C是与数据集密切相关的常数。步骤4.3：根据CIoU与NWD的比例关系，提出新的损失函数，其计算公式为：In formula (2), NWD (N _a , N _b ) is the normalized Wasserstein distance, and C is a constant closely related to the data set. Step 4.3: Based on the proportional relationship between CIoU and NWD, a new loss function is proposed, and its calculation formula is:

L＝(1-β)*(1-NWD(N_a，N_b))+β*(1-CIoU) (3)L＝(1-β)*(1-NWD(N _a , N _b ))+β*(1-CIoU) (3)

式(3)中，L是优化后的损失函数，β是权重比例系数，CIoU是原YOLOv5s中的损失函数，其计算公式为：In formula (3), L is the optimized loss function, β is the weight proportion coefficient, CIoU is the loss function in the original YOLOv5s, and its calculation formula is:

式(4)中，ρ²(b_A，b_B)表示真实框与预测框中心点之间的欧几里得距离，c表示预测框和真实框的最小限定矩形的对角线距离，α为权重因子，v为纵横比一致性，IntersectionoverUnion(IoU)是目标检测用于衡量预测边界框和真实边界框之间的重叠程度。IoU值越高，表示预测框和真实框之间的重叠程度越大，检测效果越好，IoU计算公式为：In formula (4), ρ ² (b _A , b _B ) represents the Euclidean distance between the center point of the real frame and the prediction frame, c represents the diagonal distance of the minimum limiting rectangle between the prediction frame and the real frame, α is the weight factor, v is the aspect ratio consistency, and IntersectionoverUnion (IoU) is the object detection method used to measure the overlap between the predicted bounding box and the real bounding box. The higher the IoU value, the greater the overlap between the predicted frame and the real frame, and the better the detection effect. The IoU calculation formula is:

式(5)中：B为预测框；B^GT为真实框。In formula (5): B is the prediction box; B ^GT is the real box.

步骤5：将所述训练集和验证集输入到步骤3所述的基于改进YOLOv5s的小目标交通标志模型中进行训练，具体包括步骤5.1至步骤5.4：Step 5: Input the training set and verification set into the small target traffic sign model based on improved YOLOv5s described in step 3 for training, specifically including steps 5.1 to 5.4:

步骤5.1：设定所述基于改进YOLOv5s的小目标交通标志实时检测模型的训练参数，模型训练参数包括：学习率，动量，权重衰减，优化器，迭代轮数，批大小；Step 5.1: Set the training parameters of the small target traffic sign real-time detection model based on improved YOLOv5s. The model training parameters include: learning rate, momentum, weight attenuation, optimizer, number of iteration rounds, and batch size;

在本实施例中，优化器optimizer为SGD，初始学习率1r0为0.01，动量momentum为0.937，权重衰减weight_decay为0.0005，批大小batchsize为16，迭代轮次Epoch为300。In this embodiment, the optimizer is SGD, the initial learning rate 1r0 is 0.01, the momentum is 0.937, the weight decay weight_decay is 0.0005, the batch size is 16, and the iteration round Epoch is 300.

步骤5.2：将训练集和验证集图像以及对应标签输入到所述改进YOLOv5s的小目标交通标志实时检测模型中，使用反向传播算法计算损失函数对模型参数的梯度。反向传播算法是一种有效的计算梯度的方法，它使用链式法则来计算每个参数对于损失函数的梯度。具体的，反向传播通过将损失函数从输出层向后传播，逐层计算每个参数的梯度。在这个过程中，每个参数的梯度表示了损失函数对于该参数的变化率，即损失函数如何随着该参数的变化而变化。通过最小化损失函数调整模型参数使其逐渐接近最优解。Step 5.2: Input the training set and verification set images and corresponding labels into the improved YOLOv5s small target traffic sign real-time detection model, and use the back propagation algorithm to calculate the gradient of the loss function on the model parameters. The backpropagation algorithm is an efficient method for calculating gradients. It uses the chain rule to calculate the gradient of each parameter with respect to the loss function. Specifically, backpropagation calculates the gradient of each parameter layer by layer by propagating the loss function backward from the output layer. In this process, the gradient of each parameter represents the rate of change of the loss function for that parameter, that is, how the loss function changes as the parameter changes. Adjust the model parameters by minimizing the loss function to gradually approach the optimal solution.

步骤5.3：计算得到模型参数的梯度后，使用优化器来更新这些参数。优化器根据参数的梯度信息，按照梯度的反方向进行参数更新。梯度较大的参数会以更大的步伐进行更新，而梯度较小的参数则以较小的步伐进行更新。通过不断迭代更新模型参数，可以逐渐减小损失函数的值。通过最小化损失函数，调整模型参数的取值，使其逐渐接近最优解，即损失函数达到最小值的参数取值，模型的预测结果与真实值之间的差异最小化，同时mAP、召回率R、准确率P等评价指标也不再提高。Step 5.3: After calculating the gradients of the model parameters, use the optimizer to update these parameters. The optimizer updates parameters in the opposite direction of the gradient based on the gradient information of the parameters. Parameters with larger gradients are updated in larger steps, while parameters with smaller gradients are updated in smaller steps. By continuously updating the model parameters iteratively, the value of the loss function can be gradually reduced. By minimizing the loss function, the values of the model parameters are adjusted so that they gradually approach the optimal solution, that is, the parameter value at which the loss function reaches the minimum value, the difference between the model's prediction results and the true value is minimized, and at the same time, mAP, recall Evaluation indicators such as rate R and accuracy P also no longer improve.

步骤6：采用所述测试集对步骤5所述的最优模型进行测试，对测试集测试结果进行评估，满足精度要求，即获得最终基于改进YOLOv5s的小目标交通标志实时检测模型，具体的，所述步骤6进一步包括步骤6.1至步骤6.3：Step 6: Use the test set to test the optimal model described in step 5, evaluate the test set test results, and meet the accuracy requirements, that is, obtain the final real-time detection model of small target traffic signs based on improved YOLOv5s. Specifically, The step 6 further includes steps 6.1 to 6.3:

步骤6.1：将所述测试集输入到步骤5所述的最优模型中；Step 6.1: Input the test set into the optimal model described in step 5;

步骤6.2：计算模型性能指标：性能指标具体包括准确率P、召回率R、mAP、参数量、计算复杂度GFLOPs、模型大小，具体计算公式如下：Step 6.2: Calculate model performance indicators: Performance indicators specifically include accuracy P, recall rate R, mAP, number of parameters, computational complexity GFLOPs, and model size. The specific calculation formula is as follows:

在本实施例中，为了验证本发明公开的改进后的模型效果，将测试集输入到所述基于改进YOLOv5s的小目标交通标志实时检测模型中以及YOLOv5s模型进行测试。对两个模型分别计算评价指标，评价指标数据见表1所示。由表1可知，本发明公开的小目标交通标志实时检测模型比原YOLOv5s模型在准确率P、mAP@0.5等指标都获得了较高的检测精度。在模型参数量、模型大小、计算复杂度GFLOPs等性能指标上，本发明公开的改进模型比原YOLOv5s模型更低，从而使模型具有较好的轻量化指标，更容易部署到移动设备上。In this embodiment, in order to verify the improved model effect disclosed in the present invention, the test set is input into the small target traffic sign real-time detection model based on improved YOLOv5s and the YOLOv5s model is tested. The evaluation indicators are calculated separately for the two models, and the evaluation indicator data are shown in Table 1. As can be seen from Table 1, the real-time detection model of small target traffic signs disclosed in the present invention has achieved higher detection accuracy than the original YOLOv5s model in terms of accuracy P, mAP@0.5 and other indicators. In terms of performance indicators such as model parameter quantity, model size, and computational complexity GFLOPs, the improved model disclosed in the present invention is lower than the original YOLOv5s model, so that the model has better lightweight indicators and is easier to deploy on mobile devices.

表1对比实验结果Table 1 Comparative experimental results

以上所述仅为本发明的一个实施例，并非因此限制本发明的专利范围，对于本领域的技术人员来说，本发明可以有各种更改和变化，凡在本发明的精神和原则之内，所做的任何修改、同等替换、改进等，均应包含在本发明的保护范围之内。The above is only an embodiment of the present invention, and does not limit the patent scope of the present invention. For those skilled in the art, the present invention can have various modifications and changes, all within the spirit and principles of the present invention. , any modifications, equivalent substitutions, improvements, etc. shall be included in the protection scope of the present invention.

Claims

1. The real-time detection method for the small target traffic sign based on the improved YOLOv5s is characterized by comprising the following steps of:

step 1: acquiring a traffic sign image to form a first data set; the traffic sign images in the first data set can be shot by a digital camera, collected and obtained from a network or obtained from a monitoring video;

step 2: adding annotation information to the images in the first data set to form a second data set, and dividing the second data set into a training set, a verification set and a test set;

step 3: constructing a small target traffic sign real-time detection model based on improved YOLOv5s, wherein the construction of the model further comprises the steps of 3.1 to 3.3:

step 3.1: combining a C3 module in the Yolov5s model with ConvMixer to form a CSPCM module, and respectively replacing the C3 module of the last layer of the Yolov5s backbone network and the C3 module of the last layer of the neck network with the CSPCM module;

step 3.2: replacing the rest of C3 modules in the Yolov5s model backbone network and the neck network by using a lightweight convolution module C3_Faster, namely replacing the rest of C3 modules except the last layer of C3 modules in the backbone network and the neck network which are replaced in the step 3.1 by using C3_Faster;

step 3.3: the output layer is provided with a small target detection head based on the existing 3 detection heads;

step 4: the new loss function is adopted, and the specific method is as follows:

introducing an NWD measurement method into the CIoU loss function, optimizing the CIoU loss function of Yolov5s by using the NWD measurement, wherein the optimized loss function formula is as follows:

L＝(1-β)*(1-NWD(N _a ,N _b ))+β*(1-CIoU) (1)

NWD is normalized Wasserstein distance, N _a ,N _b Is composed ofAnd-> The modeled gaussian distribution, a represents the true box, b represents the predicted box, cx _a ,,cy _a Representing the coordinates of the center point, w, of the real frame _a 、h _a Representing the width and height of the real frame; cx (cx) _b ,cy _b Representing the coordinates of the central point of the prediction frame, w _b ,h _b Representing the width and height of the prediction box; beta is a weight proportionality coefficient, CIoU is a loss function in original YOLOv5s, and the CIoU calculation formula is:

in the formula (2), ρ ² (b _A ,b _B ) Representing the center points of the real frames and the predicted framesEuclidean distance between, c represents the diagonal distance of the smallest defined rectangle of the predicted and real frames, α is the weight factor, v is the aspect ratio uniformity, ioU is the overlap ratio between the real and predicted frames;

step 5: training the small target traffic sign real-time detection model based on the improved YOLOv5s by using the training set and the verification set, and saving the trained model as an optimal model, and further comprising the steps of 5.1 to 5.4:

step 5.1: setting the small target traffic sign real-time detection model training parameters based on the improved YOLOv5s, wherein the model training parameters comprise: learning rate, momentum, weight decay, optimizer, iteration round number, batch size;

step 5.2: inputting the training set and verification set images and the corresponding labels into the small target traffic sign real-time detection model of the improved YOLOv5s, calculating the gradient of a loss function to model parameters by using a back propagation algorithm, and adjusting the model parameters to gradually approach an optimal solution by minimizing the loss function;

step 5.3: updating model parameters by using an optimizer SGD (generalized algorithm D), so that the model parameters are updated towards the gradient descending direction until the loss functions of the training set and the verification set are not reduced any more, and the evaluation index mAP, the recall rate R and the accuracy rate P are not improved any more;

step 5.4: saving the trained model parameters as an optimal model;

step 6: and testing the optimal model by adopting the test set, evaluating the test result of the test set, and meeting the precision requirement to obtain the final real-time detection model of the small target traffic sign based on the improved YOLOv5 s.

2. The real-time detection method of small target traffic sign based on improved YOLOv5s according to claim 1, wherein in the step 2, the training set, the verification set and the test set are divided according to a ratio of 7:2:1.

3. The real-time detection method of small target traffic sign based on improved YOLOv5s according to claim 1, wherein the step 6 further comprises steps 6.1 to 6.3:

step 6.1: inputting the test set into the optimal model in the step 5;

step 6.2: calculating a model performance index: the accuracy P, the recall R, mAP, the parameter quantity, the calculation complexity GFLOPs and the model size are calculated according to the following specific calculation formulas:

wherein P is the accuracy, R is the recall, mAP is the average accuracy average of all the categories, AP is the average accuracy, m is the total number of categories of traffic sign, TP represents the number of positive samples correctly identified as positive samples, FP represents the number of negative samples incorrectly identified as positive samples, and FN represents the number of positive samples incorrectly identified as negative samples;

step 6.3: and when the performance index meets the precision requirement, obtaining a small target traffic sign real-time detection model based on the improved YOLOv5 s.