CN115690704A - LG-CenterNet model-based complex road scene target detection method and device - Google Patents
LG-CenterNet model-based complex road scene target detection method and device Download PDFInfo
- Publication number
- CN115690704A CN115690704A CN202211179337.4A CN202211179337A CN115690704A CN 115690704 A CN115690704 A CN 115690704A CN 202211179337 A CN202211179337 A CN 202211179337A CN 115690704 A CN115690704 A CN 115690704A
- Authority
- CN
- China
- Prior art keywords
- module
- model
- feature
- target
- road scene
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 33
- 238000000034 method Methods 0.000 claims abstract description 9
- 238000012545 processing Methods 0.000 claims abstract description 3
- 238000011176 pooling Methods 0.000 claims description 16
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 13
- 238000010586 diagram Methods 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 5
- 230000000875 corresponding effect Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 2
- 238000006243 chemical reaction Methods 0.000 claims 1
- 230000002708 enhancing effect Effects 0.000 claims 1
- 238000011897 real-time detection Methods 0.000 abstract description 2
- 238000000605 extraction Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 206010039203 Road traffic accident Diseases 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明属于语义分割、图像处理及智能驾驶领域,具体涉及一种基于LG-CenterNet模型的复杂道路场景目标检测方法及装置。The invention belongs to the fields of semantic segmentation, image processing and intelligent driving, and in particular relates to a complex road scene object detection method and device based on an LG-CenterNet model.
背景技术Background technique
近些年汽车数目稳健上升导致交通事故频发,这严重威胁到了人民的生命安全。如今,随着自动驾驶技术的发展,研究人员也从汽车被动安全技术研究转向了汽车主动安全技术研究。实现汽车的自动化必须使用一些先进的技术手段才能完成部分的汽车驾驶任务。采用深度学习方法对道路场景目标进行智能检测是解决汽车主动安全技术的关键。现阶段的目标检测网络主要是通过主干网络进行特征提取,但是在对底层的多尺度问题没有进行过多的考虑,可能会导致多尺度目标检测能力不足的情况。In recent years, the steady increase in the number of cars has led to frequent traffic accidents, which seriously threaten the safety of people's lives. Nowadays, with the development of autonomous driving technology, researchers have also shifted from research on passive safety technology to active safety technology. To realize the automation of the car, some advanced technical means must be used to complete part of the car driving task. Using deep learning method to intelligently detect road scene objects is the key to solve the vehicle active safety technology. The target detection network at this stage mainly uses the backbone network for feature extraction, but does not give too much consideration to the underlying multi-scale problems, which may lead to insufficient multi-scale target detection capabilities.
发明内容Contents of the invention
发明目的:针对现阶段复杂道路场景目标检测应用效果不佳,常规的检测方法不能满足实际道路环境的检测要求,提供一种基于LG-CenterNet模型的复杂道路场景目标检测方法及装置。Purpose of the invention: To provide a complex road scene target detection method and device based on the LG-CenterNet model in view of the poor application effect of complex road scene target detection at the present stage, and conventional detection methods that cannot meet the detection requirements of the actual road environment.
技术方案:本发明提出一种基于LG-CenterNet模型的复杂道路场景目标检测方法,具体包括以下步骤:Technical solution: The present invention proposes a complex road scene target detection method based on the LG-CenterNet model, which specifically includes the following steps:
(1)对复杂道路场景的图像进行处理,获取到含有多种类别的道路目标图像,对图像中的道路目标进行类别和位置标记,构建出复杂的道路场景数据集并进行预处理;(1) Process the images of complex road scenes, obtain road target images containing multiple categories, mark the categories and positions of the road targets in the images, construct complex road scene datasets and perform preprocessing;
(2)构建目标检测LG-CenterNet模型,并将上述的道路目标数据集通过LG-CenterNet模型进行训练得到模型S;所述LG-CenterNet模型包括Backbone模块、层级引导注意力模块、Scales Encoder模块、反卷积模块、特征增强模块和Centerpoints预测模块;(2) build target detection LG-CenterNet model, and above-mentioned road target data set is trained to obtain model S by LG-CenterNet model; Described LG-CenterNet model comprises Backbone module, hierarchical guidance attention module, Scales Encoder module, Deconvolution module, feature enhancement module and Centerpoints prediction module;
(3)使用训练好的模型S对复杂道路目标通过Center points预测模块以热力图的形式进行目标定位、边框大小划分和类别预测,并将得到的结果在视频或者图像上进行显示输入相应的效果。(3) Use the trained model S to perform target positioning, frame size division and category prediction in the form of heat maps for complex road targets through the Center points prediction module, and display the obtained results on videos or images and input corresponding effects .
进一步地,步骤(1)所述的对道路场景数据集预处理是通过将像素不一和复杂道路场景的图像进行归一化处理,将图像的大小归一化为512×512像素大小,再通过批标准化、ReLU激活函数和最大池化操作得到分布均匀的特征目标样本。Further, the preprocessing of the road scene data set described in step (1) is to normalize the image size of the image to 512×512 pixel size by normalizing the images of different pixels and complex road scenes, and then Uniformly distributed feature target samples are obtained through batch normalization, ReLU activation function, and maximum pooling operations.
进一步地,所述步骤(2)实现过程如下:Further, the implementation process of the step (2) is as follows:
(21)LG-CenterNet模型中提出新的MresneIt50作为Backbone模块,MresneIt50由多个残差块组成,其中将4个残差模块提取到的特征图记为E1,通道数为512;将6个残差块提取到的特征图记为E2,通道数为1024;将3个通道数提取到的特征图记为E3,通道数为2048;(21) In the LG-CenterNet model, a new MresneIt50 is proposed as the Backbone module. MresneIt50 is composed of multiple residual blocks. The feature map extracted by the 4 residual modules is marked as E1, and the number of channels is 512; the 6 residual blocks are The feature map extracted from the difference block is marked as E2, and the number of channels is 1024; the feature map extracted from 3 channels is marked as E3, and the number of channels is 2048;
(22)将Backbone提取到的特征图E1、E2、E3输入到层级引导注意力模块中,其主要的结构包括两个分支:全局池化分支和层级引导分支,将通道数为512的特征图E1输入到全局池化分支,通过全局最大池化层和上采样层操作获得EC1;将通道数为512、1024、2048的特征图E1、E2、E3输入到层级引导分支中,通过一系列的平均池化和卷积操作并配合上采样得到EC2;将EC1和EC2使用add进行特征联合获得EC3,从而减少计算参数;(22) Input the feature maps E1, E2, and E3 extracted by Backbone into the hierarchical guided attention module. Its main structure includes two branches: the global pooling branch and the hierarchical guided branch, and the feature map with 512 channels E1 is input to the global pooling branch, and EC1 is obtained through the operation of the global maximum pooling layer and the up-sampling layer; the feature maps E1, E2, and E3 with the number of channels of 512, 1024, and 2048 are input into the hierarchical guidance branch, through a series of The average pooling and convolution operations are combined with upsampling to obtain EC2; EC1 and EC2 are combined using add to obtain EC3, thereby reducing calculation parameters;
(23)将提取到的EC3输入到Scales Encoder模块,进行一系列的卷积和残差模块运算后得到EC4;(23) Input the extracted EC3 to the Scales Encoder module, and perform a series of convolution and residual module operations to obtain EC4;
(24)将提取到的EC4输入到反卷积模块,反卷积模块由3个deconv组组成,通过每次deconv组的卷积运算将特征图尺寸不断放大,同时通道数不断降低,得到尺度为128×128×64的特征图记为EC5;(24) Input the extracted EC4 to the deconvolution module. The deconvolution module is composed of 3 deconv groups. The size of the feature map is continuously enlarged through the convolution operation of each deconv group, and the number of channels is continuously reduced to obtain the scale. The feature map of 128×128×64 is marked as EC5;
(25)将特征图EC5输入到特征增强模块进行卷积运算得到尺度大小为128×128×64特征图EC6,P-FEM由3×3的Poly-Scale Convolution、批标准化、ReLU激活函数和Sigmoid激活函数构成,主要是为了提高特征图中的局部信息的相关性,增强其对特征的表达能力。(25) Input the feature map EC5 to the feature enhancement module for convolution operation to obtain a feature map EC6 with a scale size of 128×128×64. P-FEM consists of 3×3 Poly-Scale Convolution, batch normalization, ReLU activation function and Sigmoid The composition of the activation function is mainly to improve the correlation of local information in the feature map and enhance its ability to express features.
进一步地,所述步骤(3)实现过程如下:Further, the implementation process of the step (3) is as follows:
Centerpoints预测模块通过对训练好的模型S对输入的图片进行分类预测,将原始图像生成尺度与EC6大小一致的heatmap图,随后通过分别计算热力图的损失值记为Lh,目标长宽的损失值记为Ls和中心点偏移量的损失值记为Lf来确定目标的位置和大小并生成最后的分类定位的heatmap;其中总体的网络损失为:The Centerpoints prediction module classifies and predicts the input images through the trained model S, generates a heatmap image with the same scale as EC6 from the original image, and then calculates the loss value of the heat map separately as L h , the loss of the target length and width The value is recorded as L s and the loss value of the center point offset is recorded as L f to determine the position and size of the target and generate the final heatmap for classification and positioning; the overall network loss is:
Ld=Lk+λsLs+λfLf L d = L k +λ s L s +λ f L f
其中λs=0.1,λf=1;对于输入图片大小为512×215的图像来说其通过该网络生成的特征图为H×W×C,则Lk、Ls和Lf计算公式分别为:Where λ s =0.1, λ f =1; for an image with an input image size of 512×215, the feature map generated by the network is H×W×C, then the calculation formulas of L k , L s and L f are respectively for:
其中,AHWC为图像中目标标注的真实值,A'HWC为图像的预测值,α和β分别为2和4,N为图像中关键点的个数,s'pk为预测尺寸,sk为真实尺寸,p为图像中目标的中心点位置。Among them, A HWC is the actual value of the target label in the image, A' HWC is the predicted value of the image, α and β are 2 and 4 respectively, N is the number of key points in the image, s' pk is the predicted size, s k is the real size, and p is the center point position of the target in the image.
基于相同的发明构思,本发明还提供一种基于LG-CenterNet模型的复杂道路场景目标检测装置,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述计算机程序被加载至处理器时实现上述的基于LG-CenterNet模型的复杂道路场景目标检测方法。Based on the same inventive concept, the present invention also provides a complex road scene object detection device based on the LG-CenterNet model, comprising a memory, a processor and a computer program stored on the memory and operable on the processor, the computer program When loaded into the processor, the above-mentioned complex road scene target detection method based on the LG-CenterNet model is realized.
有益效果:与现有技术相比,本发明的有益效果:1、通过改进LG-CenterNet模型的主干网络,提出MresneIt50加强特征提取效果;2、提出一种层级引导注意力模块对主干网络提取到的特征图的进行特征融合;3、提出新的Scales Encoder模块和特征增强模块注重对局部特征的提取,避免在反卷积模块中出现的特征丢失问题;4、改进后的LG-CenterNet目标检测模型对比原来的CenterNet框架的平均精度mAP(meanAveragePrecision)提升了5个百分点;5、本发明在应对复杂道路场景也有较高的检测精度。Beneficial effect: compared with prior art, the beneficial effect of the present invention: 1, by improving the backbone network of LG-CenterNet model, propose MresneIt50 to strengthen feature extraction effect; 3. A new Scales Encoder module and a feature enhancement module are proposed to focus on the extraction of local features to avoid the problem of feature loss in the deconvolution module; 4. The improved LG-CenterNet target detection Compared with the average precision mAP (meanAveragePrecision) of the original CenterNet framework, the model has improved by 5 percentage points; 5. The present invention also has higher detection precision in dealing with complex road scenes.
附图说明Description of drawings
图1是基于LG-CenterNet模型的复杂道路场景目标检测方法的流程图;Figure 1 is a flowchart of a complex road scene target detection method based on the LG-CenterNet model;
图2是本发明提出的基于LG-CenterNet目标检测模型示意图;Fig. 2 is a schematic diagram based on the LG-CenterNet target detection model proposed by the present invention;
图3是本发明提出的残差块结构Mblock结构示意图;Fig. 3 is a structural schematic diagram of the residual block structure Mblock proposed by the present invention;
图4是层级引导注意力模型结构示意图;Figure 4 is a schematic diagram of the structure of the hierarchical guided attention model;
图5是Scales Encoder模块结构示意图;Figure 5 is a schematic diagram of the Scales Encoder module structure;
图6是特征增强模块结构示意图;Fig. 6 is a schematic structural diagram of a feature enhancement module;
图7是采用LG-CenterNet目标检测模型后得到的检测效果图。Figure 7 is a detection effect diagram obtained after using the LG-CenterNet target detection model.
具体实施方式Detailed ways
下面结合附图对本发明作进一步详细说明。The present invention will be described in further detail below in conjunction with the accompanying drawings.
本实施方式中涉及大量变量,现将各变量作如下说明。如表1所示。A large number of variables are involved in this embodiment, and each variable is described as follows. As shown in Table 1.
表1变量说明表Table 1 variable description table
本发明提供一种基于LG-CenterNet模型的复杂道路场景目标检测方法,通过采集道路场景不同目标图像并进行标记制作成复杂道路场景数据集,利用提出的MresneIt50作为主干网络进行特征提取,对于主干网络中提取到的不同尺度的特征图输入到层级引导注意力模块当中,随后通过Scales Encoder模块得到多个感受野特征,之后利用反卷积模块进行特征像素还原,利用Poly-Scale Convolution(简称PSConv)构建特征增强模块提高局部特征的信息相关性。最终利用Center points预测模块对目标的中心点位置、预测框的尺度大小和中心点的偏移进行预测,同时识别出目标类别。如图1所示,具体包括以下步骤:The present invention provides a complex road scene target detection method based on the LG-CenterNet model. By collecting different target images of the road scene and marking them into complex road scene data sets, the proposed MresneIt50 is used as the backbone network for feature extraction. For the backbone network The feature maps of different scales extracted in are input into the hierarchical guidance attention module, and then multiple receptive field features are obtained through the Scales Encoder module, and then the feature pixels are restored by the deconvolution module, and the Poly-Scale Convolution (PSConv for short) is used. Build a feature enhancement module to improve the information relevance of local features. Finally, the center points prediction module is used to predict the position of the center point of the target, the scale size of the prediction frame and the offset of the center point, and at the same time identify the target category. As shown in Figure 1, it specifically includes the following steps:
步骤1:对复杂道路场景的图像进行处理,获取到含有多种类别的道路目标图像进行预处理,并对图像中的道路目标进行类别和位置标记,构建出复杂的道路场景数据集。Step 1: Process the image of the complex road scene, obtain the road target image containing multiple categories for preprocessing, and mark the category and position of the road target in the image to construct a complex road scene dataset.
对道路场景数据集预处理主要是通过将像素不一和复杂道路场景的图像进行归一化处理,将图像的大小归一化为512×512像素大小,再通过批标准化(BatchNormalizaition)、ReLU激活函数和最大池化操作得到目标样本处于在图像分布较为均匀。The preprocessing of the road scene dataset is mainly by normalizing the images of different pixels and complex road scenes, normalizing the size of the image to 512×512 pixel size, and then through batch normalization (BatchNormalizaition), ReLU activation The function and the maximum pooling operation get the target samples in a relatively uniform distribution in the image.
步骤2、构建目标检测LG-CenterNet模型,LG-CenterNet模型结构如图2所示,并将上述的道路目标数据集通过LG-CenterNet模型进行训练得到模型S,其中LG-CenterNet网络主要包括Backbone模块、层级引导注意力模块(Levels guide attention,简称LGA)、Scales Encoder模块、反卷积模块、特征增强模块(P-Feature enhancement module,P-FEM)和Centerpoints预测模块。Step 2. Build the target detection LG-CenterNet model. The structure of the LG-CenterNet model is shown in Figure 2, and the above-mentioned road target data set is trained through the LG-CenterNet model to obtain the model S. The LG-CenterNet network mainly includes the Backbone module , Levels guide attention module (Levels guide attention, LGA for short), Scales Encoder module, deconvolution module, feature enhancement module (P-Feature enhancement module, P-FEM) and Centerpoints prediction module.
(21)LG-CenterNet模型中提出新的MresneIt50作为Backbone模块,MresneIt50由多个残差块Mblock组成,残差块结构Mblock如图3所示,其中将4个残差块提取到的特征图记为E1,通道数为512;将6个残差块提取到的特征图记为E2,通道数为1024;将3个通道数提取到的特征图记为E3,通道数为2048。(21) In the LG-CenterNet model, a new MresneIt50 is proposed as the Backbone module. MresneIt50 is composed of multiple residual blocks Mblock. The residual block structure Mblock is shown in Figure 3. The feature maps extracted from the four residual blocks are marked is E1, the number of channels is 512; the feature map extracted from 6 residual blocks is marked as E2, and the number of channels is 1024; the feature map extracted from 3 channels is marked as E3, and the number of channels is 2048.
(22)将Backbone提取到的特征图E1、E2、E3输入到层级引导注意力模块(Levelsguide attention,简称LGA)中,LGA模块结构如图4所示,其主要的结构包括两个分支:全局池化分支和层级引导分支,将通道数为512的特征图E1输入到全局池化分支,通过全局最大池化层和上采样层操作获得EC1;将通道数为512、1024、2048的特征图E1、E2、E3输入到层级引导分支中,通过一系列的平均池化和卷积操作并配合上采样得到EC2。将EC1和EC2使用add进行特征联合获得EC3,从而减少计算参数。(22) Input the feature maps E1, E2, and E3 extracted by Backbone into the Levelsguide attention module (Levelsguide attention, referred to as LGA). The structure of the LGA module is shown in Figure 4. Its main structure includes two branches: global The pooling branch and the hierarchical guidance branch input the feature map E1 with 512 channels into the global pooling branch, and obtain EC1 through the global maximum pooling layer and upsampling layer operation; the feature maps with 512, 1024, and 2048 channels E1, E2, and E3 are input to the hierarchical guidance branch, and EC2 is obtained through a series of average pooling and convolution operations with upsampling. Combine EC1 and EC2 with add to obtain EC3, thereby reducing calculation parameters.
(23)将提取到的EC3输入到Scales Encoder模块,Scales Encoder模块结构如图5所示,进行一系列的卷积和残差模块运算后得到EC4。(23) Input the extracted EC3 to the Scales Encoder module. The structure of the Scales Encoder module is shown in Figure 5, and EC4 is obtained after a series of convolution and residual module operations.
(24)将提取到的EC4输入到反卷积模块,反卷积模块由3个deconv组组成,通过每次deconv组的卷积运算将特征图尺寸不断放大,同时通道数不断降低,得到尺度为128×128×64的特征图记为EC5。(24) Input the extracted EC4 to the deconvolution module. The deconvolution module is composed of 3 deconv groups. The size of the feature map is continuously enlarged through the convolution operation of each deconv group, and the number of channels is continuously reduced to obtain the scale. A feature map of 128×128×64 is marked as EC5.
(25)将特征图EC5输入到P-FEM进行卷积运算得到尺度大小为128×128×64特征图EC6,P-FEM由3×3的Poly-Scale Convolution(简称PSConv)、批标准化(BatchNormalizaition)、ReLU激活函数和Sigmoid激活函数构成,主要是为了提高特征图中的局部信息的相关性,增强其对特征的表达能力。P-FEM结构如图6所示。(25) Input feature map EC5 to P-FEM for convolution operation to obtain feature map EC6 with a scale size of 128×128×64. P-FEM consists of 3×3 Poly-Scale Convolution (PSConv for short), Batch Normalization (BatchNormalization ), ReLU activation function and Sigmoid activation function, mainly to improve the correlation of local information in the feature map and enhance its ability to express features. The P-FEM structure is shown in Fig. 6.
步骤3:使用训练好的模型S对道路场景目标通过Centerpoints预测模块以热力图的形式进行目标定位、边框大小划分和类别预测,并将得到的结果在视频或者图像上进行显示输入相应的效果。Step 3: Use the trained model S to perform target positioning, frame size division, and category prediction in the form of heat maps through the Centerpoints prediction module for road scene targets, and display the obtained results on videos or images and input corresponding effects.
Centerpoints预测模块通过对训练好的模型S对输入的图片进行分类预测,将原始图像生成尺度与EC6大小一致的heatmap图,随后通过分别计算热力图的损失值记为Lh,目标长宽(size)的损失值记为Ls和中心点偏移量(offset)的损失值记为Lf来确定目标的位置和大小并生成最后的分类定位的heatmap。其中总体的网络损失为Ld。The Centerpoints prediction module classifies and predicts the input pictures through the trained model S, and generates a heatmap with the same scale as EC6 from the original image, and then calculates the loss value of the heat map separately as L h , and the target length and width (size ) is recorded as L s and the loss value of the center point offset (offset) is recorded as L f to determine the position and size of the target and generate the final heatmap for classification and positioning. where the overall network loss is L d .
Ld=Lk+λsLs+λfLf L d = L k +λ s L s +λ f L f
其中λs=0.1,λf=1。对于输入图片大小为512×215的图像来说其通过该网络生成的特征图为H×W×C,则Lk、Ls和Lf计算公式分别为:Wherein λ s =0.1, λ f =1. For an image with an input image size of 512×215, the feature map generated by the network is H×W×C, then the calculation formulas of L k , L s and L f are respectively:
其中,AHWC为图像中目标标注的真实值,A'HWC为图像的预测值,α和β分别为2和4,N为图像中关键点的个数,s'pk为预测尺寸,sk为真实尺寸,p为图像中目标的中心点位置。Among them, A HWC is the actual value of the target label in the image, A' HWC is the predicted value of the image, α and β are 2 and 4 respectively, N is the number of key points in the image, s' pk is the predicted size, s k is the real size, and p is the center point position of the target in the image.
基于相同的发明构思,本发明还提供一种基于LG-CenterNet模型的复杂道路场景目标检测装置,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述计算机程序被加载至处理器时实现上述的基于LG-CenterNet模型的复杂道路场景目标检测方法。如图7所示。Based on the same inventive concept, the present invention also provides a complex road scene object detection device based on the LG-CenterNet model, comprising a memory, a processor and a computer program stored on the memory and operable on the processor, the computer program When loaded into the processor, the above-mentioned complex road scene target detection method based on the LG-CenterNet model is realized. As shown in Figure 7.
将自建的复杂场景数据集通过LG-CenterNet网络进行训练,得到可以识别复杂场景目标的模型,通过数据集中的验证集进行模型性能验证,如图7所示。本发明在自建的复杂道路场景数据集的识别平均精度为86.93%,道路场景目标图像检测速度达到50帧/s,能够满足对道路场景的准确检测和实时检测的要求。The self-built complex scene data set is trained through the LG-CenterNet network to obtain a model that can recognize complex scene objects, and the model performance is verified through the verification set in the data set, as shown in Figure 7. The present invention has an average recognition accuracy of 86.93% in the self-built complex road scene data set, and the detection speed of the road scene object image reaches 50 frames/s, which can meet the requirements of accurate detection and real-time detection of the road scene.
其中,Precision为精确度,Recall为召回率,AP为精度,mAP为平均精度,FPS为帧数,t为检测单张图片的时间。数据集中有较多样本类别(如car,person等),n表示样本个数,TP(True Positives)为正样本并被认定为正样本的数量(即为car的样本被认定为car的总数);TN(True Negatives)为负样本模型识别也为负样本的总数;FP(FalsePositives)为负样本模型认定为正样本的总数(即样本不为car,模型认定为car的总数);FN(False Negatives)为负样本模型认定为正样本的总数。Among them, Precision is the precision, Recall is the recall rate, AP is the precision, mAP is the average precision, FPS is the number of frames, and t is the time to detect a single picture. There are many sample categories in the data set (such as car, person, etc.), n represents the number of samples, TP (True Positives) is the number of positive samples and is recognized as positive samples (that is, the total number of samples that are car are recognized as car) ; TN (True Negatives) is the total number of negative samples identified by the negative sample model; FP (FalsePositives) is the total number of positive samples identified by the negative sample model (that is, the total number of samples that are not car, and the model is identified as car); FN (FalsePositives) Negatives) is the total number of positive samples identified by the negative sample model.
上面结合附图对本发明的实施方式作了详细说明,但是本发明并不限于上述实施方式,在本领域普通技术人员所具备的知识范围内,还可以在不脱离本发明宗旨的前提下做出各种变化。The embodiments of the present invention have been described in detail above in conjunction with the accompanying drawings, but the present invention is not limited to the above embodiments, and can also be made without departing from the gist of the present invention within the scope of knowledge possessed by those of ordinary skill in the art. Variations.
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211179337.4A CN115690704B (en) | 2022-09-27 | 2022-09-27 | Object detection method and device for complex road scenes based on LG-CenterNet model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211179337.4A CN115690704B (en) | 2022-09-27 | 2022-09-27 | Object detection method and device for complex road scenes based on LG-CenterNet model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115690704A true CN115690704A (en) | 2023-02-03 |
CN115690704B CN115690704B (en) | 2023-08-22 |
Family
ID=85063352
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211179337.4A Active CN115690704B (en) | 2022-09-27 | 2022-09-27 | Object detection method and device for complex road scenes based on LG-CenterNet model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115690704B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117690165A (en) * | 2024-02-02 | 2024-03-12 | 四川泓宝润业工程技术有限公司 | Method and device for detecting personnel passing between drill rod and hydraulic pliers |
Citations (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108717537A (en) * | 2018-05-30 | 2018-10-30 | 淮阴工学院 | A kind of face identification method and system of the complex scene based on pattern-recognition |
CN110543895A (en) * | 2019-08-08 | 2019-12-06 | 淮阴工学院 | An Image Classification Method Based on VGGNet and ResNet |
CN111382714A (en) * | 2020-03-13 | 2020-07-07 | Oppo广东移动通信有限公司 | Image detection method, device, terminal and storage medium |
CN111709895A (en) * | 2020-06-17 | 2020-09-25 | 中国科学院微小卫星创新研究院 | Blind image deblurring method and system based on attention mechanism |
CN111814889A (en) * | 2020-07-14 | 2020-10-23 | 大连理工大学人工智能大连研究院 | A One-Stage Object Detection Method Using Anchor-Free Module and Boosted Classifier |
CN111932553A (en) * | 2020-07-27 | 2020-11-13 | 北京航空航天大学 | Remote sensing image semantic segmentation method based on area description self-attention mechanism |
CN112329800A (en) * | 2020-12-03 | 2021-02-05 | 河南大学 | Salient object detection method based on global information guiding residual attention |
CN112580443A (en) * | 2020-12-02 | 2021-03-30 | 燕山大学 | Pedestrian detection method based on embedded device improved CenterNet |
CN112686207A (en) * | 2021-01-22 | 2021-04-20 | 北京同方软件有限公司 | Urban street scene target detection method based on regional information enhancement |
CN112700444A (en) * | 2021-02-19 | 2021-04-23 | 中国铁道科学研究院集团有限公司铁道建筑研究所 | Bridge bolt detection method based on self-attention and central point regression model |
CN113378815A (en) * | 2021-06-16 | 2021-09-10 | 南京信息工程大学 | Model for scene text positioning recognition and training and recognition method thereof |
CN113408498A (en) * | 2021-08-05 | 2021-09-17 | 广东众聚人工智能科技有限公司 | Crowd counting system and method, equipment and storage medium |
CN113657326A (en) * | 2021-08-24 | 2021-11-16 | 陕西科技大学 | A Weed Detection Method Based on Multiscale Fusion Module and Feature Enhancement |
WO2021244621A1 (en) * | 2020-06-04 | 2021-12-09 | 华为技术有限公司 | Scenario semantic parsing method based on global guidance selective context network |
CN114359153A (en) * | 2021-12-07 | 2022-04-15 | 湖北工业大学 | Insulator defect detection method based on improved CenterNet |
CN114419589A (en) * | 2022-01-17 | 2022-04-29 | 东南大学 | A road target detection method based on attention feature enhancement module |
CN114581866A (en) * | 2022-01-24 | 2022-06-03 | 江苏大学 | Multi-target visual detection algorithm for automatic driving scene based on improved CenterNet |
CN114638836A (en) * | 2022-02-18 | 2022-06-17 | 湖北工业大学 | An urban streetscape segmentation method based on highly effective driving and multi-level feature fusion |
US20220204035A1 (en) * | 2020-12-28 | 2022-06-30 | Hyundai Mobis Co., Ltd. | Driver management system and method of operating same |
US20220237403A1 (en) * | 2021-01-28 | 2022-07-28 | Salesforce.Com, Inc. | Neural network based scene text recognition |
CN114863368A (en) * | 2022-07-05 | 2022-08-05 | 城云科技(中国)有限公司 | Multi-scale target detection model and method for road damage detection |
CN115035361A (en) * | 2022-05-11 | 2022-09-09 | 中国科学院声学研究所南海研究站 | Target detection method and system based on attention mechanism and feature cross fusion |
-
2022
- 2022-09-27 CN CN202211179337.4A patent/CN115690704B/en active Active
Patent Citations (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108717537A (en) * | 2018-05-30 | 2018-10-30 | 淮阴工学院 | A kind of face identification method and system of the complex scene based on pattern-recognition |
CN110543895A (en) * | 2019-08-08 | 2019-12-06 | 淮阴工学院 | An Image Classification Method Based on VGGNet and ResNet |
CN111382714A (en) * | 2020-03-13 | 2020-07-07 | Oppo广东移动通信有限公司 | Image detection method, device, terminal and storage medium |
WO2021244621A1 (en) * | 2020-06-04 | 2021-12-09 | 华为技术有限公司 | Scenario semantic parsing method based on global guidance selective context network |
CN111709895A (en) * | 2020-06-17 | 2020-09-25 | 中国科学院微小卫星创新研究院 | Blind image deblurring method and system based on attention mechanism |
CN111814889A (en) * | 2020-07-14 | 2020-10-23 | 大连理工大学人工智能大连研究院 | A One-Stage Object Detection Method Using Anchor-Free Module and Boosted Classifier |
CN111932553A (en) * | 2020-07-27 | 2020-11-13 | 北京航空航天大学 | Remote sensing image semantic segmentation method based on area description self-attention mechanism |
CN112580443A (en) * | 2020-12-02 | 2021-03-30 | 燕山大学 | Pedestrian detection method based on embedded device improved CenterNet |
CN112329800A (en) * | 2020-12-03 | 2021-02-05 | 河南大学 | Salient object detection method based on global information guiding residual attention |
US20220204035A1 (en) * | 2020-12-28 | 2022-06-30 | Hyundai Mobis Co., Ltd. | Driver management system and method of operating same |
CN112686207A (en) * | 2021-01-22 | 2021-04-20 | 北京同方软件有限公司 | Urban street scene target detection method based on regional information enhancement |
US20220237403A1 (en) * | 2021-01-28 | 2022-07-28 | Salesforce.Com, Inc. | Neural network based scene text recognition |
CN112700444A (en) * | 2021-02-19 | 2021-04-23 | 中国铁道科学研究院集团有限公司铁道建筑研究所 | Bridge bolt detection method based on self-attention and central point regression model |
CN113378815A (en) * | 2021-06-16 | 2021-09-10 | 南京信息工程大学 | Model for scene text positioning recognition and training and recognition method thereof |
CN113408498A (en) * | 2021-08-05 | 2021-09-17 | 广东众聚人工智能科技有限公司 | Crowd counting system and method, equipment and storage medium |
CN113657326A (en) * | 2021-08-24 | 2021-11-16 | 陕西科技大学 | A Weed Detection Method Based on Multiscale Fusion Module and Feature Enhancement |
CN114359153A (en) * | 2021-12-07 | 2022-04-15 | 湖北工业大学 | Insulator defect detection method based on improved CenterNet |
CN114419589A (en) * | 2022-01-17 | 2022-04-29 | 东南大学 | A road target detection method based on attention feature enhancement module |
CN114581866A (en) * | 2022-01-24 | 2022-06-03 | 江苏大学 | Multi-target visual detection algorithm for automatic driving scene based on improved CenterNet |
CN114638836A (en) * | 2022-02-18 | 2022-06-17 | 湖北工业大学 | An urban streetscape segmentation method based on highly effective driving and multi-level feature fusion |
CN115035361A (en) * | 2022-05-11 | 2022-09-09 | 中国科学院声学研究所南海研究站 | Target detection method and system based on attention mechanism and feature cross fusion |
CN114863368A (en) * | 2022-07-05 | 2022-08-05 | 城云科技(中国)有限公司 | Multi-scale target detection model and method for road damage detection |
Non-Patent Citations (3)
Title |
---|
于方程 等: "基于改进CenterNet的自动驾驶小目标检测", 《HTTP://KNS.CNKI.NET/KCMS/DETAIL/11.2175.TN.20220719.1838.026.HTML》, pages 1 - 8 * |
于方程 等: "基于改进CenterNet的自动驾驶小目标检测", 《电子测量技术》, vol. 45, no. 15, pages 115 - 122 * |
成怡 等: "改进CenterNet的交通标志检测算法", 《信号处理》, vol. 38, no. 3, pages 511 - 518 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117690165A (en) * | 2024-02-02 | 2024-03-12 | 四川泓宝润业工程技术有限公司 | Method and device for detecting personnel passing between drill rod and hydraulic pliers |
Also Published As
Publication number | Publication date |
---|---|
CN115690704B (en) | 2023-08-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110738207B (en) | Character detection method for fusing character area edge information in character image | |
Wang et al. | FE-YOLOv5: Feature enhancement network based on YOLOv5 for small object detection | |
CN109558832B (en) | Human body posture detection method, device, equipment and storage medium | |
CN108492319B (en) | Moving target detection method based on deep full convolution neural network | |
WO2021155792A1 (en) | Processing apparatus, method and storage medium | |
CN108280397B (en) | Human body image hair detection method based on deep convolutional neural network | |
CN110569738B (en) | Natural scene text detection method, equipment and medium based on densely connected network | |
CN110175613A (en) | Street view image semantic segmentation method based on Analysis On Multi-scale Features and codec models | |
CN111340123A (en) | Image score label prediction method based on deep convolutional neural network | |
CN107944020A (en) | Facial image lookup method and device, computer installation and storage medium | |
CN111210446B (en) | Video target segmentation method, device and equipment | |
CN113487610B (en) | Herpes image recognition method and device, computer equipment and storage medium | |
CN116012709B (en) | High-resolution remote sensing image building extraction method and system | |
CN113936195B (en) | Sensitive image recognition model training method and device and electronic equipment | |
CN113033454B (en) | A detection method for building changes in urban video cameras | |
CN114266794A (en) | Pathological section image cancer region segmentation system based on full convolution neural network | |
CN116665054A (en) | Remote sensing image small target detection method based on improved YOLOv3 | |
CN117726575A (en) | Medical image segmentation method based on multi-scale feature fusion | |
CN114743257A (en) | Method for detecting and identifying image target behaviors | |
CN116503726A (en) | Multi-scale light smoke image segmentation method and device | |
CN115690704B (en) | Object detection method and device for complex road scenes based on LG-CenterNet model | |
CN110287970B (en) | A Weakly Supervised Object Localization Method Based on CAM and Masking | |
Li et al. | Incremental learning of infrared vehicle detection method based on SSD | |
CN103927517B (en) | Motion detection method based on human body global feature histogram entropies | |
CN114898290A (en) | Real-time detection method and system for marine ship |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20230203 Assignee: Jiangsu Kesheng Xuanyi Technology Co.,Ltd. Assignor: HUAIYIN INSTITUTE OF TECHNOLOGY Contract record no.: X2023980048436 Denomination of invention: Method and device for complex road scene object detection based on LG CenterNet model Granted publication date: 20230822 License type: Common License Record date: 20231129 |