WO2022237139A1 - Lanesegnet-based lane line detection method and system - Google Patents

Lanesegnet-based lane line detection method and system Download PDF

Info

Publication number
WO2022237139A1
WO2022237139A1 PCT/CN2021/135474 CN2021135474W WO2022237139A1 WO 2022237139 A1 WO2022237139 A1 WO 2022237139A1 CN 2021135474 W CN2021135474 W CN 2021135474W WO 2022237139 A1 WO2022237139 A1 WO 2022237139A1
Authority
WO
WIPO (PCT)
Prior art keywords
module
convolution
lane line
lanesegnet
image
Prior art date
Application number
PCT/CN2021/135474
Other languages
French (fr)
Chinese (zh)
Inventor
高尚兵
胡序洋
汪长春
陈浩霖
蔡创新
相林
于永涛
周君
朱全银
张正伟
郝明阳
张骏强
李�杰
李少凡
Original Assignee
淮阴工学院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 淮阴工学院 filed Critical 淮阴工学院
Publication of WO2022237139A1 publication Critical patent/WO2022237139A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the invention belongs to the technical field of computer vision, and in particular relates to a lane line detection method and system based on LaneSegNet (lane line segmentation network).
  • LaneSegNet Lane line segmentation network
  • the current traditional lane line detection methods such as using Hough transform to detect lane lines, first use image processing to extract lane line edge features, and then use Hough transform to detect and fit the lines.
  • the above-mentioned lane line detection method can only be used for uniform illumination, single environment, no occlusion, and blurred detection, and its robustness is poor.
  • an end-to-end lane line detection proposed by Davy Neven et al. first detects lane lines through a deep neural network. , and then use the clustering algorithm to cluster the lane lines, and finally use polynomial fitting to get the lane lines.
  • the detection process of the above method is relatively complicated and time-consuming, and it is difficult to meet the real-time requirements.
  • the present invention provides a lane line detection method and system based on LaneSegNet.
  • a lane detection method based on LaneSegNet comprising the following steps:
  • the LaneSegNet network model includes sequentially connected initial modules, three convolutional downsampling modules, enhanced receptive field modules, four convolutional upsampling modules, and two enhanced feature modules;
  • the initial module is used to halve the size of the input image
  • the convolutional downsampling module is used to extract lane line feature information
  • the enhanced receptive field module is used to increase the receptive field of the network
  • the enhanced feature module is used to enhance lane line information
  • the convolutional upsampling module is used to restore image size and image features ;
  • the first enhanced feature module is connected to the output of the first convolution downsampling module and the second convolution upsampling module
  • the second enhancement feature module is connected to the output of the second downsampling module and the first convolution upsampling module
  • the second convolution upsampling module is connected to the output end of the second enhancement feature module
  • the third convolution upsampling module is connected to the output end of the first enhancement feature module;
  • step (3) use the DBSCAN algorithm to cluster the lane line pixel point coordinates to the binary image that step (2) obtains, divide the lane lines of different classes, use quadratic polynomial to fit respectively for the lane lines of different classes;
  • the initial module includes a convolutional layer with a convolution kernel size of k ⁇ k and a step size of 1, a convolution layer with a convolution kernel size of k ⁇ k and a step size of 2, and a maximum pooling layer, a connection layer, in which two convolutional layers are sequentially connected, and between the convolutional layer and the pooling layer are parallel connections, and k is 3 or 5.
  • the unit structures of the three convolution down-sampling modules are the same, including a 1 ⁇ 1 convolution, two k ⁇ k convolutions and the first branch of the 1 ⁇ 1 convolution series connection, and three dilation rates respectively.
  • the second branch of the hole convolution series connection of 1, 2, and 5, the first branch and the second branch are connected in parallel, the input of this module is added to the output of the two parallel connections, and then divided into
  • the enhanced receptive field module includes three parallel atrous convolution branches, the first atrous convolution branch includes a k ⁇ k convolution with an expansion rate of 1, and the second atrous convolution branch includes four Consecutive k ⁇ k convolutions with dilation rates of 2, 5, 9, and 13 respectively.
  • the structure of the third hole convolution branch is the same as that of the second hole convolution branch.
  • the first hole convolution branch The output of the atrous convolution branch is the input of the second atrous convolution branch, the output of the second atrous convolution branch is the input of the third atrous convolution branch, and finally the outputs of the three atrous convolution branches are added together , k is 3 or 5.
  • the two enhanced feature modules have the same structure, including the first k ⁇ k asymmetric convolution, two parallel global average pooling and global maximum pooling, and the second k ⁇ k asymmetric convolution , a 1 ⁇ 1 convolution, a sigmoid activation layer, and a threshold layer, multiply the final threshold with the input, k is 3 or 5; the threshold function is:
  • the four convolution upsampling modules have the same structure, including a 1 ⁇ 1 convolution connected in sequence, a k ⁇ k convolution, two parallel transposed convolutions and upsampling, and a 1 ⁇ 1 convolution; each convolution operation is followed by Batch Normalization and PReLU nonlinear activation function processing.
  • the step of training LaneSegNet network model in described step (2) comprises:
  • LaneSegNet a lane detection system based on LaneSegNet, including:
  • the preprocessing module is used to carry out polygon filling to the road image, and obtains the ROI area image containing the lane line;
  • Lane line recognition module for inputting the ROI area image into the LaneSegNet network model trained, obtains the binary image that contains lane line;
  • Described LaneSegNet network model comprises the initial module connected successively, three convolution down-sampling modules, Enhanced receptive field module, four convolutional upsampling modules, and two enhanced feature modules; the initial module is used to halve the size of the input image, the convolutional downsampling module is used to extract lane line feature information, and the enhanced receptive field module is used to Increase the receptive field of the network, the enhanced feature module is used to enhance the lane line information, and the convolution upsampling module is used to restore the image size and image features; the first enhanced feature module is combined with the first convolution downsampling module and the second convolution The output terminal of the upsampling module is connected, the second enhanced feature module is connected with the output terminal of the second downsampling module and the first convolutional upsampling module, the second convolutional upsamp
  • the lane line fitting module is used to cluster the lane line pixel coordinates using the DBSCAN algorithm on the binary image obtained by the LaneSegNet network model, divide different types of lane lines, and use quadratic polynomials for different types of lane lines. combine;
  • the result output module is used for displaying the fitted lane line on the original road image, so as to realize the visualization of lane line detection.
  • the present invention provides a lane line detection system based on LaneSegNet, including a memory, a processor and a computer program stored on the memory and operable on the processor, the computer program being loaded into the processor When realizing the described lane detection method based on LaneSegNet.
  • the beneficial effects of the present invention are as follows: 1.
  • the enhanced receptive field module increases the receptive field of the network model to avoid the problem that the lane line spans too large in the image and is difficult to segment.
  • the enhanced feature module removes features irrelevant to the task, making the network more focused on extracting lane line features, and avoiding segmentation difficulties caused by the small proportion of lane line pixels in the image.
  • Fig. 1 is the method flowchart of the embodiment of the present invention
  • Fig. 2 is the LaneSegNet model network structural diagram in the embodiment of the present invention.
  • Fig. 3 is the initial module network structure diagram in the embodiment of the present invention.
  • FIG. 4 is a network structure diagram of an encoding module in an embodiment of the present invention.
  • FIG. 5 is a network structure diagram of an enhanced receptive field module in an embodiment of the present invention.
  • FIG. 6 is a network structure diagram of an enhanced feature module in an embodiment of the present invention.
  • FIG. 7 is a network structure diagram of a decoding module model in an embodiment of the present invention.
  • Fig. 8 is a schematic diagram of some data sets used in the embodiment of the present invention.
  • Fig. 9 is a schematic diagram of some data set tags used in the embodiment of the present invention.
  • Fig. 10 is the data set picture labeling process in the embodiment of the present invention.
  • Fig. 11 is a segmentation diagram of an example of road dividing line detection in the embodiment of the present invention.
  • Fig. 12 is a fitting diagram of lane lines in an embodiment of the present invention.
  • Fig. 13 is the miou diagram in the training process in the embodiment of the present invention.
  • FIG. 14 is an acc graph during training in an embodiment of the present invention.
  • a kind of lane line detection method based on LaneSegNet disclosed in the embodiment of the present invention first carries out polygon filling to road image, obtains the ROI (region of interest) area containing lane line; Then ROI area image is input into In the trained LaneSegNet network model, a binary image containing lane lines is obtained; then the DBSCAN algorithm is used to cluster the pixel coordinates of lane lines in the binary image, and quadratic polynomials are used to fit different types of lane lines respectively ; Finally, the fitted lane line is displayed on the original road image to realize the visualization of lane line detection.
  • Preprocess the road video captured by the camera obtain effective road video data, and extract frames from the acquired image to obtain the road image to be marked, and use the marking tool to mark the lane line in the image to be marked to obtain the lane line target mark road image.
  • the constructed data set is obtained.
  • the LaneSegNet model constructed by the embodiment of the present invention mainly includes an encoding module, a decoding module, an enhanced receptive field module and an enhanced feature module.
  • the encoding module mainly includes three identical convolutional downsampling modules, which expand the receptive field of the network by using hole convolution;
  • the decoding module mainly includes four identical convolutional upsampling modules, which restore feature information and image size; enhance the receptive field module, to further increase the receptive field of the network; to enhance the feature module, to enhance the information related to the current task, and to discard the information irrelevant to the current task.
  • the model structure specifically includes an initial module connected in sequence, three convolutional downsampling modules, an enhanced receptive field module, four convolutional upsampling modules, and two enhanced feature modules; the initial module is used to halve the size of the input image, and the volume
  • the product downsampling module is used to extract the feature information of the lane line
  • the enhanced receptive field module is used to increase the receptive field of the network
  • the enhanced feature module is used to enhance the lane line information
  • the convolutional upsampling module is used to restore the image size and image features
  • the first enhanced feature module is connected to the output of the first convolution downsampling module and the second convolution upsampling module
  • the second enhancement feature module is connected to the second downsampling module and the output of the first convolution upsampling module
  • the second The convolutional upsampling module is connected to the output end of the second enhancement feature module
  • the third convolutional upsampling module is connected to the output end of the first
  • the initial module includes a convolution layer with a convolution kernel size of 3 ⁇ 3 and a step size of 1, a convolution layer with a convolution kernel size of 3 ⁇ 3 and a step size of 2, and a maximum pooling layer , a connection layer, where the two convolutional layers are sequentially connected, and the convolutional layer and the pooling layer are connected in parallel.
  • the unit structure of the three convolutional downsampling modules is the same, as shown in Figure 4, all of which include a 1 ⁇ 1 convolution, two 3 ⁇ 3 convolutions and the first branch of the 1 ⁇ 1 convolution concatenation, and three expansion rates
  • the second branch of the hole convolution series connection of 1, 2, and 5 respectively, the first branch and the second branch are connected in parallel, the input of this module is added to the output of the two parallel connections, and then divided into A third branch connected in series by a 1 ⁇ 1 convolution, two 3 ⁇ 3 convolutions, and a 1 ⁇ 1 convolution, and a fourth branch that sets the maximum pooling layer, and the third branch and the fourth branch are parallel Connect, and the output of the two branches is added.
  • the enhanced receptive field module includes three parallel hole convolution branches, the first hole convolution branch includes a 3 ⁇ 3 convolution with an expansion rate of 1, and the second hole convolution branch includes four Consecutive 3 ⁇ 3 convolution, expansion ratios are 2, 5, 9, 13 respectively, the structure of the third dilated convolution branch is the same as that of the second dilated convolution branch, and the structure of the first dilated convolution branch
  • the output is the input of the second hole convolution branch
  • the output of the second hole convolution branch is the input of the third hole convolution branch
  • the outputs of the three hole convolution branches are added together.
  • the structure of the two enhanced feature modules is the same, as shown in Figure 6, including the first 3 ⁇ 3 asymmetric convolution, two parallel global average pooling and global maximum pooling, and the second 3 ⁇ 3 asymmetric convolution product, a 1 ⁇ 1 convolution, a sigmoid activation layer, and a threshold layer, which multiply the final threshold with the input;
  • the threshold function is:
  • the structure of the four convolutional upsampling modules is the same, as shown in Figure 7, including a 1 ⁇ 1 convolution connected in sequence, a 3 ⁇ 3 convolution, two parallel Conv2DTranspose and UpSampling2D, and a 1 ⁇ 1 convolution Product; each convolution operation is followed by Batch Normalization and PReLU nonlinear activation function processing.
  • the training steps of the LaneSegNet network model are as follows: first extract the region of interest from the road image, and set six coordinate points: r1 (0,270), r2 (0, h), r3 (w, h), r4 (w, 470), r5 (670,150), r6(570,150), where w is the width of the input image, h is the height of the input image, and the outside of the area surrounded by the coordinate points is set to 0 to obtain the area of interest containing the lane line, and then the area of interest
  • the image and the corresponding binary image are input into the LaneSegNet network model as training sample data, and then the loss of the LaneSegNet network is calculated, and the parameters in the network are continuously optimized with the minimum loss as the goal.
  • the network parameters are saved to obtain the final lane line detection model.
  • the constructed data set is the video data taken by vehicles driving on the highway. Since the original video data contains information irrelevant to the training task, it is necessary to crop and denoise the original video. After processing, the video frame is drawn to obtain enough training data. The obtained data is shown in Figure 8.
  • a .json file will be generated in the original folder, and the corresponding binary image will be generated according to the .json file, as shown in Figure 9.
  • a total of 11807 road images are labeled, and the labeling process is shown in FIG. 10 .
  • the data folder is Datasets, which includes two subfolders, namely Images and Labels, wherein Images saves the training images, and Labels saves the binary images corresponding to the training images, wherein the ratio of the training set to the verification set It is 7:3, the training set image path is saved in train.txt, and the verification set image path is saved in val.txt, which stores the relative paths of training images and labels.
  • Training of the LaneSegNet network model Input the marked lane line data set into the LaneSegNet network for training, set the corresponding parameters, and perform model training to obtain the trained LaneSegNet network model.
  • the specific steps are as follows:
  • MIOU is an evaluation index for semantic segmentation and an important criterion for measuring model performance.
  • miou is the ratio of the intersection and union of the two sets of real values and predicted values.
  • the miou of the LaneSegNet network model proposed in the present invention is shown in Figure 13. It can be seen from the figure that a LaneSegNet-based network model proposed by the present invention The miou value of the lane line detection method is 78.18%.
  • the method provided by the present invention is to detect lane lines based on the network model of LaneSegNet, and the detection accuracy of this method has achieved a good effect. As shown in Figure 14, the lane line detection based on the method of the present invention has an accuracy of 98.62%. .
  • a lane line detection system based on LaneSegNet includes a preprocessing module, which is used to perform polygon filling on the road image, and obtain an ROI region image containing lane lines; a lane line recognition module, It is used to input the ROI area image into the trained LaneSegNet network model to obtain a binary image containing lane lines; the LaneSegNet network model includes sequentially connected initial modules, three convolutional downsampling modules, enhanced receptive field modules, Four convolutional upsampling modules, and two enhanced feature modules; the initial module is used to halve the size of the input image, the convolutional downsampling module is used to extract lane line feature information, and the enhanced receptive field module is used to increase the perception of the network In the field, the enhanced feature module is used to enhance the lane line information, and the convolution upsampling module is used to restore the image size and image features; wherein the first enhanced feature module is connected with the first
  • a LaneSegNet-based lane line detection system provided in an embodiment of the present invention includes a memory, a processor, and a computer program stored in the memory and operable on the processor, the computer program being loaded into the processing
  • the above-mentioned LaneSegNet-based lane line detection method is implemented when the device is used.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

A LaneSegNet-based lane line detection method and system. According to the method and system, polygon filling is first performed on an image to obtain a ROI, and then an ROI image is input to a trained LaneSegNet network model to obtain a binary image comprising lane lines, pixel point coordinates of each lane line is clustered by using a DBSCAN algorithm, polynomial fitting is performed on same, and the fitted lane lines are displayed on the original image. The constructed LaneSegNet network model comprises a network architecture of an encoding module, a decoding module, an enhanced receptive field module, and enhanced feature modules. The network receptive field is improved by using the parallel atrous convolution modules. Feature information irrelevant to the current task is removed by using the enhanced feature modules. Asymmetric convolutions are used to construct a feature extraction network, so that network parameters are reduced. The method has an accuracy of 98.62%, can be used for detecting lane lines on a highway, and has good robustness and real-time performance.

Description

一种基于LaneSegNet的车道线检测方法及系统A lane line detection method and system based on LaneSegNet 技术领域technical field
本发明属于计算机视觉技术领域,具体涉及一种基于LaneSegNet(车道线分割网络)的车道线检测方法及系统。The invention belongs to the technical field of computer vision, and in particular relates to a lane line detection method and system based on LaneSegNet (lane line segmentation network).
背景技术Background technique
随着人们生活水平的提高,汽车在人们的生活中扮演着越来越重要的角色,然而,汽车保有量的增加导致发生的交通事故越来越多,为了保障行车安全,自动驾驶功能受到越来越多的重视,车道线检测是自动驾驶功能中重要的组成部分。With the improvement of people's living standards, cars play an increasingly important role in people's lives. However, the increase in car ownership leads to more and more traffic accidents. In order to ensure driving safety, the automatic driving function is increasingly With more and more attention, lane line detection is an important part of the automatic driving function.
目前传统的车道线检测方法,如使用Hough变换检测车道线,首先使用图像处理提取车道线边缘特征,然后使用Hough变换对线进行检测和拟合。上述车道线检测方法只能用于光照均匀,环境单一,无遮挡、模糊的检测,鲁棒性较差。The current traditional lane line detection methods, such as using Hough transform to detect lane lines, first use image processing to extract lane line edge features, and then use Hough transform to detect and fit the lines. The above-mentioned lane line detection method can only be used for uniform illumination, single environment, no occlusion, and blurred detection, and its robustness is poor.
随着深度学习在各个领域的广泛应用,车道线检测与深度学习结合的越来越紧密,如Davy Neven等人提出的一种端到端的车道线检测,先通过深度神经网络对车道线进行检测,再利用聚类算法对车道线聚类,最后利用多项式拟合出车道线。然而上述方法检测过程较为复杂,耗时较长,很难满足实时性的要求。With the wide application of deep learning in various fields, the combination of lane line detection and deep learning is getting closer and closer. For example, an end-to-end lane line detection proposed by Davy Neven et al. first detects lane lines through a deep neural network. , and then use the clustering algorithm to cluster the lane lines, and finally use polynomial fitting to get the lane lines. However, the detection process of the above method is relatively complicated and time-consuming, and it is difficult to meet the real-time requirements.
发明内容Contents of the invention
发明目的:针对目前车道线检测鲁棒性差,过程复杂,检测时间较长的问题,本发明提供一种基于LaneSegNet的车道线检测方法及系统。Purpose of the invention: To solve the problems of poor robustness, complicated process and long detection time of current lane line detection, the present invention provides a lane line detection method and system based on LaneSegNet.
技术方案:为实现上述发明目的,本发明采用如下技术方案:Technical solution: In order to achieve the above-mentioned purpose of the invention, the present invention adopts the following technical solution:
一种基于LaneSegNet的车道线检测方法,包括以下步骤:A lane detection method based on LaneSegNet, comprising the following steps:
(1)对道路图像进行多边形填充,获取含有车道线的ROI区域图像;(1) Carry out polygon filling to road image, obtain the ROI area image that contains lane line;
(2)将ROI区域图像输入到训练好的LaneSegNet网络模型中,得到含有车道线的二值图像;(2) Input the ROI area image into the trained LaneSegNet network model to obtain a binary image containing lane lines;
所述LaneSegNet网络模型包括依次连接的初始模块、三个卷积下采样模块、增强感受野模块、四个卷积上采样模块,以及两个增强特征模块;初始模块用于将输入图像大小减半,卷积下采样模块用于提取车道线特征信息,增强感受野模块用于增大网络的感受野,增强特征模块用于增强车道线信息,卷积上采样模块用于恢复图像大小以及图像特征;其中第一增强特征模块与第一卷积下采样模块和第二卷积上采样模块输出端连接,第二增强特征模块与第二下采样模块和第一 卷积上采样模块输出端连接,第二卷积上采样模块与第二增强特征模块输出端连接,第三卷积上采样模块与第一增强特征模块输出端连接;The LaneSegNet network model includes sequentially connected initial modules, three convolutional downsampling modules, enhanced receptive field modules, four convolutional upsampling modules, and two enhanced feature modules; the initial module is used to halve the size of the input image , the convolutional downsampling module is used to extract lane line feature information, the enhanced receptive field module is used to increase the receptive field of the network, the enhanced feature module is used to enhance lane line information, and the convolutional upsampling module is used to restore image size and image features ; wherein the first enhanced feature module is connected to the output of the first convolution downsampling module and the second convolution upsampling module, and the second enhancement feature module is connected to the output of the second downsampling module and the first convolution upsampling module, The second convolution upsampling module is connected to the output end of the second enhancement feature module, and the third convolution upsampling module is connected to the output end of the first enhancement feature module;
(3)对步骤(2)得到的二值图像使用DBSCAN算法对车道线像素点坐标进行聚类,划分出不同类别的车道线,对于不同类别的车道线使用二次多项式分别进行拟合;(3) use the DBSCAN algorithm to cluster the lane line pixel point coordinates to the binary image that step (2) obtains, divide the lane lines of different classes, use quadratic polynomial to fit respectively for the lane lines of different classes;
(4)将拟合后的车道线显示在原道路图像上,实现车道线检测的可视化。(4) Display the fitted lane lines on the original road image to realize the visualization of lane line detection.
作为优选,所述初始模块包括一个卷积核大小为k×k,步长为1的卷积层,一个卷积核大小为k×k,步长为2的卷积层,一个最大池化层,一个连接层,其中两个卷积层之间为依次连接,卷积层和池化层之间为并行连接,k为3或5。Preferably, the initial module includes a convolutional layer with a convolution kernel size of k×k and a step size of 1, a convolution layer with a convolution kernel size of k×k and a step size of 2, and a maximum pooling layer, a connection layer, in which two convolutional layers are sequentially connected, and between the convolutional layer and the pooling layer are parallel connections, and k is 3 or 5.
作为优选,三个卷积下采样模块的单元结构相同,均包括一个1×1卷积、两个k×k卷积和1×1卷积串连的第一支路、三个扩张率分别为1、2、5的空洞卷积串连的第二支路,第一支路和第二支路并行连接,该模块的输入与两个并行连接的输出做相加操作,再分为由一个1×1卷积、两个k×k卷积和1×1卷积串连的第三支路和设置最大池化层的第四支路,第三支路和第四支路并行连接,两支路的输出做相加操作,k为3或5。Preferably, the unit structures of the three convolution down-sampling modules are the same, including a 1×1 convolution, two k×k convolutions and the first branch of the 1×1 convolution series connection, and three dilation rates respectively The second branch of the hole convolution series connection of 1, 2, and 5, the first branch and the second branch are connected in parallel, the input of this module is added to the output of the two parallel connections, and then divided into The third branch of a 1×1 convolution, two k×k convolutions and 1×1 convolution in series and the fourth branch of the maximum pooling layer, the third branch and the fourth branch are connected in parallel , the outputs of the two branches are added, and k is 3 or 5.
作为优选,所述增强感受野模块包括三条并行的空洞卷积支路,第一条空洞卷积支路包括一个扩张率为1的k×k卷积,第二条空洞卷积支路包括四个连续的k×k卷积,扩张率分别为2、5、9、13,第三条空洞卷积支路的结构与第二条空洞卷积支路相同,第一条空洞卷积支路的输出为第二条空洞卷积支路的输入,第二条空洞卷积支路的输出为第三条空洞卷积支路的输入,最后将三条空洞卷积支路的输出做相加操作,k为3或5。Preferably, the enhanced receptive field module includes three parallel atrous convolution branches, the first atrous convolution branch includes a k×k convolution with an expansion rate of 1, and the second atrous convolution branch includes four Consecutive k×k convolutions with dilation rates of 2, 5, 9, and 13 respectively. The structure of the third hole convolution branch is the same as that of the second hole convolution branch. The first hole convolution branch The output of the atrous convolution branch is the input of the second atrous convolution branch, the output of the second atrous convolution branch is the input of the third atrous convolution branch, and finally the outputs of the three atrous convolution branches are added together , k is 3 or 5.
作为优选,两个增强特征模块的结构相同,包括第一个k×k的非对称卷积,两个并行的全局平均池化和全局最大池化,第二个k×k的非对称卷积,一个1×1的卷积,一个sigmoid激活层,一个阈值层,将最后得到的阈值与输入做乘积操作,k为3或5;其中阈值函数为:
Figure PCTCN2021135474-appb-000001
Preferably, the two enhanced feature modules have the same structure, including the first k×k asymmetric convolution, two parallel global average pooling and global maximum pooling, and the second k×k asymmetric convolution , a 1×1 convolution, a sigmoid activation layer, and a threshold layer, multiply the final threshold with the input, k is 3 or 5; the threshold function is:
Figure PCTCN2021135474-appb-000001
作为优选,四个卷积上采样模块的结构相同,均包括依次连接的一个1×1的卷积,一个k×k的卷积、两个并行的转置卷积和上采样,一个1×1的卷积;每一个卷积操作之后均进行Batch Normalization归一化和PReLU非线性激活函数处理。Preferably, the four convolution upsampling modules have the same structure, including a 1×1 convolution connected in sequence, a k×k convolution, two parallel transposed convolutions and upsampling, and a 1× 1 convolution; each convolution operation is followed by Batch Normalization and PReLU nonlinear activation function processing.
作为优选,所述步骤(2)中训练LaneSegNet网络模型的步骤包括:As preferably, the step of training LaneSegNet network model in described step (2) comprises:
(2.1)将道路图像的ROI区域图像及做好标记的车道线的二值图像作为训练样本数据输入到LaneSegNet网络模型中;(2.1) Input the ROI area image of the road image and the binary image of the marked lane line into the LaneSegNet network model as training sample data;
(2.2)计算LaneSegNet网络的损失,以损失最小作为目标对网络中的参数持续优化;(2.2) Calculate the loss of the LaneSegNet network, and continuously optimize the parameters in the network with the minimum loss as the goal;
(2.3)当损失值稳定在一定范围内时,保存网络参数,得到最终的车道线检测模型。(2.3) When the loss value is stable within a certain range, the network parameters are saved to obtain the final lane line detection model.
基于相同的发明构思,本发明提供的一种基于LaneSegNet的车道线检测系统,包括:Based on the same inventive concept, the present invention provides a lane detection system based on LaneSegNet, including:
预处理模块,用于对道路图像进行多边形填充,获取含有车道线的ROI区域图像;The preprocessing module is used to carry out polygon filling to the road image, and obtains the ROI area image containing the lane line;
车道线识别模块,用于将ROI区域图像输入到训练好的LaneSegNet网络模型中,得到含有车道线的二值图像;所述LaneSegNet网络模型包括依次连接的初始模块、三个卷积下采样模块、增强感受野模块、四个卷积上采样模块,以及两个增强特征模块;初始模块用于将输入图像大小减半,卷积下采样模块用于提取车道线特征信息,增强感受野模块用于增大网络的感受野,增强特征模块用于增强车道线信息,卷积上采样模块用于恢复图像大小以及图像特征;其中第一增强特征模块与第一卷积下采样模块和第二卷积上采样模块输出端连接,第二增强特征模块与第二下采样模块和第一卷积上采样模块输出端连接,第二卷积上采样模块与第二增强特征模块输出端连接,第三卷积上采样模块与第一增强特征模块输出端连接;Lane line recognition module, for inputting the ROI area image into the LaneSegNet network model trained, obtains the binary image that contains lane line; Described LaneSegNet network model comprises the initial module connected successively, three convolution down-sampling modules, Enhanced receptive field module, four convolutional upsampling modules, and two enhanced feature modules; the initial module is used to halve the size of the input image, the convolutional downsampling module is used to extract lane line feature information, and the enhanced receptive field module is used to Increase the receptive field of the network, the enhanced feature module is used to enhance the lane line information, and the convolution upsampling module is used to restore the image size and image features; the first enhanced feature module is combined with the first convolution downsampling module and the second convolution The output terminal of the upsampling module is connected, the second enhanced feature module is connected with the output terminal of the second downsampling module and the first convolutional upsampling module, the second convolutional upsampling module is connected with the output terminal of the second enhanced feature module, and the third volume The product upsampling module is connected with the first enhanced feature module output;
车道线拟合模块,用于对LaneSegNet网络模型得到二值图像使用DBSCAN算法对车道线像素点坐标进行聚类,划分出不同类别的车道线,对于不同类别的车道线使用二次多项式分别进行拟合;The lane line fitting module is used to cluster the lane line pixel coordinates using the DBSCAN algorithm on the binary image obtained by the LaneSegNet network model, divide different types of lane lines, and use quadratic polynomials for different types of lane lines. combine;
以及,结果输出模块,用于将拟合后的车道线显示在原道路图像上,实现车道线检测的可视化。And, the result output module is used for displaying the fitted lane line on the original road image, so as to realize the visualization of lane line detection.
基于相同的发明构思,本发明提供的一种基于LaneSegNet的车道线检测系统,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序, 所述计算机程序被加载至处理器时实现所述的基于LaneSegNet的车道线检测方法。Based on the same inventive concept, the present invention provides a lane line detection system based on LaneSegNet, including a memory, a processor and a computer program stored on the memory and operable on the processor, the computer program being loaded into the processor When realizing the described lane detection method based on LaneSegNet.
有益效果:与现有技术相比,本发明的有益效果如下:1、利用LaneSegNet模型架构,结构简单,参数量较少的优点实现了道路图像中的车道线高精度提取。2、增强感受野模块增大网络模型的感受野,避免车道线在图像中跨幅过大,不易分割的问题。3.增强特征模块去除与任务无关的特征,使网络更专注于提取车道线特征,避免车道线像素在图像中占比过小而导致的分割困难问题。Beneficial effects: Compared with the prior art, the beneficial effects of the present invention are as follows: 1. By using the LaneSegNet model architecture, the structure is simple and the number of parameters is small to realize high-precision extraction of lane lines in road images. 2. The enhanced receptive field module increases the receptive field of the network model to avoid the problem that the lane line spans too large in the image and is difficult to segment. 3. The enhanced feature module removes features irrelevant to the task, making the network more focused on extracting lane line features, and avoiding segmentation difficulties caused by the small proportion of lane line pixels in the image.
附图说明Description of drawings
图1为本发明实施例的方法流程图;Fig. 1 is the method flowchart of the embodiment of the present invention;
图2为本发明实施例中的LaneSegNet模型网络结构图;Fig. 2 is the LaneSegNet model network structural diagram in the embodiment of the present invention;
图3为本发明实施例中的初始模块网络结构图;Fig. 3 is the initial module network structure diagram in the embodiment of the present invention;
图4为本发明实施例中的编码模块网络结构图;FIG. 4 is a network structure diagram of an encoding module in an embodiment of the present invention;
图5为本发明实施例中的增强感受野模块网络结构图;FIG. 5 is a network structure diagram of an enhanced receptive field module in an embodiment of the present invention;
图6为本发明实施例中的增强特征模块网络结构图;FIG. 6 is a network structure diagram of an enhanced feature module in an embodiment of the present invention;
图7为本发明实施例中的解码模块模型网络结构图;FIG. 7 is a network structure diagram of a decoding module model in an embodiment of the present invention;
图8为本发明实施例中使用的部分数据集示意图;Fig. 8 is a schematic diagram of some data sets used in the embodiment of the present invention;
图9为本发明实施例中使用的部分数据集标签示意图;Fig. 9 is a schematic diagram of some data set tags used in the embodiment of the present invention;
图10为本发明实施例中数据集图片标注过程;Fig. 10 is the data set picture labeling process in the embodiment of the present invention;
图11本发明实施例中道路分割线检测示例的分割图;Fig. 11 is a segmentation diagram of an example of road dividing line detection in the embodiment of the present invention;
图12为本发明实施例中车道线拟合图;Fig. 12 is a fitting diagram of lane lines in an embodiment of the present invention;
图13为本发明实施例中训练过程中miou图;Fig. 13 is the miou diagram in the training process in the embodiment of the present invention;
图14为本发明实施例中训练过程中acc图。FIG. 14 is an acc graph during training in an embodiment of the present invention.
具体实施方式Detailed ways
下面结合附图和具体实施例对本发明作进一步的说明。The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.
如图1所示,本发明实施例公开的一种基于LaneSegNet的车道线检测方法,首先对道路图像进行多边形填充,获取含有车道线的ROI(感兴趣区域)区域;然后将ROI区域图像输入到训练好的LaneSegNet网络模型中,得到含有车道线的二值图像;再使用DBSCAN算法对二值图像中的车道线像素点坐标进行聚类, 对于不同类别的车道线使用二次多项式分别进行拟合;最后将拟合后的车道线显示在原道路图像上,实现车道线检测的可视化。As shown in Fig. 1, a kind of lane line detection method based on LaneSegNet disclosed in the embodiment of the present invention first carries out polygon filling to road image, obtains the ROI (region of interest) area containing lane line; Then ROI area image is input into In the trained LaneSegNet network model, a binary image containing lane lines is obtained; then the DBSCAN algorithm is used to cluster the pixel coordinates of lane lines in the binary image, and quadratic polynomials are used to fit different types of lane lines respectively ; Finally, the fitted lane line is displayed on the original road image to realize the visualization of lane line detection.
下面首先对本实施例所用的数据集和网络模型具体结构进行详细说明。The specific structure of the data set and the network model used in this embodiment will first be described in detail below.
对摄像机捕获的道路视频进行预处理,获取有效道路视频数据对获取的图像进行抽帧,获得待标记的道路图像,通过标记工具对待标记的图像中的车道线进行标记,得到有车道线目标标记的道路图像。从而得到构建的数据集。Preprocess the road video captured by the camera, obtain effective road video data, and extract frames from the acquired image to obtain the road image to be marked, and use the marking tool to mark the lane line in the image to be marked to obtain the lane line target mark road image. Thus, the constructed data set is obtained.
如图2所示,本发明实施例构建的LaneSegNet模型主要包括编码模块、解码模块、增强感受野模块和增强特征模块。其中编码模块主要包括三个相同的卷积下采样模块,通过使用空洞卷积扩大网络的感受野;解码模块主要包括四个相同的卷积上采样模块,恢复特征信息及图像尺寸;增强感受野模块,进一步增大网络的感受野;增强特征模块,增强与当前任务有关的信息,丢弃与当前任务无关的信息。将训练数据输入到LaneSegNet网络中进行训练,得到含有车道线的二值图像。模型结构具体包括依次连接的初始模块、三个卷积下采样模块、增强感受野模块、四个卷积上采样模块,以及两个增强特征模块;初始模块用于将输入图像大小减半,卷积下采样模块用于提取车道线特征信息,增强感受野模块用于增大网络的感受野,增强特征模块用于增强车道线信息,卷积上采样模块用于恢复图像大小以及图像特征;其中第一增强特征模块与第一卷积下采样模块和第二卷积上采样模块输出端连接,第二增强特征模块与第二下采样模块和第一卷积上采样模块输出端连接,第二卷积上采样模块与第二增强特征模块输出端连接,第三卷积上采样模块与第一增强特征模块输出端连接。As shown in Fig. 2, the LaneSegNet model constructed by the embodiment of the present invention mainly includes an encoding module, a decoding module, an enhanced receptive field module and an enhanced feature module. The encoding module mainly includes three identical convolutional downsampling modules, which expand the receptive field of the network by using hole convolution; the decoding module mainly includes four identical convolutional upsampling modules, which restore feature information and image size; enhance the receptive field module, to further increase the receptive field of the network; to enhance the feature module, to enhance the information related to the current task, and to discard the information irrelevant to the current task. Input the training data into the LaneSegNet network for training, and obtain a binary image containing lane lines. The model structure specifically includes an initial module connected in sequence, three convolutional downsampling modules, an enhanced receptive field module, four convolutional upsampling modules, and two enhanced feature modules; the initial module is used to halve the size of the input image, and the volume The product downsampling module is used to extract the feature information of the lane line, the enhanced receptive field module is used to increase the receptive field of the network, the enhanced feature module is used to enhance the lane line information, and the convolutional upsampling module is used to restore the image size and image features; The first enhanced feature module is connected to the output of the first convolution downsampling module and the second convolution upsampling module, the second enhancement feature module is connected to the second downsampling module and the output of the first convolution upsampling module, and the second The convolutional upsampling module is connected to the output end of the second enhancement feature module, and the third convolutional upsampling module is connected to the output end of the first enhancement feature module.
如图3,初始模块包括一个卷积核大小为3×3,步长为1的卷积层,一个卷积核大小为3×3,步长为2的卷积层,一个最大池化层,一个连接层,其中两个卷积层之间为依次连接,卷积层和池化层之间为并行连接。As shown in Figure 3, the initial module includes a convolution layer with a convolution kernel size of 3×3 and a step size of 1, a convolution layer with a convolution kernel size of 3×3 and a step size of 2, and a maximum pooling layer , a connection layer, where the two convolutional layers are sequentially connected, and the convolutional layer and the pooling layer are connected in parallel.
三个卷积下采样模块的单元结构相同,如图4,均包括一个1×1卷积、两个3×3卷积和1×1卷积串连的第一支路、三个扩张率分别为1、2、5的空洞卷积串连的第二支路,第一支路和第二支路并行连接,该模块的输入与两个并行连接的输出做相加操作,再分为由一个1×1卷积、两个3×3卷积和1×1卷积串连的第三支路和设置最大池化层的第四支路,第三支路和第四支路并行连接,两支路的输出做相加操作。The unit structure of the three convolutional downsampling modules is the same, as shown in Figure 4, all of which include a 1×1 convolution, two 3×3 convolutions and the first branch of the 1×1 convolution concatenation, and three expansion rates The second branch of the hole convolution series connection of 1, 2, and 5 respectively, the first branch and the second branch are connected in parallel, the input of this module is added to the output of the two parallel connections, and then divided into A third branch connected in series by a 1×1 convolution, two 3×3 convolutions, and a 1×1 convolution, and a fourth branch that sets the maximum pooling layer, and the third branch and the fourth branch are parallel Connect, and the output of the two branches is added.
如图5,增强感受野模块包括三条并行的空洞卷积支路,第一条空洞卷积支路包括一个扩张率为1的3×3卷积,第二条空洞卷积支路包括四个连续的3×3卷积,扩张率分别为2、5、9、13,第三条空洞卷积支路的结构与第二条空洞卷积支路相同,第一条空洞卷积支路的输出为第二条空洞卷积支路的输入,第二条空洞卷积支路的输出为第三条空洞卷积支路的输入,最后将三条空洞卷积支路的输出做相加操作。As shown in Figure 5, the enhanced receptive field module includes three parallel hole convolution branches, the first hole convolution branch includes a 3×3 convolution with an expansion rate of 1, and the second hole convolution branch includes four Consecutive 3×3 convolution, expansion ratios are 2, 5, 9, 13 respectively, the structure of the third dilated convolution branch is the same as that of the second dilated convolution branch, and the structure of the first dilated convolution branch The output is the input of the second hole convolution branch, the output of the second hole convolution branch is the input of the third hole convolution branch, and finally the outputs of the three hole convolution branches are added together.
两个增强特征模块的结构相同,如图6,包括第一个3×3的非对称卷积,两个并行的全局平均池化和全局最大池化,第二个3×3的非对称卷积,一个1×1的卷积,一个sigmoid激活层,一个阈值层,将最后得到的阈值与输入做乘积操作;其中阈值函数为:
Figure PCTCN2021135474-appb-000002
The structure of the two enhanced feature modules is the same, as shown in Figure 6, including the first 3×3 asymmetric convolution, two parallel global average pooling and global maximum pooling, and the second 3×3 asymmetric convolution product, a 1×1 convolution, a sigmoid activation layer, and a threshold layer, which multiply the final threshold with the input; the threshold function is:
Figure PCTCN2021135474-appb-000002
四个卷积上采样模块的结构相同,如图7,均包括依次连接的一个1×1的卷积,一个3×3的卷积、两个并行的Conv2DTranspose和UpSampling2D,一个1×1的卷积;每一个卷积操作之后均进行Batch Normalization归一化和PReLU非线性激活函数处理。The structure of the four convolutional upsampling modules is the same, as shown in Figure 7, including a 1×1 convolution connected in sequence, a 3×3 convolution, two parallel Conv2DTranspose and UpSampling2D, and a 1×1 convolution Product; each convolution operation is followed by Batch Normalization and PReLU nonlinear activation function processing.
LaneSegNet网络模型的训练步骤为:首先对道路图像提取感兴趣区域,设置六个坐标点:r1(0,270),r2(0,h),r3(w,h),r4(w,470),r5(670,150),r6(570,150),其中w为输入图像的宽,h为输入图像的高,将坐标点围成的区域外部设置为0,得到含有车道线的感兴趣区域,然后将感兴趣区域图像和对应的二值图像作为训练样本数据输入到LaneSegNet网络模型中,然后计算LaneSegNet网络的损失,以损失最小作为目标对网络中的参数持续优化。当损失值稳定在一定范围内时,保存网络参数,得到最终的车道线检测模型。在本实施例中,构建的数据集为车辆行驶在高速公路中所拍摄的视频数据,由于原视频数据会存在与训练任务无关的信息,所以需要对原视频进行裁剪、去噪。处理之后再对视频进行抽帧,以获取足够多的训练数据,获取的数据如图8所示。The training steps of the LaneSegNet network model are as follows: first extract the region of interest from the road image, and set six coordinate points: r1 (0,270), r2 (0, h), r3 (w, h), r4 (w, 470), r5 (670,150), r6(570,150), where w is the width of the input image, h is the height of the input image, and the outside of the area surrounded by the coordinate points is set to 0 to obtain the area of interest containing the lane line, and then the area of interest The image and the corresponding binary image are input into the LaneSegNet network model as training sample data, and then the loss of the LaneSegNet network is calculated, and the parameters in the network are continuously optimized with the minimum loss as the goal. When the loss value is stable within a certain range, the network parameters are saved to obtain the final lane line detection model. In this embodiment, the constructed data set is the video data taken by vehicles driving on the highway. Since the original video data contains information irrelevant to the training task, it is necessary to crop and denoise the original video. After processing, the video frame is drawn to obtain enough training data. The obtained data is shown in Figure 8.
使用Labelme工具对获取到的道路图像中的车道线进行标注,标注完一张图片后会在原文件夹下生成一个.json文件,根据.json文件生成对应的二值图像,如图9所示。在本实施例中共标注11807张道路图像,标注过程如图10所示。Use the Labelme tool to label the lane lines in the obtained road image. After labeling a picture, a .json file will be generated in the original folder, and the corresponding binary image will be generated according to the .json file, as shown in Figure 9. In this embodiment, a total of 11807 road images are labeled, and the labeling process is shown in FIG. 10 .
在本实施例中,存放数据文件夹为Datasets,包括两个子文件夹,分别为Images,Labels,其中Images保存训练图像,Labels保存与训练图像对应的二值 图像,其中训练集与验证集的比例为7:3,训练集图像路径保存在train.txt,验证集图像路径保存在val.txt,其中存放的是训练图像和标签的相对路径。In this embodiment, the data folder is Datasets, which includes two subfolders, namely Images and Labels, wherein Images saves the training images, and Labels saves the binary images corresponding to the training images, wherein the ratio of the training set to the verification set It is 7:3, the training set image path is saved in train.txt, and the verification set image path is saved in val.txt, which stores the relative paths of training images and labels.
LaneSegNet网络模型的训练:将标记好的车道线数据集输入LaneSegNet网络进行训练,设置相应的参数后,进行模型训练,得到训练好的LaneSegNet网络模型,其具体步骤如下:Training of the LaneSegNet network model: Input the marked lane line data set into the LaneSegNet network for training, set the corresponding parameters, and perform model training to obtain the trained LaneSegNet network model. The specific steps are as follows:
1)设置参数,包括学习率、Epochs大小、批大小等。其中初始学习率为1e-3,最终下降到2.5e-5。其中batch-size为2,Epochs为100。1) Set parameters, including learning rate, Epochs size, batch size, etc. Among them, the initial learning rate is 1e-3, and finally drops to 2.5e-5. Among them, batch-size is 2 and Epochs is 100.
2)对数据进行训练,使用1)设置的参数,对图像进行训练及预测,预测效果如图11所示。2) Train the data, use the parameters set in 1) to train and predict the image, and the prediction effect is shown in Figure 11.
在获得含有车道线的二值图像后,使用DBSCAN对分割结果进行聚类并用二次多项式对聚类结果进行拟合,拟合效果如图12所示。After obtaining a binary image containing lane lines, use DBSCAN to cluster the segmentation results and use a quadratic polynomial to fit the clustering results. The fitting effect is shown in Figure 12.
从LaneSegNet网络的输出数据中获取车道线像素点的坐标,使用DBSCAN算法对车道线像素点坐标进行聚类,从而划分出不同类别的车道线,对于不同类别的车道线分别使用二次多项式进行拟合,使用拟合后的二次多项式在原图像中画出车道线,实现车道线检测的可视化。Obtain the coordinates of the lane line pixel points from the output data of the LaneSegNet network, use the DBSCAN algorithm to cluster the lane line pixel point coordinates, thereby dividing different types of lane lines, and use quadratic polynomials for different types of lane lines to simulate Combined, use the fitted quadratic polynomial to draw lane lines in the original image to realize the visualization of lane line detection.
本发明实施例所使用的实验环境如下所示:The experimental environment used in the embodiment of the present invention is as follows:
操作系统:operating system:
Windows10 64位 Windows 10 64 bit
硬件环境:Hardware environment:
Inter Core i5-10400F@2.90GHZ六核Inter Core i5-10400F@2.90GHZ six-core
16GB DDR4 2600MHZ RAM16GB DDR4 2600MHZ RAM
Nvidia GTX 2060 SUPER with 6GB DRAMNvidia GTX 2060 SUPER with 6GB DRAM
WDS 500G with SSDWDS 500G with SSD
软件环境:Software Environment:
深度学习框架Keras(2.2.5)Deep learning framework Keras (2.2.5)
运行环境Python 3.6Operating environmentPython 3.6
JetBrains PyCharm 2020.2 x64JetBrains PyCharm 2020.2 x64
CUDA 10.2CUDA 10.2
miou是语义分割的一种评价指标,是衡量模型性能的重要标准。miou是真 实值和预测值两个集合的交集和并集的比值,本发明中提出的LaneSegNet网络模型的miou如图13所示,从图中可以看出,本发明提出的一种基于LaneSegNet的车道线检测方法的miou值为78.18%。MIOU is an evaluation index for semantic segmentation and an important criterion for measuring model performance. miou is the ratio of the intersection and union of the two sets of real values and predicted values. The miou of the LaneSegNet network model proposed in the present invention is shown in Figure 13. It can be seen from the figure that a LaneSegNet-based network model proposed by the present invention The miou value of the lane line detection method is 78.18%.
本发明提供的方法是基于LaneSegNet的网络模型进行车道线检测,该方法的检测精度达到了较好的效果,如图14所示,基于本发明的方法进行车道线检测,其准确度达到98.62%。The method provided by the present invention is to detect lane lines based on the network model of LaneSegNet, and the detection accuracy of this method has achieved a good effect. As shown in Figure 14, the lane line detection based on the method of the present invention has an accuracy of 98.62%. .
基于相同的发明构思,本发明实施例公开的一种基于LaneSegNet的车道线检测系统,包括预处理模块,用于对道路图像进行多边形填充,获取含有车道线的ROI区域图像;车道线识别模块,用于将ROI区域图像输入到训练好的LaneSegNet网络模型中,得到含有车道线的二值图像;所述LaneSegNet网络模型包括依次连接的初始模块、三个卷积下采样模块、增强感受野模块、四个卷积上采样模块,以及两个增强特征模块;初始模块用于将输入图像大小减半,卷积下采样模块用于提取车道线特征信息,增强感受野模块用于增大网络的感受野,增强特征模块用于增强车道线信息,卷积上采样模块用于恢复图像大小以及图像特征;其中第一增强特征模块与第一卷积下采样模块和第二卷积上采样模块输出端连接,第二增强特征模块与第二下采样模块和第一卷积上采样模块输出端连接,第二卷积上采样模块与第二增强特征模块输出端连接,第三卷积上采样模块与第一增强特征模块输出端连接;车道线拟合模块,用于对LaneSegNet网络模型得到二值图像使用DBSCAN算法对车道线像素点坐标进行聚类,划分出不同类别的车道线,对于不同类别的车道线使用二次多项式分别进行拟合;以及,结果输出模块,用于将拟合后的车道线显示在原道路图像上,实现车道线检测的可视化。Based on the same inventive concept, a lane line detection system based on LaneSegNet disclosed in the embodiment of the present invention includes a preprocessing module, which is used to perform polygon filling on the road image, and obtain an ROI region image containing lane lines; a lane line recognition module, It is used to input the ROI area image into the trained LaneSegNet network model to obtain a binary image containing lane lines; the LaneSegNet network model includes sequentially connected initial modules, three convolutional downsampling modules, enhanced receptive field modules, Four convolutional upsampling modules, and two enhanced feature modules; the initial module is used to halve the size of the input image, the convolutional downsampling module is used to extract lane line feature information, and the enhanced receptive field module is used to increase the perception of the network In the field, the enhanced feature module is used to enhance the lane line information, and the convolution upsampling module is used to restore the image size and image features; wherein the first enhanced feature module is connected with the first convolution downsampling module and the second convolution upsampling module output connected, the second enhanced feature module is connected with the second downsampling module and the output of the first convolutional upsampling module, the second convolutional upsampling module is connected with the output of the second enhanced feature module, and the third convolutional upsampling module is connected with The output end of the first enhanced feature module is connected; the lane line fitting module is used to use the DBSCAN algorithm to cluster the lane line pixel point coordinates for the binary image obtained by the LaneSegNet network model, and divide different types of lane lines, for different types of lane lines The lane lines are respectively fitted by quadratic polynomials; and, the result output module is used to display the fitted lane lines on the original road image, so as to realize the visualization of lane line detection.
具体细节参照上述方法实施,此处不在赘述。For specific details, refer to the above-mentioned method for implementation, and details are not repeated here.
基于相同的发明构思,本发明实施例提供的一种基于LaneSegNet的车道线检测系统,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,该计算机程序被加载至处理器时实现上述的基于LaneSegNet的车道线检测方法。Based on the same inventive concept, a LaneSegNet-based lane line detection system provided in an embodiment of the present invention includes a memory, a processor, and a computer program stored in the memory and operable on the processor, the computer program being loaded into the processing The above-mentioned LaneSegNet-based lane line detection method is implemented when the device is used.

Claims (9)

  1. 一种基于LaneSegNet的车道线检测方法,其特征在于,包括以下步骤:A lane line detection method based on LaneSegNet, is characterized in that, comprises the following steps:
    (1)对道路图像进行多边形填充,获取含有车道线的ROI区域图像;(1) Carry out polygon filling to road image, obtain the ROI area image that contains lane line;
    (2)将ROI区域图像输入到训练好的LaneSegNet网络模型中,得到含有车道线的二值图像;所述LaneSegNet网络模型包括依次连接的初始模块、三个卷积下采样模块、增强感受野模块、四个卷积上采样模块,以及两个增强特征模块;初始模块用于将输入图像大小减半,卷积下采样模块用于提取车道线特征信息,增强感受野模块用于增大网络的感受野,增强特征模块用于增强车道线信息,卷积上采样模块用于恢复图像大小以及图像特征;其中第一增强特征模块与第一卷积下采样模块和第二卷积上采样模块输出端连接,第二增强特征模块与第二下采样模块和第一卷积上采样模块输出端连接,第二卷积上采样模块与第二增强特征模块输出端连接,第三卷积上采样模块与第一增强特征模块输出端连接;(2) Input the ROI area image into the trained LaneSegNet network model to obtain a binary image containing lane lines; the LaneSegNet network model includes sequentially connected initial modules, three convolution down-sampling modules, and enhanced receptive field modules , four convolutional upsampling modules, and two enhanced feature modules; the initial module is used to halve the size of the input image, the convolutional downsampling module is used to extract lane line feature information, and the enhanced receptive field module is used to increase the network. The receptive field, the enhanced feature module is used to enhance the lane line information, and the convolutional upsampling module is used to restore the image size and image features; the first enhanced feature module is output with the first convolutional downsampling module and the second convolutional upsampling module Terminal connection, the second enhanced feature module is connected with the output of the second downsampling module and the first convolutional upsampling module, the second convolutional upsampling module is connected with the output end of the second enhanced feature module, and the third convolutional upsampling module connected to the output end of the first enhanced feature module;
    (3)对步骤(2)得到的二值图像使用DBSCAN算法对车道线像素点坐标进行聚类,划分出不同类别的车道线,对于不同类别的车道线使用二次多项式分别进行拟合;(3) use the DBSCAN algorithm to cluster the lane line pixel point coordinates to the binary image that step (2) obtains, divide the lane lines of different classes, use quadratic polynomial to fit respectively for the lane lines of different classes;
    (4)将拟合后的车道线显示在原道路图像上,实现车道线检测的可视化。(4) Display the fitted lane lines on the original road image to realize the visualization of lane line detection.
  2. 根据权利要求1所述的一种基于LaneSegNet的车道线检测方法,其特征在于,所述初始模块包括一个卷积核大小为k×k,步长为1的卷积层,一个卷积核大小为k×k,步长为2的卷积层,一个最大池化层,一个连接层,其中两个卷积层之间为依次连接,卷积层和池化层之间为并行连接,k为3或5。A kind of lane line detection method based on LaneSegNet according to claim 1, it is characterized in that, described initial module comprises a convolution kernel size is k * k, the convolution layer that stride is 1, a convolution kernel size k×k convolutional layer with a step size of 2, a maximum pooling layer, and a connection layer, where the two convolutional layers are sequentially connected, and the convolutional layer and the pooling layer are connected in parallel, k for 3 or 5.
  3. 根据权利要求1所述的一种基于LaneSegNet的车道线检测方法,其特征在于,三个卷积下采样模块的单元结构相同,均包括一个1×1卷积、两个k×k卷积和1×1卷积串连的第一支路、三个扩张率分别为1、2、5的空洞卷积串连的第二支路,第一支路和第二支路并行连接,该模块的输入与两个并行连接的输出做相加操作,再分为由一个1×1卷积、两个k×k卷积和1×1卷积串连的第三支路和设置最大池化层的第四支路,第三支路和第四支路并行连接,两支路的输出做相加操作,k为3或5。A lane line detection method based on LaneSegNet according to claim 1, wherein the unit structures of the three convolution downsampling modules are the same, including a 1×1 convolution, two k×k convolutions and The first branch of the 1 × 1 convolution series, the second branch of the three hole convolution series with expansion rates of 1, 2, and 5 respectively, the first branch and the second branch are connected in parallel, the module The input of the input is added to the output of two parallel connections, and then divided into a third branch consisting of a 1×1 convolution, two k×k convolutions and 1×1 convolutions connected in series and setting the maximum pooling The fourth branch of the layer, the third branch and the fourth branch are connected in parallel, and the outputs of the two branches are added, and k is 3 or 5.
  4. 根据权利要求1所述的一种基于LaneSegNet的车道线检测方法,其特征在于,所述增强感受野模块包括三条并行的空洞卷积支路,第一条空洞卷积支路包括一个扩张率为1的k×k卷积,第二条空洞卷积支路包括四个连续的k×k 卷积,扩张率分别为2、5、9、13,第三条空洞卷积支路的结构与第二条空洞卷积支路相同,第一条空洞卷积支路的输出为第二条空洞卷积支路的输入,第二条空洞卷积支路的输出为第三条空洞卷积支路的输入,最后将三条空洞卷积支路的输出做相加操作,k为3或5。A lane line detection method based on LaneSegNet according to claim 1, wherein the enhanced receptive field module includes three parallel atrous convolution branches, and the first atrous convolution branch includes a dilation rate 1 k×k convolution, the second dilated convolution branch includes four consecutive k×k convolutions, the expansion rates are 2, 5, 9, 13 respectively, the structure of the third dilated convolution branch is the same as The second hole convolution branch is the same, the output of the first hole convolution branch is the input of the second hole convolution branch, and the output of the second hole convolution branch is the third hole convolution branch The input of the channel, and finally the output of the three dilated convolution branches is added, and k is 3 or 5.
  5. 根据权利要求1所述的一种基于LaneSegNet的车道线检测方法,其特征在于,两个增强特征模块的结构相同,包括第一个k×k的非对称卷积,两个并行的全局平均池化和全局最大池化,第二个k×k的非对称卷积,一个1×1的卷积,一个sigmoid激活层,一个阈值层,将最后得到的阈值与输入做乘积操作,k为3或5;其中阈值函数为:
    Figure PCTCN2021135474-appb-100001
    A lane line detection method based on LaneSegNet according to claim 1, wherein the two enhanced feature modules have the same structure, including the first k×k asymmetric convolution, two parallel global average pools Globalization and global maximum pooling, the second k×k asymmetric convolution, a 1×1 convolution, a sigmoid activation layer, a threshold layer, and multiply the final threshold with the input, k is 3 or 5; where the threshold function is:
    Figure PCTCN2021135474-appb-100001
  6. 根据权利要求1所述的一种基于LaneSegNet的车道线检测方法,其特征在于,四个卷积上采样模块的结构相同,均包括依次连接的一个1×1的卷积,一个k×k的卷积、两个并行的转置卷积和上采样,一个1×1的卷积;每一个卷积操作之后均进行Batch Normalization归一化和PReLU非线性激活函数处理。A lane line detection method based on LaneSegNet according to claim 1, wherein the four convolution upsampling modules have the same structure, and all include a 1×1 convolution connected in sequence, a k×k Convolution, two parallel transposed convolutions and upsampling, a 1×1 convolution; each convolution operation is followed by Batch Normalization and PReLU nonlinear activation function processing.
  7. 根据权利要求1所述的一种基于LaneSegNet的车道线检测方法,其特征在于,所述步骤(2)中训练LaneSegNet网络模型的步骤包括:A kind of lane line detection method based on LaneSegNet according to claim 1, is characterized in that, the step of training LaneSegNet network model in described step (2) comprises:
    (2.1)将道路图像的ROI区域图像及做好标记的车道线的二值图像作为训练样本数据输入到LaneSegNet网络模型中;(2.1) Input the ROI area image of the road image and the binary image of the marked lane line into the LaneSegNet network model as training sample data;
    (2.2)计算LaneSegNet网络的损失,以损失最小作为目标对网络中的参数持续优化;(2.2) Calculate the loss of the LaneSegNet network, and continuously optimize the parameters in the network with the minimum loss as the goal;
    (2.3)当损失值稳定在一定范围内时,保存网络参数,得到最终的车道线检测模型。(2.3) When the loss value is stable within a certain range, the network parameters are saved to obtain the final lane line detection model.
  8. 一种基于LaneSegNet的车道线检测系统,其特征在于,包括:A lane detection system based on LaneSegNet, characterized in that, comprising:
    预处理模块,用于对道路图像进行多边形填充,获取含有车道线的ROI区域图像;The preprocessing module is used to carry out polygon filling to the road image, and obtains the ROI area image containing the lane line;
    车道线识别模块,用于将ROI区域图像输入到训练好的LaneSegNet网络模型中,得到含有车道线的二值图像;所述LaneSegNet网络模型包括依次连接的初始模块、三个卷积下采样模块、增强感受野模块、四个卷积上采样模块,以及两个增强特征模块;初始模块用于将输入图像大小减半,卷积下采样模块用于提取车道线特征信息,增强感受野模块用于增大网络的感受野,增强特征模块用于 增强车道线信息,卷积上采样模块用于恢复图像大小以及图像特征;其中第一增强特征模块与第一卷积下采样模块和第二卷积上采样模块输出端连接,第二增强特征模块与第二下采样模块和第一卷积上采样模块输出端连接,第二卷积上采样模块与第二增强特征模块输出端连接,第三卷积上采样模块与第一增强特征模块输出端连接;Lane line recognition module, for inputting the ROI area image into the LaneSegNet network model trained, obtains the binary image that contains lane line; Described LaneSegNet network model comprises the initial module connected successively, three convolution down-sampling modules, Enhanced receptive field module, four convolutional upsampling modules, and two enhanced feature modules; the initial module is used to halve the size of the input image, the convolutional downsampling module is used to extract lane line feature information, and the enhanced receptive field module is used to Increase the receptive field of the network, the enhanced feature module is used to enhance the lane line information, and the convolution upsampling module is used to restore the image size and image features; the first enhanced feature module is combined with the first convolution downsampling module and the second convolution The output terminal of the upsampling module is connected, the second enhanced feature module is connected with the output terminal of the second downsampling module and the first convolutional upsampling module, the second convolutional upsampling module is connected with the output terminal of the second enhanced feature module, and the third volume The product upsampling module is connected with the first enhanced feature module output;
    车道线拟合模块,用于对LaneSegNet网络模型得到二值图像使用DBSCAN算法对车道线像素点坐标进行聚类,划分出不同类别的车道线,对于不同类别的车道线使用二次多项式分别进行拟合;The lane line fitting module is used to cluster the lane line pixel coordinates using the DBSCAN algorithm on the binary image obtained by the LaneSegNet network model, divide different types of lane lines, and use quadratic polynomials for different types of lane lines. combine;
    以及,结果输出模块,用于将拟合后的车道线显示在原道路图像上,实现车道线检测的可视化。And, the result output module is used for displaying the fitted lane line on the original road image, so as to realize the visualization of lane line detection.
  9. 一种基于LaneSegNet的车道线检测系统,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述计算机程序被加载至处理器时实现根据权利要求1-7任一项所述的基于LaneSegNet的车道线检测方法。A lane line detection system based on LaneSegNet, comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, characterized in that, when the computer program is loaded into the processor, the system according to claim 1 is implemented. The lane line detection method based on LaneSegNet described in any one of -7.
PCT/CN2021/135474 2021-05-14 2021-12-03 Lanesegnet-based lane line detection method and system WO2022237139A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110527595.6A CN113343778B (en) 2021-05-14 2021-05-14 Lane line detection method and system based on LaneSegNet
CN202110527595.6 2021-05-14

Publications (1)

Publication Number Publication Date
WO2022237139A1 true WO2022237139A1 (en) 2022-11-17

Family

ID=77470031

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/135474 WO2022237139A1 (en) 2021-05-14 2021-12-03 Lanesegnet-based lane line detection method and system

Country Status (2)

Country Link
CN (1) CN113343778B (en)
WO (1) WO2022237139A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114241308A (en) * 2021-12-17 2022-03-25 杭州电子科技大学 Lightweight remote sensing image significance detection method based on compression module
CN115993365A (en) * 2023-03-23 2023-04-21 山东省科学院激光研究所 Belt defect detection method and system based on deep learning
CN116129379A (en) * 2022-12-28 2023-05-16 国网安徽省电力有限公司芜湖供电公司 Lane line detection method in foggy environment
CN116453121A (en) * 2023-06-13 2023-07-18 合肥市正茂科技有限公司 Training method and device for lane line recognition model
CN117764988A (en) * 2024-02-22 2024-03-26 山东省计算中心(国家超级计算济南中心) Road crack detection method and system based on heteronuclear convolution multi-receptive field network
WO2024138993A1 (en) * 2022-12-26 2024-07-04 江苏大学 Multi-task joint sensing network model and detection method for traffic road surface information

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343778B (en) * 2021-05-14 2022-02-11 淮阴工学院 Lane line detection method and system based on LaneSegNet
CN113781607B (en) * 2021-09-17 2023-09-19 平安科技(深圳)有限公司 Processing method, device, equipment and storage medium for labeling data of OCR (optical character recognition) image
CN118155105B (en) * 2024-05-13 2024-08-02 齐鲁空天信息研究院 Unmanned aerial vehicle mountain area rescue method, unmanned aerial vehicle mountain area rescue system, unmanned aerial vehicle mountain area rescue medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110276267A (en) * 2019-05-28 2019-09-24 江苏金海星导航科技有限公司 Method for detecting lane lines based on Spatial-LargeFOV deep learning network
US20200117916A1 (en) * 2018-10-11 2020-04-16 Baidu Usa Llc Deep learning continuous lane lines detection system for autonomous vehicles
CN111242037A (en) * 2020-01-15 2020-06-05 华南理工大学 Lane line detection method based on structural information
US20200285869A1 (en) * 2019-03-06 2020-09-10 Dura Operating, Llc Convolutional neural network system for object detection and lane detection in a motor vehicle
CN113343778A (en) * 2021-05-14 2021-09-03 淮阴工学院 Lane line detection method and system based on LaneSegNet

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107704866B (en) * 2017-06-15 2021-03-23 清华大学 Multitask scene semantic understanding model based on novel neural network and application thereof
CN111460921B (en) * 2020-03-13 2023-05-26 华南理工大学 Lane line detection method based on multitasking semantic segmentation
CN111582083B (en) * 2020-04-25 2023-05-23 华南理工大学 Lane line detection method based on vanishing point estimation and semantic segmentation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200117916A1 (en) * 2018-10-11 2020-04-16 Baidu Usa Llc Deep learning continuous lane lines detection system for autonomous vehicles
US20200285869A1 (en) * 2019-03-06 2020-09-10 Dura Operating, Llc Convolutional neural network system for object detection and lane detection in a motor vehicle
CN110276267A (en) * 2019-05-28 2019-09-24 江苏金海星导航科技有限公司 Method for detecting lane lines based on Spatial-LargeFOV deep learning network
CN111242037A (en) * 2020-01-15 2020-06-05 华南理工大学 Lane line detection method based on structural information
CN113343778A (en) * 2021-05-14 2021-09-03 淮阴工学院 Lane line detection method and system based on LaneSegNet

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"China Master’s Theses Full-text Database", 15 September 2019, ISBN: 978-7-111-60788-5, article LI QIAOYI: "Research on Lane Detection Based on Computer Vision", pages: 1 - 56, XP093003507 *
"Thesis Beijing Jiaotong University", 15 July 2020, BEIJING JIAOTONG UNIVERSITY, CN, article WU YIPENG: "Research on Lane Detection Technology Based on Convolutional Neural Network", pages: 1 - 59, XP093003476 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114241308A (en) * 2021-12-17 2022-03-25 杭州电子科技大学 Lightweight remote sensing image significance detection method based on compression module
WO2024138993A1 (en) * 2022-12-26 2024-07-04 江苏大学 Multi-task joint sensing network model and detection method for traffic road surface information
CN116129379A (en) * 2022-12-28 2023-05-16 国网安徽省电力有限公司芜湖供电公司 Lane line detection method in foggy environment
CN116129379B (en) * 2022-12-28 2023-11-07 国网安徽省电力有限公司芜湖供电公司 Lane line detection method in foggy environment
CN115993365A (en) * 2023-03-23 2023-04-21 山东省科学院激光研究所 Belt defect detection method and system based on deep learning
CN116453121A (en) * 2023-06-13 2023-07-18 合肥市正茂科技有限公司 Training method and device for lane line recognition model
CN116453121B (en) * 2023-06-13 2023-12-22 合肥市正茂科技有限公司 Training method and device for lane line recognition model
CN117764988A (en) * 2024-02-22 2024-03-26 山东省计算中心(国家超级计算济南中心) Road crack detection method and system based on heteronuclear convolution multi-receptive field network
CN117764988B (en) * 2024-02-22 2024-04-30 山东省计算中心(国家超级计算济南中心) Road crack detection method and system based on heteronuclear convolution multi-receptive field network

Also Published As

Publication number Publication date
CN113343778A (en) 2021-09-03
CN113343778B (en) 2022-02-11

Similar Documents

Publication Publication Date Title
WO2022237139A1 (en) Lanesegnet-based lane line detection method and system
CN112132156B (en) Image saliency target detection method and system based on multi-depth feature fusion
US20220092882A1 (en) Living body detection method based on facial recognition, and electronic device and storage medium
CN108062525B (en) Deep learning hand detection method based on hand region prediction
WO2021218786A1 (en) Data processing system, object detection method and apparatus thereof
CN109035172B (en) Non-local mean ultrasonic image denoising method based on deep learning
Zhang et al. Deep learning in lane marking detection: A survey
CN111209811B (en) Method and system for detecting eyeball attention position in real time
CN109726678B (en) License plate recognition method and related device
CN109165658B (en) Strong negative sample underwater target detection method based on fast-RCNN
CN112446292B (en) 2D image salient object detection method and system
CN114332942A (en) Night infrared pedestrian detection method and system based on improved YOLOv3
WO2021083126A1 (en) Target detection and intelligent driving methods and apparatuses, device, and storage medium
CN111428664A (en) Real-time multi-person posture estimation method based on artificial intelligence deep learning technology for computer vision
CN113033321A (en) Training method of target pedestrian attribute identification model and pedestrian attribute identification method
CN108537109B (en) OpenPose-based monocular camera sign language identification method
CN117197763A (en) Road crack detection method and system based on cross attention guide feature alignment network
CN115861981A (en) Driver fatigue behavior detection method and system based on video attitude invariance
CN116129291A (en) Unmanned aerial vehicle animal husbandry-oriented image target recognition method and device
CN116935361A (en) Deep learning-based driver distraction behavior detection method
CN114758382A (en) Face AU detection model establishing method and application based on adaptive patch learning
CN112597996B (en) Method for detecting traffic sign significance in natural scene based on task driving
CN113688930A (en) Thyroid nodule calcification recognition device based on deep learning
CN111797704B (en) Action recognition method based on related object perception
CN116740792A (en) Face recognition method and system for sightseeing vehicle operators

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21941701

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21941701

Country of ref document: EP

Kind code of ref document: A1