CN113095152A - Lane line detection method and system based on regression - Google Patents

Lane line detection method and system based on regression Download PDF

Info

Publication number
CN113095152A
CN113095152A CN202110290948.5A CN202110290948A CN113095152A CN 113095152 A CN113095152 A CN 113095152A CN 202110290948 A CN202110290948 A CN 202110290948A CN 113095152 A CN113095152 A CN 113095152A
Authority
CN
China
Prior art keywords
lane line
image
lane
regression
line key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110290948.5A
Other languages
Chinese (zh)
Other versions
CN113095152B (en
Inventor
郑南宁
朱丹彤
黄宇豪
王圣琦
南智雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202110290948.5A priority Critical patent/CN113095152B/en
Publication of CN113095152A publication Critical patent/CN113095152A/en
Application granted granted Critical
Publication of CN113095152B publication Critical patent/CN113095152B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a method and a system for detecting lane lines based on regression, wherein the method specifically comprises the following steps: inputting an image to be detected into a trained lane line key point regression network model to obtain a series of lane line key points and category confidence coefficients thereof, screening the lane line key points according to the category confidence coefficients, and reserving the lane line key points with the confidence coefficients larger than a preset value as the lane line key points used in clustering and fitting; transforming the points of the image to be detected to the aerial view image through a perspective transformation function to perform DBSCAN clustering, and dividing predicted lane key points in the whole image into feature points on different lane lines; according to the shape of the lane line in the actual scene and the characteristic points, adopting secondary inverse proportion curve fitting to obtain a lane line detection result; the images do not need to be subjected to complex preprocessing, each pixel point in the images does not need to be subjected to category segmentation, any multiple lane lines can be identified, and the method has good adaptability to detection under various weather environments.

Description

Lane line detection method and system based on regression
Technical Field
The invention belongs to the technical field of intelligent traffic, and particularly relates to a method and a system for detecting a lane line based on regression.
Background
The unmanned vehicle mainly senses and identifies the environment around the vehicle through a vehicle-mounted sensor, the visual sensing is a very important module in the current unmanned application scene, and lane line detection is also a very important part in the sensing module. In the perception module, common sensors are divided into a laser radar and a camera, and for lane line detection, image data can provide more abundant information compared with three-dimensional point cloud data. In the research subject of lane line detection, a camera is mainly selected as a sensor for providing data, so that the requirement of reducing cost is met, and meanwhile, richer information is provided.
At present, the traditional method for detecting the lane line mainly utilizes characteristic information of the lane line, such as color, width, edge or gradient change, and the like to extract the lane line from a road surface area, and realizes the lane line detection through a clustering algorithm and a lane line fitting method. However, as research progresses, scenes corresponding to a lane line detection task are more diversified, recognition of a lane line is not limited to low-level understanding of white and yellow lines, and more methods pay attention to detection of the position of the lane line on a semantic level even if the features of the lane line are fuzzy or completely shielded.
The current unmanned technology has higher requirements on the detection of lane lines, and has higher requirements on the detection stability of the unmanned technology under various complex environments, the stability and the real-time performance of lane line detection under continuous scenes besides the basic detection accuracy. Compared with the traditional method, the lane line detection based on deep learning enhances the detection robustness and optimizes the lane line detection results in various environmental states. In deep learning lane line detection, lane line detection is often regarded as a segmentation task, and the whole image is classified pixel by pixel to obtain a pixel-level lane line detection result. The method for detecting the deep learning lane lines based on the segmentation has high calculation complexity and low speed, and because the receptive field of each pixel is limited, the method is difficult to detect the slender shapes such as the lane lines, so that the method has high difficulty in detecting the lane lines under the state of shielding or complex weather environment.
Besides the deep learning lane line detection method based on segmentation, the deep learning lane line detection method based on regression is simple in network structure, strong in detection robustness under various environments and good in engineering applicability. The existing deep learning lane line detection method based on regression is divided into two regression methods for performing regression on lane line characteristic points under two different viewing angles of a bird's-eye view and a front view. Although the aerial view reduces redundant information in the images and is more beneficial to regression of the lane line characteristic points, the aerial view needs preprocessing of perspective transformation on the original images, and the accuracy of the perspective transformation greatly influences the generalization capability of the method.
Disclosure of Invention
In order to solve the problems of poor lane line detection robustness, low calculation efficiency and the like in various scenes in the prior art, the invention provides a regression-based lane line key point detection method, the network complexity is low, complicated preprocessing is not required to be carried out on images in the training and reasoning processes, the algorithm calculation efficiency is high, the final detection result is obtained through a clustering and fitting post-processing mode, and the proposed fitting model has a good fitting effect on a lane line with a large curvature.
In order to achieve the purpose, the invention adopts the technical scheme that: a regression-based lane line detection method is characterized in that positions of a series of lane line key points in an image to be detected are obtained through a lane line key point detection model, and a complete lane line detection result is obtained by combining a clustering and fitting post-processing method, and the method specifically comprises the following steps:
inputting an image to be detected into a trained lane line key point regression network model to obtain a series of lane line key points and category confidence coefficients thereof, screening the lane line key points according to the category confidence coefficients, and reserving the lane line key points with the confidence coefficients larger than a preset value as the lane line key points used in clustering and fitting;
transforming key points in the screened image to be detected to the aerial view image through a perspective transformation function to perform DBSCAN clustering, and dividing predicted lane key points in the whole image into feature points on different lane lines;
and based on the characteristic points obtained by clustering, obtaining a lane line detection result by adopting quadratic inverse proportion curve fitting according to the shape of the lane line in the actual scene and the transformation relation between the camera coordinate system and the world coordinate system.
The lane line key point detection model is obtained by adopting the following method:
constructing a regression method-based lane line key point detection network;
preprocessing the forward-looking lane line image and the label thereof to obtain a training data set;
and training a regression network of the key points of the lane line by adopting a multi-scale fusion network structure based on the training data set to obtain a detection model of the key points of the lane line.
The lane line key point detection network model comprises a convolution network feature extraction module, a multi-scale feature fusion module and a lane line key point position and category prediction module; the convolution network feature extraction module is used for extracting features of a forward-looking image of a driving visual angle, converting the forward-looking image into a feature image with a smaller size and outputting image features extracted by the last three down-sampling modules;
the multi-scale feature fusion module is used for performing feature fusion on the channel dimension by using the image features output by the last three down-sampling modules in the convolution network feature extraction module, and outputting three feature images with different scales as the input of the lane line key point position and category prediction module;
the lane line key point position and category prediction module comprises a position prediction unit and a category prediction unit, wherein the lane line key point position prediction unit is used for further positioning the lane line key points according to three feature images with different scales, and regressing the lane line key point features with different channel numbers through the convolution layer to obtain category confidence information and position information of the lane line key points.
When the forward-looking image and the label thereof collected under the driving visual angle are preprocessed:
cutting the front view image, and only reserving a ground area of the lower half part of the image, wherein the ground area contains lane line characteristics;
the size of the cut image is set according to the parameter characteristics of the convolution layer of the convolution characteristic extraction module;
the convolutional network feature extraction module comprises 52 convolutional layers, specifically comprises a single convolutional layer and 5 repeated residual error units, each residual error unit is a downsampling module, wherein each residual error unit in the 5 repeated residual error units comprises 1 single convolutional layer and a repeated residual error unit, and each residual error unit can change the size of an output feature image into that of an input front-view image
Figure BDA0002982618060000031
After the first two down-sampling modules finish coarse-grained feature extraction on the image, the last three down-sampling modules extract more detailed image features as the input of the feature fusion module.
The multi-scale feature fusion module performs feature fusion on the channel dimension by using the image features output by the last three down-sampling modules in the convolutional network feature extraction module, and the specific steps of outputting three feature images with different scales are as follows:
the number of channels for outputting the characteristic image through the convolution layer is T1The down-sampling module of (2) performs further feature extraction and saves the result as fmp 1; secondly, by up-sampling fmp1, the number of channels for fusing output characteristic images in channel dimension is T2And further feature extraction is performed based on the fused features, and the result is fmp 2; finally, fmp2 is upsampled and its channel dimension and number of output feature image channels are T3And fusing the down-sampling modules to finally obtain three characteristic images with different scales.
The method comprises the following steps of training a regression network of key points of a lane line by adopting a multi-scale fusion network structure to obtain a detection model of the key points of the lane line as follows:
taking a training data set as the input of a lane line key point regression network, performing end-to-end training on the lane line key point regression network according to a lane line key point detection loss function, inputting the training data set into the constructed lane line key point detection network, and minimizing the category confidence coefficient and the position loss of the lane line key points based on a back propagation algorithm to obtain an optimal lane line key point detection network model;
the lane line key point detection loss function comprises a category confidence coefficient loss function and a position regression loss function, and a lane line key point detection model can be obtained;
the formula for the category confidence loss function is:
Figure BDA0002982618060000041
where j denotes the feature map of three different scales, I denotes each anchor point, H and W are the height and width of the image, respectively, IijIs a Boolean variable and represents the characteristic image under the jth scale, the ith anchor point of the characteristic image contains a lane line or not, if the anchor point contains the lane line, the value is 1, if the anchor point does not contain the lane line, the value is 0, CijThe category prediction value of the ith anchor point of the feature image under the jth scale is between 0 and 1;
the formula for the position loss function is:
Figure BDA0002982618060000051
wherein ,
Figure BDA0002982618060000052
and calculating the loss of the absolute error between the predicted value and the true value of the position, wherein the loss is calculated only for the positive example anchor points containing the lane lines, and the loss is ignored for the negative example anchor points not containing the lane lines.
When the category confidence coefficient loss function is calculated, weights with different sizes are set for the positive sample and the negative sample according to the proportion of the positive sample to the negative sample, and the calculation formula is as follows:
Figure BDA0002982618060000053
Figure BDA0002982618060000054
wherein numposNumber of positive samples, numnegRepresenting the number of negative samples, c is a constant.
Fitting a single lane line by adopting a quadratic inverse proportion curve, wherein the fitting model is as follows:
Figure BDA0002982618060000055
the method comprises the steps that (u, v) represents pixel point coordinates under an image coordinate system, a, b, C, D and E are fitting parameters, lanes on an actual road are parallel to each other in a single image and have similar curvatures, C, D and E in the fitting parameters are shared to all lane lines in the same scene as shared parameters, and a and b are used as independent parameters to distinguish each single lane line.
The image to be detected is a forward-looking image collected under a driving visual angle.
The invention provides a regression-based lane line detection system, which comprises a lane line key point acquisition module, a clustering module and a curve fitting module;
the lane line key point acquisition module is used for inputting an image to be detected into a trained lane line key point regression network model to obtain a series of lane line key points and category confidence coefficients thereof, screening the lane line key points according to the category confidence coefficients, and reserving the lane line key points with the confidence coefficients larger than a preset value as the lane line key points used in clustering and fitting;
the clustering module is used for transforming the key points in the screened image to be detected to the aerial view image through a perspective transformation function to perform DBSCAN clustering, and dividing the predicted lane key points in the whole image into feature points on different lane lines;
and the curve fitting module is used for obtaining a lane line detection result by adopting quadratic inverse proportion curve fitting according to the shape of the lane line in the actual scene and the transformation relation between the camera coordinate system and the world coordinate system based on the characteristic points obtained by clustering.
Compared with the prior art, the invention has at least the following beneficial effects:
according to the regression-based lane line detection method, the images do not need to be subjected to complex preprocessing, and each pixel point in the images does not need to be subjected to category segmentation, so that the regression-based lane line detection method has higher execution efficiency; the invention can identify any plurality of lane lines, the number of the lane lines to be detected does not need to be preset, and the invention has good adaptability to detection under various weather environments; a series of lane line key points are predicted through a deep neural network, and a final detection result is obtained through a post-processing mode of clustering and fitting, so that the algorithm has high calculation efficiency and accuracy, strong robustness under various scenes, small video memory occupation and real-time performance in engineering application; the proposed lane line fitting model has a good fitting effect on the lane line with a large curvature.
Drawings
Fig. 1 is a flowchart of a regression-based lane line detection method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a lane line key point detection network based on a regression method according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a positive and negative example of the detection of the lane line key point detection network based on the regression method according to the embodiment of the present invention;
FIG. 3a is a schematic diagram of a feature map output by the multi-scale feature fusion module; FIG. 3b is a schematic diagram of detecting a positive sample;
FIG. 3c is a schematic diagram of detecting negative samples.
Fig. 4 is a schematic diagram of a result of the lane line detection method according to the embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a regression-based lane line detection method, which does not need to perform complex preprocessing on an image, does not need to perform category segmentation on each pixel point in the image, and has higher execution efficiency; the invention can identify any plurality of lane lines, the number of the lane lines to be detected does not need to be preset, and the invention has good adaptability to detection under various weather environments; a series of lane line key points are predicted through a deep neural network, a final detection result is obtained through a post-processing mode of clustering and fitting, the algorithm is high in calculation efficiency and accuracy, and strong in robustness under various scenes, and the method mainly comprises the following steps:
step 1, constructing a regression method-based lane line key point detection network, enabling the network to process an input training data set, and simultaneously realizing the function of training a lane line key point detection model.
In the embodiment of the invention, the detection of the key points of the lane lines is realized by a regression method, each pixel in the image does not need to be classified, the complexity of calculation is reduced, and the detection efficiency is improved.
As shown in fig. 2, the network for detecting the key points of the lane lines based on the regression method includes: the system comprises a convolution network feature extraction module, a multi-scale feature fusion module and a lane line key point position and category prediction module; wherein,
and the multi-scale feature fusion module fuses and outputs three kinds of feature information with different scales based on the result obtained by the convolution network feature extraction module, and the feature information is used as the input of the lane line key point position and category prediction module.
1) And a convolutional network feature extraction module.
In the embodiment of the invention, the convolution network feature extraction module performs feature extraction on the forward-looking image of the driving visual angle through five down-sampling modules and converts the forward-looking image into a feature image with a smaller size.
As an example, the convolutional network feature extraction module comprises 52 convolutional layers, and the structure of the convolutional layerComprises a single convolution layer and 5 repeated residual units, wherein each residual unit is a down-sampling module. Wherein each of the 5 repeated residual units comprises 1 independent convolutional layer and a repeated residual unit, the repeated execution times of the residual units are respectively 1, 2, 8 and 4, and each residual unit can change the size of the output characteristic image into that of the input front view image
Figure BDA0002982618060000081
Further, assume that the size of the input image is h × w × 3, where h represents the image height, w represents the image width, and 3 represents the image channel (RGB). After being processed by 5 down-sampling modules of the convolution network feature extraction module, the size of the finally output feature image is
Figure BDA0002982618060000082
Where C represents the number of channels of the feature image. After the first two down-sampling modules finish coarse-grained feature extraction on the image, the last three down-sampling modules extract more detailed image features. In the multi-scale feature fusion module, the output of the last three down-sampling modules for extracting the detail features in the convolution network feature extraction module is used for feature fusion.
2) And a multi-scale feature fusion module.
The multi-scale feature fusion module performs feature fusion in channel dimension by using the output of the last three down-sampling modules in the convolution network feature extraction module, and the channel number T of 3 outputs1、T2 and T3256, 512 and 1024 respectively, i.e. feature image sizes of 256, 512 and 1024 respectively
Figure BDA0002982618060000083
And
Figure BDA0002982618060000084
as an example, in the first step, a downsampling module with 1024 output feature image channels is further subjected to feature extraction through a convolutional layer, and the result is stored as fmp 1; secondly, by up-sampling fmp1, the output of a down-sampling module with the output characteristic image channel number of 512 is fused in the channel dimension, further characteristic extraction is carried out based on the fused characteristic, and the result is stored as fmp 2; and finally, performing upsampling on the fmp2, and fusing the upsampling module with a channel dimension and 256 channels of output characteristic images to finally obtain three characteristic images with different scales.
3) Lane line key point position and category prediction module
And the lane line key point position and category prediction module is used for further positioning the lane line key points based on the feature images output by the multi-scale feature fusion module, and regressing the features of the lane line key points with different channel numbers to the channel number of 2 through the convolution layer, wherein 2 represents the category confidence degree dimension and the position dimension of the lane line key points.
The lane line key point position and category prediction module comprises a category confidence prediction unit and a position prediction unit, wherein the category confidence prediction unit represents whether the original image pixel semantics corresponding to the feature map anchor points contain lane line features through Boolean variables, if so, the category confidence prediction unit is set to 1, and if not, the category confidence prediction unit is set to 0; the position prediction unit is used for positioning the position of the key point of the lane line in the positive sample containing the lane line characteristics, and the output result is the relative position of the feature point of the lane line in the anchor point, and the size is between 0 and 1.
The category confidence degree dimensionality of the key points of the lane lines represents whether the original image pixel semantics corresponding to the feature map anchor points contain the lane lines or not through Boolean variables, and for the condition of shielding or dotted lines, the detection results are output according to the semantics instead of actual pixel characteristics. As shown in the two regions marked in the characteristic diagram of fig. 3a, the anchor point in fig. 3b corresponds to the central line containing the lane line in the pixel of the original image and intersecting the lane line with the anchor point, and it is considered as a positive sample. Although the lane feature is included in fig. 3c, it does not intersect the centerline of the anchor point, so it is considered a negative example. The position dimension is used for positioning the position of the key point of the lane line in the positive sample containing the lane line characteristic, and the output result is the relative position of the feature point of the lane line in the anchor point, and the size is between 0 and 1. If a plurality of lane line characteristic points are contained in the anchor point, only one characteristic point closest to the center is reserved.
And 2, preprocessing the forward-looking image and the label thereof under the driving visual angle to obtain a training data set.
As an example, a forward-looking image and its labels collected under a driving view angle are preprocessed to establish a training data set required by a lane line key point detection network.
The processing mode of the forward-looking image is to cut a redundant image area of a sky part in the image, only a ground area containing lane line characteristics at the lower half part of the image is reserved, the size of the cut image is set according to parameter characteristics of a convolutional layer of a convolutional characteristic extraction module part, the convolutional network characteristic extraction module carries out downsampling for 5 times in total, and the size of the characteristic image output each time is input
Figure BDA0002982618060000091
The clipped image needs to satisfy the condition that the feature size output by each downsampling is an integer.
The method comprises the steps of preprocessing a label of a lane line, wherein the label is a binary image representing the position of the lane line, and acquiring a series of feature point sets of the lane line according to the binary image. Secondly, calculating whether the anchor points of the corresponding feature map contain the key points of the lane line according to the key point positions of the lane line and the sizes of the information of the key points of the lane line with three different scales output by the category prediction module, and further calculating the relative position of the anchor points if the anchor points contain the key points of the lane line. The size of the input image is h multiplied by w multiplied by 3, and the sizes of the key point information of the lane lines with three different sizes are respectively
Figure BDA0002982618060000101
And
Figure BDA0002982618060000102
the size of the preprocessed label is the same as the size of the characteristic information. Wherein 2 represents the channel number of the key point information of the lane line, and the 1 st channel represents the anchorWhether the points contain the lane lines or not is 1, and the points do not contain 0, the second channel represents the relative position of the lane lines in the anchor point, and the size of the second channel is between 0 and 1.
In the specific implementation process of the invention, the step 1 and the step 2 do not distinguish the execution sequence, and the step 1 or the step 2 can be executed first or synchronously.
And 3, training a regression network of the key points of the lane line by adopting a multi-scale fusion network structure to obtain a detection model of the key points of the lane line.
In the embodiment of the invention, a training data set is used as the input of the regression network of the key points of the lane line, the regression network of the key points of the lane line is trained end to end according to the designed detection loss function of the key points of the lane line, and the detection loss function of the key points of the lane line comprises a category confidence coefficient loss function and a position regression loss function.
1) A class confidence loss function.
The category confidence coefficient loss function is a loss function used for calculating whether a lane line is included in an original image pixel region corresponding to a feature map anchor point, and the calculation formula is as follows:
Figure BDA0002982618060000103
where j denotes the feature map at three different scales, i denotes each anchor point, and H and W are the height and width of the image, respectively. I isijThe characteristic image is a Boolean variable and represents that the characteristic image under the jth scale has the i-th anchor point containing the lane line, if the i-th anchor point contains the lane line, the value is 1, and if the i-th anchor point does not contain the lane line, the value is 0; cijThe size of the category predicted value of the ith anchor point of the feature image at the jth scale is between 0 and 1. In order to solve the problem of unbalance of positive and negative samples, when a category confidence coefficient loss function is calculated, weights with different sizes are set for a positive example and a negative example according to the proportion of the positive and negative samples, and the calculation formula is as follows:
Figure BDA0002982618060000111
wherein ,numposNumber of positive samples, numnegRepresenting the number of negative samples, c is a constant and is set to 1.02 in the method;
2) a position loss function.
The position loss function is used for calculating the relative position of key points of the lane line in an anchor point containing the lane line, and the relative position is calculated by adopting an L1 norm, and the calculation formula is as follows:
Figure BDA0002982618060000112
wherein ,
Figure BDA0002982618060000113
when the absolute error between the predicted value and the true value of the position is calculated, the loss of the absolute error is calculated only for the positive example anchor points containing the lane lines, and the loss of the absolute error is ignored for the negative example anchor points not containing the lane lines.
In the embodiment of the invention, the detection loss function of the key points of the lane lines in the training is obtained by summing a category confidence coefficient loss function and a position loss function. The regression network of the key points of the lane lines is trained to obtain the optimal model parameters so as to deduce more accurate key points of the lane lines.
And 4, obtaining the positions of a series of lane line key points of the image to be detected through a lane line key point detection model, and obtaining a complete lane line detection result by combining a post-processing method of clustering and fitting.
As shown in the lane line detection flow in fig. 1, the implementation manner of this step is as follows:
1) inputting the foresight image to be detected into the trained lane line key point regression network model to obtain a series of lane line key points and the class confidence thereof.
2) And screening the lane line key points obtained by the lane line key point regression network model according to the category confidence coefficient, and reserving the lane line key points with the confidence coefficient larger than 0.6 (namely a preset value) as the lane line key points used in the clustering and fitting steps.
3) After a series of predicted lane line key points are obtained, points under the original image view angle are transformed to the aerial view image through a perspective transformation function to perform DBSCAN clustering, and the predicted lane key points in the whole image are divided into feature points on different example lane lines.
4) According to the shape of the lane line in an actual scene, a conventional common polynomial curve is fitted to the lane line, and the fitting formula is as follows:
y=ax3+bx2+cx+d
where a, b, c, d are constants, and a ≠ 0, (x, y) are the coordinates of points on the ground.
Based on the transformation relation between a camera coordinate system and a world coordinate system and the pitch angle and the yaw angle of a camera relative to the ground, a quadratic inverse proportion curve is constructed to describe a lane line model, and based on the result of DBSCAN clustering, a single lane line is fitted through the following fitting model:
Figure BDA0002982618060000121
wherein (u, v) represents the pixel point coordinates in the image coordinate system, and a, b, C, D and E are fitting parameters. In a single image, the lanes on the actual road are parallel to each other and have similar curvatures, so C, D and E in the fitting parameters are shared as shared parameters to all lane lines in the same scene, and a and b are used as independent parameters to distinguish each single lane line.
According to the detection result in the TuSimple data set commonly used in the lane line detection task, as shown in FIG. 4, it can be seen that the method has a good detection effect under the conditions of a large curvature curve and the occlusion of other vehicles or shadows.
Compared with the prior art, the invention mainly obtains the following technical effects:
1) the method does not need to classify each pixel in the image, has higher calculation efficiency and smaller video memory occupation, and meets the real-time property in engineering application;
2) the method adopts a multi-scale feature fusion method, and has stronger robustness under various complex scenes and diversified weather conditions;
3) the lane line fitting model provided by the invention has a good fitting effect on the lane line with larger curvature.
For ease of understanding, the present invention is further described below with reference to specific examples.
As shown in fig. 1, the method first converts the original lane line image dataset to be processed into a lane line dataset available for training through data preprocessing. And secondly, inputting the training data set into the lane line key point detection network, and training to obtain a lane line key point detection model. And finally, screening the key points of the lane line according to the category confidence of the key points of the lane line, and implementing DBSCAN clustering and curve fitting on the screened feature points to obtain a final lane line detection result.
1. And constructing a lane line training data set.
Firstly, cutting an original front-view image, reserving a ground area containing lane line characteristics, and enabling the size of the cut image to meet the requirement that the output size of each downsampling layer passing through a lane line key point detection network is an integer. And (4) processing the label image, and defining the size of the label image according to the output size of the multi-scale fusion module in the network model. The output of the method comprises two dimensions of a category and a position, wherein the category represents whether the current area comprises a lane line, and the position represents the relative position of a key point of the lane line in the area.
2. And (5) training a lane line key point detection network model.
The network used DarkNet53 as the convolution feature extraction module and a model trained on ImageNet dataset by DarkNet53 as the pre-trained model of the network. After the training data set is input into the constructed lane line key point detection network, the category confidence coefficient and the position loss of the lane line key points are minimized based on a back propagation algorithm, so that an optimal lane line key point detection network model is obtained. The network model architecture is shown in fig. 2.
3. And detecting the lane lines of the images to be predicted.
And inputting the image to be detected into the trained lane line key point detection network, and performing forward propagation on the network to obtain a series of lane line key points in the image to be detected and the corresponding category confidence coefficients of the lane line key points. The network outputs three kinds of key points of the lane line with different scales, the number of which is respectively
Figure BDA0002982618060000131
And
Figure BDA0002982618060000132
where h and w represent the height and width of the input image, respectively. As an example, assuming an image size of (1280,720), the network outputs three different scales of lane line keypoints, the number of which is 480, 1920, and 7680, respectively.
And screening the output lane line key points based on the corresponding category confidence degrees, wherein the lane line key points which are more than or equal to 0.6 are considered as the positive example lane line key points. And after the screened lane line key points are obtained, processing the key points through clustering and fitting to obtain a final lane line detection result.
Specifically, the clustering of the key points of the lane lines adopts a DBSCAN clustering method, the clustering category does not need to be preset, and the number of the detected lane lines is not limited. Due to the characteristic that the camera collects the image in the near and far directions, the lane line characteristic points at the near end of the image are more dense than the characteristic points at the far end, and the clustering cannot be directly carried out. In the present invention, the above problem is solved by transforming the key points to the image under the bird's eye view and scaling the height direction thereof. After clustering is completed, the image is projected back to the view angle of the forward looking image and is returned to the original scale.
As an example, in the method, after the conversion to the bird's eye view perspective, the size in the key point height direction is reduced by 5 times.
After the lane lines of different examples are distinguished through clustering, curve fitting is carried out on the lane lines through a quadratic inverse proportion curve model, and a final lane line detection result is obtained.
The invention also provides a regression-based lane line detection system, which comprises a lane line key point acquisition module, a clustering module and a curve fitting module;
the lane line key point acquisition module is used for inputting an image to be detected into a trained lane line key point regression network model to obtain a series of lane line key points and category confidence coefficients thereof, screening the lane line key points according to the category confidence coefficients, and reserving the lane line key points with the confidence coefficients larger than a preset value as the lane line key points used in clustering and fitting;
the clustering module is used for transforming the points of the image to be detected to the aerial view image through a perspective transformation function to perform DBSCAN clustering, and dividing predicted lane key points in the whole image into feature points on different lane lines;
and the curve fitting module adopts quadratic inverse proportion curve fitting to obtain a lane line detection result according to the shape of the lane line in the actual scene and the characteristic points.

Claims (10)

1. A regression-based lane line detection method is characterized in that positions of a series of lane line key points in an image to be detected are obtained through a lane line key point detection model, and a complete lane line detection result is obtained by combining a clustering and fitting post-processing method, and the method specifically comprises the following steps:
inputting an image to be detected into a trained lane line key point regression network model to obtain a series of lane line key points and category confidence coefficients thereof, screening the lane line key points according to the category confidence coefficients, and reserving the lane line key points with the confidence coefficients larger than a preset value as the lane line key points used in clustering and fitting;
transforming key points in the screened image to be detected to the aerial view image through a perspective transformation function to perform DBSCAN clustering, and dividing predicted lane key points in the whole image into feature points on different lane lines;
and based on the characteristic points obtained by clustering, obtaining a lane line detection result by adopting quadratic inverse proportion curve fitting according to the shape of the lane line in the actual scene and the transformation relation between the camera coordinate system and the world coordinate system.
2. The regression-based lane line detection method according to claim 1, wherein the lane line key point detection model is obtained by using the following method:
constructing a regression method-based lane line key point detection network;
preprocessing the forward-looking lane line image and the label thereof to obtain a training data set;
and training a regression network of the key points of the lane line by adopting a multi-scale fusion network structure based on the training data set to obtain a detection model of the key points of the lane line.
3. The regression-based lane line detection method according to claim 1 or 2, wherein the lane line keypoint detection network model comprises a convolution network feature extraction module, a multi-scale feature fusion module, and a lane line keypoint location and category prediction module; the convolution network feature extraction module is used for extracting features of a forward-looking image of a driving visual angle, converting the forward-looking image into a feature image with a smaller size and outputting image features extracted by the last three down-sampling modules;
the multi-scale feature fusion module is used for performing feature fusion on the channel dimension by using the image features output by the last three down-sampling modules in the convolution network feature extraction module, and outputting three feature images with different scales as the input of the lane line key point position and category prediction module;
the lane line key point position and category prediction module comprises a position prediction unit and a category prediction unit, wherein the lane line key point position prediction unit is used for further positioning the lane line key points according to three feature images with different scales, and regressing the lane line key point features with different channel numbers through the convolution layer to obtain category confidence information and position information of the lane line key points.
4. The regression-based lane line detection method according to claim 3, wherein when preprocessing the forward-looking images and their labels collected from the driving perspective:
cutting the front view image, and only reserving a ground area of the lower half part of the image, wherein the ground area contains lane line characteristics;
the size of the cut image is set according to the parameter characteristics of the convolution layer of the convolution characteristic extraction module;
the convolutional network feature extraction module comprises 52 convolutional layers, specifically comprises a single convolutional layer and 5 repeated residual error units, each residual error unit is a downsampling module, wherein each residual error unit in the 5 repeated residual error units comprises 1 single convolutional layer and a repeated residual error unit, and each residual error unit can change the size of an output feature image into that of an input front-view image
Figure FDA0002982618050000021
After the first two down-sampling modules finish coarse-grained feature extraction on the image, the last three down-sampling modules extract more detailed image features as the input of the feature fusion module.
5. The regression-based lane line detection method according to claim 3, wherein the multi-scale feature fusion module performs feature fusion in channel dimensions using image features output by the last three down-sampling modules in the convolutional network feature extraction module, and outputting three feature images with different scales is specifically:
the number of channels for outputting the characteristic image through the convolution layer is T1The down-sampling module of (2) performs further feature extraction and saves the result as fmp 1; secondly, by up-sampling fmp1, the number of channels for fusing output characteristic images in channel dimension is T2And further feature extraction is performed based on the fused features, and the result is fmp 2; finally, fmp2 is upsampled and its channel dimension and number of output feature image channels are T3And fusing the down-sampling modules to finally obtain three characteristic images with different scales.
6. The regression-based lane line detection method according to claim 2, wherein a multi-scale fusion network structure is adopted to train a lane line key point regression network, and the obtained lane line key point detection model is specifically as follows:
taking a training data set as the input of a lane line key point regression network, performing end-to-end training on the lane line key point regression network according to a lane line key point detection loss function, inputting the training data set into the constructed lane line key point detection network, and minimizing the category confidence coefficient and the position loss of the lane line key points based on a back propagation algorithm to obtain an optimal lane line key point detection network model;
the lane line key point detection loss function comprises a category confidence coefficient loss function and a position regression loss function, and a lane line key point detection model can be obtained;
the formula for the category confidence loss function is:
Figure FDA0002982618050000031
where j denotes the feature map of three different scales, I denotes each anchor point, H and W are the height and width of the image, respectively, IijIs a Boolean variable and represents the characteristic image under the jth scale, the ith anchor point of the characteristic image contains a lane line or not, if the anchor point contains the lane line, the value is 1, if the anchor point does not contain the lane line, the value is 0, CijThe category prediction value of the ith anchor point of the feature image under the jth scale is between 0 and 1;
the formula for the position loss function is:
Figure FDA0002982618050000032
wherein ,
Figure FDA0002982618050000033
and calculating the loss of the absolute error between the predicted value and the true value of the position, wherein the loss is calculated only for the positive example anchor points containing the lane lines, and the loss is ignored for the negative example anchor points not containing the lane lines.
7. The regression-based lane line detection method according to claim 6, wherein weights with different magnitudes are set for the positive and negative samples according to the ratio of the positive and negative samples when calculating the category confidence loss function, and the calculation formula is:
Figure FDA0002982618050000041
Figure FDA0002982618050000042
wherein numposNumber of positive samples, numnegRepresenting the number of negative samples, c is a constant.
8. The regression-based lane line detection method according to claim 1, wherein a quadratic inverse proportion curve is adopted to fit a single lane line, and the fitting model is as follows:
Figure FDA0002982618050000043
the method comprises the steps that (u, v) represents pixel point coordinates under an image coordinate system, a, b, C, D and E are fitting parameters, lanes on an actual road are parallel to each other in a single image and have similar curvatures, C, D and E in the fitting parameters are shared to all lane lines in the same scene as shared parameters, and a and b are used as independent parameters to distinguish each single lane line.
9. The regression-based lane line detection method according to claim 1, wherein said image to be detected is a forward-looking image collected from a driving viewpoint.
10. The regression-based lane line detection system is characterized by comprising a lane line key point acquisition module, a clustering module and a curve fitting module;
the lane line key point acquisition module is used for inputting an image to be detected into a trained lane line key point regression network model to obtain a series of lane line key points and category confidence coefficients thereof, screening the lane line key points according to the category confidence coefficients, and reserving the lane line key points with the confidence coefficients larger than a preset value as the lane line key points used in clustering and fitting;
the clustering module is used for transforming the key points in the screened image to be detected to the aerial view image through a perspective transformation function to perform DBSCAN clustering, and dividing the predicted lane key points in the whole image into feature points on different lane lines;
and the curve fitting module is used for obtaining a lane line detection result by adopting quadratic inverse proportion curve fitting according to the shape of the lane line in the actual scene and the transformation relation between the camera coordinate system and the world coordinate system based on the characteristic points obtained by clustering.
CN202110290948.5A 2021-03-18 2021-03-18 Regression-based lane line detection method and system Active CN113095152B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110290948.5A CN113095152B (en) 2021-03-18 2021-03-18 Regression-based lane line detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110290948.5A CN113095152B (en) 2021-03-18 2021-03-18 Regression-based lane line detection method and system

Publications (2)

Publication Number Publication Date
CN113095152A true CN113095152A (en) 2021-07-09
CN113095152B CN113095152B (en) 2023-08-22

Family

ID=76669311

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110290948.5A Active CN113095152B (en) 2021-03-18 2021-03-18 Regression-based lane line detection method and system

Country Status (1)

Country Link
CN (1) CN113095152B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114463720A (en) * 2022-01-25 2022-05-10 杭州飞步科技有限公司 Lane line detection method based on line segment intersection-to-parallel ratio loss function
CN114708569A (en) * 2022-02-22 2022-07-05 广州文远知行科技有限公司 Road curve detection method, device, equipment and storage medium
CN116543365A (en) * 2023-07-06 2023-08-04 广汽埃安新能源汽车股份有限公司 Lane line identification method and device, electronic equipment and storage medium
CN113706705B (en) * 2021-09-03 2023-09-26 北京百度网讯科技有限公司 Image processing method, device, equipment and storage medium for high-precision map

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111008600A (en) * 2019-12-06 2020-04-14 中国科学技术大学 Lane line detection method
CN111460984A (en) * 2020-03-30 2020-07-28 华南理工大学 Global lane line detection method based on key point and gradient balance loss
WO2020181685A1 (en) * 2019-03-12 2020-09-17 南京邮电大学 Vehicle-mounted video target detection method based on deep learning
CN111814593A (en) * 2020-06-19 2020-10-23 浙江大华技术股份有限公司 Traffic scene analysis method and device, and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020181685A1 (en) * 2019-03-12 2020-09-17 南京邮电大学 Vehicle-mounted video target detection method based on deep learning
CN111008600A (en) * 2019-12-06 2020-04-14 中国科学技术大学 Lane line detection method
CN111460984A (en) * 2020-03-30 2020-07-28 华南理工大学 Global lane line detection method based on key point and gradient balance loss
CN111814593A (en) * 2020-06-19 2020-10-23 浙江大华技术股份有限公司 Traffic scene analysis method and device, and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
蔡英凤;张田田;王海;李?承;孙晓强;陈龙;: "基于实例分割和自适应透视变换算法的多车道线检测", 东南大学学报(自然科学版), no. 04 *
陈立潮;徐秀芝;曹建芳;潘理虎;: "引入辅助损失的多场景车道线检测", 中国图象图形学报, no. 09 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113706705B (en) * 2021-09-03 2023-09-26 北京百度网讯科技有限公司 Image processing method, device, equipment and storage medium for high-precision map
CN114463720A (en) * 2022-01-25 2022-05-10 杭州飞步科技有限公司 Lane line detection method based on line segment intersection-to-parallel ratio loss function
CN114463720B (en) * 2022-01-25 2022-10-21 杭州飞步科技有限公司 Lane line detection method based on line segment intersection ratio loss function
CN114708569A (en) * 2022-02-22 2022-07-05 广州文远知行科技有限公司 Road curve detection method, device, equipment and storage medium
CN116543365A (en) * 2023-07-06 2023-08-04 广汽埃安新能源汽车股份有限公司 Lane line identification method and device, electronic equipment and storage medium
CN116543365B (en) * 2023-07-06 2023-10-10 广汽埃安新能源汽车股份有限公司 Lane line identification method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113095152B (en) 2023-08-22

Similar Documents

Publication Publication Date Title
CN111626217B (en) Target detection and tracking method based on two-dimensional picture and three-dimensional point cloud fusion
CN111310574B (en) Vehicle-mounted visual real-time multi-target multi-task joint sensing method and device
CN113095152B (en) Regression-based lane line detection method and system
CN111242041B (en) Laser radar three-dimensional target rapid detection method based on pseudo-image technology
WO2021249071A1 (en) Lane line detection method, and related apparatus
CN112287860B (en) Training method and device of object recognition model, and object recognition method and system
CN114359851A (en) Unmanned target detection method, device, equipment and medium
CN116188999B (en) Small target detection method based on visible light and infrared image data fusion
CN111931683B (en) Image recognition method, device and computer readable storage medium
CN112541460B (en) Vehicle re-identification method and system
CN115187964A (en) Automatic driving decision-making method based on multi-sensor data fusion and SoC chip
CN114913498A (en) Parallel multi-scale feature aggregation lane line detection method based on key point estimation
CN112287859A (en) Object recognition method, device and system, computer readable storage medium
Zang et al. Traffic lane detection using fully convolutional neural network
CN117058646B (en) Complex road target detection method based on multi-mode fusion aerial view
CN112395962A (en) Data augmentation method and device, and object identification method and system
CN115238758A (en) Multi-task three-dimensional target detection method based on point cloud feature enhancement
CN115273032A (en) Traffic sign recognition method, apparatus, device and medium
CN117115690A (en) Unmanned aerial vehicle traffic target detection method and system based on deep learning and shallow feature enhancement
CN116630702A (en) Pavement adhesion coefficient prediction method based on semantic segmentation network
CN113378647B (en) Real-time track obstacle detection method based on three-dimensional point cloud
CN114495050A (en) Multitask integrated detection method for automatic driving forward vision detection
CN113221643B (en) Lane line classification method and system adopting cascade network
Liu et al. The robust semantic slam system for texture-less underground parking lot
CN116797894A (en) Radar and video fusion target detection method for enhancing characteristic information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant