CN113095152A

CN113095152A - Lane line detection method and system based on regression

Info

Publication number: CN113095152A
Application number: CN202110290948.5A
Authority: CN
Inventors: 郑南宁; 朱丹彤; 黄宇豪; 王圣琦; 南智雄
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2021-03-18
Filing date: 2021-03-18
Publication date: 2021-07-09
Anticipated expiration: 2041-03-18
Also published as: CN113095152B

Abstract

The invention discloses a method and a system for detecting lane lines based on regression, wherein the method specifically comprises the following steps: inputting an image to be detected into a trained lane line key point regression network model to obtain a series of lane line key points and category confidence coefficients thereof, screening the lane line key points according to the category confidence coefficients, and reserving the lane line key points with the confidence coefficients larger than a preset value as the lane line key points used in clustering and fitting; transforming the points of the image to be detected to the aerial view image through a perspective transformation function to perform DBSCAN clustering, and dividing predicted lane key points in the whole image into feature points on different lane lines; according to the shape of the lane line in the actual scene and the characteristic points, adopting secondary inverse proportion curve fitting to obtain a lane line detection result; the images do not need to be subjected to complex preprocessing, each pixel point in the images does not need to be subjected to category segmentation, any multiple lane lines can be identified, and the method has good adaptability to detection under various weather environments.

Description

Lane line detection method and system based on regression

Technical Field

The invention belongs to the technical field of intelligent traffic, and particularly relates to a method and a system for detecting a lane line based on regression.

Background

The unmanned vehicle mainly senses and identifies the environment around the vehicle through a vehicle-mounted sensor, the visual sensing is a very important module in the current unmanned application scene, and lane line detection is also a very important part in the sensing module. In the perception module, common sensors are divided into a laser radar and a camera, and for lane line detection, image data can provide more abundant information compared with three-dimensional point cloud data. In the research subject of lane line detection, a camera is mainly selected as a sensor for providing data, so that the requirement of reducing cost is met, and meanwhile, richer information is provided.

At present, the traditional method for detecting the lane line mainly utilizes characteristic information of the lane line, such as color, width, edge or gradient change, and the like to extract the lane line from a road surface area, and realizes the lane line detection through a clustering algorithm and a lane line fitting method. However, as research progresses, scenes corresponding to a lane line detection task are more diversified, recognition of a lane line is not limited to low-level understanding of white and yellow lines, and more methods pay attention to detection of the position of the lane line on a semantic level even if the features of the lane line are fuzzy or completely shielded.

The current unmanned technology has higher requirements on the detection of lane lines, and has higher requirements on the detection stability of the unmanned technology under various complex environments, the stability and the real-time performance of lane line detection under continuous scenes besides the basic detection accuracy. Compared with the traditional method, the lane line detection based on deep learning enhances the detection robustness and optimizes the lane line detection results in various environmental states. In deep learning lane line detection, lane line detection is often regarded as a segmentation task, and the whole image is classified pixel by pixel to obtain a pixel-level lane line detection result. The method for detecting the deep learning lane lines based on the segmentation has high calculation complexity and low speed, and because the receptive field of each pixel is limited, the method is difficult to detect the slender shapes such as the lane lines, so that the method has high difficulty in detecting the lane lines under the state of shielding or complex weather environment.

Besides the deep learning lane line detection method based on segmentation, the deep learning lane line detection method based on regression is simple in network structure, strong in detection robustness under various environments and good in engineering applicability. The existing deep learning lane line detection method based on regression is divided into two regression methods for performing regression on lane line characteristic points under two different viewing angles of a bird's-eye view and a front view. Although the aerial view reduces redundant information in the images and is more beneficial to regression of the lane line characteristic points, the aerial view needs preprocessing of perspective transformation on the original images, and the accuracy of the perspective transformation greatly influences the generalization capability of the method.

Disclosure of Invention

In order to solve the problems of poor lane line detection robustness, low calculation efficiency and the like in various scenes in the prior art, the invention provides a regression-based lane line key point detection method, the network complexity is low, complicated preprocessing is not required to be carried out on images in the training and reasoning processes, the algorithm calculation efficiency is high, the final detection result is obtained through a clustering and fitting post-processing mode, and the proposed fitting model has a good fitting effect on a lane line with a large curvature.

In order to achieve the purpose, the invention adopts the technical scheme that: a regression-based lane line detection method is characterized in that positions of a series of lane line key points in an image to be detected are obtained through a lane line key point detection model, and a complete lane line detection result is obtained by combining a clustering and fitting post-processing method, and the method specifically comprises the following steps:

inputting an image to be detected into a trained lane line key point regression network model to obtain a series of lane line key points and category confidence coefficients thereof, screening the lane line key points according to the category confidence coefficients, and reserving the lane line key points with the confidence coefficients larger than a preset value as the lane line key points used in clustering and fitting;

transforming key points in the screened image to be detected to the aerial view image through a perspective transformation function to perform DBSCAN clustering, and dividing predicted lane key points in the whole image into feature points on different lane lines;

and based on the characteristic points obtained by clustering, obtaining a lane line detection result by adopting quadratic inverse proportion curve fitting according to the shape of the lane line in the actual scene and the transformation relation between the camera coordinate system and the world coordinate system.

The lane line key point detection model is obtained by adopting the following method:

constructing a regression method-based lane line key point detection network;

preprocessing the forward-looking lane line image and the label thereof to obtain a training data set;

and training a regression network of the key points of the lane line by adopting a multi-scale fusion network structure based on the training data set to obtain a detection model of the key points of the lane line.

The lane line key point detection network model comprises a convolution network feature extraction module, a multi-scale feature fusion module and a lane line key point position and category prediction module; the convolution network feature extraction module is used for extracting features of a forward-looking image of a driving visual angle, converting the forward-looking image into a feature image with a smaller size and outputting image features extracted by the last three down-sampling modules;

the multi-scale feature fusion module is used for performing feature fusion on the channel dimension by using the image features output by the last three down-sampling modules in the convolution network feature extraction module, and outputting three feature images with different scales as the input of the lane line key point position and category prediction module;

the lane line key point position and category prediction module comprises a position prediction unit and a category prediction unit, wherein the lane line key point position prediction unit is used for further positioning the lane line key points according to three feature images with different scales, and regressing the lane line key point features with different channel numbers through the convolution layer to obtain category confidence information and position information of the lane line key points.

When the forward-looking image and the label thereof collected under the driving visual angle are preprocessed:

cutting the front view image, and only reserving a ground area of the lower half part of the image, wherein the ground area contains lane line characteristics;

the size of the cut image is set according to the parameter characteristics of the convolution layer of the convolution characteristic extraction module;

the convolutional network feature extraction module comprises 52 convolutional layers, specifically comprises a single convolutional layer and 5 repeated residual error units, each residual error unit is a downsampling module, wherein each residual error unit in the 5 repeated residual error units comprises 1 single convolutional layer and a repeated residual error unit, and each residual error unit can change the size of an output feature image into that of an input front-view image

After the first two down-sampling modules finish coarse-grained feature extraction on the image, the last three down-sampling modules extract more detailed image features as the input of the feature fusion module.

The multi-scale feature fusion module performs feature fusion on the channel dimension by using the image features output by the last three down-sampling modules in the convolutional network feature extraction module, and the specific steps of outputting three feature images with different scales are as follows:

the number of channels for outputting the characteristic image through the convolution layer is T₁The down-sampling module of (2) performs further feature extraction and saves the result as fmp 1; secondly, by up-sampling fmp1, the number of channels for fusing output characteristic images in channel dimension is T₂And further feature extraction is performed based on the fused features, and the result is fmp 2; finally, fmp2 is upsampled and its channel dimension and number of output feature image channels are T₃And fusing the down-sampling modules to finally obtain three characteristic images with different scales.

The method comprises the following steps of training a regression network of key points of a lane line by adopting a multi-scale fusion network structure to obtain a detection model of the key points of the lane line as follows:

taking a training data set as the input of a lane line key point regression network, performing end-to-end training on the lane line key point regression network according to a lane line key point detection loss function, inputting the training data set into the constructed lane line key point detection network, and minimizing the category confidence coefficient and the position loss of the lane line key points based on a back propagation algorithm to obtain an optimal lane line key point detection network model;

the lane line key point detection loss function comprises a category confidence coefficient loss function and a position regression loss function, and a lane line key point detection model can be obtained;

the formula for the category confidence loss function is:

where j denotes the feature map of three different scales, I denotes each anchor point, H and W are the height and width of the image, respectively, I_ijIs a Boolean variable and represents the characteristic image under the jth scale, the ith anchor point of the characteristic image contains a lane line or not, if the anchor point contains the lane line, the value is 1, if the anchor point does not contain the lane line, the value is 0, C_ijThe category prediction value of the ith anchor point of the feature image under the jth scale is between 0 and 1;

the formula for the position loss function is:

wherein ,

and calculating the loss of the absolute error between the predicted value and the true value of the position, wherein the loss is calculated only for the positive example anchor points containing the lane lines, and the loss is ignored for the negative example anchor points not containing the lane lines.

When the category confidence coefficient loss function is calculated, weights with different sizes are set for the positive sample and the negative sample according to the proportion of the positive sample to the negative sample, and the calculation formula is as follows:

wherein num_posNumber of positive samples, num_negRepresenting the number of negative samples, c is a constant.

Fitting a single lane line by adopting a quadratic inverse proportion curve, wherein the fitting model is as follows:

the method comprises the steps that (u, v) represents pixel point coordinates under an image coordinate system, a, b, C, D and E are fitting parameters, lanes on an actual road are parallel to each other in a single image and have similar curvatures, C, D and E in the fitting parameters are shared to all lane lines in the same scene as shared parameters, and a and b are used as independent parameters to distinguish each single lane line.

The image to be detected is a forward-looking image collected under a driving visual angle.

The invention provides a regression-based lane line detection system, which comprises a lane line key point acquisition module, a clustering module and a curve fitting module;

the lane line key point acquisition module is used for inputting an image to be detected into a trained lane line key point regression network model to obtain a series of lane line key points and category confidence coefficients thereof, screening the lane line key points according to the category confidence coefficients, and reserving the lane line key points with the confidence coefficients larger than a preset value as the lane line key points used in clustering and fitting;

the clustering module is used for transforming the key points in the screened image to be detected to the aerial view image through a perspective transformation function to perform DBSCAN clustering, and dividing the predicted lane key points in the whole image into feature points on different lane lines;

and the curve fitting module is used for obtaining a lane line detection result by adopting quadratic inverse proportion curve fitting according to the shape of the lane line in the actual scene and the transformation relation between the camera coordinate system and the world coordinate system based on the characteristic points obtained by clustering.

Compared with the prior art, the invention has at least the following beneficial effects:

according to the regression-based lane line detection method, the images do not need to be subjected to complex preprocessing, and each pixel point in the images does not need to be subjected to category segmentation, so that the regression-based lane line detection method has higher execution efficiency; the invention can identify any plurality of lane lines, the number of the lane lines to be detected does not need to be preset, and the invention has good adaptability to detection under various weather environments; a series of lane line key points are predicted through a deep neural network, and a final detection result is obtained through a post-processing mode of clustering and fitting, so that the algorithm has high calculation efficiency and accuracy, strong robustness under various scenes, small video memory occupation and real-time performance in engineering application; the proposed lane line fitting model has a good fitting effect on the lane line with a large curvature.

Drawings

Fig. 1 is a flowchart of a regression-based lane line detection method according to an embodiment of the present invention.

Fig. 2 is a schematic diagram of a lane line key point detection network based on a regression method according to an embodiment of the present invention.

Fig. 3 is a schematic diagram of a positive and negative example of the detection of the lane line key point detection network based on the regression method according to the embodiment of the present invention;

FIG. 3a is a schematic diagram of a feature map output by the multi-scale feature fusion module; FIG. 3b is a schematic diagram of detecting a positive sample;

FIG. 3c is a schematic diagram of detecting negative samples.

Fig. 4 is a schematic diagram of a result of the lane line detection method according to the embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a regression-based lane line detection method, which does not need to perform complex preprocessing on an image, does not need to perform category segmentation on each pixel point in the image, and has higher execution efficiency; the invention can identify any plurality of lane lines, the number of the lane lines to be detected does not need to be preset, and the invention has good adaptability to detection under various weather environments; a series of lane line key points are predicted through a deep neural network, a final detection result is obtained through a post-processing mode of clustering and fitting, the algorithm is high in calculation efficiency and accuracy, and strong in robustness under various scenes, and the method mainly comprises the following steps:

step 1, constructing a regression method-based lane line key point detection network, enabling the network to process an input training data set, and simultaneously realizing the function of training a lane line key point detection model.

In the embodiment of the invention, the detection of the key points of the lane lines is realized by a regression method, each pixel in the image does not need to be classified, the complexity of calculation is reduced, and the detection efficiency is improved.

As shown in fig. 2, the network for detecting the key points of the lane lines based on the regression method includes: the system comprises a convolution network feature extraction module, a multi-scale feature fusion module and a lane line key point position and category prediction module; wherein,

and the multi-scale feature fusion module fuses and outputs three kinds of feature information with different scales based on the result obtained by the convolution network feature extraction module, and the feature information is used as the input of the lane line key point position and category prediction module.

1) And a convolutional network feature extraction module.

In the embodiment of the invention, the convolution network feature extraction module performs feature extraction on the forward-looking image of the driving visual angle through five down-sampling modules and converts the forward-looking image into a feature image with a smaller size.

As an example, the convolutional network feature extraction module comprises 52 convolutional layers, and the structure of the convolutional layerComprises a single convolution layer and 5 repeated residual units, wherein each residual unit is a down-sampling module. Wherein each of the 5 repeated residual units comprises 1 independent convolutional layer and a repeated residual unit, the repeated execution times of the residual units are respectively 1, 2, 8 and 4, and each residual unit can change the size of the output characteristic image into that of the input front view image

Further, assume that the size of the input image is h × w × 3, where h represents the image height, w represents the image width, and 3 represents the image channel (RGB). After being processed by 5 down-sampling modules of the convolution network feature extraction module, the size of the finally output feature image is

Where C represents the number of channels of the feature image. After the first two down-sampling modules finish coarse-grained feature extraction on the image, the last three down-sampling modules extract more detailed image features. In the multi-scale feature fusion module, the output of the last three down-sampling modules for extracting the detail features in the convolution network feature extraction module is used for feature fusion.

2) And a multi-scale feature fusion module.

The multi-scale feature fusion module performs feature fusion in channel dimension by using the output of the last three down-sampling modules in the convolution network feature extraction module, and the channel number T of 3 outputs₁、T₂ and T₃256, 512 and 1024 respectively, i.e. feature image sizes of 256, 512 and 1024 respectively

And

as an example, in the first step, a downsampling module with 1024 output feature image channels is further subjected to feature extraction through a convolutional layer, and the result is stored as fmp 1; secondly, by up-sampling fmp1, the output of a down-sampling module with the output characteristic image channel number of 512 is fused in the channel dimension, further characteristic extraction is carried out based on the fused characteristic, and the result is stored as fmp 2; and finally, performing upsampling on the fmp2, and fusing the upsampling module with a channel dimension and 256 channels of output characteristic images to finally obtain three characteristic images with different scales.

3) Lane line key point position and category prediction module

And the lane line key point position and category prediction module is used for further positioning the lane line key points based on the feature images output by the multi-scale feature fusion module, and regressing the features of the lane line key points with different channel numbers to the channel number of 2 through the convolution layer, wherein 2 represents the category confidence degree dimension and the position dimension of the lane line key points.

The lane line key point position and category prediction module comprises a category confidence prediction unit and a position prediction unit, wherein the category confidence prediction unit represents whether the original image pixel semantics corresponding to the feature map anchor points contain lane line features through Boolean variables, if so, the category confidence prediction unit is set to 1, and if not, the category confidence prediction unit is set to 0; the position prediction unit is used for positioning the position of the key point of the lane line in the positive sample containing the lane line characteristics, and the output result is the relative position of the feature point of the lane line in the anchor point, and the size is between 0 and 1.

The category confidence degree dimensionality of the key points of the lane lines represents whether the original image pixel semantics corresponding to the feature map anchor points contain the lane lines or not through Boolean variables, and for the condition of shielding or dotted lines, the detection results are output according to the semantics instead of actual pixel characteristics. As shown in the two regions marked in the characteristic diagram of fig. 3a, the anchor point in fig. 3b corresponds to the central line containing the lane line in the pixel of the original image and intersecting the lane line with the anchor point, and it is considered as a positive sample. Although the lane feature is included in fig. 3c, it does not intersect the centerline of the anchor point, so it is considered a negative example. The position dimension is used for positioning the position of the key point of the lane line in the positive sample containing the lane line characteristic, and the output result is the relative position of the feature point of the lane line in the anchor point, and the size is between 0 and 1. If a plurality of lane line characteristic points are contained in the anchor point, only one characteristic point closest to the center is reserved.

And 2, preprocessing the forward-looking image and the label thereof under the driving visual angle to obtain a training data set.

As an example, a forward-looking image and its labels collected under a driving view angle are preprocessed to establish a training data set required by a lane line key point detection network.

The processing mode of the forward-looking image is to cut a redundant image area of a sky part in the image, only a ground area containing lane line characteristics at the lower half part of the image is reserved, the size of the cut image is set according to parameter characteristics of a convolutional layer of a convolutional characteristic extraction module part, the convolutional network characteristic extraction module carries out downsampling for 5 times in total, and the size of the characteristic image output each time is input

The clipped image needs to satisfy the condition that the feature size output by each downsampling is an integer.

The method comprises the steps of preprocessing a label of a lane line, wherein the label is a binary image representing the position of the lane line, and acquiring a series of feature point sets of the lane line according to the binary image. Secondly, calculating whether the anchor points of the corresponding feature map contain the key points of the lane line according to the key point positions of the lane line and the sizes of the information of the key points of the lane line with three different scales output by the category prediction module, and further calculating the relative position of the anchor points if the anchor points contain the key points of the lane line. The size of the input image is h multiplied by w multiplied by 3, and the sizes of the key point information of the lane lines with three different sizes are respectively

And

the size of the preprocessed label is the same as the size of the characteristic information. Wherein 2 represents the channel number of the key point information of the lane line, and the 1 st channel represents the anchorWhether the points contain the lane lines or not is 1, and the points do not contain 0, the second channel represents the relative position of the lane lines in the anchor point, and the size of the second channel is between 0 and 1.

In the specific implementation process of the invention, the step 1 and the step 2 do not distinguish the execution sequence, and the step 1 or the step 2 can be executed first or synchronously.

And 3, training a regression network of the key points of the lane line by adopting a multi-scale fusion network structure to obtain a detection model of the key points of the lane line.

In the embodiment of the invention, a training data set is used as the input of the regression network of the key points of the lane line, the regression network of the key points of the lane line is trained end to end according to the designed detection loss function of the key points of the lane line, and the detection loss function of the key points of the lane line comprises a category confidence coefficient loss function and a position regression loss function.

1) A class confidence loss function.

The category confidence coefficient loss function is a loss function used for calculating whether a lane line is included in an original image pixel region corresponding to a feature map anchor point, and the calculation formula is as follows:

where j denotes the feature map at three different scales, i denotes each anchor point, and H and W are the height and width of the image, respectively. I is_ijThe characteristic image is a Boolean variable and represents that the characteristic image under the jth scale has the i-th anchor point containing the lane line, if the i-th anchor point contains the lane line, the value is 1, and if the i-th anchor point does not contain the lane line, the value is 0; c_ijThe size of the category predicted value of the ith anchor point of the feature image at the jth scale is between 0 and 1. In order to solve the problem of unbalance of positive and negative samples, when a category confidence coefficient loss function is calculated, weights with different sizes are set for a positive example and a negative example according to the proportion of the positive and negative samples, and the calculation formula is as follows:

wherein ,num_posNumber of positive samples, num_negRepresenting the number of negative samples, c is a constant and is set to 1.02 in the method;

2) a position loss function.

The position loss function is used for calculating the relative position of key points of the lane line in an anchor point containing the lane line, and the relative position is calculated by adopting an L1 norm, and the calculation formula is as follows:

wherein ,

when the absolute error between the predicted value and the true value of the position is calculated, the loss of the absolute error is calculated only for the positive example anchor points containing the lane lines, and the loss of the absolute error is ignored for the negative example anchor points not containing the lane lines.

In the embodiment of the invention, the detection loss function of the key points of the lane lines in the training is obtained by summing a category confidence coefficient loss function and a position loss function. The regression network of the key points of the lane lines is trained to obtain the optimal model parameters so as to deduce more accurate key points of the lane lines.

And 4, obtaining the positions of a series of lane line key points of the image to be detected through a lane line key point detection model, and obtaining a complete lane line detection result by combining a post-processing method of clustering and fitting.

As shown in the lane line detection flow in fig. 1, the implementation manner of this step is as follows:

1) inputting the foresight image to be detected into the trained lane line key point regression network model to obtain a series of lane line key points and the class confidence thereof.

2) And screening the lane line key points obtained by the lane line key point regression network model according to the category confidence coefficient, and reserving the lane line key points with the confidence coefficient larger than 0.6 (namely a preset value) as the lane line key points used in the clustering and fitting steps.

3) After a series of predicted lane line key points are obtained, points under the original image view angle are transformed to the aerial view image through a perspective transformation function to perform DBSCAN clustering, and the predicted lane key points in the whole image are divided into feature points on different example lane lines.

4) According to the shape of the lane line in an actual scene, a conventional common polynomial curve is fitted to the lane line, and the fitting formula is as follows:

y＝ax³+bx²+cx+d

where a, b, c, d are constants, and a ≠ 0, (x, y) are the coordinates of points on the ground.

Based on the transformation relation between a camera coordinate system and a world coordinate system and the pitch angle and the yaw angle of a camera relative to the ground, a quadratic inverse proportion curve is constructed to describe a lane line model, and based on the result of DBSCAN clustering, a single lane line is fitted through the following fitting model:

wherein (u, v) represents the pixel point coordinates in the image coordinate system, and a, b, C, D and E are fitting parameters. In a single image, the lanes on the actual road are parallel to each other and have similar curvatures, so C, D and E in the fitting parameters are shared as shared parameters to all lane lines in the same scene, and a and b are used as independent parameters to distinguish each single lane line.

According to the detection result in the TuSimple data set commonly used in the lane line detection task, as shown in FIG. 4, it can be seen that the method has a good detection effect under the conditions of a large curvature curve and the occlusion of other vehicles or shadows.

Compared with the prior art, the invention mainly obtains the following technical effects:

1) the method does not need to classify each pixel in the image, has higher calculation efficiency and smaller video memory occupation, and meets the real-time property in engineering application;

2) the method adopts a multi-scale feature fusion method, and has stronger robustness under various complex scenes and diversified weather conditions;

3) the lane line fitting model provided by the invention has a good fitting effect on the lane line with larger curvature.

For ease of understanding, the present invention is further described below with reference to specific examples.

As shown in fig. 1, the method first converts the original lane line image dataset to be processed into a lane line dataset available for training through data preprocessing. And secondly, inputting the training data set into the lane line key point detection network, and training to obtain a lane line key point detection model. And finally, screening the key points of the lane line according to the category confidence of the key points of the lane line, and implementing DBSCAN clustering and curve fitting on the screened feature points to obtain a final lane line detection result.

1. And constructing a lane line training data set.

Firstly, cutting an original front-view image, reserving a ground area containing lane line characteristics, and enabling the size of the cut image to meet the requirement that the output size of each downsampling layer passing through a lane line key point detection network is an integer. And (4) processing the label image, and defining the size of the label image according to the output size of the multi-scale fusion module in the network model. The output of the method comprises two dimensions of a category and a position, wherein the category represents whether the current area comprises a lane line, and the position represents the relative position of a key point of the lane line in the area.

2. And (5) training a lane line key point detection network model.

The network used DarkNet53 as the convolution feature extraction module and a model trained on ImageNet dataset by DarkNet53 as the pre-trained model of the network. After the training data set is input into the constructed lane line key point detection network, the category confidence coefficient and the position loss of the lane line key points are minimized based on a back propagation algorithm, so that an optimal lane line key point detection network model is obtained. The network model architecture is shown in fig. 2.

3. And detecting the lane lines of the images to be predicted.

And inputting the image to be detected into the trained lane line key point detection network, and performing forward propagation on the network to obtain a series of lane line key points in the image to be detected and the corresponding category confidence coefficients of the lane line key points. The network outputs three kinds of key points of the lane line with different scales, the number of which is respectively

And

where h and w represent the height and width of the input image, respectively. As an example, assuming an image size of (1280,720), the network outputs three different scales of lane line keypoints, the number of which is 480, 1920, and 7680, respectively.

And screening the output lane line key points based on the corresponding category confidence degrees, wherein the lane line key points which are more than or equal to 0.6 are considered as the positive example lane line key points. And after the screened lane line key points are obtained, processing the key points through clustering and fitting to obtain a final lane line detection result.

Specifically, the clustering of the key points of the lane lines adopts a DBSCAN clustering method, the clustering category does not need to be preset, and the number of the detected lane lines is not limited. Due to the characteristic that the camera collects the image in the near and far directions, the lane line characteristic points at the near end of the image are more dense than the characteristic points at the far end, and the clustering cannot be directly carried out. In the present invention, the above problem is solved by transforming the key points to the image under the bird's eye view and scaling the height direction thereof. After clustering is completed, the image is projected back to the view angle of the forward looking image and is returned to the original scale.

As an example, in the method, after the conversion to the bird's eye view perspective, the size in the key point height direction is reduced by 5 times.

After the lane lines of different examples are distinguished through clustering, curve fitting is carried out on the lane lines through a quadratic inverse proportion curve model, and a final lane line detection result is obtained.

The invention also provides a regression-based lane line detection system, which comprises a lane line key point acquisition module, a clustering module and a curve fitting module;

the clustering module is used for transforming the points of the image to be detected to the aerial view image through a perspective transformation function to perform DBSCAN clustering, and dividing predicted lane key points in the whole image into feature points on different lane lines;

and the curve fitting module adopts quadratic inverse proportion curve fitting to obtain a lane line detection result according to the shape of the lane line in the actual scene and the characteristic points.

Claims

1. A regression-based lane line detection method is characterized in that positions of a series of lane line key points in an image to be detected are obtained through a lane line key point detection model, and a complete lane line detection result is obtained by combining a clustering and fitting post-processing method, and the method specifically comprises the following steps:

2. The regression-based lane line detection method according to claim 1, wherein the lane line key point detection model is obtained by using the following method:

constructing a regression method-based lane line key point detection network;

3. The regression-based lane line detection method according to claim 1 or 2, wherein the lane line keypoint detection network model comprises a convolution network feature extraction module, a multi-scale feature fusion module, and a lane line keypoint location and category prediction module; the convolution network feature extraction module is used for extracting features of a forward-looking image of a driving visual angle, converting the forward-looking image into a feature image with a smaller size and outputting image features extracted by the last three down-sampling modules;

4. The regression-based lane line detection method according to claim 3, wherein when preprocessing the forward-looking images and their labels collected from the driving perspective:

5. The regression-based lane line detection method according to claim 3, wherein the multi-scale feature fusion module performs feature fusion in channel dimensions using image features output by the last three down-sampling modules in the convolutional network feature extraction module, and outputting three feature images with different scales is specifically:

6. The regression-based lane line detection method according to claim 2, wherein a multi-scale fusion network structure is adopted to train a lane line key point regression network, and the obtained lane line key point detection model is specifically as follows:

the formula for the category confidence loss function is:

the formula for the position loss function is:

wherein ,

7. The regression-based lane line detection method according to claim 6, wherein weights with different magnitudes are set for the positive and negative samples according to the ratio of the positive and negative samples when calculating the category confidence loss function, and the calculation formula is:

8. The regression-based lane line detection method according to claim 1, wherein a quadratic inverse proportion curve is adopted to fit a single lane line, and the fitting model is as follows:

9. The regression-based lane line detection method according to claim 1, wherein said image to be detected is a forward-looking image collected from a driving viewpoint.

10. The regression-based lane line detection system is characterized by comprising a lane line key point acquisition module, a clustering module and a curve fitting module;