CN110163069B

CN110163069B - Lane line detection method for driving assistance

Info

Publication number: CN110163069B
Application number: CN201910008697.XA
Authority: CN
Inventors: 王孝润; 欧阳琼林; 刘晓清; 王亮
Original assignee: Shenzhen Cookoo Technology Co ltd
Current assignee: Shenzhen Cookoo Technology Co ltd
Priority date: 2019-01-04
Filing date: 2019-01-04
Publication date: 2023-09-08
Anticipated expiration: 2039-01-04
Also published as: CN110163069A

Abstract

The invention discloses a lane line detection method for assisting driving, which is characterized in that the recognition of lane lines is assisted by utilizing the relevance between a vehicle and the lane lines through training an image sample based on a multitasking convolutional neural network, so that the detection safety of complex and changeable automatic driving of an actual driving environment is improved, and the hardware calculation processing requirement is reduced; the lane line detection method not only ensures safer auxiliary driving, but also has the advantages of higher working efficiency, more accurate recognition precision, less misjudgment and the like.

Description

Lane line detection method for driving assistance

Technical Field

The invention relates to the technical field of intelligent traffic, in particular to a lane line detection method for auxiliary driving.

Background

With the rapid development of the economic society in China, the automobile conservation amount is continuously and rapidly increased, and the traffic safety problem is increasingly valued by people, so that the automobile has intelligent environment sensing capability and lane line detection technology, and the safety and the comfort of the automobile can be greatly improved. Currently, related algorithms and techniques based on pattern recognition and machine learning have found some application in vehicle lane line detection, but have been mostly implemented in the past based on conventional image processing and pattern recognition techniques. The basic steps are as follows: sliding window extraction features- > classifier classification; almost all the fields of vehicle detection follow the mode of combining manual and manual feature operators with classifiers, and the method of combining image thresholding with curve fitting and image morphology is utilized in lane line detection. Typical representation of the traditional algorithm is Haar feature+Adaboost algorithm, hog feature+SVM algorithm, and the DPM model proposed by Felzenszwalb in 2008 should be the best artificial feature method in target detection. The method improves HOG characteristics, provides two global and local models, and greatly improves the accuracy of artificial characteristics on target detection. The DPM method has the defects of relatively complex characteristics, low calculation speed and poor detection effect on rotating and stretching objects. Because the general resolution of the vehicle-mounted video is larger and the viewing angle is not fixed, if the DPM is directly used for solving the target detection problem of the unmanned video, the real-time performance and the generalization performance of the model are likely not guaranteed. And because of the characteristics of variability and complexity of the driving environment, the methods are often poor in performance, have the defects of poor robustness and generalization and have limitations in use. The processed results lack good semantic descriptive characteristics and cannot provide good decision basis for automatic driving.

In recent years, the rapid development of deep learning provides a new idea for solving the detection problem of traffic scenes, and the problem that the detection of targets can be processed by using the deep learning is proposed, and the detection can be processed as a regression box.

Girshick made such a series of studies in 2014, forming a research line for RCNN- > SPPNet- > fastRCNN- > fasterRCNN. General procedure for such methods: candidate region generation- > deep network extraction feature- > classifier classification, regression correction. The advent of RCNN successfully reduced regression problems to classification problems. But it is also apparent that repeated feature extraction for each candidate region results in a slow rate. SPPNet: the spatial pyramid pooling layer is designed after the last convolution layer, so that the network input can be not a fixed size, and the information loss of the image caused by stretching and cutting can be avoided to the greatest extent. And establishing a mapping relation between the partial region of the original image and the extracted features, and directly calculating the features for a given region to avoid repeated convolution. In order to solve the problem of repeated calculation of a plurality of candidate areas of the RCNN, on the basis of taking reference to the idea of the SPPNet, a fast-RCNN is added with RoI pooling layer, and the role of the fast-RCNN layer is the same as that of a pooling layer of the SPPNet; based on a sufficient experiment, the SVM is replaced by softmax; classification and binding box regression are carried out at the back of the same network, so that the calculation cost is greatly reduced. The method has the advantages of avoiding repeated convolution, integrating a plurality of tasks and further improving the calculation efficiency. The architecture and optimization of the whole network are basically completed, and the key of the restriction speed is the generation of candidate areas. In order to solve the problem of slow speed of Region pro-pos, the faster-RCNN proposes the idea of RPN. faster-RCNN: the core idea is to hand the candidate region generation to the network as well. Because the target location also needs to be corrected in the fast-RCNN of the next target detection, the candidate region generation does not need an excessively accurate method. The candidate region generation network is also essentially a fast-RCNN whose input is a region in the pre-set image and whose output is whether the region belongs to the foreground or the background and the modified region. Such an approach specifies only a few regions that are likely to be targets, both compared to sliding windows and compared to over-segmentation, much faster. YOLO converts the two stage problem into one stage problem for the first time, and the speed is faster. YOLO: the image is divided into a plurality of grids, the binding box and the trust value are respectively returned, and finally the NMS filters the low-score box. YOLO has the disadvantage of poor detection of closely spaced objects, poor generalization ability, and positioning errors are a major cause of affecting the detection due to the problem of loss functions. Even though YOLO is currently imperfect, even though it is not as well as the very sophisticated faster-RCNN, its speed and accuracy are better than the method of artificial characterization, once these problems are solved, the performance will have a very large headroom.

Meanwhile, the problems that the required computational power is large, the data volume is huge, the algorithm requirement cannot be met by embedded hardware and the like are solved by utilizing the deep learning technology, and the method brings great challenges to detection of a plurality of task targets (lane lines, people, vehicles and traffic signs) on an embedded platform with limited resources.

In summary, the traditional intelligent vehicle lane line detection method cannot meet the requirement of automatic driving in a new period due to complex and changeable actual driving environments; the model based on the deep learning lane line and the vehicle detection algorithm is complex, and the hardware calculation force is limited and cannot meet the real-time processing requirement; the existing lane line detection induction learning system is only focused on a single task, so that the model generalization capability is poor, the data volume requirement is high, and the defects of low working efficiency, low recognition precision, large judgment error and the like of the existing lane line detection method are caused.

Disclosure of Invention

The present invention aims to overcome the above-mentioned disadvantages of the prior art and provide a lane line detection method for driving assistance.

The technical scheme of the invention is realized as follows:

the lane line detection method for assisting driving is characterized by comprising the following specific steps of:

step 1, acquiring an image sample, wherein the image sample comprises a vehicle image sample and a lane line image sample; randomly dividing an image sample into 80% and 20% for data training and result testing;

step 2, constructing a multi-task convolutional neural network, wherein the multi-task convolutional neural network comprises a plurality of convolutional layers, three pooling layers and three full-connection layers, wherein the first three convolutional layers are alternately connected with the three pooling layers, a BN layer is added before a function is activated in the convolutional layers, normalization is carried out on data, the pooling layers adopt a mode of maximizing pooling, the pooling layers are not connected after the fourth convolutional layer, the convolutional layers are directly connected with the three full-connection layers, the last full-connection layer is formed by a plurality of independent sublayers, each sublayer executes different tasks, and a SOFTMAX loss function is used as an objective function to calculate a loss value so as to complete model establishment on a vehicle and a lane line;

step 3, obtaining a vehicle identification result by using the trained multitask convolutional neural network model, and designing and generating an image mask according to the vehicle identification result, wherein the image mask is a binary matrix with the same size as a vehicle image;

step 4, performing dot multiplication on the image mask information generated by design and the image element image to construct a new image for extracting the lane line characteristics;

step 5, extracting the depth network characteristics of the appointed convolution layer of the second pooling layer, and inputting the depth network characteristics into a classifier;

and 6, determining vehicle information and lane line information according to the classifier result, and training a corresponding SVM classifier by using the prepared data and the LibSVM tool kit.

In the step 2, the step of constructing the multitasking convolutional neural network specifically includes:

step 201, inputting image sample data into the multitasking convolutional neural network, wherein each neuron on the multitasking convolutional neural network firstly inputs a value for weighted accumulation, then normalizes the data through a BN layer, and finally inputs an activation function as an output value of the neuron;

step 202, inputting the output value into an error function, performing regularization punishment, comparing the output value with an expected value, and judging the recognition degree through errors, wherein the smaller the error value is, the better the recognition effect is;

step 203, performing inverse derivation on the error function and each activation function in the multitasking convolutional neural network, and determining a gradient vector through inverse propagation;

step 204, adjusting each weight value through the gradient vector, and adjusting the direction of enabling the error to be zero or converging towards the output value;

step 205, repeating the steps 201 to 204 until the iteration number reaches a preset value or the average value of the error is continuously reduced and stabilized near the lowest point;

and 206, inputting the rest 20% data image into the multi-task convolutional neural network, and performing verification and evaluation on the test set until the recognition accuracy of the vehicle and the lane lines reaches more than 90%.

Further, in the step 2, the method further includes: judging whether the positions of the lane lines are clear, if the positions of the lane lines are not clear, adding a network output lane line identification processing step after the full connection layer: firstly, sampling a seed point of network output, removing redundant points, then performing inverse transformation perspective operation, clustering under the inverse transformation perspective operation, converting a clustering result into a normal view, and finally, fitting by a least square method to obtain a lane line.

Further, in the step 1, the image sample includes denoising and deduplication processing.

Further, in the step 1, the image sample further includes pedestrians and traffic signs.

Further, in the step 2, the upper half part of the multitasking convolutional neural network adopts a network model similar to AlexNet.

Compared with the prior art, the method and the device have the advantages that the image sample is trained based on the multitasking convolutional neural network, the lane line identification is assisted by utilizing the relevance of the vehicle and the lane line, the detection safety of complex and changeable automatic driving of the actual driving environment is improved, and the hardware calculation processing requirement is reduced; the lane line detection method not only ensures safer auxiliary driving, but also has the advantages of higher working efficiency, more accurate recognition precision, less misjudgment and the like.

Drawings

The invention is described in further detail below in connection with the embodiments in the drawings, but is not to be construed as limiting the invention in any way.

FIG. 1 is a flow chart of a lane line detection method according to an embodiment of the present invention;

FIG. 2 is a frame diagram of lane line detection and recognition provided by an embodiment of the present invention;

FIG. 3 is a schematic diagram of a single-task learning model framework;

fig. 4 is a schematic diagram of a framework of a multi-task learning model according to an embodiment of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings, which are not intended to limit the invention.

As shown in fig. 1 to 2, a lane line detection method for driving assistance, which works based on a multitasking convolutional neural network, can be divided into two main processes, namely, a training process and an identification process. Because of the high performance of AlexNet in the task related to image classification of ImageNet challenge, alexNet is selected as the reference network, i.e. the network structure comprises only common network layers such as convolution layer, pooling layer, full connection layer, SOFTMAX layer, etc. All training processes of the invention are carried out on a PC, the preferred configuration of the PC is selected from Intel i 7.0 GHz CPU, 32GB RAM and GPU980, and the system is Ubuntu 16.04. Wherein, the training multi-task convolutional neural network adopts a CAFFE framework, a GPU mode is used in a training stage, a MATLAB 2015b mode is used for training an SVM classifier, a CPU mode is used, and the used SVM toolkit is a LIBSVM toolkit published by Chang et al. The code part mainly comprises two parts, namely Python and MATLAB, wherein the Python is mainly used for deep network model pooling layer feature extraction, and the MATLAB is mainly used for training an SVM classifier and testing a final classification result.

The specific implementation steps are as follows:

step 1, acquiring an image sample, wherein the image sample comprises a vehicle image sample and a lane line image sample, the image sample can be subjected to denoising and duplication elimination, and the image sample can also comprise pedestrians, traffic signs and the like. The image sample preferably selects 3626 images with the resolution of 1280x720 from a data set disclosed by the atlas high-tech company, and the images at least comprise two traffic elements of vehicles and lane lines. Moreover, the vehicle is generally located in the middle of the lane lines on both sides. The two elements of the vehicle and lane lines are marked with bright wire frames and dots, respectively, to form candidate areas, and the marking tool selected is Rround Truth laber provided by matlab2017b automated driving toolbox. The pictures are subjected to preprocessing operations including, but not limited to, data enhancement, centralization, resizing and the like, for example, a fixed ROI region extraction method is adopted to intercept partial regions of an original image to generate a new image so as to improve the proportion of vehicle and lane lines to picture pixels, and a double-cube difference method is adopted to resample the pictures so as to reduce the waste of computational resources during training. The processed picture is then converted into LMDB format supported by caffe (TFRecord supported by tensorsurface). All 3626 images in the database were randomly split into 80% and 20% portions for subsequent data training and results testing, respectively.

And 2, constructing a multi-task convolutional neural network, and associating auxiliary lane line recognition by introducing vehicle recognition positioned in the middle of the lane line by utilizing the characteristics of the multi-task convolutional neural network. The multi-task convolutional neural network comprises a network structure file, a network learning rate, a learning rate updating strategy, a maximum iteration number and the like, wherein the deep learning framework is a tensorf low framework, a Adam (Adaptive Moment Estimation) optimization strategy is adopted, the initial learning rate is set to be 0.005, the first-order momentum attenuation coefficient is set to be 0.9, and the first-order momentum attenuation coefficient is set to be 0.999.

The upper half part of the multi-task convolutional neural network is defined similar to AlexNet, the multi-task convolutional neural network comprises a plurality of convolutional layers, three pooling layers and three full-connection layers, wherein the three convolutional layers are alternately connected with the three pooling layers, a BN layer (Batch Normalization) is added before a function is activated to normalize data by the convolutional layers, the pooling layers adopt a mode of maximizing pooling, the pooling layers are not connected after the fourth convolutional layer, the three full-connection layers are directly connected, and a SOFTMAX loss function is used as an objective function to calculate loss values of the three full-connection layers, so that model establishment of vehicles and lane lines is completed;

step 202, inputting the output value into an error function, performing regularization punishment, preventing overfitting, comparing the output value with an expected value, and judging the recognition degree through errors, wherein the smaller the error value is, the better the recognition effect is;

step 203, to minimize the error, performing a reverse derivative on the error function and each activation function in the multitasking convolutional neural network, and determining a gradient vector through a reverse propagation;

step 206, inputting the rest 20% data image into the multi-task convolutional neural network, and performing verification and evaluation on a test set until the recognition accuracy of the vehicle and the lane lines reaches more than 90%;

the model establishment of the vehicle and the lane line is completed through the step 2;

through the training neural network, the obtained lane line position is not clear enough, and the network output lane line recognition task after the full connection layer is required to be further processed, and the specific process is as follows: firstly, sampling a seed point of network output, removing redundant points, then performing inverse transformation perspective operation, clustering under the inverse transformation perspective operation, converting a clustering result into a normal view, and finally, fitting by a least square method to obtain a lane line.

step 5, extracting the depth network characteristics of the appointed convolution layer of the second pooling layer, and inputting the depth network characteristics into an SVM classifier;

and 6, determining vehicle information and lane line information according to the SVM classifier result, and training a corresponding SVM classifier by using the prepared data and the LibSVM tool kit.

The whole multitasking convolutional neural network is transplanted to a vehicle embedded platform Ruisas 32V. The existing model has high precision, but has more model parameters and larger occupied space memory, which is unfavorable for the deployment of the embedded platform, so the network is compressed before the neural network is transplanted to the embedded platform S32V. The network compression method used by the invention optimizes the network weight, namely reduces the parameters of the full-connection layer to a certain extent and changes the full-connection layer with the size of 4096 into the full-connection layer with the size of 3072. During running, a front-view camera arranged on the vehicle obtains a front road image. The forward vision process transmits the image to the convolutional neural network. The trained deep network model and the SVM lane line recognition model recognize the lane lines and transmit the lane lines to the lane keeping and lane line departure early warning process for decision making. The lane line detection method based on the multitasking convolutional neural network solves the problem that the computational power of the vehicle-mounted embedded platform for driving assistance is limited, and solves the problem that the characteristics of the traditional algorithm are complex and difficult to identify.

As shown in fig. 2, in actual situations, in most cases, the vehicle is located in the middle of lane lines on both sides, the obtained image is masked in the vehicle image to perform dot product operation, the depth network characteristics of the pooling layer in the trained multi-task convolutional neural network are extracted, the extracted depth network characteristics are used as training data of the SVM classifier, the SVM classifier model is trained to perform lane line position recognition, and the obtained lane line position recognition result is used as a final lane line recognition result. The design of the loss function has important significance on the convergence of the network, and the multitasking convolutional neural network provided by the invention branches after the full-connection layer, and different branches are used for representing different tasks respectively. For each task we therefore calculate its loss value using the SOFTMAX output and the label corresponding to when the image was input as input to the loss function. The overall network function formula is loss=αloss (vehicle) +β Loss (lane line), and as can be seen from the formula, the overall Loss function consists of two parts, loss (vehicle) represents the Loss function of the vehicle recognition subtask, and Loss (lane line) represents the Loss function of the lane line recognition subtask. Wherein the Loss function of two separate tasks adopts the Softmax Loss type. Meanwhile, each loss function has a corresponding weight coefficient, namely alpha and beta; to ensure weight balance between the two subtasks, α and β may each be assigned a value of 0.5.

As shown in fig. 3 to 4, in the conventional single-task learning, each task employs a separate data source, and a separate model is learned for each task. In the multi-task learning of the invention, a plurality of data sources adopt shared representation to learn a plurality of subtask models simultaneously. The invention provides the method for identifying the lane lines, the people, the vehicles, the air marks and the steering marks, wherein each lane line and each vehicle have certain correlation, the precision of the classification of a plurality of attributes is improved, in the classification of the attributes, the lane lines and the vehicles have certain correlation, the information cannot be utilized when being independently trained, and the precision of the classification of the plurality of attributes can be improved by utilizing the task correlation in a combined way when the multi-task learning is performed. The multi-task learning is a machine learning method for learning a plurality of tasks simultaneously, and the multi-task learning learns the lane line, the classifier of people, vehicles, traffic signs and the like and the classifier of left turn signs and right turn simultaneously; the basic assumption of multitasking is that there is a correlation between a plurality of tasks, and thus the correlation between tasks can be utilized to promote each other. Through experiments, under the condition of only about 50M parameters, the accuracy of a relative single-task model is improved, and compared with a plurality of single tasks, the efficiency is greatly improved.

The above embodiment is only one of the preferred embodiments of the present invention, and the ordinary changes and substitutions made by those skilled in the art within the scope of the present invention should be included in the scope of the present invention.

Claims

1. The lane line detection method for assisting driving is characterized by comprising the following specific steps of: step 1, acquiring an image sample, wherein the image sample comprises a vehicle image sample and a lane line image sample; randomly dividing an image sample into 80% and 20% for data training and result testing; step 2, constructing a multi-task convolutional neural network, wherein the multi-task convolutional neural network comprises a plurality of convolutional layers, three pooling layers and three full-connection layers, wherein the first three convolutional layers are alternately connected with the three pooling layers, a BN layer is added before a function is activated in the convolutional layers, normalization is carried out on data, the pooling layers adopt a mode of maximizing pooling, the pooling layers are not connected after the fourth convolutional layer, the convolutional layers are directly connected with the three full-connection layers, the last full-connection layer is formed by a plurality of independent sublayers, each sublayer executes different tasks, and a SOFTMAX loss function is used as an objective function to calculate a loss value so as to complete model establishment on a vehicle and a lane line; in the step 2, the step of constructing the multitasking convolutional neural network specifically includes: step 201, inputting image sample data into the multitasking convolutional neural network, wherein each neuron on the multitasking convolutional neural network firstly inputs a value for weighted accumulation, then normalizes the data through a BN layer, and finally inputs an activation function as an output value of the neuron; step 202, inputting the output value into an error function, performing regularization punishment, comparing the output value with an expected value, and judging the recognition degree through errors, wherein the smaller the error value is, the better the recognition effect is; step 203, performing inverse derivation on the error function and each activation function in the multitasking convolutional neural network, and determining a gradient vector through inverse propagation; step 204, adjusting each weight value through the gradient vector, and adjusting the direction of enabling the error to be zero or converging towards the output value; step 205, repeating the steps 201 to 204 until the iteration number reaches a preset value or the average value of the error is continuously reduced and stabilized near the lowest point; step 206, inputting the rest 20% data image into the multi-task convolutional neural network, and performing verification and evaluation on a test set until the recognition accuracy of the vehicle and the lane lines reaches more than 90%; in the step 2, further includes: judging whether the positions of the lane lines are clear, if the positions of the lane lines are not clear, adding a network output lane line identification processing step after the full connection layer: firstly, sampling a seed point of network output, removing redundant points, then performing inverse transformation perspective operation, clustering under the inverse transformation perspective operation, converting a clustering result into a normal view, and finally, fitting by a least square method to obtain a lane line; step 3, obtaining a vehicle identification result by using the trained multitask convolutional neural network model, and designing and generating an image mask according to the vehicle identification result, wherein the image mask is a binary matrix with the same size as a vehicle image; step 4, performing dot multiplication on the image mask information generated by design and the image element image to construct a new image for extracting the lane line characteristics; step 5, extracting the depth network characteristics of the appointed convolution layer of the second pooling layer, and inputting the depth network characteristics into a classifier; and 6, determining vehicle information and lane line information according to the classifier result, and training a corresponding SVM classifier by using the prepared data and the LibSVM tool kit.

2. The lane line detection method for driving assistance according to claim 1, wherein in step 1, the image sample includes a denoising and deduplication process.

3. The lane marking detection method for assisted driving according to claim 1, wherein in step 1, the image sample further includes pedestrians and traffic signs.

4. The lane line detection method for driving assistance according to claim 1, wherein in the step 2, the upper half of the multitasking convolutional neural network adopts an AlexNet network model.