CN108009524B - Lane line detection method based on full convolution network - Google Patents

Lane line detection method based on full convolution network Download PDF

Info

Publication number
CN108009524B
CN108009524B CN201711420524.6A CN201711420524A CN108009524B CN 108009524 B CN108009524 B CN 108009524B CN 201711420524 A CN201711420524 A CN 201711420524A CN 108009524 B CN108009524 B CN 108009524B
Authority
CN
China
Prior art keywords
lane line
layer
convolution
loss
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711420524.6A
Other languages
Chinese (zh)
Other versions
CN108009524A (en
Inventor
周巍
臧金聚
张冠文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN201711420524.6A priority Critical patent/CN108009524B/en
Publication of CN108009524A publication Critical patent/CN108009524A/en
Application granted granted Critical
Publication of CN108009524B publication Critical patent/CN108009524B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention provides a lane line detection method based on a full convolution network, which relates to the field of traffic information detection. The invention can simultaneously realize the detection of linear lane lines and curved lane lines, train a full-convolution lane line detection network by using a lane line detection loss function, improve the detection effect of the lane lines, and learn the abstract characteristics of the lane lines from the lane line classification data in a centralized manner by a convolution neural network instead of simply extracting the external characteristics of the lane lines; the detection of the new input image can be realized only by storing the lane line detection network model, so that the storage space is saved, and the method is suitable for vehicle-mounted embedded equipment; the small shallow full-convolution lane line detection network is adopted for detection acceleration, and the detection speed is high.

Description

Lane line detection method based on full convolution network
Technical Field
The invention relates to the field of traffic information detection, in particular to a lane line detection method
Background
Traffic environment and situation need to be sensed and understood in intelligent driving, the traffic environment of vehicles comprises surrounding vehicles, lane lines, traffic lights and the like, and lane line detection plays an extremely important role in controlling the vehicles to run in a safe area. When the vehicle has large deviation, the driver can be warned in time by using the lane line detection, the driving direction of the vehicle is adjusted, and the traffic accident is avoided.
The lane line detection technology is mainly divided into three types: color feature-based detection techniques, texture feature-based detection techniques, and multi-feature fusion-based detection techniques. The color features are classified into grayscale features and color features, and for grayscale features, the grayscale of the lane line pixels is usually much larger than that of the non-lane line pixels. Foreign researchers distinguish lane line pixels from non-lane line pixels by selecting a proper threshold value, so that lane lines are detected. The detection technology based on the color features utilizes the color information features in the images to detect road boundaries and lane marks, researchers in key laboratories of automobile bodies of Hunan university preferentially design and manufacture white and yellow pixel points by utilizing RGB color space and lane line brightness characteristics, the occupancy rate of the lane line pixels is increased, the contrast between the lane lines and a background area is improved, and therefore the lane lines are detected. Researchers at Shanghai university of transportation use HSV color space to divide colors into chroma, saturation and brightness, set corresponding thresholds of lane lines and classify the colors according to the thresholds, and take the corresponding color which is dominant in the classification result as a recognition result so as to detect the lane lines and the types of the lane lines. When the data is large, a detection method based on color features often detects a large number of background areas, and the detection accuracy is not high.
The detection method based on the texture obtains a result meeting the detection requirement of the lane line by counting the texture strength and the texture direction of the pixel points in the region, and has the characteristics of strong anti-noise capability and the like. And the Graovac S and the Goma A take the texture characteristics and the road structure of the lane line area and the background area as information sources and acquire the optimal lane line area according to the statistical information. Liu of Jilin university utilizes a multidirectional Gabor template with different frequencies to carry out transformation analysis on the shot image, votes according to the texture intensity and the direction characteristic value of the pixel point to obtain a road vanishing point, establishes a road equation passing through the vanishing point by utilizing a linear slope extracted from an effective voting area, and partitions a road area in the non-structural road. In addition, researchers of the southeast university automation academy use multi-scale sparse coding, use local texture information of roads on a large scale, and use context structure characteristics of the roads on a medium and small scale to divide the roads, so that similar textures of the roads and the surrounding environment are distinguished more effectively. Due to interference of factors such as illumination and the like, the texture in the shot image is not necessarily the real texture of the surface of the three-dimensional object, which affects the effect of the detection method based on the texture features to a certain extent.
The detection method based on multi-feature fusion improves the lane line detection effect by applying the characteristics of different features. The Traverse of the university of Hunan divides a road area by using a lane line vanishing point and a lane line vanishing line, performs direction follow-up filtering processing on a shot image, then constructs a lane line confidence function by fusing various characteristics of a road texture direction, a boundary parallelism, a pixel gray value and the like, and extracts lane lines by adopting Hough transformation. Although the detection method based on multi-feature fusion has a good detection effect, the image processing process is complex, the requirement on the operation environment is high, and the method is not suitable for vehicle-mounted embedded equipment.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a method for realizing lane line detection by a full convolution network, aiming at the problem of embedded application, the invention constructs a small and shallow convolution neural network on the premise of ensuring the lane line detection effect, and realizes the purposes of saving storage space and accelerating the lane line detection speed, the lane line full convolution detection network can detect a certain area in an input picture, and the size of the detection area is the product of the sizes of all pooling layer pooling kernels in the detection network. The method carries out probability operation on the output characteristic graph of the full-convolution lane line detection network to obtain the probability of the lane line appearing in each block of area in the input picture, and sets the prediction probability threshold value to realize extraction and detection of the lane line.
The technical scheme adopted by the invention for solving the technical problem comprises the following steps:
the first step is as follows: constructing a lane line classification network
The lane line classification network is composed of three convolution layers, three pooling layers and two full-connection layers, wherein the input limit of the lane line classification network is n multiplied by n pixels and pictures containing lane lines, the output is the category number of the lane lines contained in the input pictures, the number 0 represents a background area, the number 1 represents a yellow solid line, the number 2 represents a yellow dotted line, the number 3 represents a white solid line, the number 4 represents a white dotted line, one pooling layer is connected behind each convolution layer in the lane line classification network, each convolution layer is connected with an activation function, the first full-connection layer is connected with the last pooling layer and is connected with the activation function, namely the specific structure of the lane line classification network is that the convolution layers 1, the pooling layers 1, the convolution layers 2, the pooling layers 2, the convolution layers 3, the pooling layers 3, the full-connection layer 1 and the full-connection layer 2 are sequentially connected, the convolution layers 1, the full-connection layers 2 are sequentially connected, and the convolution layers, The convolutional layer 2, the convolutional layer 3 and the full connection layer are respectively connected with an activation function, the loss layer and the accuracy layer are simultaneously connected with the full connection layer 2, the loss layer and the accuracy layer are not connected, label information needs to be input as a bottom layer and connected to the loss layer and the accuracy layer, the pooling mode of the pooling layer in the lane line classification network adopts an MAX pooling mode, the MAX pooling mode takes the maximum value of pixel points in the pooling kernel coverage range as a pooling result, and dimension reduction is achieved on the feature map;
the second step is that: training lane line classification network model
Training the lane line classification network constructed in the first step on a lane line classification data set to obtain a lane line classification network model, wherein lane line marking information in an original video sequence comprises category information of lane lines and pixel point position information of lane line boundary points in a video frame, performing straight line fitting by using the boundary point position information of the lane lines to obtain a boundary equation of two side lines of the lane lines, selecting coordinate points on the two side lines according to the marking information to form a rectangular frame, intercepting lane line areas at corresponding positions in the original video sequence by using the rectangular frame, wherein the intercepted lane line areas are stored as n multiplied by n pictures which are consistent with the size of input pictures of the classification network in the first step, manufacturing the intercepted lane line area pictures into a lane line classification data set in an lmdb database format, and the lane line classification data set comprises a training set and a test set, training the lane line classification network on a training set, and inspecting the effect of the obtained model on a test set to obtain a lane line classification network model;
the third step: modifying the full-connection layer in the lane line classification network into a convolution layer, constructing a full-convolution lane line detection network, converting the lane line classification network model obtained in the second step into an initialization detection network model for initializing the full-convolution lane line detection network, wherein the picture input pixel size of the lane line classification network is n multiplied by n, and the setting of the classification network parameters is combined with the lane line classification network structure and shown in the table 1;
TABLE 1 Lane line Classification network architecture and parameter configuration
Figure BDA0001522853500000031
Figure BDA0001522853500000041
Setting the size of convolution kernels of the conversion convolutional layer 1 converted from the fully-connected layer 1 to be 4 x 4, setting the size of convolution kernels of the conversion convolutional layer 2 converted from the fully-connected layer 2 to be 1 x 1, and keeping the number of convolution kernels of the convolutional layer converted from the fully-connected layer consistent with the number of outputs of the original fully-connected layer;
the step of converting the lane line classification network model into the initialization detection network model comprises the following steps:
unfolding a parameter matrix of a full-link layer in a lane line classification network model into column vectors, sequentially assigning element values in the column vectors to elements in the column vectors unfolded by a conversion convolution layer parameter matrix converted from the full-link layer in a full-convolution lane line detection network, directly obtaining parameters of other layers in the full-convolution lane line detection network from the classification network model to obtain an initialized detection network model, and applying the initialized detection network model as an initial model of the full-convolution lane line detection network to a training process of the full-convolution lane line detection network;
the fourth step: training lane line detection network model
Performing corresponding assignment on parameters of each network layer in the full-convolution lane line detection network by using the parameters in the initialization detection network model obtained in the third step to complete the initialization of the detection network, and training the full-convolution lane line detection network on a detection data set by using lane line detection loss, wherein a lane line detection task needs to identify the type of a lane line and the position of the lane line in an image, the detection loss comprises classification loss and regression loss, the regression loss is position loss, and the lane line detection loss definition L is as shown in formula (1):
L=αLC+βLR (1)
wherein, alpha represents the proportionality coefficient of classification loss in detection loss, beta represents the proportionality coefficient of regression loss in detection loss, and LCTo classify the loss, LRIs the regression loss;
the classification loss represents the loss between the prediction tag and the real data, and is defined as shown in formula (2):
Figure BDA0001522853500000042
wherein M represents the number of the detected network input pictures, K represents the number of channels of the label matrix, and is consistent with the total number of types of the background area contained in the lane line, H represents the height of the output feature map of the convolutional layer at the end of the network, W represents the height of the output feature map of the convolutional layer at the end of the network, H and W are consistent with the height and the width of the sub-matrix in each channel of the label matrix, g (i, K, H, W) represents the label value at (i, K, H, W) in the label array of the real data, and represents the probability that the label type at (H, W) on the feature map after the i-th input picture is convolved is K, the numerical value in the label array is 0 or 1, 0 represents that the label type at (H, W) is not K, 1 represents that the label type at (H, W) is K, when K is 0, represents the background area, and when K is 1, represents yellow, when k is 2, it represents a yellow dotted line, when k is 3, it represents a white solid line, and when k is 4, it represents a white dotted line; p (i, k, h, w) represents the prediction probability of the category k at (h, w) on the feature diagram of the ith input picture after convolution, the probability value is decimal within the (0, 1) interval, the detection loss layer converts the input feature diagram into a prediction probability matrix by using a Softmax algorithm, and the calculation method of the prediction probability of each pixel point on the feature diagram is shown as a formula (3):
Figure BDA0001522853500000051
wherein y (i, c, h, w) ═ y '(i, c, h, w) -max (y' (i, k, h, w)), k ∈ {0,1,2,3,4}, y '(i, c, h, w) represents the value of the pixel with the channel number c at the position of the input ith convolution feature map (h, w), max (y' (i, k, h, w)) represents the maximum value of the pixel in five channels at the position of the ith convolution feature map (h, w), and k is the channel number for traversing the feature map channels, and since each feature map contains 5 channels, k takes a value in {0,1,2,3,4 };
the regression loss represents the loss between the lane line position predicted by the detection network and the lane line position in the tag data, the position of the lane line in the feature map can be judged by using the prediction probability in the formula (3), and then the regression loss is calculated by comparing the position of the lane line in the tag data, wherein the detailed steps of the comparison are as follows:
selecting a certain row in the characteristic diagram, storing the column position of the predicted lane line in the row in a vector P (predicted position vector), storing the column position of the predicted lane line in the row in a vector L (label position vector) corresponding to the input label data, wherein the column position is a horizontal coordinate, and then solving the L2 loss between the P and the L to obtain the regression loss of the certain row in the characteristic diagram, wherein the output regression loss can be obtained by summing the regression losses of all the rows in the characteristic diagram and calculating the average value, and the calculation mode is shown as formula (4):
Figure BDA0001522853500000052
wherein D (j (i, k, h) -g '(i, k, h)) is a vector obtained by subtracting the predicted position vector j (i, k, h) and the label position vector g' (i, k, h), j (i, k, h) represents a set of column positions with the h row type being k in the ith picture output characteristic diagram, i.e. the predicted position vector, the prediction probability p (i, k, h, w) of each pixel point in the feature map is compared with a prediction probability threshold, and recording the comparison result as t (i, k, h, w), when p (i, k, h, w) is greater than the prediction probability threshold, if t (i, k, h, w) is 1, otherwise t (i, k, h, w) is 0, and if t (i, k, h, w) is 1, then w is stored in j (i, k, h), and t (i, k, h, w) is defined as shown in equation (5):
Figure BDA0001522853500000061
wherein p istRepresenting a prediction probability threshold value, used for judging whether the current pixel point belongs to a lane line class k, when t (i, k, h, w) is ' 1 ', representing that (h, w) on the ith feature map is classified into the lane line class k, when t (i, k, h, w) is ' 0 ', representing that the position of (h, w) does not belong to the lane line class k, when k is 0, representing a background area, g ' (i, k, h) is a label position vector, the obtaining process is similar to j (i, k, h), the difference is that label probability 0 or 1 is provided for the label data in the detection data set, and directly judging 0 and 1 for the label data g (i, k, h, w), if the value of g (i, k, h, w) is 1, then w is saved in g' (i, k, h), if the value of g (i, k, h, w) is 0, then w is not saved;
||D(j(i,k,h)-g'(i,k,h))||2represents the L2 penalty between the predicted position vector j (i, k, h) and the tag position vector g ' (i, k, h), i.e., the square of the vector D (j (i, k, h) -g ' (i, k, h)) modulo, | | D (j (i, k, h) -g ' (i, k, h)) | toroids2The calculation of (2) is divided into the following four cases, and the element is the information of the lane line:
● j (i, k, h) with no elements, g' (i, k, h) with no elements: indicating that neither the predicted position vector nor the tag position vector occurs along the lane line, | | D (j (i, k, h) -g' (i, k, h)) | survival2=0;
● j (i, k, h) with no elements, g' (i, k, h) with elements:
Figure BDA0001522853500000062
● j (i, k, h) with elements, g' (i, k, h) without elements:
Figure BDA0001522853500000063
● j (i, k, h) with elements, g' (i, k, h) with elements:
Figure BDA0001522853500000064
in equations (6) to (8), W represents an element in the predicted position vector j (i, k, h) as long as the predicted position vector j (i, k, h) has an element, and if only the tag position vector g '(i, k, h) has an element, W represents an element in the tag position vector g' (i, k, h), and W represents a width of the output feature map of the network end convolution layer, in equation (8), W "is an arbitrary element in the tag position vector g '(i, k, h), W' is an element in the tag position vector g '(i, k, h), and the absolute value of the difference value obtained by the difference between the W' and the W value is smaller than the absolute value of the difference value obtained by the difference between the W value and the other element in the tag position vector g '(i, k, h), and g' (i, k, h) is found by traversing any element W", k, h) and the w value, namely the element with the minimum absolute value of the difference obtained by the difference, namely w ', for the column coordinates which do not appear in j (i, k, h) and g' (i, k, h), setting the regression loss part of the corresponding point in the network terminal convolution layer output characteristic diagram to be 0;
the invention trains a full-convolution lane line detection network according to a Back Propagation (BP) algorithm, network updating is carried out by utilizing a derivative of lane line detection loss, and the network updating gradient calculation mode is shown as a formula (9):
Figure BDA0001522853500000071
the derivative of the classification loss in the update gradient is calculated as shown in equation (10):
Figure BDA0001522853500000072
Figure BDA0001522853500000073
c represents the total channel number of the output characteristic diagram of the network terminal convolution layer, and C represents the channel serial number of the output characteristic diagram of the network terminal convolution layer;
according to | | D (j (i, k, h) -g' (i, k, h)) | survival optical circuit2Form of definition ofThe return loss derivative is calculated as follows:
● j (i, k, h) with no elements, g' (i, k, h) with no elements:
Figure BDA0001522853500000074
● j (i, k, h) with no elements, g' (i, k, h) with elements:
Figure BDA0001522853500000075
● j (i, k, h) with elements, g' (i, k, h) without elements:
Figure BDA0001522853500000076
● j (i, k, h) with elements, g' (i, k, h) with elements:
Figure BDA0001522853500000077
in equations (12) to (15), W represents an element in the predicted position vector j (i, k, h) as long as the predicted position vector j (i, k, h) has an element, and if only the tag position vector g '(i, k, h) has an element, W represents an element in the tag position vector g' (i, k, h), W represents a width of the network end convolution layer output feature map, in equation (15), W "is an arbitrary element in the tag position vector g '(i, k, h), W' is an element in g '(i, k, h) in the tag position vector, and an absolute value of a difference value obtained by a difference between W' and a value of W is smaller than an absolute value of a difference value obtained by a difference between other element in the tag position vector g '(i, k, h) and a value of W", and the g' (i, k, h) is found by traversing an arbitrary element W ", (i, k, h), k, h) and the w value, namely the element with the minimum absolute value of the difference obtained by the difference, namely w ', for the column coordinates which do not appear in j (i, k, h) and g' (i, k, h), setting the derivative of the regression loss part of the corresponding point in the network end convolution layer output characteristic diagram as 0;
the method comprises the steps of taking a process of calculating detection loss as a forward propagation process of detecting a loss layer, taking a process of calculating a lane line detection loss derivative as an error reverse propagation process of detecting the loss layer, taking a proportional coefficient of classification loss, a proportional coefficient of regression loss and a prediction probability threshold as layer parameters of the detection loss layer, training a full convolution lane line detection network by using a Back Propagation (BP) algorithm on a detection data set through setting the layer parameters of the detection loss layer to obtain a lane line detection network model, and realizing the detection of a lane line by using the obtained lane line detection network model.
The method has the advantages that the detection of the linear lane lines and the curved lane lines can be realized at the same time, the detection loss function of the lane lines is used for training the full-volume lane line detection network, and the detection effect of the lane lines is improved. Compared with the traditional lane line detection method, the method has the advantages that the original shot image is directly used as input, and the complex image preprocessing process is omitted; the convolutional neural network learns the abstract features of the lane lines from the lane line classification data set instead of simply extracting the external features of the lane lines; the detection of the new input image can be realized only by storing the lane line detection network model, so that the storage space is saved, and the method is suitable for vehicle-mounted embedded equipment; the small shallow full-convolution lane line detection network is adopted for detection acceleration, and the detection speed is high.
Drawings
Fig. 1 is a schematic diagram of a lane line classification network according to the present invention.
Fig. 2 is a schematic diagram of a full-convolution lane line detection network according to the present invention.
Fig. 3 is an overall flow chart of the present invention.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
The embodiment of the invention is implemented according to the flow in fig. 3, firstly, the lane line classification network is built, and the lane line classification network is trained on the classification data set to obtain a lane line classification network model. Then, the invention converts the model into an initialization detection network model to initialize the full-convolution lane line detection network, and trains the full-convolution lane line detection network on a detection data set by using the defined lane line detection loss to obtain the lane line detection network model. In the embodiment of the invention, a Caffe frame is used as an experimental platform, a lane line classification network is built in the Caffe frame, and the lane line classification network is trained on a lane line classification data set to obtain a lane line classification network model. The embodiment of the invention modifies the full-connection layer in the lane line classification network into the convolution layer, constructs the lane line detection network suitable for full convolution, and realizes the detection of the loss layer in the Caffe framework according to the definition of the lane line detection loss. And training the full-convolution lane line detection network on the detection data set by setting parameters of the detection loss layer to obtain a lane line detection network model.
The first step is as follows: constructing a lane line classification network
The lane line classification network is composed of three convolution layers, three pooling layers and two full-connection layers, wherein the input limit of the lane line classification network is n multiplied by n pixels and pictures containing lane lines, the output is the category number of the lane lines contained in the input pictures, the number 0 represents a background area, the number 1 represents a yellow solid line, the number 2 represents a yellow dotted line, the number 3 represents a white solid line, the number 4 represents a white dotted line, one pooling layer is connected behind each convolution layer in the lane line classification network, each convolution layer is connected with an activation function, the first full-connection layer is connected with the last pooling layer and is connected with the activation function, namely the specific structure of the lane line classification network is that the convolution layers 1, the pooling layers 1, the convolution layers 2, the pooling layers 2, the convolution layers 3, the pooling layers 3, the full-connection layer 1 and the full-connection layer 2 are sequentially connected, the convolution layers 1, the full-connection layers 2 are sequentially connected, and the convolution layers, The convolutional layer 2, the convolutional layer 3 and the full connection layer are respectively connected with an activation function, the loss layer and the accuracy layer are simultaneously connected with the full connection layer 2, the loss layer and the accuracy layer are not connected, label information needs to be input as a bottom layer and connected to the loss layer and the accuracy layer, the pooling mode of the pooling layer in the lane line classification network adopts an MAX pooling mode, the MAX pooling mode takes the maximum value of pixel points in the pooling kernel coverage range as a pooling result, and dimension reduction is achieved on the feature map;
the second step is that: training lane line classification network model
Training the lane line classification network constructed in the first step on a lane line classification data set to obtain a lane line classification network model, wherein lane line marking information in an original video sequence comprises category information of lane lines and pixel point position information of lane line boundary points in a video frame, performing straight line fitting by using the boundary point position information of the lane lines to obtain a boundary equation of two side lines of the lane lines, selecting coordinate points on the two side lines according to the marking information to form a rectangular frame, intercepting lane line areas at corresponding positions in the original video sequence by using the rectangular frame, wherein the intercepted lane line areas are stored as n multiplied by n pictures which are consistent with the size of input pictures of the classification network in the first step, manufacturing the intercepted lane line area pictures into a lane line classification data set in an lmdb database format, and the lane line classification data set comprises a training set and a test set, training the lane line classification network on a training set, and inspecting the effect of the obtained model on a test set to obtain a lane line classification network model;
the third step: modifying the full-connection layer in the lane line classification network into a convolution layer, constructing a full-convolution lane line detection network, converting the lane line classification network model obtained in the second step into an initialization detection network model for initializing the full-convolution lane line detection network, wherein the picture input pixel size of the lane line classification network is n multiplied by n, and the setting of the classification network parameters is combined with the lane line classification network structure and shown in the table 1;
TABLE 1 Lane line Classification network architecture and parameter configuration
Network layer Number of convolution kernels Convolution kernel size Step size Zero padding
Convolutional layer 1 32 5×5 1 2
Activation function 1 32 -- -- --
Pooling layer 1 32 2×2 2 0
Convolutional layer 2 32 5×5 1 2
Activation function 2 32 -- -- --
Pooling layer 2 32 2×2 2 0
Convolutional layer 3 64 3×3 1 1
Activation function 3 64 -- -- -
Pooling layer 3 64 2×2 2 0
Full connection layer 1 64 -- -- --
Activation function 4 -- -- -- --
Full connection layer 2 5 -- -- --
Loss layer -- -- -- --
Layer of accuracy -- -- -- --
Setting the size of convolution kernels of the conversion convolutional layer 1 converted from the fully-connected layer 1 to be 4 x 4, setting the size of convolution kernels of the conversion convolutional layer 2 converted from the fully-connected layer 2 to be 1 x 1, and keeping the number of convolution kernels of the convolutional layer converted from the fully-connected layer consistent with the number of outputs of the original fully-connected layer;
the step of converting the lane line classification network model into the initialization detection network model comprises the following steps:
unfolding a parameter matrix of a full-link layer in a lane line classification network model into column vectors, sequentially assigning element values in the column vectors to elements in the column vectors unfolded by a conversion convolution layer parameter matrix converted from the full-link layer in a full-convolution lane line detection network, directly obtaining parameters of other layers in the full-convolution lane line detection network from the classification network model to obtain an initialized detection network model, and applying the initialized detection network model as an initial model of the full-convolution lane line detection network to a training process of the full-convolution lane line detection network;
the fourth step: training lane line detection network model
Performing corresponding assignment on parameters of each network layer in the full-convolution lane line detection network by using the parameters in the initialization detection network model obtained in the third step to complete the initialization of the detection network, and training the full-convolution lane line detection network on a detection data set by using lane line detection loss, wherein a lane line detection task needs to identify the type of a lane line and the position of the lane line in an image, the detection loss comprises classification loss and regression loss, the regression loss is position loss, and the lane line detection loss definition L is as shown in formula (1):
L=αLC+βLR (1)
wherein, alpha represents the proportionality coefficient of classification loss in detection loss, beta represents the proportionality coefficient of regression loss in detection loss, and LCTo classify the loss, LRIs the regression loss;
the classification loss represents the loss between the prediction tag and the real data, and is defined as shown in formula (2):
Figure BDA0001522853500000111
wherein M represents the number of the detected network input pictures, K represents the number of channels of the label matrix, and is consistent with the total number of types of the background area contained in the lane line, H represents the height of the output feature map of the convolutional layer at the end of the network, W represents the height of the output feature map of the convolutional layer at the end of the network, H and W are consistent with the height and the width of the sub-matrix in each channel of the label matrix, g (i, K, H, W) represents the label value at (i, K, H, W) in the label array of the real data, and represents the probability that the label type at (H, W) on the feature map after the i-th input picture is convolved is K, the numerical value in the label array is 0 or 1, 0 represents that the label type at (H, W) is not K, 1 represents that the label type at (H, W) is K, when K is 0, represents the background area, and when K is 1, represents yellow, when k is 2, it represents a yellow dotted line, when k is 3, it represents a white solid line, and when k is 4, it represents a white dotted line; p (i, k, h, w) represents the prediction probability of the category k at (h, w) on the feature diagram of the ith input picture after convolution, the probability value is decimal within the (0, 1) interval, the detection loss layer converts the input feature diagram into a prediction probability matrix by using a Softmax algorithm, and the calculation method of the prediction probability of each pixel point on the feature diagram is shown as a formula (3):
Figure BDA0001522853500000112
wherein y (i, c, h, w) ═ y '(i, c, h, w) -max (y' (i, k, h, w)), k ∈ {0,1,2,3,4}, y '(i, c, h, w) represents the value of the pixel with the channel number c at the position of the input ith convolution feature map (h, w), max (y' (i, k, h, w)) represents the maximum value of the pixel in five channels at the position of the ith convolution feature map (h, w), and k is the channel number for traversing the feature map channels, and since each feature map contains 5 channels, k takes a value in {0,1,2,3,4 };
the regression loss represents the loss between the lane line position predicted by the detection network and the lane line position in the tag data, the position of the lane line in the feature map can be judged by using the prediction probability in the formula (3), and then the regression loss is calculated by comparing the position of the lane line in the tag data, wherein the detailed steps of the comparison are as follows:
selecting a certain row in the characteristic diagram, storing the column position of the predicted lane line in the row in a vector P (predicted position vector), storing the column position of the predicted lane line in the row in a vector L (label position vector) corresponding to the input label data, wherein the column position is a horizontal coordinate, and then solving the L2 loss between the P and the L to obtain the regression loss of the certain row in the characteristic diagram, wherein the output regression loss can be obtained by summing the regression losses of all the rows in the characteristic diagram and calculating the average value, and the calculation mode is shown as formula (4):
Figure BDA0001522853500000121
wherein D (j (i, k, h) -g '(i, k, h)) is a vector obtained by subtracting the predicted position vector j (i, k, h) and the label position vector g' (i, k, h), j (i, k, h) represents a set of column positions with the h row type being k in the ith picture output characteristic diagram, i.e. the predicted position vector, the prediction probability p (i, k, h, w) of each pixel point in the feature map is compared with a prediction probability threshold, and recording the comparison result as t (i, k, h, w), when p (i, k, h, w) is greater than the prediction probability threshold, if t (i, k, h, w) is 1, otherwise t (i, k, h, w) is 0, and if t (i, k, h, w) is 1, then w is stored in j (i, k, h), and t (i, k, h, w) is defined as shown in equation (5):
Figure BDA0001522853500000122
wherein p istRepresenting a prediction probability threshold value, used for judging whether the current pixel point belongs to a lane line class k, when t (i, k, h, w) is ' 1 ', representing that (h, w) on the ith feature map is classified into the lane line class k, when t (i, k, h, w) is ' 0 ', representing that the position of (h, w) does not belong to the lane line class k, when k is 0, representing a background area, g ' (i, k, h) is a label position vector, the obtaining process is similar to j (i, k, h), the difference is that label probability 0 or 1 is provided for the label data in the detection data set, and directly judging 0 and 1 for the label data g (i, k, h, w), if the value of g (i, k, h, w) is 1, then w is saved in g' (i, k, h), if the value of g (i, k, h, w) is 0, then w is not saved;
||D(j(i,k,h)-g'(i,k,h))||2representing the loss of L2 between the predicted position vector j (i, k, h) and the tag position vector g '(i, k, h), | | D (j (i, k, h) -g' (i, k, h)) | survival2The calculation of (2) is divided into the following four cases, and the element is the information of the lane line:
● j (i, k, h) with no elements, g' (i, k, h) with no elements: indicating that neither the predicted position vector nor the tag position vector occurs along the lane line, | | D (j (i, k, h) -g' (i, k, h)) | survival2=0;
● j (i, k, h) with no elements, g' (i, k, h) with elements:
Figure BDA0001522853500000131
● j (i, k, h) with elements, g' (i, k, h) without elements:
Figure BDA0001522853500000132
● j (i, k, h) with elements, g' (i, k, h) with elements:
Figure BDA0001522853500000133
in equations (6) to (8), W represents an element in the predicted position vector j (i, k, h) as long as the predicted position vector j (i, k, h) has an element, and if only the tag position vector g '(i, k, h) has an element, W represents an element in the tag position vector g' (i, k, h), and W represents a width of the output feature map of the network end convolution layer, in equation (8), W "is an arbitrary element in the tag position vector g '(i, k, h), W' is an element in the tag position vector g '(i, k, h), and the absolute value of the difference value obtained by the difference between the W' and the W value is smaller than the absolute value of the difference value obtained by the difference between the W value and the other element in the tag position vector g '(i, k, h), and g' (i, k, h) is found by traversing any element W", k, h) and the w value, namely the element with the minimum absolute value of the difference obtained by the difference, namely w ', for the column coordinates which do not appear in j (i, k, h) and g' (i, k, h), setting the regression loss part of the corresponding point in the network terminal convolution layer output characteristic diagram to be 0;
the invention trains a full-convolution lane line detection network according to a Back Propagation (BP) algorithm, network updating is carried out by utilizing a derivative of lane line detection loss, and the network updating gradient calculation mode is shown as a formula (9):
Figure BDA0001522853500000134
the derivative of the classification loss in the update gradient is calculated as shown in equation (10):
Figure BDA0001522853500000141
Figure BDA0001522853500000142
c represents the total channel number of the output characteristic diagram of the network terminal convolution layer, and C represents the channel serial number of the output characteristic diagram of the network terminal convolution layer;
according to | | D (j (i, k, h) -g' (i, k, h)) | survival optical circuit2The regression loss derivative is calculated as follows:
● j (i, k, h) with no elements, g' (i, k, h) with no elements:
Figure BDA0001522853500000143
● j (i, k, h) with no elements, g' (i, k, h) with elements:
Figure BDA0001522853500000144
● j (i, k, h) with elements, g' (i, k, h) without elements:
Figure BDA0001522853500000145
● j (i, k, h) with elements, g' (i, k, h) with elements:
Figure BDA0001522853500000146
in equations (12) to (15), W represents an element in the predicted position vector j (i, k, h) as long as the predicted position vector j (i, k, h) has an element, and if only the tag position vector g '(i, k, h) has an element, W represents an element in the tag position vector g' (i, k, h), W represents a width of the network end convolution layer output feature map, W "in equation (15) is an arbitrary element in the tag position vector g '(i, k, h), W' is an element in g '(i, k, h) in the tag position vector, and the absolute value of the difference obtained by the difference between the values of W' and W is smaller than the absolute value of the difference obtained by the difference between the values of the other elements in the tag position vector g '(i, k, h) and W, and for column coordinates that do not appear in j (i, k, h) and g' (i, k, h), setting the derivative of the regression loss part of the corresponding point in the network terminal convolution layer output characteristic diagram as 0;
the method comprises the steps of taking a process of calculating detection loss as a forward propagation process of detecting a loss layer, taking a process of calculating a lane line detection loss derivative as an error reverse propagation process of detecting the loss layer, taking a proportional coefficient of classification loss, a proportional coefficient of regression loss and a prediction probability threshold as layer parameters of the detection loss layer, training a full convolution lane line detection network by using a Back Propagation (BP) algorithm on a detection data set through setting the layer parameters of the detection loss layer to obtain a lane line detection network model, and realizing the detection of a lane line by using the obtained lane line detection network model.
An embodiment of the present invention comprises the steps of:
the first step is as follows: and constructing a lane line classification network. And constructing a lane line classification network in a Caffe framework, wherein the structure of the lane line classification network is shown in FIG. 1, and the setting of each network layer parameter in the lane line classification network is shown in Table 1.
The second step is that: and training a lane line classification network model. In the embodiment of the invention, the lane line classification network is trained on the lane line classification data set, and the picture size of the training set and the test set adopts pictures with the size of 32 multiplied by 32 pixels. The ratio of the number of pictures in the training set to the number of pictures in the test set is 5: 1. the method adopts the following strategy to train the lane line classification network, wherein the training network inputs 1000 pictures each time, the test is performed on the test set after the whole training set is input and trained, the initial learning rate of the training is set to be 0.001, the learning rate is multiplied by 0.1 to be reduced every 200 epochs of training, and the network is trained for 1000 epochs to obtain a lane line classification network model; the classification accuracy of the obtained lane line classification network model to the background area and each type of lane line is more than 92%, the classification accuracy is high, and the effect is good.
The third step: modifying the full-connection layer in the lane line classification network into a convolutional layer, constructing a full-convolutional lane line detection network model, converting the classification network model obtained in the second step into an initialization detection network model, wherein the full-convolutional lane line detection network structure is shown in fig. 2, and the parameter settings of each layer are shown in table 2.
TABLE 2 full-convolution lane line detection network structure and parameter configuration
Figure BDA0001522853500000151
Figure BDA0001522853500000161
The fourth step: and training a lane line detection network model. In this embodiment, a lane line detection loss layer is compiled in Caffe, a proportional coefficient of classification loss in the detection loss layer is set to 0.5, a coefficient of regression loss is set to 0.5, a prediction probability threshold is set to 0.8, and an initialized detection network model obtained by converting a lane line classification network model is used to perform parameter initialization on a detection network. The full convolution lane detection network is trained on the detection data set, the initial learning rate of the training is set to be 0.00001, and the training is kept unchanged in the whole training process. 10 pictures are input each time to train the network, 100 epochs are trained to obtain a lane line detection network model, and the obtained lane line detection network model is used for detecting lane lines.
A large number of outliers (detecting the background area as the point of the lane line area) exist in the detection effect of the initialized detection network model obtained by converting the lane line classification network model, and the outliers cause great interference to the next lane line fitting. Compared with the initialized detection network model, the detection effect of the trained full convolution detection network model is that the detection capability of the lane line detection network model on the inner region of the lane line is weakened, but the boundary region point of the lane line can still be detected. More importantly, a large number of miscellaneous points are removed from the lane line detection network model, and the complexity of next lane line fitting is reduced. By comparing the detection effects of the initialized detection network model and the lane line detection network model, the regression loss part in the detection loss function defined in the invention can correct the detection position of the lane line, and the detection effect of the lane line is improved.
According to the invention, the secondary curve model is adopted to perform lane line fitting on the extracted lane line area points, and the lane line detection network model has a good lane line detection effect on good road conditions and has a non-ideal lane line detection effect on poor road conditions, abrasion, reflection and vehicle shielding. Because the characteristics of the solid line and the dotted line of the same color are close to each other, the deep learning technology cannot accurately identify the solid line and the dotted line, and the lane line detection network model can cause the misjudgment between the solid line and the dotted line of the same color.
The network model for detecting the lane lines detects pictures with the size of 1024 multiplied by 1280, the average time consumption is 54.57ms (only the forward propagation process of the network is executed when the new input pictures are detected), the speed can reach 18FPS, and the detection speed is high. And the size of the lane line detection network model is only 440kb, the occupied storage space is small, and the method is suitable for vehicle-mounted embedded equipment.
In a word, the lane line detection network model can realize the lane line detection task, occupies a smaller storage space, has higher detection speed, meets the application real-time performance and achieves the aim of the invention.

Claims (1)

1. A lane line detection method based on a full convolution network is characterized by comprising the following steps:
the first step is as follows: constructing a lane line classification network
The lane line classification network is composed of three convolution layers, three pooling layers and two full-connection layers, wherein the input limit of the lane line classification network is n multiplied by n pixels and pictures containing lane lines, the output is the category number of the lane lines contained in the input pictures, the number 0 represents a background area, the number 1 represents a yellow solid line, the number 2 represents a yellow dotted line, the number 3 represents a white solid line, the number 4 represents a white dotted line, one pooling layer is connected behind each convolution layer in the lane line classification network, each convolution layer is connected with an activation function, the first full-connection layer is connected with the last pooling layer and is connected with the activation function, namely the specific structure of the lane line classification network is that the convolution layers 1, the pooling layers 1, the convolution layers 2, the pooling layers 2, the convolution layers 3, the pooling layers 3, the full-connection layer 1 and the full-connection layer 2 are sequentially connected, the convolution layers 1, the full-connection layers 2 are sequentially connected, and the convolution layers, The convolutional layer 2, the convolutional layer 3 and the full connection layer are respectively connected with an activation function, the loss layer and the accuracy layer are simultaneously connected with the full connection layer 2, the loss layer and the accuracy layer are not connected, label information needs to be input as a bottom layer and connected to the loss layer and the accuracy layer, the pooling mode of the pooling layer in the lane line classification network adopts an MAX pooling mode, the MAX pooling mode takes the maximum value of pixel points in the pooling kernel coverage range as a pooling result, and dimension reduction is achieved on the feature map;
the second step is that: training lane line classification network model
Training the lane line classification network constructed in the first step on a lane line classification data set to obtain a lane line classification network model, wherein lane line marking information in an original video sequence comprises category information of lane lines and pixel point position information of lane line boundary points in a video frame, performing straight line fitting by using the boundary point position information of the lane lines to obtain a boundary equation of two side lines of the lane lines, selecting coordinate points on the two side lines according to the marking information to form a rectangular frame, intercepting lane line areas at corresponding positions in the original video sequence by using the rectangular frame, wherein the intercepted lane line areas are stored as n multiplied by n pictures which are consistent with the size of input pictures of the classification network in the first step, manufacturing the intercepted lane line area pictures into a lane line classification data set in an lmdb database format, and the lane line classification data set comprises a training set and a test set, training the lane line classification network on a training set, and inspecting the effect of the obtained model on a test set to obtain a lane line classification network model;
the third step: modifying the full-connection layer in the lane line classification network into a convolution layer, constructing a full-convolution lane line detection network, converting the lane line classification network model obtained in the second step into an initialization detection network model for initializing the full-convolution lane line detection network, wherein the picture input pixel size of the lane line classification network is n multiplied by n, and the setting of the classification network parameters is combined with the lane line classification network structure and shown in the table 1;
TABLE 1 Lane line Classification network architecture and parameter configuration
Network layer Number of convolution kernels Convolution kernel size Step size Zero padding Convolutional layer 1 32 5×5 1 2 Activation function 1 32 -- -- -- Pooling layer 1 32 2×2 2 0 Convolutional layer 2 32 5×5 1 2 Activation function 2 32 -- -- -- Pooling layer 2 32 2×2 2 0 Convolutional layer 3 64 3×3 1 1 Activation function 3 64 -- -- - Pooling layer 3 64 2×2 2 0 Full connection layer 1 64 -- -- -- Activation function 4 -- -- -- -- Full connection layer 2 5 -- -- -- Loss layer -- -- -- -- Layer of accuracy -- -- -- --
Setting the size of convolution kernels of the conversion convolutional layer 1 converted from the fully-connected layer 1 to be 4 x 4, setting the size of convolution kernels of the conversion convolutional layer 2 converted from the fully-connected layer 2 to be 1 x 1, and keeping the number of convolution kernels of the convolutional layer converted from the fully-connected layer consistent with the number of outputs of the original fully-connected layer;
the step of converting the lane line classification network model into the initialization detection network model comprises the following steps:
unfolding a parameter matrix of a full-link layer in a lane line classification network model into column vectors, sequentially assigning element values in the column vectors to elements in the column vectors unfolded by a conversion convolution layer parameter matrix converted from the full-link layer in a full-convolution lane line detection network, directly obtaining parameters of other layers in the full-convolution lane line detection network from the classification network model to obtain an initialized detection network model, and applying the initialized detection network model as an initial model of the full-convolution lane line detection network to a training process of the full-convolution lane line detection network;
the fourth step: training lane line detection network model
Performing corresponding assignment on parameters of each network layer in the full-convolution lane line detection network by using the parameters in the initialization detection network model obtained in the third step to complete the initialization of the detection network, and training the full-convolution lane line detection network on a detection data set by using lane line detection loss, wherein a lane line detection task needs to identify the type of a lane line and the position of the lane line in an image, the detection loss comprises classification loss and regression loss, the regression loss is position loss, and the lane line detection loss definition L is as shown in formula (1):
L=αLC+βLR (1)
wherein, alpha represents the proportionality coefficient of classification loss in detection loss, beta represents the proportionality coefficient of regression loss in detection loss, and LCTo classify the loss, LRIs the regression loss;
the classification loss represents the loss between the prediction tag and the real data, and is defined as shown in formula (2):
Figure FDA0003014681830000031
wherein M represents the number of the detected network input pictures, K represents the number of channels of the label matrix, and is consistent with the total number of types of the background area contained in the lane line, H represents the height of the output feature map of the convolutional layer at the end of the network, W represents the height of the output feature map of the convolutional layer at the end of the network, H and W are consistent with the height and the width of the sub-matrix in each channel of the label matrix, g (i, K, H, W) represents the label value at (i, K, H, W) in the label array of the real data, and represents the probability that the label type at (H, W) on the feature map after the i-th input picture is convolved is K, the numerical value in the label array is 0 or 1, 0 represents that the label type at (H, W) is not K, 1 represents that the label type at (H, W) is K, when K is 0, represents the background area, and when K is 1, represents yellow, when k is 2, it represents a yellow dotted line, when k is 3, it represents a white solid line, and when k is 4, it represents a white dotted line; p (i, k, h, w) represents the prediction probability of the category k at (h, w) on the feature diagram of the ith input picture after convolution, the probability value is decimal within the (0, 1) interval, the detection loss layer converts the input feature diagram into a prediction probability matrix by using a Softmax algorithm, and the calculation method of the prediction probability of each pixel point on the feature diagram is shown as a formula (3):
Figure FDA0003014681830000032
wherein y (i, c, h, w) ═ y '(i, c, h, w) -max (y' (i, k, h, w)), k ∈ {0,1,2,3,4}, y '(i, c, h, w) represents the value of the pixel with the channel number c at the position of the input ith convolution feature map (h, w), max (y' (i, k, h, w)) represents the maximum value of the pixel in five channels at the position of the ith convolution feature map (h, w), and k is the channel number for traversing the feature map channels, and since each feature map contains 5 channels, k takes a value in {0,1,2,3,4 };
the regression loss represents the loss between the lane line position predicted by the detection network and the lane line position in the tag data, the position of the lane line in the feature map can be judged by using the prediction probability in the formula (3), and then the regression loss is calculated by comparing the position of the lane line in the tag data, wherein the detailed steps of the comparison are as follows:
selecting a certain row in the characteristic diagram, storing the column position of the predicted lane line in the row in a vector P (predicted position vector), storing the column position of the predicted lane line in the row in a vector L (label position vector) corresponding to the input label data, wherein the column position is a horizontal coordinate, and then solving the L2 loss between the P and the L to obtain the regression loss of the certain row in the characteristic diagram, wherein the output regression loss can be obtained by summing the regression losses of all the rows in the characteristic diagram and calculating the average value, and the calculation mode is shown as formula (4):
Figure FDA0003014681830000041
wherein D (j (i, k, h) -g '(i, k, h)) is a vector obtained by subtracting the predicted position vector j (i, k, h) and the label position vector g' (i, k, h), j (i, k, h) represents a set of column positions with the h row type being k in the ith picture output characteristic diagram, i.e. the predicted position vector, the prediction probability p (i, k, h, w) of each pixel point in the feature map is compared with a prediction probability threshold, and recording the comparison result as t (i, k, h, w), when p (i, k, h, w) is greater than the prediction probability threshold, if t (i, k, h, w) is 1, otherwise t (i, k, h, w) is 0, and if t (i, k, h, w) is 1, then w is stored in j (i, k, h), and t (i, k, h, w) is defined as shown in equation (5):
Figure FDA0003014681830000042
wherein p istRepresenting a prediction probability threshold value, used for judging whether the current pixel point belongs to a lane line class k, when t (i, k, h, w) is ' 1 ', representing that (h, w) on the ith feature map is classified into the lane line class k, when t (i, k, h, w) is ' 0 ', representing that the position of (h, w) does not belong to the lane line class k, when k is 0, representing a background area, g ' (i, k, h) is a label position vector, the obtaining process is similar to j (i, k, h), the difference is that label probability 0 or 1 is provided for the label data in the detection data set, and directly judging 0 and 1 for the label data g (i, k, h, w), if the value of g (i, k, h, w) is 1, then w is saved in g' (i, k, h), if the value of g (i, k, h, w) is 0, then w is not saved;
||D(j(i,k,h)-g'(i,k,h))||2represents the L2 penalty between the predicted position vector j (i, k, h) and the tag position vector g ' (i, k, h), i.e., the square of the vector D (j (i, k, h) -g ' (i, k, h)) modulo, | | D (j (i, k, h) -g ' (i, k, h)) | toroids2The calculation of (2) is divided into the following four cases, and the element is the information of the lane line:
none in j (i, k, h)With elements, g' (i, k, h) has no elements: indicating that neither the predicted position vector nor the tag position vector occurs along the lane line, | | D (j (i, k, h) -g' (i, k, h)) | survival2=0;
J (i, k, h) has no element, g' (i, k, h) has an element:
Figure FDA0003014681830000043
j (i, k, h) with an element, g' (i, k, h) without an element:
Figure FDA0003014681830000051
j (i, k, h) has an element, g' (i, k, h) has an element:
Figure FDA0003014681830000052
in equations (6) to (8), W represents an element in the predicted position vector j (i, k, h) as long as the predicted position vector j (i, k, h) has an element, and if only the tag position vector g '(i, k, h) has an element, W represents an element in the tag position vector g' (i, k, h), and W represents a width of the output feature map of the network end convolution layer, in equation (8), W "is an arbitrary element in the tag position vector g '(i, k, h), W' is an element in the tag position vector g '(i, k, h), and the absolute value of the difference value obtained by the difference between the W' and the W value is smaller than the absolute value of the difference value obtained by the difference between the W value and the other element in the tag position vector g '(i, k, h), and g' (i, k, h) is found by traversing any element W", k, h) and the w value, namely the element with the minimum absolute value of the difference obtained by the difference, namely w ', for the column coordinates which do not appear in j (i, k, h) and g' (i, k, h), setting the regression loss part of the corresponding point in the network terminal convolution layer output characteristic diagram to be 0;
training a full-convolution lane line detection network according to a back propagation algorithm, and updating the network by using a derivative of lane line detection loss, wherein the network updating gradient calculation mode is as shown in formula (9):
Figure FDA0003014681830000053
the derivative of the classification loss in the update gradient is calculated as shown in equation (10):
Figure FDA0003014681830000054
Figure FDA0003014681830000055
c represents the total channel number of the output characteristic diagram of the network terminal convolution layer, and C represents the channel serial number of the output characteristic diagram of the network terminal convolution layer;
according to | | D (j (i, k, h) -g' (i, k, h)) | survival optical circuit2The regression loss derivative is calculated as follows:
no element in j (i, k, h), no element in g' (i, k, h):
Figure FDA0003014681830000056
j (i, k, h) has no element, g' (i, k, h) has an element:
Figure FDA0003014681830000061
j (i, k, h) with an element, g' (i, k, h) without an element:
Figure FDA0003014681830000062
j (i, k, h) has an element, g' (i, k, h) has an element:
Figure FDA0003014681830000063
in equations (12) to (15), W represents an element in the predicted position vector j (i, k, h) as long as the predicted position vector j (i, k, h) has an element, and if only the tag position vector g '(i, k, h) has an element, W represents an element in the tag position vector g' (i, k, h), W represents a width of the network end convolution layer output feature map, in equation (15), W "is an arbitrary element in the tag position vector g '(i, k, h), W' is an element in g '(i, k, h) in the tag position vector, and an absolute value of a difference value obtained by a difference between W' and a value of W is smaller than an absolute value of a difference value obtained by a difference between other element in the tag position vector g '(i, k, h) and a value of W", and the g' (i, k, h) is found by traversing an arbitrary element W ", (i, k, h), k, h) and the w value, namely the element with the minimum absolute value of the difference obtained by the difference, namely w ', for the column coordinates which do not appear in j (i, k, h) and g' (i, k, h), setting the derivative of the regression loss part of the corresponding point in the network end convolution layer output characteristic diagram as 0;
the method comprises the steps of taking a process of calculating detection loss as a forward propagation process of detecting a loss layer, taking a process of calculating a lane line detection loss derivative as an error reverse propagation process of detecting the loss layer, taking a proportional coefficient of classification loss, a proportional coefficient of regression loss and a prediction probability threshold as layer parameters of the detection loss layer, training a full-convolution lane line detection network by setting the layer parameters of the detection loss layer on a detection data set through a backward transfer algorithm to obtain a lane line detection network model, and detecting a lane line by using the obtained lane line detection network model.
CN201711420524.6A 2017-12-25 2017-12-25 Lane line detection method based on full convolution network Active CN108009524B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711420524.6A CN108009524B (en) 2017-12-25 2017-12-25 Lane line detection method based on full convolution network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711420524.6A CN108009524B (en) 2017-12-25 2017-12-25 Lane line detection method based on full convolution network

Publications (2)

Publication Number Publication Date
CN108009524A CN108009524A (en) 2018-05-08
CN108009524B true CN108009524B (en) 2021-07-09

Family

ID=62061049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711420524.6A Active CN108009524B (en) 2017-12-25 2017-12-25 Lane line detection method based on full convolution network

Country Status (1)

Country Link
CN (1) CN108009524B (en)

Families Citing this family (64)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846328B (en) * 2018-05-29 2020-10-16 上海交通大学 Lane detection method based on geometric regularization constraint
CN108960055B (en) * 2018-05-30 2021-06-08 广西大学 Lane line detection method based on local line segment mode characteristics
CN110555877B (en) * 2018-05-31 2022-05-31 杭州海康威视数字技术股份有限公司 Image processing method, device and equipment and readable medium
CN110378174A (en) * 2018-08-10 2019-10-25 北京京东尚科信息技术有限公司 Road extracting method and device
CN110879943B (en) * 2018-09-05 2022-08-19 北京嘀嘀无限科技发展有限公司 Image data processing method and system
US10262214B1 (en) * 2018-09-05 2019-04-16 StradVision, Inc. Learning method, learning device for detecting lane by using CNN and testing method, testing device using the same
CN109345589A (en) 2018-09-11 2019-02-15 百度在线网络技术(北京)有限公司 Method for detecting position, device, equipment and medium based on automatic driving vehicle
CN113793356B (en) * 2018-09-30 2023-06-23 百度在线网络技术(北京)有限公司 Lane line detection method and device
CN109345547B (en) * 2018-10-19 2021-08-24 天津天地伟业投资管理有限公司 Traffic lane line detection method and device based on deep learning multitask network
CN109472272A (en) * 2018-11-05 2019-03-15 四川长虹电器股份有限公司 A kind of lines detection method based on from coding convolutional network
CN111209780A (en) * 2018-11-21 2020-05-29 北京市商汤科技开发有限公司 Lane line attribute detection method and device, electronic device and readable storage medium
CN111259704B (en) * 2018-12-03 2022-06-10 魔门塔(苏州)科技有限公司 Training method of dotted lane line endpoint detection model
CN111259705B (en) * 2018-12-03 2022-06-10 魔门塔(苏州)科技有限公司 Special linear lane line detection method and system
CN109635744B (en) * 2018-12-13 2020-04-14 合肥工业大学 Lane line detection method based on deep segmentation network
CN109740469B (en) * 2018-12-24 2021-01-22 百度在线网络技术(北京)有限公司 Lane line detection method, lane line detection device, computer device, and storage medium
CN111369566B (en) * 2018-12-25 2023-12-05 杭州海康威视数字技术股份有限公司 Method, device, equipment and storage medium for determining position of pavement blanking point
CN109631794A (en) * 2018-12-28 2019-04-16 天津大学 Target object curvature measurement method based on convolutional neural networks
US10346693B1 (en) * 2019-01-22 2019-07-09 StradVision, Inc. Method and device for attention-based lane detection without post-processing by using lane mask and testing method and testing device using the same
CN109871778B (en) * 2019-01-23 2022-11-15 长安大学 Lane keeping control method based on transfer learning
CN110020592B (en) * 2019-02-03 2024-04-09 平安科技(深圳)有限公司 Object detection model training method, device, computer equipment and storage medium
CN109886176B (en) * 2019-02-14 2023-02-24 武汉大学 Lane line detection method in complex driving scene
CN109961013A (en) * 2019-02-21 2019-07-02 杭州飞步科技有限公司 Recognition methods, device, equipment and the computer readable storage medium of lane line
CN109934272B (en) * 2019-03-01 2022-03-29 大连理工大学 Image matching method based on full convolution network
CN109902758B (en) * 2019-03-11 2022-05-31 重庆邮电大学 Deep learning-based lane area identification data set calibration method
CN110163077A (en) * 2019-03-11 2019-08-23 重庆邮电大学 A kind of lane recognition method based on full convolutional neural networks
CN111914596B (en) * 2019-05-09 2024-04-09 北京四维图新科技股份有限公司 Lane line detection method, device, system and storage medium
CN110197151A (en) * 2019-05-28 2019-09-03 大连理工大学 A kind of lane detection system and method for combination double branching networks and custom function network
CN110232368B (en) * 2019-06-20 2021-08-24 百度在线网络技术(北京)有限公司 Lane line detection method, lane line detection device, electronic device, and storage medium
CN112131914B (en) * 2019-06-25 2022-10-21 北京市商汤科技开发有限公司 Lane line attribute detection method and device, electronic equipment and intelligent equipment
CN110348383B (en) * 2019-07-11 2020-07-31 重庆市地理信息和遥感应用中心(重庆市测绘产品质量检验测试中心) Road center line and double line extraction method based on convolutional neural network regression
CN110414386B (en) * 2019-07-12 2022-01-21 武汉理工大学 Lane line detection method based on improved SCNN (traffic channel network)
CN110363160B (en) * 2019-07-17 2022-09-23 河南工业大学 Multi-lane line identification method and device
CN112241670B (en) * 2019-07-18 2024-03-01 杭州海康威视数字技术股份有限公司 Image processing method and device
CN112241669A (en) * 2019-07-18 2021-01-19 杭州海康威视数字技术股份有限公司 Target identification method, device, system and equipment, and storage medium
CN110487562B (en) * 2019-08-21 2020-04-14 北京航空航天大学 Driveway keeping capacity detection system and method for unmanned driving
CN112446230B (en) * 2019-08-27 2024-04-09 中车株洲电力机车研究所有限公司 Lane line image recognition method and device
CN110569384B (en) * 2019-09-09 2021-02-26 深圳市乐福衡器有限公司 AI scanning method
CN111046723B (en) * 2019-10-17 2023-06-02 安徽清新互联信息科技有限公司 Lane line detection method based on deep learning
CN110920631B (en) * 2019-11-27 2021-02-12 北京三快在线科技有限公司 Method and device for controlling vehicle, electronic equipment and readable storage medium
CN112926354A (en) * 2019-12-05 2021-06-08 北京超星未来科技有限公司 Deep learning-based lane line detection method and device
CN112926365A (en) * 2019-12-06 2021-06-08 广州汽车集团股份有限公司 Lane line detection method and system
CN111008600B (en) * 2019-12-06 2023-04-07 中国科学技术大学 Lane line detection method
CN111126327B (en) * 2019-12-30 2023-09-15 中国科学院自动化研究所 Lane line detection method and system, vehicle-mounted system and vehicle
CN111259796A (en) * 2020-01-16 2020-06-09 东华大学 Lane line detection method based on image geometric features
CN111460984B (en) * 2020-03-30 2023-05-23 华南理工大学 Global lane line detection method based on key points and gradient equalization loss
CN111553210B (en) * 2020-04-16 2024-04-09 雄狮汽车科技(南京)有限公司 Training method of lane line detection model, lane line detection method and device
EP4193123A4 (en) 2020-08-07 2024-01-17 Grabtaxi Holdings Pte Ltd Method of predicting road attributes, data processing system and computer executable code
CN112215795B (en) * 2020-09-02 2024-04-09 苏州超集信息科技有限公司 Intelligent detection method for server component based on deep learning
CN112434591B (en) * 2020-11-19 2022-06-17 腾讯科技(深圳)有限公司 Lane line determination method and device
CN112560717B (en) * 2020-12-21 2023-04-21 青岛科技大学 Lane line detection method based on deep learning
CN112287912B (en) * 2020-12-25 2021-03-30 浙江大华技术股份有限公司 Deep learning-based lane line detection method and device
CN112758107B (en) * 2021-02-07 2023-01-03 的卢技术有限公司 Automatic lane changing method for vehicle, control device, electronic equipment and automobile
CN113011338B (en) * 2021-03-19 2023-08-22 华南理工大学 Lane line detection method and system
CN113095164A (en) * 2021-03-22 2021-07-09 西北工业大学 Lane line detection and positioning method based on reinforcement learning and mark point characterization
CN113221643B (en) * 2021-04-06 2023-04-11 中国科学院合肥物质科学研究院 Lane line classification method and system adopting cascade network
CN113052135B (en) * 2021-04-22 2023-03-24 淮阴工学院 Lane line detection method and system based on deep neural network Lane-Ar
CN113392704B (en) * 2021-05-12 2022-06-10 重庆大学 Mountain road sideline position detection method
CN113239865B (en) * 2021-05-31 2023-03-10 西安电子科技大学 Deep learning-based lane line detection method
CN113487542B (en) * 2021-06-16 2023-08-04 成都唐源电气股份有限公司 Extraction method of contact net wire abrasion area
CN113822218A (en) * 2021-09-30 2021-12-21 厦门汇利伟业科技有限公司 Lane line detection method and computer-readable storage medium
CN114463720B (en) * 2022-01-25 2022-10-21 杭州飞步科技有限公司 Lane line detection method based on line segment intersection ratio loss function
CN114724119B (en) * 2022-06-09 2022-09-06 天津所托瑞安汽车科技有限公司 Lane line extraction method, lane line detection device, and storage medium
CN115082888B (en) * 2022-08-18 2022-10-25 北京轻舟智航智能技术有限公司 Lane line detection method and device
CN115393595B (en) * 2022-10-27 2023-02-03 福思(杭州)智能科技有限公司 Segmentation network model training method, lane line detection method, device and medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608456A (en) * 2015-12-22 2016-05-25 华中科技大学 Multi-directional text detection method based on full convolution network
CN105631426A (en) * 2015-12-29 2016-06-01 中国科学院深圳先进技术研究院 Image text detection method and device
CN106097444A (en) * 2016-05-30 2016-11-09 百度在线网络技术(北京)有限公司 High-precision map generates method and apparatus
CN106940562A (en) * 2017-03-09 2017-07-11 华南理工大学 A kind of mobile robot wireless clustered system and neutral net vision navigation method
CN107229904A (en) * 2017-04-24 2017-10-03 东北大学 A kind of object detection and recognition method based on deep learning
CN107424161A (en) * 2017-04-25 2017-12-01 南京邮电大学 A kind of indoor scene image layout method of estimation by thick extremely essence
CN107506765A (en) * 2017-10-13 2017-12-22 厦门大学 A kind of method of the license plate sloped correction based on neutral net

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608456A (en) * 2015-12-22 2016-05-25 华中科技大学 Multi-directional text detection method based on full convolution network
CN105631426A (en) * 2015-12-29 2016-06-01 中国科学院深圳先进技术研究院 Image text detection method and device
CN106097444A (en) * 2016-05-30 2016-11-09 百度在线网络技术(北京)有限公司 High-precision map generates method and apparatus
CN106940562A (en) * 2017-03-09 2017-07-11 华南理工大学 A kind of mobile robot wireless clustered system and neutral net vision navigation method
CN107229904A (en) * 2017-04-24 2017-10-03 东北大学 A kind of object detection and recognition method based on deep learning
CN107424161A (en) * 2017-04-25 2017-12-01 南京邮电大学 A kind of indoor scene image layout method of estimation by thick extremely essence
CN107506765A (en) * 2017-10-13 2017-12-22 厦门大学 A kind of method of the license plate sloped correction based on neutral net

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Traffic sign detection and recognition using fully convolutional network guided proposals;Yingying Zhu等;《Neurocomputing》;20161231;第758-766页 *
一种实时的城市道路车道线识别方法及实现;曾治等;《电子技术与软件工程》;20151231;第88-90页 *

Also Published As

Publication number Publication date
CN108009524A (en) 2018-05-08

Similar Documents

Publication Publication Date Title
CN108009524B (en) Lane line detection method based on full convolution network
CN108108761B (en) Rapid traffic signal lamp detection method based on deep feature learning
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
CN108961235B (en) Defective insulator identification method based on YOLOv3 network and particle filter algorithm
CN109635744B (en) Lane line detection method based on deep segmentation network
CN108280397B (en) Human body image hair detection method based on deep convolutional neural network
CN108171748B (en) Visual identification and positioning method for intelligent robot grabbing application
CN105046196B (en) Front truck information of vehicles structuring output method based on concatenated convolutional neutral net
CN108154160B (en) License plate color identification method and system
CN113658132B (en) Computer vision-based structural part weld joint detection method
CN106022232A (en) License plate detection method based on deep learning
CN107729801A (en) A kind of vehicle color identifying system based on multitask depth convolutional neural networks
CN109684922B (en) Multi-model finished dish identification method based on convolutional neural network
Zhu et al. Automatic recognition of lactating sow postures by refined two-stream RGB-D faster R-CNN
CN104517103A (en) Traffic sign classification method based on deep neural network
CN110032952B (en) Road boundary point detection method based on deep learning
CN107292933B (en) Vehicle color identification method based on BP neural network
CN108334881A (en) A kind of licence plate recognition method based on deep learning
CN112488046B (en) Lane line extraction method based on high-resolution images of unmanned aerial vehicle
CN110363160B (en) Multi-lane line identification method and device
CN111460894B (en) Intelligent car logo detection method based on convolutional neural network
CN110969171A (en) Image classification model, method and application based on improved convolutional neural network
CN112560717B (en) Lane line detection method based on deep learning
CN106919902A (en) A kind of vehicle identification and trajectory track method based on CNN
CN112464731B (en) Traffic sign detection and identification method based on image processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant