CN114463721A - Lane line detection method based on spatial feature interaction - Google Patents

Lane line detection method based on spatial feature interaction Download PDF

Info

Publication number
CN114463721A
CN114463721A CN202210113686.XA CN202210113686A CN114463721A CN 114463721 A CN114463721 A CN 114463721A CN 202210113686 A CN202210113686 A CN 202210113686A CN 114463721 A CN114463721 A CN 114463721A
Authority
CN
China
Prior art keywords
feature
lane line
interaction
spatial
line detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210113686.XA
Other languages
Chinese (zh)
Inventor
宋立新
焦守文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of Science and Technology
Original Assignee
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin University of Science and Technology filed Critical Harbin University of Science and Technology
Priority to CN202210113686.XA priority Critical patent/CN114463721A/en
Publication of CN114463721A publication Critical patent/CN114463721A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4084Scaling of whole images or parts thereof, e.g. expanding or contracting in the transform domain, e.g. fast Fourier transform [FFT] domain scaling

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a lane line detection method based on spatial feature interaction. The lane line detection problem is defined as position selection and classification in the row direction based on spatial feature interaction. The purpose is when guaranteeing faster detection speed, through the interaction of spatial feature, makes every position all spatial information in can both the perception same characteristic map, can improve the problem that detection effect is not good under receiving factors such as vehicle shelter from, ground wearing and tearing, light environment. In addition, the invention provides bilateral upsampling combining coarse-grained and fine-grained features in an upsampling stage, so that a low-resolution feature map can be accurately restored to pixel-level prediction. The invention comprises the following steps: the method comprises the following steps: processing the training data; step two: constructing a lane line detection network based on spatial feature interaction; step three: and (5) training a lane line detection model. Step four: and (5) testing the lane line detection model. The invention belongs to the technical field of automatic driving.

Description

Lane line detection method based on spatial feature interaction
Technical Field
The invention relates to the technical field of automobile auxiliary driving or automatic driving, in particular to a lane line detection method.
Background
The lane line detection is a process of automatically sensing the shape and position of a marked lane line and is a key component of an automatic driving system. The lane line detection is used as a basic module in automatic driving, and plays an important role in the applications of vehicle real-time positioning, driving route planning, lane keeping assistance, adaptive cruise control and the like. This leaves lane line detection still a number of challenges due to severe occlusion, severe weather conditions, a fuzzy road surface, and the inherent slimness of the lane itself.
Conventional lane line detection methods typically rely on manual operations to extract features and then post-process to fit the shape of the lane line. However, the conventional method cannot maintain robustness in a real scene because the manually designed model cannot deal with the diversity of lane lines in different scenes.
In recent years, most of research on lane line detection has focused on deep learning. Early methods based on deep learning were to detect lane lines by segmentation, but a faster detection speed was essential for lane line detection algorithms due to the severe real-time requirements of autonomous driving. For this purpose, the method of line direction detection proposes to define lane line detection as finding a set of positions of the lane lines in certain lines in the image, i.e. selecting, classifying based on the positions in the line direction. Although the method has a faster detection speed, due to the slender characteristic of the lane line, the number of the pixels of the marked lane line is far smaller than that of the background pixels, and the method is often difficult to extract fine lane line features, so that the detection performance is low. It is more challenging that a lane line may be almost completely obscured by a crowded car, and that lane line can only be surmised with common sense. Therefore, the low-quality features extracted by the common CNN tend to reduce the fine lane line features, and the effect is poor in a complex scene. In fact, there is a high correlation between lane lines, and studying how to obtain such correlation would bring hope for more accurately detecting lane lines in complex scenes with weak visual clues.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a lane line detection method based on spatial feature interaction. The method defines the lane line detection problem as position selection and classification in the row direction based on space feature interaction. The purpose is when guaranteeing faster detection speed, through the interaction of spatial feature, can improve the problem that detection effect is not good under receiving factors such as vehicle shelter from, ground wearing and tearing, light environment to influence well, has effectively improved lane line detection's accuracy nature and robustness.
The above purpose is realized by the following technical scheme:
a lane line detection method based on spatial feature interaction comprises the following steps:
the method comprises the following steps: processing the training data;
step two: constructing a lane line detection network based on spatial feature interaction;
step three: training the network model established in the second step by using the data processed in the first step, performing parameter learning on the model by using an Adam optimization strategy, and storing a final training model;
step four: and testing the final network model in the third step.
The lane line detection method based on the spatial feature interaction is characterized in that the first step comprises the following processes:
first, the original image is resized to 288 × 800. Then, in order to improve generalization capability, a data enhancement method combining rotation, vertical movement and horizontal movement is applied to the scaled image. In addition, since the edge of the enhanced image may be vacant, in order to maintain the lane structure, the lane line is extended to the boundary of the image.
The lane line detection method based on the spatial feature interaction is characterized in that the second step comprises the following processes:
(1) integral structure of lane line detection network
The invention defines the lane line detection problem as position selection and classification in the row direction based on space feature interaction. The whole network structure mainly comprises three parts: the system comprises a feature extractor, a spatial feature interaction module and a predictor based on classification. In addition, an auxiliary segmentation module is provided, and it should be noted that the auxiliary segmentation task is only used in the training stage and is deleted in the testing stage.
(2) Feature extractor
At this stage, the preliminary features are extracted, and the ResNet is used as a feature extractor after removing the full connection layer. The feature extractor consists of 17 convolution layers, each convolution layer being followed by a batch normalization layer, a ReLU activation function layer.
(3) Spatial feature interaction module
And sending the feature map extracted by the feature extractor into a spatial feature interaction module, wherein the spatial feature interaction module achieves interaction of spatial information by moving the slice feature map in the vertical and horizontal directions. In each iteration, the slice signature will move around in 4 directions, passing information in the vertical and horizontal directions. And finally, K iterations are needed to ensure that each position can receive the information in the whole feature diagram. Specifically, there is provided a three-dimensional eigenmap tensor X of size C × H × W, where C, H and W represent the number of channels, the number of rows, and the number of columns, respectively.
Figure BDA0003495606960000021
The values of the feature map X at the kth iteration are shown, where c, i and j represent the channel, row and column indices, respectively. Then the forward calculation formulas (1), (2), (3) and (4) of the spatial feature interaction module are as follows:
Figure BDA0003495606960000022
Figure BDA0003495606960000023
Figure BDA0003495606960000024
Figure BDA0003495606960000025
where K is the number of iterations, and K is log2And L. L in the formula (1) and the formula (2) is W and H, respectively. f is the nonlinear activation function, the present invention uses ReLU. The X labeled' represents the updated element. skIs the move step in the k-th iteration. The formula (1) and the formula (2) are vertical and horizontal information transfer formulas, respectively. F is a set of one-dimensional convolution kernels, where m, c, n represent the indices of the input channel, output channel, convolution kernel width, respectively. Here both the number of input channels and the number of output channels are equal to C. Z in the formulae (1) and (2) is an intermediate result of information transfer. Note that the feature map X is divided into H slices in the horizontal direction and W slices in the vertical direction. Moving step skAnd dynamically determining the information transmission distance under the control of the iteration number k.
The information transmission has four directions, and the invention uses 'bottom to top' and 'top to bottom' as vertical information interaction and 'left to right' and 'right to left' as horizontal information interaction. By continuously moving the slice feature map in the vertical and horizontal directions, all spatial information in the same feature map can be interacted and perceived at each position.
(4) Classification-based predictor
In pursuit of faster detection speed, the predictive part of the network is to select and classify the lane line position on each predefined row. H predefined rows are selected according to the training data, each predefined row being divided into (w +1) small cells. When prediction based on classification is carried out, the rich characteristic graph learned by the spatial characteristic interaction module is mapped to a characteristic graph m multiplied by h multiplied by (w +1) of the dimension required by classification row by row through two fully connected layers, wherein m represents the number of lane lines. Then (w +1) -dimensional classification is performed on h predefined rows, respectively. The positions of all the lane lines on the predefined rows are found, and then the whole lane line is predicted.
(5) Auxiliary segmentation module
Because the segmentation network has finer prediction on the lane line edge, but because of larger calculation amount, the invention only uses the segmentation task in the training stage, and assists the main network to train the model better. Thus, even if an additional segmentation task is added, the detection speed is not affected. In the auxiliary segmentation task, firstly, a feature graph processed by the spatial feature interaction module and two feature graphs of different scales extracted by the feature extractor are unified into a feature graph of the same size, then the feature graph, the feature graph and the feature graph are spliced to obtain a feature graph, the feature graph passes through a convolution layer, and the feature graph sampled from the convolution layer to be consistent with the original graph in size is subjected to segmentation prediction through double-side sampling. Sampling on two sides is divided into two parts, and one part depends on bilinear difference values to obtain coarse-grained up-sampling characteristics; another part relies on transpose convolution to fine tune coarse-grained fine information loss. The results of the two parts are fused by an addition operation.
The lane line detection method based on the spatial feature interaction is characterized in that the third step comprises the following processes:
and taking the processed lane line image as the input of the network, and training the model by using an Adam optimization algorithm to minimize a composite loss function. The recombination loss is: l istotal=Lcls+βLseg. Wherein L isclsTo classify the loss, LsegFor the segmentation loss, β is the loss coefficient. The present invention uses focus loss as classification loss and cross entropy as auxiliary segmentation loss. L isclsAnd LsegAs shown in formulas (5) and (6):
Figure BDA0003495606960000031
wherein p is ∈ [0,1]]Is the predicted probability of the model for the label y ═ 1, α ∈ [0,1 ∈]Is a balance factor, (1-p)γIs the sample difficulty weight modulation factor.
Figure BDA0003495606960000032
Wherein p ∈ [0,1] is the predicted probability of the model for the label y ═ 1, and α ∈ [0,1] is the balance factor.
And (3) adopting an early-stopping strategy to prevent overfitting in model training, and storing the final training model after the training is finished.
The lane line detection method based on the spatial feature interaction is characterized in that the fourth step comprises the following processes:
the original image is first resized to 288 x 800. And taking the processed lane line image as the input of the network, loading the trained model, and obtaining the detection result of the lane line through forward propagation.
The invention has the following beneficial effects:
compared with the existing method, the lane line detection method based on the spatial feature interaction has the advantages that the robustness is enhanced, and the lane line detection method can be better suitable for the road conditions of complex roads, different light conditions and the like. By continuously moving the slice feature map in the vertical and horizontal directions, all spatial information in the same feature map can be interacted and perceived at each position. Lane line detection is a task that is highly dependent on surrounding cues. If one lane line is occluded or worn but has strong shape priors, it can be inferred from other lanes, car direction, road shape, or other visual cues by capturing the spatial relationship of the pixels between rows and columns. In addition, the present invention provides a bilateral upsampling combining coarse-grained and fine-grained features at the upsampling stage, which can accurately restore the low-resolution feature map to a pixel-level prediction. Finally, the detection method of the invention selects and classifies the lane line position on each predefined line. Because the predefined line number is far smaller than the height of the image, the lane line detection method can achieve higher detection speed.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic diagram of a lane line detection network according to the present invention;
FIG. 3 is a schematic diagram of a spatial feature interaction module according to the present invention;
FIG. 4 is a schematic diagram of the spatial feature interaction "from right to left" message delivery in accordance with the present invention;
FIG. 5 is a schematic diagram of a bilateral upsampling structure of the present invention;
fig. 6 is a diagram illustrating the lane line detection effect of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
As shown in fig. 1, the lane line detection method based on spatial feature interaction according to the present invention includes the following steps: firstly, processing training data; secondly, constructing a lane line detection network based on spatial feature interaction; thirdly, training the network model established in the second step by using the data processed in the first step, performing parameter learning on the model by using an Adam optimization strategy, and storing a final training model; and fourthly, testing the final network model in the third step.
The method comprises the following steps: processing the training data;
the present embodiment uses the cuiane data set, collected by cameras mounted on six different vehicles driven by different drivers in beijing. The CULane dataset collected over 55 hours of video and extracted 133,235 frames. Wherein the training set size is 88880 frames, the validation set size is 9675 frames, and the test set size is 34680 frames. The data set contains 9 different scenes including normal, crowded, curved, glare, night, no lane, shadow, intersection and downtown arrow scenes.
First, to balance the speed of detection, the image size of the original data set is scaled to 288 × 800. Then, in order to improve generalization capability, a data enhancement method combining rotation, vertical movement and horizontal movement is applied to the scaled image. In addition, since the edge of the enhanced image may be vacant, in order to maintain the lane structure, the lane line is extended to the boundary of the image.
Step two: constructing a lane line detection network based on spatial feature interaction;
(1) integral structure of lane line detection network
The invention defines the lane line detection problem as position selection and classification in the row direction based on space feature interaction. The whole network structure is shown in fig. 2 and mainly comprises three parts: the system comprises a feature extractor, a spatial feature interaction module and a predictor based on classification. In addition, an auxiliary segmentation module is provided, and it should be noted that the auxiliary segmentation task is only used in the training stage and is deleted in the testing stage.
(2) Feature extractor
At this stage, the preliminary features are extracted, and the ResNet is used as a feature extractor after removing the full connection layer. The feature extractor consists of 17 convolution layers, each convolution layer is followed by a batch normalization layer and a ReLU activation function layer.
(3) Spatial feature interaction module
And sending the feature map extracted by the feature extractor into a spatial feature interaction module, wherein the spatial feature interaction module achieves interaction of spatial information by moving the slice feature map in the vertical and horizontal directions. In each iteration, the slice signature will move around in 4 directions, passing information in the vertical and horizontal directions. And finally, K iterations are needed to ensure that each position can receive the information in the whole feature diagram. Specifically, there is provided a three-dimensional eigenmap tensor X of size C × H × W, where C, H and W represent the number of channels, the number of rows, and the number of columns, respectively.
Figure BDA0003495606960000041
The values of the feature map X at the kth iteration are shown, where c, i and j represent the channel, row and column indices, respectively. Then the forward calculation formulas (1) and (2) of the spatial feature interaction moduleThe followings (3) and (4) are as follows:
Figure BDA0003495606960000042
Figure BDA0003495606960000043
Figure BDA0003495606960000044
Figure BDA0003495606960000051
where K is the number of iterations, and K is log2And L. L in the formula (1) and the formula (2) is W and H, respectively. f is the nonlinear activation function, the present invention uses ReLU. The X labeled' represents the updated element. skIs the move step in the k-th iteration. The formula (1) and the formula (2) are vertical and horizontal information transfer formulas, respectively. F is a set of one-dimensional convolution kernels, where m, c, n represent the indices of the input channel, output channel, convolution kernel width, respectively. Here both the number of input channels and the number of output channels are equal to C. Z in the formulae (1) and (2) is an intermediate result of information transfer. Note that the feature map X is divided into H slices in the horizontal direction and W slices in the vertical direction, as shown in fig. 3(a) and 3 (b). Moving step skAnd dynamically determining the information transmission distance under the control of the iteration number k.
There are four directions for information transfer, and the present invention uses "bottom-up" (as shown in fig. 3(a) "," top-down "as vertical information interaction, and" left-to-right "," right-to-left "(as shown in fig. 3 (b)) as horizontal information interaction. By continuously moving the slice feature map in the vertical and horizontal directions, all spatial information in the same feature map can be interacted and perceived at each position. The information transfer from "right to left" is used as an illustration here, as shown in fig. 4. When the number of iterations k is 1, s1Each X is 1iCan receive Xi+1The characteristics of (1). Due to the repeated movement, the column at the end may also receive a feature of the other side, namely Xw-1Can receive X0The characteristics of (1). When the iteration number k is 2, s 22 each XiCan receive Xi+2The method is characterized in that. With X0For example, X0X may be received in a second iteration2Considering X in the previous iteration0Received from X1Information of (A), and X2Received from X3Information of (2), now X0Received from X in only two iterations0、X1、X2、X3The information of (1). The next iteration is similar to the above process. After all K iterations, when the iteration number K is K, each XiThe information in the entire feature map can be perceived.
(4) Classification-based predictor
In pursuit of faster detection speed, the predictive part of the network is to select and classify the lane line position on each predefined row. H predefined rows are selected according to the training data, each predefined row being divided into (w +1) small cells. When prediction based on classification is carried out, the rich characteristic graph learned by the spatial characteristic interaction module is mapped to a characteristic graph m multiplied by h multiplied by (w +1) of the dimension required by classification row by row through two fully connected layers, wherein m represents the number of lane lines. Then (w +1) -dimensional classification is performed on h predefined rows, respectively. The positions of all the lane lines on the predefined rows are found, and then the whole lane line is predicted.
(5) Auxiliary segmentation module
Because the segmentation network has finer prediction on the lane line edge, but because of larger calculation amount, the invention only uses the segmentation task in the training stage, and assists the main network to train the model better. Thus, even if an additional segmentation task is added, the detection speed is not affected. In the auxiliary segmentation task, firstly, a feature graph processed by the spatial feature interaction module and two feature graphs of different scales extracted by the feature extractor are unified into a feature graph of the same size, then the feature graph, the feature graph and the feature graph are spliced to obtain a feature graph, the feature graph passes through a convolution layer, and the feature graph sampled from the convolution layer to be consistent with the original graph in size is subjected to segmentation prediction through double-side sampling.
The double-sided sampling consists of coarse-grained branches and fine-grained branches, and the structure is shown in fig. 5. Coarse grain branching will quickly obtain coarse up-sampled features from the previous layer. The number of channels is first reduced to 1/2 for the input feature map by 1 x 1 convolution, and then bilinear interpolation is used directly to up-sample the input feature map. The fine grain branch is used to fine tune the fine information loss of the coarse grain branch, and the path is deeper than the coarse grain branch. The feature map is upsampled using the transposed convolution with step size 2 while reducing the number of channels 1/2. Thereafter, two non-bottleneck blocks (non-bottle) are stacked. The non-bottleneck block is composed of 4 convolutions of 3 × 1 and 1 × 3 with BN and ReLU, which can maintain the shape of the feature map and extract information efficiently by way of decomposition. And finally, carrying out addition operation on the two branches.
Step three: training the network model established in the second step by using the data processed in the first step, performing parameter learning on the model by using an Adam optimization strategy, and storing a final training model;
for the CULane dataset, the present invention uses the rows defined by the dataset. Specifically, the range of rows of the CULane dataset with an image height of 590 is 260 to 580, with a step size of 10. The number of cells on each predefined row is set to 200. In the optimization process, the processed lane line image is used as the input of a network, an Adam optimization algorithm is used for enabling a composite loss function to be minimum to train a model, the momentum is 0.9, the cosine attenuation learning rate is initialized by 4e-4, the batch size is 16, and the training iteration number is 50. The recombination loss is: l istotal=Lcls+βLseg. Wherein L isclsTo classify the loss, LsegFor the division loss, β is a loss coefficient, where β is set to 1. The present invention uses focus loss as classification loss and cross entropy as auxiliary segmentation loss. L isclsAnd LsegAs shown in formulas (5) and (6):
Figure BDA0003495606960000061
wherein p is ∈ [0,1]]Is the predicted probability of the model for the label y ═ 1, α ∈ [0,1 ∈]Is a balance factor, (1-p)γIs the sample difficulty weight modulation factor.
Figure BDA0003495606960000062
Wherein p ∈ [0,1] is the predicted probability of the model for the label y ═ 1, and α ∈ [0,1] is the balance factor.
And (3) adopting an early stopping strategy to prevent overfitting during model training, and storing the final training model after the training is finished.
Step four: and testing the final network model in the third step.
The original image size is first scaled to 288 x 800. And taking the processed lane line image as the input of the network, loading the trained model, and obtaining the detection result of the lane line through forward propagation. As can be seen from FIG. 6, the method of the present invention can accurately detect the lane line even when the vehicle is crowded at night. For the CULane data set, each lane line is considered to be a 30-pixel wide line. If the intersection ratio (IoU) of the predicted value and the true value of the lane line is greater than the threshold value (0.5), it is considered as True Positive (TP). The F1 score was used as an evaluation index as shown in formula (7):
Figure BDA0003495606960000063
wherein the content of the first and second substances,
Figure BDA0003495606960000064
FP and FN were false positive and false negative, respectively. The test results under the CULane data set are shown in table 1.
Table 1 results of the experiment:
Figure BDA0003495606960000065
the above-described embodiments are merely illustrative of the present invention and are not limited to the scope thereof, and those skilled in the art can make modifications to the parts thereof without departing from the spirit and scope of the present invention.

Claims (9)

1. A lane line detection method based on spatial feature interaction is characterized by comprising the following steps:
the method comprises the following steps: processing the training data;
step two: constructing a lane line detection network based on spatial feature interaction;
step three: training the network model established in the second step by using the data processed in the first step, performing parameter learning on the model by using an Adam optimization strategy, and storing a final training model;
step four: and testing the final network model in the third step.
2. The method as claimed in claim 1, wherein the processing of the training data in the first step comprises the following steps:
first, the original image is resized to 288 × 800; then, in order to improve generalization capability, a data enhancement method combining rotation, vertical movement and horizontal movement is adopted for the zoomed image; in addition, since the edge of the enhanced image may be vacant, in order to maintain the lane structure, the lane line is extended to the boundary of the image.
3. The lane line detection method based on spatial feature interaction according to claim 1, wherein the lane line detection network structure in the second step mainly comprises three parts: the system comprises a feature extractor, a spatial feature interaction module and a predictor based on classification; in addition, an auxiliary segmentation module is provided, and it should be noted that the auxiliary segmentation task is only used in the training stage and is deleted in the testing stage.
4. A method according to claim 3, wherein the feature extractor is obtained by removing the fully connected layer from ResNet; the feature extractor consists of 17 convolution layers, each convolution layer being followed by a batch normalization layer, a ReLU activation function layer.
5. The method of claim 3, wherein the spatial feature interaction module achieves the interaction of the spatial information by moving the sliced feature map in vertical and horizontal directions;
the information transmission has four directions, the invention uses 'from bottom to top' and 'from top to bottom' as vertical information interaction, and 'from left to right' and 'from right to left' as horizontal information interaction; continuously moving the slice characteristic diagram in the vertical and horizontal directions to carry out K times of iteration so that all spatial information in the same characteristic diagram can be interacted and sensed at each position; specifically, a three-dimensional feature map tensor X is provided, of size C H W, where C, H and W represent the number of channels, rows and columns respectively,
Figure FDA0003495606950000011
representing the values of the feature map X at the kth iteration, where c, i, and j represent the channel, row, and column indices, respectively; then the forward calculation formulas (1), (2), (3) and (4) of the spatial feature interaction module are as follows:
Figure FDA0003495606950000012
Figure FDA0003495606950000013
Figure FDA0003495606950000014
Figure FDA0003495606950000015
where K is the number of iterations, and K is log2L, L in the formulae (1) and (2) being W and H, respectively, f being a non-linear activation function, X, which is denoted by' representing an updated element, skIs the move step in the kth iteration; the formula (1) and the formula (2) are vertical and horizontal information transfer formulas respectively, F is a group of one-dimensional convolution kernels, wherein m, C and n respectively represent indexes of input channels, output channels and the widths of the convolution kernels, wherein the number of the input channels and the number of the output channels are equal to C, Z in the formula (1) and the formula (2) is an intermediate result of information transfer, and the characteristic diagram X is divided into H slices in the horizontal direction and W slices in the vertical direction; moving step skAnd dynamically determining the information transfer distance under the control of the iteration number k.
6. The method of claim 3, wherein the classification-based predictor makes a selection, classification of lane line position on each predefined row;
firstly, selecting h predefined rows according to training data, wherein each predefined row is divided into (w +1) small units; when prediction based on classification is carried out, mapping rich feature maps learned by the spatial feature interaction module to feature maps m multiplied by h multiplied by (w +1) (m represents the number of lane lines) with dimensions required by classification row by row through two fully-connected layers; then (w +1) -dimensional classification is carried out on h predefined rows respectively, and the positions of the lane lines on all the predefined rows are found, so that the whole lane line is predicted.
7. Method according to claim 3, characterized in that said secondary segmentation module is used only during the training phase, comprising the following processes:
in the auxiliary segmentation task, firstly unifying a feature map processed by a spatial feature interaction module and two feature maps with different scales extracted by a feature extractor into a feature map with the same size, splicing the feature maps, passing the obtained feature map through a convolution layer, and performing segmentation prediction on the feature map which is sampled to be consistent with the original map in size through double-side sampling; sampling on two sides is divided into two parts, and one part depends on bilinear difference values to obtain coarse-grained up-sampling characteristics; another part relies on transpose convolution to fine tune coarse-grained fine information loss.
8. The method of claim 7, wherein said bilateral upsampling consists of coarse-grain and fine-grain branches.
The coarse-granularity branch is used for quickly acquiring coarse up-sampling features from the upper layer, firstly, 1 × 1 convolution is used for reducing the number of channels to 1/2 of an input feature map, and then bilinear interpolation is directly used for up-sampling the input feature map; the fine grain branch is used for fine adjustment of fine information loss of the coarse grain branch, the path is deeper than the coarse grain branch, the feature graph is up-sampled by using a transposition convolution with the step size of 2, the number of channels is reduced by 1/2, and then two non-bottleneck blocks (non-bottle) are stacked, wherein the non-bottleneck blocks are formed by 4 convolutions of 3 × 1 and 1 × 3 with BN and ReLU, the shape of the feature graph can be maintained, and information can be efficiently extracted in a decomposition mode; and finally, carrying out addition operation on the two branches.
9. The lane line detection method based on spatial feature interaction of claim 1, wherein in step three, the network model training adopts an Adam optimization strategy, the momentum is 0.9, the cosine decay learning rate initialized by 4e-4, the batch size is 16, and the training iteration number is 50; model training employs an early-stop strategy to prevent overfitting.
CN202210113686.XA 2022-01-30 2022-01-30 Lane line detection method based on spatial feature interaction Pending CN114463721A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210113686.XA CN114463721A (en) 2022-01-30 2022-01-30 Lane line detection method based on spatial feature interaction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210113686.XA CN114463721A (en) 2022-01-30 2022-01-30 Lane line detection method based on spatial feature interaction

Publications (1)

Publication Number Publication Date
CN114463721A true CN114463721A (en) 2022-05-10

Family

ID=81412498

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210113686.XA Pending CN114463721A (en) 2022-01-30 2022-01-30 Lane line detection method based on spatial feature interaction

Country Status (1)

Country Link
CN (1) CN114463721A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294548A (en) * 2022-07-28 2022-11-04 烟台大学 Lane line detection method based on position selection and classification method in row direction
CN115376091A (en) * 2022-10-21 2022-11-22 松立控股集团股份有限公司 Lane line detection method assisted by image segmentation

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294548A (en) * 2022-07-28 2022-11-04 烟台大学 Lane line detection method based on position selection and classification method in row direction
CN115376091A (en) * 2022-10-21 2022-11-22 松立控股集团股份有限公司 Lane line detection method assisted by image segmentation

Similar Documents

Publication Publication Date Title
CN109740465B (en) Lane line detection algorithm based on example segmentation neural network framework
CN108509978B (en) Multi-class target detection method and model based on CNN (CNN) multi-level feature fusion
CN113052210B (en) Rapid low-light target detection method based on convolutional neural network
CN111563909B (en) Semantic segmentation method for complex street view image
CN111222396B (en) All-weather multispectral pedestrian detection method
CN111368846B (en) Road ponding identification method based on boundary semantic segmentation
CN110929578A (en) Anti-blocking pedestrian detection method based on attention mechanism
CN113627228B (en) Lane line detection method based on key point regression and multi-scale feature fusion
CN114463721A (en) Lane line detection method based on spatial feature interaction
CN111368830B (en) License plate detection and recognition method based on multi-video frame information and kernel correlation filtering algorithm
CN111652081B (en) Video semantic segmentation method based on optical flow feature fusion
CN112990065B (en) Vehicle classification detection method based on optimized YOLOv5 model
CN112613343B (en) River waste monitoring method based on improved YOLOv4
CN110910413A (en) ISAR image segmentation method based on U-Net
CN113762209A (en) Multi-scale parallel feature fusion road sign detection method based on YOLO
CN114120069B (en) Lane line detection system, method and storage medium based on direction self-attention
CN112651423A (en) Intelligent vision system
CN114120272A (en) Multi-supervision intelligent lane line semantic segmentation method fusing edge detection
CN112766056A (en) Method and device for detecting lane line in low-light environment based on deep neural network
CN115035295A (en) Remote sensing image semantic segmentation method based on shared convolution kernel and boundary loss function
CN115527096A (en) Small target detection method based on improved YOLOv5
CN116486080A (en) Lightweight image semantic segmentation method based on deep learning
CN116503709A (en) Vehicle detection method based on improved YOLOv5 in haze weather
CN114463205A (en) Vehicle target segmentation method based on double-branch Unet noise suppression
CN111881914B (en) License plate character segmentation method and system based on self-learning threshold

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination