CN112883807A

CN112883807A - Lane line detection method and system

Info

Publication number: CN112883807A
Application number: CN202110091223.3A
Authority: CN
Inventors: 李丰军; 周剑光
Original assignee: China Automotive Innovation Corp
Current assignee: China Automotive Innovation Corp
Priority date: 2021-01-22
Filing date: 2021-01-22
Publication date: 2021-06-01

Abstract

The invention discloses a lane line detection method and a system, wherein the method specifically comprises the following steps: modeling a lane detection task, building a convolutional neural network model, and extracting the characteristics of a convolutional characteristic diagram; collecting lane line training sample images in a training stage, increasing the diversity of training samples by utilizing preprocessing, and further obtaining converged network model parameters through iterative training; in the reasoning stage, post-processing is carried out on a reasoning result to obtain a lane line predicted by the model; according to the invention, a mechanism is introduced into the network structure, the semantic features of the extracted feature map are richer through the fusion of local and global information, and the positioning precision of the far end of the lane line is further improved by utilizing the network model of structure training.

Description

Lane line detection method and system

Technical Field

The invention relates to a lane line detection technology, in particular to a lane line detection method and a lane line detection system.

Background

The lane line is used as an important component of the road surface mark, so that the intelligent vehicle can be effectively guided to run in a restricted road structure area, the lane line detected on the road surface in real time is an important link in an intelligent vehicle auxiliary driving system, the functions of assisting path planning, performing road deviation early warning and the like are facilitated, and a reference object can be provided for positioning and navigation.

Currently, the most advanced methods for lane line detection in the industry are CNN-based methods, for example, the SCNN and SAD networks regard lane line detection as a semantic segmentation task and have a heavy coding and decoding structure, however, such methods usually use a small image as an input, which makes it difficult to accurately predict the far end part of a curved lane line; in addition, the above method is generally limited to detecting a predefined number of lane lines, and for the task of detecting lane lines on an actual road, the scenes in the image are different from each other, and the number of lane lines is not fixed; for this purpose pointlanonet follows a candidate region based strategy by generating multiple candidate lines in the image, thereby breaking the limitations of inefficient encoders and predefined number of lanes.

In the prior art, the ResNet122 is used as a main network to extract semantic features, an input image is subjected to main processing to obtain a corresponding feature map, then each grid on the feature map is regarded as an anchor, the anchor predicts the offset corresponding to a lane line passing through the grid, then the lane line with higher confidence coefficient is reserved through an NMS algorithm, and redundant candidate lane lines are filtered; furthermore, the output of the whole lane line can be directly obtained through end-to-end training, and the limit of the output of a fixed number of lane lines is eliminated, however, the whole lane line passing through the corresponding grid of the obtained feature graph is predicted by using anchors, so that the performance is greatly reduced when the far end of the prediction curve is obtained; due to the limitation of the scale of the convolution kernel, the characteristic diagram obtained by the scheme can only capture the information of a local area, and cannot capture the context information of long distance and short distance simultaneously.

Disclosure of Invention

The embodiment of the invention provides a lane line detection method, which can enrich semantic features of extracted feature maps through the fusion of local information and global information, and improve the positioning accuracy of the far end of a lane line at lower calculation cost.

In a first aspect, a lane line detection method is provided, where the method includes:

constructing a convolutional neural network model;

constructing a training and reasoning stage in a network model;

adjusting the pretreatment of the training phase;

and obtaining the predicted lane line through an inference stage.

In some realizations of the first aspect, the neural network model is constructed as follows:

the neural network model extracts the features of the convolution feature map;

the convolution characteristic diagram is connected with three branches and adopts convolution layers of 1x 1;

1x1 convolution layers respectively compress the number of channels of the characteristic diagram;

the compressed data carries out matrix multiplication on the first branch and the second branch after the first branch is transferred to obtain an attention feature map;

obtaining a normalized attention feature map by the attention feature map through softmax;

performing matrix multiplication on the third branch of the compressed data and the normalized attention feature map to obtain a weighted attention feature map;

and (4) performing dimension increasing on the number of the characteristic diagram channels by utilizing 1X1 convolution, and outputting a self-attention characteristic diagram.

In some implementations of the first aspect, the attention structure is derived from a neural network model, the attention structure being specified as follows:

the size of the characteristic diagram obtained after passing through the backbone network is not prevented from being N C H W,

then, the convolution of 1X1 is adopted by three branches to respectively compress the channel number of the feature map,

the sizes of the compressed characteristic graphs are all N C/8 (H W);

then, the characteristic diagram Q of the second branch is subjected to a transposition function to obtain Q ', and the size of Q' is N (H) W C/8;

performing matrix multiplication on Q 'and R to obtain M, and then performing normalization processing on M by using Softmax to obtain M', wherein the size of M is N (H) W;

representing the extracted global semantic information; performing matrix multiplication operation on the P and the M' to obtain a characteristic diagram O, wherein the size of the characteristic diagram O is N C/8 (H W);

finally, the number of channels is subjected to dimensionality increase by convolution with 1X1, and the size of the output feature graph is N × C (H × W).

In some implementations of the first aspect, the training phase specifically operates as follows:

collecting lane line training sample images, wherein the lane line training sample images comprise RGB three-channel color images and are marked with the position and category information of lane lines;

the training sample is converted into a format required by a convolutional neural network through pretreatment, and the training effect is further improved;

dividing a data set into a training set, a verification set and a test set, wherein the training set is used for training the convolutional neural network, the verification set is used for selecting an optimal training model, and the test set is used for evaluating model indexes;

building a neural network structure, and replacing a common convolution structure in a PointLaneNet structure with an attention structure;

iterative training, forward propagation and calculation of a model prediction result and a loss function of the marked truth value; calculating gradients through back propagation;

observing a loss function descending curve, verifying the converged model on a verification set, and selecting an optimal training model;

the training process comprises the following steps:

initializing training times;

generating small-batch lane line training data according to the training times;

obtaining a predicted value through forward-transmitted training data, and calculating a loss function;

updating the weights by back propagation;

judging whether the updated weight reaches a training target; if yes, the training is finished; otherwise, carrying out the next step;

and (4) judging whether the training times are reached, if not, returning to the step (2) through the training times plus 1, and if so, ending the training.

According to one aspect of the invention, the inference phase specifically comprises the following steps:

randomly disorganizing the test set;

carrying out convolution neural network reasoning on the images in the test set to obtain a reasoning result;

and post-processing the inference result to obtain the predicted lane line.

In some implementations of the first aspect, the preprocessing flow is as follows:

reading the image I and the annotation data L;

zooming and reading the coefficients of the image I and the labeled data L;

and normalizing the coefficients of the scaled read image I and the labeled data L.

In some realizations of the first aspect, the scaled expressions in the scaled read image I and annotation data L coefficients are as follows:

in the formula, dst_xAnd dst_yRepresenting the position of the zoomed image; src_xAnd src_yRepresenting the corresponding position in the original image;

s_xand s_yRepresents a scaling factor for;

the expression for normalization in the coefficient normalization process of the scaled read image I and the annotation data L is as follows:

in formula (II)'_(x，y)Representing pixel values of the normalized image; i is_(x，y)Representing pixel values of the image before normalization; mean represents the mean of the image pixel values before normalization; std denotes the variance of the image pixel values before normalization.

In some implementations of the first aspect, the data enhancement strategy of the lane line detection task involved in the training phase includes horizontal flipping, rotation, translation, scaling; the coordinates of points on the lane line before horizontal turning are set as (x, y) in the horizontal turning; the coordinate after flipping is (x ', y'), then the calculation formula is as follows:

in the formula, w represents the width of an image;

the rotation is defined as the coordinates of a point on the lane line before the image rotation is (x, y), the coordinates after the image rotation is (x ', y'), and the rotation angle is θ, then the calculation formula is as follows:

in the formula, w represents the width of an image; h represents the height of the image;

the coordinates of a point on the lane line before image translation are set as (x, y) in the translation, and the coordinates after translation are set as (x ', y'), so that the calculation formula is as follows:

in the formula (x)_start,y_start) Representing the coordinates selected before translation, (x)_end,y_end) Representing coordinates after point translation;

and (3) the scaling is to set the coordinates of a point on the lane line before image scaling as (x, y) and the scaled coordinates as (x ', y'), so that the calculation formula is as follows:

where scale represents the scaling factor, (x)_top，y_top) Representing the starting coordinates of the scaled image in the filler image.

In a second aspect, there is provided a lane line detection system, the system comprising:

a first module for constructing a convolutional neural network model;

a second module for building a training and reasoning phase through the network model;

a third module for adjusting the pre-processing of the training phase;

a fourth module for predicting the lane line through the inference phase.

In some realizations of the second aspect, the first module further constructs the neural network model as follows:

the neural network model extracts the features of the convolution feature map;

performing dimension increasing on the number of the characteristic diagram channels by utilizing 1X1 convolution, and outputting a self-attention characteristic diagram;

further, an attention structure is obtained according to the neural network model, and the attention structure is specifically as follows:

the sizes of the compressed characteristic graphs are all N C/8 (H W);

In some implementations of the second aspect, the second module is further configured to perform the training phase, and the specific operation steps are as follows:

the training process comprises the following steps:

initializing training times;

generating small-batch lane line training data according to the training times;

updating the weights by back propagation;

In some implementations of the second aspect, the third module further performs the preprocessing as follows:

reading the image I and the annotation data L;

zooming and reading the coefficients of the image I and the labeled data L;

normalizing the coefficients of the zoom reading image I and the annotation data L;

the scaling expressions in the coefficients of the scaled read image I and the annotation data L are as follows:

s_xand s_yRepresents a scaling factor for;

in formula (II)'_(x,y)Representing pixel values of the normalized image; i is_(x，y)Representing pixel values of the image before normalization; mean represents the mean of the image pixel values before normalization; std denotes the variance of the image pixel values before normalization.

In some implementations of the second aspect, the fourth module further enhances the data enhancement strategy for the lane line detection task involved in the training phase by horizontal flipping, rotating, translating, scaling; the coordinates of points on the lane line before horizontal turning are set as (x, y) in the horizontal turning; the coordinate after flipping is (x ', y'), then the calculation formula is as follows:

in the formula, w represents the width of an image;

in the formula (x)_start，y_start) Representing the coordinates selected before translation, (x)_end，y_end) Representing coordinates after point translation;

In a third aspect, there is provided a lane line detection apparatus, the apparatus comprising:

a processor and a memory storing computer program instructions;

the processor reads and executes the computer program instructions to implement the lane line detection method of the first aspect.

In a fourth aspect, a computer-readable storage medium having a computer stored thereon is provided;

program instructions which, when executed by a processor, implement a lane line detection method of the first aspect.

Has the advantages that: the invention designs a lane line detection method and a lane line detection system, aiming at solving the problem that the far end of a lane line is not accurately positioned, the invention introduces an attention mechanism into the network structure, can make semantic features of an extracted feature map richer through the fusion of local and global information, and simultaneously improves the positioning accuracy of the far end of the lane line with lower calculation cost; by introducing an attention mechanism, the global semantic information of the feature map can be enriched; and the network model trained by the attention structure is utilized to improve the positioning precision of the far end of the lane line.

Drawings

Fig. 1 is a schematic diagram of the network architecture of the present invention.

Fig. 2 is a schematic view of the attention structure of the present invention.

Fig. 3 is a schematic diagram of the training process of the present invention.

FIG. 4 is a schematic flow chart of the training phase and the reasoning phase of the present invention.

Fig. 5 is a schematic diagram of the present invention in a horizontal flip.

Fig. 6 is a schematic rotation diagram of the present invention.

Fig. 7 is a schematic view of the translation of the present invention.

Fig. 8 is a schematic zoom view of the present invention.

FIG. 9 is a flow chart of the pre-processing of the present invention.

Detailed Description

In this embodiment, a method and a system for detecting a lane line can enrich semantic features of an extracted feature map by fusing local information and global information, and improve positioning accuracy of a far end of a lane line at a low calculation cost.

In the prior art, the ResNet122 is used as a main network to extract semantic features, an input image is subjected to main processing to obtain a corresponding feature map, then each grid on the feature map is regarded as an anchor, the anchor predicts the offset corresponding to a lane line passing through the grid, then the lane line with higher confidence coefficient is reserved through an NMS algorithm, and redundant candidate lane lines are filtered; and then the output of the whole lane line can be directly obtained through end-to-end training, and the limit of the output of a fixed number of lane lines is eliminated.

In summary, in the present application, the applicant believes that there are at least the following disadvantages in the prior art:

and predicting the whole lane line passing through the corresponding grid by using anchors on the obtained characteristic diagram, so that the performance is greatly reduced when the far end of the curve is predicted.

In order to solve the disadvantages in the prior art, embodiments of the present invention provide a lane line detection method and system, and the following describes a technical solution of the embodiments of the present invention with reference to the accompanying drawings.

The first embodiment,

According to an embodiment, there is provided a lane line detection method, including:

constructing a convolutional neural network model;

constructing a training and reasoning stage in a network model;

adjusting the pretreatment of the training phase;

and obtaining the predicted lane line through an inference stage.

Example II,

On the basis of the first embodiment, the neural network model is constructed as follows:

the neural network model extracts the features of the convolution feature map;

Example III,

On the basis of the second embodiment, an attention structure is obtained according to the neural network model, and the attention structure is specifically as follows:

the sizes of the compressed characteristic graphs are all N C/8 (H W);

Example four,

On the basis of the first embodiment, the training phase specifically operates as follows:

and observing a loss function descending curve, verifying the converged model on a verification set, and selecting an optimal training model.

Example V,

On the basis of the fourth embodiment, the training flow comprises the following steps:

initializing training times;

generating small-batch lane line training data according to the training times;

updating the weights by back propagation;

Example six,

On the basis of the first embodiment, the inference stage specifically comprises the following operation steps:

randomly disorganizing the test set;

and post-processing the inference result to obtain the predicted lane line.

Example seven,

On the basis of the sixth embodiment, the flow of the pretreatment is as follows:

reading the image I and the annotation data L;

zooming and reading the coefficients of the image I and the labeled data L;

Example eight,

On the basis of the seventh embodiment, the expressions of scaling in the coefficients of the scaled read image I and the annotation data L are as follows:

s_xand s_yRepresents a scaling factor for;

Examples nine,

On the basis of the seventh embodiment, the data enhancement strategy of the lane line detection task involved in the training stage comprises horizontal turning, rotation, translation and scaling; the coordinates of points on the lane line before horizontal turning are set as (x, y) in the horizontal turning; the coordinate after flipping is (x ', y'), then the calculation formula is as follows:

in the formula, w represents the width of an image;

in the formula (x)_start,y_start) Representing the coordinates selected before translation, (x)_end，y_end) Representing coordinates after point translation;

where scale represents the scaling factor, (x)_top，y_top) Representation scalingThe starting coordinates of the post image in the filler image.

Examples ten,

There is provided in accordance with an eleventh embodiment a lane line detection system, the system comprising:

a first module for constructing a convolutional neural network model; the first module further constructs the neural network model as follows:

the neural network model extracts the features of the convolution feature map;

An attention structure is obtained according to a neural network model, and the attention structure is specifically as follows:

the sizes of the compressed characteristic graphs are all N C/8 (H W);

Examples eleven,

On the basis of the tenth embodiment, the second module is used for constructing a training and reasoning stage in the network model; the second module is further the training phase, and the specific operation steps are as follows:

in the training stage, the specific operation is as follows:

Examples twelve,

A third module for adjusting the preprocessing of the training phase on the basis of the tenth embodiment; the third module further performs the following pre-processing procedure:

the pretreatment process comprises the following steps:

reading the image I and the annotation data L;

zooming and reading the coefficients of the image I and the labeled data L;

s_xand s_yRepresents a scaling factor for;

in formula (II)'_(x，y)Representing pixel values of the normalized image; i is_(x，y)Representing pixel values of the image before normalization; mean represents the mean of the image pixel values before normalization; std represents the variance of the image pixel values before normalization; in the middle, the mean values of RGB three channels are [0.485, 0.456 and 0.406 ] in sequence]While the corresponding variance is sequentially

[0.229，0.224，0.225]。

Example thirteen,

A fourth module for predicting lane lines through a training phase based on the twelfth embodiment; the fourth module further enhances strategies including horizontal turning, rotation, translation and scaling for the data of the lane line detection task involved in the training phase; the coordinates of points on the lane line before horizontal turning are set as (x, y) in the horizontal turning; the coordinate after flipping is (x ', y'), then the calculation formula is as follows:

in the formula, w represents the width of an image;

Examples fourteen,

There is provided in accordance with a twelfth embodiment a lane line detecting apparatus, including:

a processor and a memory storing computer program instructions;

the processor reads and executes the computer program instructions to implement the lane line detection method of the first embodiment.

Example fifteen,

There is provided in accordance with a twelfth embodiment a computer-readable storage medium having a computer stored thereon;

program instructions, which when executed by a processor implement a lane line detection method according to the first embodiment.

It should also be noted that the exemplary embodiments mentioned in this patent describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the above-mentioned order of the steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.

As will be apparent to those skilled in the art, for convenience and brevity of description, the specific working processes of the systems, modules and units described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.

Claims

1. A lane line detection method is characterized by comprising the following steps:

constructing a convolutional neural network model;

constructing a training and reasoning stage in a network model;

adjusting the pretreatment of the training phase;

and obtaining the predicted lane line through an inference stage.

2. The method according to claim 1, wherein the neural network model is constructed as follows:

the neural network model extracts the features of the convolution feature map;

the sizes of the compressed characteristic graphs are all N C/8 (H W);

3. The lane line detection method according to claim 1, wherein the training phase specifically operates as follows:

the training process comprises the following steps:

initializing training times;

generating small-batch lane line training data according to the training times;

updating the weights by back propagation;

4. The lane line detection method according to claim 1, wherein the inference stage specifically comprises the following steps:

randomly disorganizing the test set;

post-processing the inference result to obtain a predicted lane line;

the pretreatment process comprises the following steps:

reading the image I and the annotation data L;

zooming and reading the coefficients of the image I and the labeled data L;

5. The lane line detection method according to claim 4, wherein the scaling expressions in the coefficients of the scaled read image I and the annotation data L are as follows:

s_xand s_yRepresents a scaling factor for;

6. The lane line detection method according to claim 1, wherein the data enhancement strategy of the lane line detection task involved in the training phase comprises horizontal flipping, rotation, translation, and scaling; the coordinates of points on the lane line before horizontal turning are set as (x, y) in the horizontal turning; the coordinate after flipping is (x ', y'), then the calculation formula is as follows:

in the formula, w represents the width of an image;

7. A lane line detection system is characterized by comprising the following modules:

a first module for constructing a convolutional neural network model;

a third module for adjusting the pre-processing of the training phase;

a fourth module for predicting the lane line through the inference phase;

the first module further constructs the neural network model as follows:

the neural network model extracts the features of the convolution feature map;

the sizes of the compressed characteristic graphs are all N C/8 (H W);

finally, the size of the characteristic graph is output to be N, C (H, W) after the number of channels is subjected to dimensionality increase by convolution of 1X 1;

the second module is further the training phase, and the specific operation steps are as follows:

the training process comprises the following steps:

initializing training times;

generating small-batch lane line training data according to the training times;

updating the weights by back propagation;

8. The lane line detection system of claim 7, wherein said third module further performs said preprocessing as follows:

reading the image I and the annotation data L;

zooming and reading the coefficients of the image I and the labeled data L;

s_xand s_yRepresents a scaling factor for;

in formula (II)'_(x,y)Representing pixel values of the normalized image; i is_(x,y)Representing pixel values of the image before normalization; mean represents the mean of the image pixel values before normalization; std represents the variance of the image pixel values before normalization;

the fourth module further enhances strategies including horizontal turning, rotation, translation and scaling for the data of the lane line detection task involved in the training phase; the coordinates of points on the lane line before horizontal turning are set as (x, y) in the horizontal turning; the coordinate after flipping is (x ', y'), then the calculation formula is as follows:

in the formula, w represents the width of an image;

9. A lane line detection apparatus, the apparatus comprising:

a processor and a memory storing computer program instructions;

the processor reads and executes the computer program instructions to implement a lane line detection method according to any one of claims 1 to 6.

10. A computer-readable storage medium having a computer stored thereon;

program instructions which, when executed by a processor, implement a lane line detection method according to any one of claims 1 to 6.