CN112883807A - Lane line detection method and system - Google Patents

Lane line detection method and system Download PDF

Info

Publication number
CN112883807A
CN112883807A CN202110091223.3A CN202110091223A CN112883807A CN 112883807 A CN112883807 A CN 112883807A CN 202110091223 A CN202110091223 A CN 202110091223A CN 112883807 A CN112883807 A CN 112883807A
Authority
CN
China
Prior art keywords
training
image
lane line
coordinates
follows
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110091223.3A
Other languages
Chinese (zh)
Inventor
李丰军
周剑光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Automotive Innovation Corp
Original Assignee
China Automotive Innovation Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Automotive Innovation Corp filed Critical China Automotive Innovation Corp
Priority to CN202110091223.3A priority Critical patent/CN112883807A/en
Publication of CN112883807A publication Critical patent/CN112883807A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a lane line detection method and a system, wherein the method specifically comprises the following steps: modeling a lane detection task, building a convolutional neural network model, and extracting the characteristics of a convolutional characteristic diagram; collecting lane line training sample images in a training stage, increasing the diversity of training samples by utilizing preprocessing, and further obtaining converged network model parameters through iterative training; in the reasoning stage, post-processing is carried out on a reasoning result to obtain a lane line predicted by the model; according to the invention, a mechanism is introduced into the network structure, the semantic features of the extracted feature map are richer through the fusion of local and global information, and the positioning precision of the far end of the lane line is further improved by utilizing the network model of structure training.

Description

Lane line detection method and system
Technical Field
The invention relates to a lane line detection technology, in particular to a lane line detection method and a lane line detection system.
Background
The lane line is used as an important component of the road surface mark, so that the intelligent vehicle can be effectively guided to run in a restricted road structure area, the lane line detected on the road surface in real time is an important link in an intelligent vehicle auxiliary driving system, the functions of assisting path planning, performing road deviation early warning and the like are facilitated, and a reference object can be provided for positioning and navigation.
Currently, the most advanced methods for lane line detection in the industry are CNN-based methods, for example, the SCNN and SAD networks regard lane line detection as a semantic segmentation task and have a heavy coding and decoding structure, however, such methods usually use a small image as an input, which makes it difficult to accurately predict the far end part of a curved lane line; in addition, the above method is generally limited to detecting a predefined number of lane lines, and for the task of detecting lane lines on an actual road, the scenes in the image are different from each other, and the number of lane lines is not fixed; for this purpose pointlanonet follows a candidate region based strategy by generating multiple candidate lines in the image, thereby breaking the limitations of inefficient encoders and predefined number of lanes.
In the prior art, the ResNet122 is used as a main network to extract semantic features, an input image is subjected to main processing to obtain a corresponding feature map, then each grid on the feature map is regarded as an anchor, the anchor predicts the offset corresponding to a lane line passing through the grid, then the lane line with higher confidence coefficient is reserved through an NMS algorithm, and redundant candidate lane lines are filtered; furthermore, the output of the whole lane line can be directly obtained through end-to-end training, and the limit of the output of a fixed number of lane lines is eliminated, however, the whole lane line passing through the corresponding grid of the obtained feature graph is predicted by using anchors, so that the performance is greatly reduced when the far end of the prediction curve is obtained; due to the limitation of the scale of the convolution kernel, the characteristic diagram obtained by the scheme can only capture the information of a local area, and cannot capture the context information of long distance and short distance simultaneously.
Disclosure of Invention
The embodiment of the invention provides a lane line detection method, which can enrich semantic features of extracted feature maps through the fusion of local information and global information, and improve the positioning accuracy of the far end of a lane line at lower calculation cost.
In a first aspect, a lane line detection method is provided, where the method includes:
constructing a convolutional neural network model;
constructing a training and reasoning stage in a network model;
adjusting the pretreatment of the training phase;
and obtaining the predicted lane line through an inference stage.
In some realizations of the first aspect, the neural network model is constructed as follows:
the neural network model extracts the features of the convolution feature map;
the convolution characteristic diagram is connected with three branches and adopts convolution layers of 1x 1;
1x1 convolution layers respectively compress the number of channels of the characteristic diagram;
the compressed data carries out matrix multiplication on the first branch and the second branch after the first branch is transferred to obtain an attention feature map;
obtaining a normalized attention feature map by the attention feature map through softmax;
performing matrix multiplication on the third branch of the compressed data and the normalized attention feature map to obtain a weighted attention feature map;
and (4) performing dimension increasing on the number of the characteristic diagram channels by utilizing 1X1 convolution, and outputting a self-attention characteristic diagram.
In some implementations of the first aspect, the attention structure is derived from a neural network model, the attention structure being specified as follows:
the size of the characteristic diagram obtained after passing through the backbone network is not prevented from being N C H W,
then, the convolution of 1X1 is adopted by three branches to respectively compress the channel number of the feature map,
the sizes of the compressed characteristic graphs are all N C/8 (H W);
then, the characteristic diagram Q of the second branch is subjected to a transposition function to obtain Q ', and the size of Q' is N (H) W C/8;
performing matrix multiplication on Q 'and R to obtain M, and then performing normalization processing on M by using Softmax to obtain M', wherein the size of M is N (H) W;
representing the extracted global semantic information; performing matrix multiplication operation on the P and the M' to obtain a characteristic diagram O, wherein the size of the characteristic diagram O is N C/8 (H W);
finally, the number of channels is subjected to dimensionality increase by convolution with 1X1, and the size of the output feature graph is N × C (H × W).
In some implementations of the first aspect, the training phase specifically operates as follows:
collecting lane line training sample images, wherein the lane line training sample images comprise RGB three-channel color images and are marked with the position and category information of lane lines;
the training sample is converted into a format required by a convolutional neural network through pretreatment, and the training effect is further improved;
dividing a data set into a training set, a verification set and a test set, wherein the training set is used for training the convolutional neural network, the verification set is used for selecting an optimal training model, and the test set is used for evaluating model indexes;
building a neural network structure, and replacing a common convolution structure in a PointLaneNet structure with an attention structure;
iterative training, forward propagation and calculation of a model prediction result and a loss function of the marked truth value; calculating gradients through back propagation;
observing a loss function descending curve, verifying the converged model on a verification set, and selecting an optimal training model;
the training process comprises the following steps:
initializing training times;
generating small-batch lane line training data according to the training times;
obtaining a predicted value through forward-transmitted training data, and calculating a loss function;
updating the weights by back propagation;
judging whether the updated weight reaches a training target; if yes, the training is finished; otherwise, carrying out the next step;
and (4) judging whether the training times are reached, if not, returning to the step (2) through the training times plus 1, and if so, ending the training.
According to one aspect of the invention, the inference phase specifically comprises the following steps:
randomly disorganizing the test set;
carrying out convolution neural network reasoning on the images in the test set to obtain a reasoning result;
and post-processing the inference result to obtain the predicted lane line.
In some implementations of the first aspect, the preprocessing flow is as follows:
reading the image I and the annotation data L;
zooming and reading the coefficients of the image I and the labeled data L;
and normalizing the coefficients of the scaled read image I and the labeled data L.
In some realizations of the first aspect, the scaled expressions in the scaled read image I and annotation data L coefficients are as follows:
Figure BDA0002912610420000031
in the formula, dstxAnd dstyRepresenting the position of the zoomed image; srcxAnd srcyRepresenting the corresponding position in the original image;
sxand syRepresents a scaling factor for;
the expression for normalization in the coefficient normalization process of the scaled read image I and the annotation data L is as follows:
Figure BDA0002912610420000032
in formula (II)'(x,y)Representing pixel values of the normalized image; i is(x,y)Representing pixel values of the image before normalization; mean represents the mean of the image pixel values before normalization; std denotes the variance of the image pixel values before normalization.
In some implementations of the first aspect, the data enhancement strategy of the lane line detection task involved in the training phase includes horizontal flipping, rotation, translation, scaling; the coordinates of points on the lane line before horizontal turning are set as (x, y) in the horizontal turning; the coordinate after flipping is (x ', y'), then the calculation formula is as follows:
Figure BDA0002912610420000041
in the formula, w represents the width of an image;
the rotation is defined as the coordinates of a point on the lane line before the image rotation is (x, y), the coordinates after the image rotation is (x ', y'), and the rotation angle is θ, then the calculation formula is as follows:
Figure BDA0002912610420000042
in the formula, w represents the width of an image; h represents the height of the image;
the coordinates of a point on the lane line before image translation are set as (x, y) in the translation, and the coordinates after translation are set as (x ', y'), so that the calculation formula is as follows:
Figure BDA0002912610420000043
in the formula (x)start,ystart) Representing the coordinates selected before translation, (x)end,yend) Representing coordinates after point translation;
and (3) the scaling is to set the coordinates of a point on the lane line before image scaling as (x, y) and the scaled coordinates as (x ', y'), so that the calculation formula is as follows:
Figure BDA0002912610420000044
where scale represents the scaling factor, (x)top,ytop) Representing the starting coordinates of the scaled image in the filler image.
In a second aspect, there is provided a lane line detection system, the system comprising:
a first module for constructing a convolutional neural network model;
a second module for building a training and reasoning phase through the network model;
a third module for adjusting the pre-processing of the training phase;
a fourth module for predicting the lane line through the inference phase.
In some realizations of the second aspect, the first module further constructs the neural network model as follows:
the neural network model extracts the features of the convolution feature map;
the convolution characteristic diagram is connected with three branches and adopts convolution layers of 1x 1;
1x1 convolution layers respectively compress the number of channels of the characteristic diagram;
the compressed data carries out matrix multiplication on the first branch and the second branch after the first branch is transferred to obtain an attention feature map;
obtaining a normalized attention feature map by the attention feature map through softmax;
performing matrix multiplication on the third branch of the compressed data and the normalized attention feature map to obtain a weighted attention feature map;
performing dimension increasing on the number of the characteristic diagram channels by utilizing 1X1 convolution, and outputting a self-attention characteristic diagram;
further, an attention structure is obtained according to the neural network model, and the attention structure is specifically as follows:
the size of the characteristic diagram obtained after passing through the backbone network is not prevented from being N C H W,
then, the convolution of 1X1 is adopted by three branches to respectively compress the channel number of the feature map,
the sizes of the compressed characteristic graphs are all N C/8 (H W);
then, the characteristic diagram Q of the second branch is subjected to a transposition function to obtain Q ', and the size of Q' is N (H) W C/8;
performing matrix multiplication on Q 'and R to obtain M, and then performing normalization processing on M by using Softmax to obtain M', wherein the size of M is N (H) W;
representing the extracted global semantic information; performing matrix multiplication operation on the P and the M' to obtain a characteristic diagram O, wherein the size of the characteristic diagram O is N C/8 (H W);
finally, the number of channels is subjected to dimensionality increase by convolution with 1X1, and the size of the output feature graph is N × C (H × W).
In some implementations of the second aspect, the second module is further configured to perform the training phase, and the specific operation steps are as follows:
collecting lane line training sample images, wherein the lane line training sample images comprise RGB three-channel color images and are marked with the position and category information of lane lines;
the training sample is converted into a format required by a convolutional neural network through pretreatment, and the training effect is further improved;
dividing a data set into a training set, a verification set and a test set, wherein the training set is used for training the convolutional neural network, the verification set is used for selecting an optimal training model, and the test set is used for evaluating model indexes;
building a neural network structure, and replacing a common convolution structure in a PointLaneNet structure with an attention structure;
iterative training, forward propagation and calculation of a model prediction result and a loss function of the marked truth value; calculating gradients through back propagation;
observing a loss function descending curve, verifying the converged model on a verification set, and selecting an optimal training model;
the training process comprises the following steps:
initializing training times;
generating small-batch lane line training data according to the training times;
obtaining a predicted value through forward-transmitted training data, and calculating a loss function;
updating the weights by back propagation;
judging whether the updated weight reaches a training target; if yes, the training is finished; otherwise, carrying out the next step;
and (4) judging whether the training times are reached, if not, returning to the step (2) through the training times plus 1, and if so, ending the training.
In some implementations of the second aspect, the third module further performs the preprocessing as follows:
reading the image I and the annotation data L;
zooming and reading the coefficients of the image I and the labeled data L;
normalizing the coefficients of the zoom reading image I and the annotation data L;
the scaling expressions in the coefficients of the scaled read image I and the annotation data L are as follows:
Figure BDA0002912610420000061
in the formula, dstxAnd dstyRepresenting the position of the zoomed image; srcxAnd srcyRepresenting the corresponding position in the original image;
sxand syRepresents a scaling factor for;
the expression for normalization in the coefficient normalization process of the scaled read image I and the annotation data L is as follows:
Figure BDA0002912610420000062
in formula (II)'(x,y)Representing pixel values of the normalized image; i is(x,y)Representing pixel values of the image before normalization; mean represents the mean of the image pixel values before normalization; std denotes the variance of the image pixel values before normalization.
In some implementations of the second aspect, the fourth module further enhances the data enhancement strategy for the lane line detection task involved in the training phase by horizontal flipping, rotating, translating, scaling; the coordinates of points on the lane line before horizontal turning are set as (x, y) in the horizontal turning; the coordinate after flipping is (x ', y'), then the calculation formula is as follows:
Figure BDA0002912610420000063
in the formula, w represents the width of an image;
the rotation is defined as the coordinates of a point on the lane line before the image rotation is (x, y), the coordinates after the image rotation is (x ', y'), and the rotation angle is θ, then the calculation formula is as follows:
Figure BDA0002912610420000071
in the formula, w represents the width of an image; h represents the height of the image;
the coordinates of a point on the lane line before image translation are set as (x, y) in the translation, and the coordinates after translation are set as (x ', y'), so that the calculation formula is as follows:
Figure BDA0002912610420000072
in the formula (x)start,ystart) Representing the coordinates selected before translation, (x)end,yend) Representing coordinates after point translation;
and (3) the scaling is to set the coordinates of a point on the lane line before image scaling as (x, y) and the scaled coordinates as (x ', y'), so that the calculation formula is as follows:
Figure BDA0002912610420000073
where scale represents the scaling factor, (x)top,ytop) Representing the starting coordinates of the scaled image in the filler image.
In a third aspect, there is provided a lane line detection apparatus, the apparatus comprising:
a processor and a memory storing computer program instructions;
the processor reads and executes the computer program instructions to implement the lane line detection method of the first aspect.
In a fourth aspect, a computer-readable storage medium having a computer stored thereon is provided;
program instructions which, when executed by a processor, implement a lane line detection method of the first aspect.
Has the advantages that: the invention designs a lane line detection method and a lane line detection system, aiming at solving the problem that the far end of a lane line is not accurately positioned, the invention introduces an attention mechanism into the network structure, can make semantic features of an extracted feature map richer through the fusion of local and global information, and simultaneously improves the positioning accuracy of the far end of the lane line with lower calculation cost; by introducing an attention mechanism, the global semantic information of the feature map can be enriched; and the network model trained by the attention structure is utilized to improve the positioning precision of the far end of the lane line.
Drawings
Fig. 1 is a schematic diagram of the network architecture of the present invention.
Fig. 2 is a schematic view of the attention structure of the present invention.
Fig. 3 is a schematic diagram of the training process of the present invention.
FIG. 4 is a schematic flow chart of the training phase and the reasoning phase of the present invention.
Fig. 5 is a schematic diagram of the present invention in a horizontal flip.
Fig. 6 is a schematic rotation diagram of the present invention.
Fig. 7 is a schematic view of the translation of the present invention.
Fig. 8 is a schematic zoom view of the present invention.
FIG. 9 is a flow chart of the pre-processing of the present invention.
Detailed Description
In this embodiment, a method and a system for detecting a lane line can enrich semantic features of an extracted feature map by fusing local information and global information, and improve positioning accuracy of a far end of a lane line at a low calculation cost.
Currently, the most advanced methods for lane line detection in the industry are CNN-based methods, for example, the SCNN and SAD networks regard lane line detection as a semantic segmentation task and have a heavy coding and decoding structure, however, such methods usually use a small image as an input, which makes it difficult to accurately predict the far end part of a curved lane line; in addition, the above method is generally limited to detecting a predefined number of lane lines, and for the task of detecting lane lines on an actual road, the scenes in the image are different from each other, and the number of lane lines is not fixed; for this purpose pointlanonet follows a candidate region based strategy by generating multiple candidate lines in the image, thereby breaking the limitations of inefficient encoders and predefined number of lanes.
In the prior art, the ResNet122 is used as a main network to extract semantic features, an input image is subjected to main processing to obtain a corresponding feature map, then each grid on the feature map is regarded as an anchor, the anchor predicts the offset corresponding to a lane line passing through the grid, then the lane line with higher confidence coefficient is reserved through an NMS algorithm, and redundant candidate lane lines are filtered; and then the output of the whole lane line can be directly obtained through end-to-end training, and the limit of the output of a fixed number of lane lines is eliminated.
In summary, in the present application, the applicant believes that there are at least the following disadvantages in the prior art:
and predicting the whole lane line passing through the corresponding grid by using anchors on the obtained characteristic diagram, so that the performance is greatly reduced when the far end of the curve is predicted.
In order to solve the disadvantages in the prior art, embodiments of the present invention provide a lane line detection method and system, and the following describes a technical solution of the embodiments of the present invention with reference to the accompanying drawings.
The first embodiment,
According to an embodiment, there is provided a lane line detection method, including:
constructing a convolutional neural network model;
constructing a training and reasoning stage in a network model;
adjusting the pretreatment of the training phase;
and obtaining the predicted lane line through an inference stage.
Example II,
On the basis of the first embodiment, the neural network model is constructed as follows:
the neural network model extracts the features of the convolution feature map;
the convolution characteristic diagram is connected with three branches and adopts convolution layers of 1x 1;
1x1 convolution layers respectively compress the number of channels of the characteristic diagram;
the compressed data carries out matrix multiplication on the first branch and the second branch after the first branch is transferred to obtain an attention feature map;
obtaining a normalized attention feature map by the attention feature map through softmax;
performing matrix multiplication on the third branch of the compressed data and the normalized attention feature map to obtain a weighted attention feature map;
and (4) performing dimension increasing on the number of the characteristic diagram channels by utilizing 1X1 convolution, and outputting a self-attention characteristic diagram.
Example III,
On the basis of the second embodiment, an attention structure is obtained according to the neural network model, and the attention structure is specifically as follows:
the size of the characteristic diagram obtained after passing through the backbone network is not prevented from being N C H W,
then, the convolution of 1X1 is adopted by three branches to respectively compress the channel number of the feature map,
the sizes of the compressed characteristic graphs are all N C/8 (H W);
then, the characteristic diagram Q of the second branch is subjected to a transposition function to obtain Q ', and the size of Q' is N (H) W C/8;
performing matrix multiplication on Q 'and R to obtain M, and then performing normalization processing on M by using Softmax to obtain M', wherein the size of M is N (H) W;
representing the extracted global semantic information; performing matrix multiplication operation on the P and the M' to obtain a characteristic diagram O, wherein the size of the characteristic diagram O is N C/8 (H W);
finally, the number of channels is subjected to dimensionality increase by convolution with 1X1, and the size of the output feature graph is N × C (H × W).
Example four,
On the basis of the first embodiment, the training phase specifically operates as follows:
collecting lane line training sample images, wherein the lane line training sample images comprise RGB three-channel color images and are marked with the position and category information of lane lines;
the training sample is converted into a format required by a convolutional neural network through pretreatment, and the training effect is further improved;
dividing a data set into a training set, a verification set and a test set, wherein the training set is used for training the convolutional neural network, the verification set is used for selecting an optimal training model, and the test set is used for evaluating model indexes;
building a neural network structure, and replacing a common convolution structure in a PointLaneNet structure with an attention structure;
iterative training, forward propagation and calculation of a model prediction result and a loss function of the marked truth value; calculating gradients through back propagation;
and observing a loss function descending curve, verifying the converged model on a verification set, and selecting an optimal training model.
Example V,
On the basis of the fourth embodiment, the training flow comprises the following steps:
initializing training times;
generating small-batch lane line training data according to the training times;
obtaining a predicted value through forward-transmitted training data, and calculating a loss function;
updating the weights by back propagation;
judging whether the updated weight reaches a training target; if yes, the training is finished; otherwise, carrying out the next step;
and (4) judging whether the training times are reached, if not, returning to the step (2) through the training times plus 1, and if so, ending the training.
Example six,
On the basis of the first embodiment, the inference stage specifically comprises the following operation steps:
randomly disorganizing the test set;
carrying out convolution neural network reasoning on the images in the test set to obtain a reasoning result;
and post-processing the inference result to obtain the predicted lane line.
Example seven,
On the basis of the sixth embodiment, the flow of the pretreatment is as follows:
reading the image I and the annotation data L;
zooming and reading the coefficients of the image I and the labeled data L;
and normalizing the coefficients of the scaled read image I and the labeled data L.
Example eight,
On the basis of the seventh embodiment, the expressions of scaling in the coefficients of the scaled read image I and the annotation data L are as follows:
Figure BDA0002912610420000111
in the formula, dstxAnd dstyRepresenting the position of the zoomed image; srcxAnd srcyRepresenting the corresponding position in the original image;
sxand syRepresents a scaling factor for;
the expression for normalization in the coefficient normalization process of the scaled read image I and the annotation data L is as follows:
Figure BDA0002912610420000112
in formula (II)'(x,y)Representing pixel values of the normalized image; i is(x,y)Representing pixel values of the image before normalization; mean represents the mean of the image pixel values before normalization; std denotes the variance of the image pixel values before normalization.
Examples nine,
On the basis of the seventh embodiment, the data enhancement strategy of the lane line detection task involved in the training stage comprises horizontal turning, rotation, translation and scaling; the coordinates of points on the lane line before horizontal turning are set as (x, y) in the horizontal turning; the coordinate after flipping is (x ', y'), then the calculation formula is as follows:
Figure BDA0002912610420000113
in the formula, w represents the width of an image;
the rotation is defined as the coordinates of a point on the lane line before the image rotation is (x, y), the coordinates after the image rotation is (x ', y'), and the rotation angle is θ, then the calculation formula is as follows:
Figure BDA0002912610420000114
in the formula, w represents the width of an image; h represents the height of the image;
the coordinates of a point on the lane line before image translation are set as (x, y) in the translation, and the coordinates after translation are set as (x ', y'), so that the calculation formula is as follows:
Figure BDA0002912610420000115
in the formula (x)start,ystart) Representing the coordinates selected before translation, (x)end,yend) Representing coordinates after point translation;
and (3) the scaling is to set the coordinates of a point on the lane line before image scaling as (x, y) and the scaled coordinates as (x ', y'), so that the calculation formula is as follows:
Figure BDA0002912610420000121
where scale represents the scaling factor, (x)top,ytop) Representation scalingThe starting coordinates of the post image in the filler image.
Examples ten,
There is provided in accordance with an eleventh embodiment a lane line detection system, the system comprising:
a first module for constructing a convolutional neural network model; the first module further constructs the neural network model as follows:
the neural network model extracts the features of the convolution feature map;
the convolution characteristic diagram is connected with three branches and adopts convolution layers of 1x 1;
1x1 convolution layers respectively compress the number of channels of the characteristic diagram;
the compressed data carries out matrix multiplication on the first branch and the second branch after the first branch is transferred to obtain an attention feature map;
obtaining a normalized attention feature map by the attention feature map through softmax;
performing matrix multiplication on the third branch of the compressed data and the normalized attention feature map to obtain a weighted attention feature map;
and (4) performing dimension increasing on the number of the characteristic diagram channels by utilizing 1X1 convolution, and outputting a self-attention characteristic diagram.
An attention structure is obtained according to a neural network model, and the attention structure is specifically as follows:
the size of the characteristic diagram obtained after passing through the backbone network is not prevented from being N C H W,
then, the convolution of 1X1 is adopted by three branches to respectively compress the channel number of the feature map,
the sizes of the compressed characteristic graphs are all N C/8 (H W);
then, the characteristic diagram Q of the second branch is subjected to a transposition function to obtain Q ', and the size of Q' is N (H) W C/8;
performing matrix multiplication on Q 'and R to obtain M, and then performing normalization processing on M by using Softmax to obtain M', wherein the size of M is N (H) W;
representing the extracted global semantic information; performing matrix multiplication operation on the P and the M' to obtain a characteristic diagram O, wherein the size of the characteristic diagram O is N C/8 (H W);
finally, the number of channels is subjected to dimensionality increase by convolution with 1X1, and the size of the output feature graph is N × C (H × W).
Examples eleven,
On the basis of the tenth embodiment, the second module is used for constructing a training and reasoning stage in the network model; the second module is further the training phase, and the specific operation steps are as follows:
in the training stage, the specific operation is as follows:
collecting lane line training sample images, wherein the lane line training sample images comprise RGB three-channel color images and are marked with the position and category information of lane lines;
the training sample is converted into a format required by a convolutional neural network through pretreatment, and the training effect is further improved;
dividing a data set into a training set, a verification set and a test set, wherein the training set is used for training the convolutional neural network, the verification set is used for selecting an optimal training model, and the test set is used for evaluating model indexes;
building a neural network structure, and replacing a common convolution structure in a PointLaneNet structure with an attention structure;
iterative training, forward propagation and calculation of a model prediction result and a loss function of the marked truth value; calculating gradients through back propagation;
and observing a loss function descending curve, verifying the converged model on a verification set, and selecting an optimal training model.
Examples twelve,
A third module for adjusting the preprocessing of the training phase on the basis of the tenth embodiment; the third module further performs the following pre-processing procedure:
the pretreatment process comprises the following steps:
reading the image I and the annotation data L;
zooming and reading the coefficients of the image I and the labeled data L;
and normalizing the coefficients of the scaled read image I and the labeled data L.
The scaling expressions in the coefficients of the scaled read image I and the annotation data L are as follows:
Figure BDA0002912610420000131
in the formula, dstxAnd dstyRepresenting the position of the zoomed image; srcxAnd srcyRepresenting the corresponding position in the original image;
sxand syRepresents a scaling factor for;
the expression for normalization in the coefficient normalization process of the scaled read image I and the annotation data L is as follows:
Figure BDA0002912610420000132
in formula (II)'(x,y)Representing pixel values of the normalized image; i is(x,y)Representing pixel values of the image before normalization; mean represents the mean of the image pixel values before normalization; std represents the variance of the image pixel values before normalization; in the middle, the mean values of RGB three channels are [0.485, 0.456 and 0.406 ] in sequence]While the corresponding variance is sequentially
[0.229,0.224,0.225]。
Example thirteen,
A fourth module for predicting lane lines through a training phase based on the twelfth embodiment; the fourth module further enhances strategies including horizontal turning, rotation, translation and scaling for the data of the lane line detection task involved in the training phase; the coordinates of points on the lane line before horizontal turning are set as (x, y) in the horizontal turning; the coordinate after flipping is (x ', y'), then the calculation formula is as follows:
Figure BDA0002912610420000141
in the formula, w represents the width of an image;
the rotation is defined as the coordinates of a point on the lane line before the image rotation is (x, y), the coordinates after the image rotation is (x ', y'), and the rotation angle is θ, then the calculation formula is as follows:
Figure BDA0002912610420000142
in the formula, w represents the width of an image; h represents the height of the image;
the coordinates of a point on the lane line before image translation are set as (x, y) in the translation, and the coordinates after translation are set as (x ', y'), so that the calculation formula is as follows:
Figure BDA0002912610420000143
in the formula (x)start,ystart) Representing the coordinates selected before translation, (x)end,yend) Representing coordinates after point translation;
and (3) the scaling is to set the coordinates of a point on the lane line before image scaling as (x, y) and the scaled coordinates as (x ', y'), so that the calculation formula is as follows:
Figure BDA0002912610420000144
where scale represents the scaling factor, (x)top,ytop) Representing the starting coordinates of the scaled image in the filler image.
Examples fourteen,
There is provided in accordance with a twelfth embodiment a lane line detecting apparatus, including:
a processor and a memory storing computer program instructions;
the processor reads and executes the computer program instructions to implement the lane line detection method of the first embodiment.
Example fifteen,
There is provided in accordance with a twelfth embodiment a computer-readable storage medium having a computer stored thereon;
program instructions, which when executed by a processor implement a lane line detection method according to the first embodiment.
It should also be noted that the exemplary embodiments mentioned in this patent describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the above-mentioned order of the steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.
As will be apparent to those skilled in the art, for convenience and brevity of description, the specific working processes of the systems, modules and units described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.

Claims (10)

1. A lane line detection method is characterized by comprising the following steps:
constructing a convolutional neural network model;
constructing a training and reasoning stage in a network model;
adjusting the pretreatment of the training phase;
and obtaining the predicted lane line through an inference stage.
2. The method according to claim 1, wherein the neural network model is constructed as follows:
the neural network model extracts the features of the convolution feature map;
the convolution characteristic diagram is connected with three branches and adopts convolution layers of 1x 1;
1x1 convolution layers respectively compress the number of channels of the characteristic diagram;
the compressed data carries out matrix multiplication on the first branch and the second branch after the first branch is transferred to obtain an attention feature map;
obtaining a normalized attention feature map by the attention feature map through softmax;
performing matrix multiplication on the third branch of the compressed data and the normalized attention feature map to obtain a weighted attention feature map;
performing dimension increasing on the number of the characteristic diagram channels by utilizing 1X1 convolution, and outputting a self-attention characteristic diagram;
an attention structure is obtained according to a neural network model, and the attention structure is specifically as follows:
the size of the characteristic diagram obtained after passing through the backbone network is not prevented from being N C H W,
then, the convolution of 1X1 is adopted by three branches to respectively compress the channel number of the feature map,
the sizes of the compressed characteristic graphs are all N C/8 (H W);
then, the characteristic diagram Q of the second branch is subjected to a transposition function to obtain Q ', and the size of Q' is N (H) W C/8;
performing matrix multiplication on Q 'and R to obtain M, and then performing normalization processing on M by using Softmax to obtain M', wherein the size of M is N (H) W;
representing the extracted global semantic information; performing matrix multiplication operation on the P and the M' to obtain a characteristic diagram O, wherein the size of the characteristic diagram O is N C/8 (H W);
finally, the number of channels is subjected to dimensionality increase by convolution with 1X1, and the size of the output feature graph is N × C (H × W).
3. The lane line detection method according to claim 1, wherein the training phase specifically operates as follows:
collecting lane line training sample images, wherein the lane line training sample images comprise RGB three-channel color images and are marked with the position and category information of lane lines;
the training sample is converted into a format required by a convolutional neural network through pretreatment, and the training effect is further improved;
dividing a data set into a training set, a verification set and a test set, wherein the training set is used for training the convolutional neural network, the verification set is used for selecting an optimal training model, and the test set is used for evaluating model indexes;
building a neural network structure, and replacing a common convolution structure in a PointLaneNet structure with an attention structure;
iterative training, forward propagation and calculation of a model prediction result and a loss function of the marked truth value; calculating gradients through back propagation;
observing a loss function descending curve, verifying the converged model on a verification set, and selecting an optimal training model;
the training process comprises the following steps:
initializing training times;
generating small-batch lane line training data according to the training times;
obtaining a predicted value through forward-transmitted training data, and calculating a loss function;
updating the weights by back propagation;
judging whether the updated weight reaches a training target; if yes, the training is finished; otherwise, carrying out the next step;
and (4) judging whether the training times are reached, if not, returning to the step (2) through the training times plus 1, and if so, ending the training.
4. The lane line detection method according to claim 1, wherein the inference stage specifically comprises the following steps:
randomly disorganizing the test set;
carrying out convolution neural network reasoning on the images in the test set to obtain a reasoning result;
post-processing the inference result to obtain a predicted lane line;
the pretreatment process comprises the following steps:
reading the image I and the annotation data L;
zooming and reading the coefficients of the image I and the labeled data L;
and normalizing the coefficients of the scaled read image I and the labeled data L.
5. The lane line detection method according to claim 4, wherein the scaling expressions in the coefficients of the scaled read image I and the annotation data L are as follows:
Figure FDA0002912610410000021
in the formula, dstxAnd dstyRepresenting the position of the zoomed image; srcxAnd srcyRepresenting the corresponding position in the original image;
sxand syRepresents a scaling factor for;
the expression for normalization in the coefficient normalization process of the scaled read image I and the annotation data L is as follows:
Figure FDA0002912610410000031
in formula (II)'(x,y)Representing pixel values of the normalized image; i is(x,y)Representing pixel values of the image before normalization; mean represents the mean of the image pixel values before normalization; std denotes the variance of the image pixel values before normalization.
6. The lane line detection method according to claim 1, wherein the data enhancement strategy of the lane line detection task involved in the training phase comprises horizontal flipping, rotation, translation, and scaling; the coordinates of points on the lane line before horizontal turning are set as (x, y) in the horizontal turning; the coordinate after flipping is (x ', y'), then the calculation formula is as follows:
Figure FDA0002912610410000032
in the formula, w represents the width of an image;
the rotation is defined as the coordinates of a point on the lane line before the image rotation is (x, y), the coordinates after the image rotation is (x ', y'), and the rotation angle is θ, then the calculation formula is as follows:
Figure FDA0002912610410000033
in the formula, w represents the width of an image; h represents the height of the image;
the coordinates of a point on the lane line before image translation are set as (x, y) in the translation, and the coordinates after translation are set as (x ', y'), so that the calculation formula is as follows:
Figure FDA0002912610410000034
in the formula (x)start,ystart) Representing the coordinates selected before translation, (x)end,yend) Representing coordinates after point translation;
and (3) the scaling is to set the coordinates of a point on the lane line before image scaling as (x, y) and the scaled coordinates as (x ', y'), so that the calculation formula is as follows:
Figure FDA0002912610410000035
where scale represents the scaling factor, (x)top,ytop) Representing the starting coordinates of the scaled image in the filler image.
7. A lane line detection system is characterized by comprising the following modules:
a first module for constructing a convolutional neural network model;
a second module for building a training and reasoning phase through the network model;
a third module for adjusting the pre-processing of the training phase;
a fourth module for predicting the lane line through the inference phase;
the first module further constructs the neural network model as follows:
the neural network model extracts the features of the convolution feature map;
the convolution characteristic diagram is connected with three branches and adopts convolution layers of 1x 1;
1x1 convolution layers respectively compress the number of channels of the characteristic diagram;
the compressed data carries out matrix multiplication on the first branch and the second branch after the first branch is transferred to obtain an attention feature map;
obtaining a normalized attention feature map by the attention feature map through softmax;
performing matrix multiplication on the third branch of the compressed data and the normalized attention feature map to obtain a weighted attention feature map;
performing dimension increasing on the number of the characteristic diagram channels by utilizing 1X1 convolution, and outputting a self-attention characteristic diagram;
further, an attention structure is obtained according to the neural network model, and the attention structure is specifically as follows:
the size of the characteristic diagram obtained after passing through the backbone network is not prevented from being N C H W,
then, the convolution of 1X1 is adopted by three branches to respectively compress the channel number of the feature map,
the sizes of the compressed characteristic graphs are all N C/8 (H W);
then, the characteristic diagram Q of the second branch is subjected to a transposition function to obtain Q ', and the size of Q' is N (H) W C/8;
performing matrix multiplication on Q 'and R to obtain M, and then performing normalization processing on M by using Softmax to obtain M', wherein the size of M is N (H) W;
representing the extracted global semantic information; performing matrix multiplication operation on the P and the M' to obtain a characteristic diagram O, wherein the size of the characteristic diagram O is N C/8 (H W);
finally, the size of the characteristic graph is output to be N, C (H, W) after the number of channels is subjected to dimensionality increase by convolution of 1X 1;
the second module is further the training phase, and the specific operation steps are as follows:
collecting lane line training sample images, wherein the lane line training sample images comprise RGB three-channel color images and are marked with the position and category information of lane lines;
the training sample is converted into a format required by a convolutional neural network through pretreatment, and the training effect is further improved;
dividing a data set into a training set, a verification set and a test set, wherein the training set is used for training the convolutional neural network, the verification set is used for selecting an optimal training model, and the test set is used for evaluating model indexes;
building a neural network structure, and replacing a common convolution structure in a PointLaneNet structure with an attention structure;
iterative training, forward propagation and calculation of a model prediction result and a loss function of the marked truth value; calculating gradients through back propagation;
observing a loss function descending curve, verifying the converged model on a verification set, and selecting an optimal training model;
the training process comprises the following steps:
initializing training times;
generating small-batch lane line training data according to the training times;
obtaining a predicted value through forward-transmitted training data, and calculating a loss function;
updating the weights by back propagation;
judging whether the updated weight reaches a training target; if yes, the training is finished; otherwise, carrying out the next step;
and (4) judging whether the training times are reached, if not, returning to the step (2) through the training times plus 1, and if so, ending the training.
8. The lane line detection system of claim 7, wherein said third module further performs said preprocessing as follows:
reading the image I and the annotation data L;
zooming and reading the coefficients of the image I and the labeled data L;
normalizing the coefficients of the zoom reading image I and the annotation data L;
the scaling expressions in the coefficients of the scaled read image I and the annotation data L are as follows:
Figure FDA0002912610410000051
in the formula, dstxAnd dstyRepresenting the position of the zoomed image; srcxAnd srcyRepresenting the corresponding position in the original image;
sxand syRepresents a scaling factor for;
the expression for normalization in the coefficient normalization process of the scaled read image I and the annotation data L is as follows:
Figure FDA0002912610410000052
in formula (II)'(x,y)Representing pixel values of the normalized image; i is(x,y)Representing pixel values of the image before normalization; mean represents the mean of the image pixel values before normalization; std represents the variance of the image pixel values before normalization;
the fourth module further enhances strategies including horizontal turning, rotation, translation and scaling for the data of the lane line detection task involved in the training phase; the coordinates of points on the lane line before horizontal turning are set as (x, y) in the horizontal turning; the coordinate after flipping is (x ', y'), then the calculation formula is as follows:
Figure FDA0002912610410000061
in the formula, w represents the width of an image;
the rotation is defined as the coordinates of a point on the lane line before the image rotation is (x, y), the coordinates after the image rotation is (x ', y'), and the rotation angle is θ, then the calculation formula is as follows:
Figure FDA0002912610410000062
in the formula, w represents the width of an image; h represents the height of the image;
the coordinates of a point on the lane line before image translation are set as (x, y) in the translation, and the coordinates after translation are set as (x ', y'), so that the calculation formula is as follows:
Figure FDA0002912610410000063
in the formula (x)start,ystart) Representing the coordinates selected before translation, (x)end,yend) Representing coordinates after point translation;
and (3) the scaling is to set the coordinates of a point on the lane line before image scaling as (x, y) and the scaled coordinates as (x ', y'), so that the calculation formula is as follows:
Figure FDA0002912610410000064
where scale represents the scaling factor, (x)top,ytop) Representing the starting coordinates of the scaled image in the filler image.
9. A lane line detection apparatus, the apparatus comprising:
a processor and a memory storing computer program instructions;
the processor reads and executes the computer program instructions to implement a lane line detection method according to any one of claims 1 to 6.
10. A computer-readable storage medium having a computer stored thereon;
program instructions which, when executed by a processor, implement a lane line detection method according to any one of claims 1 to 6.
CN202110091223.3A 2021-01-22 2021-01-22 Lane line detection method and system Pending CN112883807A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110091223.3A CN112883807A (en) 2021-01-22 2021-01-22 Lane line detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110091223.3A CN112883807A (en) 2021-01-22 2021-01-22 Lane line detection method and system

Publications (1)

Publication Number Publication Date
CN112883807A true CN112883807A (en) 2021-06-01

Family

ID=76050531

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110091223.3A Pending CN112883807A (en) 2021-01-22 2021-01-22 Lane line detection method and system

Country Status (1)

Country Link
CN (1) CN112883807A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113269171A (en) * 2021-07-20 2021-08-17 魔视智能科技(上海)有限公司 Lane line detection method, electronic device and vehicle
CN114782915A (en) * 2022-04-11 2022-07-22 哈尔滨工业大学 Intelligent automobile end-to-end lane line detection system and equipment based on auxiliary supervision and knowledge distillation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222591A (en) * 2019-05-16 2019-09-10 天津大学 A kind of method for detecting lane lines based on deep neural network
CN110298387A (en) * 2019-06-10 2019-10-01 天津大学 Incorporate the deep neural network object detection method of Pixel-level attention mechanism
CN111242037A (en) * 2020-01-15 2020-06-05 华南理工大学 Lane line detection method based on structural information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222591A (en) * 2019-05-16 2019-09-10 天津大学 A kind of method for detecting lane lines based on deep neural network
CN110298387A (en) * 2019-06-10 2019-10-01 天津大学 Incorporate the deep neural network object detection method of Pixel-level attention mechanism
CN111242037A (en) * 2020-01-15 2020-06-05 华南理工大学 Lane line detection method based on structural information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王哲伟: "基于深度学习的车道线检测算法研究", 中国优秀硕士电子期刊网, no. 01, pages 035 - 472 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113269171A (en) * 2021-07-20 2021-08-17 魔视智能科技(上海)有限公司 Lane line detection method, electronic device and vehicle
CN113269171B (en) * 2021-07-20 2021-10-12 魔视智能科技(上海)有限公司 Lane line detection method, electronic device and vehicle
CN114782915A (en) * 2022-04-11 2022-07-22 哈尔滨工业大学 Intelligent automobile end-to-end lane line detection system and equipment based on auxiliary supervision and knowledge distillation
CN114782915B (en) * 2022-04-11 2023-04-07 哈尔滨工业大学 Intelligent automobile end-to-end lane line detection system and equipment based on auxiliary supervision and knowledge distillation

Similar Documents

Publication Publication Date Title
CN111401361B (en) End-to-end lightweight depth license plate recognition method
CN111696094B (en) Immunohistochemical PD-L1 membrane staining pathological section image processing method, device and equipment
CN112837315B (en) Deep learning-based transmission line insulator defect detection method
CN111160311A (en) Yellow river ice semantic segmentation method based on multi-attention machine system double-flow fusion network
CN111612008B (en) Image segmentation method based on convolution network
CN113177560A (en) Universal lightweight deep learning vehicle detection method
CN111259940A (en) Target detection method based on space attention map
CN113298815A (en) Semi-supervised remote sensing image semantic segmentation method and device and computer equipment
CN110659601B (en) Depth full convolution network remote sensing image dense vehicle detection method based on central point
CN112883807A (en) Lane line detection method and system
CN108776777A (en) The recognition methods of spatial relationship between a kind of remote sensing image object based on Faster RCNN
CN114821342A (en) Remote sensing image road extraction method and system
CN114913498A (en) Parallel multi-scale feature aggregation lane line detection method based on key point estimation
CN106372597A (en) CNN traffic detection method based on adaptive context information
CN112037180B (en) Chromosome segmentation method and device
CN110738132A (en) target detection quality blind evaluation method with discriminant perception capability
CN114494870A (en) Double-time-phase remote sensing image change detection method, model construction method and device
CN115205855A (en) Vehicle target identification method, device and equipment fusing multi-scale semantic information
CN114387512A (en) Remote sensing image building extraction method based on multi-scale feature fusion and enhancement
CN114120359A (en) Method for measuring body size of group-fed pigs based on stacked hourglass network
CN111914596B (en) Lane line detection method, device, system and storage medium
CN115147727A (en) Method and system for extracting impervious surface of remote sensing image
KR102416714B1 (en) System and method for city-scale tree mapping using 3-channel images and multiple deep learning
CN115761667A (en) Unmanned vehicle carried camera target detection method based on improved FCOS algorithm
CN115376094A (en) Unmanned sweeper road surface identification method and system based on scale perception neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination