CN117765492A

CN117765492A - Lane line detection method and device, electronic equipment and storage medium

Info

Publication number: CN117765492A
Application number: CN202311769108.2A
Authority: CN
Inventors: 邱增玉; 王烽人; 娄舜; 温子腾; 郭涛; 胡金水
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2023-12-20
Filing date: 2023-12-20
Publication date: 2024-03-26

Abstract

The invention provides a lane line detection method, a lane line detection device, electronic equipment and a storage medium, wherein the lane line detection method comprises the following steps: inputting the image to be detected into a feature extractor in the lane line detection model to obtain the image features of the image to be detected output by the feature extractor; extracting initial query characteristics corresponding to at least one target pixel point from the image characteristics, wherein the target pixel point is a pixel point corresponding to a lane line in the image to be detected; based on the image characteristics and each initial query characteristic, carrying out characteristic enhancement on each initial query characteristic to obtain an enhanced query characteristic corresponding to each target pixel point, wherein the enhanced query characteristic is used for representing global characteristic information of the target pixel point in the image to be detected; and detecting a lane line in the image to be detected based on each enhanced query characteristic. The feature enhancement is performed based on the image features and the initial query features, so that the information characterization strength of the target pixel point can be improved, and the detection accuracy can be further improved.

Description

Lane line detection method and device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a lane line detection method, a lane line detection device, electronic equipment and a storage medium.

Background

In the related art of automatically driving automobiles, lane line detection can help a vehicle identify a lane line on a road, and based on the identified lane line, the vehicle can travel in the correct direction, lane, speed, and the like. Lane line detection techniques have been used in lane keeping, planning control, real-time positioning, adaptive cruise control, and other applications.

The existing lane line detection methods include methods based on a deep learning algorithm, such as a lane line detection method based on a polynomial curve, a lane line detection method based on semantic segmentation, and the like. Although the methods use a deep learning algorithm, the accuracy of detection is lower due to the defects of the algorithm. Therefore, how to improve the accuracy of lane line detection is a problem to be solved.

Disclosure of Invention

The invention provides a lane line detection method, a lane line detection device, electronic equipment and a storage medium, which are used for solving the defect of low lane line detection accuracy in the prior art and achieving the purpose of improving the detection accuracy.

The invention provides a lane line detection method, which comprises the following steps:

inputting an image to be detected into a feature extractor in a lane line detection model to obtain image features of the image to be detected, which are output by the feature extractor;

Extracting initial query features corresponding to at least one target pixel point from the image features, wherein the target pixel point is a pixel point corresponding to a lane line in the image to be detected;

performing feature enhancement on each initial query feature based on the image features and each initial query feature to obtain an enhanced query feature corresponding to each target pixel point, wherein the enhanced query feature is used for representing global feature information of the target pixel point in the image to be detected;

and detecting the lane line in the image to be detected based on each enhanced query characteristic.

According to the lane line detection method provided by the invention, the feature enhancement is performed on each initial query feature based on the image feature and each initial query feature to obtain an enhanced query feature corresponding to each target pixel point, and the method comprises the following steps:

inputting the image features and the initial query features into a transducer decoder in the lane line detection model, performing self-attention feature enhancement on the initial query features through the transducer decoder, and performing cross-attention feature enhancement on the image features and the initial query features to obtain enhanced query features corresponding to the target pixel points.

According to the lane line detection method provided by the invention, the initial query feature corresponding to at least one target pixel point is extracted from the image features, and the method comprises the following steps:

inputting the image features into a semantic segmentation detector in the lane line detection model to obtain a mask segmentation map output by the semantic segmentation detector, wherein the mask segmentation map is used for representing the positions of corresponding pixel points of the lane lines in the image to be detected;

and inputting the mask segmentation map and the image features into a query generator in the lane line detection model to obtain initial query features corresponding to each target pixel point output by the query generator.

According to the lane line detection method provided by the invention, the mask segmentation map and the image features are input into a query generator in the lane line detection model to obtain initial query features corresponding to each target pixel point output by the query generator, and the method comprises the following steps:

determining the position of each target pixel point in the image to be detected based on the mask segmentation map through the query generator;

and extracting feature information corresponding to the position of the target pixel point from the image feature based on the position of the target pixel point, and determining the feature information as an initial query feature corresponding to the target pixel point.

determining a position vector corresponding to each target pixel point based on a mask segmentation map through a query generator in the lane line detection model, wherein the mask segmentation map is used for representing the positions of the pixel points corresponding to the lane lines in the image to be detected;

for each target pixel point, splicing the position vector of the target pixel point with the initial query characteristic of the target pixel point to obtain a target query characteristic;

inputting the image features and the target query features into a transducer decoder in the lane line detection model, performing self-attention feature enhancement on the target query features through the transducer decoder, and performing cross-attention feature enhancement on the image features and the target query features to obtain enhanced query features corresponding to the target pixel points.

According to the lane line detection method provided by the invention, the lane line in the image to be detected is detected based on each enhanced query feature, and the lane line detection method comprises the following steps:

Inputting the enhanced query features into a lane line detector in the lane line detection model aiming at each enhanced query feature to obtain a predicted lane line corresponding to the enhanced query features output by the lane line detector and prediction confidence of the predicted lane line;

and detecting the lane lines in the image to be detected based on the predicted lane lines corresponding to the enhanced query features and the prediction confidence of the predicted lane lines.

According to the lane line detection method provided by the invention, the lane line detection method for detecting the lane line in the image to be detected based on the predicted lane line corresponding to each enhanced query feature and the prediction confidence of the predicted lane line comprises the following steps:

determining the similarity between any two enhanced query features, and dividing the enhanced query features with the similarity larger than the preset similarity into a group to obtain at least one group;

for each packet, detecting a predicted lane line corresponding to an enhanced query feature with the maximum prediction confidence in the packet as a target lane line of the packet;

and detecting each target lane line as a lane line in the image to be detected.

The invention also provides a lane line detection device, which comprises:

The input unit is used for inputting the image to be detected into the feature extractor in the lane line detection model to obtain the image features of the image to be detected, which are output by the feature extractor;

the extraction unit is used for extracting initial query characteristics corresponding to at least one target pixel point from the image characteristics, wherein the target pixel point is a pixel point corresponding to a lane line in the image to be detected;

the enhancement unit is used for carrying out feature enhancement on each initial query feature based on the image features and each initial query feature to obtain an enhanced query feature corresponding to each target pixel point, wherein the enhanced query feature is used for representing global feature information of the target pixel point in the image to be detected;

and the detection unit is used for detecting the lane lines in the image to be detected based on the enhanced query characteristics.

The invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the lane line detection method according to any one of the above when executing the computer program.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a lane line detection method as described in any one of the above.

The invention also provides a computer program product comprising a computer program which when executed by a processor implements a lane line detection method as described in any one of the above.

The invention provides a lane line detection method, a lane line detection device, electronic equipment and a storage medium, wherein the method is characterized in that an image to be detected is input into a feature extractor in a lane line detection model to obtain image features of the image to be detected, which are output by the feature extractor; extracting initial query characteristics corresponding to at least one target pixel point from the image characteristics, wherein the target pixel point is a pixel point corresponding to a lane line in the image to be detected; based on the image characteristics and each initial query characteristic, carrying out characteristic enhancement on each initial query characteristic to obtain an enhanced query characteristic corresponding to each target pixel point, wherein the enhanced query characteristic is used for representing global characteristic information of the target pixel point in the image to be detected; and detecting a lane line in the image to be detected based on each enhanced query characteristic. Therefore, the information of the pixel points in the lane line can be represented through the initial query characteristics, and the semantic priori information of the lane line is fully utilized; further, feature enhancement is performed on each initial query feature based on the image features and each initial query feature, so that after feature interaction is performed on the enhanced query features and the image features, information of a target pixel point in the image global can be more effectively represented, information characterization strength of the target pixel point is improved, and therefore, when lane lines in an image to be detected are detected based on each enhanced query feature, detection results with higher accuracy can be obtained, and further the accuracy of lane line detection is improved.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a lane line detection method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a lane line detection model according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of determining a target pixel according to an embodiment of the present invention;

FIG. 4 is a schematic block diagram of implementation steps of a lane line detection method according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a lane line detection apparatus according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention. It should be noted that, in the present invention, the numbers of the described objects, such as "first", "second", etc., are only used to distinguish the described objects, and do not have any sequence or technical meaning.

In practical application, the traditional lane line detection method depends on a detection rule designed manually to detect the lane line, has poor applicability when detecting based on the rule, and cannot meet the practical requirements of applications such as automatic driving perception. With the continuous progress and development of deep learning technology, a lane line detection method based on deep learning is attracting attention.

In the existing methods, the lane line detection method based on deep learning comprises a lane line detection method based on a polynomial curve, a lane line detection method based on semantic segmentation and the like, and different types of methods respectively model the lane line detection process from different angles, but the lane line semantic priori information is ignored in the methods, and the effect of the lane line can be effectively represented by not fully combining the lane line semantic priori information, so that the detection accuracy is lower when the lane line detection is carried out by adopting the existing methods.

The lane line detection based on the polynomial curve takes the lane line as the polynomial curve, and can be a quadratic polynomial, a cubic polynomial or a Bezier curve, and the like, and a complete lane line is shown by predicting the coefficient of the polynomial curve and the information such as the starting point of the lane line. The method has the advantages that the modeling concept of the lane lines is more ideal, and the method cannot be suitable for more lane line scenes, so that the detection accuracy is lower in practical application.

The semantic segmentation based lane line detection is to classify each pixel on the image into a lane line and a background class to generate a binary segmentation mask. And clustering pixels belonging to the same lane line into a group through post-processing to obtain lane lines of different examples. The method converts the lane line detection problem into a pixel-by-pixel prediction problem. Although the method uses the binary segmentation mask, the lane line is not modeled as a whole, and the representation capability of the semantic priori information of the lane line is not fully utilized, so that the phenomenon of missed detection or broken detected lane lines and the like can occur due to the lack of visual clues of the lane line when the method is used for the push application, and the accuracy of lane line detection is lower.

In view of the above problems, an embodiment of the present invention provides a lane line detection method, where an image to be detected is input into a feature extractor in a lane line detection model to obtain image features of the image to be detected output by the feature extractor; extracting initial query characteristics corresponding to at least one target pixel point from the image characteristics, wherein the target pixel point is a pixel point corresponding to a lane line in the image to be detected; based on the image characteristics and each initial query characteristic, carrying out characteristic enhancement on each initial query characteristic to obtain an enhanced query characteristic corresponding to each target pixel point, wherein the enhanced query characteristic is used for representing global characteristic information of the target pixel point in the image to be detected; and detecting a lane line in the image to be detected based on each enhanced query characteristic. Therefore, the information of the pixel points in the lane line can be represented through the initial query characteristics, and the semantic priori information of the lane line is fully utilized; further, feature enhancement is performed on each initial query feature based on the image features and each initial query feature, so that after feature interaction is performed on the enhanced query features and the image features, information of a target pixel point in the image global can be more effectively represented, information characterization strength of the target pixel point is improved, and therefore, when lane lines in an image to be detected are detected based on each enhanced query feature, detection results with higher accuracy can be obtained, and further the accuracy of lane line detection is improved.

The lane line detection method according to the embodiment of the present invention is described below with reference to fig. 1 to 4.

Fig. 1 is a flow chart of a lane line detection method provided by the embodiment of the invention, which is applicable to lane line detection in various scenes, for example, lane line detection scenes in the technical fields of automatic driving perception and the like. The execution subject of the method can be electronic equipment such as vehicles, computers, servers or specially designed lane line detection equipment, or can be lane line detection devices arranged in the electronic equipment, and the lane line detection devices can be realized by software, hardware or a combination of the two. As shown in fig. 1, the lane line detection method includes steps 110 to 140.

Step 110, inputting the image to be detected into a feature extractor in the lane line detection model to obtain the image features of the image to be detected output by the feature extractor.

In this step, the image to be detected may be an image of a lane line to be detected acquired by the image acquisition device, or may be an image of a lane line to be detected obtained by frame extraction from a video stream acquired by the video acquisition device, or may be an image to be detected acquired by other means.

The lane line detection model may be a trained neural network model. For example, the initial lane line detection model may be subjected to supervised training based on the sample image and the truth label corresponding to the sample image, to obtain the lane line detection model. The initial lane line detection model may include, but is not limited to, a model composed of at least one of convolutional neural network (Convolutional Neural Network, CNN), cyclic neural network (Recurrent Neural Networks, RNN), long short-term memory (LSTM) neural network, and deep neural network (Deep Neural Networks, DNN).

The lane line detection model includes a feature extractor therein, which may be used to extract image features of an image to be detected, for example, the feature extractor may include a plurality of convolutional neural network layers. The image features may be high-dimensional feature vectors characterizing the image information of the image to be detected. The image features can represent the global feature information of the image to be detected, including the feature information of the lane line local part and the lane line whole, and can be understood as high-level semantic features.

Step 120, extracting an initial query feature corresponding to at least one target pixel point from the image features, where the target pixel point is a pixel point corresponding to a lane line in the image to be detected.

In this step, the target pixel point may be a pixel point corresponding to a lane line in the image to be detected, where the target pixel point may be a pixel point obtained randomly from all pixel points of the lane line, or may be a pixel point determined from all pixel points of the lane line in a preset obtaining manner, for example, a target pixel point is determined from all pixel points of the lane line in an equidistant obtaining manner.

Extracting an initial query feature corresponding to at least one target pixel point from the image feature, which can be understood as extracting feature information of the pixel point corresponding to the lane line, namely the initial query feature, from the image feature based on the extracted image feature, wherein the initial query feature can represent the feature information of the target pixel point. The initial query feature may be understood as a query feature obtained by preliminary extraction, and the query feature may be a query feature.

For example, after determining a target pixel point to extract an initial query feature, inquiring the image feature of the image to be detected based on index information of the target pixel point in the image to be detected, so as to extract the initial query feature of the target pixel point.

And 130, carrying out feature enhancement on each initial query feature based on the image features and each initial query feature to obtain enhanced query features corresponding to each target pixel point, wherein the enhanced query features are used for representing global feature information of the target pixel point in the image to be detected.

In this step, the enhanced query feature may be a feature obtained after feature enhancement of the initial query feature, for example, feature enhancement may be performed on the initial query feature by means of feature information fusion or feature vector concatenation, so as to obtain a corresponding enhanced query feature. The enhanced query feature may represent global feature information of the target pixel point in the image to be detected, and it may be understood that the enhanced query feature may represent feature information associated with the corresponding target pixel point and other target pixel points, feature information associated with the target pixel point and the whole lane line, and feature information associated with the target pixel point and other pixel areas except the lane line in the image to be detected.

The image features and the initial query features are input into a network layer of an Encoder-Decoder structure to perform feature fusion, so as to obtain the enhanced query features corresponding to the initial query features.

Based on the method, the feature information of the corresponding target pixel point can be represented by stronger characterization force through the enhanced query feature obtained by feature enhancement, the purpose of lane line detection by fully utilizing the semantic priori information of the lane line is achieved, and the accuracy of lane line detection can be further improved.

And 140, detecting lane lines in the image to be detected based on the enhanced query features.

In this step, when detecting the lane line in the image to be detected based on each enhanced query feature, for example, lane line prediction may be performed based on each enhanced query feature corresponding to each target pixel point to predict a lane line where each target pixel point is located, and further, a probability value corresponding to the predicted lane line may be determined, where the probability value indicates a probability that the predicted lane line is a real lane line; in each predicted lane line, the area division may be performed based on the position of each predicted lane line, and at least one area may be divided. For one divided region, the predicted lane line corresponding to the maximum probability value in the region can be determined as a detection result, and the lane line in the image to be detected can be detected.

According to the lane line detection method provided by the embodiment of the invention, the image characteristics of the image to be detected, which are output by the characteristic extractor, are obtained by inputting the image to be detected into the characteristic extractor in the lane line detection model; extracting initial query characteristics corresponding to at least one target pixel point from the image characteristics, wherein the target pixel point is a pixel point corresponding to a lane line in the image to be detected; based on the image characteristics and each initial query characteristic, carrying out characteristic enhancement on each initial query characteristic to obtain an enhanced query characteristic corresponding to each target pixel point, wherein the enhanced query characteristic is used for representing global characteristic information of the target pixel point in the image to be detected; and detecting a lane line in the image to be detected based on each enhanced query characteristic. Therefore, the information of the pixel points in the lane line can be represented through the initial query characteristics, and the semantic priori information of the lane line is fully utilized; further, feature enhancement is performed on each initial query feature based on the image features and each initial query feature, so that after feature interaction is performed on the enhanced query features and the image features, information of a target pixel point in the image global can be more effectively represented, information characterization strength of the target pixel point is improved, and therefore, when lane lines in an image to be detected are detected based on each enhanced query feature, detection results with higher accuracy can be obtained, and further the accuracy of lane line detection is improved.

In addition, the existing method does not fully utilize the semantic priori information of the lane lines, so that the existing method has the defects of unstable training, low convergence speed, poor generalization performance and the like during model training, and a large amount of data is needed for training, so that the training cost is high. Meanwhile, the existing method usually only focuses on the local information of the lane lines, does not focus on the global information of the image to be detected, and under the condition of lacking the visual information of the lane lines, the defects of poor detection effect, broken lines or instability of the detected lane lines and the like exist during detection. The lane line detection method provided by the embodiment of the invention pays attention to fully utilizing the semantic priori information of the lane line, performs feature enhancement based on the image features and each initial query feature to obtain the enhanced query feature corresponding to each target pixel point, and performs lane line detection based on the enhanced query feature, and fully utilizes the global information of the image to be detected, in particular to the semantic priori information of the lane line, so that the purpose of improving the detection accuracy can be realized; meanwhile, as useful information is focused on, the interference of irrelevant information is reduced, the purposes of improving the training stability of the model, accelerating the convergence speed of the model and enhancing the generalization performance of the model can be achieved, and other defects in the prior art are overcome to a certain extent.

In practical application, in order to realize effective feature interaction between the image features and each initial query feature, a self-attention mechanism and a cross-attention mechanism can be adopted when the features are enhanced, so that semantic priori information of lane lines is fully utilized, and the enhanced query feature with stronger characterization strength is obtained.

In an embodiment, based on the image feature and each initial query feature, feature enhancement is performed on each initial query feature to obtain an enhanced query feature corresponding to each target pixel, which may be specifically implemented in the following manner:

inputting the image features and the initial query features into a transducer decoder in a lane line detection model, carrying out self-attention feature enhancement on the initial query features through the transducer decoder, and carrying out cross-attention feature enhancement on the image features and the initial query features to obtain enhanced query features corresponding to each target pixel point.

In particular, the lane line detection model may include a transducer decoder, which may be a network structure formed by a stack of multiple transducer network layers. The image features and the initial query features are input into a transducer decoder in the lane line detection model, the transducer decoder can perform self-attention feature enhancement on the input initial query features, and it can be understood that feature interaction is performed on the initial query features corresponding to the target pixel points, so that the characterization strength of the initial query features is enhanced. And performing cross-attention feature interaction on each initial query feature and the image feature after self-attention feature enhancement, further enhancing the characterization strength of each initial query feature, and obtaining enhanced query features corresponding to each target pixel point output by the transducer decoder.

Fig. 2 is a schematic diagram of a frame of a lane line detection model according to an embodiment of the present invention, and as shown in fig. 2, the lane line detection model includes a feature extractor, a transducer decoder, and the like. The feature extractor can be a network structure comprising a plurality of convolution layers, and can extract the image features of the image to be detected through convolution operation after inputting the image to be detected into the feature extractor in the lane line detection model to obtain high-level semantic features F _t . For example, an RGB three-channel image of 240×240×3 is input to the feature extractor, and a high-level semantic feature F of 40×40×256 can be output _t Wherein 240×240×3 indicates that the height and width of the image to be detected are 240×240, and the number of channels is 3;40 x 256 represent high-level semantic features F _t The sum of the widths of the features is 40 x 40 and the dimension of the feature is 256. Will high-level semantic feature F _t And inputting each initial query feature into a transducer decoder in the lane line detection model, wherein the transducer decoder can output each enhanced query feature with enhanced features through encoding and decoding, for example, the dimension of the initial query feature is 256, and the dimension of the enhanced query feature can be 256.

In this embodiment, the image feature and each initial query feature may be input to a transducer decoder in the lane line detection model, and the transducer decoder may perform self-attention feature enhancement on each initial query feature, and perform cross-attention feature enhancement on the image feature and each initial query feature, to obtain enhanced query features corresponding to each target pixel point. Based on the method, effective feature interaction between the image features and each initial query feature can be realized based on an attention mechanism, and the enhanced query feature with stronger characterization strength is obtained under the condition of fully utilizing the semantic priori information of the lane line, so that the accuracy of lane line detection is improved through each enhanced query feature.

Next, a detailed description is given of how to extract an initial query feature corresponding to at least one target pixel point from the image features.

In an embodiment, extracting an initial query feature corresponding to at least one target pixel point from the image features may be specifically implemented as follows:

inputting the image features into a semantic segmentation detector in a lane line detection model to obtain a mask segmentation map output by the semantic segmentation detector, wherein the mask segmentation map is used for representing the positions of corresponding pixel points of the lane lines in the image to be detected; and inputting the mask segmentation map and the image features into a query generator in the lane line detection model to obtain initial query features corresponding to each target pixel point output by the query generator.

Specifically, as shown in fig. 2, the lane line detection model includes a semantic segmentation detector and a query generator. The semantic segmentation detector may be a network structure for predicting the semantics of each pixel point in the image to be detected, for example, the semantic segmentation detector may be a network structure formed by a plurality of convolution layers, etc.; the query generator may also be a network structure comprising convolutional layers. Both the semantic segmentation detector and the query generator can be obtained by means of model training.

Inputting the image features into a semantic segmentation detector in the lane line detection model, wherein the semantic segmentation detector can output a mask segmentation map corresponding to the image features through operation, and the mask segmentation map can be a binary image, for example, in the mask segmentation map, a pixel point marked with 0 has a semantic of a background category, namely a pixel point representing that the pixel point is a non-lane line; the pixel marked with 1 has the semantic of foreground category, namely the pixel which indicates the pixel as the lane line. The mask segmentation map can provide semantic priori guiding position point coordinates, and it can be understood that a pixel point marked as 1 in the mask segmentation map is a pixel point corresponding to a lane line, and feature information of the pixel point can be queried through the position coordinates of the pixel point.

In one implementation manner, the mask segmentation map and the image feature are input into a query generator in the lane line detection model, so as to obtain initial query features corresponding to each target pixel point output by the query generator, which may be: determining the position of each target pixel point in the image to be detected based on the mask segmentation map through a query generator; for each target pixel point, extracting feature information corresponding to the position of the target pixel point from the image feature based on the position of the target pixel point, and determining the feature information as an initial query feature corresponding to the target pixel point.

Illustratively, as shown in FIG. 2, the image to be detected is input into a feature extractor, which may output high-level semantic features F of the image to be detected _t Will high-level semantic feature F _t The semantic segmentation detector is input, the semantic segmentation detector can predict the semantics of each pixel point in the image to be detected to obtain the semantics of each pixel point, and then a binary mask segmentation map of the image to be detected can be output, and the positions of the pixel points corresponding to the lane lines can be determined from the mask segmentation map. For example, a high-level semantic feature F based on 40 x 256 _t A binary Mask segmentation map Mask of 240×240×1 may be obtained, where the position coordinates of each pixel point in the Mask segmentation map Mask may be used as an index to determine feature information of the corresponding pixel point. Further, mask segmentation map Mask and high-level semantic feature F _t The input query generator is used for inputting the high-level semantic feature F based on the position of the corresponding pixel point of the lane line _t And inquiring and extracting the characteristic value, namely characteristic information, at the corresponding position of the query, wherein the characteristic information is the initial query characteristic. Based on the method, the purpose of extracting the initial query characteristics of the target pixel point is achieved.

In the embodiment, inputting the image features into a semantic segmentation detector in a lane line detection model to obtain a mask segmentation map output by the semantic segmentation detector; inputting the mask segmentation map and the image characteristics into a query generator in a lane line detection model, and determining the position of each target pixel point in the image to be detected based on the mask segmentation map through the query generator; for each target pixel point, extracting feature information corresponding to the position of the target pixel point from the image feature based on the position of the target pixel point, and determining the feature information as an initial query feature corresponding to the target pixel point. Based on the method, the initial query features of each target pixel point corresponding to the lane line can be accurately extracted through the semantic segmentation detector and the query generator, the lane line in the image to be detected can be accurately detected based on each initial query feature, and the detection accuracy is improved.

In practical application, a position vector corresponding to the target pixel point can be obtained based on the position of the target pixel point, and the corresponding initial query feature can be enhanced based on the position vector to obtain the enhanced query feature with higher characterization strength.

Determining position vectors corresponding to all target pixel points based on a mask segmentation map by a query generator in a lane line detection model, wherein the mask segmentation map is used for representing positions of the pixel points corresponding to the lane lines in an image to be detected; for each target pixel point, splicing the position vector of the target pixel point with the initial query characteristic of the target pixel point to obtain a target query characteristic; inputting the image features and the target query features into a transducer decoder in a lane line detection model, carrying out self-attention feature enhancement on the target query features through the transducer decoder, and carrying out cross-attention feature enhancement on the image features and the target query features to obtain enhanced query features corresponding to the target pixel points.

Specifically, the position of the target pixel point in the image to be detected can be determined based on the mask segmentation map, the position can be position information such as position coordinates, and the position information can be encoded through the query generator to obtain a position vector corresponding to the target pixel point. The initial query feature is also a feature vector for representing the target pixel point, and the position vector corresponding to the target pixel point and the initial query feature corresponding to the target pixel point can be added, namely feature stitching is performed, so that the target query feature corresponding to the initial query feature is obtained, and the target query feature is provided with the position coding information of the target pixel point, so that the representation strength is higher than that of the initial query feature.

Further, the image features and the target query features are input into a transducer decoder in a lane line detection model, the transducer decoder is used for carrying out self-attention feature enhancement on the target query features, and cross-attention feature enhancement is carried out on the image features and the target query features, so that the enhanced query features corresponding to the target pixel points are obtained. Self-attention and high-level semantic feature F are calculated for each target query feature with position coding information _t Repeating this operation several times to obtain the enhanced query feature after feature enhancement. The specific procedures and effects are similar to those in the foregoing embodiments, and will not be repeated here.

In the embodiment, a query generator in a lane line detection model is used for determining a position vector corresponding to each target pixel point based on a mask segmentation map, wherein the mask segmentation map is used for representing the positions of the corresponding pixels of the lane lines in an image to be detected; for each target pixel point, splicing the position vector of the target pixel point with the initial query characteristic of the target pixel point to obtain a target query characteristic; inputting the image features and the target query features into a transducer decoder in a lane line detection model, carrying out self-attention feature enhancement on the target query features through the transducer decoder, and carrying out cross-attention feature enhancement on the image features and the target query features to obtain enhanced query features corresponding to the target pixel points. Based on the method, the initial query feature can be enhanced by utilizing the position vector of the target pixel point, so that the obtained information of the target query feature representation is more accurate and richer, further, based on the image feature and each target query feature, the enhanced query feature corresponding to each target pixel point can be obtained through a transducer decoder, and further, the lane line detection result with higher accuracy is obtained based on each enhanced query feature.

Next, a detailed description is given of how to detect the lane lines in the image to be detected based on each enhanced query feature.

In an embodiment, based on each enhanced query feature, the detection of the lane line in the image to be detected may be specifically implemented as follows:

inputting the enhanced query features into a lane line detector in a lane line detection model aiming at each enhanced query feature to obtain a predicted lane line corresponding to the enhanced query features output by the lane line detector and prediction confidence of the predicted lane line; and detecting the lane lines in the image to be detected based on the predicted lane lines corresponding to the enhanced query features and the prediction confidence of the predicted lane lines.

Specifically, as shown in fig. 2, the lane line detection model includes a lane line detector, which may be a network structure for predicting a lane line corresponding to the enhanced query feature, for example, a network structure composed of a plurality of convolution layers, or the like. The lane line detector may be obtained by means of model training.

In an exemplary embodiment, at least one target pixel point may be determined in the image to be detected at a preset interval, so as to determine, based on the determined target pixel points, a predicted lane line corresponding to the target pixel point, where the predicted lane line is a predicted lane line corresponding to the enhanced query feature corresponding to the target pixel point.

FIG. 3 is a schematic diagram of determining a target pixel point according to an embodiment of the present invention, as shown in FIG. 3, a sampling line of the target pixel point may be determined on the Y-axis at a fixed interval, i.e. a sampling line Y shown by a dotted line in FIG. 3 ₁ 、Y ₂ …Y _n The solid line in FIG. 3 represents the lane line in the image to be detected, and the dot at the intersection of the dotted line and the solid line represents the sampled target pixel point in the line, which is the target pixelA point can be understood as a semantically guided position point, the coordinates (x, y) of the target pixel point in the image to be detected, i.e. the position coordinates of the target pixel point on the lane line. Alternatively, the position coordinates of the target pixel points may be represented by vectors, and then the position vector corresponding to each target pixel point may be determined in the mask segmentation map based on the position coordinates.

Illustratively, on a binary mask segmentation map, for each target pixel point, the position of the target pixel point is mapped to a high-level semantic feature F _t Upper-level semantic feature F _t The corresponding feature at the position is the initial query feature corresponding to the target pixel point. And after the feature enhancement is carried out on each initial query feature, obtaining each enhanced query feature. For each target pixel point, the lane line detector may predict N predicted pixel points corresponding to the target reference point with a fixed distance in the Y-axis direction with the target pixel point as a reference point, where N may be any positive integer. Since the distance difference in the Y-axis direction between two adjacent predicted pixel points is a fixed interval distance value, that is, the distance difference in the vertical direction is a preset interval value, only the horizontal offset between each predicted pixel point and the reference point, that is, the horizontal distance difference between the predicted pixel point and the reference point, that is, the distance difference between the predicted pixel point and the reference point in the X-axis direction, is predicted. And smoothly connecting the predicted N predicted pixel points, so that a predicted lane line corresponding to the target pixel point can be obtained, and a predicted lane line corresponding to the enhanced query characteristic output by the lane line detector is obtained.

For example, when the lane line detection model is obtained through training, the lane lines on the position coordinates can be allocated to the corresponding reinforced query features in training according to the position coordinates of the reinforced query features in training, so that the supervised training of the model can be realized, and finally, the lane line detection model can be obtained.

Specifically, in the process of training to obtain the lane line detection model, sample target pixel points corresponding to the enhanced query features in training can be used as sample reference points, N sample truth value pixel points are determined according to the preset Y-axis direction interval distance, the sample truth value pixel points are the pixel points determined by sampling lines on the sample lane line in a sample image, and coordinate values of the N sample truth value pixel points in the X-axis direction and the Y-axis direction can be known. And for each sample truth value pixel point, the coordinate value of the sample truth value pixel point in the X-axis direction is differenced with the coordinate value of the corresponding sample target pixel point in the X-axis direction to obtain a difference value, and then the horizontal deviation value truth value label of the sample truth value pixel point and the corresponding sample target pixel point is obtained. During training, an initial lane line detector in the initial lane line detection model can predict sample prediction pixel points corresponding to each sample true value pixel point, a coordinate difference value of the sample prediction pixel points and the corresponding sample true value pixel points in the X-axis direction is a predicted horizontal offset, each sample prediction pixel point is restrained based on a corresponding horizontal offset true value label, and the initial lane line detector is trained, so that the predicted horizontal offset predicted by the initial lane line detector after iteration is more close to the horizontal offset true value label. Based on the method, the initial lane line detection model can be supervised and trained through the sample image, the sample true value pixel points in the sample image and the horizontal deviation value labels, so that a lane line detector in the trained lane line detection model can output a predicted lane line corresponding to the enhanced query characteristic.

Meanwhile, the lane line detector can output the prediction confidence of the predicted lane line corresponding to each enhanced query feature, wherein the prediction confidence is used for representing the probability that the predicted lane line is a real lane line. The higher the prediction confidence is, the greater the probability that the predicted lane line corresponding to the prediction confidence is the real lane line is; the smaller the prediction confidence is, the smaller the probability that the predicted lane line corresponding to the prediction confidence is the real lane line is.

Specifically, in the process of training to obtain the lane line detection model, for each sample true value pixel point on the sample lane line, a distance difference value between sample prediction pixel points corresponding to the sample true value pixel points can be calculated, mean square error values corresponding to N sample prediction pixel points are calculated based on each distance difference value, and the reciprocal of the mean square error value can be determined as the prediction confidence of the prediction lane line corresponding to the N sample prediction pixel points. Alternatively, the prediction confidence may be calculated by covariance, etc., which will not be described herein. In the application stage, the enhanced query features are input into a lane line detector in the lane line detection model, and then the lane line detector can output a predicted lane line corresponding to the enhanced query features and prediction confidence of the predicted lane line.

In one implementation manner, based on the predicted lane line and the predicted offset corresponding to each enhanced query feature, the lane line in the image to be detected is detected, which may specifically be:

determining the similarity between any two enhancement query features, and dividing the enhancement query features with the similarity larger than the preset similarity into a group to obtain at least one group; for each packet, detecting a predicted lane line corresponding to the enhanced query feature with the maximum prediction confidence in the packet as a target lane line of the packet; and detecting each target lane line as a lane line in the image to be detected.

Specifically, as shown in fig. 2, the lane line detection model includes a query similarity discriminator, which may be an algorithm module that discriminates and filters each predicted lane line based on the predicted lane line and the prediction confidence of the predicted lane line corresponding to each enhanced query feature.

For example, for each target pixel, if each target pixel is sampled from the same lane, the prediction target of each corresponding enhanced query feature is higher in consistency, so that each enhanced query feature corresponding to each target pixel has higher similarity. In the structure of the lane line detection model, a correlation module of a cosine similarity discrimination function can be added, and through supervised training, the correlation module can cluster all the enhanced query features with the consistency of the predicted targets, and exclude all the enhanced query features without the consistency of the predicted targets from each other. The query similarity discriminator can be understood as a correlation module of the cosine similarity discriminating function.

Inputting each enhanced query feature into a query similarity discriminator, determining the similarity between any two enhanced query features, and dividing the enhanced query features with the similarity larger than the preset similarity into a group to obtain at least one group; the preset similarity may be a judgment threshold obtained according to empirical data, statistical data or test data, etc., and is used for performing cluster judgment on the similarity between the enhanced query features, and the preset similarity may be any value.

For example, when determining the packet, the method may determine the packet in which the enhanced query feature is located based on a similarity matrix based on cosine similarity values between the enhanced query features. When the query similarity discriminator is trained, the query similarity matrix can be used as a label to carry out supervised training so as to improve the stability of model training.

As shown in fig. 2, based on the cluster information provided by the query similarity discriminator, at least one packet obtained can be filtered out by a Non-maximum suppression (Non-Maximum Suppression, NMS) post-processing algorithm to obtain a final lane prediction result.

For example, the prediction confidence is used to represent the probability that the predicted lane line is the real lane line, so for each packet, the predicted lane line corresponding to the enhanced query feature of the maximum prediction confidence in the packet can be detected as the target lane line of the packet by adopting an algorithm of NMS post-processing, the prediction confidence of the target lane line is the maximum, and the predicted lane line can be understood as the predicted lane line closest to the real lane line, and the reliability is the highest, so that the target lane line can be determined as the lane line in the image to be detected, and the lane line detection result with higher accuracy can be obtained.

In the embodiment, for each enhanced query feature, inputting the enhanced query feature into a lane line detector in a lane line detection model to obtain a predicted lane line corresponding to the enhanced query feature output by the lane line detector and a prediction confidence of the predicted lane line; further, determining the similarity between any two enhanced query features, and dividing the enhanced query features with the similarity larger than the preset similarity into a group to obtain at least one group; for each packet, detecting a predicted lane line corresponding to the enhanced query feature with the maximum prediction confidence in the packet as a target lane line of the packet; and detecting each target lane line as a lane line in the image to be detected. Based on the method, the prediction confidence of the corresponding prediction lane line and the prediction confidence of the prediction lane line can be predicted by the lane line detector based on each enhanced query feature which fully utilizes the semantic priori information of the lane line in the image to be detected, and the nearest real target lane line is determined from a plurality of prediction lane lines based on grouping and screening filtration with higher reliability, so that a detection result with higher accuracy is obtained, and the accuracy of lane line detection is improved.

The lane line detection method provided by the embodiment of the invention is described below through a specific implementation manner. FIG. 4 is a schematic block diagram of implementation steps of a lane line detection method according to an embodiment of the present invention, as shown in FIG. 4, an image to be detected is input; extracting high-level semantic features F of images to be detected through feature extraction network _t The method comprises the steps of carrying out a first treatment on the surface of the Will high-level semantic feature F _t An input semantic segmentation detector generates a binary mask segmentation map; at least one target pixel point is obtained based on the binary mask segmentation graph through up-sampling, wherein the target pixel point is a semantic priori position point and corresponds to a high-level semantic feature F _t Generating initial query characteristics corresponding to each target pixel point; each initial query feature and high-level semantic feature F _t Inputting a transducer decoder to perform feature enhancement to obtain enhanced query features corresponding to the initial query features; on the one hand, inputting each enhanced query feature into a query similarity discriminator to cluster each enhanced query feature; on the other hand, inputting each enhanced query feature into a lane line detector, and predicting to obtain a preliminary result, namely obtaining a predicted lane line corresponding to each enhanced query feature and a prediction confidence coefficient of the predicted lane line; and performing NMS post-processing on the cluster information based on the query similarity discriminator, and filtering out predicted lane lines with lower credibility to obtain a detection result with higher accuracy.

In this embodiment, the binary mask segmentation map is used to provide the initial query feature generated by initializing the semantic guidance location points, so that the lane line semantic priori information can be provided as guidance, the convergence speed of the model is improved, and the NMS post-processing is assisted by the query similarity discriminator to obtain a detection result with higher accuracy. According to the scheme, the position coding information is provided by using the semantic guidance position points in the transducer encoder, so that the position information and global information of each target pixel point and the surrounding can be focused, and the model generalization capability under the condition of lacking of lane line visual cues can be improved.

The lane line detection apparatus provided in the embodiments of the present invention will be described below, and the lane line detection apparatus described below and the lane line detection method described above may be referred to correspondingly.

Fig. 5 is a schematic structural diagram of a lane line detection apparatus according to an embodiment of the present invention, and referring to fig. 5, a lane line detection apparatus 500 includes:

the input unit 510 is configured to input an image to be detected into a feature extractor in the lane line detection model, so as to obtain image features of the image to be detected output by the feature extractor;

the extracting unit 520 is configured to extract an initial query feature corresponding to at least one target pixel in the image features, where the target pixel is a pixel corresponding to a lane line in the image to be detected;

The enhancing unit 530 is configured to perform feature enhancement on each initial query feature based on the image feature and each initial query feature, to obtain enhanced query features corresponding to each target pixel point, where the enhanced query features are used to characterize global feature information of the target pixel point in the image to be detected;

and a detection unit 540, configured to detect a lane line in the image to be detected based on each enhanced query feature.

In an example embodiment, the enhancement unit 530 is specifically configured to:

In an example embodiment, the extraction unit 520 is specifically configured to:

inputting the image features into a semantic segmentation detector in a lane line detection model to obtain a mask segmentation map output by the semantic segmentation detector, wherein the mask segmentation map is used for representing the positions of corresponding pixel points of the lane lines in the image to be detected;

determining the position of each target pixel point in the image to be detected based on the mask segmentation map through a query generator;

for each target pixel point, extracting feature information corresponding to the position of the target pixel point from the image feature based on the position of the target pixel point, and determining the feature information as an initial query feature corresponding to the target pixel point.

determining position vectors corresponding to all target pixel points based on a mask segmentation map by a query generator in a lane line detection model, wherein the mask segmentation map is used for representing positions of the pixel points corresponding to the lane lines in an image to be detected;

inputting the image features and the target query features into a transducer decoder in a lane line detection model, carrying out self-attention feature enhancement on the target query features through the transducer decoder, and carrying out cross-attention feature enhancement on the image features and the target query features to obtain enhanced query features corresponding to the target pixel points.

In an example embodiment, the detection unit 540 is specifically configured to:

inputting the enhanced query features into a lane line detector in a lane line detection model aiming at each enhanced query feature to obtain a predicted lane line corresponding to the enhanced query features output by the lane line detector and prediction confidence of the predicted lane line;

In an example embodiment, the detection unit 540 is specifically configured to:

determining the similarity between any two enhancement query features, and dividing the enhancement query features with the similarity larger than the preset similarity into a group to obtain at least one group;

for each packet, detecting a predicted lane line corresponding to the enhanced query feature with the maximum prediction confidence in the packet as a target lane line of the packet;

and detecting each target lane line as a lane line in the image to be detected.

The apparatus of the present embodiment may be used to execute the method of any one of the embodiments of the lane line detection method, and its specific implementation process and technical effects are similar to those of the embodiments of the lane line detection method, and specific reference may be made to the detailed description of the embodiments of the lane line detection method, which is not repeated herein.

Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, as shown in fig. 6, the electronic device may include: processor 610, communication interface (Communications Interface) 620, memory 630, and communication bus 640, wherein processor 610, communication interface 620, and memory 630 communicate with each other via communication bus 640. The processor 610 may invoke logic instructions in the memory 630 to perform a lane line detection method comprising: inputting the image to be detected into a feature extractor in the lane line detection model to obtain the image features of the image to be detected output by the feature extractor; extracting initial query characteristics corresponding to at least one target pixel point from the image characteristics, wherein the target pixel point is a pixel point corresponding to a lane line in the image to be detected; based on the image characteristics and each initial query characteristic, carrying out characteristic enhancement on each initial query characteristic to obtain an enhanced query characteristic corresponding to each target pixel point, wherein the enhanced query characteristic is used for representing global characteristic information of the target pixel point in the image to be detected; and detecting a lane line in the image to be detected based on each enhanced query characteristic.

Further, the logic instructions in the memory 630 may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In another aspect, an embodiment of the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the lane line detection method provided by the above methods, the method comprising: inputting the image to be detected into a feature extractor in the lane line detection model to obtain the image features of the image to be detected output by the feature extractor; extracting initial query characteristics corresponding to at least one target pixel point from the image characteristics, wherein the target pixel point is a pixel point corresponding to a lane line in the image to be detected; based on the image characteristics and each initial query characteristic, carrying out characteristic enhancement on each initial query characteristic to obtain an enhanced query characteristic corresponding to each target pixel point, wherein the enhanced query characteristic is used for representing global characteristic information of the target pixel point in the image to be detected; and detecting a lane line in the image to be detected based on each enhanced query characteristic.

In yet another aspect, embodiments of the present invention further provide a computer program product, the computer program product including a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of executing the lane line detection method provided by the above methods, the method comprising: inputting the image to be detected into a feature extractor in the lane line detection model to obtain the image features of the image to be detected output by the feature extractor; extracting initial query characteristics corresponding to at least one target pixel point from the image characteristics, wherein the target pixel point is a pixel point corresponding to a lane line in the image to be detected; based on the image characteristics and each initial query characteristic, carrying out characteristic enhancement on each initial query characteristic to obtain an enhanced query characteristic corresponding to each target pixel point, wherein the enhanced query characteristic is used for representing global characteristic information of the target pixel point in the image to be detected; and detecting a lane line in the image to be detected based on each enhanced query characteristic.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A lane line detection method, characterized by comprising:

2. The lane line detection method according to claim 1, wherein the feature enhancing the initial query features based on the image features and the initial query features to obtain enhanced query features corresponding to the target pixel points, includes:

3. The lane line detection method according to claim 1, wherein the extracting an initial query feature corresponding to at least one target pixel point from the image features includes:

4. The lane line detection method according to claim 3, wherein the inputting the mask segmentation map and the image feature into a query generator in the lane line detection model, to obtain initial query features corresponding to each of the target pixel points output by the query generator, includes:

5. The lane line detection method according to claim 1, wherein the feature enhancing the initial query features based on the image features and the initial query features to obtain enhanced query features corresponding to the target pixel points, includes:

6. The lane line detection method according to any one of claims 1 to 5, wherein the detecting the lane line in the image to be detected based on each of the enhanced query features includes:

7. The lane line detection method according to claim 6, wherein the detecting the lane line in the image to be detected based on the predicted lane line corresponding to each of the enhanced query features and the prediction confidence of the predicted lane line comprises:

and detecting each target lane line as a lane line in the image to be detected.

8. A lane line detection apparatus, comprising:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the lane line detection method of any one of claims 1 to 7 when the computer program is executed by the processor.

10. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the lane line detection method according to any one of claims 1 to 7.