CN114743126A - Lane line sign segmentation method based on graph attention machine mechanism network - Google Patents

Lane line sign segmentation method based on graph attention machine mechanism network Download PDF

Info

Publication number
CN114743126A
CN114743126A CN202210224636.9A CN202210224636A CN114743126A CN 114743126 A CN114743126 A CN 114743126A CN 202210224636 A CN202210224636 A CN 202210224636A CN 114743126 A CN114743126 A CN 114743126A
Authority
CN
China
Prior art keywords
road condition
feature map
initial
condition image
marking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210224636.9A
Other languages
Chinese (zh)
Inventor
张雯玮
杜杰儒
赵子铭
何为
李凤荣
张质懿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Hansuo Information Technology Co ltd
Original Assignee
Shanghai Hansuo Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Hansuo Information Technology Co ltd filed Critical Shanghai Hansuo Information Technology Co ltd
Priority to CN202210224636.9A priority Critical patent/CN114743126A/en
Publication of CN114743126A publication Critical patent/CN114743126A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention relates to a lane mark segmentation method based on an attention machine mechanism network, which comprises the following steps: step S1, constructing a traffic sign segmentation model based on the graph attention machine mechanism network; step S2, collecting a lane marking data set, and training a traffic marking segmentation model according to the collected lane marking data set to obtain a trained traffic marking segmentation model; step S3, acquiring an initial road condition image, and preprocessing the acquired initial road condition image to obtain a preprocessed road condition image; and step S4, inputting the preprocessed road condition image into the trained traffic sign segmentation model, acquiring the segmented lane mark, and marking the segmented lane mark in the initial road condition image. The invention can fully fuse the global features of the image and the features of each subregion in space and semantics, and simultaneously screen the features according to importance, thereby accurately segmenting the lane marking information in a complex scene.

Description

Lane line sign segmentation method based on graph attention machine mechanism network
Technical Field
The invention relates to the technical field of traffic sign segmentation, in particular to a lane line sign segmentation method based on a graph attention machine system network.
Background
Increased automobile inventory can create traffic hazards, in part because drivers and vehicles do not recognize traffic sign information well. The traffic sign segmentation technology can provide good traffic information for the driver and assist the driver in making judgment, so that the traffic accident rate is reduced; meanwhile, as an important part of environment perception, lane line identification can provide good lane mark information for running vehicles, and driving experience is improved.
The current lane marking segmentation method can be divided into two categories, one category is based on the traditional machine learning algorithm, such as AdaBoost, support vector machine and the like, the method mainly relies on an online training classifier to distinguish a target from a background, and then the classifier is utilized to position the target from a candidate area; another class is based on deep learning algorithms, such as convolutional neural networks, which first perform off-line training on large-scale lane marker data sets, and then segment the lane markers. The machine learning method has strong sensitivity to illumination, and is difficult to obtain a good identification effect under the conditions of shadow, low brightness, occlusion and motion blur. The deep learning algorithm relies on strong feature representation capability, can solve the problems of motion blur, occlusion and the like, and is far superior to the traditional machine learning algorithm in segmentation precision.
However, the current deep learning algorithm only focuses on image feature expression, and is difficult to process categories caused by illumination change and shielding interference, so that when the lane marking is in a complex scene, the boundary between the foreground and the background is not obvious, and lane marking information cannot be accurately segmented.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a lane marking segmentation method based on a graph attention machine network, which can accurately segment lane marking information in a complex scene.
The invention provides a lane line mark segmentation method based on a graph attention machine mechanism network, which comprises the following steps:
step S1, constructing a traffic sign segmentation model based on the graph attention machine mechanism network;
step S2, collecting a lane marking data set, and training the traffic marking segmentation model according to the collected lane marking data set to obtain a trained traffic marking segmentation model;
step S3, acquiring an initial road condition image, and preprocessing the acquired initial road condition image to obtain a preprocessed road condition image;
and step S4, inputting the preprocessed road condition image into the trained traffic sign segmentation model, acquiring the segmented lane mark, and marking the segmented lane mark in the initial road condition image.
Further, the traffic sign segmentation model comprises an initial feature extraction module, a pyramid pooling module, a graph attention machine mechanism network, an upsampling module and a feedforward neural network which are connected in sequence, and the upsampling module is connected with the fusion neural network.
Further, the loss function of the traffic sign segmentation model is expressed according to the following formula:
Ltotal=αLcls+βLseg
Lcls={l1,l2,…,lN}li=-wi[yilogxi+(1-yi)log(1-xi)]
Lseg={l1,l2,...,lN},
Figure BDA0003538708600000021
in the formula, LtotalRepresenting the total loss function of the traffic sign segmentation model, alpha and beta being weighting coefficients, LclsRepresents the classification loss function, LsegRepresenting a segmentation loss function; w is aiIs a weight matrix, yiAs a result of the original classification, xiIn order to classify and predict the value for the model,
Figure BDA0003538708600000022
in order to obtain the original segmentation result,
Figure BDA0003538708600000023
the predicted values are segmented for the model.
Further, the step S3 includes:
step S31, shooting a road condition video through a vehicle-mounted camera, and extracting frames of the road condition video according to a set frame rate to obtain an initial road condition image;
and step S32, adjusting the size of the initial road condition image, and enhancing the initial road condition image after the size is adjusted to obtain a road condition image after preprocessing.
Further, in step S32, the initial image is enhanced by using RetinexNet.
Further, the step S4 includes:
step S41, inputting the road condition image after being preprocessed into the initial feature extraction module to obtain an initial feature map;
step S42, inputting the initial feature map into the pyramid pooling module to obtain a global pooling feature map and a plurality of sub-area pooling feature maps;
step S43, inputting the global pooling feature map and the sub-region pooling feature map into the map attention mechanism network to obtain a multi-head attention feature map;
step S44, reducing the dimension of the multi-head attention feature map through the fusion neural network, inputting the multi-head attention feature map after the dimension reduction into the up-sampling module to obtain an up-sampled feature map, and then connecting the up-sampled feature map with the initial feature map to obtain a fusion feature map;
and step S45, classifying each pixel point in the fusion characteristic diagram through the feedforward neural network to obtain a segmented lane mark.
The invention can fully fuse the global features of the image and the features of each subregion in space and semantics, and simultaneously screen the features according to importance, thereby accurately segmenting the lane marking information in a complex scene.
Drawings
Fig. 1 is a flowchart of a lane marking segmentation method based on a graph attention mechanism network according to the present invention.
Detailed Description
The following description of the preferred embodiments of the present invention is provided in conjunction with the accompanying drawings and will be described in detail.
As shown in fig. 1, the lane marking segmentation method based on the attention mechanism network provided by the invention comprises the following steps:
and step S1, constructing a traffic sign segmentation model based on the graph attention machine mechanism network. Specifically, the traffic sign segmentation model comprises an initial feature extraction module, a pyramid pooling module, a graph attention machine mechanism network, an upsampling module and a feedforward neural network which are connected in sequence, wherein the upsampling module is connected with the fusion neural network.
The loss function of the traffic sign segmentation model is as follows:
Ltotal=αLcls+βLseg
in the formula, LtotalTotal loss function representing traffic sign segmentation model, alpha and beta are weighting coefficients, LclsRepresenting the classification loss function, LsegRepresenting the segmentation loss function.
Lcls={l1,l2,…,lN}li=-wi[yilogxi+(1-yi)log(1-xi)]
Lseg={l1,l2,...,lN}
Figure BDA0003538708600000041
In the formula, wiIs a weight matrix, yiAs a result of the original classification, xiIn order to predict the value of the classification of the model,
Figure BDA0003538708600000042
in order to be the result of the original segmentation,
Figure BDA0003538708600000043
the predicted values are segmented for the model. Wherein,
Figure BDA0003538708600000044
and
Figure BDA0003538708600000045
the method comprises the steps of predicting the center coordinates, the length and the width of a frame and obtaining contour information of target detection.
And step S2, collecting a lane marking data set, and training the traffic marking segmentation model constructed in the step S1 according to the collected lane marking data set to obtain the trained traffic marking segmentation model. During training, the lane marking data set is automatically divided into a training set, a testing set and a verification set according to a preset division threshold value, and a traffic marking division model is trained according to the divided training set, the divided testing set and the divided verification set.
And step S3, acquiring an initial road condition image, and preprocessing the acquired initial road condition image to obtain a preprocessed road condition image.
Specifically, step S3 includes:
and step S31, shooting the road condition video through the vehicle-mounted camera, and extracting frames from the road condition video according to a set frame rate to obtain a plurality of images, wherein the images form an initial road condition image.
And step S32, adjusting the size of the initial road condition image, and enhancing the initial road condition image after the size is adjusted to obtain a road condition image after preprocessing.
Specifically, the size of the initial road condition image is unified to 512 × 512 pixels. In addition, because the quality of images shot under weak illumination conditions such as foggy days, rainy days, nights and the like is poor, the contrast is low, the details of the images are seriously lost, and the visibility is difficult to meet the requirement, the low-illumination images need to be enhanced. In the present embodiment, the image is subjected to enhancement processing using RetinexNet. RetinexNet is an effective dim light image enhancement tool, and the principle is as follows: firstly, decomposing an image through a decomposition network; secondly, on the basis of decomposition, utilizing an enhancement network to adjust the illumination map of the image; and finally, carrying out denoising operation on the reflectivity by joint denoising.
And step S4, inputting the preprocessed road condition image into the trained traffic sign segmentation model, acquiring the segmented lane mark, and marking the segmented lane mark in the initial road condition image.
Specifically, step S4 includes:
and step S41, inputting the preprocessed road condition image into an initial feature extraction module of the traffic sign segmentation model to obtain an initial feature map. Specifically, the initial feature extraction module is of a residual network structure, and obtains an extracted initial feature map through multilayer convolution operation and a residual structure, and the initial feature map can be used for representing semantic and spatial feature information of road condition image information and possibly contains edge information or detail feature information of different targets in a road condition.
And step S42, inputting the initial feature map into a pyramid pooling module of the traffic sign segmentation model, and acquiring a global pooling feature map and a plurality of sub-area pooling feature maps. In the pyramid pooling module, the initial feature map is divided into a plurality of sub-regions with different scales, for example, sub-regions of 1 × 1 pixel, 2 × 2 pixel, 3 × 3 pixel, 6 × 6 pixel and 12 × 12 pixel, and then global average pooling operation is performed on the sub-regions to obtain pooled feature maps corresponding to the sub-regions. The global pooling feature map refers to a pooling feature map of the entire image, including background information of the entire image.
And step S43, inputting the global pooling feature map and the sub-region pooling feature map into a map attention mechanism network of the traffic sign segmentation model, and acquiring a multi-head attention feature map for representing the semantics among the targets of the sub-regions and the interrelation of the space structures of the sub-regions.
The graph neural network is a neural network structure that operates on a graph based on a graph structure. The figure attention mechanism network introduces a multi-head attention mechanism on the basis of a figure neural network, and the calculation principle is as follows:
1) extracting pixel points in the pooling feature graph as nodes, and performing feature conversion on all the nodes to convert the pooled feature channel number F into a channel number F' after dimensionality reduction, wherein the conversion formula is as follows:
zj=Whj
in the formula, zjFor the converted features, hjFor the graph attention machine to make the original input characteristics of each node of the network, W belongs to RF′×FIs a weight matrix and is used for completing the characteristic conversion process of all nodes.
2) A multi-head attention mechanism is introduced into a graph structure, and in a graph attention mechanism network, a self-attention mechanism (self-attention) is used, a shared attention calculation function is used, and the calculation formula is as follows:
eij=Wihi||Wjhj
in the formula, eijRepresents the degree of contribution, W, of the feature of node j to node iiWeight matrix, h, representing node iiRepresenting the original input features of node i, "| |" represents the concatenation of vectors, WjWeight matrix, h, representing node jjRepresenting the original input characteristics of node j.
3) Normalizing the contribution degree of each adjacent node k to better distribute the weight, wherein the calculation formula is as follows:
Figure BDA0003538708600000061
in this process, the contribution degree is activated using a leakage activation function. In the formula, alphaijRepresenting the correlation degree of the index node i and the node j; n is a radical of hydrogeniSet of first order neighbor nodes representing node i (representing nodes directly related to inode), eikRepresenting the degree of correlation of the node i and the node k; k represents one of the first order neighbor node sets of node i.
4) And after the contribution degree of each adjacent node of the node i is calculated, performing feature summation updating on all adjacent nodes of the node i according to the weight, and taking the sum as the final output of the node i. In order to stabilize the process of attention mechanism learning, a multi-head attention mechanism is adopted, and the calculation formula is as follows:
Figure BDA0003538708600000062
in the formula,
Figure BDA0003538708600000063
representing the output of the network of the attention mechanism, K is the number of the multi-head attention mechanism, | | | represents that the results of the K multi-head attention mechanisms are connected, σ represents a sigmoid activation function,
Figure BDA0003538708600000064
denotes the kth Multi-head attention value (relevance), W, for node i and node jkA transformation matrix representing the (degree of correlation) of the kth multi-head attention,
Figure BDA0003538708600000065
the diagram shows the inputs to the attention mechanism network.
And S44, reducing the dimensions of the multi-head attention feature map obtained in the step S43 through a fusion neural network of the traffic sign segmentation model, inputting the multi-head attention feature map subjected to dimension reduction into an up-sampling module of the traffic sign segmentation model to obtain an up-sampled feature map, enabling the up-sampled feature map to be consistent with the size of the initial feature map obtained in the step S41, and then connecting the up-sampled feature map with the initial feature map to obtain a fusion feature map.
The purpose of reducing the dimensions of the multi-head attention feature map is to fuse the multi-head attention features, enable a network to better learn the relationship among the multi-head attention features, and extract important features.
And step S45, classifying each pixel point in the fusion characteristic diagram through a feedforward neural network of the traffic sign segmentation model to obtain the segmented lane mark. Specifically, whether each pixel point in the fusion characteristic diagram is a prediction target or not is judged, and if yes, the pixel point is judged as a lane mark; otherwise, it is judged as a background. In this way, a complete lane marking shape can be finally formed.
The invention can fully fuse the global features of the image and the features of each subregion in space and semantics, and simultaneously screen the features according to importance, thereby accurately segmenting the lane marking information in a complex scene. Compared with the traditional semantic segmentation method based on pure vision, the method provided by the invention does not need to incorporate a large amount of carefully designed and highly task-customized expert experience knowledge into the algorithm, and is simpler and more convenient to realize. In addition, the invention can be applied to other computer vision scenes, has stronger expansibility and can access a more general intelligent system.
The above embodiments are merely preferred embodiments of the present invention, which are not intended to limit the scope of the present invention, and various changes may be made in the above embodiments of the present invention. All simple and equivalent changes and modifications made according to the claims and the content of the specification of the present application fall within the scope of the claims of the present patent application. The invention has not been described in detail in the conventional technical content.

Claims (6)

1. A lane mark segmentation method based on a graph attention machine mechanism network is characterized by comprising the following steps:
step S1, constructing a traffic sign segmentation model based on the graph attention machine mechanism network;
step S2, collecting a lane marking data set, and training the traffic marking segmentation model according to the collected lane marking data set to obtain a trained traffic marking segmentation model;
step S3, acquiring an initial road condition image, and preprocessing the acquired initial road condition image to obtain a preprocessed road condition image;
and step S4, inputting the preprocessed road condition image into the trained traffic sign segmentation model, acquiring the segmented lane mark, and marking the segmented lane mark in the initial road condition image.
2. The lane marking segmentation method based on the graph attention mechanism network according to claim 1, wherein the traffic marking segmentation model comprises an initial feature extraction module, a pyramid pooling module, the graph attention mechanism network, an upsampling module and a feed-forward neural network which are connected in sequence, and the upsampling module is connected with the fusion neural network.
3. The lane marking segmentation method based on the graph attention mechanism network according to claim 1, wherein the loss function of the traffic marking segmentation model is expressed according to the following formula:
Ltotal=αLcls+βLseg
Lcls={l1,l2,...,lN}li=-wi[yilogxi+(1-yi)log(1-xi)],
Lseg={l1,l2,...,lN},
Figure FDA0003538708590000011
in the formula, LtotalRepresenting the total loss function of the traffic sign segmentation model, alpha and beta being weighting coefficients, LclsRepresenting the classification loss function, LsegRepresenting a segmentation loss function; w is aiIs a weight matrix, yiAs a result of the original classification, xiIn order to predict the value of the classification of the model,
Figure FDA0003538708590000012
in order to obtain the original segmentation result,
Figure FDA0003538708590000013
the predicted values are segmented for the model.
4. The lane marking segmentation method based on graph attention mechanism network according to claim 1, wherein the step S3 includes:
step S31, shooting road condition videos through a vehicle-mounted camera, and extracting frames of the road condition videos according to a set frame rate to obtain initial road condition images;
and step S32, adjusting the size of the initial road condition image, and enhancing the initial road condition image after the size is adjusted to obtain a road condition image after preprocessing.
5. The method for segmenting lane marking based on graph attention mechanism network according to claim 4, wherein RetinexNet is adopted to enhance the initial image in step S32.
6. The method for segmenting the lane marking based on the graph attention mechanism network according to claim 2, wherein the step S4 comprises:
step S41, inputting the road condition image after being preprocessed into the initial feature extraction module to obtain an initial feature map;
step S42, inputting the initial feature map into the pyramid pooling module to obtain a global pooling feature map and a plurality of sub-area pooling feature maps;
step S43, inputting the global pooling feature map and the sub-region pooling feature map into the map attention mechanism network to obtain a multi-head attention feature map;
step S44, reducing the dimension of the multi-head attention feature map through the fusion neural network, inputting the multi-head attention feature map after the dimension reduction into the up-sampling module to obtain an up-sampled feature map, and then connecting the up-sampled feature map with the initial feature map to obtain a fusion feature map;
and step S45, classifying each pixel point in the fusion characteristic diagram through the feedforward neural network to obtain a segmented lane mark.
CN202210224636.9A 2022-03-09 2022-03-09 Lane line sign segmentation method based on graph attention machine mechanism network Pending CN114743126A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210224636.9A CN114743126A (en) 2022-03-09 2022-03-09 Lane line sign segmentation method based on graph attention machine mechanism network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210224636.9A CN114743126A (en) 2022-03-09 2022-03-09 Lane line sign segmentation method based on graph attention machine mechanism network

Publications (1)

Publication Number Publication Date
CN114743126A true CN114743126A (en) 2022-07-12

Family

ID=82275305

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210224636.9A Pending CN114743126A (en) 2022-03-09 2022-03-09 Lane line sign segmentation method based on graph attention machine mechanism network

Country Status (1)

Country Link
CN (1) CN114743126A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294548A (en) * 2022-07-28 2022-11-04 烟台大学 Lane line detection method based on position selection and classification method in row direction
CN116071374A (en) * 2023-02-28 2023-05-05 华中科技大学 Lane line instance segmentation method and system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294548A (en) * 2022-07-28 2022-11-04 烟台大学 Lane line detection method based on position selection and classification method in row direction
CN115294548B (en) * 2022-07-28 2023-05-02 烟台大学 Lane line detection method based on position selection and classification method in row direction
CN116071374A (en) * 2023-02-28 2023-05-05 华中科技大学 Lane line instance segmentation method and system
CN116071374B (en) * 2023-02-28 2023-09-12 华中科技大学 Lane line instance segmentation method and system

Similar Documents

Publication Publication Date Title
CN106845478B (en) A kind of secondary licence plate recognition method and device of character confidence level
Roychowdhury et al. Machine learning models for road surface and friction estimation using front-camera images
CN110728200B (en) Real-time pedestrian detection method and system based on deep learning
CN108875608B (en) Motor vehicle traffic signal identification method based on deep learning
CN111461039B (en) Landmark identification method based on multi-scale feature fusion
CN111723693B (en) Crowd counting method based on small sample learning
CN111639564B (en) Video pedestrian re-identification method based on multi-attention heterogeneous network
CN113902915A (en) Semantic segmentation method and system based on low-illumination complex road scene
CN110929593A (en) Real-time significance pedestrian detection method based on detail distinguishing and distinguishing
CN113255659B (en) License plate correction detection and identification method based on MSAFF-yolk 3
CN108416780B (en) Object detection and matching method based on twin-region-of-interest pooling model
CN114743126A (en) Lane line sign segmentation method based on graph attention machine mechanism network
CN111160407A (en) Deep learning target detection method and system
CN112861970B (en) Fine-grained image classification method based on feature fusion
CN111461213A (en) Training method of target detection model and target rapid detection method
CN113344932A (en) Semi-supervised single-target video segmentation method
CN113610144A (en) Vehicle classification method based on multi-branch local attention network
CN112686242B (en) Fine-grained image classification method based on multilayer focusing attention network
CN112801182A (en) RGBT target tracking method based on difficult sample perception
CN110889360A (en) Crowd counting method and system based on switching convolutional network
Liang et al. Cross-scene foreground segmentation with supervised and unsupervised model communication
CN116883650A (en) Image-level weak supervision semantic segmentation method based on attention and local stitching
CN116129291A (en) Unmanned aerial vehicle animal husbandry-oriented image target recognition method and device
CN111368845A (en) Feature dictionary construction and image segmentation method based on deep learning
CN106650814B (en) Outdoor road self-adaptive classifier generation method based on vehicle-mounted monocular vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination