CN118193628A - Semantic enhancement graph contrast learning method and system for road network representation - Google Patents
Semantic enhancement graph contrast learning method and system for road network representation Download PDFInfo
- Publication number
- CN118193628A CN118193628A CN202410299789.9A CN202410299789A CN118193628A CN 118193628 A CN118193628 A CN 118193628A CN 202410299789 A CN202410299789 A CN 202410299789A CN 118193628 A CN118193628 A CN 118193628A
- Authority
- CN
- China
- Prior art keywords
- graph
- road network
- road
- representation
- embedding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 230000000007 visual effect Effects 0.000 claims abstract description 20
- 238000013461 design Methods 0.000 claims abstract description 7
- 230000003416 augmentation Effects 0.000 claims description 21
- 230000006870 function Effects 0.000 claims description 13
- 238000005457 optimization Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 8
- 238000005070 sampling Methods 0.000 claims description 7
- 239000013598 vector Substances 0.000 claims description 6
- 238000004458 analytical method Methods 0.000 claims description 5
- 230000000873 masking effect Effects 0.000 claims description 4
- 238000003062 neural network model Methods 0.000 claims description 4
- 238000012549 training Methods 0.000 claims description 3
- 230000007704 transition Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 2
- 238000012216 screening Methods 0.000 claims 1
- 238000012545 processing Methods 0.000 abstract description 7
- 238000010801 machine learning Methods 0.000 abstract description 5
- 238000003012 network analysis Methods 0.000 abstract description 5
- 238000007405 data analysis Methods 0.000 abstract description 3
- 239000011159 matrix material Substances 0.000 description 14
- 230000003190 augmentative effect Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 235000008694 Humulus lupulus Nutrition 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 2
- 230000000052 comparative effect Effects 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 206010039203 Road traffic accident Diseases 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000010921 in-depth analysis Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/26—Visual data mining; Browsing structured data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computational Mathematics (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Algebra (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
本发明涉及一种用于道路网络表示的语义增强图对比学习方法及系统,属于交通数据分析和机器学习技术领域。本发明提供了一种语义增强图对比学习框架,该框架通过多模态特征嵌入将道路属性和视觉信息有效结合,深入挖掘并利用道路网络数据的特征。构造一种新型的道路网络数据表示,不仅具备高维度的丰富信息,而且可以直接用作机器学习模型的输入。这种设计使得各种基于路网的下游任务,如道路标签分类、速度预测等,变得更加方便实施。本方案融合了图对比学习的思想,通过模拟道路网络的拓扑变化和数据缺失,提高了道路网络数据表示的鲁棒性和泛化能力,有效地解决了传统方法在复杂道路网络分析中的局限性,提高了道路网络数据处理效率。
The present invention relates to a semantically enhanced graph contrast learning method and system for road network representation, and belongs to the technical field of traffic data analysis and machine learning. The present invention provides a semantically enhanced graph contrast learning framework, which effectively combines road attributes and visual information through multimodal feature embedding, and deeply mines and utilizes the characteristics of road network data. A new type of road network data representation is constructed, which not only has high-dimensional rich information, but also can be directly used as the input of a machine learning model. This design makes various road network-based downstream tasks, such as road label classification, speed prediction, etc., more convenient to implement. This scheme integrates the idea of graph contrast learning, and by simulating the topological changes and data missing of the road network, it improves the robustness and generalization ability of the road network data representation, effectively solves the limitations of traditional methods in complex road network analysis, and improves the efficiency of road network data processing.
Description
技术领域Technical Field
本发明属于交通数据分析和机器学习技术领域,涉及一种用于道路网络表示的语义增强图对比学习方法及系统。The present invention belongs to the technical field of traffic data analysis and machine learning, and relates to a semantically enhanced graph comparative learning method and system for road network representation.
背景技术Background technique
随着城市化的加速,城市交通网络变得日益复杂和多变,传统的道路网络分析方法面临越来越多的挑战。传统的道路网络分析方法往往依赖于简化的数学模型和有限的数据来源,难以捕捉和解释道路网络中的复杂动态和多样性。例如,在城市规划和交通管理中,对交通流量、拥堵模式、事故风险等关键信息的深入理解至关重要,但现有的技术和方法却往往无法有效地对大规模、高维度的道路网络数据进行处理和分析。此外,随着技术的发展,道路网络数据的类型和量级都有了显著变化。从车辆的GPS轨迹到实时交通状况的监控,数据的丰富性和多样性前所未有。然而,这同时也给数据处理和分析带来了新的挑战。现有的分析工具和方法在处理这些大数据时往往显得力不从心,特别是在需要即时监控交通流、预测交通趋势、以及应对交通紧急情况等场合中。例如,在实时交通状态监测方面,传统的数据分析方法无法即时反映交通流的变化;在路况预测方面,由于缺乏对大数据的深入分析,往往无法准确预测未来的交通状况;在交通事故响应方面,传统方法在数据处理的时效性和准确性方面也存在明显不足。因此,针对现代城市交通网络复杂多变的特性,迫切需要开发出新的分析方法和技术,以更好地理解和管理这些复杂的交通系统。这些新方法和技术应能够有效地应对大规模高维度的交通数据,以实现对交通网络的深入理解和精确管理。这不仅包括高效处理现有数据,同时也包括有效地模拟和预测交通网络拓扑变化和数据缺失情况。通过这种技术进步,可以更全面和准确地支持城市规划和交通管理,从而提高城市交通系统的效率和安全性。With the acceleration of urbanization, urban traffic networks have become increasingly complex and changeable, and traditional road network analysis methods are facing more and more challenges. Traditional road network analysis methods often rely on simplified mathematical models and limited data sources, making it difficult to capture and explain the complex dynamics and diversity in road networks. For example, in urban planning and traffic management, a deep understanding of key information such as traffic flow, congestion patterns, and accident risks is crucial, but existing technologies and methods are often unable to effectively process and analyze large-scale, high-dimensional road network data. In addition, with the development of technology, the types and magnitudes of road network data have changed significantly. From GPS trajectories of vehicles to real-time traffic condition monitoring, the richness and diversity of data are unprecedented. However, this also brings new challenges to data processing and analysis. Existing analysis tools and methods often seem to be unable to cope with these big data, especially in situations where real-time monitoring of traffic flows, prediction of traffic trends, and response to traffic emergencies are required. For example, in terms of real-time traffic status monitoring, traditional data analysis methods cannot immediately reflect changes in traffic flow; in terms of road condition prediction, due to the lack of in-depth analysis of big data, it is often impossible to accurately predict future traffic conditions; in terms of traffic accident response, traditional methods also have obvious deficiencies in the timeliness and accuracy of data processing. Therefore, in view of the complex and changeable characteristics of modern urban transportation networks, it is urgent to develop new analysis methods and technologies to better understand and manage these complex transportation systems. These new methods and technologies should be able to effectively cope with large-scale, high-dimensional traffic data to achieve in-depth understanding and precise management of transportation networks. This includes not only efficient processing of existing data, but also effective simulation and prediction of traffic network topology changes and data missing situations. Through this technological advancement, urban planning and traffic management can be supported more comprehensively and accurately, thereby improving the efficiency and safety of urban transportation systems.
发明内容Summary of the invention
有鉴于此,本发明的目的在于提供一种用于道路网络表示的语义增强图对比学习方法及系统,该方法和系统提供了一种名为SE-GCL(语义增强图对比学习)的框架,该框架通过多模态特征嵌入将道路属性和视觉信息有效结合,深入挖掘并充分利用了道路网络数据的特征。它构造了一种新型的道路网络数据表示,不仅具备高维度的丰富信息,而且可以直接用作机器学习模型的输入。这种设计使得各种基于路网的下游任务,如道路标签分类、速度预测和通行时间预测等,变得更加方便实施。SE-GCL融合了图对比学习的思想,通过模拟道路网络的拓扑变化和数据缺失,提高了道路网络数据表示的鲁棒性和泛化能力,有效地解决了传统方法在复杂道路网络分析中的局限性。In view of this, the purpose of the present invention is to provide a semantically enhanced graph contrast learning method and system for road network representation, which provides a framework called SE-GCL (Semantic Enhanced Graph Contrastive Learning), which effectively combines road attributes and visual information through multimodal feature embedding, deeply explores and fully utilizes the characteristics of road network data. It constructs a new type of road network data representation, which not only has high-dimensional rich information, but also can be directly used as input for machine learning models. This design makes various road network-based downstream tasks, such as road label classification, speed prediction, and travel time prediction, more convenient to implement. SE-GCL integrates the idea of graph contrast learning, and improves the robustness and generalization ability of road network data representation by simulating the topological changes and data loss of the road network, effectively solving the limitations of traditional methods in complex road network analysis.
为达到上述目的,本发明提供如下技术方案:In order to achieve the above object, the present invention provides the following technical solutions:
一种用于道路网络表示的语义增强图对比学习方法,该方法构造一个用于道路网络表示的语义增强图对比学习框架,具体包括以下阶段:A semantically enhanced graph contrastive learning method for road network representation is proposed. The method constructs a semantically enhanced graph contrastive learning framework for road network representation, which specifically includes the following stages:
第一阶段:构建一个多模态特征嵌入模块,其利用路段的属性数据和街景图像数据,分别得到路段的属性嵌入(嵌入,即低维的表示向量)和视觉特征嵌入,并将它们拼接到一起作为路段的初始嵌入;Phase 1: Build a multimodal feature embedding module, which uses the attribute data of the road segment and the street view image data to obtain the attribute embedding (embedding, i.e., low-dimensional representation vector) and visual feature embedding of the road segment respectively, and concatenate them together as the initial embedding of the road segment;
第二阶段:构建一个图增广模块,在将路网建模为图的基础上,设计两种不同的增广策略,分别对路网(图)的拓扑和节点特征施加扰动,从而生成两种不同的增广图(图变种);The second stage: construct a graph augmentation module. Based on modeling the road network as a graph, design two different augmentation strategies to impose perturbations on the topology and node features of the road network (graph), thereby generating two different augmented graphs (graph variants).
第三阶段:构建一个图表示学习模块,在图增广模块生成的两个图变种的基础上,采用图神经网络模型来提取节点级别,即路段的表示;Phase 3: Build a graph representation learning module. Based on the two graph variants generated by the graph augmentation module, use a graph neural network model to extract the node level, i.e., the representation of the road segment.
第四阶段:构建一个基于语义增强的对比优化模块,利用路网特性和历史轨迹中的移动语义进行样本对的采样,并设计双视图的对比损失函数来指导对比优化的过程。Phase 4: Build a contrast optimization module based on semantic enhancement, use the road network characteristics and the mobility semantics in the historical trajectory to sample sample pairs, and design a dual-view contrast loss function to guide the contrast optimization process.
进一步,在所述第一阶段中,主要分为属性特征嵌入和视觉特征嵌入两部分;Furthermore, in the first stage, it is mainly divided into two parts: attribute feature embedding and visual feature embedding;
属性特征嵌入部分包括:对于任意的路段vi,综合其路段标识符ID、路段长度LEN、路段中点坐标(LON,LAT)等属性进行分析,这些属性对应的符号分别为vi.id、vi.len、vi.lon、vi.lat;考虑到在单个路段内经纬度坐标的变化非常有限,因此只需选取路段几何中点的经纬度来代表该路段的位置信息。对于实数类型的属性vi.lon和vi.lat,通过分箱操作将它们转化为离散化的形式,然后,使用单独的线性层来将每个属性值映射为相应的特征向量,随后,将四个特征向量拼接在一起,形成路段的属性嵌入;The attribute feature embedding part includes: for any road segment v i , comprehensively analyze its road segment identifier ID, road segment length LEN, road segment midpoint coordinates (LON, LAT) and other attributes, and the symbols corresponding to these attributes are v i .id, v i .len, v i .lon, and v i .lat respectively; considering that the changes in longitude and latitude coordinates within a single road segment are very limited, it is only necessary to select the longitude and latitude of the geometric midpoint of the road segment to represent the location information of the road segment. For the real number type attributes v i .lon and v i .lat, they are converted into discretized forms through binning operations, and then a separate linear layer is used to map each attribute value to a corresponding feature vector. Subsequently, the four feature vectors are concatenated together to form the attribute embedding of the road segment;
视觉特征嵌入部分包括:采用预训练的Swin Transformer作为视觉特征编码器,以路段中点处收集到的全景图像数据作为输入,输出其视觉嵌入最终,将两种不同的特征类型连接在一起,以获得路段的初始嵌入。The visual feature embedding part includes: using the pre-trained Swin Transformer as the visual feature encoder, taking the panoramic image data collected at the midpoint of the road section as input, and outputting its visual embedding Finally, the two different feature types are concatenated together to obtain the initial embedding of the road segment.
进一步,在所述第二阶段中,构建一个图增广模块,对路网使用基于移动性的边移除和基于模态的特征遮盖两种策略进行增广;基于移动性的边移除策略针对路网的拓扑结构,依据历史轨迹数据中路段之间的转移概率来确定边的移除概率;如果两个路段在历史轨迹中经常一起出现,则认为它们之间的连接强且重要,因此移除这种连接的概率会很低;基于模态的特征掩盖策略是针对节点特征的一种扰动策略,它随机选择属性或图像特征中的一种进行遮盖,以使对比学习模型适应路段数据的缺失,从而增强模型的鲁棒性;在每次训练过程中,同时使用这两种不同的增广策略生成两个图视图进行对比。Furthermore, in the second stage, a graph augmentation module is constructed to augment the road network using two strategies: mobility-based edge removal and modality-based feature masking. The mobility-based edge removal strategy targets the topological structure of the road network and determines the edge removal probability based on the transition probability between road segments in the historical trajectory data. If two road segments often appear together in historical trajectories, the connection between them is considered strong and important, so the probability of removing such a connection is very low. The modality-based feature masking strategy is a perturbation strategy for node features, which randomly selects one of the attributes or image features for masking, so that the contrastive learning model can adapt to the lack of road segment data, thereby enhancing the robustness of the model. In each training process, these two different augmentation strategies are used simultaneously to generate two graph views for comparison.
进一步,在所述第三阶段中,构建一个图表示学习模块,以图增广模块生成的两个图视图作为输入,分别学习两个图视图的表示;具体包括:采用图编码器从图视图中学习和提取高级特征,从而生成每个路段的复杂表示,最后,将图编码器生成的路段表示通过非线性投影头映射到低维潜在空间。Furthermore, in the third stage, a graph representation learning module is constructed, and the two graph views generated by the graph augmentation module are used as input to learn the representations of the two graph views respectively; specifically, the module includes: using a graph encoder to learn and extract high-level features from the graph views, thereby generating a complex representation of each road segment, and finally, mapping the road segment representation generated by the graph encoder to a low-dimensional latent space through a nonlinear projection head.
进一步,在所述第四阶段中,构建一个基于语义增强的对比优化模块,该模块包含语义增强的采样策略,能够利用路网特性和历史轨迹中的移动语义进行正、负样本的构建;具体包括:给定某一视图的目标节点(即锚节点),将其h-hop内的邻居节点视为视图内正样本,另一视图的镜像节点视为视图间正样本;对于超出h-hop范围的节点,进一步筛除那些在历史轨迹中与锚节点共现的节点,选择剩下的节点作为真正的负样本;之后采用基于InfoNCE目标设计的对比损失函数:Furthermore, in the fourth stage, a contrast optimization module based on semantic enhancement is constructed, which includes a semantic enhancement sampling strategy and can construct positive and negative samples by using the road network characteristics and the mobility semantics in the historical trajectory. Specifically, given a target node (i.e., anchor node) of a certain view, its neighbor nodes within the h-hop are regarded as positive samples within the view, and the mirror nodes of another view are regarded as positive samples between views. For nodes beyond the h-hop range, those nodes that co-occur with the anchor node in the historical trajectory are further screened out, and the remaining nodes are selected as true negative samples. Then, a contrast loss function designed based on the InfoNCE objective is adopted:
来计算两个图视图上所有节点的对比损失,并以此得到最终的双视图对比损失函数To calculate the contrast loss of all nodes on the two graph views, and use this to get the final dual-view contrast loss function
其中,v'i表示vi在另一个视图中的镜像节点;最后,使用此损失函数进行网络的训练,以优化道路网络嵌入表示。Among them, v'i represents the mirror node of vi in another view; finally, this loss function is used to train the network to optimize the road network embedding representation.
本发明还提供了一种用于道路网络表示的语义增强图对比学习系统。The present invention also provides a semantically enhanced graph comparative learning system for road network representation.
本发明的有益效果在于:The beneficial effects of the present invention are:
本发明提供了一种名为SE-GCL(语义增强图对比学习)的框架,该框架通过多模态特征嵌入将道路属性和视觉信息有效结合,深入挖掘并充分利用了道路网络数据的特征。它构造了一种新型的道路网络数据表示,不仅具备高维度的丰富信息,而且可以直接用作机器学习模型的输入。这种设计使得各种基于路网的下游任务,如道路标签分类、速度预测和通行时间预测等,变得更加方便实施。SE-GCL融合了图对比学习的思想,通过模拟道路网络的拓扑变化和数据缺失,提高了道路网络数据表示的鲁棒性和泛化能力,有效地解决了传统方法在复杂道路网络分析中的局限性,提高了道路网络数据处理的效率和对城市交通网络复杂动态的理解。The present invention provides a framework called SE-GCL (Semantic Enhanced Graph Contrastive Learning), which effectively combines road attributes and visual information through multimodal feature embedding, deeply explores and fully utilizes the characteristics of road network data. It constructs a new type of road network data representation, which not only has high-dimensional rich information, but also can be directly used as input for machine learning models. This design makes various road network-based downstream tasks, such as road label classification, speed prediction, and travel time prediction, more convenient to implement. SE-GCL incorporates the idea of graph contrastive learning, and improves the robustness and generalization ability of road network data representation by simulating topological changes and data loss in road networks, effectively solving the limitations of traditional methods in complex road network analysis, and improving the efficiency of road network data processing and the understanding of the complex dynamics of urban transportation networks.
本发明的其他优点、目标和特征在某种程度上将在随后的说明书中进行阐述,并且在某种程度上,基于对下文的考察研究对本领域技术人员而言将是显而易见的,或者可以从本发明的实践中得到教导。本发明的目标和其他优点可以通过下面的说明书来实现和获得。Other advantages, objectives and features of the present invention will be described in the following description to some extent, and to some extent, will be obvious to those skilled in the art based on the following examination and study, or can be taught from the practice of the present invention. The objectives and other advantages of the present invention can be realized and obtained through the following description.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作优选的详细描述,其中:In order to make the purpose, technical solutions and advantages of the present invention more clear, the present invention will be described in detail below in conjunction with the accompanying drawings, wherein:
图1为本发明的系统框图;Fig. 1 is a system block diagram of the present invention;
图2为语义增强采样策略的说明性示例图。FIG2 is an illustrative example diagram of the semantic enhancement sampling strategy.
具体实施方式Detailed ways
下面结合附图对本发明技术方案进行详细说明。The technical solution of the present invention is described in detail below with reference to the accompanying drawings.
本发明提供的用于道路网络表示的语义增强图对比学习框架用于道路网络的表示学习。当前,城市交通网络的日益复杂化导致了海量且多样的道路网络数据,这些数据的处理和分析带来了巨大的存储和计算开销。目前,道路网络数据通常通过结合道路属性和视觉信息进行处理。本发明针对这些数据的表征和优化提出了一种新的框架,包括以下四个阶段的实现:首先,构建一个多模态特征嵌入模块,利用路段的属性数据和街景图像数据得到道路段的属性和视觉特征嵌入;其次,通过图增广模块,设计两种增广策略对路网的拓扑和节点特征施加扰动,生成两种不同的增广图;接着,在图表示学习模块中,使用图神经网络模型从这两个图变种中提取路段的表示;最后,在基于语义增强的对比优化模块中,结合路网特性和历史轨迹中的移动语义进行样本对采样,并设计双视图对比损失函数指导优化过程。这种框架提高了道路网络数据处理的效率和对城市交通网络复杂动态的理解。图1为本发明的系统框图。The semantically enhanced graph contrast learning framework for road network representation provided by the present invention is used for road network representation learning. At present, the increasing complexity of urban traffic networks has led to massive and diverse road network data, and the processing and analysis of these data have brought huge storage and computing overhead. At present, road network data are usually processed by combining road attributes and visual information. The present invention proposes a new framework for the characterization and optimization of these data, including the implementation of the following four stages: first, a multimodal feature embedding module is constructed to obtain the attribute and visual feature embedding of the road segment using the attribute data of the road segment and the street view image data; second, through the graph augmentation module, two augmentation strategies are designed to perturb the topology and node features of the road network to generate two different augmented graphs; then, in the graph representation learning module, a graph neural network model is used to extract the representation of the road segment from the two graph variants; finally, in the semantically enhanced contrast optimization module, sample pairs are sampled in combination with the road network characteristics and the mobile semantics in the historical trajectory, and a dual-view contrast loss function is designed to guide the optimization process. This framework improves the efficiency of road network data processing and the understanding of the complex dynamics of urban traffic networks. Figure 1 is a system block diagram of the present invention.
在本实施例中,涉及7个必要的概念,分别如下:In this embodiment, seven necessary concepts are involved, which are as follows:
第1个概念:路网,路网是一个有向图G=(V,A)。其中,每个顶点v∈V代表一段道路,它包含地图上的长度和坐标等属性。A是一个邻接矩阵,其中每个条目Ai,j是一个二进制数值,表示路段vj是否与路段vi相连。The first concept: road network, a road network is a directed graph G = (V, A). Each vertex v∈V represents a section of road, which contains attributes such as length and coordinates on the map. A is an adjacency matrix, in which each entry Ai ,j is a binary value indicating whether the road segment vj is connected to the road segment vi .
第2个概念:街景图像是车辆在道路网络上特定位置拍摄的照片。具体来说,我们收集了每个路段的多个图像,并将它们拼接成360度全景视图。全景图像代表了一个快照,说明了周围环境的视觉细节。Concept 2: Street View Imagery A panoramic image is a picture taken by a vehicle at a specific location on the road network. Specifically, we collect multiple images of each road segment and stitch them into a 360-degree panoramic view. A panoramic image represents a snapshot that illustrates the visual details of the surrounding environment.
第3个概念:轨迹是车辆在路网上运动时记录的GPS点序列,记为<p1,p2,p3,p4,...,p|T|>。其中,pi是第i个点的坐标,|T|表示点的总数。我们使用我们提出的L2MM深度模型进一步将每个轨迹映射到道路网络上。因此,轨迹由路网G中一系列相连的路段<v1,v2,v3,v4,...,vm>表示。Concept 3: Trajectory is a sequence of GPS points recorded when the vehicle moves on the road network, denoted as <p 1 ,p 2 ,p 3 ,p 4 ,...,p |T| >. Where p i is the coordinate of the i-th point and |T| represents the total number of points. We further map each trajectory onto the road network using our proposed L2MM deep model. Therefore, a trajectory is represented by a series of connected road segments <v 1 ,v 2 ,v 3 ,v 4 ,...,v m > in the road network G.
第4个概念,zi指某个特定节点(如路段)的嵌入或表示。在一个图中,每个节点都会有一个独特的嵌入来表示它的特征。代表正样本的嵌入。在对比学习中,正样本是与锚节点(zi)相似或有关联的节点。/>表示负样本的嵌入。与正样本相反,负样本是与锚节点不相似或无关的节点。The fourth concept, z i, refers to the embedding or representation of a specific node (such as a road segment). In a graph, each node has a unique embedding to represent its characteristics. Represents the embedding of the positive sample. In contrastive learning, the positive sample is a node that is similar or related to the anchor node (z i ). /> Represents the embedding of negative samples. In contrast to positive samples, negative samples are nodes that are dissimilar or irrelevant to the anchor node.
第5个概念,锚节点是指一个用于比较的参考点或目标节点。在学习过程中,锚节点的表示(或特征向量)通常与其他节点(正样本或负样本)的表示进行比较,以学习区分性的节点嵌入。The fifth concept, anchor node, refers to a reference point or target node for comparison. During the learning process, the representation (or feature vector) of the anchor node is usually compared with the representation of other nodes (positive samples or negative samples) to learn discriminative node embeddings.
第6个概念,h-hop用于描述图中节点的邻域。在一个图中,一个节点的h-hop邻域指的是此节点出发,通过跳数不超过h次可以到达的所有节点的集合。The sixth concept, h-hop, is used to describe the neighborhood of a node in a graph. In a graph, the h-hop neighborhood of a node refers to the set of all nodes that can be reached from this node by no more than h hops.
第7个概念,InfoNCE目标是一种常用于对比学习的损失函数。其基本思想是最大化正样本对之间的相似性,同时最小化负样本对之间的相似性。在对比学习中,正样本对通常是相似或相关的实体,而负样本对则是不相关的实体。The seventh concept, InfoNCE objective, is a loss function commonly used in contrastive learning. The basic idea is to maximize the similarity between positive pairs and minimize the similarity between negative pairs. In contrastive learning, positive pairs are usually similar or related entities, while negative pairs are unrelated entities.
图2为语义增强采样策略的说明性示例图,本系统的系统框架主要包括四个模块:多模态特征嵌入模块、图增广模块、图表示学习模块、基于语义增强的对比优化模块,图1所示的是本发明的系统框图,其中:多模态特征嵌入模块:该模块的出发点在于利用路段的属性数据和街景图像数据,分别得到路段的属性嵌入和视觉特征嵌入,然后将它们组合在一起作为路段的初始嵌入,包括以下步骤:FIG2 is an illustrative example diagram of the semantic enhancement sampling strategy. The system framework of the present system mainly includes four modules: a multimodal feature embedding module, a graph augmentation module, a graph representation learning module, and a semantic enhancement-based contrast optimization module. FIG1 shows a system block diagram of the present invention, wherein: a multimodal feature embedding module: The starting point of this module is to use the attribute data of the road segment and the street view image data to obtain the attribute embedding and visual feature embedding of the road segment respectively, and then combine them together as the initial embedding of the road segment, including the following steps:
步骤一:输入路网G=(V,A)、街景图像数据以及历史GPS轨迹数据/>通过多模态特征嵌入模块得到路段的属性嵌入Xa和视觉特征嵌入Xv;Step 1: Input road network G = (V, A) and street view image data And historical GPS track data/> The attribute embedding Xa and visual feature embedding Xv of the road segment are obtained through the multimodal feature embedding module;
步骤二:将路段的属性嵌入和视觉特征嵌入拼接起来构建路段的初始嵌入X=Xa⊕Xv,并以此作为路网的节点(路段)特征矩阵。Step 2: Concatenate the attribute embedding and visual feature embedding of the road segment to construct the initial embedding X = Xa ⊕ Xv of the road segment, and use it as the node (road segment) feature matrix of the road network.
图增广模块:该模块使用两种不同的增广策略分别对路网(图)的拓扑和节点特征施加扰动,从而生成两种不同的增广图(图变种),包括以下步骤:Graph augmentation module: This module uses two different augmentation strategies to perturb the topology and node features of the road network (graph) respectively, thereby generating two different augmented graphs (graph variants), including the following steps:
步骤一:依据道路历史轨迹数据计算边删除概率其中ei,j表示路段vi与vj之间的边连接,/>表示从路段vi到vj的转移概率估计;Step 1: Calculate edge deletion probability based on road history trajectory data where e i,j represents the edge connection between segments vi and v j , /> represents the estimated transition probability from segment vi to segment vj ;
步骤二:依据边删除概率Pr和路网G作为输入,对路网两次施加扰动(每一次扰动涉及拓扑和节点特征两个方面),得到第一个增广图其邻接矩阵为/>特征矩阵为/>以及第二个增广图/>其邻接矩阵为/>特征矩阵为/> Step 2: Based on the edge deletion probability Pr and the road network G as input, perturb the road network twice (each perturbation involves two aspects: topology and node features), and obtain the first augmented graph Its adjacency matrix is/> The feature matrix is/> And the second augmented image/> Its adjacency matrix is/> The feature matrix is/>
图表示学习模块:该模块的出发点在于在图增广模块生成的两个图变种的基础上,采用图神经网络模型来提取节点级别,即路段的表示,包括以下步骤:Graph representation learning module: The starting point of this module is to use the graph neural network model to extract the node level, that is, the representation of the road segment, based on the two graph variants generated by the graph augmentation module. It includes the following steps:
步骤一:采用图编码器,对图增广模块中得到的第一种增广图的邻接矩阵及特征矩阵/>进行编码操作,得到嵌入矩阵/> Step 1: Use the graph encoder to encode the adjacency matrix of the first augmented graph obtained in the graph augmentation module and feature matrix/> Perform encoding operation to obtain the embedding matrix/>
步骤二:采用图编码器,对图增广模块中得到的第二种增广图的邻接矩阵及特征矩阵/>进行编码操作,得到嵌入矩阵/> Step 2: Use the graph encoder to encode the adjacency matrix of the second augmented graph obtained in the graph augmentation module and feature matrix/> Perform encoding operation to obtain the embedding matrix/>
步骤三:通过非线性投影头,分别将图编码器得到的两嵌入矩阵映射到低维潜在空间,得到两组路段的嵌入Z1和Z2。Step 3: Through the nonlinear projection head, the two embedding matrices obtained by the graph encoder are mapped to the low-dimensional latent space to obtain the embeddings Z 1 and Z 2 of the two groups of road segments.
基于语义增强的对比优化模块:该模块的出发点在于利用路网特性和历史轨迹中的移动语义进行样本对的采样,并设计双视图的对比损失函数来指导对比优化的过程并使用此损失函数进行网络的训练,以优化道路网络嵌入表示,包括以下步骤:Contrastive optimization module based on semantic enhancement: The starting point of this module is to sample sample pairs using the characteristics of the road network and the mobility semantics in the historical trajectory, and to design a dual-view contrast loss function to guide the contrast optimization process and use this loss function to train the network to optimize the road network embedding representation, including the following steps:
步骤一:依据道路历史轨迹T构造共现矩阵M。之后,基于共现矩阵M、邻居节点条数h,分别对两增广图的邻接矩阵和/>进行正、负样本的采样。具体而言,给定某一视图的目标节点(即锚节点),将其h跳内的邻居节点视为视图内正样本,另一视图的镜像节点视为视图间正样本。对于h跳以外的节点,进一步排除与锚节点在历史轨迹共现的节点,选择剩下的节点作为真正的负样本;Step 1: Construct a co-occurrence matrix M based on the road history trajectory T. Then, based on the co-occurrence matrix M and the number of neighbor nodes h, the adjacency matrices of the two augmented graphs are respectively and/> Sampling of positive and negative samples. Specifically, given a target node (i.e., anchor node) in a certain view, its neighbor nodes within h hops are considered as intra-view positive samples, and the mirror nodes of another view are considered as inter-view positive samples. For nodes beyond h hops, further exclude nodes that co-occur with the anchor node in the historical trajectory, and select the remaining nodes as true negative samples;
步骤二:依据步骤一的采样结果作为输入,使用基于InfoNCE目标设计的双视图对比损失函数计算损失值并以此为指导进行网络的训练。Step 2: Based on the sampling results of step 1 as input, use the dual-view contrast loss function designed based on the InfoNCE target Calculate the loss value and use it as a guide to train the network.
最后说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改而不脱离本技术方案的宗旨和范围,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention rather than to limit it. Although the present invention has been described in detail with reference to the preferred embodiments, those skilled in the art should understand that the technical solution of the present invention can be modified without departing from the purpose and scope of the technical solution, which should be included in the scope of the claims of the present invention.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410299789.9A CN118193628A (en) | 2024-03-15 | 2024-03-15 | Semantic enhancement graph contrast learning method and system for road network representation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410299789.9A CN118193628A (en) | 2024-03-15 | 2024-03-15 | Semantic enhancement graph contrast learning method and system for road network representation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118193628A true CN118193628A (en) | 2024-06-14 |
Family
ID=91411804
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410299789.9A Pending CN118193628A (en) | 2024-03-15 | 2024-03-15 | Semantic enhancement graph contrast learning method and system for road network representation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118193628A (en) |
-
2024
- 2024-03-15 CN CN202410299789.9A patent/CN118193628A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113486726A (en) | Rail transit obstacle detection method based on improved convolutional neural network | |
Mai et al. | Symbolic and subsymbolic GeoAI: Geospatial knowledge graphs and spatially explicit machine learning. | |
CN109635748B (en) | Method for extracting road characteristics in high-resolution image | |
Liu et al. | Behavior2vector: Embedding users’ personalized travel behavior to vector | |
Wang et al. | Current State of Autonomous Driving Applications Based on Distributed Perception and Decision-Making | |
CN117237559A (en) | Digital twin city-oriented three-dimensional model data intelligent analysis method and system | |
CN112990222A (en) | Image boundary knowledge migration-based guided semantic segmentation method | |
CN115393745A (en) | Automatic bridge image progress identification method based on unmanned aerial vehicle and deep learning | |
CN115565376B (en) | Vehicle journey time prediction method and system integrating graph2vec and double-layer LSTM | |
Wang et al. | Self-attentive local aggregation learning with prototype guided regularization for point cloud semantic segmentation of high-speed railways | |
CN118351682B (en) | Two-stage expressway vehicle discrete point track reconstruction method and system based on deep learning | |
CN114528679A (en) | Multi-mode fault early warning method and system for numerical control system | |
CN117236492B (en) | Traffic demand prediction method based on dynamic multi-scale graph learning | |
Joseph et al. | Investigation of deep learning methodologies in intelligent green transportation system | |
CN116976526B (en) | Land use change prediction method coupling ViViT and ANN | |
CN118193628A (en) | Semantic enhancement graph contrast learning method and system for road network representation | |
CN114648697B (en) | Robot travelable path identification method based on improved BiSeNet network | |
CN117829273A (en) | Connection loss perception training method for federal graph neural network system | |
Pan et al. | Scan-to-graph: automatic generation and representation of highway geometric digital twins from point cloud data | |
CN116721125A (en) | Vehicle motion trail visual prediction method and device based on digital twin | |
Huang et al. | GT-TTE: Modeling Trajectories as Graphs for Travel Time Estimation | |
Zhao et al. | Probing a point cloud based expeditious approach with deep learning for constructing digital twin models in shopfloor | |
Xiong et al. | Travel Time Prediction Based on Transformer | |
CN114372407B (en) | V2X edge computing node equipment arrangement system and method based on road space mode | |
CN114692762B (en) | A vehicle trajectory prediction method based on graph attention interaction mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |