CN114997067A - Trajectory prediction method based on space-time diagram and space-domain aggregation Transformer network - Google Patents

Trajectory prediction method based on space-time diagram and space-domain aggregation Transformer network Download PDF

Info

Publication number
CN114997067A
CN114997067A CN202210767796.8A CN202210767796A CN114997067A CN 114997067 A CN114997067 A CN 114997067A CN 202210767796 A CN202210767796 A CN 202210767796A CN 114997067 A CN114997067 A CN 114997067A
Authority
CN
China
Prior art keywords
pedestrian
graph
network
trajectory
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210767796.8A
Other languages
Chinese (zh)
Other versions
CN114997067B (en
Inventor
曾繁虎
杨欣
王翔辰
李恒锐
樊江锋
周大可
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202210767796.8A priority Critical patent/CN114997067B/en
Publication of CN114997067A publication Critical patent/CN114997067A/en
Application granted granted Critical
Publication of CN114997067B publication Critical patent/CN114997067B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/02Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a trajectory prediction method based on a space-time diagram and airspace aggregation Transformer network, which solves the problem of insufficient extraction of interactive features in the existing pedestrian trajectory prediction output, uses a space-time diagram convolution neural network and a time sequence feature transformation network to operate and finish effective and accurate extraction of pedestrian trajectory features in a scene, meanwhile, a brand-new airspace aggregation Transformer framework is designed to perform pedestrian time sequence characteristic transformation, efficient aggregation and utilization of airspace pedestrian characteristics are completed, output of predicted tracks of pedestrians is finally completed in a probability distribution mode, the purposes of reasonably avoiding sudden conditions and keeping movement consistency of group pedestrians are achieved, relevant indexes show that the framework breaks through in the aspect of predicting pedestrian endpoints, the purpose of more accurate and efficient prediction of pedestrian track distribution is completed, and important help is provided for development in the fields of automatic driving, intelligent traffic and the like.

Description

一种基于时空图与空域聚合Transformer网络的轨迹预测 方法A Trajectory Prediction Method Based on Spatio-temporal Graph and Spatial Domain Aggregated Transformer Network

技术领域technical field

本发明涉及一种基于时空图与空域聚合Transformer网络的轨迹预测方法,属于人工智能与自动驾驶领域。The invention relates to a trajectory prediction method based on a spatiotemporal map and a spatial domain aggregation Transformer network, and belongs to the field of artificial intelligence and automatic driving.

背景技术Background technique

行人轨迹预测技术具有较深刻的理论背景与实际应用价值,在如无人驾驶,智能监控等领域,行人轨迹识别与预测技术一直占据着较为重要的地位。近年来,由于人工智能和深度学习技术的进步,有关行人轨迹预测问题的智能算法落地与应用逐渐引发了关注与热议。Pedestrian trajectory prediction technology has profound theoretical background and practical application value. In fields such as unmanned driving and intelligent monitoring, pedestrian trajectory recognition and prediction technology has always occupied a relatively important position. In recent years, due to the advancement of artificial intelligence and deep learning technology, the implementation and application of intelligent algorithms related to pedestrian trajectory prediction has gradually attracted attention and heated discussions.

一直以来,使智能体对于场景内交通参与者的行为特点进行更好理解与判断,建立具有空间交互特征信息的行人轨迹预测模型并进行相关预测,进而作出准确快速合理的相关决策一直是行人轨迹预测问题所要达到的目标。然而,行人轨迹预测问题的高度复杂性与不确定性决定其存在以下难点:复杂的场景特征信息使得行人的将来轨迹不仅受到其自身历史轨迹与既定轨迹路线的影响,同时还受到场景内障碍物与其他交通参与者在时空维度的多种影响。因此,能否建立合理准确的模型并进行快速预测输出与决策,是行人轨迹预测问题应用于实际场景的关键。For a long time, it has always been the pedestrian trajectory to enable the agent to better understand and judge the behavior characteristics of the traffic participants in the scene, establish a pedestrian trajectory prediction model with spatial interactive feature information and make relevant predictions, and then make accurate, fast and reasonable relevant decisions. Predict what the problem is trying to achieve. However, the high complexity and uncertainty of the pedestrian trajectory prediction problem determines that it has the following difficulties: the complex scene feature information makes the pedestrian's future trajectory not only affected by its own historical trajectory and the established trajectory route, but also by the obstacles in the scene. Multiple effects with other traffic participants in the spatiotemporal dimension. Therefore, whether a reasonable and accurate model can be established and rapid prediction output and decision-making can be performed is the key to the application of pedestrian trajectory prediction in practical scenarios.

得益于机器学习在人工智能领域的发展,在很长一段时间里,基于LSTM以及基于CNN算法的轨迹预测方法是主流预测方法。这类预测方法具有模型简单,可以使用较少的参数和较基本的模型架构取得相当不错的预测效果,这些架构也为后续深入的算法研究提供了思路与基本模块框架,具有开创性的意义。Thanks to the development of machine learning in the field of artificial intelligence, trajectory prediction methods based on LSTM and CNN algorithms have been the mainstream prediction methods for a long time. This type of prediction method has a simple model, and can achieve fairly good prediction results with fewer parameters and a more basic model architecture. These architectures also provide ideas and basic module frameworks for subsequent in-depth algorithm research, which is of pioneering significance.

由于图及其网络架构在行人轨迹预测问题的数据信息表示方面具有天然优势,基于图的行人轨迹预测研究成为近年来研究的热门方向。Mohamed A等人在2020年的文献(Social-stgcnn:A social spatio-temporal graph convolutional neural networkfor human trajectory prediction[C])中使用时空图神经网络的方法,对时域和空域分别进行两种不同的卷积操作,得到轨迹特征信息的同时进行预测输出。同样,模型考虑行人轨迹在空间内的随机性与不确定性,即模型预测的时候并不事先知道每个行人的既定轨迹与终点等信息,因而一种合理的研究方法便是假设行人预测轨迹的横纵坐标符合二维高斯分布,在检验预测的过程中使用采样的方式输出轨迹。该模型也基于这样的假设完成预测,取得了较为不错的结果。然而,这样的行人轨迹预测模型中仍然存在没有针对行人交互特征信息进行进一步处理,导致空间交互能力不足,使得产生的轨迹惯性较大,也不能依据组群行人之间的运动模式产生紧密关联的运动预测。Since graphs and their network architectures have natural advantages in the representation of data and information for pedestrian trajectory prediction problems, graph-based pedestrian trajectory prediction research has become a hot research direction in recent years. Mohamed A et al. used the method of spatiotemporal graph neural network in the literature of 2020 (Social-stgcnn: A social spatio-temporal graph convolutional neural network for human trajectory prediction [C]), and conducted two different temporal and spatial domains respectively. The convolution operation is used to obtain the trajectory feature information and predict the output at the same time. Similarly, the model considers the randomness and uncertainty of pedestrian trajectories in space, that is, the model does not know the predetermined trajectory and end point of each pedestrian in advance when predicting. Therefore, a reasonable research method is to assume that the pedestrian predicts the trajectory. The abscissa and ordinate of are conformed to a two-dimensional Gaussian distribution, and the trajectory is output by sampling in the process of testing the prediction. The model also completed predictions based on such assumptions, and achieved relatively good results. However, in such pedestrian trajectory prediction model, there is still no further processing of pedestrian interaction feature information, resulting in insufficient spatial interaction ability, resulting in a large trajectory inertia, and it cannot be closely related according to the movement patterns between groups of pedestrians. Motion prediction.

近年来,已经有许多学者基于图表示,结合许多其他不同的算法工具与研究方法对行人轨迹预测进行研究,取得了许多方面的进展。Dan X等人在文献(Spatial-TemporalBlock and LSTM Network for Pedestrian Trajectories Prediction[J])中提出一种基于时空模块和LSTM的行人轨迹预测模型架构,该模型基于图表示,通过图的嵌入表示得到每个行人节点与其邻居行人之间的关系特征向量,输入到LSTM中编码得到的时空图行人交互特征向量,进而进行相关预测,取得了很好的预测结果;Rainbow B A等人在文献(Semantics-STGCNN:A semantics-guided spatial-temporal graph convolutionalnetwork for multi-class trajectory prediction[C])中提出了一种基于语义的时空图行人轨迹预测模型 Semantics-STGCNN,模型中从场景语义理解出发,将行人对象的类别标签嵌入到标签邻接矩阵中,结合速度邻接矩阵,输出语义邻接矩阵,完成对语义信息的建模,最终输出预测结果;Yu C等人在文献(Spatio-temporal graph transformer networksfor pedestrian trajectory prediction[C])中使用一种基于Transformer的网络模型,该模型利用Transformer 在其他领域的优秀表现,设计直接拼接多个Transformer基本框架提取行人在场景内的时空特征信息并完成相关预测。In recent years, many scholars have studied pedestrian trajectory prediction based on graph representation, combined with many other different algorithm tools and research methods, and have made progress in many aspects. In the literature (Spatial-TemporalBlock and LSTM Network for Pedestrian Trajectories Prediction[J]), Dan X et al. proposed a pedestrian trajectory prediction model architecture based on spatiotemporal modules and LSTM. The relationship feature vector between individual pedestrian nodes and their neighbors is input into the spatiotemporal graph pedestrian interaction feature vector encoded in LSTM, and then related predictions are made, and good prediction results are obtained; Rainbow B A et al. in the literature (Semantics-STGCNN) In :A semantics-guided spatial-temporal graph convolutionalnetwork for multi-class trajectory prediction[C]), a semantic-based spatial-temporal graph pedestrian trajectory prediction model Semantics-STGCNN is proposed. The model starts from the scene semantic understanding and combines the pedestrian object's trajectory. The category label is embedded in the label adjacency matrix, combined with the velocity adjacency matrix, the semantic adjacency matrix is output, the modeling of semantic information is completed, and the prediction result is finally output; Yu C et al. in the literature (Spatio-temporal graph transformer networks for pedestrian trajectory prediction[C ]), a Transformer-based network model is used, which uses Transformer's excellent performance in other fields to design and directly splicing multiple Transformer basic frameworks to extract the spatiotemporal feature information of pedestrians in the scene and complete related predictions.

本发明中,针对现有轨迹预测方法在行人空间交互特征提取与预测方面存在的问题和不足,提出了一种使用时空图与空域聚合Transformer进行行人轨迹预测的全新网络架构,对于输入的原始数据进行恰当的图表示与预处理,使用时空图卷积神经网络与时序特征变换网络对原始对行人轨迹特征信息进行提取,并引入一个空域Transformer网络架构深层空间特征信息进行充分提取与聚合,以确保模型在空间行人交互特征方面的有效性与准确性。本发明注重模型预测结果在空间交互方面的合理性,保证行人空间行走特性的同时兼顾交互影响,特别对于行人轨迹终点的预测取得了突破,对于建模复杂场景内的行人轨迹交互与预测行人轨迹有着积极的作用,对无人驾驶、人工智能等领域的研究与探索起到帮助与启发。In the present invention, in view of the problems and deficiencies existing in the existing trajectory prediction methods in the extraction and prediction of pedestrian spatial interaction features, a new network architecture using spatiotemporal map and spatial aggregation Transformer for pedestrian trajectory prediction is proposed. Carry out proper graph representation and preprocessing, use spatiotemporal graph convolutional neural network and time series feature transformation network to extract original pedestrian trajectory feature information, and introduce a spatial Transformer network architecture to fully extract and aggregate deep spatial feature information to ensure that Model effectiveness and accuracy in spatial pedestrian interaction features. The invention pays attention to the rationality of the model prediction results in the aspect of space interaction, ensures the pedestrian space walking characteristics while taking into account the interaction influence, and especially achieves a breakthrough in the prediction of the end point of the pedestrian trajectory, and is used for modeling the pedestrian trajectory interaction in complex scenes and predicting the pedestrian trajectory. It plays a positive role, and helps and inspires research and exploration in areas such as unmanned driving and artificial intelligence.

发明内容SUMMARY OF THE INVENTION

本发明公开一种基于时空图与空域聚合Transformer网络的轨迹预测方法,该方法针对现有行人轨迹预测方法中空间行人轨迹信息提取不充分,导致行人行走时相对位置关系不够清晰、无法针对碰撞等作出较大范围转动等问题,考虑以图的形式建立一种新的轨迹预测模型架构,通过时空图卷积神经网络的以及时序特征变换网络等操作完成对场景内行人特征提取,设计一个全新的空域聚合Transformer架构进行行人时序特征变换与利用,最终以概率分布的形式完成对行人预测轨迹的输出,达到合理预测的目的。The invention discloses a trajectory prediction method based on a spatiotemporal map and a spatial domain aggregated Transformer network. The method aims at the insufficient extraction of spatial pedestrian trajectory information in the existing pedestrian trajectory prediction method, which results in that the relative position relationship of pedestrians is not clear enough when walking, and cannot be used for collisions and the like. To solve problems such as large-scale rotation, consider establishing a new trajectory prediction model architecture in the form of a graph, and complete the feature extraction of pedestrians in the scene through operations such as spatiotemporal graph convolutional neural networks and time series feature transformation networks, and design a new The spatial aggregation Transformer architecture performs the transformation and utilization of pedestrian time series features, and finally completes the output of pedestrian predicted trajectories in the form of probability distribution to achieve the purpose of reasonable prediction.

在时空图卷积神经网络方面,将场景内行人轨迹特征信息通过图的形式进行表示与预处理,构建图卷积神经网络完成对空间内行人轨迹特征信息的初步提取,作为后续网络输入。In terms of spatiotemporal graph convolutional neural network, the pedestrian trajectory feature information in the scene is represented and preprocessed in the form of a graph, and a graph convolutional neural network is constructed to complete the preliminary extraction of pedestrian trajectory feature information in space as the subsequent network input.

在时序特征变换网络中,通过一个卷积伸进网络完成时序特征信息的提取与特征维度的变换,同时合理设计网络以简化模型参数,提高模型性能。In the time series feature transformation network, the extraction of time series feature information and the transformation of feature dimension are completed through a convolution extending into the network, and the network is reasonably designed to simplify the model parameters and improve the model performance.

在空域聚合Transformer网络中,对于先前从时空图卷积神经网络以及时序特征变换网络中得到的特征进一步处理。为了对空间场景内行人特征的交互进行进一步挖掘与建模,本发明中的模型使用每个行人的时序特征向量作为输入向量,输入至空域聚合Transformer网络中对行人空间轨迹特征进行充分提取与聚合,同时完成轨迹预测输出的任务。In the spatial aggregation Transformer network, the features previously obtained from the spatiotemporal graph convolutional neural network and the temporal feature transformation network are further processed. In order to further mine and model the interaction of pedestrian features in the spatial scene, the model in the present invention uses the time series feature vector of each pedestrian as an input vector, and inputs it into the spatial aggregation Transformer network to fully extract and aggregate the pedestrian spatial trajectory features. , while completing the task of trajectory prediction output.

本发明主要包括以下步骤:The present invention mainly comprises the following steps:

步骤(1):利用图的特性从输入的原始数据中对场景内行人轨迹特征信息进行图表示与预处理,选取合适的核函数完成对邻接矩阵的构建,为后续网络架构输入提供准确、高效的场景内行人信息;Step (1): Use the characteristics of the graph to represent and preprocess the pedestrian trajectory feature information in the scene from the input original data, select the appropriate kernel function to complete the construction of the adjacency matrix, and provide accurate and efficient input for the subsequent network architecture. information of pedestrians in the scene;

步骤(2):建立时空图卷积神经网络模块,构建图卷积神经网络,通过选择对行人轨迹特征的图卷积次数完成对空间内行人轨迹特征信息的初步提取,确保提取特征的准确、有效;Step (2): Establish a spatiotemporal graph convolutional neural network module, construct a graph convolutional neural network, and complete the preliminary extraction of pedestrian trajectory feature information in space by selecting the number of graph convolutions for pedestrian trajectory features to ensure accurate and accurate extraction of features. efficient;

步骤(3):建立时序特征变换网络模块,通过设计卷积神经网络完成时序特征的提取与特征维度的变换;Step (3): establish a time series feature transformation network module, and complete the extraction of time series features and the transformation of feature dimensions by designing a convolutional neural network;

步骤(4):建立空域聚合Transformer网络,使用场景内每个行人的时序特征向量作为输入向量,同时输入Transformer网络进行空域特征的进一步聚合,并且完成行人轨迹预测序列的输出。Step (4): Establish an airspace aggregation Transformer network, use the time series feature vector of each pedestrian in the scene as an input vector, and input the Transformer network to further aggregate airspace features, and complete the output of the pedestrian trajectory prediction sequence.

进一步的,所述步骤(1)中,引入时空图对输入的原始行人轨迹数据进行图表示,从多种核函数中选择合适的核函数构建图意义下的邻接矩阵,完成高效的场景内行人特征构建与选择,为后续建模提供准确、高效的信息。Further, in the step (1), a spatiotemporal graph is introduced to represent the input original pedestrian trajectory data, and an appropriate kernel function is selected from a variety of kernel functions to construct an adjacency matrix in the sense of graph, so as to complete an efficient pedestrian in the scene. Feature construction and selection provide accurate and efficient information for subsequent modeling.

进一步的,所述引入时空图对输入的原始行人轨迹数据进行图表示具体为:对于每个时刻t,引入一个空间图Gt,用来表示每个时间点行人间的交互特征关系;Gt定义为 Gt=(Vt,Et),其中,Vt具体表示时刻t场景内行人的坐标信息,即

Figure BDA0003722819760000041
每个
Figure BDA0003722819760000042
的特征信息使用观测的相对坐标变化
Figure BDA0003722819760000043
来进行刻画,即:Further, the introduction of a space-time map to graphically represent the input original pedestrian trajectory data is specifically: for each time t, a space map G t is introduced to represent the interactive feature relationship between pedestrians at each time point; G t Defined as G t =(V t , E t ), where V t specifically represents the coordinate information of pedestrians in the scene at time t, namely
Figure BDA0003722819760000041
each
Figure BDA0003722819760000042
The feature information of using the observed relative coordinate changes
Figure BDA0003722819760000043
to characterize, that is:

Figure BDA0003722819760000044
Figure BDA0003722819760000044

Figure BDA0003722819760000045
Figure BDA0003722819760000045

其中,i=1,…,N,t=2,…,Tobs,对于初始时刻,规定其位置相对偏移为0,即

Figure BDA0003722819760000046
Among them, i=1,...,N, t=2,...,T obs , for the initial moment, the relative position offset is defined as 0, that is,
Figure BDA0003722819760000046

Et则表示空间图Gt的边信息,其是一个维度大小为n×n的矩阵;定义为

Figure BDA0003722819760000047
Figure BDA0003722819760000048
的取值由如下方式给出:E t represents the side information of the spatial graph G t , which is a matrix with a dimension of n×n; it is defined as
Figure BDA0003722819760000047
Figure BDA0003722819760000048
The value of is given by:

如果节点

Figure BDA0003722819760000049
与节点
Figure BDA00037228197600000410
相连,那么
Figure BDA00037228197600000411
反之,如果节点
Figure BDA00037228197600000412
与节点
Figure BDA00037228197600000413
不相连,那么
Figure BDA00037228197600000414
if node
Figure BDA0003722819760000049
with node
Figure BDA00037228197600000410
connected, then
Figure BDA00037228197600000411
Conversely, if the node
Figure BDA00037228197600000412
with node
Figure BDA00037228197600000413
not connected, then
Figure BDA00037228197600000414

进一步的,所述从多种核函数中选择合适的核函数构建图意义下的邻接矩阵具体为:Further, the adjacency matrix in the sense of selecting a suitable kernel function from a variety of kernel functions to construct a graph is specifically:

引入加权邻接矩阵At对行人空间图的节点信息进行加权表示,通过核函数变换得到行人间相互影响的大小并存储在加权邻接矩阵At中;The weighted adjacency matrix A t is introduced to represent the node information of the pedestrian spatial graph by weight, and the magnitude of the mutual influence between pedestrians is obtained through the kernel function transformation and stored in the weighted adjacency matrix A t ;

选用两个节点在欧式空间中距离的倒数作为核函数,并且为了避免二者过于接近而导致的函数发散问题,加入一个微小的常量ε来加速模型收敛,表达式如下:The inverse of the distance between two nodes in Euclidean space is used as the kernel function, and in order to avoid the function divergence problem caused by the two being too close, a tiny constant ε is added to accelerate the model convergence, the expression is as follows:

Figure BDA00037228197600000415
Figure BDA00037228197600000415

在时间维度上对于每一个时刻的空间图Gt进行堆叠,即得到图表示下的行人轨迹预测时空图序列G={G1,…,GT}。In the time dimension, the spatial graph G t at each moment is stacked, that is, the pedestrian trajectory prediction spatio-temporal graph sequence G={G 1 ,...,G T } under the graph representation is obtained.

进一步的,所述步骤(2)具体为:Further, the step (2) is specifically:

对于输入得到的特征图时间序列,通过建立的时空图卷积神经网络得到输出:For the input feature map time series, the output is obtained through the established spatiotemporal map convolutional neural network:

et=GNN(Gt) (1.6)e t =GNN(G t ) (1.6)

其中,GNN表示构建的时空图卷积神经网络,其由多层的图卷积迭代得到输出结果;et表示通过图神经网络从空间维度初步提取的时空特征信息;Among them, GNN represents the constructed spatiotemporal graph convolutional neural network, which obtains the output result by iterative multi-layer graph convolution; e t represents the spatiotemporal feature information initially extracted from the spatial dimension through the graph neural network;

对于每一个时刻的输出,均有这样的操作;而实际图卷积神经网络得到的输出则是这样时间序列的堆叠:For the output of each moment, there is such an operation; and the output obtained by the actual graph convolutional neural network is a stack of such time series:

eg=Stack(et) (1.7)e g = Stack(e t ) (1.7)

其中,Stack(·)表示对于输入在拓展维度上的叠加,eg表示图卷积的输出;实际处理过程中,多个拓展维度是同时并行送入图神经网络进行处理的;Among them, Stack( ) represents the superposition of the input on the extended dimension, and e g represents the output of the graph convolution; in the actual processing process, multiple extended dimensions are simultaneously sent to the graph neural network for processing in parallel;

接着经过一个全连接层FC对特征进行恰当的维度变换:Then, through a fully connected layer FC, the features are appropriately dimensionally transformed:

VGNN=FC(eg) (1.8)V GNN = FC(e g ) (1.8)

由此得到时空图卷积神经网络的特征信息的初步提取输出。Thereby, the preliminary extraction output of the feature information of the spatiotemporal graph convolutional neural network is obtained.

进一步的,所述步骤(3)中,将时空图卷积神经网络的输出经过维度变换,使用一个基于CNN的时序特征变换网络模块并设计卷积次数完成对行人自身历史轨迹特征信息的提取;Further, in the step (3), the output of the spatiotemporal graph convolutional neural network is subjected to dimensional transformation, and a CNN-based time series feature transformation network module is used and the number of convolutions is designed to complete the extraction of the pedestrian's own historical trajectory feature information;

进一步的,所述步骤(3)具体为:Further, the step (3) is specifically:

在得到时空图卷积神经网络的特征提取信息后,送入一个时序特征变换网络对时序特征进行提取;由于在步骤二中已经通过一个全连接层对于维度特征进行合适变换,因此本步骤中的网络模块直接对得到的特征信息进行利用;本发明中,选择多层CNN卷积神经网络对时间维度特征信息进行处理,可以表示为:After obtaining the feature extraction information of the spatiotemporal graph convolutional neural network, it is sent to a time series feature transformation network to extract the time series features; since the dimension features have been appropriately transformed through a fully connected layer in step 2, the The network module directly utilizes the obtained feature information; in the present invention, the multi-layer CNN convolutional neural network is selected to process the time dimension feature information, which can be expressed as:

ec=CNN(VGNN) (1.9) ec = CNN(V GNN ) (1.9)

其中,VGNN表示从图卷积神经网络中提取到的特征信息,ec表示经过时序特征变换网络的输出;接着通过一个多层感知机MLP,用以增加网络的表达能力:Among them, V GNN represents the feature information extracted from the graph convolutional neural network, and e c represents the output of the time series feature transformation network; then a multi-layer perceptron MLP is used to increase the expressive ability of the network:

VCNN=MLP(ec) (1.10)V CNN = MLP( ec ) (1.10)

通过上述网络进行特征的变换与处理,即得到时序特征变换网络的输出VCNNThe feature transformation and processing are performed through the above network, that is, the output VCNN of the time series feature transformation network is obtained.

进一步的,步骤四的主要构建计算内容包括:为了增加行人特征在空域之间的联系,设计一个空域Transformer网络对上述提取到的特征信息进行进一步空间聚合。特别地,将同一个行人在时序上的特征向量作为输入向量输入,依次输入的为不同行人的提取特征。Further, the main construction and calculation content of step 4 includes: in order to increase the connection between pedestrian features in the airspace, an airspace Transformer network is designed to further spatially aggregate the above extracted feature information. In particular, the feature vector of the same pedestrian in the time series is input as the input vector, and the extracted features of different pedestrians are input in sequence.

对于空域聚合Transformer网络,选用Transformer架构的编码器层,首先对输入添加位置编码:For the spatial aggregation Transformer network, the encoder layer of the Transformer architecture is selected, and position encoding is first added to the input:

Vin=VCNN+PEpos,i(VCNN) (1.11)V in = V CNN +PE pos,i (V CNN ) (1.11)

其中pos表示输入特征的相对位置,i表示输入特征的维度。接着引入多头注意力层,使用从输入层进行矩阵变换得到的三个注意力层输入Query(Q)、Key(K)、Value (V),依照设定的多头数对输入特征进行划分,计算注意力得分,表达式如下:where pos represents the relative position of the input feature and i represents the dimension of the input feature. Then introduce a multi-head attention layer, use the three attention layers obtained by matrix transformation from the input layer to input Query(Q), Key(K), Value (V), divide the input features according to the set number of heads, and calculate The attention score, expressed as follows:

Figure BDA0003722819760000061
Figure BDA0003722819760000061

headi=Attention(Qi,Ki,Vi) (1.13)head i =Attention(Q i ,K i ,V i ) (1.13)

其中,i=1,…,nhead,nhead表示多头数。而最终的多头输出通过拼接的方式完成特征提取,表达式如下所示:Among them, i=1,...,nhead, nhead represents the number of long heads. The final multi-head output completes feature extraction by splicing, and the expression is as follows:

VMulti=ConCat(head1,…,headh)Wo (1.14)V Multi = ConCat(head 1 ,...,head h )W o (1.14)

其中,ConCat表示拼接操作,Wo表示注意力层输出的参数矩阵。Among them, ConCat represents the concatenation operation, and Wo represents the parameter matrix output by the attention layer.

接着通过前馈神经网络以及层归一化完成空域Transformer的最终输出,表示为:Then, the final output of the spatial Transformer is completed through the feedforward neural network and layer normalization, which is expressed as:

Vout=LN(Feedback(VMulti)) (1.15)V out = LN(Feedback(V Multi )) (1.15)

通过这种架构方式,较好地完成堆通过初步提取的时空特征进行行人空间交互特征的聚合,达到更好输出符合场景行人关联与交互的行人轨迹的目的。Through this architectural method, the aggregation of pedestrian spatial interaction features through the initially extracted spatiotemporal features can be better completed, and the goal of better outputting pedestrian trajectories that conform to the pedestrian association and interaction in the scene is achieved.

在损失函数方面,选用行人预测轨迹上每一点的负对数似然之和作为损失函数。第 i个行人的损失函数有如下表示:In terms of loss function, the sum of negative log-likelihoods of each point on the pedestrian predicted trajectory is selected as the loss function. The loss function of the i-th pedestrian is expressed as follows:

Figure BDA0003722819760000062
Figure BDA0003722819760000062

其中,

Figure BDA0003722819760000063
是待预测的未知的行人轨迹特征参数,Tobs,Tpred分别表示观测和预测终点时刻;而所有行人的损失函数之和即为最终的损失函数:in,
Figure BDA0003722819760000063
is the unknown pedestrian trajectory feature parameter to be predicted, T obs , T pred represent the end point of observation and prediction respectively; and the sum of the loss functions of all pedestrians is the final loss function:

Figure BDA0003722819760000071
Figure BDA0003722819760000071

通过对本发明提出的上述模型架构进行正向损失函数计算和反向参数更新,即可完成对模型的训练,得到合理的行人预测轨迹输出。By performing forward loss function calculation and reverse parameter update on the above-mentioned model architecture proposed by the present invention, the training of the model can be completed, and a reasonable pedestrian predicted trajectory output can be obtained.

有益效果beneficial effect

为了解决现有行人轨迹预测输出存在的交互特征提取不足、进而导致的行人空间特性不明显,一方面表现在行人预测轨迹多存在较大惯性,不能针对高速、突发等状况进行较大转角的避让,另一方面表现在行人组群行为的运动一致性保持不够,导致空间内关联紧密的人群之间不能在一段时间内保持相同的运动趋势的问题,本发明提出一种全新的网络模型架构,使用时空图卷积神经网络以及时序特征变换网络等相关变换操作完成对场景内行人特征的有效、准确提取,同时设计一个全新的空域聚合Transformer架构进行行人时序特征变换与利用,最终以概率分布的形式完成对行人预测轨迹的输出,达到对突发状况进行合理避让、保持组群行人运动一致性的目的,完成对行人空间交互的更准确、合理预测,对于行人轨迹预测问题的进一步深入研究和探索提供了新的思路,为其在实际场景下更准确、及时的预测与应用具有深刻的意义和作用,对于自动驾驶、智慧交通等领域的发展提供了帮助。In order to solve the problem of insufficient interactive feature extraction in the existing pedestrian trajectory prediction output, which leads to the inconspicuous pedestrian spatial characteristics, on the one hand, the pedestrian prediction trajectory has a large inertia, and it cannot be used for high-speed, sudden and other situations. Avoidance, on the other hand, is manifested in the problem that the movement consistency of pedestrian group behaviors is not maintained enough, resulting in the problem that the closely related people in the space cannot maintain the same movement trend for a period of time. The present invention proposes a new network model architecture. , using spatiotemporal graph convolutional neural network and time series feature transformation network and other related transformation operations to complete the effective and accurate extraction of pedestrian features in the scene, and design a new spatial aggregation Transformer architecture to transform and utilize pedestrian time series features, and finally use probability distribution It completes the output of pedestrian predicted trajectories in the form of a reasonable avoidance of emergencies, maintains the consistency of group pedestrian movements, completes more accurate and reasonable prediction of pedestrian space interaction, and conducts further in-depth research on pedestrian trajectory prediction. It provides new ideas for more accurate and timely prediction and application in actual scenarios, and provides help for the development of autonomous driving, intelligent transportation and other fields.

附图说明Description of drawings

图1为本发明具体实施方式中基于时空图与空域聚合Transformer网络框架的整体示意图;Fig. 1 is the overall schematic diagram of the Transformer network framework based on the spatiotemporal map and the spatial domain aggregation in the specific embodiment of the present invention;

图2为本发明中利用时序变换特征输入空域聚合Transformer网络进行轨迹预测的示意图。FIG. 2 is a schematic diagram of using time series transformation features to input spatial aggregation Transformer network to perform trajectory prediction in the present invention.

具体实施方式Detailed ways

本发明涉及一种基于时空图与空域聚合Transformer网络的行人轨迹预测方法,具体实施方式主要包含以下几个步骤:The present invention relates to a pedestrian trajectory prediction method based on a spatiotemporal map and a spatial domain aggregation Transformer network. The specific implementation mainly includes the following steps:

对于给定场景下的行人轨迹预测问题,由N个行人在每个观测时刻在场景内的坐标组成。对于第t个时刻的第i个行人的坐标信息,用

Figure BDA0003722819760000072
表示。有了如上定义,那么本问题的一般表述为,对每一组已知的给定观测行人轨迹序列:For the pedestrian trajectory prediction problem in a given scene, it consists of the coordinates of N pedestrians in the scene at each observation moment. For the coordinate information of the i-th pedestrian at the t-th time, use
Figure BDA0003722819760000072
express. With the above definition, the general formulation of this problem is, for each set of known sequences of observed pedestrian trajectories:

Figure BDA0003722819760000081
Figure BDA0003722819760000081

由构建网络框架通过输入数据对行人轨迹特性进行提取与建模,得到合适的轨迹特征信息,并给出场景内合理的轨迹预测输出:By constructing a network framework, the pedestrian trajectory characteristics are extracted and modeled through the input data to obtain appropriate trajectory characteristic information, and a reasonable trajectory prediction output in the scene is given:

Figure BDA0003722819760000082
Figure BDA0003722819760000082

其中Tobs和Tpred分别表示行人观测时间跨度和预测时间跨度,(·)表示行人轨迹预测真值,

Figure BDA0003722819760000083
表示模型给出的行人轨迹预测值。where T obs and T pred represent the pedestrian observation time span and prediction time span, respectively, ( ) represents the true value of pedestrian trajectory prediction,
Figure BDA0003722819760000083
Represents the predicted pedestrian trajectory given by the model.

本发明具体实施方式中基于时空图与空域聚合Transformer网络框架的整体示意图如图1所示。The overall schematic diagram of the Transformer network framework based on the spatiotemporal map and the spatial domain aggregation in the specific embodiment of the present invention is shown in FIG. 1 .

步骤一:对数据进行恰当的图表示与预处理,提供准确、高效的场景内行人信息Step 1: Appropriate graph representation and preprocessing of the data to provide accurate and efficient pedestrian information in the scene

本发明中,首先使用恰当的图表示方法对输入的原始行人轨迹数据进行相关图转化与预处理,方便后续中对于输入特征信息进行提取与高效利用。In the present invention, an appropriate graph representation method is used to first perform correlation graph transformation and preprocessing on the input original pedestrian trajectory data, so as to facilitate subsequent extraction and efficient use of input feature information.

对于每个时刻t,引入一个空间图Gt,用来表示每个时间点行人间的交互特征关系。 Gt定义为Gt=(Vt,Et),其中,Vt表示空间图Gt的节点信息,本模型中,Vt具体表示时刻t场景内行人的坐标信息,即

Figure BDA0003722819760000084
对于本模型,每个
Figure BDA0003722819760000085
的特征信息使用观测的相对坐标变化
Figure BDA0003722819760000086
来进行刻画,即:For each time t, a spatial graph G t is introduced to represent the interactive feature relationship between pedestrians at each time point. G t is defined as G t =(V t , E t ), where V t represents the node information of the spatial graph G t . In this model, V t specifically represents the coordinate information of pedestrians in the scene at time t, that is,
Figure BDA0003722819760000084
For this model, each
Figure BDA0003722819760000085
The feature information of using the observed relative coordinate changes
Figure BDA0003722819760000086
to characterize, that is:

Figure BDA0003722819760000087
Figure BDA0003722819760000087

Figure BDA0003722819760000088
Figure BDA0003722819760000088

其中,i=1,…,N,t=2,…,Tobs,对于初始时刻,规定其位置相对偏移为0,即

Figure BDA0003722819760000089
Among them, i=1,...,N, t=2,...,T obs , for the initial moment, the relative position offset is defined as 0, that is,
Figure BDA0003722819760000089

Et则表示空间图Gt的边信息,其是一个维度大小为n×n的矩阵。其通常意义上定义为

Figure BDA00037228197600000810
Figure BDA00037228197600000811
的取值由如下方式给出:如果节点
Figure BDA00037228197600000812
与节点
Figure BDA00037228197600000813
相连,那么
Figure BDA00037228197600000814
反之,如果节点
Figure BDA00037228197600000815
与节点
Figure BDA00037228197600000816
不相连,那么
Figure BDA00037228197600000817
E t represents the side information of the spatial graph G t , which is a matrix with a dimension of n×n. It is usually defined as
Figure BDA00037228197600000810
Figure BDA00037228197600000811
The value of is given by: if the node
Figure BDA00037228197600000812
with node
Figure BDA00037228197600000813
connected, then
Figure BDA00037228197600000814
Conversely, if the node
Figure BDA00037228197600000815
with node
Figure BDA00037228197600000816
not connected, then
Figure BDA00037228197600000817

对于本预测任务而言,不仅希望得到行人之间是否关联,还希望度量空间内行人间相互影响的相对大小,因此引入加权邻接矩阵At对行人空间图的节点信息进行加权表示,通过核函数变换得到行人间相互影响的大小并存储在加权邻接矩阵At中,本发明中,选用两个节点在欧式空间中距离的倒数作为核函数,并且为了避免二者过于接近而导致的函数发散问题,加入一个微小的常量ε来加速模型收敛,表达式如下:For this prediction task, it is not only hoped to obtain whether the pedestrians are related, but also to measure the relative size of the mutual influence between pedestrians in the space. Therefore, a weighted adjacency matrix A t is introduced to weight the node information of the pedestrian space graph, and transform it through a kernel function. The size of the mutual influence between pedestrians is obtained and stored in the weighted adjacency matrix A t . In the present invention, the inverse of the distance between the two nodes in the Euclidean space is selected as the kernel function, and in order to avoid the function divergence problem caused by the two being too close, A tiny constant ε is added to speed up the model convergence, and the expression is as follows:

Figure BDA0003722819760000091
Figure BDA0003722819760000091

在时间维度上对于每一个时刻的空间图Gt进行堆叠,即得到图表示下的行人轨迹预测时空图序列G={G1,…,GT}。通过这种定义与变换,完成行人轨迹预测问题中数据的图表示和预处理。In the time dimension, the spatial graph G t at each moment is stacked, that is, the pedestrian trajectory prediction spatio-temporal graph sequence G={G 1 ,...,G T } under the graph representation is obtained. Through this definition and transformation, the graph representation and preprocessing of the data in the pedestrian trajectory prediction problem are completed.

步骤二:建立时空图卷积神经网络对特征信息进行初步提取Step 2: Establish a spatiotemporal graph convolutional neural network to initially extract feature information

本发明中,针对步骤一中对原始数据进行图表示后的数据,使用时空图卷积神经网络对特征信息进行初步提取。In the present invention, the feature information is preliminarily extracted by using a spatiotemporal graph convolutional neural network for the data after the original data is represented graphically in step 1.

该模型架构中,使用图卷积神经网络,确定恰当的卷积层数进行合适的特征迭代次数,达到较好提取空间内轨迹特征的目的。In this model architecture, the graph convolutional neural network is used to determine the appropriate number of convolutional layers and perform appropriate feature iterations to achieve the purpose of better extraction of trajectory features in space.

对于输入得到的特征图时间序列,通过建立的时空图卷积神经网络得到输出:For the input feature map time series, the output is obtained through the established spatiotemporal map convolutional neural network:

et=GNN(Gt) (1.6)e t =GNN(G t ) (1.6)

其中,GNN表示构建的时空图卷积神经网络,其由多层的图卷积迭代得到输出结果;et表示通过图神经网络从空间维度初步提取的时空特征信息。Among them, GNN represents the constructed spatiotemporal graph convolutional neural network, which is iterated by multiple layers of graph convolution to obtain the output result; e t represents the spatiotemporal feature information initially extracted from the spatial dimension through the graph neural network.

对于每一个时刻的输出,均有这样的操作。而实际图卷积神经网络得到的输出则是这样时间序列的堆叠:For the output at each moment, there is such an operation. The output obtained by the actual graph convolutional neural network is a stack of such time series:

eg=Stack(et) (1.7)e g = Stack(e t ) (1.7)

其中,Stack(·)表示对于输入在拓展维度上的叠加,eg表示图卷积的输出。实际处理过程中,多个拓展维度是同时并行送入图神经网络进行处理的。Among them, Stack( ) represents the superposition of the input in the extended dimension, and e g represents the output of the graph convolution. In the actual processing process, multiple extended dimensions are simultaneously sent to the graph neural network for processing in parallel.

接着经过一个全连接层FC对特征进行恰当的维度变换:Then, through a fully connected layer FC, the features are appropriately dimensionally transformed:

VGNN=FC(eg) (1.8)V GNN = FC(e g ) (1.8)

由此得到时空图卷积神经网络的特征信息的初步提取输出。Thereby, the preliminary extraction output of the feature information of the spatiotemporal graph convolutional neural network is obtained.

步骤三:建立时序特征变换网络,通过设计卷积神经网络完成时序特征的提取与特征维度的变换;Step 3: Establish a time series feature transformation network, and complete the time series feature extraction and feature dimension transformation by designing a convolutional neural network;

在得到时空图卷积神经网络的特征提取信息后,送入一个时序特征变换网络对时序特征进行提取。由于在步骤二中已经通过一个全连接层对于维度特征进行合适变换,因此本步骤中的网络模块直接对得到的特征信息进行利用。本发明中,选择多层CNN卷积神经网络对时间维度特征信息进行处理,可以表示为:After obtaining the feature extraction information of the spatiotemporal graph convolutional neural network, it is sent to a time series feature transformation network to extract the time series features. Since the dimension feature has been appropriately transformed by a fully connected layer in step 2, the network module in this step directly utilizes the obtained feature information. In the present invention, the multi-layer CNN convolutional neural network is selected to process the time dimension feature information, which can be expressed as:

ec=CNN(VGNN) (1.9) ec = CNN(V GNN ) (1.9)

其中,VGNN表示从图卷积神经网络中提取到的特征信息,ec表示经过时序特征变换网络的输出。接着通过一个多层感知机MLP,用以增加网络的表达能力:Among them, V GNN represents the feature information extracted from the graph convolutional neural network, and ec represents the output of the time series feature transformation network. Then pass a multi-layer perceptron MLP to increase the expressive ability of the network:

VCNN=MLP(ec) (1.10)V CNN = MLP( ec ) (1.10)

通过上述网络进行特征的变换与处理,即得到时序特征变换网络的输出VCNNThe feature transformation and processing are performed through the above network, that is, the output VCNN of the time series feature transformation network is obtained.

步骤四:建立空域聚合Transformer网络进行空域特征的进一步聚合,并且完成行人轨迹预测序列的输出Step 4: Establish an airspace aggregation Transformer network for further aggregation of airspace features, and complete the output of the pedestrian trajectory prediction sequence

为了解决现有行人轨迹预测输出存在的交互特征提取不足、进而导致的行人空间特性不明显,一方面表现在行人预测轨迹多存在较大惯性,不能针对高速、突发等状况进行较大转角的避让,另一方面表现在行人组群行为的运动一致性保持不够,导致空间内关联紧密的人群之间不能在一段时间内保持相同的运动趋势。In order to solve the problem of insufficient interactive feature extraction in the existing pedestrian trajectory prediction output, which leads to the inconspicuous pedestrian spatial characteristics, on the one hand, the pedestrian prediction trajectory has a large inertia, and it cannot be used for high-speed, sudden and other situations. Avoidance, on the other hand, is manifested in that the movement consistency of pedestrian group behavior is not maintained enough, resulting in the inability of closely related groups in space to maintain the same movement trend for a period of time.

本发明中为了增加行人特征在空域之间的联系,设计一个空域Transformer网络对上述提取到的特征信息进行进一步空间聚合。特别地,将同一个行人在时序上的特征向量作为输入向量输入,依次输入的为不同行人的提取特征。In the present invention, in order to increase the connection between pedestrian features in the airspace, an airspace Transformer network is designed to further spatially aggregate the above extracted feature information. In particular, the feature vector of the same pedestrian in the time series is input as the input vector, and the extracted features of different pedestrians are input in sequence.

对于空域聚合Transformer网络,选用Transformer架构的编码器层,首先对输入添加位置编码:For the spatial aggregation Transformer network, the encoder layer of the Transformer architecture is selected, and position encoding is first added to the input:

Vin=VCNN+PEpos,i(VCNN) (1.11)V in = V CNN +PE pos,i (V CNN ) (1.11)

其中pos表示输入特征的相对位置,i表示输入特征的维度。接着引入多头注意力层,使用从输入层进行矩阵变换得到的三个注意力层输入Query(Q)、Key(K)、Value (V),依照设定的多头数对输入特征进行划分,计算注意力得分,表达式如下:where pos represents the relative position of the input feature and i represents the dimension of the input feature. Then introduce a multi-head attention layer, use the three attention layers obtained by matrix transformation from the input layer to input Query(Q), Key(K), Value (V), divide the input features according to the set number of heads, and calculate The attention score, expressed as follows:

Figure BDA0003722819760000111
Figure BDA0003722819760000111

headi=Attention(Qi,Ki,Vi) (1.13)head i =Attention(Q i ,K i ,V i ) (1.13)

其中,i=1,…,nhead,nhead表示多头数。而最终的多头输出通过拼接的方式完成特征提取,表达式如下所示:Among them, i=1,...,nhead, nhead represents the number of long heads. The final multi-head output completes feature extraction by splicing, and the expression is as follows:

VMulti=ConCat(head1,…,headh)Wo (1.14)V Multi = ConCat(head 1 ,...,head h )W o (1.14)

其中,ConCat表示拼接操作,Wo表示注意力层输出的参数矩阵。Among them, ConCat represents the concatenation operation, and Wo represents the parameter matrix output by the attention layer.

接着通过前馈神经网络以及层归一化完成空域Transformer的最终输出,表示为:Then, the final output of the spatial Transformer is completed through the feedforward neural network and layer normalization, which is expressed as:

Vout=LN(Feedback(VMulti)) (1.15)V out = LN(Feedback(V Multi )) (1.15)

通过这种架构方式,较好地完成堆通过初步提取的时空特征进行行人空间交互特征的聚合,达到更好输出符合场景行人关联与交互的行人轨迹的目的。Through this architectural method, the aggregation of pedestrian spatial interaction features through the initially extracted spatiotemporal features can be better completed, and the goal of better outputting pedestrian trajectories that conform to the pedestrian association and interaction in the scene is achieved.

在损失函数方面,选用行人预测轨迹上每一点的负对数似然之和作为损失函数。第 i个行人的损失函数有如下表示:In terms of loss function, the sum of negative log-likelihoods of each point on the pedestrian predicted trajectory is selected as the loss function. The loss function of the i-th pedestrian is expressed as follows:

Figure BDA0003722819760000112
Figure BDA0003722819760000112

其中,

Figure BDA0003722819760000113
是待预测的未知的行人轨迹特征参数,Tobs,Tpred分别表示观测和预测终点时刻;而所有行人的损失函数之和即为最终的损失函数:in,
Figure BDA0003722819760000113
is the unknown pedestrian trajectory feature parameter to be predicted, T obs , T pred represent the end point of observation and prediction respectively; and the sum of the loss functions of all pedestrians is the final loss function:

Figure BDA0003722819760000114
Figure BDA0003722819760000114

通过对本发明提出的上述模型架构进行正向损失函数计算和反向参数更新,即可完成对模型的训练,得到合理的行人预测轨迹输出。By performing forward loss function calculation and reverse parameter update on the above-mentioned model architecture proposed by the present invention, the training of the model can be completed, and a reasonable pedestrian predicted trajectory output can be obtained.

在模型准确性与有效性的评估过程中,与常用的轨迹预测评估方法类似,选用平均偏移误差(Average Differential Error,ADE)和终点偏移误差(Final DifferentialError,FDE) 作为评价指标来描述预测轨迹的准确性。平均位移误差指的是场景中每个行人在每一时刻预测位移与真实位移误差的L2范数的平均值,而终点位移误差指的是场景中每个行人终点时刻预测位移与真实位移误差的L2范数的平均值,其表达式如下:In the process of evaluating the accuracy and validity of the model, similar to the commonly used trajectory prediction evaluation methods, the average offset error (Average Differential Error, ADE) and the end point offset error (Final Differential Error, FDE) are used as evaluation indicators to describe the prediction. accuracy of the trajectory. The average displacement error refers to the average value of the L2 norm of the predicted displacement and the actual displacement error of each pedestrian in the scene at each moment, and the end point displacement error refers to the difference between the predicted displacement and the actual displacement error of each pedestrian in the scene at the end point. The average value of the L2 norm, which is expressed as:

Figure BDA0003722819760000121
Figure BDA0003722819760000121

Figure BDA0003722819760000122
Figure BDA0003722819760000122

其中,

Figure BDA0003722819760000123
表示预测行人待预测的轨迹真值,
Figure BDA0003722819760000124
表示该模型的输出的行人预测轨迹; Tpred表示预测终点时刻,Tp表示预测时间范围,对于FDE指标,仅对于场景内每个行人终点坐标的误差取平均而对于行走路线的选择没有较高要求,而对于ADE指标,则需要对每个时刻点的坐标误差求和取平均。对两种指标而言,值越小表明与实际轨迹越接近,预测性能也越好。in,
Figure BDA0003722819760000123
represents the true value of the predicted pedestrian trajectory to be predicted,
Figure BDA0003722819760000124
Represents the pedestrian predicted trajectory of the output of the model; T pred represents the predicted end point time, T p represents the prediction time range, for the FDE indicator, only the error of the end point coordinates of each pedestrian in the scene is averaged and the selection of the walking route is not high requirements, and for the ADE index, the sum of the coordinate errors at each time point needs to be averaged. For both metrics, the smaller the value, the closer it is to the actual trajectory and the better the prediction performance.

由于实际输出的为轨迹在二维平面内的概率分布,在实际轨迹预测性能评估时,为保证轨迹多样性和泛化能力,常采用多次采样预测(如20次),并取最接近真值轨迹的预测轨迹作为输出轨迹的方式来计算ADE/FDE与评估模型。具体而言,对于ETH和UCY 上的五个数据集,每隔0.4秒进行一次采样的方式产生使用行人轨迹数据,每20帧作为一个数据样本,通过给定过去8帧共3.2s的行人轨迹数据作为输入,并预测未来12帧共4.8s的行人轨迹的方式对模型进行训练与验证。将本发明中的模型与其他两种同样使用图网络模型的算法进行性能比较,得到的比较结果如表1所示,最佳性能用红色标出:Since the actual output is the probability distribution of the trajectory in the two-dimensional plane, when evaluating the actual trajectory prediction performance, in order to ensure the trajectory diversity and generalization ability, multiple sampling predictions (such as 20 times) are often used, and the closest to the true trajectory is used for prediction. The predicted trajectories of the value trajectories are used as output trajectories to calculate ADE/FDE and evaluate the model. Specifically, for the five datasets on ETH and UCY, the pedestrian trajectory data is generated by sampling every 0.4 seconds, and every 20 frames is used as a data sample, by giving the pedestrian trajectory of the past 8 frames for a total of 3.2 seconds. The data is used as input, and the model is trained and verified by predicting pedestrian trajectories in the next 12 frames for a total of 4.8s. The performance of the model in the present invention is compared with other two algorithms that also use the graph network model, and the obtained comparison results are shown in Table 1, and the best performance is marked in red:

Figure BDA0003722819760000125
Figure BDA0003722819760000125

Figure BDA0003722819760000131
Figure BDA0003722819760000131

表1本模型与图网络主流模型预测结果比较Table 1 Comparison of prediction results between this model and mainstream models of graph networks

由表1可以看出,本发明所提出的框架对于终点预测问题起到很大的突破,在几乎所有数据集上FDE指标均为最优,同时平均ADE和FDE指标也均为最优性能。相较于两种图网络算法的最优性能,本模型分别在ETH,UNIV,ZARA1,ZARA2上提升 FDE达17%,21%,5%,12%,在平均FDE指标上提升达16%。由以上数据可以看出,本模型使用一个空域聚合Transformer架构对于将行人时序特征向量输入,专注于对时空图神经网络和时序特征变换网络中提取特征的利用,完成了对空间行人交互特征的更好聚合,达到了更好的预测效果,在FDE上取得了较大的突破,对于行人在空间内的交互特征有了更强的感知与表达。It can be seen from Table 1 that the framework proposed by the present invention has made a great breakthrough in the end point prediction problem, and the FDE index is the best in almost all data sets, and the average ADE and FDE indicators are also the best performance. Compared with the optimal performance of the two graph network algorithms, this model improves the FDE by 17%, 21%, 5%, and 12% on ETH, UNIV, ZARA1, and ZARA2 respectively, and the average FDE index improves by 16%. It can be seen from the above data that this model uses a spatial aggregation Transformer architecture to input the pedestrian time series feature vector, focusing on the utilization of the extracted features from the spatiotemporal graph neural network and the time series feature transformation network, and completes the spatial pedestrian interaction feature. Good aggregation, achieves better prediction effect, and has made great breakthroughs in FDE, and has stronger perception and expression of pedestrian interaction characteristics in space.

Claims (8)

1.一种基于时空图与空域聚合Transformer网络的轨迹预测方法,其特征在于,包括以下几个步骤:1. a trajectory prediction method based on spatiotemporal map and spatial domain aggregation Transformer network, is characterized in that, comprises the following steps: (1)利用图的特性从输入的原始数据中对场景内行人轨迹特征信息进行图表示与预处理,选取合适的核函数完成对邻接矩阵的构建,为后续网络架构输入提供准确、高效的场景内行人轨迹特征信息;(1) Use the characteristics of the graph to represent and preprocess the pedestrian trajectory feature information in the scene from the input original data, select the appropriate kernel function to complete the construction of the adjacency matrix, and provide an accurate and efficient scene for the subsequent network architecture input Insider trajectory feature information; (2)建立时空图卷积神经网络模块,构建图卷积神经网络,通过选择对行人轨迹特征的图卷积次数完成对步骤(1)中图表示与预处理后的行人轨迹特征信息的初步提取,确保提取特征的准确、有效;(2) Establish a spatiotemporal graph convolutional neural network module, and construct a graph convolutional neural network. By selecting the number of graph convolutions for pedestrian trajectory features, the preliminary representation of the graph in step (1) and the preprocessed pedestrian trajectory feature information are completed. Extraction to ensure that the extracted features are accurate and effective; (3)建立时序特征变换网络模块,通过设计卷积神经网络完成时序特征信息的提取与特征维度的变换;(3) Establish a time series feature transformation network module, and complete the extraction of time series feature information and the transformation of feature dimensions by designing a convolutional neural network; (4)建立空域聚合Transformer网络,使用场景内每个行人的时序特征向量作为输入向量,同时输入Transformer网络进行空域特征的进一步聚合,并且完成行人轨迹预测序列的输出。(4) Establish a spatial aggregation Transformer network, use the time series feature vector of each pedestrian in the scene as the input vector, and input the Transformer network to further aggregate spatial features, and complete the output of the pedestrian trajectory prediction sequence. 2.如权利要求1所述的一种基于时空图与空域聚合Transformer网络的轨迹预测方法,其特征在于,所述步骤(1)中,引入时空图对输入的原始行人轨迹数据进行图表示,从多种核函数中选择合适的核函数构建图意义下的邻接矩阵,完成高效的场景内行人特征构建与选择,为后续建模提供准确、高效的信息。2. a kind of trajectory prediction method based on spatiotemporal map and spatial domain aggregation Transformer network as claimed in claim 1, it is characterized in that, in described step (1), introduce spatiotemporal map to the original pedestrian trajectory data of input to carry out graph representation, Select the appropriate kernel function from a variety of kernel functions to construct an adjacency matrix in the sense of graph, complete the efficient construction and selection of pedestrian features in the scene, and provide accurate and efficient information for subsequent modeling. 3.如权利要求2所述的一种基于时空图与空域聚合Transformer网络的轨迹预测方法,其特征在于,所述引入时空图对输入的原始行人轨迹数据进行图表示具体为:对于每个时刻t,引入一个空间图Gt,用来表示每个时间点行人间的交互特征关系;Gt定义为Gt=(Vt,Et),其中,Vt具体表示时刻t场景内行人的坐标信息,即
Figure FDA0003722819750000011
每个
Figure FDA0003722819750000012
的特征信息使用观测的相对坐标变化
Figure FDA0003722819750000013
来进行刻画,即:
3. a kind of trajectory prediction method based on spatiotemporal map and spatial domain aggregation Transformer network as claimed in claim 2, it is characterized in that, described introducing spatiotemporal map to the original pedestrian trajectory data of input carries out graph representation and is specifically: for each moment t, a spatial graph G t is introduced to represent the interactive feature relationship between pedestrians at each time point; G t is defined as G t =(V t , E t ), where V t specifically represents the time t of pedestrians in the scene. coordinate information, i.e.
Figure FDA0003722819750000011
each
Figure FDA0003722819750000012
The feature information of using the observed relative coordinate changes
Figure FDA0003722819750000013
to characterize, that is:
Figure FDA0003722819750000014
Figure FDA0003722819750000014
Figure FDA0003722819750000015
Figure FDA0003722819750000015
其中,i=1,…,N,t=2,…,Tobs,对于初始时刻,规定其位置相对偏移为0,即
Figure FDA0003722819750000016
Among them, i=1,...,N, t=2,...,T obs , for the initial moment, the relative position offset is defined as 0, that is,
Figure FDA0003722819750000016
Et则表示空间图Gt的边信息,其是一个维度大小为n×n的矩阵;定义为
Figure FDA0003722819750000021
Figure FDA0003722819750000022
的取值由如下方式给出:
E t represents the side information of the spatial graph G t , which is a matrix with a dimension of n×n; it is defined as
Figure FDA0003722819750000021
Figure FDA0003722819750000022
The value of is given by:
如果节点
Figure FDA0003722819750000023
与节点
Figure FDA0003722819750000024
相连,那么
Figure FDA0003722819750000025
反之,如果节点
Figure FDA0003722819750000026
与节点
Figure FDA0003722819750000027
不相连,那么
Figure FDA0003722819750000028
if node
Figure FDA0003722819750000023
with node
Figure FDA0003722819750000024
connected, then
Figure FDA0003722819750000025
Conversely, if the node
Figure FDA0003722819750000026
with node
Figure FDA0003722819750000027
not connected, then
Figure FDA0003722819750000028
4.如权利要求2所述的一种基于时空图与空域聚合Transformer网络的轨迹预测方法,其特征在于,所述从多种核函数中选择合适的核函数构建图意义下的邻接矩阵具体为:4. a kind of trajectory prediction method based on spatiotemporal graph and space domain aggregation Transformer network as claimed in claim 2, it is characterized in that, the described adjacency matrix under the meaning of selecting suitable kernel function construction graph from multiple kernel functions is specifically: : 引入加权邻接矩阵At对行人空间图的节点信息进行加权表示,通过核函数变换得到行人间相互影响的大小并存储在加权邻接矩阵At中;The weighted adjacency matrix A t is introduced to represent the node information of the pedestrian spatial graph by weight, and the magnitude of the mutual influence between pedestrians is obtained through the kernel function transformation and stored in the weighted adjacency matrix A t ; 选用两个节点在欧式空间中距离的倒数作为核函数,并且为了避免二者过于接近而导致的函数发散问题,加入一个微小的常量ε来加速模型收敛,表达式如下:The inverse of the distance between two nodes in Euclidean space is used as the kernel function, and in order to avoid the function divergence problem caused by the two being too close, a tiny constant ε is added to accelerate the model convergence, the expression is as follows:
Figure FDA0003722819750000029
Figure FDA0003722819750000029
在时间维度上对于每一个时刻的空间图Gt进行堆叠,即得到图表示下的行人轨迹预测时空图序列G={G1,…,GT}。In the time dimension, the spatial graph G t at each moment is stacked, that is, the pedestrian trajectory prediction spatio-temporal graph sequence G={G 1 ,...,G T } under the graph representation is obtained.
5.如权利要求4所述的一种基于时空图与空域聚合Transformer网络的轨迹预测方法,其特征在于,所述步骤(2)具体为:5. a kind of trajectory prediction method based on space-time map and space domain aggregation Transformer network as claimed in claim 4, is characterized in that, described step (2) is specifically: 对于输入得到的特征图时间序列,通过建立的时空图卷积神经网络得到输出:For the input feature map time series, the output is obtained through the established spatiotemporal map convolutional neural network: et=GNN(Gt) (1.6)e t =GNN(G t ) (1.6) 其中,GNN表示构建的时空图卷积神经网络,其由多层的图卷积迭代得到输出结果;et表示通过图神经网络从空间维度初步提取的时空特征信息;Among them, GNN represents the constructed spatiotemporal graph convolutional neural network, which obtains the output result by iterative multi-layer graph convolution; e t represents the spatiotemporal feature information initially extracted from the spatial dimension through the graph neural network; 对于每一个时刻的输出,均有这样的操作;而实际图卷积神经网络得到的输出则是这样时间序列的堆叠:For the output of each moment, there is such an operation; and the output obtained by the actual graph convolutional neural network is a stack of such time series: eg=Stack(et) (1.7)e g = Stack(e t ) (1.7) 其中,Stack(·)表示对于输入在拓展维度上的叠加,eg表示图卷积的输出;实际处理过程中,多个拓展维度是同时并行送入图神经网络进行处理的;Among them, Stack( ) represents the superposition of the input on the extended dimension, and e g represents the output of the graph convolution; in the actual processing process, multiple extended dimensions are simultaneously sent to the graph neural network for processing in parallel; 接着经过一个全连接层FC对特征进行恰当的维度变换:Then, through a fully connected layer FC, the features are appropriately dimensionally transformed: VGNN=FC(eg) (1.8)V GNN = FC(e g ) (1.8) 由此得到时空图卷积神经网络的特征信息的初步提取输出。Thereby, the preliminary extraction output of the feature information of the spatiotemporal graph convolutional neural network is obtained. 6.如权利要求1所述的一种基于时空图与空域聚合Transformer网络的轨迹预测方法,其特征在于,所述步骤(3)中,将时空图卷积神经网络的输出经过恰当维度变换,使用一个基于CNN的时序特征变换网络模块并设计卷积次数完成对行人自身历史轨迹特征信息的提取。6. a kind of trajectory prediction method based on spatiotemporal graph and space domain aggregation Transformer network as claimed in claim 1, is characterized in that, in described step (3), the output of spatiotemporal graph convolutional neural network is transformed through appropriate dimension, Using a CNN-based time series feature transformation network module and designing the number of convolutions to complete the extraction of the pedestrian's own historical trajectory feature information. 7.如权利要求6所述的一种基于时空图与空域聚合Transformer网络的轨迹预测方法,其特征在于,所述步骤(3)具体为:7. a kind of trajectory prediction method based on spatiotemporal map and space domain aggregation Transformer network as claimed in claim 6, is characterized in that, described step (3) is specifically: 在得到时空图卷积神经网络的特征提取信息后,送入一个时序特征变换网络对时序特征进行提取;由于在步骤二中已经通过一个全连接层对于维度特征进行合适变换,因此本步骤中的网络模块直接对得到的特征信息进行利用;本发明中,选择多层CNN卷积神经网络对时间维度特征信息进行处理,可以表示为:After obtaining the feature extraction information of the spatiotemporal graph convolutional neural network, it is sent to a time series feature transformation network to extract the time series features; since the dimension features have been appropriately transformed through a fully connected layer in step 2, the The network module directly utilizes the obtained feature information; in the present invention, the multi-layer CNN convolutional neural network is selected to process the time dimension feature information, which can be expressed as: ec=CNN(VGNN) (1.9) ec = CNN(V GNN ) (1.9) 其中,VGNN表示从图卷积神经网络中提取到的特征信息,ec表示经过时序特征变换网络的输出;接着通过一个多层感知机MLP,用以增加网络的表达能力:Among them, V GNN represents the feature information extracted from the graph convolutional neural network, and e c represents the output of the time series feature transformation network; then a multi-layer perceptron MLP is used to increase the expressive ability of the network: VCNN=MLP(ec) (1.10)V CNN = MLP( ec ) (1.10) 通过上述网络进行特征的变换与处理,即得到时序特征变换网络的输出VCNNThe feature transformation and processing are performed through the above network, that is, the output VCNN of the time series feature transformation network is obtained. 8.如权利要求1所述的一种基于时空图与空域聚合Transformer网络的轨迹预测方法,其特征在于,所述步骤(4)具体为:8. a kind of trajectory prediction method based on spatiotemporal map and spatial domain aggregation Transformer network as claimed in claim 1, is characterized in that, described step (4) is specifically: 将同一个行人在时序上的特征向量作为输入向量输入,依次输入的为不同行人的提取特征;The feature vector of the same pedestrian in the time series is input as the input vector, and the extracted features of different pedestrians are input in turn; 对于空域聚合Transformer网络,选用Transformer架构的编码器层,首先对输入添加位置编码:For the spatial aggregation Transformer network, the encoder layer of the Transformer architecture is selected, and position encoding is first added to the input: Vin=VCNN+PEpos,i(VCNN) (1.11)V in = V CNN +PE pos,i (V CNN ) (1.11) 其中pos表示输入特征的相对位置,i表示输入特征的维度;接着引入多头注意力层,使用从输入层进行矩阵变换得到的三个注意力层输入Query、Key、Value,依照设定的多头数对输入特征进行划分,计算注意力得分,表达式如下:Among them, pos represents the relative position of the input feature, and i represents the dimension of the input feature; then a multi-head attention layer is introduced, and the three attention layers obtained by matrix transformation from the input layer are used to input Query, Key, and Value, according to the set number of heads. Divide the input features and calculate the attention score, the expression is as follows:
Figure FDA0003722819750000041
Figure FDA0003722819750000041
headi=Attention(Qi,Ki,Vi) (1.13)head i =Attention(Q i ,K i ,V i ) (1.13) 其中,i=1,…,nhead,nhead表示多头数;而最终的多头输出通过拼接的方式完成特征提取,表达式如下所示:Among them, i=1,...,nhead, nhead represents the number of heads; and the final output of multiple heads completes feature extraction by splicing, and the expression is as follows: VMulti=ConCat(head1,…,headh)Wo (1.14)V Multi = ConCat(head 1 ,...,head h )W o (1.14) 其中,ConCat表示拼接操作,Wo表示注意力层输出的参数矩阵。Among them, ConCat represents the concatenation operation, and Wo represents the parameter matrix output by the attention layer. 接着通过前馈神经网络以及层归一化完成空域Transformer的最终输出,表示为:Then, the final output of the spatial Transformer is completed through the feedforward neural network and layer normalization, which is expressed as: Vout=LN(Feedback(VMulti)) (1.15)V out = LN(Feedback(V Multi )) (1.15) 在损失函数方面,选用行人预测轨迹上每一点的负对数似然之和作为损失函数;第i个行人的损失函数有如下表示:In terms of loss function, the sum of the negative log-likelihood of each point on the pedestrian predicted trajectory is selected as the loss function; the loss function of the i-th pedestrian is expressed as follows:
Figure FDA0003722819750000042
Figure FDA0003722819750000042
其中,
Figure FDA0003722819750000043
是待预测的未知的行人轨迹特征参数,Tobs,Tpred分别表示观测和预测终点时刻;而所有行人的损失函数之和即为最终的损失函数:
in,
Figure FDA0003722819750000043
is the unknown pedestrian trajectory feature parameter to be predicted, T obs , T pred represent the end point of observation and prediction respectively; and the sum of the loss functions of all pedestrians is the final loss function:
Figure FDA0003722819750000044
Figure FDA0003722819750000044
通过对上述模型架构进行正向损失函数计算和反向参数更新,即完成对模型的训练,得到合理的行人预测轨迹输出。By performing forward loss function calculation and reverse parameter update on the above model architecture, the training of the model is completed, and a reasonable pedestrian predicted trajectory output is obtained.
CN202210767796.8A 2022-06-30 2022-06-30 Track prediction method based on space-time diagram and airspace aggregation transducer network Active CN114997067B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210767796.8A CN114997067B (en) 2022-06-30 2022-06-30 Track prediction method based on space-time diagram and airspace aggregation transducer network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210767796.8A CN114997067B (en) 2022-06-30 2022-06-30 Track prediction method based on space-time diagram and airspace aggregation transducer network

Publications (2)

Publication Number Publication Date
CN114997067A true CN114997067A (en) 2022-09-02
CN114997067B CN114997067B (en) 2024-07-19

Family

ID=83019465

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210767796.8A Active CN114997067B (en) 2022-06-30 2022-06-30 Track prediction method based on space-time diagram and airspace aggregation transducer network

Country Status (1)

Country Link
CN (1) CN114997067B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115392595A (en) * 2022-10-31 2022-11-25 北京科技大学 Spatio-temporal short-term wind speed prediction method and system based on graph convolutional neural network and Transformer
CN115881286A (en) * 2023-02-21 2023-03-31 创意信息技术股份有限公司 Epidemic prevention management scheduling system
CN115966313A (en) * 2023-03-09 2023-04-14 创意信息技术股份有限公司 Integrated management platform based on face recognition
CN117493424A (en) * 2024-01-03 2024-02-02 湖南工程学院 Vehicle track prediction method independent of map information
CN117523821A (en) * 2023-10-09 2024-02-06 苏州大学 Vehicle multi-modal driving behavior trajectory prediction system and method based on GAT-CS-LSTM
CN117933492A (en) * 2024-03-21 2024-04-26 中国人民解放军海军航空大学 Ship track long-term prediction method based on space-time feature fusion
WO2024119489A1 (en) * 2022-12-09 2024-06-13 中国科学院深圳先进技术研究院 Pedestrian trajectory prediction method, system, device, and storage medium
CN118629006A (en) * 2024-06-11 2024-09-10 南通大学 A Pedestrian Trajectory Prediction Method Based on Sparse Spatiotemporal Graph Transformer Network
WO2024193334A1 (en) * 2023-03-22 2024-09-26 重庆邮电大学 Automatic trajectory prediction method based on graph spatial-temporal pyramid
CN119131893A (en) * 2024-08-21 2024-12-13 国家电网有限公司华东分部 Method, device and equipment for human behavior recognition in power production based on skeleton

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255597A (en) * 2021-06-29 2021-08-13 南京视察者智能科技有限公司 Transformer-based behavior analysis method and device and terminal equipment thereof
CN113762595A (en) * 2021-07-26 2021-12-07 清华大学 Traffic time prediction model training method, traffic time prediction method and equipment
CN113837148A (en) * 2021-11-04 2021-12-24 昆明理工大学 Pedestrian trajectory prediction method based on self-adjusting sparse graph transform
CN114117892A (en) * 2021-11-04 2022-03-01 中通服咨询设计研究院有限公司 Method for predicting road traffic flow under distributed system
CN114267084A (en) * 2021-12-17 2022-04-01 北京沃东天骏信息技术有限公司 Video recognition method, device, electronic device and storage medium
CN114626598A (en) * 2022-03-08 2022-06-14 南京航空航天大学 A Multimodal Trajectory Prediction Method Based on Semantic Environment Modeling
CN114638408A (en) * 2022-03-03 2022-06-17 南京航空航天大学 A Pedestrian Trajectory Prediction Method Based on Spatio-temporal Information
CN114757975A (en) * 2022-04-29 2022-07-15 华南理工大学 Pedestrian Trajectory Prediction Method Based on Transformer and Graph Convolutional Network

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255597A (en) * 2021-06-29 2021-08-13 南京视察者智能科技有限公司 Transformer-based behavior analysis method and device and terminal equipment thereof
CN113762595A (en) * 2021-07-26 2021-12-07 清华大学 Traffic time prediction model training method, traffic time prediction method and equipment
CN113837148A (en) * 2021-11-04 2021-12-24 昆明理工大学 Pedestrian trajectory prediction method based on self-adjusting sparse graph transform
CN114117892A (en) * 2021-11-04 2022-03-01 中通服咨询设计研究院有限公司 Method for predicting road traffic flow under distributed system
CN114267084A (en) * 2021-12-17 2022-04-01 北京沃东天骏信息技术有限公司 Video recognition method, device, electronic device and storage medium
CN114638408A (en) * 2022-03-03 2022-06-17 南京航空航天大学 A Pedestrian Trajectory Prediction Method Based on Spatio-temporal Information
CN114626598A (en) * 2022-03-08 2022-06-14 南京航空航天大学 A Multimodal Trajectory Prediction Method Based on Semantic Environment Modeling
CN114757975A (en) * 2022-04-29 2022-07-15 华南理工大学 Pedestrian Trajectory Prediction Method Based on Transformer and Graph Convolutional Network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHE HUANG: ""Learning Sparse Interaction Graphs of Partially Detected Pedestrians for Trajectory Prediction"", 《IEEE ROBOTICS AND AUTOMATION LETTERS》, vol. 7, no. 2, 28 December 2021 (2021-12-28), pages 1198 - 1205 *
成星橙: ""基于Transformer与图卷积网络的行人轨迹预测研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 2023, 15 December 2023 (2023-12-15), pages 138 - 34 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115392595B (en) * 2022-10-31 2022-12-27 北京科技大学 Time-space short-term wind speed prediction method and system based on graph convolution neural network and Transformer
CN115392595A (en) * 2022-10-31 2022-11-25 北京科技大学 Spatio-temporal short-term wind speed prediction method and system based on graph convolutional neural network and Transformer
WO2024119489A1 (en) * 2022-12-09 2024-06-13 中国科学院深圳先进技术研究院 Pedestrian trajectory prediction method, system, device, and storage medium
CN115881286A (en) * 2023-02-21 2023-03-31 创意信息技术股份有限公司 Epidemic prevention management scheduling system
CN115881286B (en) * 2023-02-21 2023-06-16 创意信息技术股份有限公司 Epidemic prevention management scheduling system
CN115966313A (en) * 2023-03-09 2023-04-14 创意信息技术股份有限公司 Integrated management platform based on face recognition
CN115966313B (en) * 2023-03-09 2023-06-09 创意信息技术股份有限公司 Integrated management platform based on face recognition
WO2024193334A1 (en) * 2023-03-22 2024-09-26 重庆邮电大学 Automatic trajectory prediction method based on graph spatial-temporal pyramid
CN117523821A (en) * 2023-10-09 2024-02-06 苏州大学 Vehicle multi-modal driving behavior trajectory prediction system and method based on GAT-CS-LSTM
CN117493424A (en) * 2024-01-03 2024-02-02 湖南工程学院 Vehicle track prediction method independent of map information
CN117493424B (en) * 2024-01-03 2024-03-22 湖南工程学院 A vehicle trajectory prediction method that does not rely on map information
CN117933492B (en) * 2024-03-21 2024-06-11 中国人民解放军海军航空大学 Long-term prediction method of ship track based on spatiotemporal feature fusion
CN117933492A (en) * 2024-03-21 2024-04-26 中国人民解放军海军航空大学 Ship track long-term prediction method based on space-time feature fusion
CN118629006A (en) * 2024-06-11 2024-09-10 南通大学 A Pedestrian Trajectory Prediction Method Based on Sparse Spatiotemporal Graph Transformer Network
CN119131893A (en) * 2024-08-21 2024-12-13 国家电网有限公司华东分部 Method, device and equipment for human behavior recognition in power production based on skeleton

Also Published As

Publication number Publication date
CN114997067B (en) 2024-07-19

Similar Documents

Publication Publication Date Title
CN114997067B (en) Track prediction method based on space-time diagram and airspace aggregation transducer network
CN114898293B (en) A multimodal trajectory prediction method for pedestrian groups crossing the street for autonomous vehicles
CN111432015B (en) A full-coverage task assignment method for dynamic noise environments
CN115829171B (en) Pedestrian track prediction method combining space-time information and social interaction characteristics
CN111339867A (en) A Pedestrian Trajectory Prediction Method Based on Generative Adversarial Networks
Yang et al. Long-short term spatio-temporal aggregation for trajectory prediction
CN115331460B (en) A large-scale traffic signal control method and device based on deep reinforcement learning
CN114626598B (en) A multimodal trajectory prediction method based on semantic environment modeling
He et al. IRLSOT: Inverse reinforcement learning for scene‐oriented trajectory prediction
CN118296090A (en) A trajectory prediction method based on multi-dimensional spatiotemporal feature fusion for autonomous driving
CN115659275A (en) Real-time accurate trajectory prediction method and system in unstructured human-computer interaction environment
CN118261051A (en) A method for constructing a pedestrian and vehicle trajectory prediction model at intersections based on heterogeneous graph networks
Liu et al. Multi-agent trajectory prediction with graph attention isomorphism neural network
CN117389333A (en) Unmanned aerial vehicle cluster autonomous cooperation method under communication refusing environment
CN117522920A (en) Pedestrian track prediction method based on improved space-time diagram attention network
CN114723782A (en) A moving target perception method in traffic scene based on heterogeneous graph learning
CN117314956A (en) Interactive pedestrian track prediction method based on graphic neural network
CN116629116A (en) Two-layer data-driven ship trajectory prediction method and system based on GRU network
Wang et al. Human trajectory prediction using stacked temporal convolutional network
Song et al. Multimodal Model Prediction of Pedestrian Trajectories Based on Graph Convolutional Neural Networks
Lin et al. OST-HGCN: Optimized Spatial–Temporal Hypergraph Convolution Network for Trajectory Prediction
Zhou¹ et al. REGION: Relevant Entropy Graph spatIO-temporal convolutional Network for Pedestrian Trajectory Prediction
Wu et al. Traffic Speed Forecasting using GCN and BiGRU
CN116738814A (en) Mobile robot navigation method based on space-time transducer
Liu et al. Toward Efficient Self-Motion-Based Memory Representation for Visuomotor Navigation of Embodied Robot

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant