CN116629115A

CN116629115A - Bidirectional data driving ship track prediction method and system based on attention mechanism

Info

Publication number: CN116629115A
Application number: CN202310583916.3A
Authority: CN
Inventors: 刘吉兆; 肖叶; 姜汉民; 肖亦
Original assignee: Hunan Institute of Technology
Current assignee: Hunan Institute of Technology
Priority date: 2023-05-23
Filing date: 2023-05-23
Publication date: 2023-08-22

Abstract

The invention discloses a bidirectional data driving ship track prediction method and a bidirectional data driving ship track prediction system based on an attention mechanism, relates to the technical field of track prediction, and mainly obtains an observation sequence with a forward length of g from an AIS data setThe sequence will be observedInput to a first machine learning model to obtain an intermediate pre-run of length lSequencingAt the same time, the observation sequence with the backward length of g is obtained from the AIS data setThe sequence will be observedInputting the intermediate prediction sequence into a second machine learning model to obtain an intermediate prediction sequence with the length of lIntermediate prediction sequenceIntermediate prediction sequencesAnd splicing to form composite training data serving as the input of a third machine learning model to obtain a prediction result. The method solves the problems of limitation and low prediction precision existing in the prior art for predicting the future track in one direction.

Description

Two-way data-driven ship trajectory prediction method and system based on attention mechanism

技术领域technical field

本发明涉及轨迹预测技术领域，尤其指一种基于注意力机制的双向数据驱动船舶轨迹预测方法及系统。The invention relates to the technical field of trajectory prediction, in particular to a two-way data-driven ship trajectory prediction method and system based on an attention mechanism.

背景技术Background technique

AIS系统是保障海上运输安全的一项重要技术。船舶的静态状态由AIS系统记录，如海上移动服务标识(MMSI)，以及船舶位置、对地速度等动态信息。AIS为支持海上航行决策、信息广播、海上避让和环境保护提供专家服务。然而，海上船舶容易受到恶劣天气或航线规划不当等问题而影响航行效率。有时还会因为船长驾驶不当和机动性差等原因而发生船只碰撞或堵塞问题等问题。因此，需要探索AIS数据在船舶风险预警、航线优化中的巨大潜力。AIS system is an important technology to ensure the safety of marine transportation. The static state of the ship is recorded by the AIS system, such as the Maritime Mobile Service Identity (MMSI), and dynamic information such as the ship's position and speed over the ground. AIS provides expert services in support of maritime navigation decision-making, information broadcasting, maritime avoidance and environmental protection. However, ships at sea are vulnerable to problems such as bad weather or improper route planning that affect the efficiency of navigation. Sometimes there are also problems such as ship collision or jamming problems due to improper driving and poor maneuverability of the captain. Therefore, it is necessary to explore the great potential of AIS data in ship risk warning and route optimization.

目前用于预测船舶轨迹的方法主要是使用单向历史轨迹数据来训练模型用以预测未来轨迹，这没有考虑双向历史轨迹数据，即对目标预测轨迹后的历史轨迹数据缺乏考虑。在一些海事区域的交通场景重现预测任务当中，现有方法存在对船舶历史多向轨迹数据挖掘方面存在不足导致预测精度依然存在提高的空间。The current method for predicting ship trajectories mainly uses one-way historical trajectory data to train the model to predict future trajectories, which does not consider two-way historical trajectory data, that is, lacks consideration of the historical trajectory data after the target predicted trajectory. In the traffic scene reproduction prediction tasks in some maritime areas, the existing methods have deficiencies in the data mining of historical multi-directional trajectories of ships, resulting in room for improvement in prediction accuracy.

线性模型可以很好地处理船舶直线航行时的轨迹预测。典型的线性模型是等速模型(CVM)。代表性的预测方法主要包括基于运动学的模型、恒速模型、OrnsteinUhlenbeck模型和卡尔曼滤波变体。然而，当船舶在某些情况下需要改变航向或速度并转向航行时，线性模型表现较弱。因此，为了克服线性模型的弱点，一些研究人员研究了非线性模型以提高预测精度。典型的非线性轨迹预测模型是基于机器学习和深度学习的方法。常见的机器学习方法包括了轨迹聚类、支持向量机、高斯模型等。然而，基于机器学习方法的准确性依赖于标签和物理知识识别。基于深度学习的方法可以捕获更多输入数据的特征提高轨迹预测精度。常见深度学习预测方法如人工神经网络、循环神经网络及其变体等。然而，现有方法往往只考虑对单向历史轨迹数据训练学习实现对目标预测轨迹的有效性，缺乏对双向历史轨迹数据的研究。在重构历史海域交通场景中，如何利用双向历史轨迹数据特征提高轨迹预测精度依然是一个难点。Linear models work well for trajectory prediction of ships traveling in a straight line. A typical linear model is the Constant Velocity Model (CVM). Representative prediction methods mainly include kinematics-based models, constant-velocity models, Ornstein-Uhlenbeck models, and Kalman filter variants. However, linear models perform weakly when ships need to change course or speed and turn to sail under certain conditions. Therefore, in order to overcome the weakness of linear models, some researchers have studied nonlinear models to improve prediction accuracy. Typical nonlinear trajectory prediction models are based on machine learning and deep learning methods. Common machine learning methods include trajectory clustering, support vector machines, Gaussian models, etc. However, the accuracy of machine learning-based methods relies on labels and physical knowledge for recognition. Methods based on deep learning can capture more features of the input data to improve trajectory prediction accuracy. Common deep learning prediction methods such as artificial neural networks, recurrent neural networks and their variants. However, existing methods often only consider the effectiveness of training and learning on one-way historical trajectory data to achieve target prediction trajectory, and there is a lack of research on bidirectional historical trajectory data. In reconstructing traffic scenarios in historical sea areas, how to use the characteristics of bidirectional historical trajectory data to improve the accuracy of trajectory prediction is still a difficult point.

发明内容Contents of the invention

本发明的目的之一在于提供一种基于注意力机制的双向数据驱动船舶轨迹预测方法，以解决现有技术中对双向历史轨迹数据特征提取不足和预测精度低的问题。One of the purposes of the present invention is to provide a two-way data-driven ship trajectory prediction method based on attention mechanism to solve the problems of insufficient feature extraction and low prediction accuracy of two-way historical trajectory data in the prior art.

为了解决上述技术问题，本发明采用如下技术方案：一种基于注意力机制的双向数据驱动船舶轨迹预测方法，包括以下步骤：In order to solve the above technical problems, the present invention adopts the following technical solutions: a two-way data-driven ship track prediction method based on attention mechanism, comprising the following steps:

S1、从AIS数据集中获取前向长度为g的观察序列将该观察序列输入到第一机器学习模型，得到长度为l的中间预测序列/>同时，从AIS数据集中获取后向长度为g的观察序列/>将该观察序列输入到第二机器学习模型，得到长度为l的中间预测序列/> S1. Obtain an observation sequence with forward length g from the AIS dataset This observation sequence is input to the first machine learning model to obtain an intermediate prediction sequence of length l /> At the same time, obtain the observation sequence with backward length g from the AIS dataset /> Input the observation sequence to the second machine learning model to obtain an intermediate prediction sequence of length l />

S2、对第一机器学习模型的中间预测序列和第二机器学习模型的中间预测序列/>进行拼接，形成一个复合的训练数据；S2, the intermediate prediction sequence for the first machine learning model and the intermediate prediction sequence of the second machine learning model /> Splicing to form a composite training data;

S3、将所述复合的训练数据作为第三机器学习模型的输入，得到预测结果。S3. Using the composite training data as an input of the third machine learning model to obtain a prediction result.

优选地，所述第一机器学习模型、第二机器学习模型和第三机器学习模型的获取过程包括：采集的船舶时序轨迹数据作为AIS数据集，将AIS数据集划分为训练集和测试集；将所述训练集作为前向子块和后向子块模型的输入，训练所述前向子块和后向子块模型，得到第一机器学习模型和第二机器学习模型；将训练过程中获得的最优前向子块和后向子块模型的中间输出组合成复合的融合训练集，将复合的融合训练集输入到融合预测块中，训练所述融合预测块，得到第三机器学习模型。Preferably, the acquisition process of the first machine learning model, the second machine learning model and the third machine learning model includes: collecting ship time series trajectory data as an AIS data set, and dividing the AIS data set into a training set and a test set; Using the training set as the input of the forward sub-block and the backward sub-block model, training the forward sub-block and the backward sub-block model to obtain the first machine learning model and the second machine learning model; during the training process The obtained intermediate outputs of the optimal forward sub-block and backward sub-block models are combined into a composite fusion training set, and the composite fusion training set is input into the fusion prediction block, and the fusion prediction block is trained to obtain the third machine learning Model.

更优选地，所述第一机器学习模型为GRU神经网络和注意力机制组成；所述第二机器学习模型为BiGRU神经网络和注意力机制组成；所述第三机器学习模型为多层感知器网络。More preferably, the first machine learning model is composed of a GRU neural network and an attention mechanism; the second machine learning model is composed of a BiGRU neural network and an attention mechanism; the third machine learning model is a multi-layer perceptron network.

更优选地，在步骤S1中，所述第一机器学习模型将输入的前向观察序列映射为一个输出序列，长度为l的中间预测序列/>的获取过程包括：More preferably, in step S1, the first machine learning model inputs the forward observation sequence Mapped to an output sequence, an intermediate prediction sequence of length l /> The acquisition process includes:

将数据集中的前向观察序列的元素依次输入第一机器学习模型，其中，t表示当前前向轨迹序列中位置；The sequence of forward observations in the dataset The elements of are sequentially input into the first machine learning model, where t represents the position in the current forward trajectory sequence;

根据下式更新第一机器学习模型的隐藏序列其中每个元素/>表示从前向轨迹序列中长度为g个中间特征状态中的第t时刻样本点提取的特征；Update the hidden sequence of the first machine learning model according to in per element /> Represents the feature extracted from the t-th time sample point in the intermediate feature state with a length of g in the forward trajectory sequence;

表示前向轨迹序列中第t个样本点的前向注意力分配系数，计算如下式：表示前向轨迹序列中第t个样本点的注意力权重打分值，计算如下式：/> 是加性计算过程的网络参数，/>为GRU网络从输入时序长度为g个轨迹中第t时刻输入样本点中提取的时空特征编码； Represents the forward attention distribution coefficient of the tth sample point in the forward trajectory sequence, calculated as follows: Represents the attention weight scoring value of the tth sample point in the forward trajectory sequence, calculated as follows: /> is the network parameter of the additive calculation process, /> is the spatio-temporal feature code extracted by the GRU network from the input sample point at the tth moment in the input time series length g trajectory;

元素表示通过GRU网络从输入时序长度为g个轨迹中第t时刻输入样本点中提取的时空特征编码，/>每个/>表示从输入时序长度为g个轨迹中第t时刻的上一个样本点中提取的时空特征，/>表示当前时刻输入到GRU神经网络当中序列的轨迹点，即第一机器学习模型的输入样本点，θ_gru表示每个输入到输出映射过程中的参数值；element Represents the spatio-temporal feature code extracted from the input sample point at the tth moment in the input time series length of g trajectories through the GRU network, /> each /> Represents the spatio-temporal features extracted from the last sample point at the tth moment in the input time series length g trajectory, /> Indicates the trajectory point of the sequence input to the GRU neural network at the current moment, that is, the input sample point of the first machine learning model, and θ _gru indicates the parameter value in the process of mapping each input to output;

θ_gru表示第一机器学习模型中GRU网络映射过程中的参数集：表示输入到第一机器学习模型中长度为g的前向轨迹序列中第i个时序点，L表示量化误差，N表示训练样本的总个数，/>表示长度为l的映射输出样本中第j个时序点，θ表示每个输入到输出映射过程中的参数值；θ _gru represents the parameter set in the GRU network mapping process in the first machine learning model: Indicates the i-th timing point in the forward trajectory sequence of length g input to the first machine learning model, L represents the quantization error, N represents the total number of training samples, /> Indicates the jth timing point in the mapping output sample with length l, and θ represents the parameter value of each input to output mapping process;

M_g,l表示在给定长度为g的前向输入序列下，预测长度为l的输出序列Y_l，从而使得条件概率最大化：M_g,l＝argmax_Yp(Y_l|X_g)，p(Y_l|X_g)表示给定g个观察序列X_g映射到未来l个预测轨迹序列Y_l的概率；M _g,l represents the forward input sequence at a given length g Next, predict the output sequence Y _l of length l, so as to maximize the conditional probability: M _g,l = argmax _Y p(Y _l |X _g ), p(Y _l |X _g ) means that given g observation sequences The probability that X _g is mapped to the future l predicted trajectory sequence Y _l ;

将由所有隐藏层的输出组成的向量输入全连接层，得到长度为l的中间预测序列 Feed a vector consisting of the outputs of all hidden layers into a fully connected layer to obtain an intermediate prediction sequence of length l

在步骤S1中，所述第二机器学习模型将输入的后向观察序列映射为一个输出序列，长度为l的中间预测序列/>的获取过程包括：In step S1, the second machine learning model takes the input backward observation sequence Mapped to an output sequence, an intermediate prediction sequence of length l /> The acquisition process includes:

将数据集中的后向观察序列的元素依次输入第二机器学习模型，其中，t表示当前后向轨迹序列中位置；The sequence of backward observations in the dataset The elements of are sequentially input into the second machine learning model, where t represents the position in the current backward trajectory sequence;

根据下式更新第二机器学习模型的隐藏序列其中每个元素/>表示从后向输入时序长度为g个中间特征状态中的第t时刻样本点提取的特征；Update the hidden sequence of the second machine learning model according to in per element /> Represents the feature extracted from the t-th time sample point in the intermediate feature states with a backward input sequence length of g;

表示后向轨迹序列中第t个样本点的前向注意力分配系数，计算如下式：表示后向轨迹序列中第t个样本点的注意力权重打分值，计算如下式：/>v，W，U是加性计算过程的网络参数，/>为BiGRU网络从输入时序长度为g个轨迹中第t时刻输入样本点中提取的时空特征编码； Represents the forward attention distribution coefficient of the t-th sample point in the backward trajectory sequence, calculated as follows: Represents the attention weight scoring value of the tth sample point in the backward trajectory sequence, calculated as follows: /> v, W, U are the network parameters of the additive calculation process, /> The spatio-temporal feature code extracted by the BiGRU network from the input sample point at the tth moment in the input time series length g trajectory;

为BiGRU网络对正向层和反向层隐藏状态的融合，计算公式如下式：其中W为网络中权重矩阵，b为偏置值； For the fusion of the hidden states of the forward layer and the reverse layer by the BiGRU network, the calculation formula is as follows: Where W is the weight matrix in the network, and b is the bias value;

其中表示当前t时刻输入的后向序列的轨迹点，即第二机器学习模型的输入样本点；in Represents the trajectory point of the backward sequence input at the current t moment, that is, the input sample point of the second machine learning model;

表示第二机器学习模型中正向层映射过程中的参数集：表示输入到第二机器学习模型正向层中长度为g的后向轨迹序列中第i个时序点。/>表示第二机器学习模型中反向层映射过程中的参数集：表示输入到反向层中的长度为g的后向轨迹序列中第i个时序点，L表示量化误差，N表示训练样本的总个数，/>表示长度为l的映射输出样本中第j个时序点，θ表示每个输入到输出映射过程中的参数值； Indicates the parameter set in the forward layer mapping process in the second machine learning model: Indicates the i-th timing point in the backward trajectory sequence of length g input to the forward layer of the second machine learning model. /> Represents the parameter set during the reverse layer mapping process in the second machine learning model: Indicates the i-th timing point in the backward trajectory sequence of length g input to the reverse layer, L represents the quantization error, N represents the total number of training samples, /> Indicates the jth timing point in the mapping output sample with length l, and θ represents the parameter value of each input to output mapping process;

更优选地，在步骤S3中，所述第三机器学习模型将输入的拼接中间预测序列映射为一个输出序列，长度为l的预测输出/>的获取过程包括：More preferably, in step S3, the third machine learning model uses the input splicing intermediate prediction sequence Mapped to an output sequence, predicted output of length l /> The acquisition process includes:

将拼接中间预测序列的元素依次输入第三机器学习模型，其中，t表示当前中间预测序列中第t个中间预测对；will splice the intermediate prediction sequence The elements of are sequentially input into the third machine learning model, where t represents the t-th intermediate prediction pair in the current intermediate prediction sequence;

依据下式获取第三机器学习模型的隐藏序列其中表示当前t时刻输入数据的中间预测拼接值，即第三机器学习模型的输入数据，/>表示上一时刻所求的输出隐藏状态值，θ表示每对输入拼接值到输出映射过程中的参数值；Obtain the hidden sequence of the third machine learning model according to the following formula in Indicates the intermediate prediction splicing value of the input data at the current time t, that is, the input data of the third machine learning model, /> Indicates the output hidden state value obtained at the previous moment, and θ indicates the parameter value in the process of mapping each pair of input concatenated values to the output;

依据下式第j隐藏层的输出其中，第1隐藏层的隐藏状态h₁为/>σ是第1隐藏层的激活函数，l是输入总数，b_t是第1隐藏层的偏置，w_t是连接层的权重，σ_j是第j隐藏层具有可学习参数θ的非线性激活函数，b_j是第j隐藏层的偏置，w_j是第j层连接层的权重；The output of the jth hidden layer according to the following formula Among them, the hidden state h ₁ of the first hidden layer is /> σ is the activation function of the 1st hidden layer, l is the total number of inputs, b _t is the bias of the 1st hidden layer, w _t is the weight of the connection layer, σ _j is the non-linear activation of the jth hidden layer with a learnable parameter θ function, b _j is the bias of the jth hidden layer, w _j is the weight of the jth connection layer;

将由所有隐藏层的输出组成的向量输入全连接层，得到长度为l的预测输出序列 Input the vector consisting of the outputs of all hidden layers into the fully connected layer to obtain a predicted output sequence of length l

另外，本发明还提供一种基于注意力机制的双向数据驱动船舶轨迹预测系统，其包括：In addition, the present invention also provides a two-way data-driven ship trajectory prediction system based on attention mechanism, which includes:

数据获取模块，用于从AIS数据集中获取前向长度为g的观察序列和后向长度为g的观察序列/> The data acquisition module is used to acquire the observation sequence whose forward length is g from the AIS data set and a backward sequence of observations of length g />

数据处理模块，用于将观察序列输入到第一机器学习模型，得到长度为l的中间预测序列/>并用于将观察序列/>输入到第二机器学习模型，得到长度为l的中间预测序列/> A data processing module for converting the observation sequence Input to the first machine learning model to obtain an intermediate prediction sequence of length l /> and used to watch the sequence /> Input to the second machine learning model to obtain an intermediate prediction sequence of length l />

数据拼接模块，用于对第一机器学习模型的中间预测序列和第二机器学习模型的中间预测序列/>进行拼接，以得到一个复合的训练数据；Data splicing module for the intermediate prediction sequence of the first machine learning model and the intermediate prediction sequence of the second machine learning model /> Splicing to obtain a composite training data;

结果预测模块，用于将所述复合的训练数据作为第三机器学习模型的输入，以得到预测结果。The result prediction module is used to use the composite training data as an input of the third machine learning model to obtain a prediction result.

该系统采用上述的基于注意力机制的双向数据驱动船舶轨迹预测方法运行。The system operates using the attention mechanism-based two-way data-driven ship trajectory prediction method described above.

与现有技术相比，本发明不仅能够提取双向历史时序轨迹数据的特征，还引入注意力机制对神经网络输出特征进行权重赋值从而增强模型对数据的特征学习感知能力，并且巧妙运用多层感知器网络对不同子模型进行中间输出特征融合学习，从而大大提高了模型对时序轨迹数据的特征学习能力和提高了模型的预测精度。Compared with the existing technology, the present invention can not only extract the features of bidirectional historical time series trajectory data, but also introduce the attention mechanism to assign weights to the output features of the neural network to enhance the model's ability to learn and perceive the features of the data, and skillfully use multi-layer perception The machine network performs fusion learning of the intermediate output features of different sub-models, which greatly improves the feature learning ability of the model for time-series trajectory data and improves the prediction accuracy of the model.

附图说明Description of drawings

图1为本发明实施例的预测方法原理示意图；Fig. 1 is the principle schematic diagram of the prediction method of the embodiment of the present invention;

图2为本发明实施例的预测方法流程示意图；Fig. 2 is a schematic flow chart of a prediction method according to an embodiment of the present invention;

图3为本发明实施例中的GRU网络结构图；Fig. 3 is the GRU network structural diagram in the embodiment of the present invention;

图4为本发明实施例中的BiGRU网络结构图；Fig. 4 is the BiGRU network structural diagram in the embodiment of the present invention;

图5为本发明实施例中的多层感知器(MLP)网络结构图；Fig. 5 is a multi-layer perceptron (MLP) network structure diagram in the embodiment of the present invention;

图6为本发明实施例中在RMSE性能上探索最佳输入和输出步长的结果示意图。FIG. 6 is a schematic diagram of results of searching for optimal input and output step sizes in terms of RMSE performance in an embodiment of the present invention.

具体实施方式Detailed ways

为了便于本领域技术人员的理解，下面结合实施例与附图对本发明作进一步的说明，实施方式提及的内容并非对本发明的限定。In order to facilitate the understanding of those skilled in the art, the present invention will be further described below in conjunction with the embodiments and accompanying drawings, and the contents mentioned in the embodiments are not intended to limit the present invention.

本发明实施例采用的机器学习模型训练过程如下：本实施例中采用了AIS数据集中现有的历史轨迹数据，并将其分为训练集和测试集两部分。本领域技术人员应当明白，在将本方法进行实际应用时，测试集应当要从待进行船舶轨迹预测的时序数据集中进行获取，由于本实施例仅仅是用于对本方法进行说明和验证，可以将现有数据集中的时序数据作为测试集。The machine learning model training process adopted by the embodiment of the present invention is as follows: In this embodiment, the existing historical track data in the AIS data set is used, and it is divided into two parts: a training set and a test set. Those skilled in the art should understand that when this method is used in practice, the test set should be obtained from the time series data set to be predicted by the ship trajectory. Since this embodiment is only used to illustrate and verify the method, it can be The time series data in the existing dataset is used as the test set.

按照从官方网站下载164条船舶数据，共940943条由AIS接收器设备采集的船舶时序轨迹数据作为AIS数据集，并对数据集分别按照7:3的比例将数据集分成训练集和测试集。According to the download of 164 ship data from the official website, a total of 940943 ship time series trajectory data collected by the AIS receiver equipment are used as the AIS data set, and the data set is divided into a training set and a test set according to the ratio of 7:3.

在训练集中，将数据集中对应每条船舶的轨迹段预处理后，根据目标预处理真实标签的前后关系划分为前向输入轨迹序列和后向输入轨迹序列，然后分别输入到第一机器学习模型和第二机器学习模型中提取时序数据的深层特征、在输出层使用tanh函数进行预测，利用均方根误差函数计算预测轨迹标签与真实轨迹标签之间的误差，并通过反向传播算法计算神经网络各层的权重与偏置，不断迭代训练第一机器学习模型和第二机器学习模型中的神经网络，直到损失函数收敛，得到最优训练第一机器学习模型和最优训练第二机器学习模型，同时得到训练集的训练数据的中间预测序列。In the training set, after preprocessing the trajectory segment corresponding to each ship in the data set, it is divided into a forward input trajectory sequence and a backward input trajectory sequence according to the front and rear relationship of the target preprocessing real label, and then input to the first machine learning model respectively and the second machine learning model to extract the deep features of the time series data, use the tanh function to predict at the output layer, use the root mean square error function to calculate the error between the predicted trajectory label and the real trajectory label, and calculate the neural network through the back propagation algorithm. The weights and biases of each layer of the network, iteratively train the neural network in the first machine learning model and the second machine learning model until the loss function converges, and obtain the optimal training of the first machine learning model and the optimal training of the second machine learning model, and at the same time obtain the intermediate prediction sequence of the training data of the training set.

将最优训练第一机器学习模型和第二机器学习模型的中间预测序列进行拼接，然后将拼接的中间预测序列向量输入到融合预测块中，利用融合预测块多层感知器网络对拼接状态网络进行学习和深度特征提取、在输出层使用tanh函数进行预测，利用均方根误差函数计算预测轨迹标签与真实轨迹标签之间的误差，并通过反向传播算法计算神经网络各层的权重与偏置，不断迭代训练融合预测块中的神经网络，直到损失函数收敛，得到最优训练融合预测块的模型，最后对整体第一、第二和第三机器学习模型进行整体保存。Splicing the intermediate prediction sequence of the optimally trained first machine learning model and the second machine learning model, and then inputting the spliced intermediate prediction sequence vector into the fusion prediction block, using the fusion prediction block multi-layer perceptron network to splicing state network Carry out learning and deep feature extraction, use the tanh function in the output layer to predict, use the root mean square error function to calculate the error between the predicted trajectory label and the real trajectory label, and calculate the weight and bias of each layer of the neural network through the back propagation algorithm The neural network in the fusion prediction block is continuously iteratively trained until the loss function converges, and the model of the optimal training fusion prediction block is obtained, and finally the overall first, second and third machine learning models are saved as a whole.

实施例1Example 1

本发明实施例利用训练后得到的GRU神经网络和注意力机制网络、BiGUR神经网络和注意力机制网络，和多层感知器网络来实现预测模型对前向和后向历史轨迹数据进行特征提取和更好的预测性能。The embodiment of the present invention utilizes the GRU neural network and the attention mechanism network obtained after training, the BiGUR neural network and the attention mechanism network, and the multi-layer perceptron network to realize the feature extraction and analysis of the forward and backward historical trajectory data by the prediction model. better predictive performance.

图2显示了所提出的双向数据驱动的船舶轨迹预测方法的结构概况，其中前向子块、后向子块与融合预测块形成一个整体的预测框架用于轨迹预测。本实施例的双向数据驱动预测方法关键特性如下：Figure 2 shows the structural overview of the proposed two-way data-driven ship trajectory prediction method, where the forward sub-block, backward sub-block and fusion prediction block form an overall prediction framework for trajectory prediction. The key features of the two-way data-driven forecasting method of this embodiment are as follows:

首先，本实施例的双向数据驱动预测方法在输入观测到的g个已有轨迹点，在经过三个关键块的学习和融合后，输出一个未来预测l个轨迹点。其中，g和l的值可以根据实际业务需求进行调整。First, the two-way data-driven prediction method of this embodiment inputs the observed g existing trajectory points, and outputs a future prediction of l trajectory points after learning and fusion of three key blocks. Wherein, the values of g and l can be adjusted according to actual service requirements.

其次，本实施例的双向数据驱动预测方法根据同一数据集构建预测子块，利用预测子块中不同神经网络结构对同一数据集中轨迹中的信息进行特征学习和预测。如同时利用前向子块和后向子块对数据集中的前向和后向轨迹序列段进行学习和中间预测序列，使得能够利用双向数据轨迹对数据集进行综合轨迹的特征学习和提取，并且利用多层感知器网络对中间预测序列进行拼接融合学习得到最后预测效果。基于注意力机制的双向数据驱动轨迹预测流程包括如下步骤，如图2所示：Secondly, the two-way data-driven prediction method of this embodiment constructs prediction sub-blocks based on the same data set, and uses different neural network structures in the prediction sub-blocks to perform feature learning and prediction on information in the trajectory in the same data set. For example, the forward sub-block and the backward sub-block are used to learn the forward and backward trajectory sequence segments and the intermediate prediction sequence in the data set, so that the feature learning and extraction of the comprehensive trajectory can be performed on the data set using the bidirectional data trajectory, and Using the multi-layer perceptron network to splice and fuse the intermediate prediction sequence to obtain the final prediction effect. The two-way data-driven trajectory prediction process based on the attention mechanism includes the following steps, as shown in Figure 2:

步骤(1):首先，前向子块接收从数据集中观察到的长度为g的前向序列，将观察序列输入GRU网络和注意机制网络中进行学习，其中数据集中每个序列点包括的数据类型是：时间、纬度、经度、航向和航速。Step (1): First, the forward sub-block receives the forward sequence of length g observed from the data set, and inputs the observed sequence into the GRU network and the attention mechanism network for learning, wherein the data included in each sequence point in the data set The types are: time, latitude, longitude, course and speed.

步骤(2):其次，后向子块接收从数据集中观察到的长度为g的后向序列，将观察序列输入到BiGRU网络和注意机制网络中进行学习，其中数据集中每个序列点包括的数据类型是：时间、纬度、经度、航向和航速。Step (2): Secondly, the backward sub-block receives the backward sequence of length g observed from the data set, and inputs the observed sequence into the BiGRU network and the attention mechanism network for learning, wherein each sequence point in the data set includes The data types are: time, latitude, longitude, course and speed.

步骤(3):接着，经过GRU网络和注意机制网络结构训练学习后得到隐藏层输出组成的状态序列输入到全连接层，得到长度为l的中间预测序列 Step (3): Then, after the GRU network and the attention mechanism network structure training and learning, the state sequence composed of the output of the hidden layer is obtained and input to the fully connected layer to obtain an intermediate prediction sequence of length l

步骤(4):同样，经过BiGRU网络和注意机制网络结构训练学习后得到隐藏层的输出组成的状态序列输入到全连接层，得到长度为l的中间特征状态输出序列 Step (4): Similarly, after the training and learning of the BiGRU network and the attention mechanism network structure, the state sequence composed of the output of the hidden layer is input to the fully connected layer, and an intermediate feature state output sequence of length l is obtained

步骤(5):然后，对前向子块和后向子块的中间预测序列进行拼接，从而形成一个新的训练数据该训练数据将作为融合预测块的输入。Step (5): Then, splicing the intermediate prediction sequences of the forward sub-block and the backward sub-block to form a new training data This training data will serve as input to the fused prediction block.

步骤(6)：将新的训练数据输入到融合预测块中，融合预测块使用多层感知器网络进行学习，最后输出整体预测模型结果/> Step (6): The new training data Input to the fusion prediction block, the fusion prediction block uses a multi-layer perceptron network for learning, and finally outputs the overall prediction model result />

本实施例所设计的基于注意力机制的双向数据驱动的船舶轨迹预测方法能够提取双向历史轨迹数据从而增强对双向历史轨迹数据的特征学习感知能力，并引入注意力机制实现对轨迹数据中的行为特征状态进行权重赋值，增强了模型对时序数据中不同数据维度的关联关系，增强了模型对时序数据不同维度的关联关系和提高预测模型精度。The two-way data-driven ship trajectory prediction method based on the attention mechanism designed in this embodiment can extract the two-way historical trajectory data to enhance the feature learning and perception ability of the two-way historical trajectory data, and introduce the attention mechanism to realize the behavior in the trajectory data. The weight assignment of the feature state enhances the relationship between the model and different data dimensions in the time series data, enhances the relationship between the model and the different dimensions of the time series data and improves the accuracy of the prediction model.

前向子块采用GRU神经网络和注意力机制网络。门循环神经网络(GRU)是一种循环神经网络(RNN)的一种变体，如图3所示，GRU网络的内部包含两种结构：重置门和更新门。其中，重置门用来减少上一单元中被认为非相关的信息，更新门则用来决定上一单元有多少信息需要传递给下一单元。The forward sub-block adopts GRU neural network and attention mechanism network. The Gated Recurrent Neural Network (GRU) is a variant of the Recurrent Neural Network (RNN). As shown in Figure 3, the GRU network contains two structures: a reset gate and an update gate. Among them, the reset gate is used to reduce the irrelevant information in the previous unit, and the update gate is used to determine how much information from the previous unit needs to be passed to the next unit.

本实施例使用定义数据集的观察前向序列表示一个长度的通用序列，t表示当前轨迹序列中位置。输入序列/>通过GRU网络依次计算隐藏向量序列具体GRU模型由以下公式(1)-(4)进行控制This example uses the forward sequence of observations defining the dataset Represents a general sequence of length, and t represents the position in the current trajectory sequence. input sequence /> Sequentially calculate the hidden vector sequence through the GRU network The specific GRU model is controlled by the following formulas (1)-(4)

其中σ表示sigmoid激活函数，tanh是双曲正切函数,r_t,z_t表示重置门和更新门的输出，表示候选输出和实际输出，/>表示按元素相乘，Us和Ws是权重矩阵，bs是偏置项。Where σ represents the sigmoid activation function, tanh is the hyperbolic tangent function, r _t , z _t represent the output of the reset gate and update gate, Indicates the candidate output and the actual output, /> Represents element-wise multiplication, Us and Ws are weight matrices, and bs is a bias term.

对输出向量做一次全连接层计算且输出长度为l的特征状态输出序列其中每个元素/>表示编码从序列的第t个分量的输入GRU神经网络的轨迹序列中提取的时空特征数据。/>是表示连接到GRU神经网络的隐藏状态全连接层的输出集合，隐藏层状态维度大小为q，q与模型输入数据的维度大小相同。for the output vector Do a fully connected layer calculation and output a feature state output sequence of length l where each element /> Indicates the spatio-temporal feature data extracted from the trajectory sequence of the input GRU neural network for encoding the tth component of the sequence. /> is the output set of the hidden state fully connected layer connected to the GRU neural network. The hidden layer state dimension is q, and q is the same as the dimension of the model input data.

对GRU网络的输出向量使用注意力机制网络进行权重分配赋值学习，得到前向子块的预测输出/>具体注意力机制网络由以下公式(5)-(6)进行计算The output vector to the GRU network Use the attention mechanism network to learn the weight assignment and assignment, and get the prediction output of the forward sub-block /> The specific attention mechanism network is calculated by the following formulas (5)-(6)

其中，v，W，U是加性计算过程的网络参数，为GRU网络从前向轨迹序列中长度为g个轨迹中第t时刻输入样本点中提取的时空特征编码，/>表示前向轨迹序列中第t个样本点的注意力权重打分值，/>表示前向轨迹序列中第t个样本点的前向注意力分配系数。Among them, v, W, U are the network parameters of the additive calculation process, is the spatio-temporal feature code extracted by the GRU network from the input sample point at the tth moment in the t-th trajectories in the forward trajectory sequence, /> Indicates the attention weight scoring value of the tth sample point in the forward trajectory sequence, /> Indicates the forward attention distribution coefficient of the t-th sample point in the forward trajectory sequence.

对输出向量再做一次全连接层计算且输出长度为l的中间预测序列/> for the output vector Do another fully connected layer calculation and output an intermediate prediction sequence of length l/>

后向子块采用BiGRU神经网络和注意力机制网络。如图4所示，BiGRU模型比单向GRU模型多了一组反方向传播的GRU模型，这使得BiGRU能够探索观察后向序列中过去和未来的信息，从而提供更有效的预测结果。BiGRU将输入后向序列映射为两个输出序列，即前向隐藏序列/>后向隐藏序列/>并通过以下公式(8)-(10)操作：The backward sub-block adopts BiGRU neural network and attention mechanism network. As shown in Figure 4, the BiGRU model has an additional set of GRU models propagated in the opposite direction than the unidirectional GRU model, which enables BiGRU to explore the past and future information in the observed backward sequence, thus providing more effective prediction results. BiGRU will input the backward sequence Mapped to two output sequences, the forward hidden sequence /> backward hidden sequence /> And operate through the following formulas (8)-(10):

其中每个GRU函数都是公式(1)-(4)的循环网络，GRU函数对输入的时序轨迹向量进行非线性转换为对应的GRU隐藏状态，表示当前t时刻输入的后向序列的轨迹点，表示BiGRU网络中正向层映射过程中的参数集,/>表示BiGRU网络中反向层映射过程中的参数集，W为网络中权重矩阵，b为偏置值。将两个单向GRU网络通过公式(8)-(10)计算得到的正向层隐藏状态和反向层隐藏层拼接成紧凑的双向表示，最终得到BiGRU块的特征状态输出向量/> Each of the GRU functions is a recurrent network of formulas (1)-(4), and the GRU function performs a nonlinear conversion of the input time-series trajectory vector into the corresponding GRU hidden state, Indicates the trajectory point of the backward sequence input at the current time t, Indicates the parameter set in the forward layer mapping process in the BiGRU network, /> Represents the parameter set in the reverse layer mapping process in the BiGRU network, W is the weight matrix in the network, and b is the bias value. The hidden state of the forward layer and the hidden layer of the reverse layer calculated by the two unidirectional GRU networks through formulas (8)-(10) are spliced into a compact bidirectional representation, and finally the feature state output vector of the BiGRU block is obtained.>

对输出向量做一次全连接层计算且输出长度为l的特征状态输出序列/>其中每个元素/>表示编码从序列的第t个分量输入BiGRU神经网络的轨迹序列中提取的时空特征数据。/>是表示连接到BiGRU神经网络的隐藏状态全连接层的输出集合，隐藏层状态维度大小为q，q与模型输入数据的维度大小相同。for the output vector Do a fully connected layer calculation and output a feature state output sequence of length l /> where each element /> Represents the spatiotemporal feature data extracted from the trajectory sequence whose t-th component of the sequence is input into the BiGRU neural network. /> is the output set of the hidden state fully connected layer connected to the BiGRU neural network. The hidden layer state dimension is q, and q is the same as the dimension of the model input data.

对BiGRU神经网络的输出向量使用注意力机制网络进行权重分配赋值学习，得到后向子块的预测输出/>具体注意力机制网络由以下公式(11)-(13)进行计算Output vector to BiGRU neural network Use the attention mechanism network for weight assignment and assignment learning, and get the predicted output of the backward sub-block/> The specific attention mechanism network is calculated by the following formulas (11)-(13)

其中，v，W，U是加性计算过程的网络参数，为BiGRU网络从后向轨迹序列中长度为g个轨迹中第t时刻输入样本点中提取的时空特征编码，/>表示后向轨迹序列中第t个样本点的注意力权重打分值，/>表示后向轨迹序列中第t个样本点的后向注意力分配系数。Among them, v, W, U are the network parameters of the additive calculation process, is the spatio-temporal feature code extracted by the BiGRU network from the input sample point at the tth moment in the backward trajectory sequence with a length of g trajectories, /> Indicates the attention weight scoring value of the tth sample point in the backward trajectory sequence, /> Indicates the backward attention distribution coefficient of the t-th sample point in the backward trajectory sequence.

本实施例对前向子块和后向子块的中间预测序列和/>进行拼接，从而形成最后融合预测块的输入序列/> In this embodiment, the intermediate prediction sequence of the forward sub-block and the backward sub-block and /> splicing to form the input sequence of the final fused prediction block />

融合预测块采用多层感知器网络。多层感知器网络(MLP)是一种神经网络，具有使用反向传播方法的监督学习技术。多层感知器网络依次读取由前向块和后向块的输出的中间预测序列中的每对拼接元素并根据以下公式(14)更新内部隐藏状态：The fused prediction block employs a multi-layer perceptron network. A multilayer perceptron network (MLP) is a neural network with a supervised learning technique using the backpropagation method. The multilayer perceptron network sequentially reads the intermediate prediction sequence consisting of the output of the forward block and the backward block Each pair of concatenated elements in and updates the internal hidden state according to the following formula (14):

其中σ是第1隐藏层的激活函数，表示输入变量中第t时刻的拼接状态，l是输入总数，b_t是该层的偏置，w_t是连接层的权重。然后，后面的隐藏层通过以下公式更新内部隐藏状态：where σ is the activation function of the first hidden layer, Indicates the splicing state at the tth moment in the input variable, l is the total number of inputs, b _t is the bias of this layer, and w _t is the weight of the connection layer. Then, the following hidden layers update the internal hidden state by the following formula:

其中σ_j是第j隐藏层具有θ可学习参数的非线性激活函数，b_j是第j隐藏层的偏置，w_tj是第j层连接的权重,表示上一时刻所求的输出隐藏状态值，θ表示每对输入拼接值到输出映射过程中的参数值。最后，附加一个输出层接受公式(15)中的隐藏状态/>作为输入，以按顺序进行预测。where _σj is the non-linear activation function of the jth hidden layer with θ learnable parameters, _bj is the bias of the jth hidden layer, w _tj is the weight of the jth layer connection, Represents the output hidden state value obtained at the previous moment, and θ represents the parameter value in the process of mapping each pair of input concatenated values to the output. Finally, an additional output layer accepts the hidden state in Equation (15) > As input, to make predictions in sequence.

在融合预测块中，多层感知器(MLP)网络如图5所示，MLP网络结构训练过程是将输入序列映射为一个输出序列，即隐藏序列/>通过公式(16)进行计算：In the fusion prediction block, the multi-layer perceptron (MLP) network is shown in Figure 5. The training process of the MLP network structure is to input the sequence is mapped to an output sequence, the hidden sequence /> Calculated by formula (16):

其中，MLP代表了公式(14)和公式(15)的运算过程，使用对隐藏层输出向量再做一次全连接层运算且输出长度为l的序列。最后，融合预测块的预测输出序列通过公式(17)进行计算Among them, MLP represents the operation process of formula (14) and formula (15), using output vector to the hidden layer Do another fully connected layer operation and output a sequence of length l. Finally, the predicted output sequence of the fused prediction block is calculated by formula (17)

其中，W_y和b_y是将MLP输出映射到下一个预测位置的神经网络的可训练参数。/>表示长度为l的输出序列中第t个轨迹序列点的输出值。where W _y and b _y are the mapping of the MLP output to the next predicted position The trainable parameters of the neural network. /> Indicates the output value of the tth trajectory sequence point in the output sequence of length l.

以下介绍本实施例的基于注意力机制的双向数据驱动预测方法所处的预测场景与应用模型，然后分析了该方法在该场景下的有效性。The following describes the prediction scenario and application model of the two-way data-driven prediction method based on the attention mechanism of this embodiment, and then analyzes the effectiveness of the method in this scenario.

预测模型：Prediction model:

(1)假设应用者已知船舶在海上某时间段的轨迹观察点，且轨迹观察序列来自AIS(船舶自动识别系统)数据集。(1) It is assumed that the user knows the trajectory observation point of the ship at sea for a certain period of time, and the trajectory observation sequence comes from the AIS (Automatic Identification System) data set.

(2)应用者可以通过将已知观察前向和后向序列输入到基于注意力机制的双向数据驱动预测方法当中，该方法会将前向和后向观察序列通过前向子块、后向子块和融合预测块得到最后的将来预测轨迹序列。(2) The user can input the known observation forward and backward sequence into the two-way data-driven prediction method based on the attention mechanism, which will pass the forward and backward observation sequence through the forward sub-block, backward The sub-blocks are fused with the prediction block to obtain the final sequence of future predicted trajectories.

(3)在本实施例的基于注意力机制的双向数据驱动预测方法中，输入的观察序列中的轨迹点不允许小于2，输出的预测序列中的轨迹点不允许小于1。(3) In the two-way data-driven prediction method based on the attention mechanism of this embodiment, the trajectory points in the input observation sequence are not allowed to be less than 2, and the trajectory points in the output prediction sequence are not allowed to be less than 1.

有效性分析：Effectiveness Analysis:

本节以美国西海岸区域的船舶轨迹序列为例分析本实施例提供的方法的有效性。This section analyzes the effectiveness of the method provided in this embodiment by taking the ship trajectory sequence in the west coast of the United States as an example.

(1)与已有的工作相比较。本实施例将本方法与朴素LSTM、GRU网络和最先进的3个已有工作进行对比。本实施例工作在RMSE性能上相较于已有研究平均优于85.42％。其中，本实施例工作在RMSE性能相较于已有研究至少可以提升30.77％，本实施例工作在RMSE性能相较于已有研究最高可以提升99.15％。因此，本实施例提出的基于注意力机制的双向数据驱动预测方法具备良好的预测精度。此外，本实施例探索了在预测不同长度的目标轨迹序列的实验中，本实施例提出的方法在短期、中期和长期目标轨迹预测任务中均优于已有研究。(1) Compared with existing work. This example compares this method with a naive LSTM, a GRU network, and three state-of-the-art existing works. Compared with the existing research, the RMSE performance of the work in this embodiment is better than 85.42% on average. Among them, the RMSE performance of this embodiment can be improved by at least 30.77% compared with the existing research, and the RMSE performance of this embodiment can be improved by up to 99.15% compared with the existing research. Therefore, the two-way data-driven prediction method based on the attention mechanism proposed in this embodiment has good prediction accuracy. In addition, this embodiment explores that in the experiment of predicting target trajectory sequences of different lengths, the method proposed in this embodiment is superior to existing research in short-term, medium-term and long-term target trajectory prediction tasks.

(2)研究最佳输入序列步长和预测步长参数，进一步探索所提方法的最佳参数。本实施例将数据集中的每条记录代表船舶在海域上的轨迹点，本实施例将数据集中输入本模型中的记录数作为输入步长，将本实施例的模型预测输出的记录数作为输出步长。本实例探索最佳的输入步长和输出步长的关系。如图6所示，本实施例工作在RMSE性能上发现当本实施例的方法输入步长＝4，5，8，9时，本实施例的预测效果达到较好的程度。因此，本实发明在实际工作当中可以设置合理的输入步长和输出步长数目以达到最佳的工作效果。(2) Study the optimal input sequence step size and prediction step size parameters, and further explore the optimal parameters of the proposed method. In this embodiment, each record in the data set represents the track point of the ship on the sea area. In this embodiment, the number of records input into the model in the data set is used as the input step size, and the number of records predicted by the model in this embodiment is output as the output. step size. This example explores the optimal input step size and output step size relationship. As shown in FIG. 6 , in the RMSE performance of this embodiment, it is found that when the input step size of the method of this embodiment is 4, 5, 8, 9, the prediction effect of this embodiment reaches a better level. Therefore, the present invention can set a reasonable number of input step size and output step size in actual work to achieve the best working effect.

实施例2Example 2

本实施例涉及的是基于注意力机制的双向数据驱动船舶轨迹预测装置，该船舶预测装置包括处理器以及存储器，存储器上存储有计算机程序，计算机程序通过处理执行时，用于实现实施例1中的基于注意力机制的双向数据驱动船舶轨迹预测方法。This embodiment relates to a two-way data-driven ship trajectory prediction device based on an attention mechanism. The ship prediction device includes a processor and a memory, and a computer program is stored on the memory. When the computer program is executed through processing, it is used to implement the embodiment 1. An Attention Mechanism-Based Bidirectional Data-Driven Ship Trajectory Prediction Method.

具体地，处理器可采用Intel(R)Core(TM)i7-1165G7@2.80GHz处理器，16GB内存，使用python3.6在Keras框架上进行软件编程。Specifically, the processor can use Intel(R) Core(TM) i7-1165G7@2.80GHz processor, 16GB memory, and use python3.6 to perform software programming on the Keras framework.

本实施例提供的基于注意力机制的双向数据驱动船舶轨迹预测装置目的是用于实现实施例1所涉基于注意力机制的双向数据驱动船舶轨迹预测方法。因此，实施例1所具备的技术效果，本实施例所提供的基于注意力机制的双向数据驱动船舶轨迹预测装置，在此不再赘述。The purpose of the two-way data-driven ship trajectory prediction device based on the attention mechanism provided in this embodiment is to implement the two-way data-driven ship trajectory prediction method based on the attention mechanism mentioned in Embodiment 1. Therefore, the technical effects of Embodiment 1, the two-way data-driven ship trajectory prediction device based on the attention mechanism provided by this embodiment, will not be repeated here.

为了让本领域普通技术人员更方便地理解本发明相对于现有技术的改进之处，本发明的一些附图和描述已经被简化，并且上述实施例为本发明较佳的实现方案，除此之外，本发明还可以其它方式实现，在不脱离本技术方案构思的前提下任何显而易见的替换均在本发明的保护范围之内。In order to allow those of ordinary skill in the art to more easily understand the improvement of the present invention compared to the prior art, some drawings and descriptions of the present invention have been simplified, and the above-mentioned embodiments are preferred implementation solutions of the present invention, in addition In addition, the present invention can also be realized in other ways, and any obvious replacements are within the protection scope of the present invention without departing from the concept of the technical solution.

Claims

1. The bidirectional data driving ship track prediction method based on the attention mechanism is characterized by comprising the following steps of:

s1, acquiring an observation sequence with a forward length of g from an AIS data setInputting the observation sequence into the firstA machine learning model, obtaining the intermediate prediction sequence +.>At the same time, the observation sequence with the backward length g is acquired from the AIS data set>Inputting the observation sequence into a second machine learning model to obtain an intermediate prediction sequence with the length of l

S2, middle prediction sequence of first machine learning modelAnd an intermediate prediction sequence of a second machine learning modelSplicing to form composite training data;

s3, taking the composite training data as the input of a third machine learning model to obtain a prediction result.

2. The attention mechanism based bi-directional data driven vessel trajectory prediction method of claim 1, wherein the acquiring process of the first, second and third machine learning models comprises:

the acquired ship time sequence track data are used as AIS data sets, and the AIS data sets are divided into training sets and testing sets;

training the forward sub-block and the backward sub-block model by taking the training set as the input of the forward sub-block and the backward sub-block model to obtain a first machine learning model and a second machine learning model;

and combining the intermediate outputs of the optimal forward sub-block and the backward sub-block model obtained in the training process into a composite fusion training set, inputting the composite fusion training set into a fusion prediction block, and training the fusion prediction block to obtain a third machine learning model.

3. The attention-based bi-directional data driven vessel trajectory prediction method according to claim 1 or 2, wherein: the first machine learning model is composed of a GRU neural network and an attention mechanism; the second machine learning model is composed of a BiGRU neural network and an attention mechanism; the third machine learning model is a multi-layer perceptron network.

4. A bi-directional data driven vessel trajectory prediction method based on an attention mechanism as claimed in claim 3, wherein: in step S1, the first machine learning model observes an input forward observation sequenceMapping to an output sequence, intermediate prediction sequence of length l +.>The acquisition process of (1) comprises:

forward observation sequence in datasetSequentially inputting elements of (a) into a first machine learning model, wherein t represents a position in a current forward track sequence;

updating hidden sequences of a first machine learning model according toWherein the method comprises the steps ofEach element->Representing features extracted from sample points at t time in g middle feature states in length in a forward track sequence;

the forward attention distribution coefficient representing the t-th sample point in the forward trajectory sequence is calculated as follows: the attention weighting score representing the t-th sample point in the forward trajectory sequence is calculated as follows: />v, W, U are network parameters of the additive calculation process, < >>Space-time feature codes extracted from input sample points at the t-th moment in g tracks of input time sequence length for the GRU network;

element(s)Representing a spatiotemporal feature code extracted from an input sample point at time t in a g-track of input timing length over a GRU network,/>Each->Representing a spatiotemporal feature extracted from the last sample point of the g tracks of length of input timing, t>Track points representing sequences input to the GRU neural network at the current moment, namely input sample points of the first machine learning model, theta _gru Representing the parameter values in each input-to-output mapping process;

θ _gru representing a set of parameters in a GRU network mapping process in a first machine learning model: representing the ith time sequence point in the forward track sequence with length g input into the first machine learning model, L representing quantization error, N representing total number of training samples, +.>A j-th time sequence point in a mapping output sample with the length of l is represented, and theta represents a parameter value in each input-to-output mapping process;

M _g,l representing a forward input sequence of g at a given lengthNext, the output sequence Y of length l is predicted _l Thereby maximizing the conditional probability: m is M _g,l ＝argmax _Y p(Y _l |X _g )，p(Y _l |X _g ) Representing a given g observation sequences X _g Mapping to future l predicted track sequences Y _l Probability of (2);

inputting vectors composed of the outputs of all hidden layers into a full-connection layer to obtain an intermediate prediction sequence with length of l

5. The attention-based mechanism of claim 3The bidirectional data driving ship track prediction method is characterized by comprising the following steps of: in step S1, the second machine learning model observes the sequence of input backward observationsMapping to an output sequence, intermediate prediction sequence of length l +.>The acquisition process of (1) comprises:

backward observation sequence in datasetSequentially inputting elements of (a) into a second machine learning model, wherein t represents a position in a current backward track sequence;

updating hidden sequences of a second machine learning model according toWherein the method comprises the steps ofEach element->Representing features extracted from a t-th time sample point in a backward input time sequence state with g intermediate feature states;

the forward attention distribution coefficient representing the t-th sample point in the backward trajectory sequence is calculated as follows: the attention weighting score representing the t-th sample point in the backward trajectory sequence is calculated as follows: />v, W, U are network parameters of the additive calculation process, < >>Space-time feature codes extracted from input sample points at the t-th moment in g tracks of the input time sequence length for the BiGRU network;

for the fusion of hidden states of a forward layer and a reverse layer by a BiGRU network, the calculation formula is as follows:wherein W is a weight matrix in the network, and b is a bias value;

wherein the method comprises the steps of Track points of a backward sequence input at the current t moment are represented, namely input sample points of a second machine learning model;

representing a set of parameters in a forward layer mapping process in a second machine learning model: representing input to a second machineAnd learning an ith time sequence point in a backward track sequence with the length g in a forward layer of the model. />Representing a set of parameters in a reverse layer mapping process in a second machine learning model: represents the ith time sequence point in the backward track sequence with length g input into the backward layer, L represents quantization error, N represents total number of training samples, +.>A j-th time sequence point in a mapping output sample with the length of l is represented, and theta represents a parameter value in each input-to-output mapping process;

6. A bi-directional data driven vessel trajectory prediction method based on an attention mechanism as claimed in claim 3, wherein: in step S3, the third machine learning model concatenates the input intermediate prediction sequenceMapping to an output sequence, predicted output +.>The acquisition process of (1) comprises:

splice intermediate predicted sequencesSequentially inputting elements of (a) into a third machine learning model, wherein t represents a t-th intermediate prediction pair in the current intermediate prediction sequence;

obtaining a hidden sequence of the third machine learning model according toWherein the method comprises the steps of Intermediate predictive splice value representing the current t-moment input data, i.e. the input data of the third machine learning model,/->Representing the output hiding state value obtained at the last moment, wherein θ represents the parameter value in the process of mapping each pair of input splicing values to the output;

output according to the following jth hidden layer Wherein, the hidden state h of the 1 st hidden layer ₁ Is->Sigma is the activation function of the 1 st hidden layer, l is the total number of inputs, b _t Is the bias of the 1 st hidden layer, w _t Is the weight of the connection layer, sigma _j Is a nonlinear activation function with a leachable parameter theta for the j-th hidden layer, b _j Is the bias of the j-th hidden layer, w _j Is the weight of the j-th connecting layer;

will consist of the output of all hidden layersIs input into the full-connection layer to obtain a predicted output sequence with the length of l

7. A bi-directional data-driven marine vessel trajectory prediction system based on an attention mechanism, comprising:

a data acquisition module for acquiring an observation sequence with a forward length of g from the AIS data setAnd a viewing sequence having a backward length g +.>A data processing module for processing the observation sequence +.>Inputting into a first machine learning model to obtain an intermediate prediction sequence with length of l>And is used to observe the sequence->Inputting into a second machine learning model to obtain an intermediate prediction sequence with length of l>A data stitching module for mid-prediction sequence of the first machine learning model>And intermediate prediction sequence of the second machine learning model +.>Splicing to obtain composite training data;

and the result prediction module is used for taking the composite training data as the input of a third machine learning model to obtain a predicted result.

8. The attention-based bi-directional data driven vessel trajectory prediction system of claim 7, wherein: a bi-directional data driven vessel trajectory prediction method based on an attention mechanism as claimed in any one of claims 1 to 6.