CN117634661A

CN117634661A - Ship maneuvering motion forecasting method based on self-attention two-way long-short-term memory network

Info

Publication number: CN117634661A
Application number: CN202310571525.XA
Authority: CN
Inventors: 王宁; 孔祥军; 宋佳麟; 董琪; 郝立柱; 韩冰
Original assignee: Dalian Maritime University
Current assignee: Dalian Maritime University
Priority date: 2023-05-19
Filing date: 2023-05-19
Publication date: 2024-03-01

Abstract

The invention discloses a ship maneuvering motion forecasting method based on a self-attention two-way long-short-term memory network, which comprises the following steps of: acquiring ship motion history data to obtain a time sequence of the ship history data; constructing a self-attention weighted two-way long-short-term memory network model, and training the attention weighted two-way long-term memory network model to obtain a trained self-attention weighted two-way long-term memory network model; the method adopts the bidirectional long-short-term memory network, can circularly learn and extract the forward and reverse characteristics of the ship motion time sequence data, has excellent prediction precision and model generalization capability in the ship maneuvering motion data prediction, and has strong practical applicability.

Description

Ship maneuvering motion forecasting method based on self-attention two-way long-short-term memory network

Technical Field

The invention belongs to the fields of ship and ocean engineering technology and ship motion prediction, and relates to a ship control motion prediction method based on self-attention-bilateral long-short-term memory network (self-attention-weighted bidirectional long short-termmemory network, seaBil).

Background

The prediction technology for intelligent ship maneuvering has important significance in practical ocean engineering application. The marine weather changes into phantom and is tested, the marine environment is complex and changeable, and all the adverse factors can cause great influence on the movement of the small intelligent ship, so that the intelligent ship can face great challenges in offshore operation. The data-driven ship maneuvering motion prediction technology can predict the motion state of the ship in a period of time in the future through analyzing the current and historical motion states of the ship, and provides important technical support for the functions of safe navigation, dynamic positioning and the like of the unmanned ship.

In recent years, data driving methods including support vector machine regression and neural networks have been widely used in the field of ship maneuvering motion prediction. Initially, a ship maneuvering motion prediction model based on deep learning is often implemented by a single long-short-term memory network, a convolutional neural network, a gating circulation unit and the like. However, in practical applications, there is a limitation in a single forecasting model, and it is difficult to meet the actual engineering requirements, so a hybrid forecasting model is gradually developed. The mixed forecasting model generally combines a data preprocessing method with a forecasting model or organically fuses a plurality of forecasting models to obtain a better forecasting effect, but the conventional mixed forecasting model still has the problems of untimely forecasting, limited forecasting precision, poor actual application performance and the like caused by higher calculation complexity.

Disclosure of Invention

In order to solve the problems of untimely prediction, limited prediction precision, poor practical application performance and the like caused by higher calculation complexity of the conventional mixed prediction model, the invention provides the technical scheme that: a ship maneuvering motion forecasting method based on a self-attention two-way long-short-term memory network comprises the following steps:

acquiring ship motion history data to obtain a time sequence of the ship history data;

constructing a self-attention weighted two-way long-short-term memory network model, and training the attention weighted two-way long-term memory network model to obtain a trained self-attention weighted two-way long-term memory network model;

the trained self-attention weighted two-way long-short-term memory network model is used for predicting the motion data of ship navigation.

Further: the self-attention weighted two-way long-short term memory network model comprises

Extracting one-dimensional feature mapping of the ship historical data changing along with time to obtain a one-dimensional convolution module of predicted data;

based on the prediction data, obtaining a two-way long-short-term memory module of the hidden state of each time step through a two-way long-term memory network;

multiplying the feature vector of each time step by a query matrix, an address matrix and a value matrix respectively, thereby forming a self-attention weight generating module of a self-attention mechanism;

the one-dimensional feature mapping module, the two-way long-short-term memory module and the self-attention weight generating module are sequentially cascaded.

Further: the process of acquiring the ship motion history data and obtaining the time sequence of the ship history data comprises the following steps:

heading anglex _1,t Yaw rate x _2,t Roll angle x _3,t Total speed x _4,t Rudder angle u _r,t Acquisition is performed in time sequence t, wherein heading angle x _1,t Yaw rate x _2,t Roll angle x _3,t Total speed x _4,t As a normal input, rudder angle u _r,t As an external input, a sampling time interval is denoted T, and a ship steering motion data matrix having n sampling periods T is acquired and expressed as follows:

wherein: x is x _all,t ＝[x _1,t ,x _2,t ,x _3,t ,x _4,t ] ^T And u _r,t Respectively representing the motion data of the ship and the external rudder angle input at the time t;

collecting data matrix X from vessel voyage through sliding window _t∣t-n Navigation data block X of (2) _t∣t-d+1 And from u _r,t∣t-n Exogenous input u of (2) _r,t∣t-d+1 Representing a block of data to be predicted asWherein: d is the width of a sliding data window, and cyclic prediction is carried out on the ship steering motion in a sliding window mode;

the mapping of the predicted data is expressed as:

wherein: f is a designed prediction model, and Θ is a parameter matrix which is learned by a sliding window according to data samples.

Further: the process for extracting the one-dimensional feature mapping changing along with time from the ship historical data to obtain the predicted data comprises the following steps:

x is to be _t∣t-n And u _t∣t-n As input to the one-dimensional convolution module, and assuming m filter kernels w ₁ ,w ₂ ,...,w _m The ith filter kernel convolves the input data at time t-n to obtain a constant y _i,t-n Calculated by the following formula:

in the middle ofSign->Representing a convolution operation;

the eigenvector after convolution decomposition at the time t-n is expressed as y _t-n ＝[y _1,t-n ,y _2,t-n ,...,y _m,t-n ] ^T A filter is applied to each set of data to generate a coupled time feature map between vessel motion parameters: y is Y _t∣t-n ＝[y _t-n ,y _t-n+1 ,...,y _t ]。

Further: the process of obtaining the hidden state of each time step based on the prediction data is as follows:

processing sequence data Y from one-dimensional convolutional neural network module by adopting two-way long-short-term memory module _t∣t-n Extracting its periodic characteristics, and its hidden state h at time t _c,t Calculated by the following formula:

h _c,t ＝conca(h _f,t ,h _b,t ) (5)

h _＊,t ＝o _*,t ⊙tanh(c _*,t ) (6)

c _＊,t ＝f _＊,t ⊙c _＊,t-1 +i _＊,t ⊙c′ _＊,t (7)

wherein:the hidden state and the alternative state of the forward and backward LSTM layers at the time t are respectively, l is the dimension of the hidden layer, the one-element multiplication operation is given by the symbol "whilethe following formula is given by the symbol", the vector series operation is given by the symbol conca (&. Cndot.):

forgetting doorInput door->Output door->A priori alternative statesThe calculation is performed by the following formula:

f _＊,t ＝σ(W _f conca(h _＊,t-1 ,y _＊,t )+b _forget ) (9)

i _＊,t ＝σ(W _i conca(h _＊,t-1 ,y _＊,t )+b _input ) (10)

o _＊,t ＝σ(W _o conca(h _＊,t-1 ,y _＊,t )+b _output ) (11)

c′ _＊,t ＝tanh(W _c conca(h _＊,t-1 ,y _＊,t )+b _c ) (12)

in the method, in the process of the invention,weight parameters of forgetting gate, input gate and output gate respectively, b _forget ,b _input And b _output Bias parameters of a forgetting gate, an input gate and an output gate respectively, sigma is a sigmoid activation function, and the hidden of each time step is calculatedThe hidden state can be expressed as H _c ＝[h _c,t-d+1 ,h _c,t-d+2 ,...,h _c,t ]。

Further: the process of multiplying the feature vector of each time step by the query matrix, the address matrix and the value matrix, respectively, to form the self-attention mechanism is as follows:

multiplying the eigenvector of each time step of the LSTM output by the query matrix W _q Address matrix W _k Sum matrix W _v Thereby forming a self-attention mechanism, specifically expressed as:

q _t ＝h _t W _q (13)

wherein q is _t 、k _t And v _t Query vector, address vector and value vector, m, respectively, of a single sequence at different positions _ij Is h _i For h _j Attention-assessing variables of the dependency;

the attention weight is scaled using a softmax function as shown in the following equation:

in the formula g _ij Representing the attention weighting parameter, alpha _i An attention vector representing an i-th time step; self-attention weight generation moduleThe final output of the block can be expressed as a= [ α ₁ ,α ₂ ,…,α _d ] ^T Where a is the attention matrix for all time steps.

Further: further comprises: the attention evaluation variable is adjusted by using a softmax function to obtain the attention vector of the time step, and the specific process is as follows:

in the formula g _ij Representing the attention weighting parameter, alpha _i An attention vector representing an i-th time step; the final output of the self-attention weight generation module may be represented as a= [ α ₁ ,α ₂ ,...,α _d ] ^T Where a is the attention matrix for all time steps.

A vessel steering motion prediction apparatus based on a self-noted two-way long and short term memory network, comprising:

and the acquisition module is used for: the method comprises the steps of acquiring ship motion history data to obtain a time sequence of the ship history data;

and (3) a construction and training module: the method is used for constructing a self-attention weighted two-way long-short-term memory network model, training the attention weighted two-way long-term memory network model, and obtaining a trained self-attention weighted two-way long-term memory network model;

and a prediction module: the self-attention weighted two-way long-short-term memory network model is used for predicting the motion data of ship navigation after training.

The ship maneuvering motion forecasting method based on the self-attention two-way long-short-term memory network combines the one-dimensional convolution and sliding window technology, and can convert the multidimensional ship motion data sample into the one-dimensional feature vector, so that the complexity of feature extraction is reduced; the method can adaptively distribute the attention weight of the sample, so that the prediction result of the ship motion is better understood and explained; the method adopts a two-way long-short-term memory network, can circularly learn and extract the forward and reverse characteristics of the ship motion time sequence data, has excellent prediction precision and model generalization capability in the prediction of the ship maneuvering motion data, and has strong practical applicability.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort to a person skilled in the art.

FIG. 1 is a schematic flow chart of the method of the present invention;

FIG. 2 is a diagram of a self-attention weighted two-way long-short term memory network model of the present invention;

FIG. 3 is raw data of a Z-manipulation experiment of KVCC 2;

FIG. 4 is a graph comparing the predicted yaw rate for the method of the present invention with other methods;

FIG. 5 is a graph comparing the prediction error of yaw rate for the method of the present invention with other methods;

FIG. 6 is a graph comparing the predicted roll angle with other methods according to the present invention;

FIG. 7 is a graph comparing the prediction error of roll angle for the method of the present invention with other methods.

Detailed Description

It should be noted that, without conflict, the embodiments of the present invention and features in the embodiments may be combined with each other, and the present invention will be described in detail below with reference to the drawings and the embodiments.

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. The following description of at least one exemplary embodiment is merely exemplary in nature and is in no way intended to limit the invention, its application, or uses. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.

The relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless it is specifically stated otherwise. Meanwhile, it should be clear that the dimensions of the respective parts shown in the drawings are not drawn in actual scale for convenience of description. Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate. In all examples shown and discussed herein, any specific values should be construed as merely illustrative, and not a limitation. Thus, other examples of the exemplary embodiments may have different values. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.

In the description of the present invention, it should be understood that the azimuth or positional relationships indicated by the azimuth terms such as "front, rear, upper, lower, left, right", "lateral, vertical, horizontal", and "top, bottom", etc., are generally based on the azimuth or positional relationships shown in the drawings, merely to facilitate description of the present invention and simplify the description, and these azimuth terms do not indicate and imply that the apparatus or elements referred to must have a specific azimuth or be constructed and operated in a specific azimuth, and thus should not be construed as limiting the scope of protection of the present invention: the orientation word "inner and outer" refers to inner and outer relative to the contour of the respective component itself.

Spatially relative terms, such as "above … …," "above … …," "upper surface at … …," "above," and the like, may be used herein for ease of description to describe one device or feature's spatial location relative to another device or feature as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations in use or operation in addition to the orientation depicted in the figures. For example, if the device in the figures is turned over, elements described as "above" or "over" other devices or structures would then be oriented "below" or "beneath" the other devices or structures. Thus, the exemplary term "above … …" may include both orientations of "above … …" and "below … …". The device may also be positioned in other different ways (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.

In addition, the terms "first", "second", etc. are used to define the components, and are only for convenience of distinguishing the corresponding components, and the terms have no special meaning unless otherwise stated, and therefore should not be construed as limiting the scope of the present invention.

A brief flow of the method of the invention is shown in figure 1,

s1, acquiring ship motion history data to obtain a time sequence of the ship history data;

s2, constructing a self-attention weighted two-way long-short-term memory network model, and training the attention weighted two-way long-term memory network model to obtain a trained self-attention weighted two-way long-term memory network model;

and S3, the trained self-attention weighted two-way long-short-term memory network model is used for predicting the motion data of ship navigation.

The steps S1/S2/S3 are sequentially executed;

the self-attention weighted two-way long-short term memory network model comprises

The process of acquiring the ship motion history data and obtaining the time sequence of the ship history data comprises the following steps:

firstly, collecting ship motion history data: heading angle x _1,t Yaw rate x _2,t Roll angle x _3,t Total speed x _4,t Rudder angle u _r,t Acquisition is performed in time sequence t, wherein heading angle x _1,t Yaw rate x _2,t Roll angle x _3,t Total speed x _4,t As a normal input, rudder angle u _r,t As an external input, the sampling time interval is denoted as T. Taking heading angle as an example, historical time series data points from t-nT to t can be represented as x _1,t∣t-n ＝[x _1,t-n ,x _1,t-(n-1) ,…,x _1,t ] ^T . Thus, a matrix of vessel steering motion data having n sampling periods T may be obtainedThe acquisition and representation are as follows:

in which x is _all,t ＝[x _1,t ,x _2,t ,x _3,t ,x _4,t ] ^T And u _r,t The ship motion data and the external rudder angle input at the time t are respectively shown.

Deploying a sliding window to collect data matrix X from vessel voyage _t∣t-n Navigation data block X of (2) _t∣t-d+1 And from u _r,t∣t-n Exogenous input u of (2) _r,t∣t-d+1 Representing a block of data to be predicted asWhere d is the width of the sliding data window. In this framework, cyclic predictions of vessel maneuvering movements are made in the manner of sliding windows.

Thus, the mapping from the actual data block to the predicted data block can be expressed as:

wherein F is a designed prediction model, and Θ is a parameter matrix which is learned by a sliding window according to a data sample.

And a one-dimensional convolution module is adopted to execute convolution operation on the time sequence data to extract the one-dimensional feature mapping of the time sequence data along with the time change. X is to be _t∣t-n And u _t∣t-n As input to the one-dimensional convolution module, and assuming m filter kernels w ₁ ,w ₂ ,...,w _m The ith filter kernel convolves the input data at time t-n to obtain a constant y _i,t-n The calculation can be made by the following formula:

in the middle ofSign->Representing a convolution operation.

The eigenvector after convolution decomposition at the time t-n is expressed as y _t-n ＝[y _1,t-n ,y _2,t-n ,...,y _m,t-n ] ^T . Applying a filter to each set of data to generate a coupled time feature map between vessel motion parameters: y is Y _t∣t-n ＝[y _t-n ,y _t-n+1 ,...,y _t ]Is used for transmitting to a long-term and short-term memory network for training.

To avoid gradient extinction and gradient explosion problems in long-sequence training, a two-way long-short-term memory (LSTM) module is used for processing sequence data Y from a one-dimensional convolutional neural network module _t∣t-n Extracting its periodic characteristics, and its hidden state h at time t _c,t The calculation can be made by the following formula:

h _c,t ＝conca(h _f,t ,h _b,t ) (5)

h _＊,t ＝o _*,t ⊙tanh(c _*,t ) (6)

c _＊,t ＝f _＊,t ⊙c _＊,t-1 +i _＊,t ⊙c′ _＊,t (7)

in the middle ofThe hidden state and the alternative state of the forward and backward LSTM layers at the time t are respectively, l is the dimension of the hidden layer, the one-element multiplication operation is given by the symbol "whilethe following formula is given by the symbol", and the vector series operation is given by the symbol conca (/ -), which can be expressed as follows:

furthermore, forgetting doorInput door->Output door->A priori alternative statesThe calculation is performed by the following formula:

f _＊,t ＝σ(W _f conca(h _＊,t-1 ,y _＊,t )+b _forget ) (9)

i _＊,t ＝σ(W _i conca(h _＊,t-1 ,y _＊,t )+b _input ) (10)

o _＊,t ＝σ(W _o conca(h _＊,t-1 ,y _＊,t )+b _output ) (11)

c′ _＊,t ＝tanh(W _c conca(h _＊,t-1 ,y _＊,t )+b _c ) (12)

in the method, in the process of the invention,weight parameters of forgetting gate, input gate and output gate respectively, b _forget ,b _input And b _output Bias parameters of a forgetting gate, an input gate and an output gate are respectively shown, and sigma is a sigmoid activation function. The hidden state that is ultimately calculated for each time step can be represented as H _c ＝[h _c,t-d+1 ,h _c,t-d+2 ,...,h _c,t ]。

Multiplying the eigenvector of each time step of the LSTM output by the query matrix W _q Address matrix W _k Sum matrix W _v Thereby forming a self-attention mechanism, which can be expressed in particular as:

q _t ＝h _t W _q (13)

k _t ＝h _t W _k (14)

v _t ＝h _t W _v (15)

wherein q is _t 、k _t And v _t Query vector, address vector and value vector, m, respectively, of a single sequence at different positions _ij Is h _i For h _j The attention of the dependency evaluates the variable.

In turn, the attention weight is scaled using a softmax function as shown in the following equation:

in the formula g _ij Representing the attention weighting parameter, alpha _i An attention vector representing the i-th time step.

The final output of the self-attention weight generation module may be represented as a= [ α ₁ ,α ₂ ,...,α _d ] ^T Where a is the attention matrix for all time steps.

And finally, cascading the one-dimensional convolution module, the two-way long-short-period memory module and the self-attention weight generation module together, multiplying the output of the two-way long-short-period memory module by the attention weight generated by a self-attention mechanism, and realizing final prediction of navigation data through a full-connection layer of the whole network model.

Firstly, historical navigation data are input into a navigation data acquisition module, the acquired multidimensional data are converted into one-dimensional feature vectors through a one-dimensional convolution module, then, the two-way long-short-term memory module learns forward and reverse time sequence features of the one-dimensional feature vectors, the output of the two-way long-short-term memory module is multiplied by the attention weight correspondingly by matching with a self-attention mechanism of the self-attention weight generation module, and finally, the online forecast of the ship steering motion is realized through a network full-connection layer.

Next, the intelligent ship maneuvering forecasting method according to the present invention was verified with a specific embodiment, using sailing data of real ship KVLCC2, the data set being from hamburger ship model base (HSVA), the reduction ratio being 1:45.714, and the specific parameters being shown in the following table:

TABLE 1 Main Property parameters of KVACC 2

As shown in fig. 3, which is the raw data of the Z-steering experiment of the KVLCC2, the total duration of the data set is 210 seconds, the sampling interval is 0.5 seconds, and the data of every twenty sampling times are input as a group. Considering the limited data, the first 70% of the data in the dataset was used as the training set and the last 30% was used as the test set.

In this embodiment, the input data includes a course angle, a yaw rate, a roll angle, a total speed, and a rudder angle, and the predicted output data is the course angle, the yaw rate, the roll angle, and the total speed. To fully demonstrate the effectiveness and superiority of the present method, a typical single Support Vector Regression (SVR), long-short-term memory network (LSTM) and gated loop unit (GRU) framework was used to compare to the present method (SaeBil).

Taking the prediction of yaw rate and roll angle as an example: as shown in fig. 4 and 5, the predicted result and the predicted error of the yaw rate by the above four methods are respectively; fig. 6 and fig. 7 show the prediction results and prediction errors of the roll angle by the above four methods, respectively. It can be seen that the SVR model has larger prediction error, and the GRU and LSTM models have better performance than the SVR model, however, still have deviation from the measured value, and only the method (SaeBil) is closest to the real measured value, which indicates that the GRU and LSTM are difficult to accurately extract the coupling mapping relation between the variables. In fig. 5 and 7, the prediction error of the method is kept to fluctuate around 0, and compared with other models, the error fluctuation range is obviously reduced, which indicates that the method has better stability and robustness in practical application.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims

1. A ship maneuvering motion forecasting method based on a self-attention two-way long-short-term memory network is characterized by comprising the following steps of: the method comprises the following steps:

2. The ship maneuvering motion forecasting method based on the self-attention two-way long-short-term memory network is characterized by comprising the following steps of: the self-attention weighted two-way long-short term memory network model comprises

3. The ship maneuvering motion forecasting method based on the self-attention two-way long-short term memory network according to claim 1, wherein the method comprises the following steps of: the process of acquiring the ship motion history data and obtaining the time sequence of the ship history data comprises the following steps:

heading angle x _1,t Yaw rate x _2,t Roll angle x _3,t Total speed x _4,t Rudder angle u _r,t Acquisition is performed in time sequence t, wherein heading angle x _1,t Yaw rate x _2,t Roll angle x _3,t Total speed x _4,t As a normal input, rudder angle u _r,t As an external input, a sampling time interval is denoted T, and a ship steering motion data matrix having n sampling periods T is acquired and expressed as follows:

the mapping of the predicted data is expressed as:

4. The ship maneuvering motion forecasting method based on the self-attention two-way long-short term memory network according to claim 1, wherein the method comprises the following steps of: the process for extracting the one-dimensional feature mapping changing along with time from the ship historical data to obtain the predicted data comprises the following steps:

x is to be _t∣t-n And u _t∣t-n As input to the one-dimensional convolution module, and assuming m filter kernels w ₁ ,w ₂ ,...,w _m The ith filter kernel rolls the input data at time t-nThe constant y obtained by the product operation _i,t-n Calculated by the following formula:

in the middle ofSign->Representing a convolution operation;

5. The ship maneuvering motion forecasting method based on the self-attention two-way long-short term memory network according to claim 1, wherein the method comprises the following steps of: the process of obtaining the hidden state of each time step based on the prediction data is as follows:

h _c,t ＝conca(h _f,t ,h _b,t ) (5)

h _＊,t ＝o _*,t ⊙tanh(c _*,t ) (6)

c _＊,t ＝f _＊,t ⊙c _＊,t-1 +i _＊,t ⊙c′ _＊,t (7)

forgetting doorInput door->Output door->A priori alternative status +.>The calculation is performed by the following formula:

f _*,t ＝σ(W _f conca(h _*,t-1 ,y _*,t )+b _forget ) (9)

i _*,t ＝σ(W _i conca(h _*,t-1 ,y _*,t )+b _input ) (10)

o _＊,t ＝σ(W _o conca(h _＊,t-1 ,y _＊,t )+b _output ) (11)

c′ _*,t ＝tanh(W _c conca(h _*,t-1 ,y _＊,t )+b _c ) (12)

in the method, in the process of the invention,weight parameters of forgetting gate, input gate and output gate respectively, b _forget ,b _input And b _output Bias parameters of a forgetting gate, an input gate and an output gate respectively, sigma is a sigmoid activation function, and the bias parameters are calculatedThe hidden state for each time step may be represented as H _c ＝[h _c,t-d+1 ,h _c,t-d+2 ,...,h _c,t ]。

6. The ship maneuvering motion forecasting method based on the self-attention two-way long-short term memory network according to claim 1, wherein the method comprises the following steps of: the process of multiplying the feature vector of each time step by the query matrix, the address matrix and the value matrix, respectively, to form the self-attention mechanism is as follows:

q _t ＝h _t W _q (13)

k _t ＝h _t W _k (14)

v _t ＝h _t W _v (15)

in the formula g _ij Representing attentionForce weight parameter, alpha _i An attention vector representing an i-th time step; the final output of the self-attention weight generation module may be represented as a= [ α ₁ ,α ₂ ,...,α _d ] ^T Where a is the attention matrix for all time steps.

7. The ship maneuvering motion forecasting method based on the self-attention two-way long and short term memory network according to claim 6, wherein the method comprises the following steps of: further comprises: the attention evaluation variable is adjusted by using a softmax function to obtain the attention vector of the time step, and the specific process is as follows:

8. A ship operation motion prediction device based on a self-attention two-way long-short-term memory network is characterized in that: comprising the following steps: