CN110955965A

CN110955965A - Pedestrian motion prediction method and system considering interaction

Info

Publication number: CN110955965A
Application number: CN201911164676.3A
Authority: CN
Inventors: 毛天露; 黄英凡; 毕慧堃
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2019-11-25
Filing date: 2019-11-25
Publication date: 2020-04-03

Abstract

The invention provides a pedestrian motion prediction method and system considering interaction. The method can predict the motion track of the pedestrian on the basis of fully considering the interaction time correlation between the pedestrian and surrounding pedestrians, thereby greatly improving the prediction precision.

Description

Pedestrian motion prediction method and system considering interaction

Technical Field

The invention relates to the field of computer vision and simulation, in particular to the field of pedestrian trajectory prediction, and more particularly to a pedestrian motion prediction method and system based on deep learning and considering interaction.

Background

The current pedestrian trajectory prediction method can be divided into 2 types: a manual feature extraction based method and a deep learning based method. The method for manually extracting the features comprises the following steps: the system comprises a social force model (social force model) and a topic model (topic model), wherein the social force model models pedestrian interaction in a manner of attractive force and repulsive force, and the topic model models a motion mode by combining time and space information. The deep learning-based method comprises the following steps: a Convolutional-Neural-Network (CNN) model based on Convolutional Neural Networks (CNN), a social attention model (social-model), and a social-LSTM model based on Long Short-Term Memory networks (LSTM). The behavioral convolution model encodes the historical motion trajectory of pedestrians as a series of pictures, and then uses a Convolutional Neural Network (CNN) to model the interaction between pedestrians. The social attention model uses a space-time Graph (Spatial-Temporal Graph) to represent interactions in time and space, and incorporates a Recurrent Neural Network (RNN) to model the different interactions. The social model uses a single LSTM to model each person's motion trajectory and uses a "pooling" mechanism to share implicit states between different pedestrians.

Because the interaction among pedestrians is very complicated and is difficult to accurately describe through a plurality of potential energy functions, the existing method based on manual feature extraction cannot model complex motion modes, especially in crowded environments; the existing deep learning-based method also has two main problems: the module of the modeling pedestrian interaction is not strong enough; the time correlation between interactions is not considered, so that the accuracy of the predicted pedestrian motion track is not high enough.

In summary, the accuracy of the pedestrian motion trajectory predicted by the existing pedestrian trajectory prediction method needs to be improved.

Disclosure of Invention

Therefore, the present invention is directed to overcome the above-mentioned drawbacks of the prior art, and to provide a new method and system for predicting pedestrian movement, which comprehensively considers the time correlation between the movement pattern of each pedestrian and the pedestrian interaction, and further overcomes the problem of low prediction accuracy of the current method for predicting pedestrian trajectories, at least to some extent.

According to a first aspect of the present invention, there is provided a pedestrian motion prediction method taking into account interaction, comprising the steps of:

s1, acquiring the position information of a plurality of pedestrians in the same scene at the current observation time, and taking the position information as the relative position information of a first observation point in the total observation step length;

s2, processing each pedestrian current observation point by adopting a first social model according to the relative position information of the current observation point of each pedestrian to respectively obtain a first hidden state of the current observation point;

s3, processing the first hidden states of the current observation points of the pedestrians by adopting an attention network to obtain an indirect hidden state of each current observation point of the pedestrians;

s4, processing the indirect hidden state of each current pedestrian observation point by adopting a second social model according to the indirect hidden state of each current pedestrian observation point and the second hidden state of the previous observation point to obtain the second hidden state of each current pedestrian observation point;

s5, taking the first hidden state of the current observation point of each pedestrian as the basis of the relative position information of the observation point after the current observation point moves by one observation step length, and repeatedly executing the steps S2 to S5 until the preset total observation step length is reached;

s6, fusing the first hidden state and the second hidden state of each pedestrian at the last observation point to obtain an intermediate hidden state of each pedestrian at the current moment;

and S7, predicting relative position information of the pedestrian at the next moment based on the intermediate hidden state of the pedestrian at the current moment.

Preferably, the method for predicting a new person trajectory of the present invention further includes:

and S8, predicting the relative position information of the next time according to the predicted relative position information of the previous time.

Preferably, the preset total observation step size is 8.

The first hidden state of each pedestrian current observation point contains motion state information of the current observation point, and the motion state information comprises the speed, the acceleration and the direction of the current observation point.

And the indirect hidden state of each pedestrian current observation point comprises the influence of the pedestrians around the current observation point on the current observation point.

The second hidden state of each pedestrian current observation point comprises the interaction information of each pedestrian with surrounding pedestrians.

In step S7, the third social model is used to analyze the intermediate hidden state of the pedestrian at the current time to predict the relative position information of the pedestrian at the next time.

According to another aspect of the present invention, there is provided a pedestrian motion prediction system considering interaction, comprising: the device comprises an encoding module, an intermediate state fusion module and a decoding module.

The encoding module comprises a first social model, a graph attention network and a second social model, wherein: the first social model is used for acquiring a first hidden state of each pedestrian containing the motion state information of the pedestrian according to the position information of each pedestrian; the graph attention network acquires an indirect hidden state of each pedestrian, which contains the influence of surrounding pedestrians on the pedestrian, according to the first hidden states of all the pedestrians in the same scene at the same moment; and the second social model acquires a second hidden state of each pedestrian containing information of interaction between the pedestrian and surrounding pedestrians according to the indirect hidden states of all the pedestrians.

The intermediate state fusion module is used for fusing the first hidden state and the second hidden state of each pedestrian to obtain an intermediate hidden state of each pedestrian;

the decoding module is used for predicting the relative position of the pedestrian at the next moment according to the intermediate implicit state of each pedestrian.

Preferably, the encoding module comprises a plurality of units consisting of a first social model and a second social model; the first social model of each unit analyzes the position information of a corresponding pedestrian in the same scene; the graph attention network carries out combined analysis on first hidden states of all pedestrians subjected to the first social model analysis in the same scene; and the second social model of each unit respectively analyzes the intermediate hidden state of a corresponding pedestrian in the same scene at the same moment.

The decoding module includes a plurality of social models for predicting a relative position of a pedestrian at a next moment in time.

Compared with the prior art, the invention has the advantages that: the method can predict the pedestrian motion track on the basis of fully considering the correlation of the interaction time of the pedestrian and the surrounding pedestrians, greatly improves the prediction precision, and meanwhile, the network structure design of the method can fully model the mutual motion influence among different pedestrians and simultaneously considers the correlation of the interaction time of the pedestrian, so that a more accurate and reasonable track prediction result can be obtained.

Drawings

Embodiments of the invention are further described below with reference to the accompanying drawings, in which:

fig. 1 is a schematic flow chart of a process for predicting the track of different pedestrians in the same scene by a pedestrian motion prediction method considering interaction according to an embodiment of the present invention;

FIG. 2 is a block diagram of a pedestrian motion prediction system that accounts for interaction according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a pedestrian trajectory prediction system predicting trajectories of different pedestrians in the same scene according to an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail by embodiments with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

According to an embodiment of the present invention, there is provided a pedestrian motion prediction method considering interaction, including the steps of:

step 1, setting the total observation step length of each moment, wherein each moving step length corresponds to one observation point.

The total observation step size is the number of cycles of iterative processing on the relative position information at the current time, and preferably, in an embodiment of the present invention, the total observation step size is set to 8.

And 2, acquiring the position information of a plurality of pedestrians in the same scene at the current observation time as the relative position information of a first observation point in the total observation step length.

And 3, processing the current observation point of each pedestrian by adopting a first social model according to the relative position information of the current observation point of each pedestrian to respectively obtain a first hidden state of the current observation point of each pedestrian.

In one embodiment of the invention, a first social model is adopted to analyze the relative position information of the current observation point of each pedestrian, so as to obtain a first hidden state of each pedestrian containing the motion state information of the current observation point, wherein the motion state information comprises the speed, the acceleration and the direction of the current observation point.

Specifically, the first social model is used for analyzing the relative position information of the current observation point of the pedestrian, namely, the social model is used for modeling the motion mode of the single pedestrian and is called M-LSTM. For each pedestrian, the relative position coordinate of each moment is used as the input of the moment corresponding to the first social model M-LSTM, the coordinate of the pedestrian is represented by x and y coordinates, the x coordinate represents an abscissa, the y coordinate represents a ordinate, and the x coordinate and the y coordinate together determine the position of the pedestrian. Wherein x is_i ^tX-axis coordinate, y, representing the ith pedestrian at time t_i ^tThe y-axis coordinate of the ith pedestrian at time t,

representing the x coordinate change of the ith pedestrian at time t relative to time t-1,

representing the change of y coordinates of the ith pedestrian at the time t relative to the time t-1; phi (-) represents a coding function of a corresponding single-layer neural network in the social model, and also comprises learnable parameters for projecting two-dimensional coordinates into a high-dimensional implicit space. W_eeThe method is used for representing the coding weight, mapping coordinate information in a two-dimensional space to a high-dimensional space through a certain coding function in machine learning, so that the capacity of the whole network is increased, more information can be contained, and the coding weight is a learnable parameter and is continuously updated in the process of training the network; w_mThe parameters representing the M-LSTM,

the hidden state of the M-LSTM at the time t, namely the first hidden state corresponding to the ith pedestrian at the time t, is a high-dimensional vector, and the vector contains the motion state information of the pedestrian, including speed, acceleration, direction and the like.

A relative position code value representing the ith pedestrian at time t:

after the relative position information of the pedestrian at the current moment is analyzed and processed through the M-LSTM, the motion state information of the pedestrian can be obtained.

And 4, processing the first hidden states of the plurality of current pedestrian observation points by adopting a graph attention network according to the first hidden states of the plurality of current pedestrian observation points to obtain an indirect hidden state of each current pedestrian observation point.

In one embodiment of the invention, a first hidden state of a plurality of current observation points of pedestrians is analyzed by adopting a graph attention network, and an indirect hidden state of each pedestrian including the influence of the pedestrians around the pedestrian on the current observation point is obtained.

Specifically, based on the foregoing embodiment, M-LSTM hidden states of all pedestrians whose current observation points are in the same scene, that is, the first hidden state of a pedestrian, are input to a Graph Attention Network (GAT), and the Graph Attention Network GAT models the mutual influence between different pedestrians based on the first hidden states of all pedestrians at the same observation point. The GAT network includes a nonlinear activation function (here, represented by σ (-)) and an exponential function (here, represented by exp (-)). in one embodiment of the present invention, the nonlinear activation function employs a ReLU function, preferably, an upgraded warping function, laakyreu (), a,

w is a parameter matrix with learning in the GAT network, where,

it is shown that at time t, the weight of the influence of pedestrian j on pedestrian i is a decimal between 0 and 1. Everyone around the pedestrian i has an influence weight on him, all of which add up to a sum of 1, where |, denotes the vector matrix concatenation. The weighted sum of the influence of all pedestrians around the ith pedestrian is expressed as formula 5:

the graph attention network GAT is used for analyzing the first hidden states of all pedestrians in the same scene at the same moment, so that the weighted influence sum of all pedestrians around each pedestrian, including the pedestrian, can be obtained, and the hidden state of the GAT is the indirect hidden state of the pedestrians.

And 5, processing the indirect hidden state of each current pedestrian observation point by adopting a second social model according to the indirect hidden state of each current pedestrian observation point and the second hidden state of the previous observation point to obtain the second hidden state of each current pedestrian observation point.

In an embodiment of the invention, a second social model is adopted to analyze the indirect hidden state of the pedestrian and the second hidden state of the pedestrian at the previous moment respectively, so as to obtain the second hidden state of the current observation point of each pedestrian containing the interaction information between the pedestrian and the surrounding pedestrians, wherein if the first observation point does not have the second hidden state of the previous moment, the indirect hidden state of the first observation point of the pedestrian is directly analyzed.

Specifically, based on the foregoing embodiment, the second social model is used to analyze the indirect hidden state of the current observation point of the pedestrian and the second hidden state of the pedestrian at the previous moment, and the social model is used to model the time correlation between the pedestrians, and the second social model is called G-LSTM.

The implicit states of GAT, i.e., the indirect implicit states of pedestrians, are input into G-LSTM to model the temporal correlation of interactions between pedestrians. W_gThe parameters representing the G-LSTM,

and the hidden state of the G-LSTM at the moment t, namely the second hidden state of the pedestrian is represented as follows:

the contained information is the interaction relationship between the ith pedestrian and surrounding pedestrians at time t.

And 6, taking the first implicit state of the current observation point of each pedestrian as the basis of the relative position information of the observation point after the current observation point moves by one observation step length, and repeatedly executing the steps S2 to S5 until the preset total observation step length is reached.

In one embodiment of the invention, the relative position information contained in the first hidden state of the last observation point is used as the input of the M-LSTM model of the next observation point.

Specifically, based on the foregoing embodiment, the position corresponding to the first hidden state at the ith pedestrian t is encoded

And as the input of an M-LSTM model at the t moment, namely the t +1 moment moving by one step length, obtaining a first hidden state of the observation point corresponding to the pedestrian t +1, and iterating the number of times of the total observation step length according to the method to obtain the first hidden state and the second hidden state of the last observation point of the pedestrian.

And 7, fusing the first hidden state and the second hidden state of each pedestrian at the last observation point to obtain an intermediate hidden state of each pedestrian at the current moment.

In an embodiment of the present invention, based on the foregoing embodiment, the hidden states of M-LSTM and G-LSTM corresponding to each pedestrian are fused, that is, the first hidden state and the second hidden state of each pedestrian are fused, and since both the hidden states are vectors, the intermediate state of each pedestrian can be directly obtained by fusing in a vector splicing manner. Using two coding functions delta₁(. and. delta.)₂(. separately processing the first and second hidden states for each pedestrian by T_obsRepresenting the last observation point, and adopting a noise matrix z to help train pedestrian intermediate state data:

the formula 10 represents the intermediate hidden state of the ith pedestrian at the current moment after the fusion and the addition of the noise auxiliary training, and the intermediate hidden state contains the motion information of the ith pedestrian at the current moment and the interaction information with the surrounding pedestrians.

And 8, predicting the relative position information of the pedestrian at the next moment based on the intermediate hidden state of the pedestrian at the current moment, predicting the relative position information of the pedestrian at the later moment according to the predicted relative position information of the pedestrian at the previous moment, and recycling the predicted relative position information of the pedestrian at the last moment to predict the relative position information of the pedestrian at the later moment until the predicted length required by the predicted pedestrian track is reached.

In one embodiment of the invention, the intermediate hidden state of each pedestrian at the previous moment is analyzed by adopting a third social model to predict the relative position information of each pedestrian at the next moment until the prediction length required by the predicted pedestrian track is reached, and the third social model is called D-LSTM.

Specifically, according to the foregoing embodiment, the intermediate state shown in equation 10, which is the intermediate state of the ith pedestrian at the previous time, is used as the input of the D-LSTM, and the output is the predicted motion information and position information at the next time. W_dThe parameters representing the D-LSTM,

representing the hidden state of D-LSTM at t moment, i.e. the hidden state of the ith pedestrian at t moment, delta₃(. X) is the coding function of D-LSTM. At the last observation point T based on the ith pedestrian_obsD-LSTM obtained from the intermediate hidden state and position information at T_obsAt +1 timeThe hidden state of the moment is the predicted hidden state and the relative position information of the pedestrian at the next moment:

wherein the content of the first and second substances,

indicating that the first position relative to the current time, predicted based on the relative position information of the current time, is to be

Substituting into formula 2, predicting the relative position of the next moment based on the predicted relative position information of the last moment, and performing loop iteration until the predicted length is reached. The predicted length refers to the total number of positions required to predict the pedestrian trajectory. Briefly, the intermediate state is used as an input hidden state of the D-LSTM, then a predicted position at a first time is obtained, and then the position at the next time is obtained by taking the position as an input, so that a position sequence of a needed predicted duration is obtained in a loop.

Based on the foregoing solution, according to an example of the present invention, as shown in fig. 1, trajectory prediction is performed on three pedestrians at the same time in the same scene, and based on the relative position information of the three pedestrians at the current time t1, 2, and 3, the observation step length is set to be 8, and 50 trajectory position points are set, and trajectory prediction is performed on the three pedestrians, which includes the following steps:

t1, acquiring position information of the pedestrians 1, 2 and 3 at the current time T, taking the relative position information of the pedestrians 1, 2 and 3 at the current time T as the relative position information of the first observation point, specifically, analyzing the relative position information of the pedestrian 1 at the first observation point by using a first social model M-LSTM _1 model, taking the relative position of the first observation point of the pedestrian 1 as the input of the M-LSTM _1, and acquiring a first hidden state of the first observation point of the pedestrian 1; analyzing the relative position information of the pedestrian 2 at the first observation point by using a first social model M-LSTM _2 model, and taking the relative position of the first observation point of the pedestrian 2 as the input of the M-LSTM _2 to obtain a first hidden state of the first observation point of the pedestrian 2; analyzing the relative position information of the pedestrian 3 at the first observation point by using a first social model M-LSTM _3 model, and taking the relative position of the first observation point of the pedestrian 3 as the input of the M-LSTM _3 to obtain a first hidden state of the first observation point of the pedestrian 3; the first hidden state of the pedestrian comprises information such as the current movement speed, acceleration and direction of the pedestrian;

t2, analyzing the first hidden states of the

pedestrians

1, 2, and 3 at the first observation point, obtaining indirect hidden states of the first observation point of the

pedestrians

1, 2, and 3, specifically, taking all the first hidden states of the

pedestrians

1, 2, and 3 at the first observation point as the input of the first attention network GAT1, so as to analyze the mutual influence of the

pedestrians

1, 2, and 3, and obtain indirect hidden states of the first observation point of the

pedestrians

1, 2, and 3, where the indirect hidden states of the first observation point of the pedestrian 1 include the weighted sum of the influences of the pedestrians 2 and 3 at the first observation point; the indirect implicit state of the first observation point of the pedestrian 2 comprises the weighted influence sum of the pedestrians 1 and 3 at the first observation point; the indirect implicit state of the first observation point of the pedestrian 3 comprises the weighted influence sum of the first observation point of the pedestrian 1 and the pedestrian 2;

t3, analyzing the indirect hidden states of the

pedestrians

1, 2 and 3 at the first observation point respectively to obtain second hidden states of the

pedestrians

1, 2 and 3 at the first observation point, specifically, analyzing the indirect hidden state of the pedestrian 1 by using a second social model G-LSTM _1, and obtaining the second hidden state of the pedestrian 1 at the first observation point by using the indirect hidden state of the pedestrian 1 at the first observation point as the input of the G-LSTM _ 1; analyzing the indirect hidden state of the pedestrian 2 by using a second social model G-LSTM _2, and taking the indirect hidden state of the pedestrian 2 at the first observation point as the input of the G-LSTM _2 to obtain a second hidden state of the pedestrian 2 at the first observation point; analyzing the indirect hidden state of the pedestrian 3 by using a second social model G-LSTM _3, and taking the indirect hidden state of the pedestrian 3 at the first observation point as the input of the G-LSTM _3 to obtain a second hidden state of the pedestrian 3 at the first observation point; the second hidden state of the pedestrian comprises interaction information of the pedestrian and surrounding pedestrians;

t4, taking the first hidden state of the first observation point of the

pedestrians

1, 2 and 3 as the input of the M-LSTM model of the second observation point corresponding to T +1 after the current time T moves by an observation step length, and obtaining the first hidden state of the second observation point of the

pedestrians

1, 2 and 3; taking all the first hidden states of the second observation points of the

pedestrians

1, 2 and 3 as the input of the attention network GAT2 of the second graph to obtain the indirect hidden states of the second observation points of the

pedestrians

1, 2 and 3; respectively taking the indirect hidden states of the second observation points of the

pedestrians

1, 2 and 3 and the second hidden states of the first observation points of the

pedestrians

1, 2 and 3 as the corresponding G-LSTM input to obtain the second hidden states of the second observation points of the

pedestrians

1, 2 and 3;

t5, analyzing each subsequent observation point in a mode of referring to a second observation point until the last observation point T +7 is reached in the observation step length, and obtaining a first hidden state and a second hidden state of the last observation point of the

pedestrians

1, 2 and 3;

t6, respectively fusing the first hidden state and the second hidden state of the last observation point of the

pedestrians

1, 2 and 3 to obtain the intermediate hidden states of the

pedestrians

1, 2 and 3 at the current moment; the intermediate hidden state of each pedestrian comprises the motion information of the current moment and the interaction information of the pedestrian and the surrounding pedestrians;

t7, analyzing the intermediate hidden states of the

pedestrians

1, 2 and 3 at the current time to obtain the position information of the next time, specifically, analyzing the intermediate hidden states of the

pedestrians

1, 2 and 3 at the current time by using a third social model, taking the intermediate hidden state of the pedestrian 1 at the current time as the input of a third social model D-LSTM _1, wherein the hidden state processed by the D-LSTM _1 is the predicted relative position information of the pedestrian 1 at the next time; taking the intermediate hidden state of the pedestrian 2 at the current moment as the input of a third social model D-LSTM _2, wherein the hidden state processed by the D-LSTM _2 is the predicted relative position information of the pedestrian 2 at the next moment; taking the intermediate hidden state of the pedestrian 3 at the current moment as the input of a third social model D-LSTM _3, wherein the hidden state processed by the D-LSTM _3 is the predicted relative position information of the pedestrian 3 at the next moment;

t8, predicting the relative position information of the pedestrian at the next moment by using the predicted relative position information of the

pedestrians

1, 2, 3 at the next moment, namely, cyclically using the predicted relative position information of the pedestrian at the last moment to predict the relative position information of the pedestrian at the next moment until the predicted length required by the predicted pedestrian track is reached, and obtaining 50 predicted positions;

and T9, obtaining the predicted pedestrian motion trail according to the 50 predicted positions.

According to another embodiment of the present invention, the present invention provides a pedestrian motion prediction system considering interaction, as shown in fig. 2, including an encoding module, an intermediate state fusion module, and a decoding module, wherein:

the coding module comprises a first social model, a graph attention network and a second social model:

the first social model is used for modeling historical motion information of pedestrians, analyzing the position information of each pedestrian and acquiring a first hidden state of each pedestrian, wherein the first hidden state contains the motion state information of the pedestrian; the graph attention network is used for carrying out merging analysis on the first hidden states of all pedestrians in the same scene at the same moment, and acquiring an indirect hidden state of each pedestrian, which contains the influence of surrounding pedestrians on the hidden state; the second social model is used for modeling time correlation of interaction between pedestrians and acquiring a second hidden state of each pedestrian containing interaction information between the pedestrian and surrounding pedestrians based on the indirect hidden states of all the pedestrians;

and the decoding module is provided with a third social model and is used for predicting the relative position of the pedestrian at the next moment according to the intermediate hidden state of each pedestrian.

Preferably, the encoding module comprises a plurality of units consisting of a first social model and a second social model; at the same moment, the first social model of each unit respectively analyzes the position information of all pedestrians in the same scene, and each first social model corresponds to one pedestrian; the graph attention network carries out combined analysis on the first hidden states of all the pedestrians subjected to the first social model analysis under the same scene at the same moment; at the same moment, the second social model of each unit respectively analyzes the intermediate hidden states of all pedestrians in the same scene at the same moment, and each second social model corresponds to one pedestrian.

The decoding module comprises a plurality of third social models for predicting relative positions of pedestrians at a next moment; each third social model corresponds to a unit consisting of the first social model and the second social model.

According to an example of the present invention, based on the foregoing embodiment, as shown in fig. 3, in a scene with three pedestrians (

pedestrians

1, 2, and 3) at the same time, the interaction-considered pedestrian motion prediction system of the present invention is used to perform trajectory prediction on the pedestrians, and the observation step size is set to be n, and the trajectory position points are set to be 100. Wherein the coding module comprises at least three first social models (respectively represented by M-LSTM _1, M-LSTM _2 and M-LSTM _ 3), three second social models (respectively represented by G-LSTM _1, G-LSTM _2 and G-LSTM _ 3) and n graph attention networks (respectively represented by GAT1, GAT2 and … … GATn). The decoding module includes at least three third social models (denoted D-LSTM _1, D-LSTM _2, and D-LSTM _3, respectively).

The coding module analyzes the position information of the

pedestrians

1, 2 and 3 at the current time t. Specifically, the M-LSTM _1 analyzes the position information of the pedestrian 1 at the current moment t as the relative position information of the first observation point of the pedestrian 1 to obtain a first hidden state of the pedestrian 1 at the current moment t, the M-LSTM _2 analyzes the position information of the pedestrian 2 at the current moment t as the relative position information of the first observation point of the pedestrian 2 to obtain a first hidden state of the pedestrian 2 at the current moment t, and the M-LSTM _3 analyzes the position information of the pedestrian 3 at the current moment t as the relative position information of the first observation point of the pedestrian 3 to obtain a first hidden state of the pedestrian 3 at the current moment t; the GAT1 analyzes the first hidden state of the first observation point of the pedestrians 1, 2 and 3 to obtain the indirect hidden state of the pedestrians 1, 2 and 3; G-LSTM _1 analyzes the indirect hidden state of the pedestrian 1 at the current moment t to obtain a second hidden state of the pedestrian 1 at the first observation point, G-LSTM _2 analyzes the indirect hidden state of the pedestrian 2 at the first observation point to obtain a second hidden state of the pedestrian 2 at the current moment t, and G-LSTM _3 analyzes the indirect hidden state of the pedestrian 3 at the first observation point to obtain a second hidden state of the pedestrian 3 at the current moment t.

In a set step length range, taking a first hidden state of a last observation point of a pedestrian as the input of a next observation point corresponding to a first social model, taking a second hidden state of the last observation point of the pedestrian and an indirect hidden state of the next observation point as the input of the next observation point corresponding to a second social model, and iteratively executing the times of setting the step length to obtain the first hidden state and the second hidden state of the pedestrian at the last observation time, wherein as shown in fig. 3, the

pedestrian

1, 2 and 3 is the last observation point at t + n observation points.

The intermediate state fusion module fuses the first hidden states and the second hidden states of the

pedestrians

1, 2 and 3 at the last observation point to respectively obtain the intermediate hidden states of the

pedestrians

1, 2 and 3 at the current moment;

the social model in the decoding module analyzes the intermediate hidden state of the pedestrian and predicts the relative position of the pedestrian at the next moment, specifically, the D-LSTM _1 analyzes the intermediate hidden state of the pedestrian 1 to obtain the relative position of the pedestrian 1 at the next moment, the D-LSTM _2 analyzes the intermediate hidden state of the pedestrian 2 to obtain the relative position of the pedestrian 2 at the next moment, and the D-LSTM _3 analyzes the intermediate hidden state of the pedestrian 3 to obtain the relative position of the pedestrian 3 at the next moment.

And predicting the relative position of the

pedestrian

1, 2 and 3 at the next moment by using the relative position of the pedestrian at the next moment predicted by the decoding module until 100 track position points required by the predicted length are reached.

The method can predict the motion trail of the pedestrian on the basis of fully considering the interaction time correlation between the pedestrian and the surrounding pedestrians, greatly improves the prediction precision, and meanwhile, the network structure design of the method can fully model the mutual motion influence between different pedestrians and simultaneously considers the interaction time correlation of the pedestrian, so that a more accurate and reasonable trail prediction result can be obtained.

It should be noted that, although the steps are described in a specific order, it is not meant that the steps must be executed in the specific order, and in fact, some of the steps may be executed concurrently or even in a different order as long as the required functions are achieved.

Claims

1. A pedestrian motion prediction method taking interaction into account, comprising:

s2, processing the current observation point of each pedestrian by adopting a first social model according to the relative position information of the current observation point of each pedestrian to respectively obtain a first hidden state of the current observation point;

2. The interaction-considered pedestrian motion prediction method according to claim 1, further comprising:

3. A method of pedestrian motion prediction considering interaction according to claim 2,

the preset total observation step size is 8.

4. The interaction-considered pedestrian motion prediction method according to claim 1,

the first hidden state of each pedestrian current observation point comprises the motion state information of the current observation point, and the motion state information comprises the speed, the acceleration and the direction of the current observation point.

5. The interaction-considered pedestrian motion prediction method according to claim 1,

the indirect hidden state of each pedestrian current observation point comprises the influence of the pedestrians around the current observation point on the current observation point.

6. The interaction-considered pedestrian motion prediction method according to claim 1,

7. The method as claimed in claim 1, wherein in step S7, a third social model is used to analyze the intermediate hidden state of the pedestrian at the current time to predict the relative position information of the pedestrian at the next time.

8. An interaction-aware pedestrian motion prediction system for implementing the method of any one of claims 1 to 7, comprising:

the coding module comprises a first social model, a graph attention network and a second social model, wherein:

the first social model is used for acquiring a first hidden state of each pedestrian containing the motion state information of the pedestrian according to the position information of each pedestrian;

the graph attention network acquires an indirect hidden state of each pedestrian, which contains the influence of surrounding pedestrians on the pedestrian, according to the first hidden states of all the pedestrians in the same scene at the same moment;

the second social model is used for acquiring a second hidden state of each pedestrian containing information of interaction between the pedestrian and surrounding pedestrians according to the indirect hidden states of all the pedestrians;

9. The interaction-aware pedestrian motion prediction system of claim 8,

the encoding module comprises a plurality of units consisting of a first social model and a second social model:

the first social model of each unit respectively analyzes the position information of a corresponding pedestrian in the same scene;

the graph attention network carries out combined analysis on the first hidden states of all pedestrians subjected to the first social model analysis in the same scene;

the second social model of each unit respectively analyzes the intermediate hidden state of a corresponding pedestrian in the same scene at the same moment;

the decoding module comprises a plurality of third social models, and each third social model corresponds to a unit consisting of the first social model and the second social model.

10. A computer-readable storage medium, having stored thereon a computer program which, when executed, performs the steps of the method of any one of claims 1 to 7.

11. A computer device for pedestrian trajectory prediction, comprising a memory and a processor, on which a computer program is stored which is executable on the processor, characterized in that the processor realizes the steps of the method according to any one of claims 1 to 7 when executing the program.