CN115082896A - Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network - Google Patents

Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network Download PDF

Info

Publication number
CN115082896A
CN115082896A CN202210741506.2A CN202210741506A CN115082896A CN 115082896 A CN115082896 A CN 115082896A CN 202210741506 A CN202210741506 A CN 202210741506A CN 115082896 A CN115082896 A CN 115082896A
Authority
CN
China
Prior art keywords
pedestrian
network
attention
data
self
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210741506.2A
Other languages
Chinese (zh)
Inventor
孔令悦
孙长银
王远大
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202210741506.2A priority Critical patent/CN115082896A/en
Publication of CN115082896A publication Critical patent/CN115082896A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A pedestrian track prediction method based on a topological graph structure and a deep self-attention network extracts local and global space interaction features in a pedestrian motion track respectively by using a graph attention network and the deep self-attention network based on the topological graph, and then extracts time sequence features by using an original deep self-attention network. In order to simulate the inherent uncertainty and the multi-modal characteristics of the pedestrian motion trajectory, the invention expands the exploration space of the pedestrian motion trajectory by introducing Gaussian noise into a fully-connected network decoder. In order to further improve the track exploration space and smoothness, the track is sent to a track correction module for correction. Compared with other methods, the adopted graph neural network and the depth self-attention network based on the graph can pay more attention to various spatial interaction modes in the pedestrian motion trail, such as parallelism, potential obstacle avoidance and the like. Compared with other pedestrian trajectory prediction methods, the social interaction feature extraction capability and the multi-modal exploration capability of the invention are more prominent and effective.

Description

Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network
Technical Field
The invention belongs to the field of deep learning and automatic driving, and particularly relates to a pedestrian trajectory prediction method based on a topological graph structure and a deep self-attention network.
Background
Under a complex crowd scene, the participation of pedestrians brings huge challenges to the dynamic obstacle avoidance motion planning of the mobile robot and the unmanned vehicle, the prediction of the pedestrian motion trail is beneficial to improving the efficiency of the obstacle avoidance planning of the unmanned vehicle, and the safety accident rate is reduced. The traditional pedestrian trajectory prediction method mainly adopts statistical probability methods such as hidden Markov chains, Bayes and the like, or artificially set rules and functions, and the methods are difficult to migrate to a complex nonlinear environment in a scene which is often applied to a sparse crowd environment and has poor motion state randomness, and once complex social interaction among pedestrians occurs, the effectiveness of a prediction result is difficult to ensure. With the development of neural networks, methods for converting trajectory prediction tasks into time sequence generation tasks have been proposed (Alahi A, Goel K, Ramanathan V, et al. The method has the core idea that a cyclic neural network such as a long-term and short-term memory network is adopted to extract time sequence information in the pedestrian tracks, and spatial interaction information among the pedestrian tracks is simulated in a pooling mode for hidden states in the network, but the method has insufficient capability of extracting spatial interaction characteristics among pedestrians. With the development of deep learning technology, particularly a graph neural network and a self-attention mechanism, a method for simulating pedestrian interaction by using the graph neural network and the attention mechanism becomes possible. The existing better track prediction method utilizes a Sparse graph convolution network (Shi L, Wang L, Long C, et al SGCN: Sparse graph context network for pedestrian prediction [ C ]// CVPR,2021.), although the method can simulate interaction information among pedestrians to a certain extent, the time sequence feature extraction capability is insufficient, more importantly, the method cannot effectively simulate multi-modal features in the pedestrian motion track, and the inherent uncertainty in the pedestrian motion track is particularly important in a track prediction task.
Disclosure of Invention
In order to overcome the defects of the prior art, balance the relation between time sequence feature extraction and space interaction feature extraction and meet the inherent randomness requirement of the pedestrian track, the invention provides a pedestrian track prediction method based on a topological graph structure and a depth self-attention network, extracts time sequence features in historical tracks, explores interaction information and behavior modes among pedestrians, restores multi-modal characteristics in the pedestrian motion track, expands the exploration space of track prediction and predicts future track points of the pedestrians.
In order to achieve the purpose, the technical scheme of the invention is as follows:
the pedestrian trajectory prediction method based on the topological graph structure and the depth self-attention network comprises the following steps:
preprocessing data to meet the requirement of a neural network on input data, and training and testing model parameters by adopting a leave-one cross verification method;
step two, after the original pedestrian track data in the step one are obtained, embedding the data into a high-dimensional space by using a full-connection network, constructing a topological graph structure to meet the input requirement of a space interactive feature encoder, sending the high-dimensional data into the space interactive feature encoder in order to fully extract space interactive features, obtaining the high-dimensional data with the space interactive features, and splicing the output data of the two networks by using the full-connection network to ensure that the dimensionality of the output data is consistent with the original high-dimensional data;
thirdly, splicing the high-dimensional data with the space interaction characteristics obtained in the second step and the original high-dimensional data, and sending the spliced high-dimensional data into a time sequence characteristic encoder to extract time sequence characteristics;
two self-attention mechanisms are adopted to respectively extract global and local space interaction information among pedestrian tracks, interaction information and behavior patterns among pedestrians are fully explored, wherein the graph attention network adopts a basisRelative distance to neighbor related parameters
Figure BDA0003718185250000021
The attention mechanism of (1):
Figure BDA0003718185250000022
Figure BDA0003718185250000023
wherein (x) i ,y i ) Is a two-dimensional spatial location coordinate point of the pedestrian i,
Figure BDA0003718185250000024
for a fully-connected network embedding function, W is a network matrix parameter, the part of attention mainly emphasizes the influence of relative distance between local neighbors on a motion track, and a depth self-attention network based on a graph adopts a self-attention mechanism to emphasize the influence of a global relation on the motion track;
step four, in order to simulate the inherent uncertainty and the multi-modal characteristics of the pedestrian motion trajectory, Gaussian sampling noise is introduced into the high-dimensional data which is obtained in the step three and has space interaction characteristics and time sequence characteristics at the same time, and then the data are sent into a fully-connected neural network decoder to obtain a predicted pedestrian trajectory sequence;
and step five, sending the result to a trajectory correction module to improve the smoothness and continuity of the path.
As a further improvement of the present invention, in the second step, the spatial interaction feature encoder is used for encoding the high-dimensional data including a graph attention network and a graph-based deep self-attention network, and in the second step, a spatial interaction feature encoder (including a graph attention network and a topological graph structure-based deep self-attention network) is used for performing spatial interaction feature extraction on the pedestrian trajectory data.
As a further improvement of the invention, in the third step, a time sequence feature encoder (comprising an original depth self-attention network) is used for extracting the time sequence features in the pedestrian track data.
As a further improvement of the invention, in the fourth step, inherent uncertainty and multi-modal characteristics in the pedestrian motion trajectory are simulated by introducing gaussian sampling noise into the data, and the exploration space of the pedestrian trajectory is expanded.
As a further improvement of the present invention, in the step five, a curve fitting and a binary network are adopted to further expand the network exploration space and enhance the curve continuity, wherein the curvature S calculation formula is as follows:
Figure BDA0003718185250000031
compared with the prior art, the invention has the beneficial effects that: (1) two self-attention mechanisms are adopted, including a graph attention network and a depth self-attention network based on a graph, space interaction characteristics are extracted, the influences of the relative distance and the global relation of local adjacent pedestrians on the pedestrian tracks are respectively extracted, social interaction information and various interaction modes in the pedestrian tracks are more fully explored and identified, and various complex interaction situations such as side-by-side walking among pedestrians, detouring in advance and avoiding potential collision among pedestrians and grouped pedestrians by a single walking surface are avoided; (2) time sequence features in the pedestrian historical track are extracted by adopting a deep self-attention network, and the extraction capability of the model on the time sequence features is enhanced by utilizing a mask mechanism (Masked) in the deep self-attention network; (3) gaussian noise is introduced into the multi-mode encoder to simulate multi-mode characteristics and inherent uncertainty in a pedestrian motion track, and the exploration space of the pedestrian motion track is expanded to be closer to the practical situation. (4) And sending the final position point into a correction module, and further improving the track exploration space and continuity.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 illustrates a trajectory prediction network framework according to the present invention;
FIG. 3 illustrates a pedestrian topology of the present invention;
FIG. 4 illustrates a pedestrian candidate trajectory generation module of the present invention;
FIG. 5 illustrates a two-class network of the present invention.
Detailed Description
The invention is described in further detail below with reference to the following detailed description and accompanying drawings:
the invention provides a pedestrian trajectory prediction method based on a topological graph structure and a deep self-attention neural network, which comprises the following specific processes of data preprocessing, deep neural network model construction, model parameter training and pedestrian trajectory prediction, wherein the specific processes are shown in figure 1.
In the data preprocessing stage, the pedestrian data are preprocessed, the pedestrian track is converted into coordinate points in a space two-dimensional coordinate, time sequence information is coded, the two-dimensional space data are expanded to a high-dimensional space to meet the requirement of deep self-attention network input, and a topological graph structure is constructed by utilizing the pedestrian track data to meet the requirement of graph neural network input.
In the deep neural network construction stage, a machine learning library Pythrch and the like are utilized to construct a corresponding deep neural network framework. Two encoders and one decoder in a deep learning framework are built, and the two encoders and the one decoder comprise a spatial feature extraction encoder (comprising a graph attention network and a graph-based deep self-attention network), a time sequence feature extraction encoder (comprising a deep self-attention network) and a multi-modal decoder (comprising a fully-connected network).
In the model parameter training stage, network hyper-parameters and loss functions are set, network model parameters are trained by adopting a leave-one-out cross verification method, and the model is evaluated by adopting two evaluation indexes of average position deviation and final position deviation.
And in the track correction stage, generating a plurality of candidate tracks by using the final position points, and sending the candidate tracks into a two-classification network for training and evaluating the tracks.
And in the pedestrian trajectory prediction stage, inputting a pedestrian motion trajectory sequence to be predicted, and predicting by using a trained trajectory prediction network frame to generate a pedestrian future motion trajectory sequence.
The specific contents of each stage are described in detail as follows:
(1) and in the data preprocessing stage, data needs to be preprocessed to meet the training input requirement of the network model. The raw data generally includes time information t, pedestrian label i and pedestrian spatial location point (x) i ,y i ) In order to meet the deep self-attention network input and training requirements, the two-dimensional space coordinate data needs to be mapped into high-dimensional data (32-dimensional data is set in the invention) by using a fully-connected network. Meanwhile, in order to meet the input requirements of the graph attention network and the self-attention network based on the topological graph, pedestrians need to be constructed into a topological graph structure according to spatial position points, and an adjacency matrix N (i) is obtained.
(2) In the model construction stage, a track prediction overall framework needs to be constructed by using a deep learning library, and the overall framework comprises a space interactive feature extraction coder, a time sequence characteristic extraction coder and a multi-modal decoder, and is shown in fig. 2. And a orthotic module was constructed as shown in figures 4 and 5, with the framework shown in figure 1.
The space interactive feature extraction encoder is composed of a graph attention network and a graph-based deep self-attention network, wherein the pedestrian topological graph structure is shown in figure 3. Considering the complex features of social interaction of pedestrians, the complex situation that spatial features of pedestrians are difficult to sufficiently extract simply using a single network, such as side-by-side walking between friends and potential barriers to strangers. Therefore, the invention adopts two network structures of the graph attention network and the depth self-attention network based on the graph to enhance the extraction capability of the global and local space interaction characteristics in the pedestrian track, wherein the graph attention network adopts the correlation coefficient of the relative distance between neighbors
Figure BDA0003718185250000041
To obtain the influence of local relations on the pedestrian trajectory, correlation coefficients
Figure BDA0003718185250000042
The calculation method is as follows:
Figure BDA0003718185250000043
wherein l is the number of iteration layers of the power network,
Figure BDA0003718185250000044
is the spatial coordinate of the pedestrian i at the time point t, W r As an embedded function
Figure BDA0003718185250000045
The parameter matrix of (2).
Attention can be obtained according to the correlation coefficient
Figure BDA0003718185250000046
Figure BDA0003718185250000047
Wherein
Figure BDA0003718185250000048
For the state value of the pedestrian i at the time point t, the initial input is carried out
Figure BDA0003718185250000049
N (i) is a contiguous matrix, likewise W α As an embedded function
Figure BDA00037181852500000410
The parameter matrix of (2).
The message passing mechanism of the graph attention network is as follows:
Figure BDA00037181852500000411
the deep self-attention network based on the graph also adopts a topological graph structure, the input is the same as the graph attention network, and a self-attention mechanism is adopted to emphasize the influence of global interaction information on the self track, firstly, data are input according to the space high dimension
Figure BDA0003718185250000051
Extracting a query matrix q i Key matrix k i Value matrix v i
Figure BDA0003718185250000052
Graph-based deep self-attention networks also employ message passing mechanisms:
Figure BDA0003718185250000053
attention calculation mode and output h thereof' S,i Comprises the following steps:
Figure BDA0003718185250000054
h' S,t =f out (Att(i))+Att(i)
wherein d is k Is the dimension of the matrix, f out For the output function, here a fully connected network is chosen, as shown in the formula, the output being in a hopping connection.
After obtaining the attention network of the graph and the output of the self-attention network of the depth, respectively, splicing the two outputs by using a full-connection network to keep the two outputs consistent with the original high-dimensional data dimension:
Figure BDA0003718185250000055
high-dimensional data with space interaction characteristics
Figure BDA0003718185250000056
Splicing with original high-dimensional data to obtain the input of time series characteristic extraction encoder
Figure BDA0003718185250000057
The encoder adopts an original depth self-attention network structure, and similarly, a query matrix Q is extracted firstly i Key matrix K i Value matrix V i
Figure BDA0003718185250000058
The self-attention formula is:
Figure BDA0003718185250000059
in order to extract feature information of different aspects and enhance the feature extraction capability of the network, the invention adopts a multi-head attention mechanism:
Figure BDA00037181852500000510
wherein the head j =Attention j (Q i ,K i ,V i ) Is the jth head, f o The function is output by a fully-connected network, and the purpose is to perform weighted fusion on the multi-head features.
After data of the fusion space interactive characteristic and the time sequence characteristic are obtained, Gaussian sampling noise is added into the data and is input into a multi-mode decoder, and the multi-mode decoder adopts a full-connection network to map high-dimensional data to coordinate points in a two-dimensional space coordinate system so as to output a track sequence.
Then, the known track points and the final position prediction points are sent to a correction module, as shown in fig. three, the correction module firstly collects eight candidate points around the final position prediction points, then generates candidate tracks according to the candidate points by utilizing cubic curve fitting, and the curvature s calculation formula is as follows:
Figure BDA0003718185250000061
wherein the known final point coordinate is (x) 1 ,y 1 ) The candidate point coordinate is (x) 2 ,y 2 ) N is the number of neighbors in the topology graph, and curvatureThe point is a point which is on the perpendicular line of the two points and has a distance s from the midpoint of the two points.
And setting all candidate tracks as positive and negative samples according to the average position deviation and the ratio of 1:3, and sending the tracks into a two-classification network for training, wherein the two-classification network is composed of two fully-connected networks.
(3) In the model parameter training stage, after the hyper-parameters and the loss functions are set, the invention adopts two public pedestrian rule data sets ETH and UCY for training, wherein ETH is composed of two small data sets of ETH and Hotel, and UCY is composed of ZARA1, ZARA2 and UNIV. And (4) performing cross validation by adopting a leave-one-out method, namely taking one data set as a test set and taking the other four data sets as training sets.
Or a self-made data set can be adopted, the data set samples the pedestrians in the scene every 0.4 second to serve as one frame of data, 8 frames of data (3.2 seconds) are taken as the historical pedestrian track when model training is carried out, and the pedestrian track sequence of 12 frames (4.8s) in the future is predicted.
The evaluation index adopts an average position deviation ADE and a final position deviation FDE, wherein ADE is an average value of Euclidean distance deviations of future 12 frames of predicted data and actual track data:
Figure BDA0003718185250000062
wherein N is the predicted pedestrian number, T p 12 is the maximum number of frames,
Figure BDA0003718185250000063
in order to be a sequence of actual positions,
Figure BDA0003718185250000064
to predict the trajectory sequence, | | · | | is the euclidean distance between two points. And FDE is the euclidean distance deviation of the last frame prediction data and the actual trajectory data:
Figure BDA0003718185250000065
wherein, T f The final time point.
(4) In the pedestrian track prediction stage, inputting a pedestrian motion track to be predicted into a trained deep network frame, and outputting a corresponding future 12-frame pedestrian motion track by the frame; and dense scenes are complex, social interaction is frequent, multiple curve tracks are formed, and the requirement on the extraction capability of the spatial interaction characteristics among model pedestrians is high. The method can effectively distinguish complex global and local interaction in a dense crowd scene, wherein the complex global and local interaction comprises complex conditions of team parallel walking, potential collision avoidance, independent walking, reasonable social range maintenance with crowds and the like, and the inherent uncertainty and multi-modal characteristics of the pedestrian motion trail are approached to the maximum extent.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, but any modifications or equivalent variations made according to the technical spirit of the present invention are within the scope of the present invention as claimed.

Claims (5)

1. The pedestrian trajectory prediction method based on the topological graph structure and the depth self-attention network is characterized by comprising the following steps of:
preprocessing data to meet the requirement of a neural network on input data, and training and testing model parameters by adopting a leave-one cross verification method;
step two, after the original pedestrian track data in the step one are obtained, embedding the data into a high-dimensional space by using a full-connection network, constructing a topological graph structure to meet the input requirement of a space interactive feature encoder, sending the high-dimensional data into the space interactive feature encoder in order to fully extract space interactive features, obtaining the high-dimensional data with the space interactive features, and splicing the output data of the two networks by using the full-connection network to ensure that the dimensionality of the output data is consistent with the original high-dimensional data;
splicing the high-dimensional data with the space interaction characteristics obtained in the step two and the original high-dimensional data, and sending the spliced high-dimensional data and the original high-dimensional data into a time sequence characteristic encoder to extract time sequence characteristics;
two self-attention mechanisms are adopted to respectively extract global and local space interaction information between pedestrian tracks and fully explore interaction information and behavior patterns between pedestrians, wherein the graph attention network adopts related parameters based on relative distance of neighbors
Figure FDA0003718185240000011
The attention mechanism of (1):
Figure FDA0003718185240000012
Figure FDA0003718185240000013
wherein (x) i ,y i ) Is a two-dimensional spatial location coordinate point of the pedestrian i,
Figure FDA0003718185240000014
for a fully connected network embedding function, W is a network matrix parameter,
Figure FDA0003718185240000015
is the state value of the pedestrian i at the time point t. The part of attention mainly emphasizes the influence of relative distance between local neighbors on the motion trail, and the depth self-attention network based on the graph adopts a self-attention mechanism to emphasize the influence of global relations on the motion trail;
step four, in order to simulate the inherent uncertainty and the multi-modal characteristics of the pedestrian motion trajectory, Gaussian sampling noise is introduced into the high-dimensional data which is obtained in the step three and has space interaction characteristics and time sequence characteristics at the same time, and then the data are sent into a fully-connected neural network decoder to obtain a predicted pedestrian trajectory sequence;
and step five, sending the result to a trajectory correction module to improve the smoothness and continuity of the path.
2. The pedestrian trajectory prediction method based on the topological graph structure and the deep self-attention network according to claim 1, characterized in that: in the second step, the spatial interaction feature encoder encodes the high-dimensional data comprising a graph attention network and a graph-based deep self-attention network.
3. The pedestrian trajectory prediction method based on the topological graph structure and the deep self-attention network according to claim 1, characterized in that: the temporal feature encoder includes a depth self-attention network.
4. The pedestrian trajectory prediction method based on the topological graph structure and the deep self-attention network according to claim 1, characterized in that: in the fourth step, inherent uncertainty and multi-modal characteristics in the pedestrian motion trajectory are simulated by introducing Gaussian sampling noise into data, and the exploration space of the pedestrian trajectory is expanded.
5. The pedestrian trajectory prediction method based on the topological graph structure and the deep self-attention network according to claim 1, characterized in that: and step five, further expanding a network exploration space by adopting curve fitting and a two-classification network, and enhancing the continuity of the curve, wherein the curvature S calculation formula is as follows:
Figure FDA0003718185240000021
wherein the final point coordinate is (x) 1 ,y 1 ) The candidate point coordinate is (x) 2 ,y 2 ) And n is the number of neighbors in the topological graph.
CN202210741506.2A 2022-06-28 2022-06-28 Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network Pending CN115082896A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210741506.2A CN115082896A (en) 2022-06-28 2022-06-28 Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210741506.2A CN115082896A (en) 2022-06-28 2022-06-28 Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network

Publications (1)

Publication Number Publication Date
CN115082896A true CN115082896A (en) 2022-09-20

Family

ID=83255079

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210741506.2A Pending CN115082896A (en) 2022-06-28 2022-06-28 Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network

Country Status (1)

Country Link
CN (1) CN115082896A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116257659A (en) * 2023-03-31 2023-06-13 华中师范大学 Dynamic diagram embedding method and system of intelligent learning guiding system
CN116291336A (en) * 2023-02-14 2023-06-23 电子科技大学 Automatic segmentation clustering system based on deep self-attention neural network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113538506A (en) * 2021-07-23 2021-10-22 陕西师范大学 Pedestrian trajectory prediction method based on global dynamic scene information depth modeling
US20220011122A1 (en) * 2020-07-09 2022-01-13 Beijing Tusen Weilai Technology Co., Ltd. Trajectory prediction method and device
CN114117259A (en) * 2021-11-30 2022-03-01 重庆七腾科技有限公司 Trajectory prediction method and device based on double attention mechanism

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220011122A1 (en) * 2020-07-09 2022-01-13 Beijing Tusen Weilai Technology Co., Ltd. Trajectory prediction method and device
CN113538506A (en) * 2021-07-23 2021-10-22 陕西师范大学 Pedestrian trajectory prediction method based on global dynamic scene information depth modeling
CN114117259A (en) * 2021-11-30 2022-03-01 重庆七腾科技有限公司 Trajectory prediction method and device based on double attention mechanism

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116291336A (en) * 2023-02-14 2023-06-23 电子科技大学 Automatic segmentation clustering system based on deep self-attention neural network
CN116291336B (en) * 2023-02-14 2024-05-24 电子科技大学 Automatic segmentation clustering system based on deep self-attention neural network
CN116257659A (en) * 2023-03-31 2023-06-13 华中师范大学 Dynamic diagram embedding method and system of intelligent learning guiding system

Similar Documents

Publication Publication Date Title
CN111400620B (en) User trajectory position prediction method based on space-time embedded Self-orientation
CN115082896A (en) Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network
Chen et al. Vehicle trajectory prediction based on intention-aware non-autoregressive transformer with multi-attention learning for Internet of Vehicles
CN110781838A (en) Multi-modal trajectory prediction method for pedestrian in complex scene
CN115829171B (en) Pedestrian track prediction method combining space-time information and social interaction characteristics
CN114802296A (en) Vehicle track prediction method based on dynamic interaction graph convolution
CN110570035B (en) People flow prediction system for simultaneously modeling space-time dependency and daily flow dependency
CN115510174A (en) Road network pixelation-based Wasserstein generation countermeasure flow data interpolation method
CN114757975B (en) Pedestrian track prediction method based on transformer and graph convolution network
CN114898293A (en) Pedestrian crossing group multi-mode trajectory prediction method for automatically driving automobile
Yang et al. Long-short term spatio-temporal aggregation for trajectory prediction
CN116935649A (en) Urban traffic flow prediction method for multi-view fusion space-time dynamic graph convolution network
CN114461931A (en) User trajectory prediction method and system based on multi-relation fusion analysis
Shin et al. User mobility synthesis based on generative adversarial networks: A survey
CN117314956A (en) Interactive pedestrian track prediction method based on graphic neural network
CN115018134A (en) Pedestrian trajectory prediction method based on three-scale spatiotemporal information
CN115359437A (en) Accompanying vehicle identification method based on semantic track
CN114880586A (en) Confrontation-based social circle inference method through mobility context awareness
Zhang et al. Obstacle‐transformer: A trajectory prediction network based on surrounding trajectories
Wang et al. Self-Attentive Local Aggregation Learning With Prototype Guided Regularization for Point Cloud Semantic Segmentation of High-Speed Railways
CN117688257B (en) Long-term track prediction method for heterogeneous user behavior mode
Wang et al. Human Trajectory Prediction Using Stacked Temporal Convolutional Network
CN114328791B (en) Map matching algorithm based on deep learning
CN117933397A (en) Track prediction method under scene fusion based on space-time structure causal model
CN116011638A (en) Urban space-time prediction method based on space-time attention

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination