CN115082896A - Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network - Google Patents
Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network Download PDFInfo
- Publication number
- CN115082896A CN115082896A CN202210741506.2A CN202210741506A CN115082896A CN 115082896 A CN115082896 A CN 115082896A CN 202210741506 A CN202210741506 A CN 202210741506A CN 115082896 A CN115082896 A CN 115082896A
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- network
- attention
- data
- self
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 230000003993 interaction Effects 0.000 claims abstract description 29
- 238000013528 artificial neural network Methods 0.000 claims abstract description 13
- 238000012937 correction Methods 0.000 claims abstract description 8
- 239000000284 extract Substances 0.000 claims abstract description 8
- 230000007246 mechanism Effects 0.000 claims description 14
- 239000011159 matrix material Substances 0.000 claims description 13
- 238000012549 training Methods 0.000 claims description 12
- 230000002452 interceptive effect Effects 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 5
- 238000005070 sampling Methods 0.000 claims description 5
- 238000012360 testing method Methods 0.000 claims description 3
- 238000012795 verification Methods 0.000 claims description 3
- 230000002708 enhancing effect Effects 0.000 claims 1
- 230000002123 temporal effect Effects 0.000 claims 1
- 238000000605 extraction Methods 0.000 abstract description 15
- 230000003997 social interaction Effects 0.000 abstract description 5
- 230000006870 function Effects 0.000 description 6
- 238000013135 deep learning Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 101100283966 Pectobacterium carotovorum subsp. carotovorum outN gene Proteins 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000005036 potential barrier Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/42—Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A pedestrian track prediction method based on a topological graph structure and a deep self-attention network extracts local and global space interaction features in a pedestrian motion track respectively by using a graph attention network and the deep self-attention network based on the topological graph, and then extracts time sequence features by using an original deep self-attention network. In order to simulate the inherent uncertainty and the multi-modal characteristics of the pedestrian motion trajectory, the invention expands the exploration space of the pedestrian motion trajectory by introducing Gaussian noise into a fully-connected network decoder. In order to further improve the track exploration space and smoothness, the track is sent to a track correction module for correction. Compared with other methods, the adopted graph neural network and the depth self-attention network based on the graph can pay more attention to various spatial interaction modes in the pedestrian motion trail, such as parallelism, potential obstacle avoidance and the like. Compared with other pedestrian trajectory prediction methods, the social interaction feature extraction capability and the multi-modal exploration capability of the invention are more prominent and effective.
Description
Technical Field
The invention belongs to the field of deep learning and automatic driving, and particularly relates to a pedestrian trajectory prediction method based on a topological graph structure and a deep self-attention network.
Background
Under a complex crowd scene, the participation of pedestrians brings huge challenges to the dynamic obstacle avoidance motion planning of the mobile robot and the unmanned vehicle, the prediction of the pedestrian motion trail is beneficial to improving the efficiency of the obstacle avoidance planning of the unmanned vehicle, and the safety accident rate is reduced. The traditional pedestrian trajectory prediction method mainly adopts statistical probability methods such as hidden Markov chains, Bayes and the like, or artificially set rules and functions, and the methods are difficult to migrate to a complex nonlinear environment in a scene which is often applied to a sparse crowd environment and has poor motion state randomness, and once complex social interaction among pedestrians occurs, the effectiveness of a prediction result is difficult to ensure. With the development of neural networks, methods for converting trajectory prediction tasks into time sequence generation tasks have been proposed (Alahi A, Goel K, Ramanathan V, et al. The method has the core idea that a cyclic neural network such as a long-term and short-term memory network is adopted to extract time sequence information in the pedestrian tracks, and spatial interaction information among the pedestrian tracks is simulated in a pooling mode for hidden states in the network, but the method has insufficient capability of extracting spatial interaction characteristics among pedestrians. With the development of deep learning technology, particularly a graph neural network and a self-attention mechanism, a method for simulating pedestrian interaction by using the graph neural network and the attention mechanism becomes possible. The existing better track prediction method utilizes a Sparse graph convolution network (Shi L, Wang L, Long C, et al SGCN: Sparse graph context network for pedestrian prediction [ C ]// CVPR,2021.), although the method can simulate interaction information among pedestrians to a certain extent, the time sequence feature extraction capability is insufficient, more importantly, the method cannot effectively simulate multi-modal features in the pedestrian motion track, and the inherent uncertainty in the pedestrian motion track is particularly important in a track prediction task.
Disclosure of Invention
In order to overcome the defects of the prior art, balance the relation between time sequence feature extraction and space interaction feature extraction and meet the inherent randomness requirement of the pedestrian track, the invention provides a pedestrian track prediction method based on a topological graph structure and a depth self-attention network, extracts time sequence features in historical tracks, explores interaction information and behavior modes among pedestrians, restores multi-modal characteristics in the pedestrian motion track, expands the exploration space of track prediction and predicts future track points of the pedestrians.
In order to achieve the purpose, the technical scheme of the invention is as follows:
the pedestrian trajectory prediction method based on the topological graph structure and the depth self-attention network comprises the following steps:
preprocessing data to meet the requirement of a neural network on input data, and training and testing model parameters by adopting a leave-one cross verification method;
step two, after the original pedestrian track data in the step one are obtained, embedding the data into a high-dimensional space by using a full-connection network, constructing a topological graph structure to meet the input requirement of a space interactive feature encoder, sending the high-dimensional data into the space interactive feature encoder in order to fully extract space interactive features, obtaining the high-dimensional data with the space interactive features, and splicing the output data of the two networks by using the full-connection network to ensure that the dimensionality of the output data is consistent with the original high-dimensional data;
thirdly, splicing the high-dimensional data with the space interaction characteristics obtained in the second step and the original high-dimensional data, and sending the spliced high-dimensional data into a time sequence characteristic encoder to extract time sequence characteristics;
two self-attention mechanisms are adopted to respectively extract global and local space interaction information among pedestrian tracks, interaction information and behavior patterns among pedestrians are fully explored, wherein the graph attention network adopts a basisRelative distance to neighbor related parametersThe attention mechanism of (1):
wherein (x) i ,y i ) Is a two-dimensional spatial location coordinate point of the pedestrian i,for a fully-connected network embedding function, W is a network matrix parameter, the part of attention mainly emphasizes the influence of relative distance between local neighbors on a motion track, and a depth self-attention network based on a graph adopts a self-attention mechanism to emphasize the influence of a global relation on the motion track;
step four, in order to simulate the inherent uncertainty and the multi-modal characteristics of the pedestrian motion trajectory, Gaussian sampling noise is introduced into the high-dimensional data which is obtained in the step three and has space interaction characteristics and time sequence characteristics at the same time, and then the data are sent into a fully-connected neural network decoder to obtain a predicted pedestrian trajectory sequence;
and step five, sending the result to a trajectory correction module to improve the smoothness and continuity of the path.
As a further improvement of the present invention, in the second step, the spatial interaction feature encoder is used for encoding the high-dimensional data including a graph attention network and a graph-based deep self-attention network, and in the second step, a spatial interaction feature encoder (including a graph attention network and a topological graph structure-based deep self-attention network) is used for performing spatial interaction feature extraction on the pedestrian trajectory data.
As a further improvement of the invention, in the third step, a time sequence feature encoder (comprising an original depth self-attention network) is used for extracting the time sequence features in the pedestrian track data.
As a further improvement of the invention, in the fourth step, inherent uncertainty and multi-modal characteristics in the pedestrian motion trajectory are simulated by introducing gaussian sampling noise into the data, and the exploration space of the pedestrian trajectory is expanded.
As a further improvement of the present invention, in the step five, a curve fitting and a binary network are adopted to further expand the network exploration space and enhance the curve continuity, wherein the curvature S calculation formula is as follows:
compared with the prior art, the invention has the beneficial effects that: (1) two self-attention mechanisms are adopted, including a graph attention network and a depth self-attention network based on a graph, space interaction characteristics are extracted, the influences of the relative distance and the global relation of local adjacent pedestrians on the pedestrian tracks are respectively extracted, social interaction information and various interaction modes in the pedestrian tracks are more fully explored and identified, and various complex interaction situations such as side-by-side walking among pedestrians, detouring in advance and avoiding potential collision among pedestrians and grouped pedestrians by a single walking surface are avoided; (2) time sequence features in the pedestrian historical track are extracted by adopting a deep self-attention network, and the extraction capability of the model on the time sequence features is enhanced by utilizing a mask mechanism (Masked) in the deep self-attention network; (3) gaussian noise is introduced into the multi-mode encoder to simulate multi-mode characteristics and inherent uncertainty in a pedestrian motion track, and the exploration space of the pedestrian motion track is expanded to be closer to the practical situation. (4) And sending the final position point into a correction module, and further improving the track exploration space and continuity.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 illustrates a trajectory prediction network framework according to the present invention;
FIG. 3 illustrates a pedestrian topology of the present invention;
FIG. 4 illustrates a pedestrian candidate trajectory generation module of the present invention;
FIG. 5 illustrates a two-class network of the present invention.
Detailed Description
The invention is described in further detail below with reference to the following detailed description and accompanying drawings:
the invention provides a pedestrian trajectory prediction method based on a topological graph structure and a deep self-attention neural network, which comprises the following specific processes of data preprocessing, deep neural network model construction, model parameter training and pedestrian trajectory prediction, wherein the specific processes are shown in figure 1.
In the data preprocessing stage, the pedestrian data are preprocessed, the pedestrian track is converted into coordinate points in a space two-dimensional coordinate, time sequence information is coded, the two-dimensional space data are expanded to a high-dimensional space to meet the requirement of deep self-attention network input, and a topological graph structure is constructed by utilizing the pedestrian track data to meet the requirement of graph neural network input.
In the deep neural network construction stage, a machine learning library Pythrch and the like are utilized to construct a corresponding deep neural network framework. Two encoders and one decoder in a deep learning framework are built, and the two encoders and the one decoder comprise a spatial feature extraction encoder (comprising a graph attention network and a graph-based deep self-attention network), a time sequence feature extraction encoder (comprising a deep self-attention network) and a multi-modal decoder (comprising a fully-connected network).
In the model parameter training stage, network hyper-parameters and loss functions are set, network model parameters are trained by adopting a leave-one-out cross verification method, and the model is evaluated by adopting two evaluation indexes of average position deviation and final position deviation.
And in the track correction stage, generating a plurality of candidate tracks by using the final position points, and sending the candidate tracks into a two-classification network for training and evaluating the tracks.
And in the pedestrian trajectory prediction stage, inputting a pedestrian motion trajectory sequence to be predicted, and predicting by using a trained trajectory prediction network frame to generate a pedestrian future motion trajectory sequence.
The specific contents of each stage are described in detail as follows:
(1) and in the data preprocessing stage, data needs to be preprocessed to meet the training input requirement of the network model. The raw data generally includes time information t, pedestrian label i and pedestrian spatial location point (x) i ,y i ) In order to meet the deep self-attention network input and training requirements, the two-dimensional space coordinate data needs to be mapped into high-dimensional data (32-dimensional data is set in the invention) by using a fully-connected network. Meanwhile, in order to meet the input requirements of the graph attention network and the self-attention network based on the topological graph, pedestrians need to be constructed into a topological graph structure according to spatial position points, and an adjacency matrix N (i) is obtained.
(2) In the model construction stage, a track prediction overall framework needs to be constructed by using a deep learning library, and the overall framework comprises a space interactive feature extraction coder, a time sequence characteristic extraction coder and a multi-modal decoder, and is shown in fig. 2. And a orthotic module was constructed as shown in figures 4 and 5, with the framework shown in figure 1.
The space interactive feature extraction encoder is composed of a graph attention network and a graph-based deep self-attention network, wherein the pedestrian topological graph structure is shown in figure 3. Considering the complex features of social interaction of pedestrians, the complex situation that spatial features of pedestrians are difficult to sufficiently extract simply using a single network, such as side-by-side walking between friends and potential barriers to strangers. Therefore, the invention adopts two network structures of the graph attention network and the depth self-attention network based on the graph to enhance the extraction capability of the global and local space interaction characteristics in the pedestrian track, wherein the graph attention network adopts the correlation coefficient of the relative distance between neighborsTo obtain the influence of local relations on the pedestrian trajectory, correlation coefficientsThe calculation method is as follows:
wherein l is the number of iteration layers of the power network,is the spatial coordinate of the pedestrian i at the time point t, W r As an embedded functionThe parameter matrix of (2).
WhereinFor the state value of the pedestrian i at the time point t, the initial input is carried outN (i) is a contiguous matrix, likewise W α As an embedded functionThe parameter matrix of (2).
The message passing mechanism of the graph attention network is as follows:
the deep self-attention network based on the graph also adopts a topological graph structure, the input is the same as the graph attention network, and a self-attention mechanism is adopted to emphasize the influence of global interaction information on the self track, firstly, data are input according to the space high dimensionExtracting a query matrix q i Key matrix k i Value matrix v i :
Graph-based deep self-attention networks also employ message passing mechanisms:
attention calculation mode and output h thereof' S,i Comprises the following steps:
h' S,t =f out (Att(i))+Att(i)
wherein d is k Is the dimension of the matrix, f out For the output function, here a fully connected network is chosen, as shown in the formula, the output being in a hopping connection.
After obtaining the attention network of the graph and the output of the self-attention network of the depth, respectively, splicing the two outputs by using a full-connection network to keep the two outputs consistent with the original high-dimensional data dimension:
high-dimensional data with space interaction characteristicsSplicing with original high-dimensional data to obtain the input of time series characteristic extraction encoderThe encoder adopts an original depth self-attention network structure, and similarly, a query matrix Q is extracted firstly i Key matrix K i Value matrix V i :
The self-attention formula is:
in order to extract feature information of different aspects and enhance the feature extraction capability of the network, the invention adopts a multi-head attention mechanism:
wherein the head j =Attention j (Q i ,K i ,V i ) Is the jth head, f o The function is output by a fully-connected network, and the purpose is to perform weighted fusion on the multi-head features.
After data of the fusion space interactive characteristic and the time sequence characteristic are obtained, Gaussian sampling noise is added into the data and is input into a multi-mode decoder, and the multi-mode decoder adopts a full-connection network to map high-dimensional data to coordinate points in a two-dimensional space coordinate system so as to output a track sequence.
Then, the known track points and the final position prediction points are sent to a correction module, as shown in fig. three, the correction module firstly collects eight candidate points around the final position prediction points, then generates candidate tracks according to the candidate points by utilizing cubic curve fitting, and the curvature s calculation formula is as follows:
wherein the known final point coordinate is (x) 1 ,y 1 ) The candidate point coordinate is (x) 2 ,y 2 ) N is the number of neighbors in the topology graph, and curvatureThe point is a point which is on the perpendicular line of the two points and has a distance s from the midpoint of the two points.
And setting all candidate tracks as positive and negative samples according to the average position deviation and the ratio of 1:3, and sending the tracks into a two-classification network for training, wherein the two-classification network is composed of two fully-connected networks.
(3) In the model parameter training stage, after the hyper-parameters and the loss functions are set, the invention adopts two public pedestrian rule data sets ETH and UCY for training, wherein ETH is composed of two small data sets of ETH and Hotel, and UCY is composed of ZARA1, ZARA2 and UNIV. And (4) performing cross validation by adopting a leave-one-out method, namely taking one data set as a test set and taking the other four data sets as training sets.
Or a self-made data set can be adopted, the data set samples the pedestrians in the scene every 0.4 second to serve as one frame of data, 8 frames of data (3.2 seconds) are taken as the historical pedestrian track when model training is carried out, and the pedestrian track sequence of 12 frames (4.8s) in the future is predicted.
The evaluation index adopts an average position deviation ADE and a final position deviation FDE, wherein ADE is an average value of Euclidean distance deviations of future 12 frames of predicted data and actual track data:
wherein N is the predicted pedestrian number, T p 12 is the maximum number of frames,in order to be a sequence of actual positions,to predict the trajectory sequence, | | · | | is the euclidean distance between two points. And FDE is the euclidean distance deviation of the last frame prediction data and the actual trajectory data:
wherein, T f The final time point.
(4) In the pedestrian track prediction stage, inputting a pedestrian motion track to be predicted into a trained deep network frame, and outputting a corresponding future 12-frame pedestrian motion track by the frame; and dense scenes are complex, social interaction is frequent, multiple curve tracks are formed, and the requirement on the extraction capability of the spatial interaction characteristics among model pedestrians is high. The method can effectively distinguish complex global and local interaction in a dense crowd scene, wherein the complex global and local interaction comprises complex conditions of team parallel walking, potential collision avoidance, independent walking, reasonable social range maintenance with crowds and the like, and the inherent uncertainty and multi-modal characteristics of the pedestrian motion trail are approached to the maximum extent.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, but any modifications or equivalent variations made according to the technical spirit of the present invention are within the scope of the present invention as claimed.
Claims (5)
1. The pedestrian trajectory prediction method based on the topological graph structure and the depth self-attention network is characterized by comprising the following steps of:
preprocessing data to meet the requirement of a neural network on input data, and training and testing model parameters by adopting a leave-one cross verification method;
step two, after the original pedestrian track data in the step one are obtained, embedding the data into a high-dimensional space by using a full-connection network, constructing a topological graph structure to meet the input requirement of a space interactive feature encoder, sending the high-dimensional data into the space interactive feature encoder in order to fully extract space interactive features, obtaining the high-dimensional data with the space interactive features, and splicing the output data of the two networks by using the full-connection network to ensure that the dimensionality of the output data is consistent with the original high-dimensional data;
splicing the high-dimensional data with the space interaction characteristics obtained in the step two and the original high-dimensional data, and sending the spliced high-dimensional data and the original high-dimensional data into a time sequence characteristic encoder to extract time sequence characteristics;
two self-attention mechanisms are adopted to respectively extract global and local space interaction information between pedestrian tracks and fully explore interaction information and behavior patterns between pedestrians, wherein the graph attention network adopts related parameters based on relative distance of neighborsThe attention mechanism of (1):
wherein (x) i ,y i ) Is a two-dimensional spatial location coordinate point of the pedestrian i,for a fully connected network embedding function, W is a network matrix parameter,is the state value of the pedestrian i at the time point t. The part of attention mainly emphasizes the influence of relative distance between local neighbors on the motion trail, and the depth self-attention network based on the graph adopts a self-attention mechanism to emphasize the influence of global relations on the motion trail;
step four, in order to simulate the inherent uncertainty and the multi-modal characteristics of the pedestrian motion trajectory, Gaussian sampling noise is introduced into the high-dimensional data which is obtained in the step three and has space interaction characteristics and time sequence characteristics at the same time, and then the data are sent into a fully-connected neural network decoder to obtain a predicted pedestrian trajectory sequence;
and step five, sending the result to a trajectory correction module to improve the smoothness and continuity of the path.
2. The pedestrian trajectory prediction method based on the topological graph structure and the deep self-attention network according to claim 1, characterized in that: in the second step, the spatial interaction feature encoder encodes the high-dimensional data comprising a graph attention network and a graph-based deep self-attention network.
3. The pedestrian trajectory prediction method based on the topological graph structure and the deep self-attention network according to claim 1, characterized in that: the temporal feature encoder includes a depth self-attention network.
4. The pedestrian trajectory prediction method based on the topological graph structure and the deep self-attention network according to claim 1, characterized in that: in the fourth step, inherent uncertainty and multi-modal characteristics in the pedestrian motion trajectory are simulated by introducing Gaussian sampling noise into data, and the exploration space of the pedestrian trajectory is expanded.
5. The pedestrian trajectory prediction method based on the topological graph structure and the deep self-attention network according to claim 1, characterized in that: and step five, further expanding a network exploration space by adopting curve fitting and a two-classification network, and enhancing the continuity of the curve, wherein the curvature S calculation formula is as follows:
wherein the final point coordinate is (x) 1 ,y 1 ) The candidate point coordinate is (x) 2 ,y 2 ) And n is the number of neighbors in the topological graph.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210741506.2A CN115082896A (en) | 2022-06-28 | 2022-06-28 | Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210741506.2A CN115082896A (en) | 2022-06-28 | 2022-06-28 | Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115082896A true CN115082896A (en) | 2022-09-20 |
Family
ID=83255079
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210741506.2A Pending CN115082896A (en) | 2022-06-28 | 2022-06-28 | Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115082896A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116257659A (en) * | 2023-03-31 | 2023-06-13 | 华中师范大学 | Dynamic diagram embedding method and system of intelligent learning guiding system |
CN116291336A (en) * | 2023-02-14 | 2023-06-23 | 电子科技大学 | Automatic segmentation clustering system based on deep self-attention neural network |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113538506A (en) * | 2021-07-23 | 2021-10-22 | 陕西师范大学 | Pedestrian trajectory prediction method based on global dynamic scene information depth modeling |
US20220011122A1 (en) * | 2020-07-09 | 2022-01-13 | Beijing Tusen Weilai Technology Co., Ltd. | Trajectory prediction method and device |
CN114117259A (en) * | 2021-11-30 | 2022-03-01 | 重庆七腾科技有限公司 | Trajectory prediction method and device based on double attention mechanism |
-
2022
- 2022-06-28 CN CN202210741506.2A patent/CN115082896A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220011122A1 (en) * | 2020-07-09 | 2022-01-13 | Beijing Tusen Weilai Technology Co., Ltd. | Trajectory prediction method and device |
CN113538506A (en) * | 2021-07-23 | 2021-10-22 | 陕西师范大学 | Pedestrian trajectory prediction method based on global dynamic scene information depth modeling |
CN114117259A (en) * | 2021-11-30 | 2022-03-01 | 重庆七腾科技有限公司 | Trajectory prediction method and device based on double attention mechanism |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116291336A (en) * | 2023-02-14 | 2023-06-23 | 电子科技大学 | Automatic segmentation clustering system based on deep self-attention neural network |
CN116291336B (en) * | 2023-02-14 | 2024-05-24 | 电子科技大学 | Automatic segmentation clustering system based on deep self-attention neural network |
CN116257659A (en) * | 2023-03-31 | 2023-06-13 | 华中师范大学 | Dynamic diagram embedding method and system of intelligent learning guiding system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111400620B (en) | User trajectory position prediction method based on space-time embedded Self-orientation | |
CN115082896A (en) | Pedestrian trajectory prediction method based on topological graph structure and depth self-attention network | |
Chen et al. | Vehicle trajectory prediction based on intention-aware non-autoregressive transformer with multi-attention learning for Internet of Vehicles | |
CN110781838A (en) | Multi-modal trajectory prediction method for pedestrian in complex scene | |
CN115829171B (en) | Pedestrian track prediction method combining space-time information and social interaction characteristics | |
CN114802296A (en) | Vehicle track prediction method based on dynamic interaction graph convolution | |
CN110570035B (en) | People flow prediction system for simultaneously modeling space-time dependency and daily flow dependency | |
CN115510174A (en) | Road network pixelation-based Wasserstein generation countermeasure flow data interpolation method | |
CN114757975B (en) | Pedestrian track prediction method based on transformer and graph convolution network | |
CN114898293A (en) | Pedestrian crossing group multi-mode trajectory prediction method for automatically driving automobile | |
Yang et al. | Long-short term spatio-temporal aggregation for trajectory prediction | |
CN116935649A (en) | Urban traffic flow prediction method for multi-view fusion space-time dynamic graph convolution network | |
CN114461931A (en) | User trajectory prediction method and system based on multi-relation fusion analysis | |
Shin et al. | User mobility synthesis based on generative adversarial networks: A survey | |
CN117314956A (en) | Interactive pedestrian track prediction method based on graphic neural network | |
CN115018134A (en) | Pedestrian trajectory prediction method based on three-scale spatiotemporal information | |
CN115359437A (en) | Accompanying vehicle identification method based on semantic track | |
CN114880586A (en) | Confrontation-based social circle inference method through mobility context awareness | |
Zhang et al. | Obstacle‐transformer: A trajectory prediction network based on surrounding trajectories | |
Wang et al. | Self-Attentive Local Aggregation Learning With Prototype Guided Regularization for Point Cloud Semantic Segmentation of High-Speed Railways | |
CN117688257B (en) | Long-term track prediction method for heterogeneous user behavior mode | |
Wang et al. | Human Trajectory Prediction Using Stacked Temporal Convolutional Network | |
CN114328791B (en) | Map matching algorithm based on deep learning | |
CN117933397A (en) | Track prediction method under scene fusion based on space-time structure causal model | |
CN116011638A (en) | Urban space-time prediction method based on space-time attention |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |