CN113989326B - Attention mechanism-based target track prediction method - Google Patents

Attention mechanism-based target track prediction method Download PDF

Info

Publication number
CN113989326B
CN113989326B CN202111240446.8A CN202111240446A CN113989326B CN 113989326 B CN113989326 B CN 113989326B CN 202111240446 A CN202111240446 A CN 202111240446A CN 113989326 B CN113989326 B CN 113989326B
Authority
CN
China
Prior art keywords
target
historical
track
targets
hidden state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111240446.8A
Other languages
Chinese (zh)
Other versions
CN113989326A (en
Inventor
罗光春
张栗粽
康昭
段贵多
刘欣
冯科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202111240446.8A priority Critical patent/CN113989326B/en
Publication of CN113989326A publication Critical patent/CN113989326A/en
Application granted granted Critical
Publication of CN113989326B publication Critical patent/CN113989326B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention discloses a track prediction method based on an attention mechanism, and belongs to the technical field of computer vision. Firstly, extracting a position sequence of a target; and then, each target utilizes a long-period memory network code to acquire target track feature expression, then adopts a graph attention network to fuse interaction features among target tracks, obtains time sequence features among all history moments of the target tracks based on an attention mechanism, and finally takes the target track features fused with the interaction features and the time sequence features as input of the long-period memory network, decodes and calculates to obtain the predicted position of the target. The method is based on reasoning the target relation and introducing time sequence characteristics through an attention mechanism to improve the accuracy of track prediction.

Description

Attention mechanism-based target track prediction method
Technical Field
The invention relates to the technical field of computer vision, in particular to a target track prediction method based on an attention mechanism.
Background
Along with the development of scientific technology, various positioning devices are continuously emerging, so that the difficulty of acquiring object track data is greatly reduced, the number and variety of acquired track data are rapidly increased, and a lot of track data have great research value. The acquired track data is stored and analyzed, and the method plays an important role in the aspects of target behavior identification, traffic planning, urban safety, prevention and control and the like.
The main task of track prediction is to predict future track points of a target according to historical track data of the target, and the prediction of how the track of the target is developed and the position of the target at a specific moment are of great significance for researching the behavior mode and detailed information of the excavated target. The understanding and reasoning ability of track prediction can promote analysis and prediction research of human behavior and action, further expand success of artificial intelligence in the fields of vision and voice, and promote the acquisition of higher-level cognition, analysis and reasoning ability. Therefore, the effective analysis of the target track is a key link for realizing the landing of artificial intelligence to human life.
The key challenges of trajectory prediction are the complexity of predicting target behavior and the diversity of external factors. The athletic performance is affected by the own target intent, the existence and behavior of surrounding targets, the association between targets, social rule constraints, and other factors. The method can be divided into two types according to different using technologies in the track prediction method, wherein one type is based on a traditional mathematical statistical model for modeling prediction, and the other type is based on a neural network model for modeling prediction. The traditional mathematical statistical model-based method has better processing effect when the input track data is linearized. The method based on the neural network model is more suitable for processing nonlinear data, but has higher requirement on input data information of a network.
Because of the advantages of recurrent neural networks in terms of processing timing information, deep learning-based trajectory methods basically use the network to extract sequence features in the network structure. Meanwhile, the track prediction task can input an image to assist in improving the prediction effect besides taking the target track sequence as a model input. In addition, there are methods to enhance the prediction effect by using the characteristics of the spatio-temporal information. However, it is difficult to capture the association between targets by the method for a single feature, and the problem of error increase due to prediction at a long time interval in the prediction process is still difficult to solve.
Disclosure of Invention
In order to solve the problem that the existing track prediction method is difficult to effectively process target interaction modeling and the prediction error is continuously increased, the invention provides a track prediction method based on an attention mechanism.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a target track prediction method based on an attention mechanism comprises the following steps:
step 1, acquiring historical tracks of all targets in a video data set to obtain a historical track data set of all targets;
step 2, extracting a first hidden state of each target at a historical moment t by utilizing a long-short-term memory network LSTM based on a historical track data set of the targets;
the method comprises the following specific steps: calculating the historical relative position of the target i at each historical moment and the last historical moment, and calculating the position vector of the target i at the historical moment t by embedding a function in the historical relative positionAnd then->Inputting the first hidden state of the target i output by the long-short-term memory network LSTM at the history time t into the long-short-term memory network LSTMWherein i is E [1, N];
Step 3, establishing an initial association relation between targets, and setting a first hidden state of each targetAs input to a corresponding node in the graph attention network; then fusing the track interaction characteristics of the target i and the adjacent targets through the graph annotation force network; fusing each history time to the target i based on the attention mechanism to obtain the history time T of the target i obs Is a joint hidden state of (a);
the method comprises the following specific steps:
3-1, establishing an initial association relation between targets;
3-2, for the object i, fusing the track interaction characteristics of the adjacent object based on the graph attention network to obtain a second hidden state
3-3. Second hidden state of object i at each historic momentInputting another diagram attention network, introducing attention mechanism, calculating second hidden state at last historical time T obs Correlation with each previous history time tRe-calculating the target i at the last historical time T obs Joint hidden state of long term memory network LSTM output
Step 4, the target i is at the historical moment T obs Is a joint hidden state of (a)Inputting the long-short-term memory network D-LSTM to decode to obtain the target i at the first predicted time T obs+1 Is a predicted relative position of (2); and then taking the predicted relative position as the historical relative position of the next predicted time, returning to the step 2, calculating the updated predicted relative position of the next predicted time, sequentially iterating to obtain the predicted relative positions of all the subsequent predicted times, and finally obtaining the track prediction result of the target i.
The target track prediction method based on the attention mechanism solves the problems that the existing track prediction method is difficult to effectively analyze the association relation between targets and the prediction accuracy is low. The core idea is to extract the position sequence of the target; and then, each target is encoded by utilizing a long-term and short-term memory network to acquire target track characteristic expression, and then, interaction characteristics among target tracks are fused by adopting a graph attention network, and time sequence characteristics among historical moments of the target tracks are acquired based on an attention mechanism. And finally, taking the target track characteristics fused with the interaction characteristics and the time sequence characteristics as the input of the long-short-period memory network, decoding and calculating to obtain the predicted position of the target. The invention is based on reasoning the target relation and introducing time sequence characteristics through an attention mechanism to improve the accuracy of track prediction.
Compared with the prior art, the method applies the attention mechanism method in natural language processing to target track prediction, relieves the phenomenon that the prediction result error increases along with the prediction interval in the track prediction method, and ensures that the result and model calculation are more persuasive from the angle of input features.
Drawings
Fig. 1 is a general flow chart of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
as shown in fig. 1, the track prediction method based on the attention mechanism in the present invention includes the following steps:
step 1, acquiring historical tracks of all targets in a video data set to obtain a historical track data set of all targets;
step 2, extracting hidden states of all targets at a historical moment t by utilizing a long-short-term memory network LSTM based on a historical track data set of the targets; the method comprises the following specific steps:
2-1 assuming a total of N targets, the position coordinates of the target i at the historical time t are At the predicted time t q Is +.>
2-2. Calculating the historical relative position of the target i at each historical time and the last historical time
2-3, calculating the position vector of the target i at the history time t by embedding the history relative position into the function FAnd then->Inputting into a long-short-term memory network LSTM to obtain a first hidden state of the output of the long-short-term memory network LSTM of the target i at the historical moment t>
Wherein, the liquid crystal display device comprises a liquid crystal display device,f is an embedded function, W l Is the weight of the long and short term memory network element.
Step 3, establishing an initial association relation between targets, and setting a first hidden state of each targetAs input to a corresponding node in the graph attention network; then fusing the track interaction characteristics of the target i and the adjacent targets through the graph annotation force network; then fusing the attention correlation of each history time to the target i based on the attention mechanism to obtain the target i at the history time T obs Is a joint hidden state of (a); the method comprises the following specific steps:
3-1, establishing an initial association relation between targets: and (3) mutually associating targets in a default setting scene, and establishing a relation diagram of full connection between the targets based on the number of the targets.
3-2, fusing the track interaction characteristics of adjacent targets based on the graph attention network for the target i.
3-2-1. First hidden state of each objectAs input of corresponding nodes in the graph attention network, different target node pairs { (i, j) |j epsilon N at each historical moment t are calculated i Attention coefficient between }:
wherein, the connection operation is represented by the I,representing the attention coefficient of object j to object i at historic time t, N i Adjacent object set representing object i in the relation diagram, j representing the sequence number of the adjacent object of object i,/>Representing a long short-term memory network LS of a target j at a history time tFirst hidden state of TM output,>any adjacent target k (k e N) representing target i i ) The first hidden state of the long-term memory network LSTM output at the historic moment t, W, a, is a learnable variable, without specific meaning.
3-2-2. Calculate each target node pair { (i, j) |j ε N i After the attention coefficient at the historical moment t, calculating a second hidden state output by the long-short-term memory network LSTM after the track interaction characteristics of the target i and the adjacent target are fused at the historical moment t through the graph attention network
Where σ represents a nonlinear function.
3-3. Second hidden state of object i at each historic momentInputting another graph into the attention network, i.e. introducing the attention mechanism, calculating at the historic moment T obs Second hidden state->With each history time T (t.epsilon. {1, …, T) obs-1 Second hidden state>Correlation between->
Wherein the method comprises the steps of<.,.>Is the inner product operator of the method,for calculating process intermediate variables, there is no specific meaning, < ->Indicating that target i is at history time T obs And a second hidden state after the track interaction characteristics of the adjacent targets are fused.
Calculating the target i at the historical moment T obs Joint hidden state of long term memory network LSTM output
Step 4, the target i is at the historical moment T obs Is a joint hidden state of (a)And inputting the long-term and short-term memory network D-LSTM to decode to obtain the track prediction result of the target. The method comprises the following specific steps:
4-1 target i at historic time T obs Is a joint hidden state of (a)Adding noise vector, inputting into long-short-term memory network D-LSTM for decoding to obtain target i at predicted time T obs+1 Is a predicted relative position of (a)The specific calculation process is as follows:
Wherein Z represents the noise vector and wherein,representing the target i at the historical moment T after the noise vector is fused obs Intermediate state as initial input to the D-LSTM network +.>Indicating that target i is at predicted time T obs+1 Hidden state, delta, output via D-LSTM network 3 (-) represents a linear layer, ">Indicating that target i is at history time T obs Position vector, W of (2) D Is a learnable training parameter.
4-2 at the predicted time T at the acquisition target i obs+1 After the predicted relative position of (2), the predicted relative position is taken as the historical relative position of the next predicted time, namely, the predicted time T obs+2 The historical time is updated to {1,2, …, T obs ,T obs+1 And (2) returning to the step (2), and calculating the predicted relative position of the next predicted time, and sequentially iterating to obtain the predicted relative positions of all the subsequent predicted times.
And 4-3, obtaining a track prediction result of the target i according to the predicted relative position of the target i at each prediction moment.
According to the method, firstly, historical track characteristics of targets are extracted through a long-short-term memory network, then, association relations among the targets are set, the track characteristics of the targets are fused through a graph attention network, target interaction is simulated, track characteristics at different moments are fused for each target through an attention mechanism, finally, track characteristics of the targets used for prediction are obtained, and finally, a final track prediction result is obtained through decoding of the long-short-term memory network.
The invention applies the attention mechanism method in natural language processing to the target track prediction, relieves the phenomenon that the prediction result error in the track prediction method increases along with the prediction interval, and greatly reduces the labor and time costs, thereby ensuring that the result is more accurate and effective. The foregoing is merely illustrative embodiments of the present invention, and the present invention is not limited thereto, and any changes or substitutions that may be easily contemplated by those skilled in the art within the scope of the present invention should be included in the scope of the present invention.

Claims (1)

1. The target track prediction method based on the attention mechanism is characterized by comprising the following steps of:
step 1, acquiring historical tracks of all targets in a video data set to obtain a historical track data set of all targets;
step 2, extracting a first hidden state of each target at a historical moment t by utilizing a long-short-term memory network LSTM based on a historical track data set of the targets;
the method comprises the following specific steps: calculating the historical relative position of the target i at each historical moment and the last historical moment, and calculating the position vector of the target i at the historical moment t by embedding a function in the historical relative positionAnd then->Inputting into a long-short-term memory network LSTM to obtain a first output of the long-short-term memory network LSTM of the target i at the historical moment tHidden state->Wherein i is E [1, N];
Step 3, establishing an initial association relation between targets, and setting a first hidden state of each targetAs input to a corresponding node in the graph attention network; then fusing the track interaction characteristics of the target i and the adjacent targets through the graph annotation force network; fusing each history time to the target i based on the attention mechanism to obtain the last history time T of the target i obs Is (are) associated hidden status->
Specifically, the method comprises the following steps:
3-1, establishing an initial association relation between targets;
3-2, for the object i, fusing the track interaction characteristics of the adjacent object based on the graph attention network to obtain a second hidden state
3-3. Second hidden state of object i at each historic momentInputting another diagram attention network, introducing attention mechanism, calculating second hidden state at last historical time T obs Correlation with each previous history time tRe-calculating the target i at the last historical time T obs Joint hidden state of long term memory network LSTM output
Step 4, the target i is at the last history time T obs Is a joint hidden state of (a)Inputting the long-short-term memory network D-LSTM to decode to obtain the target i at the first predicted time T obs+1 Is a predicted relative position of (2); and then taking the predicted relative position as the historical relative position of the next predicted time, returning to the step 2, calculating the updated predicted relative position of the next predicted time, sequentially iterating to obtain the predicted relative positions of all the subsequent predicted times, and finally obtaining the track prediction result of the target i.
CN202111240446.8A 2021-10-25 2021-10-25 Attention mechanism-based target track prediction method Active CN113989326B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111240446.8A CN113989326B (en) 2021-10-25 2021-10-25 Attention mechanism-based target track prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111240446.8A CN113989326B (en) 2021-10-25 2021-10-25 Attention mechanism-based target track prediction method

Publications (2)

Publication Number Publication Date
CN113989326A CN113989326A (en) 2022-01-28
CN113989326B true CN113989326B (en) 2023-08-25

Family

ID=79740914

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111240446.8A Active CN113989326B (en) 2021-10-25 2021-10-25 Attention mechanism-based target track prediction method

Country Status (1)

Country Link
CN (1) CN113989326B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114792320A (en) * 2022-06-23 2022-07-26 中国科学院自动化研究所 Trajectory prediction method, trajectory prediction device and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110570651A (en) * 2019-07-15 2019-12-13 浙江工业大学 Road network traffic situation prediction method and system based on deep learning
CN110781838A (en) * 2019-10-28 2020-02-11 大连海事大学 Multi-modal trajectory prediction method for pedestrian in complex scene
CN111114543A (en) * 2020-03-26 2020-05-08 北京三快在线科技有限公司 Trajectory prediction method and device
CN112215337A (en) * 2020-09-30 2021-01-12 江苏大学 Vehicle trajectory prediction method based on environment attention neural network model
CN112257850A (en) * 2020-10-26 2021-01-22 河南大学 Vehicle track prediction method based on generation countermeasure network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10611371B2 (en) * 2017-09-14 2020-04-07 Toyota Motor Engineering & Manufacturing North America, Inc. System and method for vehicle lane change prediction using structural recurrent neural networks
US20200324794A1 (en) * 2020-06-25 2020-10-15 Intel Corporation Technology to apply driving norms for automated vehicle behavior prediction

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110570651A (en) * 2019-07-15 2019-12-13 浙江工业大学 Road network traffic situation prediction method and system based on deep learning
CN110781838A (en) * 2019-10-28 2020-02-11 大连海事大学 Multi-modal trajectory prediction method for pedestrian in complex scene
CN111114543A (en) * 2020-03-26 2020-05-08 北京三快在线科技有限公司 Trajectory prediction method and device
CN112215337A (en) * 2020-09-30 2021-01-12 江苏大学 Vehicle trajectory prediction method based on environment attention neural network model
CN112257850A (en) * 2020-10-26 2021-01-22 河南大学 Vehicle track prediction method based on generation countermeasure network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Zhe Liu等.attention-based interaction trajectory prediction.international conference on AI and Mobile services.2020,168-175. *

Also Published As

Publication number Publication date
CN113989326A (en) 2022-01-28

Similar Documents

Publication Publication Date Title
CN111400620B (en) User trajectory position prediction method based on space-time embedded Self-orientation
Zhang et al. Smart industrial IoT empowered crowd sensing for safety monitoring in coal mine
CN112380426B (en) Interest point recommendation method and system based on fusion of graph embedding and long-term interest of user
Zhao et al. Real-time and light-weighted unsupervised video object segmentation network
CN107122736A (en) A kind of human body based on deep learning is towards Forecasting Methodology and device
CN115829171B (en) Pedestrian track prediction method combining space-time information and social interaction characteristics
US20230334981A1 (en) Traffic flow forecasting method based on multi-mode dynamic residual graph convolution network
CN113989326B (en) Attention mechanism-based target track prediction method
CN114715145B (en) Trajectory prediction method, device and equipment and automatic driving vehicle
CN113068131A (en) Method, device, equipment and storage medium for predicting user movement mode and track
CN116307152A (en) Traffic prediction method for space-time interactive dynamic graph attention network
CN115376103A (en) Pedestrian trajectory prediction method based on space-time diagram attention network
Li et al. Pairwise contrastive learning network for action quality assessment
CN115455130A (en) Fusion method of social media data and movement track data
CN113240714B (en) Human motion intention prediction method based on context awareness network
Liu et al. Learning from interaction-enhanced scene graph for pedestrian collision risk assessment
Liu et al. An adaptive traffic flow prediction model based on spatiotemporal graph neural network
CN114399901B (en) Method and equipment for controlling traffic system
CN115018134A (en) Pedestrian trajectory prediction method based on three-scale spatiotemporal information
CN112256858B (en) Double-convolution knowledge tracking method and system fusing question mode and answer result
CN104200222A (en) Picture object identifying method based on factor graph model
Zhang et al. Obstacle‐transformer: A trajectory prediction network based on surrounding trajectories
CN117151228B (en) Intelligent customer service system based on large model and knowledge base generation
CN117733874B (en) Robot state prediction method and device, electronic equipment and storage medium
CN117556150B (en) Multi-target prediction method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant