CN111553232B

CN111553232B - Gate loop unit network pedestrian trajectory prediction method based on scene state iteration

Info

Publication number: CN111553232B
Application number: CN202010319857.5A
Authority: CN
Inventors: 路纲; 刘远恒; 吴晓军
Original assignee: Shaanxi Normal University
Current assignee: Shaanxi Normal University
Priority date: 2020-04-22
Filing date: 2020-04-22
Publication date: 2023-04-07
Anticipated expiration: 2040-04-22
Also published as: CN111553232A

Abstract

A gate cycle unit network pedestrian trajectory prediction method based on scene state iteration comprises the steps of data preprocessing, scene state extraction, scene state iteration, prediction model construction and pedestrian trajectory prediction. The method comprises the steps of carrying out coordinate normalization and data enhancement processing on an acquired pedestrian video data set, coding by using a gate cycle unit network, determining the spatial relative position relation between pedestrians in a scene and the attention of the pedestrians in the scene, iterating the acquired hidden layer states of the pedestrians to obtain updated states, constructing a prediction model, repeatedly training the prediction model by using a leave-one-out method, obtaining the optimal training model parameters, applying the optimal training model parameters to the training model, inputting coordinates to be predicted, and predicting the tracks of the pedestrians. The method has the advantages of simplicity, low operation complexity, high prediction accuracy and the like, can be used for predicting the pedestrian movement track by the unmanned vehicle, and can also be used in other technical fields needing to predict the pedestrian movement track.

Description

Gate loop unit network pedestrian trajectory prediction method based on scene state iteration

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to a method for predicting a walking track of a pedestrian.

Background

With the rapid evolution and development of computer technology, computers are increased in computation speed by geometric multiples. Many sequence-based predictions are also becoming increasingly popular thanks to the dramatic enhancement of computer devices in terms of computational power enhancement. As a typical sequence problem, the pedestrian track has wide application space, such as automatic automobile driving, unmanned delivery robots, real-time traffic monitoring and the like.

The existing pedestrian trajectory prediction method is roughly divided into three types, namely social interaction, motion mode monitoring and deep learning-based method, wherein the social interaction is represented by a social force model, a Gaussian interaction process and the like, the social force model constructs a gravitation-repulsion force model, the Gaussian interaction process introduces a potential interaction network, the two methods simulate the interaction of people in a specific environment, give off an uncommon performance, and have the defect of being limited by a fixed scene. In the aspect of motion mode monitoring, a clustering method is used as a main prediction method, but the method is mainly used for predicting motion mode tracks for avoiding obstacles on static objects, and crowd interaction is not realized. The method mainly applies a cyclic neural network based on a deep learning method, takes the human track as a series of data chains based on inherent time sequence, and is most representative of a Social-LSTM model, which better describes a pedestrian network based on Social state through Social pooling operation, but the state control of pedestrian time steps has limitations and does not show the real-time property of pedestrian state change.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a gate cycle unit network pedestrian trajectory prediction method based on scene state iteration, which is low in operation complexity and high in prediction accuracy.

The technical scheme adopted for solving the technical problems comprises the following steps:

(1) Data pre-processing

Extracting pedestrian trajectory data from the public data set ETH and the data set UCY, including 5 sub-data sets and the nonlinear trajectory of the pedestrian, wherein the two-dimensional coordinates of each pedestrian i at time t are extracted from the data set ETH and the data set UCY and recorded as

All the pedestrian coordinate data are processed by a coordinate normalization and data enhancement method.

(2) Extracting scene states

1) Respectively coding each pedestrian i at a video moment t by using an independent gate cycle unit network, coding all the pedestrians i at the moment t by using the same method, inputting the current position information of the pedestrian i and the hidden layer state of the gate cycle unit network into the gate cycle unit network, and adopting tensor

The position information of the pedestrian i at the time t is determined according to the formula (1):

where phi (-) is a nonlinear function with an offset ReLU, W _e Is a weight matrix of the function, b _e Is the bias matrix of the function.

2) Determining the hidden layer state of the pedestrian i at the time t according to the formula (2):

/>

wherein the content of the first and second substances,

is the hidden layer state W of the pedestrian i at the time t-1 in the gate cycle unit network _G Is an internal weight matrix of the gate-cycle cell network input, b _G Is an internal bias matrix for the gate cycle cell network input.

3) Determining the spatial relative position relationship between pedestrians in a scene according to the formula (3)

Wherein phi _l Is a ReLU nonlinear function, W _l Is the weight of the function and is,

and &>

Representing the spatial coordinates of a pedestrian i and a pedestrian j, respectively, at a time t, i and j being finite positive integers, i ≠ j.

4) Determining feature matrix of relation between pedestrians according to equations (4) and (5)

Wherein sigma _F Is a nonlinear function containing a bias Sigmoid, W _F Is a weight matrix of the function, b _F Is the bias matrix of the function.

5) Determining the attention of a pedestrian in a scene according to equation (6)

Wherein W _α Is the weight matrix of the Softmax function.

(3) Iterative scene states

Iterating the states of the hidden layers of the pedestrians obtained by the formula (2) according to the formulas (7) and (8) to obtain updated states

Wherein, denotes a Hadamard product, N is the number of all pedestrians appearing in the scene, is a finite positive integer, W _h For the coefficient matrix, z denotes the update gate in the gate cycle unit, σ _z Is a Sigmoid nonlinear function in the gate, W _z Is a weight matrix of the function, b _z Is the functionThe bias matrix of (2).

(4) Building a prediction model

The predicted coordinates of the pedestrian i at the time t +1 are determined according to equation (9)

Where W represents the fully learned parameter matrix.

Combining the coordinate series provided by data set ETH and data set UCY, at time t _ob To observe the starting moment, the moment t _s For observing the end time, determining a complete observation step length period P:

P＝t _s -t _ob (10)

all information in the scene state in the step length period P is transmitted to a gate cycle unit network, and the step (3) and the formula (9) of the iterative scene state are adopted to predict the time t _s +1 pedestrian coordinate at time t _ob +1 as the observation start time, at time t _s +1 as the observation end time, the time t being predicted by this method _s +2 pedestrian coordinates, continuing to adopt the method to predict until the predicted time t _s +t _pred The pedestrian coordinates are constructed into a prediction model.

(5) Pedestrian trajectory prediction

Repeatedly training the prediction model by using a leave-one-out method for the data set ETH and the data set UCY, minimizing the mean square error MSE, and determining the mean square error MSE of the predicted track and the real track according to the formula (11):

and obtaining the optimal training model parameters, applying the optimal training model parameters to the training model, inputting coordinate data to be predicted, and predicting the pedestrian track.

In the data preprocessing step (1) of the invention, the coordinate normalization method comprises the following steps: within the observation time length, the initial coordinate of the pedestrian i is taken as the origin. The data enhancement method comprises the following steps: and randomly rotating the video image of the corresponding frame.

In the step (4) of constructing the prediction model, the value range of the observation step length period P is as follows: p is from [5,10 ∈ [ ]](ii) a Said t _pred The value range is as follows: t is t _pred ∈[8,12]。

The method comprises the steps of carrying out coordinate normalization and data enhancement processing on an acquired pedestrian video data set, coding by using a gate cycle unit network, determining the spatial relative position relation between pedestrians in a scene and the attention of the pedestrians in the scene, iterating the acquired hidden layer states of the pedestrians to obtain updated states, constructing a prediction model, repeatedly training the prediction model by using a leave-one method to obtain the optimal training model parameters, applying the optimal training model parameters to the training model, inputting coordinate data needing to be predicted, and predicting the pedestrian track. The method effectively utilizes the state extracted from the scene and is not limited by the environment, accurately determines the interaction state between the pedestrians in the scene, fully considers the interaction between the pedestrians in the scene, and corrects the hidden layer state of the pedestrians in real time according to the interaction result between the pedestrians to obtain the predicted track of the pedestrians, wherein the predicted track is close to the real track of the pedestrians. The method has the advantages of simplicity, low operation complexity, high prediction accuracy and the like, can be used for predicting the motion trail of the pedestrian by the unmanned vehicle, can avoid the pedestrian by the unmanned vehicle, and can also be used in other technical fields needing to predict the motion trail of the pedestrian.

Drawings

FIG. 1 is a flowchart of example 1 of the present invention.

Detailed Description

The present invention will be described in further detail below with reference to the drawings and examples, but the present invention is not limited to the embodiments described below.

Example 1

Taking 3 video data sets from the disclosed video data set ETH, taking 2 video data sets from the video data set UCY as an example, the gate cycle unit network pedestrian trajectory prediction method based on scene state iteration of the present embodiment has the following steps (see fig. 1):

(1) Data pre-processing

Taking 3 video data sets from the public video data set ETH, taking 2 video data sets from the video data set UCY, extracting pedestrian trajectory data, including 5 sub data sets and nonlinear trajectory of pedestrian, wherein two-dimensional coordinates of each pedestrian i at time t are extracted from the data set ETH and the data set UCY and recorded as

The coordinate normalization method of the embodiment comprises the following steps: within the observation time length, the initial coordinate of the pedestrian i is taken as the origin. The data enhancement method comprises the following steps: and randomly rotating the video image of the corresponding frame.

(2) Extracting scene states

where φ is a nonlinear function with offset ReLU, W _e Is a weight matrix of the function, b _e Is the bias matrix of the function.

wherein the content of the first and second substances,

and &>

Respectively representing the spatial coordinates of a pedestrian i and a pedestrian j at a time t, i and j being finite positive integers, i ≠ j.

Wherein σ _F Is containing an offset Sigmoid nonlinear function, W _F Is a weight matrix of the function, b _F Is the bias matrix of the function.

Wherein W _α Is the weight matrix of the Softmax function.

The step fully considers the interaction between pedestrians in the scene, obtains the pedestrian relationship characteristics and the attention of the pedestrians by the current extraction of the hidden states of the pedestrians and the utilization of the space relative position relationship between the pedestrians, and further describes the behavior characteristics of the pedestrians in the scene.

(3) Iterative scene states

Iterating the pedestrian hidden layer state obtained by the formula (2) according to the formulas (7) and (8) to obtain an updated hidden layer state

Wherein, denotes a Hadamard product, N is the number of all pedestrians appearing in the scene, is a finite positive integer, W _h For the coefficient matrix, z denotes the update gate in the gate cycle unit, σ _z Is a Sigmoid nonlinear function in the gate, W _z Is a weight matrix of the function, b _z Is the bias matrix of the function.

In the step, through a method of iterating a scene state, the interaction result between pedestrians is displayed in a mode of correcting the hidden layer state of the pedestrian i at the current time t, and the real-time property of reflecting the pedestrian state change in the scene through the method is reflected.

(4) Building a prediction model

Where W represents the fully learned parameter matrix.

P＝t _s -t _ob (10)

all information in the scene state in the step length period P is transmitted to the gate cycle unit network, the value P of the observation step length period P in the embodiment is 8, and the time t is predicted by adopting the iterative scene state step (3) and the formula (9) _s +1 pedestrian coordinate at time t _ob +1 as the observation start time, at time t _s +1 as the observation end time, and the time t is predicted by this method _s +2 pedestrian coordinates, continuing to adopt the method to predict until the predicted time t _s +t _pred Pedestrian coordinates of (1), t of the present embodiment _pred The value range of (2) is 10, and a prediction model is constructed.

(5) Pedestrian trajectory prediction

Example 2

Taking 3 video data sets from the disclosed video data set ETH, taking 2 video data sets from the video data set UCY as an example, the gate cycle unit network pedestrian trajectory prediction method based on scene state iteration of the present embodiment has the following steps:

(1) Data pre-processing

This procedure is the same as in example 1.

(2) Extracting scene states

This procedure is the same as in example 1.

(3) Iterative scene states

This procedure is the same as in example 1.

(4) Building a prediction model

The predicted coordinates of the pedestrian i at the time t +1 are determined according to the equation (9)

Where W represents the fully learned parameter matrix.

P＝t _s -t _ob (10)

all information in the scene state in the step length period P is transmitted to the gate cycle unit network, the value P of the observation step length period P in the embodiment is 5, and the time t is predicted by adopting the iterative scene state step (3) and the formula (9) _s +1 pedestrian coordinate at time t _ob +1 as the observation start time, at time t _s +1 makingTo observe the end time, the time t is predicted by the method _s +2 pedestrian coordinates, continuing to adopt the method to predict until the predicted time t _s +t _pred Pedestrian coordinates of, t of the embodiment _pred The value of (2) is 8, and a prediction model is constructed.

The other steps were the same as in example 1.

Example 3

(1) Data pre-processing

This procedure is the same as in example 1.

(2) Extracting scene states

This procedure is the same as in example 1.

(3) Iterative scene states

This procedure is the same as in example 1.

(4) Building a prediction model

Where W represents the fully learned parameter matrix.

P＝t _s -t _ob (10)

all information in the scene state in the step length period P is transmitted to a gate cycle unit network, the value P of the observation step length period P in the embodiment is 10, and the iterative scene state steps (3) and (3) are adoptedEquation (9) predicts the time t _s +1 pedestrian coordinate at time t _ob +1 as the observation start time, at time t _s +1 as the observation end time, the time t being predicted by this method _s +2 pedestrian coordinates, continuing to adopt the method to predict until the predicted time t _s +t _pred Pedestrian coordinates of, t of the embodiment _pred Is 12, a prediction model is constructed.

The other steps were the same as in example 1.

Claims

1. A pedestrian trajectory prediction method based on a gate cycle unit network of scene state iteration is characterized by comprising the following steps:

(1) Data pre-processing

All the pedestrian coordinate data are processed by a coordinate standardization and data enhancement method;

(2) Extracting scene states

where phi (-) is a nonlinear function with an offset ReLU, W _e Is a weight matrix of the function, b _e Is a bias matrix of the function;

wherein, the first and the second end of the pipe are connected with each other,

is the hidden layer state W of the pedestrian i at the time t-1 in the gate cycle unit network _G Is an internal weight matrix of the gate-cycle cell network input, b _G An internal bias matrix that is a network input to the gate cycle unit;

and &>

Respectively representing the space coordinates of a pedestrian i and a pedestrian j at a moment t, wherein i and j are limited positive integers, and i is not equal to j;

Wherein sigma _F Is a nonlinear function with offset Sigmoid, W _F Is a weight matrix of the function, b _F Is a bias matrix of the function;

Wherein W _α Is a weight matrix of the Softmax function;

(3) Iterative scene states

Wherein, denotes a Hadamard product, N is the number of all pedestrians appearing in the scene, is a finite positive integer, W _h For the coefficient matrix, z represents the update gate in the gate rotation unit, σ _z Is a Sigmoid nonlinear function in the gate, W _z Is the weight moment of the functionArray, b _z Is a bias matrix of the function;

(4) Building a prediction model

Wherein W represents a fully learned parameter matrix;

combining the data set ETH with the coordinate series provided by data set UCY, at time t _ob To observe the starting moment, the moment t _s For observing the end time, determining a complete observation step length period P:

P＝t _s -t _ob (10)

all information in the scene state in the step length period P is transmitted to a gate cycle unit network, and the step (3) and the formula (9) of the iterative scene state are adopted to predict the time t _s +1 pedestrian coordinate at time t _ob +1 as the observation start time, at time t _s +1 as the observation end time, the time t being predicted by this method _s +2 pedestrian coordinates, continuing to adopt the method to predict until the predicted time t _s +t _pred The pedestrian coordinates of the vehicle are constructed into a prediction model;

(5) Pedestrian trajectory prediction

2. The method for predicting pedestrian trajectories based on the gate loop unit network of the scene state iteration as claimed in claim 1, wherein in the step (1) of data preprocessing, the coordinate normalization method comprises: within the observation time length, taking the initial coordinate of the pedestrian i as an origin; the data enhancement method comprises the following steps: and randomly rotating the video image of the corresponding frame.

3. The iterative gate-loop unit network pedestrian trajectory prediction method based on scene states of claim 1, characterized in that: in the step (4) of constructing the prediction model, the value range of the observation step period P is as follows: p is as [5,10 ]](ii) a Said t _pred The value range of (A) is as follows: t is t _pred ∈[8，12]。