CN114582128B

CN114582128B - Traffic flow prediction method, medium and equipment based on graph discrete attention network

Info

Publication number: CN114582128B
Application number: CN202210234138.2A
Authority: CN
Inventors: 苏杰; 刘勇; 杨建党
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2022-03-10
Filing date: 2022-03-10
Publication date: 2023-08-04
Anticipated expiration: 2042-03-10
Also published as: CN114582128A

Abstract

The invention discloses a traffic flow prediction method, medium and equipment based on a graph discrete attention network. The method comprehensively considers the time and space characteristics of traffic flow, characterizes the space characteristics through a graph discrete attention mechanism, characterizes the time sequence characteristics through a framework of a multi-layer encoder sequence to a multi-layer decoder sequence, thus constructing a complete traffic flow model, and can obtain a road traffic flow prediction model through training an algorithm model. The result shows that the model constructed by the invention can accurately predict the future traffic flow data of traffic monitoring points and can represent the dynamic change of the flow between the traffic monitoring points.

Description

Traffic flow prediction method, medium and equipment based on graph discrete attention network

Technical Field

The invention belongs to the field of digital intelligent traffic, and particularly relates to a traffic flow prediction method, medium and equipment based on a graph discrete attention network.

Background

In the past decades, the automobile conservation in China has grown in number, and it is expected that in 2022, the number will reach more than 3 hundred million. Meanwhile, the traffic demand is increased day by day, so that the current road traffic load is increased day by day, and a series of problems such as congestion and accidents are brought. Although traffic management departments take measures to some extent to alleviate traffic jams, such as road construction, vehicle number limiting, etc., the traffic jam state is not well improved.

Traffic flow prediction and control are core problems of solving traffic efficiency, and reasonable decisions are made in advance according to prediction results, so that the traffic efficiency can be effectively improved, and traffic jams and accidents are prevented. However, traffic flow data has both time series and spatially correlated features, and modeling this class of spatio-temporal coupled data is quite challenging. Furthermore, the condition of partial areas is limited, and the arrangement of sensing equipment is difficult, so that the collection of traffic data is relatively sparse and deficient, and the design difficulty of the efficient prediction algorithm is further increased.

Conventional traffic flow prediction algorithms treat traffic flow data as time series data and fit using a correlation model. For example, the Auto-Regressive Integrated Moving Average (ARIMA) model and the kalman filter algorithm are used to predict traffic flow, which has a certain effect, but the prediction accuracy is not satisfactory. With the development of the deep learning technology in recent years, the solution of the traffic flow prediction algorithm based on the deep learning technology is outstanding, and the traffic flow prediction solution based on the deep confidence network, the automatic encoder, the deep convolution neural network, the cyclic neural network and the like has good effects, but the spatial coupling characteristic of the traffic flow prediction solution is still not paid attention to and utilized effectively.

The effective expressive nature of the graph structure on the spatial structure inspires that the graph neural network is designed to model the space-time coupling data. The research results based on the space-time diagram model make breakthrough progress in the aspect of traffic flow modeling prediction, wherein the space-time diagram convolutional network method proposed by the documents Yu B, yin H, zhu Z.spray-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting [ C ]// IJCAI.2018, the space-time diagram convolutional network method proposed by the documents Li Y, yu R, shahabi C, et.Diffuse Convolutional Recurrent Neural Network:Data-Driven Traffic Forecasting [ C ]// International Conference on Learning presentations.2018, the DCRNN method proposed by the documents Wu Z, pan S, long G, et.Graph WaveNet for Deep Spatial-Temporal Graph Modeling [ C ]// IJCAI.2019, the Graph Wavent method proposed by the documents Zng Q, chaJ, meng, et.spray-temporal Graph structure learning for traffic forecasting [ C ]// Proceedings of the AAAI Conference on Artificial Intergent.2020, 34 (01 1177-1185) are the typical scatter CNN method, and the Graph convolutional structure, the neural network and the neural network have better combination of the problems of the space-time flow prediction and the spatial characteristics. Similarly, the patent of the invention, entitled "prediction method of traffic flow of road based on graph rolling network", has issued an announcement number CN 110264709B, and obtains a predicted value of traffic flow data of road in the next period by integrating spatial features and temporal features of traffic flow data of road by using GCN network and LSTM network. The invention patent with the name of intelligent induction based on high-speed flow monitoring and prediction, which has the publication number of CN 110503826B, regards road traffic flow as a time sequence, realizes traffic flow prediction by fitting through an ACTI_ARMA algorithm, and is issued according to a high-speed induction information issuing flow by combining with a road related design and management scheme. The invention discloses a method for predicting high-speed traffic flow based on multi-mode fusion and graph annotation meaning force mechanism, which is characterized by comprising the steps of constructing a time sequence convolution attention network and a graph annotation meaning force mechanism network, wherein the number of the issued patent is CN 111540199B, and the name of the issued patent is a high-speed traffic flow prediction method based on multi-mode fusion and graph annotation meaning force mechanism, so that the prediction of the traffic flow of a highway is realized. However, these existing advanced traffic flow prediction models are static, predefined for the utilization of graph structures, literature Bai, lei, et al, "Adaptive graph convolutional recurrent network for traffic foraging," Advances in Neural Information Processing Systems (2020): 17804-17815. Attempts to change the graph structure to a dynamically updated parameter that improves the accuracy of the prediction, but reduces the stability of the graph structure information. How to combine static diagram structure information and dynamic diagram structure information simultaneously to predict traffic flow with high precision needs to be studied.

Disclosure of Invention

The invention aims to solve the problems in the prior art and provide a traffic flow prediction method, medium and equipment based on a graph discrete attention network.

In a first aspect, the present invention provides a road traffic flow prediction method based on a graph discrete attention network, comprising the steps of:

s1, acquiring structural traffic flow data which are acquired by sensors at different positions on a road to be predicted and are related to vehicle flow, grouping the structural traffic flow data according to a set interval step length, wherein each group of data comprises vehicle information passing through each sensor in an interval period corresponding to the grouping, and finally obtaining flow statistical data which are ordered according to time;

s2, constructing a road map network structure aiming at a road to be predicted, taking the point position of each sensor deployment as a node of the map network structure, connecting the nodes through edges, normalizing the actual distance between the nodes on the road, and taking the normalized distance as the static weight of the connected edges between the nodes in the map;

s3, modeling traffic flow data of a road based on a graph discrete attention network, firstly constructing a spatial characteristic relation of a regional road network by using a graph discrete attention module during modeling, then constructing a time sequence characteristic relation of a decoder sequence model based on an encoder sequence, and finally forming a graph discrete attention network model;

and S4, training the graph discrete attention network model by utilizing the flow statistical data obtained in the S1 to obtain a flow prediction model of traffic monitoring points and road networks, and performing actual traffic flow prediction.

Preferably, in S1, the structured traffic flow data collected by the sensor includes position information of the sensor, elapsed time of the vehicle, license plate number, and driving direction of the vehicle.

Preferably, in the step S1, the structured traffic flow data collected by the sensor is statistically processed by using a linux shell script and python scientific computing tool software.

Preferably, in the step S1, if the structured traffic flow data includes a plurality of data of vehicle driving directions, the data is extracted by taking the vehicle driving directions as dimensions, and then the data of each vehicle driving direction is respectively grouped and sequenced to form flow statistics data, and the flow statistics data formed by each vehicle driving direction is only used for training the graph discrete attention network model of the vehicle driving direction.

Preferably, in the step S3, the skeleton structure of the graph discrete attention network model includes an encoder sequence and a decoder sequence, the encoder sequence is composed of L-1 layer encoders, the decoder sequence is composed of L' -1 layer decoders, and an association link is arranged between the L-1 layer encoder and each layer decoder; each layer of encoder is composed of a separate graph discrete attention module, and the graph discrete attention module contains discrete attention, graph attention and summation regularization operations; historical traffic flow data and a static adjacency matrix are accessed into a first layer encoder through an input full-connection layer; the last layer of the decoder generates a traffic flow prediction result through the output full-connection layer;

in the graph discrete attention network model, the spatial features of traffic flow data are constructed as a graph networkWherein->Representing a set of all N nodes in the graph network, each node representing a sensor capturing road traffic flow information, ε representing a set of edges between nodes, < >>Representing static adjacency matrix constructed after Euclidean distance normalization processing between different sensors; historical traffic flow data for a T' step duration with M dimensions is represented asWherein X is _{{t-T′+1，…，t}} ＝{X _t-T′+1 ，…，X _t }，X _t Traffic flow data representing a t-th time step; traffic flow data to be predicted having a T-step duration of M dimensions is expressed as +.>The goal of model training is to learn a mapping function +.>With historical traffic flow data X _{{t-T′+1，…，t}} And graph network->For input, traffic flow data of a future T-step duration is predicted, namely:

where ψ represents the parameters that can be learned.

Preferably, the graph discrete attention network model includes an input fully connected layer, an L-1 layer encoder, an L' -1 layer decoder, and an output fully connected layer;

the input full connection layer will X _{{t-T′+1，…，t}} Andconversion to the first layer encoder feature matrix +.>And adjacency transfer matrix->First layer encoder feature matrix->And adjacency transfer matrix->

Wherein the method comprises the steps ofAnd->Respectively representing weight matrix and offset of the full connection layer corresponding to the encoder>And->Weight matrix and offset of full connection layer corresponding to decoder are respectively represented>Is->Degree matrix of (2), degree matrix->The element of the mth row and the mth column +.> Representing static adjacency matrix->An element of an mth row and an nth column;

in the L-1 layer encoder, for any first layer encoder, L E [1, L-1 ]]The inputs are allAndoutput is->And->Wherein->The operation mode is as follows:

where ReLU represents the activation function and,representing residual connection,/->Representing a weight matrix, +.>Representing a linear transformation matrix>Representing multi-head discrete attention, calculated as follows:

where i represents a stitching operation, H represents the total number of attention points,represents the h head discrete attention; discrete attention to any head>The calculation mode is as follows:

wherein K represents the dispersion, K represents the total dispersion, θ _k Representing discrete weight coefficients, the calculation method is as follows:

wherein the method comprises the steps of"Value" transformation matrix representing self-attention mechanism, view is a matrix transformation operation for transforming a matrix of N rows and N columns in the dimension before transformation into a matrix of N in the dimension ² A transformed matrix of row 1 columns; />I.e. the input sequence of the self-attention mechanism, e _ij Representing the degree of compatibility, i.e., the attention score, between discrete step i and discrete step j, the calculation method is as follows,

wherein the method comprises the steps ofAnd->An "index" (Query) transformation matrix and a "Key" (Key) transformation matrix, respectively, representing self-attention, qs representing an "index" block size;

the adjacency transfer matrixThe calculation mode of (2) is as follows:

wherein the method comprises the steps ofThe elements representing the dynamic update of the first layer, row a and column b, are found from the following multi-headed graph attention expression:

wherein m is E [1, M]A sequence number representing the attention header number, M representing the total header number of the attention header; ne (a) represents the set of neighbor nodes of node a,and->Attention score, each representing the attention of the mth head figure,/>The calculation mode of (2) is as follows:

wherein a is ^m (. Cndot.) represents the weight vector of the attention of the mth head,and->Respectively represent feature matrix->Is an activation function;

in the L '-1 layer decoder, L E [1, L' -1 ] for any first layer decoder]The inputs are allAndand +.>And->Output is->And-> The calculation method is as follows:

wherein the method comprises the steps ofAnd->Respectively representing a weight matrix and a linear transformation matrix; />Representing the correlation of the last layer encoder module output +.>And->Is a multi-headed discrete attention; discrete attention to any headThe calculation mode is as follows:

wherein the discrete parameter is calculated asAttention score +.>The calculation mode of (a) is as follows:

wherein:transformation matrices representing "index", "key" and "value", respectively;

representing the feature matrix generated by the discrete attention module in the following calculation mode:

wherein the method comprises the steps ofRepresenting multi-headed discrete attention of the discrete attention module; for any one ofDiscrete attention->The calculation mode is +.>Wherein the discrete parameter is calculated asAttention score +.>The calculation mode of (a) is that Transformation matrices representing "index", "key" and "value", respectively; />And->Respectively representing a weight matrix and a linear transformation matrix;

the saidThe manner of calculation of (c) is as follows,

wherein the method comprises the steps ofRepresenting a dynamic update section whose element formula of row a and column b is +.> Wherein->Attention scores representing the attention of the mth plot are calculated as follows:

wherein the method comprises the steps ofAnd->Respectively represent feature matrix->Is the a and b line of (2)>Representing a corresponding mth head attention weight vector;

the prediction result of the output full-connection layer output is as follows:

wherein the method comprises the steps ofAnd->Representing the transformation matrix and the offset of the output fully connected layer, respectively.

Preferably, in the step S4, the flow statistics obtained in the step S1 are used as training data to minimize the objective functionNumber of digitsAnd for optimizing the target, updating parameters of the graph discrete attention network model, and finally training to obtain a road flow prediction model.

Preferably, the objective functionAs a Mean Absolute Error (MAE) function.

In a second aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, enables to implement a traffic flow prediction method based on a graph discrete attention network according to any of the first aspects.

In a third aspect, the present invention provides a road traffic flow prediction device based on a graph discrete attention network, comprising a memory and a processor;

the memory is used for storing a computer program;

the processor is configured to implement the traffic flow prediction method based on a graph discrete attention network according to any one of the first aspects when executing the computer program.

Compared with the prior art, the invention has the following beneficial effects:

1) The invention designs the graph discrete attention mechanism by combining the graph network, the discrete process and the attention mechanism, fully grasps the static graph structure information and the dynamic graph structure information, and is different from the traditional algorithm that only can feed back the flow change of traffic monitoring points;

2) The invention designs the graph discrete attention network by combining the related graph discrete attention mechanism and the sequence pair sequence architecture, and can accurately predict the road traffic flow.

Drawings

FIG. 1 is a flow chart of a road traffic flow prediction method based on a graph discrete attention network;

FIG. 2 is a schematic diagram of a skeleton structure of the discrete attention network model of FIG. 2;

FIG. 3 is a schematic diagram of the structure of an encoder layer;

FIG. 4 is a schematic diagram of the structure of a decoder layer;

FIG. 5 is a graph of the dynamic change visualization result of the adjacency transfer matrix;

FIG. 6 is a graph of the result of the method in visualizing the predicted effect.

Detailed Description

The invention is further illustrated and described below with reference to the drawings and specific embodiments.

In a preferred embodiment of the present invention, there is provided a traffic flow prediction method based on a graph discrete attention network, comprising the steps of:

s1, acquiring structural traffic flow data which are acquired by sensors at different positions on a road to be predicted and related to vehicle flow, grouping the structural traffic flow data according to a set interval step length, wherein each group of data comprises vehicle information passing through each sensor in an interval period corresponding to the grouping, and finally obtaining flow statistical data which are ordered according to time.

In this step, the sensor may be any sensor on the road capable of sensing a vehicle, such as a portal frame, ETC toll gate, millimeter wave radar, buried coil or surveillance camera, and in order to enable traffic flow prediction, the structured traffic flow data collected by the sensor should include position information of the sensor, vehicle elapsed time, license plate number and vehicle driving direction. The specific source of the information needs to be determined according to the characteristics of the sensor data, for example, the position information of the sensor can be stake marks, longitude and latitude coordinates, IDs and the like, and the vehicle passing time can be determined by the transaction time in the ETC portal frame. If the multi-source data exist, the multi-source multi-time space granularity data such as portal frame bayonet flow data of the road, toll station flow data, millimeter wave radar flow data of a road side rod piece, flow data perceived by a buried coil and the like can be fused first and then used as flow statistical data in the invention.

The time-ordered flow statistics obtained by the step can be used as training data of a subsequent prediction model, so that the time-ordered flow statistics need to be constructed into corresponding sample data through the grouping. In this embodiment, the structured collection data (including pile number, longitude and latitude coordinates, ID, vehicle elapsed time, license plate number, vehicle driving direction, etc.) may be processed in groups by using a statistical processing tool, and the structured traffic flow data is divided into multiple groups of flow data according to interval step length to obtain a flow statistical data file, where the specific process may be implemented by referring to the following steps:

s11, importing a structured traffic flow data file by using a Linux shell script and a python scientific computing software tool;

s12, if the structured traffic flow data has data of a plurality of vehicle driving directions, the data are firstly extracted by taking the vehicle driving directions as dimensions, then the data of each vehicle driving direction are respectively grouped and ordered, and if the structured traffic flow data have only 1 vehicle driving direction, the data are directly grouped and ordered. The grouping and ordering is as follows: converting the data from an array to a list at preset interval step length, reconstructing an index so that the index is not repeated, and sequencing the data according to the extraction date to form the list; and grouping the data according to the vehicle passing time sensed by the sensor of each vehicle, recording the vehicle information passing through each sensor in the interval period corresponding to the grouping by each group of data, and re-splicing the list into a complete data file to finally form the flow statistical data of the left opening and the right opening of the time period.

The flow statistics formed for each vehicle driving direction dimension are then used only to train a graph discrete attention network process model for that vehicle driving direction.

S2, constructing a road map network structure aiming at the road to be predicted, taking the point position of each sensor deployment as a node of the map network structure, connecting the nodes through edges, normalizing the actual distance between the nodes on the road, and taking the normalized distance as the static weight of the edges connected between the nodes in the map. The weights of all edges in the graph network constitute the adjacency matrix.

S3, modeling traffic flow data of a road based on a graph discrete attention network, firstly constructing a spatial characteristic relation of a regional road network by using a graph discrete attention module during modeling, then constructing a time sequence characteristic relation of a decoder sequence model based on an encoder sequence, and finally forming a graph discrete attention network model.

As shown in fig. 2, the skeleton structure of the discrete attention network model comprises an encoder sequence and a decoder sequence, wherein the encoder sequence is composed of an L-1 layer encoder, the decoder sequence is composed of an L' -1 layer decoder, and an association link is arranged between the L-1 layer encoder and each layer decoder; each layer of encoder is composed of a separate graph discrete attention module, and the graph discrete attention module contains discrete attention, graph attention and summation regularization operations; historical traffic flow data and a static adjacency matrix are accessed into a first layer encoder through an input full-connection layer; the last layer of the decoder generates a traffic flow prediction result by outputting the full connection layer.

The graph discrete attention network model comprises an input full connection layer, an L-1 layer encoder, an L' -1 layer decoder and an output full connection layer;

the input full connection layer will X _{{t-T′+1,…,t}} Andconversion to the first layer encoder feature matrix +.>And adjacency transfer matrix->First layer encoder feature matrix->And adjacency transfer matrix->

as shown in FIG. 3, in the L-1 layer encoder, for any first layer encoder, L ε [1, L-1 ]]The inputs are allAnd->Output is->And->Wherein->The operation mode is as follows:

the adjacency transfer matrixThe calculation mode of (2) is as follows:

as shown in FIG. 4, in the L '-1 layer decoder, L E [1, L' -1 for any first layer decoder]The inputs are allAnd->And +.>And->Output is->And-> The calculation method is as follows:

representing a feature matrix generated by a discrete attention module of the graph, which countsThe calculation method is as follows:

wherein the method comprises the steps ofRepresenting multi-headed discrete attention of the discrete attention module; discrete attention to any head>The calculation mode is +.>Wherein the discrete parameter is calculated asAttention score +.>The calculation mode of (a) is that Transformation matrices representing "index", "key" and "value", respectively; />And->Respectively representing a weight matrix and a linear transformation matrix;

the saidThe manner of calculation of (c) is as follows,

the prediction result of the output full-connection layer output is as follows:

During the training process, the objective function in the aforementioned S3The flow statistical data obtained in the step S1 can be used as a loss function of model training and input into a graph discrete attention network model as training data so as to minimize an objective function +.>And for optimizing the target, updating parameters of the graph discrete attention network model through a gradient descent algorithm, and finally training to obtain a road flow prediction model. Preferably->Mean absolute error (Mean Absolute Error, MAE) is used as training objective function expressed as,/->

Wherein the method comprises the steps ofRepresenting the data quantity x _i And->Representing data true and predicted values, respectively.

The following traffic flow prediction method based on the graph discrete attention network model shown in the above S1 to S4 is applied to a specific example to show the specific implementation process and technical effects thereof, so as to better understand the essence of the present invention by those skilled in the art.

Examples

Data set preparation: the traffic flow data of 555 monitoring points collected by a certain expressway are collected, and the start and stop time is from 1 month 1 day in 2018 to 31 months 1 month 1 day in 2018. The original data contains the position information of the acquisition point, as well as the time of arrival at the vehicle, the license plate number and the running direction distinguishing mark. And grouping the data by taking 5 minutes as an interval step length to realize traffic flow statistics of arrival time of the same place within a 5-minute time interval.

The present example follows the dataset by 60%:30%: the 10% ratio is divided into a training set, a test set and a verification set for model effect verification.

The hardware configuration of the experimental environment is as follows: a server, CPU is Intel i9-10900K, memory is DDR 4-16 GB, parallel computing resource includes a NVIDIA GeForce RTX 3080TI display card, and the memory is 12GB.

The software configuration of the experimental environment is as follows: the operating system is Ubuntu 20.04LTS, cuda 11.1 and cudnn 8.2.1 are deployed. Environmental management was performed using anaconda, deployed python version 3.8.5. The established conda environment is Pytorch 1.10.

The model training is configured to: the number of epochs was set to 20, and Adam was used as a training optimization algorithm, and the learning rate was set to 0.0005.

The performance evaluation index of this example includes the performance of the prediction methods of mean absolute percentage error (Maximum Absolute Percentage Error, MAPE), mean absolute error (Maximum Absolute Error, MAE), and root mean square error (Root Mean Square Error, RMSE):

wherein y is _i And (3) withRepresenting the real traffic flow value and the predicted value, respectively.

The final prediction error index pair is shown in table 1:

table 1 algorithm error performance versus table,representing the best result +.>

Wherein SVR is a support vector machine regression algorithm, GRU is a gating cyclic unit algorithm, and is a classical time sequence method, and STGCN, DCRNN and AGCRN are respectively from the following references:

(1) STGCN is obtained from Yu B, yin H, zhu Z.spray-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting [ C ]// IJCAI.2018;

(2) DCRNN is from Li Y, yu R, shahabi C, et al Diffuse Convolutional Recurrent Neural Network, data-Driven Traffic Forecasting [ C ]// International Conference on Learning Representations.2018;

(3) AGCRN is from Bai, lei, et al, "Adaptive graph convolutional recurrent network for traffic forecasting," Advances in Neural Information Processing Systems (2020): 17804-17815. The method;

GDF represents the method of the present invention, and it can be seen that the present invention has advanced other methods in terms of performance metrics. Further, the invention canQuantized observation encoder adjacency transfer matrixAdjoining transfer matrix with decoderAnd further excavating traffic change information among traffic monitoring nodes, wherein fig. 5 shows local numerical information of dynamic transmission matrixes of the encoder and the decoder at time 0 and time 1, and the change amplitude is small, so that the actual situation that the observed road section has slow flow change can be observed. The invention can accurately predict traffic flow data, and the visual mode of FIG. 6 shows the comparison of 300 time-stamped traffic flow prediction results of No. 443 monitoring points of a data set from 667 and No. 539 monitoring points from 1594 with real traffic flow data, wherein the line represented by a graphic GDF represents the prediction result of the method of the invention, the line represented by a group Truth represents the real traffic flow data, and the tracking effect of the method of the invention can be found to be quite accurate;

additionally, in other embodiments, a graph discrete attention network based road traffic flow prediction device may be provided that includes a memory and a processor;

the memory is used for storing a computer program;

the processor is configured to implement the traffic flow prediction method based on the graph discrete attention network as described in S1 to S4 above when executing the computer program.

In addition, in other embodiments, a computer readable storage medium may be provided, where a computer program is stored, where the computer program, when executed by a processor, can implement the traffic flow prediction method based on the graph discrete attention network as described in S1 to S4 above.

It should be noted that the Memory may include a random access Memory (Random Access Memory, RAM) or a Non-Volatile Memory (NVM), such as at least one magnetic disk Memory. The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a neural network processor (Neural Processor Unit, NPU), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. Of course, the apparatus should also have necessary components to implement the program operation, such as a power supply, a communication bus, and the like.

The above embodiment is only a preferred embodiment of the present invention, but it is not intended to limit the present invention. Various changes and modifications may be made by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present invention. Therefore, all the technical schemes obtained by adopting the equivalent substitution or equivalent transformation are within the protection scope of the invention.

Claims

1. A traffic flow prediction method based on a graph discrete attention network, comprising the steps of:

the skeleton structure of the graph discrete attention network model comprises an encoder sequence and a decoder sequence, wherein the encoder sequence is composed of an L-1 layer encoder, the decoder sequence is composed of an L' -1 layer decoder, and an associated link is arranged between the L-1 layer encoder and each layer decoder; each layer of encoder is composed of a separate graph discrete attention module, and the graph discrete attention module contains discrete attention, graph attention and summation regularization operations; historical traffic flow data and a static adjacency matrix are accessed into a first layer encoder through an input full-connection layer; the last layer of the decoder generates a traffic flow prediction result through the output full-connection layer;

in the graph discrete attention network model, the spatial features of traffic flow data are constructed as a graph networkWherein->Representing a set of all N nodes in the graph network, each node representing a sensor capturing road traffic flow information, ε representing a set of edges between nodes, < >>Representing static adjacency matrix constructed after Euclidean distance normalization processing between different sensors; historical traffic flow data for a T' step duration with M dimensions is represented asWherein X is _{{t-T′+1,…,t}} ＝{X _t-T′+1 ,…,X _t }，X _t Traffic flow data representing a t-th time step; traffic flow data to be predicted having a T-step duration of M dimensions is expressed as +.>The goal of model training is to learn a mapping function +.>With historical traffic flow data X _{{t-T′+1,…,t}} And graph network->For input, traffic flow data of a future T-step duration is predicted, namely:

wherein ψ represents the learnable parameters;

the input full connection layer will X _{{t-T′+1,…} , _t} Andconversion to the first layer encoder feature matrix +.>Adjacent transfer matrixFirst layer encoder feature matrix->And adjacency transfer matrix->

in the L-1 layer encoder, for any first layer encoder, L E [1, L-1 ]]The inputs are allAnd->Output is->And->Wherein->The operation mode is as follows:

wherein the method comprises the steps of"value" transformation matrix representing self-attention mechanism, view is a matrix transformation operation for transforming a matrix of dimension N rows and columns into dimension N before transformation ² A transformed matrix of row 1 columns; />I.e. the input sequence of the self-attention mechanism, e _ij Representing the degree of compatibility, i.e., the attention score, between discrete step i and discrete step j, the calculation method is as follows,

wherein the method comprises the steps ofAnd->An "index" transformation matrix and a "key" transformation matrix, respectively representing self-attention, qs representing an "index" block size;

the adjacency transfer matrixThe calculation mode of (2) is as follows:

wherein a is ^m (. Cndot.) represents the weight vector of the attention of the mth head,and->Respectively represent feature matrix->Is a function of activation;

in the L '-1 layer decoder, L E [1, L' -1 ] for any first layer decoder]The inputs are allAnd->And +.>And->Output is->And-> The calculation method is as follows:

the saidThe manner of calculation of (c) is as follows,

the prediction result of the output full-connection layer output is as follows:

wherein the method comprises the steps ofAnd->Respectively representing a transformation matrix and an offset of the output full connection layer;

2. The traffic flow prediction method based on graph discrete attention network of claim 1, wherein in S1, the structured traffic flow data collected by the sensor includes position information of the sensor, vehicle elapsed time, license plate number and vehicle driving direction.

3. The traffic flow prediction method based on graph discrete attention network of claim 1, wherein in S1, the structured traffic flow data collected by the sensor is statistically processed by a linux shell script and a python scientific computing tool software.

4. The traffic flow prediction method based on graph discrete attention network as claimed in claim 2, wherein in the step S1, if the structured traffic flow data has a plurality of data of vehicle driving directions, the data is first extracted by taking the vehicle driving directions as dimensions, and then the data of each vehicle driving direction is respectively grouped and sequenced to form flow statistics data, and the flow statistics data formed corresponding to each vehicle driving direction is only used for training the graph discrete attention network model of the vehicle driving direction.

5. The traffic flow prediction method based on graph discrete attention network of claim 1 wherein in S4, the traffic statistics obtained in S1 are used as training data to minimize an objective functionAnd for optimizing the target, updating parameters of the graph discrete attention network model, and finally training to obtain a road flow prediction model.

6. The graph-discrete-attention-network-based traffic flow prediction method of claim 5, wherein the objective functionAs a function of the mean absolute error.

7. A computer readable storage medium, wherein a computer program is stored on the storage medium, which, when executed by a processor, is capable of implementing a graph-discrete-attention-network-based traffic flow prediction method according to any one of claims 1 to 6.

8. A road traffic flow prediction device based on a graph discrete attention network, which is characterized by comprising a memory and a processor;

the memory is used for storing a computer program;

the processor is configured to implement the graph-discrete-attention-network-based traffic flow prediction method according to any one of claims 1 to 6 when executing the computer program.