CN116403397A - Traffic prediction method based on deep learning - Google Patents

Traffic prediction method based on deep learning Download PDF

Info

Publication number
CN116403397A
CN116403397A CN202211651167.5A CN202211651167A CN116403397A CN 116403397 A CN116403397 A CN 116403397A CN 202211651167 A CN202211651167 A CN 202211651167A CN 116403397 A CN116403397 A CN 116403397A
Authority
CN
China
Prior art keywords
time
attention
code
spatial
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211651167.5A
Other languages
Chinese (zh)
Inventor
魏迎梅
高敏
杨雨璇
韩贝贝
谢毓湘
康来
蒋杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202211651167.5A priority Critical patent/CN116403397A/en
Publication of CN116403397A publication Critical patent/CN116403397A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a traffic prediction method based on deep learning, which comprises the following steps: acquiring historical characterization of traffic state information and spatiotemporal information characterizing a first number of historical time steps and future characterization of spatiotemporal information characterizing a second number of future time steps; processing the history characterization by using a first BERT model to obtain a first state code; adding the first state code to the future representation to obtain a predictive representation; and processing the predictive representation by using a second BERT model to obtain a predicted traffic state. Through the mode, the method and the device can effectively capture the hidden space-time dependence in the traffic data and improve the accuracy of long-term prediction.

Description

Traffic prediction method based on deep learning
Technical Field
The invention belongs to the technical field of intelligent traffic, and particularly relates to a traffic prediction method based on deep learning.
Background
With the acceleration of the urban process and the rapid development of economy, the population of cities and the number of motor vehicles are continuously increasing. In order to improve urban operation efficiency to the maximum extent, intelligent traffic systems are developed in various cities, and traffic prediction plays an important role in the intelligent traffic systems. The accurate prediction result can effectively relieve the traffic jam of the city, and provides more meaningful decision basis for traffic management. There are two main challenges in traffic prediction: temporal and spatial dependencies. Time dependence means that the current traffic state is affected by the previous traffic state. The time dependence has the characteristics of proximity, periodicity, trending and the like. Spatial dependence refers to the effect of the surrounding environment on the traffic conditions of a region. The influence of different adjacent regions differs from each other. Generally, the closer the distance, the greater the impact. The temporal and spatial dependencies are always interleaved together, resulting in more complex correlations.
With the great achievement of deep learning methods in the fields of computer vision, natural language processing, and the like, many researchers have attempted to introduce deep learning methods into traffic prediction. Convolutional neural networks (Convolutional Neural Networks, CNN) and graph neural networks (Graph Neural Network, GNN) are used to learn the spatial correlation hidden in the traffic data of the grid structure and the graph structure, respectively. The recurrent neural network (Recursive Neural Network, RNN) has instructive significance for modeling time correlation. The RNN variant long and short term memory model and gating loop may be applied to predict short term traffic flow as they solve the gradient explosion and gradient vanishing problems of the conventional RNN model.
The traditional RNN model still has a disadvantage in terms of capture time dependence. In traffic prediction, the traffic state of the current time period may be affected by the traffic state long before. However, conventional RNN models have difficulty remembering long-before traffic conditions, that is, there is a long-term dependence. In addition, the existing machine learning method can only model the dependency in time, and cannot capture the dependency in space.
Disclosure of Invention
The invention provides a traffic prediction method based on deep learning, which aims to solve the problem of low accuracy of the existing traffic long-term state prediction.
In order to solve the technical problems, the invention provides a traffic prediction method based on deep learning, which comprises the following steps: acquiring historical representations of traffic state information and spatiotemporal information representing a first number of historical time steps and future representations of spatiotemporal information representing a second number of future time steps; processing the history characterization by using a first BERT model to obtain a first state code; adding the first state code to the future representation to obtain a predictive representation; and processing the predictive representation by using a second BERT model to obtain a predicted traffic state.
Optionally, the applying a first BERT model to process the history characterization to obtain a first state code includes: performing time attention calculation on the history characterization to acquire a first time attention code; performing spatial attention calculation on the first time attention code to acquire a first spatial attention code; and carrying out layer normalization processing on the first space attention code to obtain a first state code.
Optionally, the performing a time attention calculation on the history characterization to obtain a first time attention code includes: decomposing the history characterization into time steps and node granularity, and calculating a time input vector of the time attention of the current layer of any time step of any node in the first BERT model according to the history characterization, wherein the time input vector comprises a time query vector, a time key vector and a time value vector; calculating a first time attention weight of the current layer by applying an activation function according to the time query vector and the time key vector; and carrying out weighted summation on the first time attention weight of the current layer and the time value vector, and carrying out residual connection with the first time attention code of the previous layer to obtain the first time attention code of the current layer.
Optionally, the performing spatial attention calculation on the first temporal attention code to obtain a first spatial attention code includes: calculating a spatial input vector of the spatial attention of the current layer of any time step of any node in the first BERT model according to the first spatial attention code, wherein the spatial input vector comprises a spatial query vector, a spatial key vector and a spatial value vector; calculating a first spatial attention weight of the current layer by applying an activation function according to the spatial query vector and the spatial key vector; and carrying out weighted summation on the first spatial attention weight of the current layer and the spatial value vector, and carrying out residual connection with the first time attention code of the current layer to obtain the first spatial attention code of the current layer.
Optionally, the performing layer normalization processing on the first spatial attention code to obtain a first state code includes: processing the first spatial attention code using a feed forward network; and superposing the first space attention code with the output of the feedforward network to obtain a first state code.
Optionally, the adding the first state code to the future representation to obtain a predicted representation includes: if the first number is less than the second number, performing random number filling on the first state code of the first number of time steps, and expanding to the second number of time steps; if the first number is greater than the second number, zero padding the future representation of the second number of time steps to the first number of time steps.
Optionally, the applying a second BERT model to process the predictive representation to obtain a predicted traffic state includes: performing time attention calculation on the predictive representation to obtain a second time attention code; and performing spatial attention calculation on the second time attention code, obtaining a second spatial attention code, performing layer normalization processing on the second spatial attention code, and obtaining a second state code, wherein the second state code is the predicted traffic state.
Optionally, the performing a temporal attention calculation on the predictive representation, and acquiring the second temporal attention code includes: decomposing the predictive representation into time steps and node granularity, and calculating a time input vector of the time attention of the current layer of any time step of any node in the second BERT model according to the predictive representation, wherein the time input vector comprises a time query vector, a time key vector and a time value vector; calculating a second time attention weight of the current layer by applying an activation function according to the time query vector and the time key vector; and carrying out weighted summation on the second time attention weight of the current layer and the time value vector, and carrying out residual connection with the second time attention code of the previous layer to obtain the second time attention code of the current layer.
Based on the same inventive concept, the embodiment of the invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the method according to any one of the previous claims.
Based on the same inventive concept, the embodiment of the invention also provides a computer storage medium, wherein at least one executable instruction is stored in the storage medium, and the executable instruction causes a processor to execute the method of any one of the foregoing claims.
From the above, the present invention provides a traffic prediction method based on deep learning, comprising: acquiring historical representations of traffic state information and spatiotemporal information representing a first number of historical time steps and future representations of spatiotemporal information representing a second number of future time steps; processing the history characterization by using a first BERT model to obtain a first state code; adding the first state code to the future representation to obtain a predictive representation; and processing the prediction characterization by using a second BERT model to obtain a predicted traffic state, so that the hidden space-time dependence in traffic data can be effectively captured, and the accuracy of long-term prediction is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a traffic prediction method based on deep learning in an embodiment of the invention;
FIG. 2 is a schematic representation of input representation in a deep learning-based traffic prediction method in an embodiment of the present invention;
FIG. 3 is a schematic diagram of separation spatiotemporal attention in an embodiment of the invention;
FIG. 4 is a flowchart of a method for obtaining a first state code according to an embodiment of the present invention;
FIG. 5 is a schematic view of traffic prediction based on deep learning in an embodiment of the invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purposes of promoting an understanding of the principles and advantages of the disclosure, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same.
It should be noted that unless otherwise defined, technical or scientific terms used in the embodiments of the present invention should be given the ordinary meaning as understood by one of ordinary skill in the art to which the present disclosure pertains. The terms "first," "second," and the like, as used in embodiments of the present invention, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
The embodiment of the invention provides a traffic prediction method based on deep learning, as shown in fig. 1, the traffic prediction method based on deep learning comprises the following steps:
step S1: a historical representation of traffic state information and spatiotemporal information characterizing a first number P of historical time steps is obtained, and a future representation of spatiotemporal information characterizing a second number Q of future time steps is obtained.
The bi-directional encoder representation (Bidirectional Encoder Representation from Transformer, BERT) is a milestone model in natural language processing that includes a mechanism of attention that reduces the distance between any time steps in the time window to 1, effectively solving the long-term dependence problem. Furthermore, BERT is a pre-trained model that can be equipped with different lightweight outputs for different tasks without the need to individually design a custom model for a particular task. Accordingly, BERT is expected to be a generic model of traffic conditions that can be used for a number of downstream tasks, such as traffic condition classification and traffic condition clustering. Aiming at the defects of the traditional method in terms of capturing the spatial dependency and the time dependency and the advantages of the BERT model, the embodiment of the invention modifies the BERT model to be a model suitable for traffic prediction scenes, which is called TPBERT for short.
Before step S1, the road network is represented as a directed graph g= (V, E, a), where V is the set of nodes, E is the set of edges between nodes, and a is the adjacency matrix. In particular, n= |v| is the number of nodes, ij e A represents v i And v j Physical distance between them. The traffic state of all vertices on the directed graph G at time step t is represented by a vector
Figure BDA0004010686140000041
And represents, where C is the number of traffic state observations. Based on the traffic state data observed in the directed graph G and P historical time steps, the traffic prediction task can be expressed as learning a function f to predict the traffic state for the future Q time steps: f (G, x=y;
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004010686140000042
since traffic prediction is subject to time and space dependencies, it is important to encode and incorporate time and space information into the model. In addition, in consideration of the positional relationship between the historical time step and the future time step, traffic data and time information, space information and position information thereof need to be encoded to obtain traffic state embedded information, time embedded information, space embedded information and position embedded information. And all embedded sizes are set to D. The description of each information is as follows:
traffic state embedded information: the original traffic state observations at time step t are represented as
Figure BDA0004010686140000043
To keep D consistent with other embedded sizes, X t The final representation is obtained through a fully connected network
Figure BDA0004010686140000044
Time embedded information: periodicity is an important feature of time dependence in traffic prediction, and time embedding mainly comprises daily periodicity, which means that traffic states are more similar at the same time of day, and weekly periodicity, which means that traffic states on the same day of the week have the same pattern. For example, seven days a week, 7 different embedded vectors are needed for the weekly periodicity to be represented. The daily periodicity is expressed in relation to the time interval of data collection. Assuming a time interval of 5 minutes, there is 24X 60 a day5 = 288 time steps. Thus 288 different embedded vectors will be used to represent the daily periodicity. The daily periodic and weekly periodic embedding of the present invention is randomly initialized. Time embedded information
Figure BDA0004010686140000051
Is obtained by embedding and adding daily periodicity and weekly periodicity, which can be updated continuously during training.
Spatial embedding information: BERT can encode elements in a sequence and relationships between elements, but cannot model spatial dependencies. To solve this problem, spatial embedding information is proposed based on graph embedding, which retains key information of nodes in a vector, and node representation is learned using a node embedding algorithm, which is a biased random walk in which the hyper-parameters p and q control the strategy of walking. All node representation vectors are pre-trained to facilitate spatial embedding, expressed as
Figure BDA0004010686140000052
Position embedding information: there are two options for position coding, absolute position coding and relative position coding. The relative position codes are chosen in the embodiments of the invention because absolute position codes require known positions in the whole time sequence, whereas relative position codes are not. For consecutive P historical time steps and Q future time steps, their relative positions may be encoded by different embeddings of P+Q. As with the time embedded information, the location embedded information
Figure BDA0004010686140000053
Is also randomly initialized and may be updated in training.
As shown in fig. 2, the historical characterization includes historical traffic state embedded information, historical time embedded information, historical space embedded information, and historical location embedded information, and the future characterization includes future time embedded information, future space embedded information, and future location embedded information. I.e. the history of the history time step t is characterized as
Figure BDA0004010686140000054
Future characterization of the future time step t is +.>
Figure BDA0004010686140000055
Step S2: and processing the history characterization by using a first BERT model to obtain a first state code.
To capture the hidden time and spatial dependencies in traffic state data, time and spatial attentiveness are computed one by one using a split spatiotemporal attentiveness mechanism. As shown in fig. 3, the input is first passed to the temporal attention to capture the temporal dependency and then to the spatial attention to capture the spatial dependency, resulting in a final output. It is noted that the attention is calculated for each node in the road network, that is, each node may be a query vector. As shown in fig. 4, step S2 includes:
step S21: and performing time attention calculation on the history characterization to acquire a first time attention code.
Decomposing the history characterization into time steps t and node granularity v, calculating a time input vector of the time attention of the current layer l of any time step t of any node v in the first BERT model according to the history characterization, wherein the time input vector comprises a time query vector
Figure BDA0004010686140000056
Time key vector->
Figure BDA0004010686140000057
Time value vector +.>
Figure BDA0004010686140000058
The calculation formula is as follows:
Figure BDA0004010686140000059
Figure BDA00040106861400000510
Figure BDA0004010686140000061
where LN represents a layer normalization operation and a represents the a-th attention header. Assuming that the total number of attention heads is A, the dimension D of the attention heads h =÷A。
Calculating a first temporal attention weight of the current layer l by applying an activation function based on the temporal query q-vector and the temporal key k-vector
Figure BDA0004010686140000062
Figure BDA0004010686140000063
Wherein SM is an activation function, and the preferred embodiment of the present invention is a softmax activation function.
Weighted summation of the first temporal attention weight of the current layer/with the temporal value vector
Figure BDA0004010686140000064
And residual connection is carried out with the first time attention code of the upper layer to obtain the first time attention code of the current layer l
Figure BDA0004010686140000065
The calculation formula is as follows:
Figure BDA0004010686140000066
Figure BDA0004010686140000067
step S22: encoding the first time attention
Figure BDA0004010686140000068
Performing spatial attention calculation to obtain the first spatial attention code +.>
Figure BDA0004010686140000069
In an embodiment of the invention, temporal attention encoding
Figure BDA00040106861400000610
Is the input to calculate the spatial attention. That is, the new spatial query vector, spatial key vector and spatial value vector are composed of +.>
Figure BDA00040106861400000611
Obtained, still used herein
Figure BDA00040106861400000612
The representation is not described in detail.
In step S22, a code is encoded according to the first temporal attention
Figure BDA00040106861400000613
Calculating a spatial input vector of spatial attention of the current layer l of any time step t of any node v in the first BERT model, the spatial input vector comprising a spatial query vector +.>
Figure BDA00040106861400000614
Space key vector->
Figure BDA00040106861400000615
Spatial value vector +.>
Figure BDA00040106861400000616
The calculation formula is the same as that in step S21, and will not be described here again.
From the spatial query vector
Figure BDA00040106861400000617
And the spatial key vector->
Figure BDA00040106861400000618
Calculating a first spatial attention weight of the current layer/using an activation function>
Figure BDA00040106861400000619
Figure BDA0004010686140000071
Said first spatial attention weight to the current layer/
Figure BDA0004010686140000072
And the spatial value vector->
Figure BDA0004010686140000073
Weighted summation +.>
Figure BDA0004010686140000074
And residual connection is carried out with the first time attention code of the current layer l to obtain the first space attention code of the current layer l +.>
Figure BDA0004010686140000075
The calculation formula is as follows:
Figure BDA0004010686140000076
Figure BDA0004010686140000077
step S23: encoding the first spatial attention
Figure BDA0004010686140000078
Performing layer normalization to obtain a first state code +.>
Figure BDA0004010686140000079
Optionally, the first spatial attention is first encoded using a feed forward network
Figure BDA00040106861400000710
And (5) processing. Then the first spatial attention code is superimposed with the output of the feed forward network to obtain a first state code +.>
Figure BDA00040106861400000711
The calculation formula is as follows:
Figure BDA00040106861400000712
step S3: and adding the first state code and the future representation to obtain a prediction representation.
The time, space and location information of the future token is known in both inputs of the history token and the future token. Thus, it is necessary to incorporate future information into the model through secondary inputs and complete the transition from history to future. The first state code is a first number P of time step dimensions, the future representation is a second number Q of time step dimensions, P is not equal to Q, namely when the future representation is inconsistent with the first state code dimensions, if the first number is smaller than the second number, random number filling is carried out on the first state code of the first number of time steps, and the first state code is expanded to the second number of time steps. If the first number is greater than the second number, zero padding is performed on the future representation of the second number of time steps, extending to the first number of time steps. In this way, the dimensions of the history characterization and the dimensions of the future characterization can be guaranteed to be consistent.
Step S4: and processing the predictive representation by using a second BERT model to obtain a predicted traffic state.
The TPBERT model is structured as shown in the figure5, the entire TPBERT model is built up from 2 l layers. The first layer is the operation layer number of the first BERT model, the second layer is the operation layer number of the second BERT model, and the second BERT model and the first BERT model are identical in structure. The previous layer has extracted abstract information of the history characterization, and the later layer can be combined with future characterization to make corresponding predictions. The history characterization method is expressed as
Figure BDA00040106861400000713
Future characterization is denoted->
Figure BDA00040106861400000714
E h Is fed into the front L layer, generating an output +.>
Figure BDA00040106861400000715
When p=q, or output H by expanding the dimension L And future characterization E f When the dimensions of (a) are the same, H L And future characterization E f Adding to obtain +.>
Figure BDA0004010686140000081
E f ' being fed into the second L layer, producing an output +.>
Figure BDA0004010686140000082
I.e. predictive representation E p . To obtain the final prediction->
Figure BDA0004010686140000083
E p A fully connected neural network will be entered.
In step S4, performing a temporal attention calculation on the predictive representation, and obtaining a second temporal attention code; and performing spatial attention calculation on the second time attention code, obtaining a second spatial attention code, performing layer normalization processing on the second spatial attention code, and obtaining a second state code, wherein the second state code is the predicted traffic state. The analysis and calculation process is the same as that in the step 2, and the historical characterization data in the original formula is replaced by the predictive characterization data.
The following experiments were conducted on the deep learning-based traffic prediction method of the present embodiment, and as shown in table 1, the TPBERT model of the present embodiment was evaluated using two common data sets METR-LA and PeMS-BAY from the real world, the time step of both data sets being 5 minutes, and short-, medium-and long-term predictions being represented by 3, 6 and 12 steps, respectively. METR-LA and PeMS-BAY are two different scale traffic data sets, the Mean Absolute Error (MAE), root Mean Square Error (RMSE), and Mean Absolute Percent Error (MAPE) being three measures of the model performance.
Table 1 test comparative results
Figure BDA0004010686140000084
Figure BDA0004010686140000091
HA, ARIMA, SVR, FNN, FC-LSTM, DCRNN, STGCN, MRA-BGCN, graph WaveNet, STA Wnet, MTGNN, GMANs in Table 1 are other different types of predictive models, HA represents a predictive model that utilizes a weighted average of historical time series as a predictive result; ARIMA and Kalman filter are a statistical prediction model for predicting and analyzing time sequences; SVR is a model which regards traffic prediction as a regression task and predicts with the help of a support vector machine; FNN is a prediction model consisting of two dense layers and L2 regularization; FC-LSTM is an encoder-decoder prediction model; DCRNN is a predictive model that captures spatial and temporal correlations using bipartite graph random walks and RNNs; STGCN is a prediction model which is built on a space-time convolution block and integrates graph convolution and gate control time convolution; MRA-BGCN is a prediction model which introduces a two-component graph convolution and a multi-range attention mechanism to integrate traffic information from different neighbors; graph WaveNet is a prediction model for learning long sequence information by using an adaptive dependency matrix and one-dimensional convolution; STA Wnet is a predictive model that uses self-learning node embedding to represent potential spatial relationships; MTGNN is a multivariate time series prediction model consisting of graph structure learning, graph convolution and time convolution; GMAN is an encoder-decoder architecture prediction model equipped with various attention mechanisms, such as spatial attention, temporal attention, and transform attention. Experimental results show that the traffic prediction method based on the deep learning is used for improving the accuracy in traffic prediction. In short-term prediction, MRA-BGCN performs best on both data sets. In mid-term and long-term predictions, TPBERT performs better on both datasets than other models. From different data sets, METR-LA predictions were larger than PeMS-BAY errors, indicating that METR-LA traffic conditions were more complex than BAY regions, while TPBERT performed well on the more challenging METR-LA, indicating that TPBERT has significant modeling capabilities for complex traffic data.
The embodiment of the invention obtains the historical representation of traffic state information and space-time information representing a first number of historical time steps and the future representation of space-time information representing a second number of future time steps; processing the history characterization by using a first BERT model to obtain a first state code; adding the first state code to the future representation to obtain a predictive representation; and the second BERT model is applied to process the prediction characterization, so that the predicted traffic state is obtained, the accuracy of long-term prediction can be improved, and the capture of the hidden space-time dependence in traffic data is facilitated.
Based on the same inventive concept, the embodiment of the invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, and is characterized in that the processor executes the program to implement the method according to any one of the preceding claims.
Fig. 6 shows a more specific hardware architecture of an electronic device according to this embodiment, where the device may include: a processor 601, a memory 602, an input/output interface 603, a communication interface 604, and a bus 605. Wherein the processor 601, the memory 602, the input/output interface 603 and the communication interface 604 are communicatively coupled to each other within the device via a bus 605.
The processor 601 may be implemented by a general-purpose CPU (Central Processing Unit ), a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc. for executing relevant programs to implement the technical solutions provided by the embodiments of the present invention.
The Memory 602 may be implemented in the form of ROM (Read Only Memory), RAM (Random AccessMemory ), static storage device, dynamic storage device, or the like. The memory 602 may store an operating system and other application programs, and when the technical solutions provided by the embodiments of the present invention are implemented by software or firmware, relevant program codes are stored in the memory 602 and invoked by the processor 601 for execution.
The input/output interface 603 is used for connecting with an input/output module to realize information input and output. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
The communication interface 604 is used to connect a communication module (not shown in the figure) to enable the present device to interact with other devices for communication. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
The bus 605 includes a path to transfer information between the various components of the device, such as the processor 601, memory 602, input/output interfaces 603, and communication interfaces 604.
It should be noted that although the above device only shows the processor 601, the memory 602, the input/output interface 603, the communication interface 604, and the bus 605, in the implementation, the device may further include other components necessary for realizing normal operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary for implementing the embodiments of the present invention, and not all the components shown in the drawings.
Based on the same inventive concept, the embodiments of the present invention also provide a computer storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform the method of any one of the foregoing.
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the disclosure, including the claims, is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined under the idea of the present disclosure, the steps may be implemented in any order, and there are many other variations of the different aspects of the embodiments of the present invention as described above, which are not provided in details for the sake of brevity.
The present embodiments are intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Accordingly, any omissions, modifications, equivalents, improvements, and the like, which are within the spirit and principles of the embodiments of the invention, are intended to be included within the scope of the present disclosure.

Claims (10)

1. The traffic prediction method based on the deep learning is characterized by comprising the following steps of:
acquiring historical characterization of traffic state information and spatiotemporal information characterizing a first number of historical time steps and future characterization of spatiotemporal information characterizing a second number of future time steps;
processing the history characterization by using a first BERT model to obtain a first state code;
adding the first state code to the future representation to obtain a predictive representation;
and processing the predictive representation by using a second BERT model to obtain a predicted traffic state.
2. The deep learning-based traffic prediction method according to claim 1, wherein the applying a first BERT model to process the history characterization to obtain a first state code includes:
performing time attention calculation on the history characterization to acquire a first time attention code;
performing spatial attention calculation on the first time attention code to acquire a first spatial attention code;
and carrying out layer normalization processing on the first space attention code to obtain a first state code.
3. The deep learning based traffic prediction method according to claim 2, wherein the performing a temporal attention calculation on the history characterization to obtain a first temporal attention code comprises:
decomposing the history characterization into time steps and node granularity, and calculating a time input vector of the time attention of the current layer of any time step of any node in the first BERT model according to the history characterization, wherein the time input vector comprises a time query vector, a time key vector and a time value vector;
calculating a first time attention weight of the current layer by applying an activation function according to the time query vector and the time key vector;
and carrying out weighted summation on the first time attention weight of the current layer and the time value vector, and carrying out residual connection with the first time attention code of the previous layer to obtain the first time attention code of the current layer.
4. The deep learning based traffic prediction method according to claim 2, wherein the performing spatial attention calculation on the first temporal attention code to obtain a first spatial attention code includes:
calculating a spatial input vector of spatial attention of a current layer of any time step of any node in the first BERT model according to the first time attention taking code, wherein the spatial input vector comprises a spatial query vector, a spatial key vector and a spatial value vector;
calculating a first spatial attention weight of the current layer by applying an activation function according to the spatial query vector and the spatial key vector;
and carrying out weighted summation on the first spatial attention weight of the current layer and the spatial value vector, and carrying out residual connection with the first time attention code of the current layer to obtain the first spatial attention code of the current layer.
5. The traffic prediction method based on deep learning of claim 2, wherein the performing layer normalization processing on the first spatial attention code to obtain a first state code includes:
processing the first spatial attention code using a feed forward network;
and superposing the first space attention code with the output of the feedforward network to obtain a first state code.
6. The deep learning based traffic prediction method according to claim 1, wherein said adding the first state code to the future representation to obtain a predicted representation comprises:
if the first number is less than the second number, performing random number filling on the first state code of the first number of time steps, and expanding to the second number of time steps;
if the first number is greater than the second number, zero padding the future representation of the second number of time steps to the first number of time steps.
7. The deep learning-based traffic prediction method according to claim 1, wherein the applying the second BERT model to process the predicted representation to obtain the predicted traffic state includes:
performing time attention calculation on the predictive representation to obtain a second time attention code;
and performing spatial attention calculation on the second time attention code, obtaining a second spatial attention code, performing layer normalization processing on the second spatial attention code, and obtaining a second state code, wherein the second state code is the predicted traffic state.
8. The deep learning based traffic prediction method according to claim 7, wherein said performing a temporal attention calculation on the predicted representation, obtaining a second temporal attention code comprises:
decomposing the predictive representation into time steps and node granularity, and calculating a time input vector of the time attention of the current layer of any time step of any node in the second BERT model according to the predictive representation, wherein the time input vector comprises a time query vector, a time key vector and a time value vector;
calculating a second time attention weight of the current layer by applying an activation function according to the time query vector and the time key vector;
and carrying out weighted summation on the second time attention weight of the current layer and the time value vector, and carrying out residual connection with the second time attention code of the previous layer to obtain the second time attention code of the current layer.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 8 when the program is executed by the processor.
10. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform the method of any one of claims 1 to 8.
CN202211651167.5A 2022-12-21 2022-12-21 Traffic prediction method based on deep learning Pending CN116403397A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211651167.5A CN116403397A (en) 2022-12-21 2022-12-21 Traffic prediction method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211651167.5A CN116403397A (en) 2022-12-21 2022-12-21 Traffic prediction method based on deep learning

Publications (1)

Publication Number Publication Date
CN116403397A true CN116403397A (en) 2023-07-07

Family

ID=87008129

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211651167.5A Pending CN116403397A (en) 2022-12-21 2022-12-21 Traffic prediction method based on deep learning

Country Status (1)

Country Link
CN (1) CN116403397A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117094451A (en) * 2023-10-20 2023-11-21 邯郸欣和电力建设有限公司 Power consumption prediction method, device and terminal

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117094451A (en) * 2023-10-20 2023-11-21 邯郸欣和电力建设有限公司 Power consumption prediction method, device and terminal
CN117094451B (en) * 2023-10-20 2024-01-16 邯郸欣和电力建设有限公司 Power consumption prediction method, device and terminal

Similar Documents

Publication Publication Date Title
Veres et al. Deep learning for intelligent transportation systems: A survey of emerging trends
KR101880907B1 (en) Method for detecting abnormal session
CN113487088A (en) Traffic prediction method and device based on dynamic space-time diagram convolution attention model
Chen et al. A novel reinforced dynamic graph convolutional network model with data imputation for network-wide traffic flow prediction
CN114299723B (en) Traffic flow prediction method
CN104268594A (en) Method and device for detecting video abnormal events
CN111199216B (en) Motion prediction method and system for human skeleton
Stanić et al. R-sqair: Relational sequential attend, infer, repeat
CN116187555A (en) Traffic flow prediction model construction method and prediction method based on self-adaptive dynamic diagram
CN110570035A (en) people flow prediction system for simultaneously modeling space-time dependency and daily flow dependency
CN111091711A (en) Traffic control method and system based on reinforcement learning and traffic lane competition theory
CN116403397A (en) Traffic prediction method based on deep learning
CN112307883A (en) Training method, training device, electronic equipment and computer readable storage medium
CN113015983A (en) Autonomous system including continuous learning world model and related methods
CN114491125A (en) Cross-modal figure clothing design generation method based on multi-modal codebook
CN116543351A (en) Self-supervision group behavior identification method based on space-time serial-parallel relation coding
Wei et al. Deterministic ship roll forecasting model based on multi-objective data fusion and multi-layer error correction
CN116109021B (en) Travel time prediction method, device, equipment and medium based on multitask learning
Li et al. An effective self-attention-based hybrid model for short-term traffic flow prediction
WO2023179609A1 (en) Data processing method and apparatus
CN112215193A (en) Pedestrian trajectory prediction method and system
Almalki et al. Forecasting method based upon gru-based deep learning model
KR102561799B1 (en) Method and system for predicting latency of deep learning model in device
CN116822722A (en) Water level prediction method, system, device, electronic equipment and medium
Huang et al. Multistep coupled graph convolution with temporal-attention for traffic flow prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination