CN117133129A - Traffic speed prediction method based on multi-component attention-seeking neural network - Google Patents
Traffic speed prediction method based on multi-component attention-seeking neural network Download PDFInfo
- Publication number
- CN117133129A CN117133129A CN202311394555.4A CN202311394555A CN117133129A CN 117133129 A CN117133129 A CN 117133129A CN 202311394555 A CN202311394555 A CN 202311394555A CN 117133129 A CN117133129 A CN 117133129A
- Authority
- CN
- China
- Prior art keywords
- time
- space
- traffic speed
- period
- attention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 21
- 238000003062 neural network model Methods 0.000 claims abstract description 21
- 230000004927 fusion Effects 0.000 claims abstract description 13
- 238000013507 mapping Methods 0.000 claims abstract description 10
- 238000012545 processing Methods 0.000 claims abstract description 7
- 238000012549 training Methods 0.000 claims abstract description 5
- 238000000605 extraction Methods 0.000 claims description 44
- 238000009792 diffusion process Methods 0.000 claims description 34
- 230000008569 process Effects 0.000 claims description 31
- 239000011159 matrix material Substances 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 25
- 230000004913 activation Effects 0.000 claims description 24
- 230000003044 adaptive effect Effects 0.000 claims description 9
- 230000007246 mechanism Effects 0.000 claims description 8
- 230000009191 jumping Effects 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000003213 activating effect Effects 0.000 claims description 3
- 238000012512 characterization method Methods 0.000 claims description 3
- 238000003475 lamination Methods 0.000 claims description 3
- 230000002441 reversible effect Effects 0.000 claims description 3
- 238000013341 scale-up Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000003442 weekly effect Effects 0.000 claims description 3
- 238000010586 diagram Methods 0.000 abstract description 3
- 230000006399 behavior Effects 0.000 description 5
- 230000002123 temporal effect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000000737 periodic effect Effects 0.000 description 3
- 241000282414 Homo sapiens Species 0.000 description 2
- 101001095088 Homo sapiens Melanoma antigen preferentially expressed in tumors Proteins 0.000 description 2
- 102100037020 Melanoma antigen preferentially expressed in tumors Human genes 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 101100272279 Beauveria bassiana Beas gene Proteins 0.000 description 1
- 241000287107 Passer Species 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
- G08G1/0125—Traffic data processing
- G08G1/0129—Traffic data processing for creating historical data or processing based on historical data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
- G08G1/0108—Measuring and analyzing of parameters relative to traffic conditions based on the source of data
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
- G08G1/0137—Measuring and analyzing of parameters relative to traffic conditions for specific applications
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/052—Detecting movement of traffic to be counted or controlled with provision for determining speed or overspeed
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a traffic speed prediction method based on a multi-component attention map neural network, which belongs to the field of intelligent traffic and comprises the following steps: defining a network structure of a traffic speed sensor, processing a traffic speed sequence at a historical moment, and then establishing a model mapping relation; constructing a multi-component attention-seeking neural network model for traffic speed prediction; training a multi-component attention-seeking neural network model to obtain a trained model; and acquiring traffic speed data of one hour before the current moment, acquiring corresponding daily period and week period information from a traffic speed sequence of a historical moment, inputting a trained multicomponent attention force diagram neural network model, and predicting the traffic speed of a future time period. According to the invention, various period information is linked with the time characteristics of the predicted time, different period fusion weights are used at different times, and the prediction accuracy of traffic speed is improved.
Description
Technical Field
The invention belongs to the field of intelligent traffic, and particularly relates to a traffic speed prediction method based on a multi-component attention-seeking neural network.
Background
With the continuous improvement of artificial intelligence technology and the rapid development of intelligent traffic systems, intelligent traffic has become one of the important directions for building intelligent cities, and in recent years, traffic speed prediction is focused by researchers in intelligent traffic systems and becomes an important research direction. The reliable and accurate traffic speed prediction not only can provide a basis for a passer to select a real-time and accurate driving route, but also can further balance the traffic flow of a road, thereby effectively relieving or avoiding the traffic jam problem.
Traffic speed prediction is a typical spatiotemporal sequence prediction problem, and the historical data of human driving behaviors on each road section form complex three-dimensional data. Since the driving behavior of human beings in real traffic is dynamically and periodically changed, for example, the driving behavior of monday is similar to that of monday and the driving behavior of the last monday, and the driving behavior of monday is similar to that of sunday to a lesser extent. Thus, how to efficiently model temporal, spatial correlation in spatio-temporal sequences and dynamic periodic features in historical information is critical to solving such problems.
In recent years, researchers have achieved some success in traffic prediction by modeling spatial correlation in traffic sequences through a graph neural network. However, in the past, the future time is predicted based on the traffic condition at the latest time, or a plurality of historical information are fused by using a fixed fusion weight, so that the periodicity of dynamic change in the traffic sequence is ignored.
Disclosure of Invention
In order to solve the problems, the invention provides a traffic speed prediction method based on a multi-component attention force diagram neural network, which is used for finely dividing historical data of traffic speed, modeling various historical period information, fully fusing local and global space-time characteristics by using an attention mechanism and improving the fine granularity of space-time modeling; meanwhile, various period information is connected with the time characteristics of the prediction time, different period fusion weights are used at different times, and the prediction effect of the model is improved.
The technical scheme of the invention is as follows:
a traffic speed prediction method based on a multi-component attention-seeking neural network comprises the following steps:
step 1, defining a network structure of a traffic speed sensor, processing a traffic speed sequence at a historical moment, and then establishing a model mapping relation;
step 2, constructing a multi-component attention-seeking neural network model for traffic speed prediction according to the time and space correlation of the traffic speed sequence;
step 3, training a multi-component attention seeking neural network model by using the traffic speed sequence at the historical moment processed in the step 1 to obtain a trained model;
and 4, acquiring traffic speed data of one hour before the current moment, acquiring corresponding daily period and weekly period information from a traffic speed sequence at the historical moment, inputting a trained multicomponent attention map neural network model, and predicting the traffic speed in a future time period.
Further, in step 1, modeling the traffic speed sensor network in the real traffic situation as a directed graphWherein->Representing a set of traffic speed sensor nodes, < +.>Representing traffic speedA set of connection relations between sensor nodes, < >>Adjacency matrix representing a traffic speed sensor network, < >>Representing the number of traffic speed sensor nodes; is provided with->Representing adjacency matrix->Is a special element of the traffic speed sensor node->Traffic speed sensor node->When the distance of (2) is below the threshold value,/->1, otherwise->Is 0;
processing the traffic speed sequence at the historical moment into four-dimensional time-space sequenceWherein->Slice number representing a spatio-temporal sequence, +.>Representing the number of sequences contained in each slice, corresponding to the number of traffic speed sensor nodes, namely,/>Representing the length of the time sequence in each hour, and 3 representing that each time-space sequence contains traffic speed data, time points and Zhou three characteristics;
modeling by fusing the historical period information of the hours, the days and the Wednesday with the time information of the predicted time, and defining the historical period information of the hours asThe date and time history information is defined asCycle history information is defined asThe method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Indicates the current time, ++>Indicates the number of sequences per day,/->Representing the length of the employed history sequence; />Indicating that the traffic speed sensor network is +.>Traffic speed information at the moment; time information of the predicted time is defined as +.>Wherein->Time step representing the predicted moment, +.>Indicating that the traffic speed sensor network is +.>Time information of the moment;
the three-period historical information, the time information of the predicted moment and the traffic speed sensor network are gatheredTogether as input to the model, traffic speed +.>The method comprises the steps of carrying out a first treatment on the surface of the The mapping relation between the model input and the model output is expressed as follows: />The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Representing the mapping.
In step 2, the multi-component attention seeking neural network model comprises five parts, namely an hour period space-time feature extraction module, a day period space-time feature extraction module, a week period space-time feature extraction module, a multi-period feature fusion module and an output layer; the specific process of traffic speed prediction by the multi-component attention-seeking neural network model is as follows:
step 2.1, extracting the time-space characteristics of the hour period, the day period and the week period respectively through an hour period time-space characteristic extraction module, a day period time-space characteristic extraction module and a week period time-space characteristic extraction module;
step 2.2, combining the time-space characteristics of the hour period, the day period and the week period through a multicycle characteristic fusion module to obtain multicycle time-space characteristics;
step 2.3, setting time information of a predicted time, and inputting a component attention layer by combining multi-period space-time characteristics to obtain final space-time characteristics;
step 2.4, the final space-time characteristics pass through the output layer to make the characteristic dimension be defined byAnd scaling to 1 to obtain a final traffic speed prediction result.
Further, in step 2.1, the hour period space-time feature extraction module, the day period space-time feature extraction module and the week period space-time feature extraction module are all constructed by using space-time feature extraction modules with the same structure; the space-time feature extraction module consists of a convolution layer, three time-space layers and a jumping attention layer; the three space-time layers are respectively used for extracting space-time characteristics of different scales, the output of the first space-time layer is the input of the second space-time layer, and the output of the second space-time layer is the input of the third space-time layer; the result of the output cascade of the three time-space layers is the final required time-space characteristics; each time space layer has the same structure and comprises two time convolution layers and a picture scroll lamination layer.
Further, the process of extracting the space-time features by the space-time feature extracting module is as follows:
step 2.1.1, inputting cycle history information into oneIs subjected to two-dimensional convolution operation to obtain a characteristic dimension of +.>Is>And will->Feeding a first time blank layer;
step 2.1.2,Extracting features by two parallel time convolution layers, performing nonlinear transformation by using an activation function, and multiplying the transformed results to obtain time features ∈ ->;/>The calculation formula of (2) is as follows:
(1);
wherein,representing the Hadamard product; />Representing a Tanh activation function; />Representing a Sigmoid activation function;and->The weights of the two time convolution layers respectively;
step 2.1.3, willInput diffusion convolution layer, in->Modeling the spatial correlation of adjacent traffic speed sensor nodes in the forward direction, the reverse direction and the global direction by using a forward diffusion matrix, a backward diffusion matrix and an adaptive diffusion matrix on each spatial step to obtain space-time characteristics ∈ ->;/>The calculation formula of (2) is as follows:
(2);
(3);
(4);
(5);
wherein,is a forward diffusion matrix->Is->Forward diffusion matrices for each spatial step; />Is a backward diffusion matrix->Is->A backward diffusion matrix of each spatial step; />Is an adaptive diffusion matrix; />Is->An adaptive diffusion matrix of individual spatial steps; />、/>And->Respectively represent +.>A parameter matrix which can be learned in each space step; />For summing the diffusion matrix by rows; />Is a normalized exponential function; />Activating a function for Randomized Leaky ReLU; />Embedding a vector for a source node; />Embedding a vector for a target node;
step 2.1.4, willAnd->Residual connection is carried out to obtain the output +.>;
Step 2.1.5, willAs input of the second time-space layer, the output of the second time-space layer is obtained according to the same process as in steps 2.1.2-2.1.4>;
Step 2.1.6, willAs input of the third time-space layer, the output of the third time-space layer is obtained according to the same process as in steps 2.1.2-2.1.4>;
Step 2.1.7, will、/>、/>Merging in the last dimension to obtain new spatio-temporal featuresWherein->Representation->、/>、/>The sum of the three sequence lengths;
step 2.1.8, willSending into a jump attention layer, calculating the correlation between the space-time characteristics of different scales to obtain the space-time characteristics +.>The method comprises the steps of carrying out a first treatment on the surface of the The jumping attention layer adopts 4 attention heads, and each attention head uses different weights;
in the hour period space-time feature extraction module, the period history information input in step 2.1.1 is the hour period history informationAfter the process of the steps 2.1.1 to 2.1.8, the hour period space-time characteristic is extracted>;
In the time-space feature extraction module of the day period, the period history information input in the step 2.1.1 is the day period history informationAfter the process of the steps 2.1.1 to 2.1.8, the day-period space-time characteristics are extracted>;
In the cycle time-space feature extraction module, the cycle history information input in step 2.1.1 is cycle history informationAfter the process of the steps 2.1.1 to 2.1.8, the weekly-periodic space-time characteristics are extracted>。
Further, the specific process of step 2.2 is as follows:
will be、/>、/>Combining in feature dimension to obtain multi-period space-time featureHere +.>,/>The number of period information is represented.
Further, the specific process of step 2.3 is as follows:
step 2.3.1, time information of the predicted time is setWherein->Feature dimensions representing time information; will->Input a +.>Is given a characteristic dimension +.>Time characteristics of (2);
Step 2.3.2, the multi-component attention layer adopts a multi-head attention mechanism; characterization of timeMulti-period spatiotemporal feature as query>As key and value, inputting multi-component attention layer, multiple head attention mechanism will be mapped to three subspaces respectively, calculating correlation of time feature and multi-period time-space feature, obtaining final time-space feature。
Further, the specific process of step 2.4 is as follows:
first, willInputting an RReLU activation function to perform nonlinear activation on the decoded space-time characteristics; then, a +.>Is to scale up the dimension to +.>And performing nonlinear activation by using the RReLU activation function; finally, a +.>Scaling the dimension to 1 to obtain a final output result; the specific formula is as follows:
(6);
wherein,、/>respectively representing the weights of the two convolution layers.
The invention has the beneficial technical effects that: the invention provides a multi-component attention layer, historical information is subdivided according to different periods, and the historical information of different periods is fused by using the time characteristics of the predicted time, so that different fusion weights are realized according to different predicted times. The invention provides a space-time feature extraction module which fuses space-time features with different scales, so that the problem of losing local information caused by increasing the expansion coefficient of time convolution is solved, and the space-time modeling granularity of a model is improved. The invention discloses a traffic speed prediction method based on multicycle information, which provides a transit time prediction method for fusing different cycle history information by using time characteristics of prediction moments, and solves the problem that the traditional statistical model and the existing deep learning prediction method can not capture the dynamic periodicity of a time sequence well.
Drawings
FIG. 1 is a flow chart of a traffic speed prediction method based on a multi-component attention-seeking neural network according to the present invention.
Detailed Description
The invention is described in further detail below with reference to the attached drawings and detailed description:
as shown in fig. 1, the invention provides a traffic speed prediction method based on a multi-component attention-seeking neural network, which can improve the prediction accuracy of urban traffic speed, and mainly comprises the following steps:
and step 1, defining a traffic speed sensor network structure, processing a traffic speed sequence at a historical moment, and then establishing a model mapping relation.
Modeling traffic speed sensor network in real traffic situation as a directed graphWherein->Representing a set of traffic speed sensor nodes, < +.>Representing a set of connection relationships between traffic speed sensor nodes,adjacency matrix representing a traffic speed sensor network, < >>Representing the number of traffic speed sensor nodes; is provided with->Representing adjacency matrix->Is a special element of the traffic speed sensor node->Traffic speed sensor node->When the distance of (2) is below the threshold value,/->1, otherwise->Is 0; processing traffic speed sequence at historical moment into three-dimensional space-time sequenceWherein->Slice number representing a spatio-temporal sequence, +.>Representing the number of sequences contained in each slice, corresponding to the number of traffic speed sensor nodes, i.e. +.>,/>Representing the length of the time sequence in each hour, and 3 representing that each time-space sequence contains traffic speed data, time points and Zhou three characteristics;
the invention combines the historical period information of three kinds of hours, days and weeks with the time information of the predicted time to model, and defines the historical period information of the hours asThe date and time history information is defined asCycle history information is defined asThe method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Indicates the current time, ++>Representing the number of sequences per dayQuantity (S)>Representing the length of the employed history sequence; />Indicating that the traffic speed sensor network is +.>Traffic speed information at the moment; the time information of the predicted moment is defined as +.>Wherein->Time step representing the predicted moment, +.>Indicating that the traffic speed sensor network is +.>Time information of the moment.
The three-period historical information, the time information of the predicted moment and the traffic speed sensor network are gatheredTogether as input to the model, traffic speed +.>. The mapping of model inputs and outputs can be expressed as: />The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Representing the mapping.
Step 2, constructing a multi-component attention-seeking neural network model for traffic speed prediction according to the time and space correlation of the traffic speed sequence; the multi-component attention-seeking neural network model mainly comprises an hour period space-time feature extraction module, a day period space-time feature extraction module, a week period space-time feature extraction module, a multi-period feature fusion module and an output layer.
The specific process of traffic speed prediction by the multi-component attention-seeking neural network model is as follows:
step 2.1, extracting the time-space characteristics of the hour period, the day period and the week period respectively through an hour period time-space characteristic extraction module, a day period time-space characteristic extraction module and a week period time-space characteristic extraction module;
the hour period space-time feature extraction module, the day period space-time feature extraction module and the week period space-time feature extraction module are all constructed by using space-time feature extraction modules with the same structure. The space-time feature extraction module consists of a convolution layer, three time-space layers and a jumping attention layer; the three space-time layers are respectively used for extracting space-time characteristics of different scales, the output of the first space-time layer is the input of the second space-time layer, and the output of the second space-time layer is the input of the third space-time layer; the result of the output cascade of the three time-space layers is the final required time-space characteristics; each time space layer has the same structure and comprises two time convolution layers and a picture scroll lamination layer.
The process of extracting the space-time characteristics by the space-time characteristic extracting module is as follows:
step 2.1.1, inputting period information into oneIs to change +.>Is subjected to two-dimensional convolution operation to obtain the characteristic dimension of +.>Is>And will->Feeding a first time blank layer;
step 2.1.2,Extracting features by two parallel time convolution layers, performing nonlinear transformation by using an activation function, and multiplying the transformed results to obtain time features ∈ ->. One of the temporal convolution layers is scaled between (-1, 1) using the Tanh activation function; another temporal convolution layer is scaled between (0, 1) using a Sigmoid activation function.
The calculation formula of (2) is as follows:
(1);
wherein,representing the Hadamard product; />Representing a Tanh activation function; />Representing a Sigmoid activation function;and->The weights of the two temporal convolution layers, respectively.
Step 2.1.3, willInput diffusion convolution layer, in->Modeling the spatial correlation of adjacent traffic speed sensor nodes in the forward direction, the reverse direction and the global direction by using a forward diffusion matrix, a backward diffusion matrix and an adaptive diffusion matrix on each spatial step to obtain space-time characteristics ∈ ->;
The calculation formula of (2) is as follows:
(2);
(3);
(4);
(5);
wherein,is a forward diffusion matrix->Is->Forward diffusion matrix of individual spatial steps, calculate +.>At a value of (2) byPersonal->Multiplication is carried out to obtain; />Is a backward diffusion matrix->Is->The backward diffusion matrix of the individual spatial steps, calculating +.>When the value of (2) is->Personal->Multiplication is carried out to obtain; />Is an adaptive diffusion matrix; />Is->Adaptive diffusion matrix for individual spatial steps, computationWhen the value of (2) is->Personal->Multiplication is carried out to obtain; />、/>And->Respectively represent +.>A parameter matrix which can be learned in each space step; />For summing the diffusion matrix by rows; />Is a normalized exponential function;activating a function for Randomized Leaky ReLU; />Embedding a vector for a source node; />A vector is embedded for the target node.
Step 2.1.4, at last, willAnd->Residual connection is carried out to obtain the output +.>;
Step 2.1.5, the spatio-temporal features extracted by the first spatio-temporal layer are relatively local and will therefore beAs input of the second time-space layer, the output of the second time-space layer is obtained according to the same process as in steps 2.1.2-2.1.4>;
Step 2.1.6, willAs input of the third time-space layer, the output of the third time-space layer is obtained according to the same process as in steps 2.1.2-2.1.4>;
Step 2.1.7 due to、/>、/>Different sequence lengths belonging to different scales of spatiotemporal features, thus will +.>、/>、/>Combining in the last dimension to get a new spatio-temporal feature +.>Wherein->Representation->、/>、/>The sum of the three sequence lengths; the spatio-temporal features extracted at this time contain local and global information.
Step 2.1.8, willSend into the jumping attention layer, calculate +.>Correlation between them, a single period of spatio-temporal features ∈>The method comprises the steps of carrying out a first treatment on the surface of the The jumping attention layer adopts 4 attention heads, and each attention head uses different weights, so that the space-time characteristics of different scales are fused, and the purpose of enriching the space-time characteristic expression capability is achieved.
In the hour period space-time feature extraction module, the period history information input in step 2.1.1 is the hour period history informationAfter the process of the steps 2.1.1 to 2.1.8, the hour period space-time characteristic is extracted>;
In the time-space feature extraction module of the day period, the period history information input in the step 2.1.1 is the day period history informationAfter the process of the steps 2.1.1 to 2.1.8, the day-period space-time characteristics are extracted>;
In the cycle time-space feature extraction module, the cycle history information input in step 2.1.1 is cycle history informationAfter the process of the steps 2.1.1 to 2.1.8, the weekly-periodic space-time characteristics are extracted>;
、/>、/>Is +.>A four-dimensional tensor;
step 2.2, combining the time-space characteristics of the hour period, the day period and the week period through a multicycle characteristic fusion module to obtain multicycle time-space characteristics; the specific process is as follows:
will be、/>、/>Combining in feature dimension to obtain multi-period space-time featureHere +.>,/>The number of the period information is represented; in the present invention, the hour period, day period and week period are used, so +.>Set to 3.
Step 2.3, setting time information of a predicted time, and inputting a component attention layer by combining multi-period space-time characteristics to obtain final space-time characteristics; the specific process is as follows:
step 2.3.1, time information of the predicted time is setWherein->Feature dimensions representing time information, such as: week, time, holiday, etc.; />A time step representing the predicted time; will->Input a +.>Is given a characteristic dimension +.>Time characteristics of->The feature dimensions herein are consistent with the feature dimensions of the extracted multicycle spatiotemporal features.
Step 2.3.2, the multi-component attention layer adopts a multi-head attention mechanism; characterization of timeMulti-period spatiotemporal feature as query>As key and value, inputting multi-component attention layer, multiple head attention mechanism will be mapped to three subspaces respectively, calculating correlation of time feature and multi-period time-space feature, obtaining final time-space feature。
Step 2.4, the final space-time characteristics pass through the output layer to make the characteristic dimension be defined byScaling to 1 to obtain a final traffic speed prediction result; the specific process is as followsThe following steps:
first, willInputting an RReLU activation function to perform nonlinear activation on the decoded space-time characteristics; then, a +.>Is to scale up the dimension to +.>And performing nonlinear activation by using the RReLU activation function; finally, a +.>The dimension is scaled to 1 to obtain the final output result. The specific formula is as follows:
(6);
wherein,、/>respectively representing the weights of the two convolution layers; />Representing Randomized Leaky ReLU activation functions.
The multi-period feature fusion module applies an attention mechanism to the fusion of the traffic speed historical information by considering different period influence weights under different time, calculates the fusion weights of different period historical information by the time features of the prediction moment, realizes dynamic periodic modeling, and improves the accuracy of traffic speed prediction.
And 3, training a multi-component attention seeking neural network model by using the traffic speed sequence at the historical moment processed in the step 1, setting the convolution layer number to 3, setting the attention head to 4 in the training process, adjusting the learning rate by using a cosine annealing function, and optimizing the model by using an Adamw optimizer to obtain a trained model.
And 4, acquiring traffic speed data of one hour before the current moment, acquiring corresponding daily period and weekly period information from a traffic speed sequence at the historical moment, inputting a trained multicomponent attention map neural network model, and predicting the traffic speed in a future time period.
In order to demonstrate the feasibility and superiority of the present invention, the following comparative experiments were performed. The specific experimental results are shown in tables 1 and 2.
Table 1 experimental comparison results on the los angeles loop speed dataset;
。
table 2 experimental comparison results on the los angeles bay area velocity dataset;
。
the experiment is a comparison experiment of five time sequence prediction models of a multi-component attention seeking neural network model MCAGCN, a long and short period memory neural network FC-LSTM, a diffusion convolution circulation neural network DCRNN, a Graph wavelet network Graph Wavenet, an attention time space Graph neural network ASTGCN and a Graph multi-attention network GMANN, which are designed and realized by using average absolute error MAE, root mean square error RMSE and average absolute percentage error MAPE as evaluation indexes on two data sets of los Angeles loop traffic speed and los Angeles bay area traffic speed. 15 minutes, 30 minutes, 60 minutes in the table represent data at 15 minutes, 30 minutes, 60 minutes, respectively, in the future, such as 3.44 MAE for the long and short term memory neural network at 15 minutes in the los Angeles loop speed dataset. As can be seen from tables 1 and 2, the multi-component attention seeking neural network model of the invention is obviously superior to other network models in MAE, RMSE, MAPE index, has the least error value of the prediction result in short-term, medium-term and long-term predictions and the best prediction effect, namely, the invention can realize the prediction of traffic speed data and has higher accuracy and applicability in short-term, medium-term and long-term scales. Therefore, the model of the invention can be used as an effective traffic speed prediction model to provide technical support for traffic speed prediction and analysis.
The invention builds the multi-component attention force diagram neural network model for traffic speed prediction based on the multi-component attention layer and the space-time attention layer, solves the problems of periodic modeling of traffic speed and fine granularity modeling of space-time information by extracting various kinds of periodicity and local and global information among sequences in the traffic speed sequence, and improves the accuracy of the traffic speed prediction model.
It should be understood that the above description is not intended to limit the invention to the particular embodiments disclosed, but to limit the invention to the particular embodiments disclosed, and that the invention is not limited to the particular embodiments disclosed, but is intended to cover modifications, adaptations, additions and alternatives falling within the spirit and scope of the invention.
Claims (8)
1. The traffic speed prediction method based on the multi-component attention-seeking neural network is characterized by comprising the following steps of:
step 1, defining a network structure of a traffic speed sensor, processing a traffic speed sequence at a historical moment, and then establishing a model mapping relation;
step 2, constructing a multi-component attention-seeking neural network model for traffic speed prediction according to the time and space correlation of the traffic speed sequence;
step 3, training a multi-component attention seeking neural network model by using the traffic speed sequence at the historical moment processed in the step 1 to obtain a trained model;
and 4, acquiring traffic speed data of one hour before the current moment, acquiring corresponding daily period and weekly period information from a traffic speed sequence at the historical moment, inputting a trained multicomponent attention map neural network model, and predicting the traffic speed in a future time period.
2. According to claim 1The traffic speed prediction method based on the multicomponent attention seeking neural network is characterized in that in the step 1, a traffic speed sensor network in a real traffic situation is modeled as a directed graphWherein->Representing a set of traffic speed sensor nodes, < +.>Representing a set of connection relations between traffic speed sensor nodes,/->Adjacency matrix representing a traffic speed sensor network, < >>Representing the number of traffic speed sensor nodes; is provided with->Representing adjacency matrix->Is a special element of the traffic speed sensor node->Traffic speed sensor node->When the distance of (2) is below the threshold value,/->1, otherwise->Is 0;
processing the traffic speed sequence at the historical moment into four-dimensional time-space sequenceWherein->Slice number representing a spatio-temporal sequence, +.>Representing the number of sequences contained in each slice, corresponding to the number of traffic speed sensor nodes, i.e. +.>,Representing the length of the time sequence in each hour, and 3 representing that each time-space sequence contains traffic speed data, time points and Zhou three characteristics;
modeling by fusing the historical period information of the hours, the days and the Wednesday with the time information of the predicted time, and defining the historical period information of the hours asThe date and time history information is defined asCycle history information is defined asThe method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Indicates the current time, ++>Indicates the number of sequences per day,/->Representing the length of the employed history sequence; />Indicating that the traffic speed sensor network is +.>Traffic speed information at the moment; time information of the predicted time is defined as +.>Wherein->Time step representing the predicted moment, +.>Indicating that the traffic speed sensor network is +.>Time information of the moment;
the three-period historical information, the time information of the predicted moment and the traffic speed sensor network are gatheredTogether as input to the model, traffic speed +.>The method comprises the steps of carrying out a first treatment on the surface of the The mapping relation between the model input and the model output is expressed as follows: />The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Representing the mapping.
3. The traffic speed prediction method based on the multicomponent attention seeking neural network according to claim 2, wherein in the step 2, the multicomponent attention seeking neural network model comprises five parts of an hour period space-time feature extraction module, a day period space-time feature extraction module, a week period space-time feature extraction module, a multicycle feature fusion module and an output layer; the specific process of traffic speed prediction by the multi-component attention-seeking neural network model is as follows:
step 2.1, extracting the time-space characteristics of the hour period, the day period and the week period respectively through an hour period time-space characteristic extraction module, a day period time-space characteristic extraction module and a week period time-space characteristic extraction module;
step 2.2, combining the time-space characteristics of the hour period, the day period and the week period through a multicycle characteristic fusion module to obtain multicycle time-space characteristics;
step 2.3, setting time information of a predicted time, and inputting a component attention layer by combining multi-period space-time characteristics to obtain final space-time characteristics;
step 2.4, the final space-time characteristics pass through the output layer to make the characteristic dimension be defined byAnd scaling to 1 to obtain a final traffic speed prediction result.
4. The traffic speed prediction method based on multi-component attention-seeking neural network according to claim 3, wherein in the step 2.1, an hour period space-time feature extraction module, a day period space-time feature extraction module and a week period space-time feature extraction module are all constructed by using space-time feature extraction modules with the same structure; the space-time feature extraction module consists of a convolution layer, three time-space layers and a jumping attention layer; the three space-time layers are respectively used for extracting space-time characteristics of different scales, the output of the first space-time layer is the input of the second space-time layer, and the output of the second space-time layer is the input of the third space-time layer; the result of the output cascade of the three time-space layers is the final required time-space characteristics; each time space layer has the same structure and comprises two time convolution layers and a picture scroll lamination layer.
5. The traffic speed prediction method based on a multi-component attention-seeking neural network according to claim 4, wherein the process of extracting the spatio-temporal features by the spatio-temporal feature extraction module is as follows:
step 2.1.1, inputting cycle history information into oneIs subjected to two-dimensional convolution operation to obtain a characteristic dimension of +.>Is>And will->Feeding a first time blank layer;
step 2.1.2,Extracting features by two parallel time convolution layers, performing nonlinear transformation by using an activation function, and multiplying the transformed results to obtain time features ∈ ->;/>The calculation formula of (2) is as follows:
(1);
wherein,representing the Hadamard product; />Representing a Tanh activation function; />Representing a Sigmoid activation function; />Andthe weights of the two time convolution layers respectively;
step 2.1.3, willInput diffusion convolution layer, in->Modeling the spatial correlation of adjacent traffic speed sensor nodes in the forward direction, the reverse direction and the global direction by using a forward diffusion matrix, a backward diffusion matrix and an adaptive diffusion matrix on each spatial step to obtain space-time characteristics ∈ ->;/>The calculation formula of (2) is as follows:
(2);
(3);
(4);
(5);
wherein,is a forward diffusion matrix->Is->Forward diffusion matrices for each spatial step; />Is a backward diffusion matrix->Is->A backward diffusion matrix of each spatial step; />Is an adaptive diffusion matrix; />Is->An adaptive diffusion matrix of individual spatial steps; />、/>And->Respectively represent +.>A parameter matrix which can be learned in each space step; />For summing the diffusion matrix by rows; />Is a normalized exponential function; />Activating a function for Randomized Leaky ReLU; />Embedding a vector for a source node; />Embedding a vector for a target node;
step 2.1.4, willAnd->Residual connection is carried out to obtain the output +.>;
Step 2.1.5, willAs input of the second time-space layer, the output of the second time-space layer is obtained according to the same process as in steps 2.1.2-2.1.4>;
Step 2.1.6, willAs input of the third time-space layer, the output of the third time-space layer is obtained according to the same process as in steps 2.1.2-2.1.4>;
Step 2.1.7, will、/>、/>Merging in the last dimension to obtain new spatio-temporal featuresWherein->Representation->、/>、/>The sum of the three sequence lengths;
step 2.1.8, willSending into a jump attention layer, calculating the correlation between the space-time characteristics of different scales to obtain the space-time characteristics +.>The method comprises the steps of carrying out a first treatment on the surface of the The jumping attention layer uses 4 attention heads, each attentionThe heads all use different weights;
in the hour period space-time feature extraction module, the period history information input in step 2.1.1 is the hour period history informationAfter the process of the steps 2.1.1 to 2.1.8, the hour period space-time characteristic is extracted>;
In the time-space feature extraction module of the day period, the period history information input in the step 2.1.1 is the day period history informationAfter the process of the steps 2.1.1 to 2.1.8, the day-period space-time characteristics are extracted>;
In the cycle time-space feature extraction module, the cycle history information input in step 2.1.1 is cycle history informationAfter the process of the steps 2.1.1 to 2.1.8, the weekly-periodic space-time characteristics are extracted>。
6. The traffic speed prediction method based on a multicomponent attention seeking neural network according to claim 5, wherein the specific process of step 2.2 is as follows:
will be、/>、/>Combining in feature dimension to obtain multicycle space-time feature ++>Here +.>,/>The number of period information is represented.
7. The traffic speed prediction method based on a multicomponent attention seeking neural network according to claim 6, wherein the specific process of step 2.3 is as follows:
step 2.3.1, time information of the predicted time is setWherein->Feature dimensions representing time information; will->Input a +.>Is given a characteristic dimension +.>Time characteristics of->;
Step 2.3.2, the multi-component attention layer adopts a multi-head attention mechanism; characterization of timeMulti-period spatiotemporal feature as query>As key and value, inputting multi-component attention layer, multiple head attention mechanism will map to three subspaces respectively, calculate correlation of time feature and multi-period time-space feature, obtain final time-space feature>。
8. The traffic speed prediction method based on a multicomponent attention seeking neural network according to claim 7, wherein the specific process of step 2.4 is as follows:
first, willInputting an RReLU activation function to perform nonlinear activation on the decoded space-time characteristics; then, a +.>Is to scale up the dimension to +.>And performing nonlinear activation by using the RReLU activation function; finally, a +.>Scaling the dimension to 1 to obtain a final output result; the specific formula is as follows:
(6);
wherein,、/>respectively representing the weights of the two convolution layers.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311394555.4A CN117133129B (en) | 2023-10-26 | 2023-10-26 | Traffic speed prediction method based on multi-component attention-seeking neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311394555.4A CN117133129B (en) | 2023-10-26 | 2023-10-26 | Traffic speed prediction method based on multi-component attention-seeking neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117133129A true CN117133129A (en) | 2023-11-28 |
CN117133129B CN117133129B (en) | 2024-01-30 |
Family
ID=88854925
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311394555.4A Active CN117133129B (en) | 2023-10-26 | 2023-10-26 | Traffic speed prediction method based on multi-component attention-seeking neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117133129B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117688453A (en) * | 2024-02-02 | 2024-03-12 | 山东科技大学 | Traffic flow prediction method based on space-time embedded attention network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112183862A (en) * | 2020-09-29 | 2021-01-05 | 长春理工大学 | Traffic flow prediction method and system for urban road network |
CN113450568A (en) * | 2021-06-30 | 2021-09-28 | 兰州理工大学 | Convolutional network traffic flow prediction method based on space-time attention mechanism |
CN113705880A (en) * | 2021-08-25 | 2021-11-26 | 杭州远眺科技有限公司 | Traffic speed prediction method and device based on space-time attention diagram convolutional network |
US20220405864A1 (en) * | 2019-11-19 | 2022-12-22 | Zhejiang University | Crop yield estimation method based on deep temporal and spatial feature combined learning |
-
2023
- 2023-10-26 CN CN202311394555.4A patent/CN117133129B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220405864A1 (en) * | 2019-11-19 | 2022-12-22 | Zhejiang University | Crop yield estimation method based on deep temporal and spatial feature combined learning |
CN112183862A (en) * | 2020-09-29 | 2021-01-05 | 长春理工大学 | Traffic flow prediction method and system for urban road network |
CN113450568A (en) * | 2021-06-30 | 2021-09-28 | 兰州理工大学 | Convolutional network traffic flow prediction method based on space-time attention mechanism |
CN113705880A (en) * | 2021-08-25 | 2021-11-26 | 杭州远眺科技有限公司 | Traffic speed prediction method and device based on space-time attention diagram convolutional network |
Non-Patent Citations (1)
Title |
---|
林锦香;: "基于卷积神经网络的道路交通速度预测", 电脑知识与技术, no. 09, pages 182 - 184 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117688453A (en) * | 2024-02-02 | 2024-03-12 | 山东科技大学 | Traffic flow prediction method based on space-time embedded attention network |
CN117688453B (en) * | 2024-02-02 | 2024-04-30 | 山东科技大学 | Traffic flow prediction method based on space-time embedded attention network |
Also Published As
Publication number | Publication date |
---|---|
CN117133129B (en) | 2024-01-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | Dynamic spatial-temporal representation learning for traffic flow prediction | |
CN109697852B (en) | Urban road congestion degree prediction method based on time sequence traffic events | |
Ke et al. | Hexagon-based convolutional neural network for supply-demand forecasting of ride-sourcing services | |
Huang et al. | A deep learning approach for multi-attribute data: A study of train delay prediction in railway systems | |
CN111210633B (en) | Short-term traffic flow prediction method based on deep learning | |
Chen et al. | Exploiting spatio-temporal correlations with multiple 3d convolutional neural networks for citywide vehicle flow prediction | |
CN115240425B (en) | Traffic prediction method based on multi-scale space-time fusion graph network | |
CN112532439B (en) | Network flow prediction method based on attention multi-component space-time cross-domain neural network model | |
CN116128122B (en) | Urban rail transit short-time passenger flow prediction method considering burst factors | |
CN111242292B (en) | OD data prediction method and system based on deep space-time network | |
US20240054321A1 (en) | Traffic prediction | |
CN112863180A (en) | Traffic speed prediction method, device, electronic equipment and computer readable medium | |
Esquivel et al. | Spatio-temporal prediction of Baltimore crime events using CLSTM neural networks | |
Li et al. | Graph CNNs for urban traffic passenger flows prediction | |
CN117133129B (en) | Traffic speed prediction method based on multi-component attention-seeking neural network | |
CN112766551B (en) | Traffic prediction method, intelligent terminal and computer readable storage medium | |
CN110570035B (en) | People flow prediction system for simultaneously modeling space-time dependency and daily flow dependency | |
Zhang et al. | Multistep speed prediction on traffic networks: A graph convolutional sequence-to-sequence learning approach with attention mechanism | |
CN111242395A (en) | Method and device for constructing prediction model for OD (origin-destination) data | |
CN115985102B (en) | Urban traffic flow prediction method and equipment based on migration contrast learning | |
Jin et al. | Adaptive dual-view wavenet for urban spatial–temporal event prediction | |
Chen et al. | Pedestrian behavior prediction model with a convolutional LSTM encoder–decoder | |
CN114461931A (en) | User trajectory prediction method and system based on multi-relation fusion analysis | |
Shiri et al. | Forecasting daily stream flows using artificial intelligence approaches | |
Zhou et al. | Deep flexible structured spatial–temporal model for taxi capacity prediction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |