CN118072518A - Traffic flow prediction method based on big data - Google Patents
Traffic flow prediction method based on big data Download PDFInfo
- Publication number
- CN118072518A CN118072518A CN202410319851.6A CN202410319851A CN118072518A CN 118072518 A CN118072518 A CN 118072518A CN 202410319851 A CN202410319851 A CN 202410319851A CN 118072518 A CN118072518 A CN 118072518A
- Authority
- CN
- China
- Prior art keywords
- road section
- features
- sample
- feature
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000012549 training Methods 0.000 claims abstract description 88
- 230000004927 fusion Effects 0.000 claims abstract description 29
- 238000012544 monitoring process Methods 0.000 claims abstract description 27
- 238000012360 testing method Methods 0.000 claims abstract description 17
- 238000007781 pre-processing Methods 0.000 claims description 11
- 230000009466 transformation Effects 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 7
- 239000011159 matrix material Substances 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 5
- 238000004140 cleaning Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 description 12
- 230000008859 change Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 238000007726 management method Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 230000000694 effects Effects 0.000 description 6
- 238000013461 design Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000035945 sensitivity Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000007499 fusion processing Methods 0.000 description 3
- 230000007774 longterm Effects 0.000 description 3
- 206010039203 Road traffic accident Diseases 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000007787 long-term memory Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
- G08G1/0125—Traffic data processing
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/065—Traffic control systems for road vehicles by counting the vehicles in a section of the road or in a parking area, i.e. comparing incoming count with outgoing count
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Traffic Control Systems (AREA)
Abstract
The application provides a traffic flow prediction method based on big data, which comprises the steps of dividing a monitoring area into M sample road sections by constructing a traffic flow prediction model, and determining historical traffic data (comprising corresponding traffic flow labels) corresponding to each sample road section by utilizing a historical traffic data set; determining real-time characteristics and global characteristics corresponding to each sample road section; performing feature fusion to obtain N training features corresponding to each sample road section, and forming a training data set containing MN training features; and constructing a GRU model, and training and testing to obtain a trained traffic flow prediction model. By combining the real-time features and the global features and adopting the GRU model for prediction, the accuracy of traffic flow prediction can be improved, and the prediction result is more reliable.
Description
Technical Field
The application relates to the technical field of big data, in particular to a traffic flow prediction method based on big data.
Background
With the continuous acceleration of the urban process, urban traffic jam has become a problem that seriously affects people's daily life and economic development. Frequent traffic jams lead to long-time commute, aggravate the emission of tail gas of vehicles, and cause serious pollution to the environment; meanwhile, the life quality of people is reduced, and the life pressure is increased.
Conventional traffic management related methods generally provide only static information and lack timely response capabilities to dynamically changing traffic conditions. Therefore, how to improve traffic management (e.g. driving route planning) by using modern information technology and to improve the prediction accuracy and precision of traffic flow becomes a urgent problem to be solved in the current traffic field.
Along with the rapid development of big data technology, the traffic field also starts to gradually apply the big data technology to predict and manage traffic flow. The big data technology provides more diversified and real-time data support for traffic management departments by the powerful data processing and analysis capability. For example, by collecting vehicle track data, road monitoring data, mobile terminal data, etc., traffic management departments can more accurately understand traffic conditions and take corresponding management measures in time. Meanwhile, the big data technology can find the rule and trend of traffic flow change through analyzing the mass data, and provide scientific basis for future traffic planning.
Although the big data technology brings new hopes for traffic flow prediction, problems still exist in practical application, the current big data processing and analyzing technology is poor in application effect in traffic flow prediction, the prediction result is inaccurate, the application effect is poor (for example, guidance on driving path planning, congestion is avoided in advance), and the improvement of traffic congestion conditions is not facilitated.
Disclosure of Invention
The embodiment of the application aims to provide a traffic flow prediction method based on big data, so as to improve traffic flow prediction accuracy, improve application level applicability and improve traffic jam conditions.
In order to achieve the above object, an embodiment of the present application is achieved by:
In a first aspect, an embodiment of the present application provides a traffic flow prediction method based on big data, including: acquiring real-time traffic data of a target road section, wherein the target road section comprises a monitoring road section for flow prediction and an adjacent road section communicated with the monitoring road section, and in each passing direction of the target road section, the adjacent road section positioned at a starting point in the passing direction can start from the adjacent road section positioned at the starting point in the passing direction, passes through the monitoring road section and then reaches the adjacent road section positioned at a terminal point in the passing direction; preprocessing real-time traffic data of a target road section, and determining real-time characteristics corresponding to the target road section; acquiring global features corresponding to the target road segments, and carrying out feature fusion on the global features corresponding to the target road segments and the real-time features to obtain input features corresponding to the target road segments; and inputting the input characteristics corresponding to the target road section into a preset traffic flow prediction model, and determining a traffic flow prediction result corresponding to the target road section through the traffic flow prediction model.
With reference to the first aspect, in a first possible implementation manner of the first aspect, before acquiring real-time traffic data, a traffic flow prediction model needs to be constructed, where a manner of constructing the traffic flow prediction model is: acquiring a historical traffic data set, wherein the historical traffic data set comprises historical traffic information acquired by N continuous time steps in a monitoring area; dividing a monitoring area to form M sample road sections, and determining historical traffic data corresponding to each sample road section, wherein each historical traffic data corresponding to each sample road section comprises a corresponding traffic flow label used for revealing the traffic flow of the sample road section in the time step, each sample road section comprises a calibration road section and an adjacent road section communicated with the calibration road section, and in each passing direction of the sample road section, the adjacent road section which is positioned at a starting point in the passing direction can be started from the adjacent road section which is positioned at the starting point in the passing direction, passes through the calibration road section and then reaches the adjacent road section which is positioned at an end point in the passing direction; preprocessing historical traffic data corresponding to each sample road section in the historical traffic data set, and determining real-time features and global features corresponding to each sample road section to obtain N real-time features corresponding to each sample road section and M global features corresponding to M sample road sections; feature fusion is carried out on the global features and the real-time features corresponding to each sample road section, N training features corresponding to each sample road section are obtained, and a training data set containing MN training features is formed; dividing the training data set into a training set and a testing set; and constructing a GRU model, and training and testing the GRU model by using a training set and a testing set to obtain a trained traffic flow prediction model.
With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the dividing the monitoring area to form M sample segments, determining historical traffic data corresponding to each sample segment includes: carrying out road section division on the monitoring area to obtain M road sections; for each road section, taking the current road section as a calibration road section, determining that the calibration road section and the adjacent road sections are one sample road section, and obtaining M sample road sections in total, wherein each sample road section comprises one calibration road section and the adjacent road sections communicated with the calibration road section, and in each passing direction of the sample road section, the adjacent road sections positioned at the starting point in the passing direction can reach the adjacent road sections positioned at the end point in the passing direction after passing through the calibration road sections from the adjacent road sections positioned at the starting point in the passing direction; and aiming at each sample road section, taking the information corresponding to each piece of historical traffic information in the current sample road section as historical traffic data corresponding to the sample road section, wherein each sample road section corresponds to N pieces of historical traffic data, each piece of historical traffic data comprises a traffic flow label, and the traffic flow of the marked road section in the current sample road section under the corresponding time step is determined.
With reference to the first possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, preprocessing historical traffic data corresponding to each sample road segment in the historical traffic data set, and determining a real-time feature and a global feature corresponding to each sample road segment includes: carrying out data cleaning on the historical traffic data corresponding to each sample road section in the historical traffic data set; normalizing the historical traffic data corresponding to each cleaned sample road section; carrying out real-time feature extraction on the standardized historical traffic data corresponding to each sample road section, and determining N real-time features corresponding to the sample road sections; and carrying out global feature extraction on the historical traffic data of the same sample road section in all time steps, and determining M global features corresponding to the M sample road sections.
With reference to the first possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, performing feature fusion on a global feature and a real-time feature corresponding to each sample road segment to obtain N training features corresponding to each sample road segment, to form a training data set including MN training features, including: for each sample segment: performing feature embedding on the global features corresponding to the current sample road section and each real-time feature by using a full connection layer to obtain N groups of real-time embedded features and a group of global embedded features corresponding to the current sample road section; for each set of real-time embedded features: performing feature fusion on the real-time embedded features and the corresponding global embedded features to obtain corresponding training features; and integrating all training features corresponding to the M sample road sections to form a training data set containing the MN training features.
With reference to the fourth possible implementation manner of the first aspect, in a fifth possible implementation manner of the first aspect, feature embedding is performed on the global feature corresponding to the current sample section and each real-time feature by using a full connection layer, so as to obtain N groups of real-time embedded features and a group of global embedded features corresponding to the current sample section, where the method includes: ith real-time feature corresponding to current sample segment jThe global feature Y corresponding to the current sample section j is embedded by adopting the following formula:
Wherein i is E [1, N ], j is E [1, M ], To real-time feature/>The feature representation after feature embedding is performed,For feature representation after feature embedding of global feature Y j,/>To real-time feature/>Weight matrix during feature embedding,/>For the weight matrix when the global feature Y j is embedded with the features,/>To real-time feature/>Bias term in feature embeddingFor bias terms when feature embedding is performed on global feature Y j, σ is the activation function.
With reference to the fifth possible implementation manner of the first aspect, in a sixth possible implementation manner of the first aspect, feature fusion is performed on the real-time embedded feature and the corresponding global embedded feature to obtain a corresponding training feature, where the feature fusion includes: for real-time embedded featuresAnd globally embedded features/>The following formula is used for feature fusion:
wherein, For the ith training feature corresponding to the sample section j, namely the real-time embedded feature/>And globally embedded features/>Fused training features, alpha is the attention weight,/>To embed features/>, in real timeNonlinear transformation of,/>For the purpose of global embedded features/>Nonlinear transformation of,/>For embedding features in real time/>Corresponding weight parameter,/>For global embedded features/>Corresponding weight parameter,/>For embedding features in real time/>Hysteresis term at time step t, t.epsilon.1, N,/>For global embedded features/>Hysteresis term at sample segment s.
With reference to the sixth possible implementation manner of the first aspect, in a seventh possible implementation manner of the first aspect, the attention weight α satisfies:
wherein, The kth dimension parameter of the attention weight alpha at the time step t is F (·) is the attention function, h (t-1) is the hidden state of the last time step,/>For the kth dimension parameter in the ith training feature corresponding to the sample section j, C is the total number of dimensions of the training feature,/>And the first dimension parameter in the ith training feature corresponding to the sample section j.
With reference to the sixth possible implementation manner of the first aspect, in an eighth possible implementation manner of the first aspect, the weight parameterThe method meets the following conditions:
wherein, Is a weight parameter.
With reference to the sixth possible implementation manner of the first aspect, in a ninth possible implementation manner of the first aspect, the weight parameterThe method meets the following conditions:
wherein, Is a weight parameter.
The beneficial effects are that:
1. According to the scheme, a traffic flow prediction model is constructed, a monitoring area is divided into M sample road sections, and historical traffic data (comprising corresponding traffic flow labels) corresponding to each sample road section is determined by utilizing a historical traffic data set; preprocessing historical traffic data, and determining real-time characteristics and global characteristics corresponding to each sample road section; further carrying out feature fusion to obtain N training features corresponding to each sample road section, and forming a training data set containing MN training features; and constructing a GRU model, and training and testing to obtain a trained traffic flow prediction model. By combining the real-time features and the global features and adopting the GRU model for prediction, the accuracy of traffic flow prediction can be improved, and the prediction result is more reliable. The real-time traffic data is utilized for prediction, the dynamic change of traffic flow can be better reflected, the fusion of the global features and the real-time features can fully utilize the historical and current data, and the perception capability of the model on complex traffic conditions is improved, so that the influence of traffic jams, accidents and other emergency events is considered, and the accuracy of traffic flow prediction is improved.
2. The feature embedding and feature fusion thinking are designed, so that the model can better utilize the relation between the global features and the real-time features, and the effect of feature expression is improved, thereby enhancing the comprehensive consideration capability of the model to different factors. By carrying out weighted fusion on the real-time embedded features and the global embedded features, the model can simultaneously consider real-time information and historical global information at the current moment, thereby comprehensively utilizing the advantages of the real-time embedded features and the historical global information. The comprehensive grasping capability of the model on the traffic data characteristics is improved, and the model can more comprehensively understand the road conditions. Attention weight is introduced, importance between the real-time embedded features and the global embedded features is dynamically learned, and weighting fusion is carried out according to the importance, so that the model can adaptively pay attention to feature information which is more critical to the current prediction task, and adaptability and generalization capability of the model under different scenes are improved. When the real-time embedded features and the global embedded features are subjected to nonlinear transformation, more nonlinear factors are introduced, the modeling capacity of complex relationships among the features is enhanced, the expression capacity of a model is improved, the nonlinear features of data can be better fitted, and the prediction accuracy is improved. According to the forming process of the training features, the area division mode (the division of one sample road section comprises a calibration road section and an adjacent road section), so that the time dependence and long-term memory of the model capturing data can be helped, the time sequence features of the data can be better understood, and the understanding and prediction capability of the model on the time change trend of traffic data can be improved. The weight parameters of the real-time embedded features and the global embedded features can be obtained through training data learning (the weight distribution scheme designed by the scheme can also be adopted), so that the weight parameters can be automatically adjusted according to the characteristics of the data, and the feature fusion process is more flexible and adapts to the situation of different data distribution.
3. The calculation formula of the attention weight can enable the model to dynamically adjust the weight according to the importance of different characteristics, and information which is more critical to the current prediction task is better focused. Through calculation of the hidden state and the training features, the model can determine the importance of each feature according to the relation between the historical information and the current features, so that the sensitivity of the model to the data features is improved, and the prediction accuracy is enhanced. Weight parameterThe design of the model uses the tanh function, as the difference between the time step t and the feature sequence number i (training feature corresponding to the ith time step) is increased, the weight is gradually reduced, so that the model can pay more attention to recent feature information, the dependence on the long-term feature is reduced, the model can capture the short-term change trend of data better, and the sensitivity to real-time information is improved. Weight parameter/>According to the position s of the sample road section j in the whole sequence, different values are taken, wherein M is the total number of the sample road sections, so that different weights can be given to road sections at different positions by the model (mainly the influence of adjacent road sections of the front and adjacent road sections of the rear are distinguished according to the passing direction), the contribution of different road sections to the prediction task can be distinguished better, the model can pay attention to road section information which is more important to the whole prediction, the generalization capability of the model and the understanding of the whole information are improved, the accuracy of traffic flow prediction is finally improved, the applicability of application layers (such as driving path planning, guiding intelligent traffic management design and the like) is improved, and the traffic jam condition is improved.
In order to make the above objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of constructing a traffic flow prediction model.
Fig. 2 is a flowchart of a traffic flow prediction method based on big data according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.
Because the traffic flow prediction method based on big data in the scheme is mainly realized by relying on the traffic flow prediction model constructed in the scheme, in order to facilitate understanding of the scheme, the process of constructing the traffic flow prediction model is introduced.
Referring to fig. 1, fig. 1 is a flowchart for constructing a traffic flow prediction model. In the present embodiment, constructing the traffic flow prediction model may include step S11, step S12, step S13, step S14, step S15, step S16.
In order to construct the traffic flow prediction model, step S11 may be performed first.
Step S11: and acquiring a historical traffic data set, wherein the historical traffic data set comprises historical traffic information acquired by N continuous time steps in a monitoring area.
In this embodiment, the historical traffic information of N continuous time steps in the monitored area may be collected, where N continuous time steps may be 1 minute, 10 minutes, etc. as one time step, which is not limited. In theory, in order to maintain high prediction accuracy, the time span of data collection is preferably 1 year or more, but because the data volume is too huge, the embodiment takes the time span of 1 month as an example, long-term features are gradually perfected through continuous learning and updating of the model in the use process (that is, global features can be updated and optimized along with the use process of the model, or the design periodically updates the global features). The historical traffic information includes data collected by sensors (such as traffic cameras, geomagnetic sensors, and radars), GPS data (data sent by GPS equipment mounted on a vehicle), geographic Information System (GIS) data, weather data (rainfall, wind speed, visibility, etc.), event data (traffic accidents, construction projects, special activities, etc.), vehicle data (running state and running data of the vehicle, etc.), and the like.
After obtaining the historical traffic information, step S12 may be performed.
Step S12: dividing the monitoring area to form M sample sections, determining historical traffic data corresponding to each sample section, wherein each historical traffic data corresponding to each sample section comprises a corresponding traffic flow label used for revealing the traffic flow of the sample section in the time step, each sample section comprises a calibration section and an adjacent section communicated with the calibration section, and in each passing direction of the sample section, the adjacent section which is positioned at the starting point in the passing direction can be started from the adjacent section which is positioned at the starting point in the passing direction, passes through the calibration section and then reaches the adjacent section which is positioned at the end point in the passing direction.
In this embodiment, in order to implement flow prediction of local road segments in the monitoring area, the local road segments may be divided to form M sample road segments, and historical traffic data corresponding to each sample road segment is determined.
For example, the monitoring area may be divided into M segments.
For each road segment, the current road segment may be taken as a calibration road segment, and the calibration road segment and the adjacent road segments (for example, the road segment directly communicated with the calibration road segment is taken as the adjacent road segment in the present embodiment, in other embodiments, in order to further improve the prediction accuracy, a wider range of road segments may be considered as the adjacent road segments, for example, the three nearest road segments communicated with the calibration road segment in each passing direction are taken as the adjacent road segments), so as to obtain M sample road segments in total, where each sample road segment includes one calibration road segment and the adjacent road segment communicated with the calibration road segment, and in each passing direction of the sample road segment, the adjacent road segment located at the end point in the passing direction can be reached after the calibration road segment is started from the adjacent road segment located at the start point in the passing direction.
And then, regarding each sample road section, taking the information corresponding to each historical traffic information in the current sample road section as historical traffic data corresponding to the sample road section, wherein each sample road section corresponds to N pieces of historical traffic data, each piece of historical traffic data comprises a traffic flow label, and determining the traffic flow of the marked road section in the current sample road section under the corresponding time step. Thus, MN pieces of historical traffic data corresponding to M sample segments (N pieces of historical traffic data corresponding to each sample segment) can be obtained.
After determining the historical traffic data corresponding to each sample road segment, step S13 may be performed.
Step S13: preprocessing historical traffic data corresponding to each sample road section in the historical traffic data set, determining real-time features and global features corresponding to each sample road section, and obtaining N real-time features corresponding to each sample road section and M global features corresponding to M sample road sections.
In this embodiment, data cleaning may be performed on the historical traffic data corresponding to each sample road segment in the historical traffic data set, for example, processing missing values, identifying and eliminating abnormal values, and so on, so as to ensure reliability and accuracy of data quality.
And then, the historical traffic data corresponding to each cleaned sample road section is standardized, such as logarithmic conversion, normalization and other operations.
After data standardization is completed, real-time feature extraction can be performed on the standardized historical traffic data corresponding to each sample road section, N real-time features corresponding to the sample road sections are determined, and MN real-time features are obtained in total. The feature extraction is implemented here, for example, aggregation is performed, statistical features (such as mean value and variance) are extracted, and space-time features are constructed (space-time feature construction is performed by combining adjacent road segments in each passing direction and calibration road segments, and each passing direction forms a part of feature parameters), so that real-time features corresponding to sample road segments are integrated.
And the global feature extraction can be carried out on the historical traffic data of the same sample road section in all time steps, and M global features corresponding to M sample road sections are determined. In the training phase, this embodiment takes data with a time span of 1-3 months as an example, and extracts global features of this part. After model training is finished and put into use, the global characteristics of each sample section can be updated along with the time span extension of data, so that the accuracy of the model is further improved along with the time increase.
After obtaining the N real-time features corresponding to each sample section and the M global features corresponding to the M sample sections, step S14 may be executed.
Step S14: and carrying out feature fusion on the global features and the real-time features corresponding to each sample road section to obtain N training features corresponding to each sample road section, and forming a training data set containing the MN training features.
In the present embodiment, for each sample section:
The global features corresponding to the current sample road section and each real-time feature can be subjected to feature embedding by utilizing the full connection layer, so that N groups of real-time embedded features and one group of global embedded features corresponding to the current sample road section are obtained.
Exemplary, the ith real-time feature corresponding to the current sample segment jThe global feature Y corresponding to the current sample segment j may be feature embedded using the following formula:
Wherein i is E [1, N ], j is E [1, M ], To real-time feature/>The feature representation after feature embedding is performed,For feature representation after feature embedding of global feature Y j,/>To real-time feature/>Weight matrix during feature embedding,/>For the weight matrix when the global feature Y j is embedded with the features,/>To real-time feature/>Bias term in feature embeddingIn order to perform bias term feature embedding on the global feature Y j, σ is an activation function (Sigmoid function is taken as an example in this embodiment).
For each set of real-time embedded features: the real-time embedded features and the corresponding global embedded features can be subjected to feature fusion to obtain corresponding training features.
In particular, for real-time embedding featuresAnd globally embedded features/>Feature fusion can be performed using the following formula:
wherein, For the ith training feature corresponding to the sample section j, namely the real-time embedded feature/>And globally embedded features/>Fused training features, alpha is the attention weight,/>To embed features/>, in real timeNonlinear transformation of,/>For the purpose of global embedded features/>Nonlinear transformation of,/>For embedding features in real time/>Corresponding weight parameter,/>For global embedded features/>Corresponding weight parameter,/>For embedding features in real time/>Hysteresis term at time step t, t.epsilon.1, N,/>For global embedded features/>Hysteresis term at sample segment s.
The attention weight α satisfies:
wherein, The kth dimension parameter of the attention weight alpha at the time step t is F (·) is the attention function, h (t-1) is the hidden state of the last time step,/>For the kth dimension parameter in the ith training feature corresponding to the sample section j, C is the total number of dimensions of the training feature,/>And the first dimension parameter in the ith training feature corresponding to the sample section j.
Weight parameterThe method meets the following conditions:
Weight parameter The method meets the following conditions:
then, all training features corresponding to the M sample segments may be integrated to form a training dataset comprising MN training features.
After obtaining the training data set, step S15 may be run.
Step S15: the training data set is divided into a training set and a test set.
In this embodiment, 8.5: the training data set is divided into a training set (with the proportion of 85%) and a testing set (with the proportion of 15%) in the dividing mode of 1.5, and equal proportion division is needed according to the sample road sections during dividing, so that training and testing of each sample road section in a monitoring area can be carried out later, and sample unbalance is avoided.
After obtaining the training set and the test set, step S16 may be performed.
Step S16: and constructing a GRU model, and training and testing the GRU model by using a training set and a testing set to obtain a trained traffic flow prediction model.
In this embodiment, a GRU model may be constructed, GRU (Gated Recurrent Unit) is a recurrent neural network model suitable for processing time series data, and the update formula of each gate in the model is as follows:
Reset gate:
Where r (t) is the reset gate at time step t, σ is the activation function (also Sigmoid function), W r is the weight of the reset gate, h (t-1) is the hidden state at time step t-1, Training features at time step t corresponding to sample section j (training features obtained in the foregoing/>To rewrite/>A parametric description of the adapted GRU model), b r is the bias of the reset gate.
Update door:
Where z (t) is the update gate at time step t, σ is the activation function (also Sigmoid function), W z is the weight of the update gate, h (t-1) is the hidden state at time step t-1, B z is the bias of the update gate, which is the training feature at time step t corresponding to sample segment j.
Candidate hidden state:
wherein, For the candidate hidden state at time step t, W k is the weight corresponding to the candidate hidden state, while, as for the element-wise multiplication operation, b k is the bias corresponding to the candidate hidden state.
Hidden state update:
wherein h (t) is the hidden state at time step t.
And selecting the mean square error as a loss function of the GRU model.
And finally, training the constructed GRU model by using a training set, and testing the GRU model by using a testing set to finally obtain a trained traffic flow prediction model.
After the trained traffic flow prediction model is obtained, the trained traffic flow prediction model can be carried on a server, and an operation program of the traffic flow prediction method based on big data is put into the server, so that the traffic flow prediction method based on the big data can be operated by the server.
Referring to fig. 2, fig. 2 is a flowchart of a traffic flow prediction method based on big data according to an embodiment of the present application. In the present embodiment, the traffic flow prediction method based on big data may include step S21, step S22, step S23, step S24.
First, the server may run step S21.
Step S21: and acquiring real-time traffic data of a target road section, wherein the target road section comprises a monitoring road section for flow prediction and an adjacent road section communicated with the monitoring road section, and in each passing direction of the target road section, the adjacent road section positioned at a starting point in the passing direction can be started from the adjacent road section positioned at the starting point in the passing direction, passes through the monitoring road section and then reaches the adjacent road section positioned at a terminal point in the passing direction.
In this embodiment, in order to implement traffic flow prediction for a monitored road section, it is necessary to acquire real-time traffic data of a target road section (including a monitored road section for flow prediction and a neighboring road section communicating with the monitored road section, in each traffic direction of the target road section, a neighboring road section located at a start point in the traffic direction can be reached from the neighboring road section located at the start point in the traffic direction after passing through the monitored road section, and the neighboring road section located at an end point in the traffic direction), the real-time traffic data including data acquired by a sensor (such as a traffic camera, a geomagnetic sensor, a radar), GPS data (data transmitted by a GPS device mounted on a vehicle), geographic Information System (GIS) data, weather data (rainfall, wind speed, visibility, etc.), event data (traffic accident, construction work, special activity, etc.), vehicle data (running state of a vehicle, running data, etc.), and the like, correspond to historical traffic data of the preceding sample road section.
After obtaining the real-time traffic data of the target road segment, the server may run step S22.
Step S22: and preprocessing the real-time traffic data of the target road section, and determining the real-time characteristics corresponding to the target road section.
In this embodiment, the server may perform preprocessing, such as data cleaning, standardization, etc., on the real-time traffic data of the target road section, which will not be described herein. And then extracting real-time characteristics to obtain real-time characteristics corresponding to the target road section.
After obtaining the real-time feature corresponding to the target road segment, the server may execute step S23.
Step S23: and acquiring global features corresponding to the target road segments, and carrying out feature fusion on the global features and the real-time features corresponding to the target road segments to obtain input features corresponding to the target road segments.
In this embodiment, the server may obtain the global feature corresponding to the target road segment, and since the target road segment corresponds to one sample road segment in the monitoring area, the global feature corresponding to the sample road segment may be directly obtained as the global feature of the target road segment.
After the global features corresponding to the target road segments are obtained, feature fusion can be carried out on the global features corresponding to the target road segments and the real-time features to obtain the input features corresponding to the target road segments. The feature fusion process of the global feature and the real-time feature may specifically refer to the process shown in the foregoing step S14, which is not described herein again.
After obtaining the input features, the server may run step S24.
Step S24: and inputting the input characteristics corresponding to the target road section into a preset traffic flow prediction model, and determining a traffic flow prediction result corresponding to the target road section through the traffic flow prediction model.
In this embodiment, the input features may be input into a preset traffic flow prediction model, and the traffic flow prediction result corresponding to the target road section may be determined through the traffic flow prediction model. Of course, in practical applications, if the use is just started (i.e., there is no preamble of the predicted result and input data), it is preferable to continuously predict a plurality of times so as to obtain a more accurate predicted result.
After the traffic flow prediction result is obtained, the traffic flow prediction result can be put into an application layer for use, for example, path planning is performed based on the traffic flow prediction result, or path congestion prompting, traffic management guidance and the like are performed, and no extension explanation is made here.
In summary, the embodiment of the application provides a traffic flow prediction method based on big data, which divides a monitoring area into M sample road sections by constructing a traffic flow prediction model, and determines historical traffic data (including corresponding traffic flow labels) corresponding to each sample road section by using a historical traffic data set; preprocessing historical traffic data, and determining real-time characteristics and global characteristics corresponding to each sample road section; further carrying out feature fusion to obtain N training features corresponding to each sample road section, and forming a training data set containing MN training features; and constructing a GRU model, and training and testing to obtain a trained traffic flow prediction model. By combining the real-time features and the global features and adopting the GRU model for prediction, the accuracy of traffic flow prediction can be improved, and the prediction result is more reliable. The real-time traffic data is utilized for prediction, the dynamic change of traffic flow can be better reflected, the fusion of the global features and the real-time features can fully utilize the historical and current data, and the perception capability of the model on complex traffic conditions is improved, so that the influence of traffic jams, accidents and other emergency events is considered, and the accuracy of traffic flow prediction is improved.
The feature embedding and feature fusion thinking are designed, so that the model can better utilize the relation between the global features and the real-time features, and the effect of feature expression is improved, thereby enhancing the comprehensive consideration capability of the model to different factors. By carrying out weighted fusion on the real-time embedded features and the global embedded features, the model can simultaneously consider real-time information and historical global information at the current moment, thereby comprehensively utilizing the advantages of the real-time embedded features and the historical global information. The comprehensive grasping capability of the model on the traffic data characteristics is improved, and the model can more comprehensively understand the road conditions. Attention weight is introduced, importance between the real-time embedded features and the global embedded features is dynamically learned, and weighting fusion is carried out according to the importance, so that the model can adaptively pay attention to feature information which is more critical to the current prediction task, and adaptability and generalization capability of the model under different scenes are improved. When the real-time embedded features and the global embedded features are subjected to nonlinear transformation, more nonlinear factors are introduced, the modeling capacity of complex relationships among the features is enhanced, the expression capacity of a model is improved, the nonlinear features of data can be better fitted, and the prediction accuracy is improved. According to the forming process of the training features, the area division mode (the division of one sample road section comprises a calibration road section and an adjacent road section), so that the time dependence and long-term memory of the model capturing data can be helped, the time sequence features of the data can be better understood, and the understanding and prediction capability of the model on the time change trend of traffic data can be improved. The weight parameters of the real-time embedded features and the global embedded features can be obtained through training data learning (the weight distribution scheme designed by the scheme can also be adopted), so that the weight parameters can be automatically adjusted according to the characteristics of the data, and the feature fusion process is more flexible and adapts to the situation of different data distribution.
The calculation formula of the attention weight can enable the model to dynamically adjust the weight according to the importance of different characteristics, and information which is more critical to the current prediction task is better focused. Through calculation of the hidden state and the training features, the model can determine the importance of each feature according to the relation between the historical information and the current features, so that the sensitivity of the model to the data features is improved, and the prediction accuracy is enhanced. Weight parameterThe design of the model uses the tanh function, as the difference between the time step t and the feature sequence number i (training feature corresponding to the ith time step) is increased, the weight is gradually reduced, so that the model can pay more attention to recent feature information, the dependence on the long-term feature is reduced, the model can capture the short-term change trend of data better, and the sensitivity to real-time information is improved. Weight parameter/>According to the position s of the sample road section j in the whole sequence, different values are taken, wherein M is the total number of the sample road sections, so that different weights can be given to road sections at different positions by the model (mainly the influence of adjacent road sections of the front and adjacent road sections of the rear are distinguished according to the passing direction), the contribution of different road sections to the prediction task can be distinguished better, the model can pay attention to road section information which is more important to the whole prediction, the generalization capability of the model and the understanding of the whole information are improved, the accuracy of traffic flow prediction is finally improved, the applicability of application layers (such as driving path planning, guiding intelligent traffic management design and the like) is improved, and the traffic jam condition is improved.
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.
Claims (10)
1. A traffic flow prediction method based on big data, comprising:
Acquiring real-time traffic data of a target road section, wherein the target road section comprises a monitoring road section for flow prediction and an adjacent road section communicated with the monitoring road section, and in each passing direction of the target road section, the adjacent road section positioned at a starting point in the passing direction can start from the adjacent road section positioned at the starting point in the passing direction, passes through the monitoring road section and then reaches the adjacent road section positioned at a terminal point in the passing direction;
preprocessing real-time traffic data of a target road section, and determining real-time characteristics corresponding to the target road section;
acquiring global features corresponding to the target road segments, and carrying out feature fusion on the global features corresponding to the target road segments and the real-time features to obtain input features corresponding to the target road segments;
And inputting the input characteristics corresponding to the target road section into a preset traffic flow prediction model, and determining a traffic flow prediction result corresponding to the target road section through the traffic flow prediction model.
2. The traffic flow prediction method based on big data according to claim 1, wherein before acquiring real-time traffic data, a traffic flow prediction model is constructed by:
Acquiring a historical traffic data set, wherein the historical traffic data set comprises historical traffic information acquired by N continuous time steps in a monitoring area;
Dividing a monitoring area to form M sample road sections, and determining historical traffic data corresponding to each sample road section, wherein each historical traffic data corresponding to each sample road section comprises a corresponding traffic flow label used for revealing the traffic flow of the sample road section in the time step, each sample road section comprises a calibration road section and an adjacent road section communicated with the calibration road section, and in each passing direction of the sample road section, the adjacent road section which is positioned at a starting point in the passing direction can be started from the adjacent road section which is positioned at the starting point in the passing direction, passes through the calibration road section and then reaches the adjacent road section which is positioned at an end point in the passing direction;
preprocessing historical traffic data corresponding to each sample road section in the historical traffic data set, and determining real-time features and global features corresponding to each sample road section to obtain N real-time features corresponding to each sample road section and M global features corresponding to M sample road sections;
Feature fusion is carried out on the global features and the real-time features corresponding to each sample road section, N training features corresponding to each sample road section are obtained, and a training data set containing MN training features is formed;
dividing the training data set into a training set and a testing set;
And constructing a GRU model, and training and testing the GRU model by using a training set and a testing set to obtain a trained traffic flow prediction model.
3. The traffic flow prediction method based on big data according to claim 2, wherein dividing the monitoring area to form M sample segments, determining historical traffic data corresponding to each sample segment, includes:
Carrying out road section division on the monitoring area to obtain M road sections;
For each road section, taking the current road section as a calibration road section, determining that the calibration road section and the adjacent road sections are one sample road section, and obtaining M sample road sections in total, wherein each sample road section comprises one calibration road section and the adjacent road sections communicated with the calibration road section, and in each passing direction of the sample road section, the adjacent road sections positioned at the starting point in the passing direction can reach the adjacent road sections positioned at the end point in the passing direction after passing through the calibration road sections from the adjacent road sections positioned at the starting point in the passing direction;
And aiming at each sample road section, taking the information corresponding to each piece of historical traffic information in the current sample road section as historical traffic data corresponding to the sample road section, wherein each sample road section corresponds to N pieces of historical traffic data, each piece of historical traffic data comprises a traffic flow label, and the traffic flow of the marked road section in the current sample road section under the corresponding time step is determined.
4. The big data based traffic flow prediction method according to claim 2, wherein preprocessing the historical traffic data corresponding to each sample section in the historical traffic data set and determining the real-time feature and the global feature corresponding to each sample section comprises:
Carrying out data cleaning on the historical traffic data corresponding to each sample road section in the historical traffic data set;
Normalizing the historical traffic data corresponding to each cleaned sample road section;
Carrying out real-time feature extraction on the standardized historical traffic data corresponding to each sample road section, and determining N real-time features corresponding to the sample road sections;
and carrying out global feature extraction on the historical traffic data of the same sample road section in all time steps, and determining M global features corresponding to the M sample road sections.
5. The traffic flow prediction method based on big data according to claim 2, wherein the feature fusion is performed on the global feature and the real-time feature corresponding to each sample road segment to obtain N training features corresponding to each sample road segment, and a training data set including MN training features is formed, and the method includes:
for each sample segment: performing feature embedding on the global features corresponding to the current sample road section and each real-time feature by using a full connection layer to obtain N groups of real-time embedded features and a group of global embedded features corresponding to the current sample road section;
For each set of real-time embedded features: performing feature fusion on the real-time embedded features and the corresponding global embedded features to obtain corresponding training features;
And integrating all training features corresponding to the M sample road sections to form a training data set containing the MN training features.
6. The traffic flow prediction method based on big data according to claim 5, wherein feature embedding is performed on global features and each real-time feature corresponding to a current sample section by using a full connection layer, so as to obtain N sets of real-time embedded features and a set of global embedded features corresponding to the current sample section, and the method comprises:
ith real-time feature corresponding to current sample segment j The global feature Y corresponding to the current sample section j is embedded by adopting the following formula:
Wherein i is E [1, N ], j is E [1, M ], To real-time feature/>Feature representation after feature embedding,/>For feature representation after feature embedding of global feature Y j,/>To real-time feature/>Weight matrix during feature embedding,/>For the weight matrix when the global feature Y j is embedded with the features,/>To real-time feature/>Bias term in feature embeddingFor bias terms when feature embedding is performed on global feature Y j, σ is the activation function.
7. The traffic flow prediction method based on big data according to claim 6, wherein feature fusion is performed between the real-time embedded feature and the corresponding global embedded feature to obtain the corresponding training feature, comprising:
For real-time embedded features And globally embedded features/>The following formula is used for feature fusion:
wherein, For the ith training feature corresponding to the sample section j, namely the real-time embedded feature/>And globally embedded features/>Fused training features, alpha is the attention weight,/>To embed features/>, in real timeIs used for the non-linear transformation of (a),For the purpose of global embedded features/>Nonlinear transformation of,/>For embedding features in real time/>The corresponding weight parameter is used to determine the weight of the object,For global embedded features/>Corresponding weight parameter,/>For embedding features in real time/>Hysteresis term at time step t, t.epsilon.1, N,/>For global embedded features/>Hysteresis term at sample segment s.
8. The big data based traffic flow prediction method according to claim 7, wherein the attention weight α satisfies:
wherein, The kth dimension parameter of the attention weight alpha at the time step t is F (·) is the attention function, h (t-1) is the hidden state of the last time step,/>For the kth dimension parameter in the ith training feature corresponding to the sample section j, C is the total number of dimensions of the training feature,/>And the first dimension parameter in the ith training feature corresponding to the sample section j.
9. The traffic flow prediction method based on big data according to claim 7, wherein the weight parameter isThe method meets the following conditions:
wherein, Is a weight parameter.
10. The traffic flow prediction method based on big data according to claim 7, wherein the weight parameter isThe method meets the following conditions:
wherein, Is a weight parameter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410319851.6A CN118072518A (en) | 2024-03-20 | 2024-03-20 | Traffic flow prediction method based on big data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410319851.6A CN118072518A (en) | 2024-03-20 | 2024-03-20 | Traffic flow prediction method based on big data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118072518A true CN118072518A (en) | 2024-05-24 |
Family
ID=91111087
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410319851.6A Withdrawn CN118072518A (en) | 2024-03-20 | 2024-03-20 | Traffic flow prediction method based on big data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118072518A (en) |
-
2024
- 2024-03-20 CN CN202410319851.6A patent/CN118072518A/en not_active Withdrawn
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109716414B (en) | Multi-mode road traffic abnormity detection method | |
CN109871876B (en) | Expressway road condition identification and prediction method based on floating car data | |
CN109544932B (en) | Urban road network flow estimation method based on fusion of taxi GPS data and gate data | |
CN109035761B (en) | Travel time estimation method based on auxiliary supervised learning | |
CN109923575B (en) | Device and method for measuring absolute and/or relative risk potential of vehicle accident | |
CN110268454B (en) | Determining a customized safe speed for a vehicle | |
CN112613225B (en) | Intersection traffic state prediction method based on neural network cell transmission model | |
CN115220133B (en) | Rainfall prediction method, device and equipment for multiple meteorological elements and storage medium | |
CN116631186B (en) | Expressway traffic accident risk assessment method and system based on dangerous driving event data | |
CN111815098A (en) | Traffic information processing method and device based on extreme weather, storage medium and electronic equipment | |
KR102063404B1 (en) | system for managing traffic based on Platform | |
Wang et al. | Vehicle reidentification with self-adaptive time windows for real-time travel time estimation | |
Moghaddam et al. | Real-time prediction of arterial roadway travel times using data collected by bluetooth detectors | |
Saleh et al. | Traffic accident risk forecasting using contextual vision transformers | |
CN114596702A (en) | Traffic state prediction model construction method and traffic state prediction method | |
CN117079460A (en) | Method for predicting urban short-time traffic flow by ARIMA model oriented to time sequence | |
CN118072518A (en) | Traffic flow prediction method based on big data | |
CN113965618B (en) | Abnormal track detection method based on fuzzy theory | |
Rashid et al. | Automated traffic measurement system based on FCD and image processing | |
CN117894181B (en) | Global traffic abnormal condition integrated monitoring method and system | |
Sun et al. | Deep learning-based probability model for traffic information estimation | |
CN117953444B (en) | Accident risk assessment method for vulnerable road users | |
Chen et al. | Research on evaluation and prediction method of link travel time based on floating car data by simulation | |
Gu et al. | Improved Cloud-NARX Estimation Algorithm for Uncertainty Analysis of Air Pollution Prediction | |
CN118230547A (en) | Digital twinning-based highway toll station congestion prediction method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20240524 |
|
WW01 | Invention patent application withdrawn after publication |