CN113160563B - Short-time road condition prediction method, system and equipment based on historical road conditions - Google Patents

Short-time road condition prediction method, system and equipment based on historical road conditions Download PDF

Info

Publication number
CN113160563B
CN113160563B CN202110352681.8A CN202110352681A CN113160563B CN 113160563 B CN113160563 B CN 113160563B CN 202110352681 A CN202110352681 A CN 202110352681A CN 113160563 B CN113160563 B CN 113160563B
Authority
CN
China
Prior art keywords
data
time
road condition
road
target point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110352681.8A
Other languages
Chinese (zh)
Other versions
CN113160563A (en
Inventor
赵玺
田文斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202110352681.8A priority Critical patent/CN113160563B/en
Publication of CN113160563A publication Critical patent/CN113160563A/en
Application granted granted Critical
Publication of CN113160563B publication Critical patent/CN113160563B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0129Traffic data processing for creating historical data or processing based on historical data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Abstract

The invention discloses a method, a system and equipment for predicting short-time road conditions based on historical road conditions, wherein the method comprises the following steps: firstly, macroscopically down-sampling data in a time domain to construct homogeneous data; secondly, microcosmically designing a road condition characteristic containing a road network point space-time relation and hot point semantics, capturing the semantic characteristics of a target point on microcosmic space-time through a shallow layer full-connection network, constructing a time sequence characteristic through a multilayer neural network, fusing and accessing space-time data and the semantic characteristics into a GBDT ensemble learning model, performing down-sampling on adjacent data around the target point through a cross validation method, and capturing the semantic characteristics of the target point on microcosmic space-time through constructing space-time data.

Description

Short-time road condition prediction method, system and equipment based on historical road conditions
Technical Field
The invention belongs to the technical field of traffic road condition prediction, and particularly relates to a short-time road condition prediction method, a system and equipment based on historical road conditions.
Background
Research on road condition prediction has been carried out for decades, and short-time road condition prediction shows extremely high use value and commercial value in many important fields, such as route planning, taxi scheduling, ambulance arrival time prediction, and the like, but the difficulty of short-time road condition prediction is relatively large, and the short-time road condition prediction is specifically shown as two reasons: 1. short-time short-circuit condition prediction has large data fluctuation and unstable traffic flow frequency spectrum; 2. the influence of the hot semantic information of the abnormal road condition and the target point on the road condition in the short-time road condition prediction is larger, and the hot semantic information of the target point is often fluctuated randomly and is difficult to generalize in a fixed filtering mode or a characteristic structure. The short-term short-circuit condition prediction method is realized based on three ways, namely a physical rule-based method, a probability map statistical-based method and a data-driven method. The method based on the physical rule is based on the displacement physical rule of the traffic flow, and the traffic information quantity in the future short time is calculated from the physical movement angle by calculating the information quantity, obviously, the method is completely based on the rule, the fitting degree of abnormal data and a large amount of data is crossed, the only parameter is the movement information of the traffic flow, and the application value is low; the probability map statistics-based method has been widely used in the past decade, and is based on a markov chain inference method, a bayesian prediction method, a moving average autoregressive (ARIMA) method, and the like. These methods have all achieved significant results in particular data and fields and have been successfully applied in commercial practice. However, the method based on probability statistics also has the following two disadvantages: 1) The method based on simple probability statistics has fewer parameters and cannot aim at more complex data sources. 2) The simple probability statistical method cannot guarantee higher generalization due to the simplicity of the model, and particularly cannot be effectively practiced aiming at the current complex and changeable road conditions and the abnormal road condition points (semantic hot points) with high-frequency outburst; the data-driven method ignores the intrinsic physical rules of the traffic flow, and represents the traffic information of the traffic flow by using huge parameters, and meanwhile, the data-driven method also considers the probability map method, which uses the parameters to simulate the probability variation trend.
Most of recent road condition prediction related researches are based on data driving, and particularly, methods based on a deep Convolutional Neural Network (CNN), a time-series neural network (LSTM) and a graph network (GNN) are endless. The CNN-based method learns the static information of the traffic flow of the road condition from the space and learns the characteristics of the excavated traffic flow in the hidden layer, so that ETA and traffic time are predicted. In other words, only static road conditions are considered to predict future road conditions, the prediction is based on a statistical method, the road conditions of a specific road section in a time interval before and after the time point are not considered, on the other hand, the simple CNN method does not consider the deconstruction of the topological network of the specific road section, that is, the road section information of the upstream and downstream of the specific road section is seriously ignored, and in fact, when a road section is congested, the probability that congestion exists at the upstream or the downstream of the road section is obviously higher than that of the non-congested road sections, and finally, the position data of the traffic flow is directly loaded into the network model for calculation, so that the authenticity of the data is defaulted, and the position data, particularly the position data which tells movement, or the position data which is derived from mobile phone signaling data or from GPS navigation, has a certain error, and in the traffic flow network with high-frequency sampling, the traffic flow information volume error of adjacent time slices is larger. In order to solve several problems presented by the CNN, the graph network is successfully applied to road condition prediction, and the GNN considers the topological relationship between road networks, which considers that the upstream and downstream of each road segment form a single topological structure with the road segment, and the trend of traffic flow in the nodes of the graph is represented by the node weights of the graph, so that the GNN calculates from the perspective of the node relationship and considers both the static information and the topological information of the traffic flow. However, such GNNs ignore dynamic traffic information, as well as the disadvantages presented by CNNs, and in order to solve this problem, space-time graph convolution (STGCN) is applied to road condition prediction. The STGCN considers the road condition information of the road network in a specific time interval on the basis of the spatial convolution, thereby predicting the future road condition more efficiently. Although STGCN exhibits superior performance in solving the road condition prediction problem, it also raises considerable issues: 1) The structure of the STGCN is too bloated, data characteristics are given to a model to construct, whether the size of the model is too large is caused, the node relation between road networks is constructed blindly, and whether traffic information is really fitted is judged. 2) The STGCN calculates the traffic flow by adopting a random node sampling mode, namely a random walk mode and a first-order neighbor mode, and whether the calculation is consistent with a real traffic flow rule or not is judged. 3) The network model loads GPS position data of traffic flow for calculation, and whether the analysis result has deviation due to system errors of the data is judged.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a short-time road condition prediction method based on historical road conditions, which is a method based on bidirectional space-time topology and target semantic extraction, does not need position information of traffic flow, only needs state information of the traffic flow, uses static information of road sections to represent the movement of the traffic flow in a road network, does not need a large-scale graph model and a network model to support an example, and simultaneously ensures higher precision.
In order to achieve the purpose, the invention adopts the technical scheme that: a short-time road condition prediction method based on historical road conditions comprises the following specific steps:
downsampling data in a time domain to construct homogenous data;
constructing micro time-space time neighbor data of a target point, and carrying out standardized operation, dimension unification and scaling on the time neighbor data to obtain micro semantic features of road condition information;
defining a shallow semantic catcher based on a plurality of times of neural networks, loading the microscopic semantic features into the shallow multilayer neural network, and catching the microscopic semantics of the target point;
constructing macroscopic historical data based on the homogeneous data, and performing label mapping on road condition states after mean value down-sampling to obtain macroscopic historical periodic characteristics;
fusing the macroscopic historical period features and the microscopic semantics of the target points to obtain loading features;
inputting the obtained loaded features into a gradient lifting decision tree algorithm, randomly sampling the loaded features, and performing multiple rounds of training according to k-fold cross validation to obtain a reasoning model;
and after acquiring the loading characteristics of the road condition to be predicted, inputting the loading characteristics into the reasoning model for reasoning to obtain a prediction result about the short-time road condition semantics of the future road section.
The downsampling of the data in the time domain to construct the homogeneous data specifically includes: capturing road condition information of the same road condition in an adjacent time interval in a historical past first set time period according to a macroscopic spatiotemporal scale, on the other hand, capturing adjacent data of the same road condition in a past second set time period on each working day, wherein the first time period is 7 days, the second set time period is one, preprocessing the adjacent data, cutting the preprocessed adjacent data into spatiotemporal distribution data, processing text format data into numerical data, and storing the data as master data constructed by characteristics; acquiring macro period data of each datum in the constructed master data, down-sampling the macro period data according to a mean value down-sampling method, and then acquiring homogeneous data of the data.
Defining a shallow semantic catcher based on a plurality of times of neural networks, loading the microscopic semantic features into the shallow multilayer neural network, wherein the microscopic semantic process of catching a target point is as follows: constructing a multi-layer linear network, extracting road condition information I on the target point adjacent time and space for each selected target point to obtain a target point road condition serving as label, taking space-time neighborhood data as a data set, and carrying out data set analysis according to the following steps of 8: and 2, splitting the road condition into a training set and a verification set, calling DNN (Dempster noise network) for training, classifying historical road conditions after training, and fitting the historical road conditions into three categories of good road conditions, congestion and very congestion.
The multi-time neural network is characterized in that a plurality of shallow layer linear networks form an implicit variable extraction dimension, an activation layer is superposed behind each shallow layer network, a batch standardization layer is superposed behind the activation layer, and an output layer is a linear layer; a set containing a shallow linear network is used as a network module, and three modules are designed to be used as feature extractors.
The road condition information comprises legal information of a road section and physical information of the road section; the traffic information represents a specific space-time traffic information amount, the data structure is S = [ day, link, id, I ], S represents traffic road condition information, day is a date of a target point, link is a road segment number of the target point, id is a time slice of the target point in one day, 2min is a time slice, and I is the traffic information amount, and specifically, the traffic information I = [ eta, v, label, cars ], a link segment of an upstream upper part and a downstream down part of each link is obtained, a topological structure of each link is formed, and the topological structure is represented as:
L={l i ,L up ,L down },L up ={l up1 ,l up2 ...,l upn },L down ={l down1 ,l down2 ...,l downm }
l i represents the ith target road section, L up Is represented by i Upstream section of (1), L up ={l up1 ,l up2 ...,l upn Upstream section of road is composed of specific section of road l upj The composition is the same for the downstream road sections;
acquiring the traffic information of the same working day and adjacent time intervals in the past month for the traffic information info (day, id, I) of the target point, specifically:
{info(day',id',I)},day'=[-7day,-14day,-21day,-28day],id'=[id±r],r={1,2,3}
acquiring time neighbor road condition information of each target point in four working days in the past by the above formula, grouping the time neighbor road condition information according to the working days, carrying out mean value down-sampling on the time neighbor road condition information for each group of grouped road condition information, and changing discrete data into continuous data:
Figure GDA0003731558550000051
constructing macroscopic historical data based on the homogeneous data, performing label mapping on road condition states after mean value downsampling, and defining two time periods with different scales to extract the macroscopic historical period characteristics of the vehicle information when obtaining the macroscopic historical period characteristics:
the first time period is four days of the same working day in one month; the second time period is seven days within one week;
the method comprises the steps of extracting features of a first time period, obtaining road condition information of each same working day of a target in the past month, then down-sampling data, obtaining historical time features based on the first time period, adding the historical time features into a feature group, extracting features of a second time period, and firstly obtaining the road condition information of the target in a neighboring time domain of the target in the past week and every day:
down-sampling the data to obtain a historical time characteristic based on a second time period, and adding the historical time characteristic into a characteristic group to obtain:
F=[Link,cid,fid,I' -1week ,...I' -1week ,I' -1day ...I' -6day ]。
the microscopic semantic features of the road condition information are as follows:
designing a shallow layer full feed network to capture the road condition information quantity of a target point in a past period, firstly acquiring the road condition information of the target point in the past n periods, then designing a shallow layer full connection network, taking the normalized road condition information as the input of the network, taking the current road condition information of the target point as label, training, combining the road condition information quantity I with the road condition physical quantity, and acquiring the cross information of the road section of the target point and the road condition as the input characteristic;
and combining the extracted macroscopic features with the microscopic semantics of the target points to predict road conditions, inputting the information quantity of the past 5 time slices into the shallow layer network, outputting the information quantity of the road conditions of each time slice, training the network by adopting a cross validation method, and applying the training result to the measurement of the information quantity of the current road conditions.
The invention also provides a short-time road condition prediction system based on historical road conditions, which comprises a homogeneous data module, a microscopic semantic feature acquisition module, a microscopic semantic extraction module of target points, a macroscopic historical period feature extraction module, a loading feature acquisition module, a training module and a prediction module;
the homogeneous data module is used for downsampling the data in the time domain to construct homogeneous data;
the microscopic semantic feature acquisition module is used for constructing microscopic time-space time neighbor data of the target point, and carrying out standardized operation, dimension unification and scaling on the time neighbor data to obtain microscopic semantic features of the road condition information;
the microscopic semantic extraction module of the target point is used for defining a shallow semantic catcher according to the neural network for many times, loading the microscopic semantic features into the shallow multilayer neural network and catching the microscopic semantic of the target point;
the macro history cycle feature extraction module is used for constructing macro history data according to the homogeneous data, and performing label mapping on road condition states after mean value down-sampling to obtain macro history cycle features;
the loading feature acquisition module is used for fusing the macroscopic history cycle features and the microscopic semantics of the target points to obtain loading features;
the training module is used for inputting the obtained loading characteristics into a gradient lifting decision tree algorithm, randomly sampling the loading characteristics, and performing multi-round training according to k-fold cross validation to obtain a reasoning model;
the forecasting module is used for inputting the loading characteristics into the reasoning model for reasoning after acquiring the loading characteristics of the road conditions to be forecasted so as to obtain forecasting results about the short-time road condition semantics of the future road sections.
A computer device comprises a processor and a memory, wherein the memory is used for storing a computer executable program, the processor reads part or all of the computer executable program from the memory and executes the computer executable program, and when the processor executes part or all of the computer executable program, the short-time road condition prediction method based on historical road conditions can be realized.
A computer readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the computer program can implement the method for predicting a short-term traffic condition based on historical traffic conditions according to the present invention.
Compared with the prior art, the invention has at least the following beneficial effects: the method starts from abstract semantics of traffic flow and a road network to construct a model, does not consider the position of a vehicle on a concrete road (a wide road is forty-five meters wide, and the representation heights of the vehicle at different pixel positions influence the fitting degree of the model), starts from road physical parameters and traffic flow state data to construct more generalized characteristics, and further models and represents; aiming at the natural defects of position data drift, ping-pong effect, data distortion under high-speed traffic flow and the like of traffic flow data recorded by a GPS, the invention does not adopt and depend on the position data of the traffic flow and a road network, and only starts from the moving mode and the micro semantic meaning of a vehicle, thereby having higher fault tolerance to the data;
the method has better fitting degree to real data and higher stability to vehicle road condition prediction, compared with the conventional statistical vehicle flow detailed real-time data, the modeling method of the type causes the problem of reduced model generalization, and the method integrates the model by utilizing the gradient lifting decision tree algorithm, and accesses the deep neural network calculation micro semantics into the tree model, thereby having more stable data prediction.
Furthermore, the bidirectional road network captures semantic features, captures congestion semantics from variable-scale logic, and can predict sequences more accurately. Compared with a road condition sequence only predicted in a macroscopic period, a traffic flow network only considering a time relation and a model only considering road network space distribution, the method carries out feature modeling from a macroscopic time scale, microscopic hot point semantics and a time-space relation of a hot point neighborhood, and comprehensively considers semantic information and hidden variable features of a road condition event.
Drawings
Fig. 1a is a schematic diagram illustrating an example road network and its upstream road conditions.
Fig. 1b is a schematic diagram of a road network congestion condition.
FIG. 1c is a schematic diagram illustrating an exemplary road network and its downstream road conditions.
FIG. 1d is a schematic diagram of a topological segment including both upstream and downstream branches.
Fig. 2 is a schematic view of the daily road condition of every same wednesday in a month.
Fig. 3 is a schematic diagram of road condition summary in 720 time periods in a day.
Fig. 4 is a schematic diagram of tag classification of road conditions.
FIG. 5 is a model structure diagram of the algorithm.
FIG. 6 is a training process of a semantic capture network.
FIG. 7 is a loss iteration of LSTM short-time condition prediction.
Fig. 8 is an acc iteration process for LSTM short-time short-circuit condition prediction.
FIG. 9 shows the cross entropy loss reduction process of the short-term prediction of the GBDT model integration algorithm.
Detailed Description
The traffic flow information amount prediction is widely applied to many fields and has high commercial value, for example, in a plurality of scenes such as taxi taking, bus dispatching, private car traveling, ambulance route planning and the like, and the short-time road condition prediction has the characteristics of poor fitting degree and low fault tolerance rate, so that the defect of low application value is caused. Aiming at short-time road condition prediction, the invention designs an upstream and downstream data construction method based on a macroscopic history period and a microscopic bidirectional road network topological semantic meaning, and firstly, macroscopically down-sampling data in a time domain to construct homogeneous data; secondly, microcosmically designing a road condition feature comprising a road network point space-time relation and a hot point semantic, capturing the microcosmic semantic of a target point on the microcosmic space-time through a shallow layer full-connection network, constructing a time sequence feature through a multilayer neural network, namely a historical period feature, fusing and accessing the microcosmic semantic and the historical period feature into a GBDT ensemble learning model, predicting the drip-drop taxi-taking data in one month of a certain city through a cross validation method, considering both the calculation performance and the timeliness, not needing a large-scale graph model and a network model to support an example, and simultaneously ensuring higher precision, and referring to a graph 5.
The invention provides a lightweight prediction algorithm which comprises the following steps: the invention relates to a method for extracting semantics based on bidirectional space-time topology and target points, which designs a topological feature group containing target point details by simply constructing space-time data, wherein the feature group contains a bidirectional topological road network in space and an elastic time domain road network in time. Considering that the data are from a data set under a real scene and have natural disturbance and errors, a batch standardization layer is added behind each linear layer of DNN to reduce overfitting, so that the network is converged quickly. The data structure designed by the invention can well represent the historical time sequence information of the road condition target point and the target point space hot spot semantic information, thereby better generalizing the abnormal road condition of the target point.
In consideration of data loss and structural complexity, DNN is used for constructing hidden variables, dimensionality unification is carried out on the constructed data, then an integrated learning model is used for carrying out learning reasoning on the designed spatio-temporal road network characteristics, and the training result is expressed. The method has the advantages of easy application in practical requirements such as engineering landing and algorithm tuning.
The concrete implementation is as follows:
step 1, downsampling data in a time domain to construct homogeneous data;
step 2, constructing time neighbor data of a microscopic space-time of a target point, carrying out dimensionless and dimension-unified operation on the data, and zooming to obtain microscopic semantic features of road condition information;
step 3, defining a shallow semantic catcher based on a multi-time neural network (DNN), and putting the micro semantic feature data into the multi-time neural network (DNN) to catch the micro semantics of the target point; the structure of DNN is as follows: the hidden variable extraction dimensionality is formed by a plurality of shallow layer linear networks, an activation layer is superposed behind each shallow layer network, a batch standardization layer is superposed behind the activation layer, and an output layer is a linear layer. The method takes a set containing a shallow linear network as a network module, and designs three modules as feature extractors.
Step 4, constructing macroscopic historical data based on the homogeneous data, and performing label mapping on the road condition state after mean value down-sampling to obtain macroscopic historical periodic characteristics;
step 5, the macroscopic historical period characteristics and the microscopic semantics of the target point are fused and input into a Gradient Boosting Decision Tree algorithm (Gradient Boosting Decision Tree),
step 6, randomly sampling the loaded features, and performing multi-round training according to k =5,k folding cross validation to obtain an optimal training model;
and 7, training a storage model, obtaining input characteristics of the road condition to be predicted through the steps 1-5, putting the input characteristics into the model obtained in the step 6 for reasoning, and obtaining a prediction result about the short-time road condition semantics of the future road section.
The method comprises the steps of carrying out homogeneous data construction on a taxi taking data source to construct original data into data dimensions and data formats required by the method, cutting the preprocessed original data into space-time distribution data, processing text format data into numerical data, and storing the data to serve as mother plate data constructed by characteristics; acquiring macro period data of each datum in the constructed master data, down-sampling the macro period data according to a mean value down-sampling method, and then acquiring homogeneous data of the data.
According to the method, the road condition information of the same road condition in the adjacent time interval is captured in the historical past first set time period according to the macroscopic spatiotemporal scale, on the other hand, the adjacent data of the same road condition in the past second set time period is captured on each working day, wherein the first time period is 7 days, and the second set time period is one month.
Referring to fig. 1a to 1d, road network data is characterized by desensitized road segment signs, and road condition information includes legal information of road segments and physical information of road segments; the traffic information represents specific space-time traffic information amount, the data structure is S = [ day, link, id, I ], S represents traffic road condition information, the date that day is a target point, link is a road segment number of the target point, id is a time slice of the target point in one day, 2Minute is a time slice, and I represents the traffic information amount, and specifically, the traffic information I = [ eta, v, label, cars ], link segments for obtaining upstream and downstream of each link form a topological structure of each link, and the topological structure is represented as
L={l i ,L up ,L down },L up ={l up1 ,l up2 ...,l upn },L down ={l down1 ,l down2 ...,l downm }
l i Represents the ith target road section, L up Represents l i Upstream section of (1), L up ={l up1 ,l up2 ...,l upn Upstream section of road is composed of specific section of road l upj The composition is the same for the downstream road sections;
the invention extracts the road network data acquired by two space-time scales of one month and one week, the road condition information of historical space-time positions does not always exist, the invention adopts an average value down-sampling method to construct a macroscopic historical periodic characteristic, respectively calculates the average values of v, eta and cars for a batch of data on a time-space coordinate, and in addition, in consideration of the fact that the road condition prediction has different proportions of general congestion, congestion and smoothness, in an actual application scene, the invention also pays more attention to whether the road section is congested or not rather than smooth or not, so the invention weights the label, and particularly, the original category label is entitled as follows: label = {0,1,2,3,4} → label = { Nan,0.2,0.4,0.6,0.8}, label being 0 means that there is no target traffic flow estimation data for the link, and the present invention assigns Nan to this, and assigns weights to smooth, general congestion, and extraordinary congestion of 0.2,0.4,0.6, and 0.8, referring to fig. 4.
For the road condition information info (day, id, I) of the target point, obtaining the road condition information of the adjacent time interval of the same working day in the past month, specifically:
{info(day',id',I)},day'=[-7day,-14day,-21day,-28day],id'=[id±r],r={1,2,3}
acquiring time neighbor road condition information of each target point in four working days in the past by the above formula, grouping the time neighbor road condition information according to the working days, carrying out mean value down-sampling on the time neighbor road condition information for each group of grouped road condition information, and changing discrete data into continuous data:
Figure GDA0003731558550000111
the method for extracting the micro semantics through the shallow DNN comprises the following steps: firstly, constructing a multilayer Liner network, extracting road condition information on the target point adjacent time and space for each selected target point to obtain a target point road condition serving as label, taking space-time neighborhood data as a data set, and performing the following steps according to the data 8:2, splitting the road condition into a training set and a verification set, calling DNN for training, classifying historical road conditions after training, and fitting the historical road conditions into three categories of good road conditions, congestion and very congestion, wherein the shallow DNN has a structure shown in the attached figure 2; in the DNN, a plurality of shallow linear networks form hidden variable extraction dimensionality, an activation layer is superposed behind each shallow network, a batch standardization layer is superposed behind the activation layer, and an output layer is a linear layer; the method takes a set containing a shallow linear network as a network module, and designs three modules as feature extractors.
The microscopic semantic features of the road condition information are as follows:
a shallow layer full feed network is designed to capture the road condition information quantity of a target point in a past period, firstly, the road condition information of the target point in the past n periods (n = 10) is obtained, then, a shallow layer full connection network is designed, the normalized road condition information is used as the input of the network, the current road condition information of the target point is used as label for training, the road condition information quantity I is combined with the road condition physical quantity, and the cross information of a road section and the road condition is obtained as the input characteristic.
Combining the extracted macroscopic features and the mined microscopic road condition semantics, predicting the road condition, inputting the information quantity of the past 5 time slices into the shallow layer network, outputting the information quantity of the road condition of each time slice, training the network by adopting a cross validation method, and applying the training result to the measurement of the information quantity of the current road condition.
Leading the microscopic semantic features into a macroscopic feature module of the feature matrix; constructing macroscopic historical data based on the homogeneous data, and performing label mapping on road condition states after mean value downsampling to obtain macroscopic historical periodic characteristics;
fusing the microscopic semantics and the macroscopic history periodic characteristics, inputting the characteristics into a Gradient Boosting Decision Tree algorithm (GBDT-Gradient Boosting Decision Tree), predicting road conditions, inputting the information quantity of the past 5 time slices into a shallow layer network, and outputting the information quantity of the past 5 time slices as the road conditions of each time slice; and training the shallow network by adopting a cross validation method, storing the trained optimal model parameters, deploying and packaging, and using the parameters for the reasoning process of the large-batch data of the user.
The specific implementation process of the invention is as follows:
obtaining the information of the air-vehicle empty condition of the cart in 2019-07 months of a certain city from the published drip cover sub-plan:
https://outreach.didichuxing.com/app-vue/outreach.didichuxing.comid =1022
preprocessing original data, cutting the preprocessed original data into space-time distribution data, processing text format data into numerical data, and storing the processed data as mother plate data constructed by characteristics; acquiring macro period data of each datum in the constructed master plate data, performing down-sampling on the macro period data according to a mean value down-sampling method, and then acquiring homogeneous data of the data, namely micro space-time neighbor data of a target point;
for each road section needing to be predicted, acquiring micro time data of the past 5 time slices, combining the micro time data with data of a macro time period, cutting out duplicate attributes, and then performing down-sampling to obtain micro space-time neighbor data of the current predicted road section (target point);
defining a shallow semantic catcher based on a multiple neural network (DNN), constructing the shallow semantic catcher DNN, designing a multilayer shallow perceptron, and carrying out batch standardization after each layer; loading the microscopic space-time neighbor data into a neural network for training for multiple times to obtain microscopic semantic features; and taking the macroscopic spatiotemporal historical period characteristics as input characteristics, and loading the micro road condition semantic characteristics obtained by training into the input characteristics to obtain input characteristic data finally subjected to training and deployment.
And loading the input characteristic data into the GBDT for training, carrying out cross validation to obtain a training result, and storing the reasoning model obtained after training for reasoning batch deployment.
And acquiring the input features of the road condition to be predicted, putting the input features into a reasoning model for reasoning, and predicting to obtain the short-time road condition semantics of the future road section.
The invention designs a topological characteristic group containing target point details by simply constructing space-time data, the characteristic group comprises a bidirectional topological road network in space and an elastic time domain road network in time, in order to construct the shape and the dimension of the data to meet network reasoning, a time domain comprising hyper-parameter definition is designed, the adjacent data around the target point is downsampled to capture the semantic characteristics of the target point on microscopic space-time, the adjacent refers to the adjacent data of the target point on the time neighborhood, concretely, a shallow BP network is adopted to capture the semantic characteristics of the upper and lower streams of the target point, and the input of the BP network is the road condition characteristics of the upper/lower streams of the target point around the time slice of the target point. The algorithm does not need the position information of the traffic flow, only needs the state information of the traffic flow, and uses the static information of the road section to represent the movement of the traffic flow in the road network, so that the fault tolerance of the model for acquiring the position data is guaranteed. The data structure designed by the method can well represent historical time sequence information of road condition target points and target point space hot point semantic information, so that abnormal road condition conditions occurring at the target points can be well generalized. In consideration of data loss and structural complexity, DNN is used for constructing hidden variables, dimensionality unification is carried out on the constructed data, and then an integrated learning model is used for carrying out learning reasoning on the well-designed spatio-temporal road network characteristics. The training results show that the method provided by the invention simultaneously guarantees higher precision and lower computational power requirements. The method has the advantage of easy use in practical requirements such as landing of engineering, algorithm tuning and the like. Comparing the algorithm proposed by the invention with the conventional time sequence network, we train the road condition information by using the LSTM, and the data of the past 5 segments of the target point are: and splicing the road condition dynamic data and the road section physical data to obtain the comparison characteristics of the physical data and the dynamic data of the road section, such as length/width/lane number/road grade and the like. In total, 16 features are obtained. And predicting road condition semantics of 1-60min in the future. FIG. 8 is the result of LSTM training. It can be seen that the average accuracy of LSTM is below 65%, which indicates that the method using LSTM in one way is not ideal for short-term short-circuit condition prediction. This also illustrates the effectiveness of the method of the invention, with reference to fig. 7, 8 and 9.
TABLE 1
Layer In_Dim OutDim
InData n×16
Linear1 relu BathNom n×16 n×64
Linear2 relu BathNom n×64 n×32
Linear3 relu BathNom n×32 n×20
Linear4 relu n×20 n×3
TABLE 2
epochs layer_nums hidden_size best_score loss_fun
3 2 16/32/64 0.63/0.63/0.63 0.59//0.53/0.58
3 3 16/32/64 0.63/0.62/0.65 0.54/0.56/0.47
3 4 16/32/64 0.65/0.64/0.65 0.48/0.61/0.54
A computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the method for predicting a short-time traffic condition based on historical traffic conditions according to the present invention can be implemented.
The computer equipment can be an onboard computer, a notebook computer, a tablet computer, a desktop computer, a mobile phone or a workstation.
The processor may be a Central Processing Unit (CPU), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), or an off-the-shelf programmable gate array (FPGA).
The memory of the invention can be an internal storage unit of a notebook computer, a tablet computer, a desktop computer, a mobile phone or a workstation, such as a memory and a hard disk; external memory units such as removable hard disks, flash memory cards may also be used.
Computer-readable storage media may include computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. The computer-readable storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a Solid State Drive (SSD), or an optical disc. The random access memory may include a resistive random access memory (ReRAM).

Claims (7)

1. A short-time road condition prediction method based on historical road conditions is characterized by comprising the following steps:
downsampling data in a time domain to construct homogenous data;
constructing micro time-space time neighbor data of a target point, and carrying out standardized operation, dimension unification and scaling on the time neighbor data to obtain micro semantic features of road condition information; the target point is a current predicted road section;
defining a shallow semantic catcher based on a plurality of times of neural networks, loading the microscopic semantic features into the shallow multilayer neural network, and catching the microscopic semantics of the target point;
constructing macroscopic historical data based on the homogeneous data, and performing label mapping on road condition states after mean value down-sampling to obtain macroscopic historical periodic characteristics;
fusing the macroscopic historical period features and the microscopic semantics of the target points to obtain loading features;
inputting the obtained loaded features into a gradient lifting decision tree algorithm, randomly sampling the loaded features, and performing multiple rounds of training according to k-fold cross validation to obtain a reasoning model;
after acquiring the loading characteristics of the road condition to be predicted, inputting the loading characteristics into the reasoning model for reasoning to obtain a prediction result about the short-time road condition semantics of a future road section; the down-sampling of the data in the time domain to construct homogeneous data specifically includes: capturing road condition information of the same road condition in an adjacent time interval in the historical past first set time period according to a macroscopic spatiotemporal scale, on the other hand, capturing adjacent data in each working day in the past second set time period, wherein the first time period is 7 days, the second set time period is one month, preprocessing the adjacent data, cutting the preprocessed adjacent data into spatiotemporal distribution data, processing text format data into numerical data, and storing the data as master data constructed by characteristics; acquiring macro period data of each datum in the constructed master data, performing down-sampling on the macro period data according to a mean value down-sampling method, and then acquiring homogeneous data of the data; the road condition information comprises legal information of a road section and physical information of the road section; the traffic information represents specific space-time traffic information amount, the data structure is S = [ day, link, id, I ], S represents traffic road condition information, day is date of a target point, link is road segment number of the target point, id is time slice of the target point in one day, 2min is a time slice, I is traffic information of space-time neighborhood of the target point, specifically includes traffic information of current road condition of the target point and space-time neighbor nodes, specifically, I = [ eta, v, label, cars ], obtains link segments of upstream and downstream of each link, and constitutes a topological structure of each link, and the data structure is represented as:
L={l i ,L up ,L down },L up ={l up1 ,l up2 ...,l upn },L down ={l down1 ,l down2 ...,l downm }
l i represents the ith target road section, L up Is represented by i Upstream section of (1), L up ={l up1 ,l up2 ...,l upn Upstream section of road is composed of specific section of road l upj The components of the composition are as follows,the downstream road section is treated in the same way;
the method for acquiring the road condition information of the target point, namely the road condition information info (day, id, I) of the target point in the past month in the adjacent time interval of the same working day comprises the following steps:
{info(day',id',I)},day'=[-7day,-14day,-21day,-28day],id'=[id±r],r={1,2,3}
acquiring time neighbor road condition information of each target point in four working days in the past by the above formula, grouping the time neighbor road condition information according to the working days, carrying out mean value down-sampling on the time neighbor road condition information for each group of grouped road condition information, and changing discrete data into continuous data:
Figure FDA0003731558540000021
constructing macroscopic historical data based on the homogeneous data, performing label mapping on road condition states after mean value downsampling, and defining two time periods with different scales to extract the macroscopic historical period characteristics of the vehicle information when obtaining the macroscopic historical period characteristics:
the first time period is four days of the same working day in one month; the second time period is seven days within one week;
performing feature extraction on the first time period to obtain road condition information of each same working day of a target point in the past month, then performing down-sampling on data to obtain historical time features based on the first time period, adding the historical time features into a feature group, performing feature extraction on the second time period, and firstly obtaining the road condition information of the target point in the adjacent time domain of each day of the week:
and downsampling the data to obtain historical time characteristics based on a second time period, and adding the historical time characteristics into a characteristic group to obtain:
F=[Link,cid,fid,I' -1week ,...I' -1week ,I' -1day ...I' -6day ]。
2. the method as claimed in claim 1, wherein a shallow semantic catcher is defined based on multiple neural networks, the micro semantic features are loaded into a shallow multi-layer neural network, and the micro semantic process of catching the target point is as follows: constructing a multi-layer linear network, extracting the current road condition of the target point and the traffic flow information I of the space-time neighbor nodes for each selected target point to obtain a data set which takes the road condition of the target point as label and takes the space-time neighborhood data as the data set, and processing the data set according to the following steps of 8: and 2, splitting the road condition into a training set and a verification set, calling DNN (Dempster navigation network) for training, classifying historical road conditions after training, and fitting the historical road conditions into three categories of good road conditions, congestion and very congestion.
3. The method as claimed in claim 2, wherein the multiple neural networks form hidden variable extraction dimensions from multiple shallow linear networks, an activation layer is superimposed behind each shallow network, a batch normalization layer is superimposed behind the activation layer, and an output layer is a linear layer; a set containing a shallow linear network is used as a network module, and three modules are designed to be used as feature extractors.
4. The method as claimed in claim 1, wherein the traffic information has the following micro semantic features:
designing a shallow layer full feed network to capture the road condition information quantity of a target point in a past period, firstly acquiring the road condition information of the target point in the past n periods, then designing a shallow layer full connection network, taking the normalized road condition information as the input of the network, taking the current road condition information of the target point as label, training, combining the road condition information quantity with the road condition physical quantity, and acquiring the cross information of the road section of the target point and the road condition as the input characteristic;
and combining the extracted macroscopic features with the microscopic semantics of the target points to predict road conditions, inputting the information quantity of the past 5 time slices into the shallow layer network, outputting the information quantity of the road conditions of each time slice, training the network by adopting a cross validation method, and applying the training result to the measurement of the information quantity of the current road conditions.
5. The short-time road condition prediction system based on historical road conditions is characterized by comprising a homogeneous data module, a micro semantic feature acquisition module, a micro semantic extraction module of a target point, a macro historical period feature extraction module, a loading feature acquisition module, a training module and a prediction module; the target point is a current predicted road section;
the homogeneous data module is used for down-sampling data in a time domain to construct homogeneous data; the downsampling of the data in the time domain to construct the homogeneous data specifically includes: capturing road condition information of the same road condition in an adjacent time interval in a historical past first set time period according to a macroscopic spatiotemporal scale, on the other hand, capturing adjacent data of the same road condition in a past second set time period on each working day, wherein the first time period is 7 days, the second set time period is one, preprocessing the adjacent data, cutting the preprocessed adjacent data into spatiotemporal distribution data, processing text format data into numerical data, and storing the data as master data constructed by characteristics; acquiring macro period data of each datum in the constructed master plate data, and performing down-sampling on the macro period data according to a mean value down-sampling method to obtain homogeneous data of the data;
the microscopic semantic feature acquisition module is used for constructing microscopic time-space time neighbor data of the target point, and carrying out standardized operation, dimension unification and scaling on the time neighbor data to obtain microscopic semantic features of the road condition information; the road condition information comprises legal information of the road section and physical information of the road section; the traffic information represents specific space-time traffic information amount, the data structure is S = [ day, link, id, I ], S represents traffic road condition information, day is date of a target point, link is road segment number of the target point, id is time slice of the target point in one day, 2min is a time slice, I is traffic information of space-time neighborhood of the target point, specifically includes traffic information of current road condition of the target point and space-time neighbor nodes, specifically, I = [ eta, v, label, cars ], obtains link segments of upstream and downstream of each link, and constitutes a topological structure of each link, and the data structure is represented as:
L={l i ,L up ,L down },L up ={l up1 ,l up2 ...,l upn },L down ={l down1 ,l down2 ...,l downm }
l i represents the ith target road section, L up Represents l i Upstream section of (1), L up ={l up1 ,l up2 ...,l upn Upstream section of road is composed of specific section of road l upj The composition is the same for the downstream road sections;
for the road condition information info (day, id, I) of the target point, obtaining the road condition information of the adjacent time interval of the same working day in the past month, specifically:
{info(day',id',I)},day'=[-7day,-14day,-21day,-28day],id'=[id±r],r={1,2,3}
acquiring time neighbor road condition information of each target point on the past four working days through the above formula, grouping the time neighbor road condition information according to the working days, carrying out mean value down-sampling on the time neighbor road condition information for each group of grouped road condition information, and changing discrete data into continuous data:
Figure FDA0003731558540000051
the microscopic semantic extraction module of the target point is used for defining a shallow semantic catcher according to the neural network for multiple times, loading the microscopic semantic features into the shallow multilayer neural network and catching the microscopic semantic of the target point;
the macro history cycle feature extraction module is used for constructing macro history data according to the homogeneous data, and performing label mapping on road condition states after mean value down-sampling to obtain macro history cycle features; constructing macroscopic historical data based on the homogeneous data, performing label mapping on road condition states after mean value downsampling, and defining two time periods with different scales to extract the macroscopic historical period characteristics of the vehicle information when obtaining the macroscopic historical period characteristics:
the first time period is four days of the same working day in one month; the second time period is seven days within one week;
the method comprises the steps of extracting features of a first time period, obtaining road condition information of each same working day of a target in the past month, then down-sampling data, obtaining historical time features based on the first time period, adding the historical time features into a feature group, extracting features of a second time period, and firstly obtaining the road condition information of the target in a neighboring time domain of the target in the past week and every day:
and downsampling the data to obtain historical time characteristics based on a second time period, and adding the historical time characteristics into a characteristic group to obtain:
F=[Link,cid,fid,I' -1week ,...I' -1week ,I' -1day ...I' -6day ];
the loading feature acquisition module is used for fusing the macroscopic history cycle features and the microscopic semantics of the target points to obtain loading features;
the training module is used for inputting the obtained loading characteristics into a gradient boosting decision tree algorithm, randomly sampling the loading characteristics, and performing multiple rounds of training according to k-fold cross validation to obtain a reasoning model;
and the prediction module is used for inputting the loading characteristics of the road conditions to be predicted into the inference model for inference after acquiring the loading characteristics of the road conditions to be predicted, so as to obtain a prediction result about the short-time road condition semantics of the future road section.
6. A computer device, comprising a processor and a memory, wherein the memory is used for storing a computer executable program, the processor reads part or all of the computer executable program from the memory and executes the computer executable program, and the processor can implement the method for predicting short-term road conditions based on historical road conditions according to any one of claims 1 to 4 when executing part or all of the computer executable program.
7. A computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the method for predicting a short-term road condition based on historical road conditions as claimed in any one of claims 1 to 4 is implemented.
CN202110352681.8A 2021-03-31 2021-03-31 Short-time road condition prediction method, system and equipment based on historical road conditions Active CN113160563B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110352681.8A CN113160563B (en) 2021-03-31 2021-03-31 Short-time road condition prediction method, system and equipment based on historical road conditions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110352681.8A CN113160563B (en) 2021-03-31 2021-03-31 Short-time road condition prediction method, system and equipment based on historical road conditions

Publications (2)

Publication Number Publication Date
CN113160563A CN113160563A (en) 2021-07-23
CN113160563B true CN113160563B (en) 2022-10-25

Family

ID=76886351

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110352681.8A Active CN113160563B (en) 2021-03-31 2021-03-31 Short-time road condition prediction method, system and equipment based on historical road conditions

Country Status (1)

Country Link
CN (1) CN113160563B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115223363B (en) * 2022-07-05 2023-01-31 中关村科学城城市大脑股份有限公司 Traffic information wireless transmission method and system based on urban brain
CN115472006A (en) * 2022-08-26 2022-12-13 武汉大学 Method for estimating commuting traffic flow of newly added road section of road network by utilizing mobile phone signaling data
CN115374316B (en) * 2022-10-24 2023-01-03 北京科技大学 Intelligent variable-scale data analysis method and system supported by real-time decision
CN115641718B (en) * 2022-10-24 2023-12-08 重庆邮电大学 Short-time traffic flow prediction method based on bayonet flow similarity and semantic association

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI619036B (en) * 2016-02-22 2018-03-21 財團法人資訊工業策進會 Traffic time forecasting system, traffic time forecasting method and traffic model establish method
CN109272169A (en) * 2018-10-10 2019-01-25 深圳市赛为智能股份有限公司 Traffic flow forecasting method, device, computer equipment and storage medium
CN110889558B (en) * 2019-11-29 2023-06-06 北京世纪高通科技有限公司 Road condition prediction method and device
CN111986490A (en) * 2020-09-18 2020-11-24 北京百度网讯科技有限公司 Road condition prediction method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113160563A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
CN113160563B (en) Short-time road condition prediction method, system and equipment based on historical road conditions
Bao et al. A spatiotemporal deep learning approach for citywide short-term crash risk prediction with multi-source data
Cai et al. Applying a deep learning approach for transportation safety planning by using high-resolution transportation and land use data
Essien et al. Improving urban traffic speed prediction using data source fusion and deep learning
Li et al. Graph CNNs for urban traffic passenger flows prediction
CN105354273A (en) Method for fast retrieving high-similarity image of highway fee evasion vehicle
CN113362491B (en) Vehicle track prediction and driving behavior analysis method
CN110458337B (en) C-GRU-based network appointment vehicle supply and demand prediction method
US20210182615A1 (en) Alexnet-based insulator self-explosion recognition method
Zhou et al. Spatial–temporal deep tensor neural networks for large-scale urban network speed prediction
CN113256980A (en) Road network state determination method, device, equipment and storage medium
Vuyyuru et al. A novel weather prediction model using a hybrid mechanism based on MLP and VAE with fire-fly optimization algorithm
CN114692984A (en) Traffic prediction method based on multi-step coupling graph convolution network
Wang et al. Deep learning of spatiotemporal patterns for urban mobility prediction using big data
Jiang et al. Bi-GRCN: A spatio-temporal traffic flow prediction model based on graph neural network
Krishnakumari et al. Understanding network traffic states using transfer learning
Zhong et al. Bus travel time prediction based on ensemble learning methods
Boujemaa et al. Toward road safety recommender systems: Formal concepts and technical basics
Prasad et al. An efficient traffic forecasting system based on spatial data and decision trees.
Kapoor et al. An intelligent railway surveillance framework based on recognition of object and railway track using deep learning
Xu et al. Air traffic density prediction using Bayesian ensemble graph attention network (BEGAN)
Zhang et al. Attention-driven recurrent imputation for traffic speed
Kantavat et al. Transportation mobility factor extraction using image recognition techniques
Li et al. A two-stream graph convolutional neural network for dynamic traffic flow forecasting
Zhu et al. A novel hybrid deep learning model for taxi demand forecasting based on decomposition of time series and fusion of text data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant