CN113160563B

CN113160563B - Short-time road condition prediction method, system and equipment based on historical road conditions

Info

Publication number: CN113160563B
Application number: CN202110352681.8A
Authority: CN
Inventors: 赵玺; 田文斌
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2021-03-31
Filing date: 2021-03-31
Publication date: 2022-10-25
Anticipated expiration: 2041-03-31
Also published as: CN113160563A

Abstract

The invention discloses a method, a system and equipment for predicting short-time road conditions based on historical road conditions, wherein the method comprises the following steps: firstly, macroscopically down-sampling data in a time domain to construct homogeneous data; secondly, microcosmically designing a road condition characteristic containing a road network point space-time relation and hot point semantics, capturing the semantic characteristics of a target point on microcosmic space-time through a shallow layer full-connection network, constructing a time sequence characteristic through a multilayer neural network, fusing and accessing space-time data and the semantic characteristics into a GBDT ensemble learning model, performing down-sampling on adjacent data around the target point through a cross validation method, and capturing the semantic characteristics of the target point on microcosmic space-time through constructing space-time data.

Description

Short-time road condition prediction method, system and equipment based on historical road conditions

Technical Field

The invention belongs to the technical field of traffic road condition prediction, and particularly relates to a short-time road condition prediction method, a system and equipment based on historical road conditions.

Background

Research on road condition prediction has been carried out for decades, and short-time road condition prediction shows extremely high use value and commercial value in many important fields, such as route planning, taxi scheduling, ambulance arrival time prediction, and the like, but the difficulty of short-time road condition prediction is relatively large, and the short-time road condition prediction is specifically shown as two reasons: 1. short-time short-circuit condition prediction has large data fluctuation and unstable traffic flow frequency spectrum; 2. the influence of the hot semantic information of the abnormal road condition and the target point on the road condition in the short-time road condition prediction is larger, and the hot semantic information of the target point is often fluctuated randomly and is difficult to generalize in a fixed filtering mode or a characteristic structure. The short-term short-circuit condition prediction method is realized based on three ways, namely a physical rule-based method, a probability map statistical-based method and a data-driven method. The method based on the physical rule is based on the displacement physical rule of the traffic flow, and the traffic information quantity in the future short time is calculated from the physical movement angle by calculating the information quantity, obviously, the method is completely based on the rule, the fitting degree of abnormal data and a large amount of data is crossed, the only parameter is the movement information of the traffic flow, and the application value is low; the probability map statistics-based method has been widely used in the past decade, and is based on a markov chain inference method, a bayesian prediction method, a moving average autoregressive (ARIMA) method, and the like. These methods have all achieved significant results in particular data and fields and have been successfully applied in commercial practice. However, the method based on probability statistics also has the following two disadvantages: 1) The method based on simple probability statistics has fewer parameters and cannot aim at more complex data sources. 2) The simple probability statistical method cannot guarantee higher generalization due to the simplicity of the model, and particularly cannot be effectively practiced aiming at the current complex and changeable road conditions and the abnormal road condition points (semantic hot points) with high-frequency outburst; the data-driven method ignores the intrinsic physical rules of the traffic flow, and represents the traffic information of the traffic flow by using huge parameters, and meanwhile, the data-driven method also considers the probability map method, which uses the parameters to simulate the probability variation trend.

Most of recent road condition prediction related researches are based on data driving, and particularly, methods based on a deep Convolutional Neural Network (CNN), a time-series neural network (LSTM) and a graph network (GNN) are endless. The CNN-based method learns the static information of the traffic flow of the road condition from the space and learns the characteristics of the excavated traffic flow in the hidden layer, so that ETA and traffic time are predicted. In other words, only static road conditions are considered to predict future road conditions, the prediction is based on a statistical method, the road conditions of a specific road section in a time interval before and after the time point are not considered, on the other hand, the simple CNN method does not consider the deconstruction of the topological network of the specific road section, that is, the road section information of the upstream and downstream of the specific road section is seriously ignored, and in fact, when a road section is congested, the probability that congestion exists at the upstream or the downstream of the road section is obviously higher than that of the non-congested road sections, and finally, the position data of the traffic flow is directly loaded into the network model for calculation, so that the authenticity of the data is defaulted, and the position data, particularly the position data which tells movement, or the position data which is derived from mobile phone signaling data or from GPS navigation, has a certain error, and in the traffic flow network with high-frequency sampling, the traffic flow information volume error of adjacent time slices is larger. In order to solve several problems presented by the CNN, the graph network is successfully applied to road condition prediction, and the GNN considers the topological relationship between road networks, which considers that the upstream and downstream of each road segment form a single topological structure with the road segment, and the trend of traffic flow in the nodes of the graph is represented by the node weights of the graph, so that the GNN calculates from the perspective of the node relationship and considers both the static information and the topological information of the traffic flow. However, such GNNs ignore dynamic traffic information, as well as the disadvantages presented by CNNs, and in order to solve this problem, space-time graph convolution (STGCN) is applied to road condition prediction. The STGCN considers the road condition information of the road network in a specific time interval on the basis of the spatial convolution, thereby predicting the future road condition more efficiently. Although STGCN exhibits superior performance in solving the road condition prediction problem, it also raises considerable issues: 1) The structure of the STGCN is too bloated, data characteristics are given to a model to construct, whether the size of the model is too large is caused, the node relation between road networks is constructed blindly, and whether traffic information is really fitted is judged. 2) The STGCN calculates the traffic flow by adopting a random node sampling mode, namely a random walk mode and a first-order neighbor mode, and whether the calculation is consistent with a real traffic flow rule or not is judged. 3) The network model loads GPS position data of traffic flow for calculation, and whether the analysis result has deviation due to system errors of the data is judged.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a short-time road condition prediction method based on historical road conditions, which is a method based on bidirectional space-time topology and target semantic extraction, does not need position information of traffic flow, only needs state information of the traffic flow, uses static information of road sections to represent the movement of the traffic flow in a road network, does not need a large-scale graph model and a network model to support an example, and simultaneously ensures higher precision.

In order to achieve the purpose, the invention adopts the technical scheme that: a short-time road condition prediction method based on historical road conditions comprises the following specific steps:

downsampling data in a time domain to construct homogenous data;

constructing micro time-space time neighbor data of a target point, and carrying out standardized operation, dimension unification and scaling on the time neighbor data to obtain micro semantic features of road condition information;

defining a shallow semantic catcher based on a plurality of times of neural networks, loading the microscopic semantic features into the shallow multilayer neural network, and catching the microscopic semantics of the target point;

constructing macroscopic historical data based on the homogeneous data, and performing label mapping on road condition states after mean value down-sampling to obtain macroscopic historical periodic characteristics;

fusing the macroscopic historical period features and the microscopic semantics of the target points to obtain loading features;

inputting the obtained loaded features into a gradient lifting decision tree algorithm, randomly sampling the loaded features, and performing multiple rounds of training according to k-fold cross validation to obtain a reasoning model;

and after acquiring the loading characteristics of the road condition to be predicted, inputting the loading characteristics into the reasoning model for reasoning to obtain a prediction result about the short-time road condition semantics of the future road section.

The downsampling of the data in the time domain to construct the homogeneous data specifically includes: capturing road condition information of the same road condition in an adjacent time interval in a historical past first set time period according to a macroscopic spatiotemporal scale, on the other hand, capturing adjacent data of the same road condition in a past second set time period on each working day, wherein the first time period is 7 days, the second set time period is one, preprocessing the adjacent data, cutting the preprocessed adjacent data into spatiotemporal distribution data, processing text format data into numerical data, and storing the data as master data constructed by characteristics; acquiring macro period data of each datum in the constructed master data, down-sampling the macro period data according to a mean value down-sampling method, and then acquiring homogeneous data of the data.

Defining a shallow semantic catcher based on a plurality of times of neural networks, loading the microscopic semantic features into the shallow multilayer neural network, wherein the microscopic semantic process of catching a target point is as follows: constructing a multi-layer linear network, extracting road condition information I on the target point adjacent time and space for each selected target point to obtain a target point road condition serving as label, taking space-time neighborhood data as a data set, and carrying out data set analysis according to the following steps of 8: and 2, splitting the road condition into a training set and a verification set, calling DNN (Dempster noise network) for training, classifying historical road conditions after training, and fitting the historical road conditions into three categories of good road conditions, congestion and very congestion.

The multi-time neural network is characterized in that a plurality of shallow layer linear networks form an implicit variable extraction dimension, an activation layer is superposed behind each shallow layer network, a batch standardization layer is superposed behind the activation layer, and an output layer is a linear layer; a set containing a shallow linear network is used as a network module, and three modules are designed to be used as feature extractors.

The road condition information comprises legal information of a road section and physical information of the road section; the traffic information represents a specific space-time traffic information amount, the data structure is S = [ day, link, id, I ], S represents traffic road condition information, day is a date of a target point, link is a road segment number of the target point, id is a time slice of the target point in one day, 2min is a time slice, and I is the traffic information amount, and specifically, the traffic information I = [ eta, v, label, cars ], a link segment of an upstream upper part and a downstream down part of each link is obtained, a topological structure of each link is formed, and the topological structure is represented as:

L＝{l _i ,L _up ,L _down },L _up ＝{l _up1 ,l _up2 ...,l _upn },L _down ＝{l _down1 ,l _down2 ...,l _downm }

l _i represents the ith target road section, L _up Is represented by _i Upstream section of (1), L _up ＝{l _up1 ,l _up2 ...,l _upn Upstream section of road is composed of specific section of road l _upj The composition is the same for the downstream road sections;

acquiring the traffic information of the same working day and adjacent time intervals in the past month for the traffic information info (day, id, I) of the target point, specifically:

{info(day',id',I)},day'＝[-7day,-14day,-21day,-28day],id'＝[id±r],r＝{1,2,3}

acquiring time neighbor road condition information of each target point in four working days in the past by the above formula, grouping the time neighbor road condition information according to the working days, carrying out mean value down-sampling on the time neighbor road condition information for each group of grouped road condition information, and changing discrete data into continuous data:

constructing macroscopic historical data based on the homogeneous data, performing label mapping on road condition states after mean value downsampling, and defining two time periods with different scales to extract the macroscopic historical period characteristics of the vehicle information when obtaining the macroscopic historical period characteristics:

the first time period is four days of the same working day in one month; the second time period is seven days within one week;

the method comprises the steps of extracting features of a first time period, obtaining road condition information of each same working day of a target in the past month, then down-sampling data, obtaining historical time features based on the first time period, adding the historical time features into a feature group, extracting features of a second time period, and firstly obtaining the road condition information of the target in a neighboring time domain of the target in the past week and every day:

down-sampling the data to obtain a historical time characteristic based on a second time period, and adding the historical time characteristic into a characteristic group to obtain:

F＝[Link,cid,fid,I' _-1week ,...I' _-1week ,I' _-1day ...I' _-6day ]。

the microscopic semantic features of the road condition information are as follows:

designing a shallow layer full feed network to capture the road condition information quantity of a target point in a past period, firstly acquiring the road condition information of the target point in the past n periods, then designing a shallow layer full connection network, taking the normalized road condition information as the input of the network, taking the current road condition information of the target point as label, training, combining the road condition information quantity I with the road condition physical quantity, and acquiring the cross information of the road section of the target point and the road condition as the input characteristic;

and combining the extracted macroscopic features with the microscopic semantics of the target points to predict road conditions, inputting the information quantity of the past 5 time slices into the shallow layer network, outputting the information quantity of the road conditions of each time slice, training the network by adopting a cross validation method, and applying the training result to the measurement of the information quantity of the current road conditions.

The invention also provides a short-time road condition prediction system based on historical road conditions, which comprises a homogeneous data module, a microscopic semantic feature acquisition module, a microscopic semantic extraction module of target points, a macroscopic historical period feature extraction module, a loading feature acquisition module, a training module and a prediction module;

the homogeneous data module is used for downsampling the data in the time domain to construct homogeneous data;

the microscopic semantic feature acquisition module is used for constructing microscopic time-space time neighbor data of the target point, and carrying out standardized operation, dimension unification and scaling on the time neighbor data to obtain microscopic semantic features of the road condition information;

the microscopic semantic extraction module of the target point is used for defining a shallow semantic catcher according to the neural network for many times, loading the microscopic semantic features into the shallow multilayer neural network and catching the microscopic semantic of the target point;

the macro history cycle feature extraction module is used for constructing macro history data according to the homogeneous data, and performing label mapping on road condition states after mean value down-sampling to obtain macro history cycle features;

the loading feature acquisition module is used for fusing the macroscopic history cycle features and the microscopic semantics of the target points to obtain loading features;

the training module is used for inputting the obtained loading characteristics into a gradient lifting decision tree algorithm, randomly sampling the loading characteristics, and performing multi-round training according to k-fold cross validation to obtain a reasoning model;

the forecasting module is used for inputting the loading characteristics into the reasoning model for reasoning after acquiring the loading characteristics of the road conditions to be forecasted so as to obtain forecasting results about the short-time road condition semantics of the future road sections.

A computer device comprises a processor and a memory, wherein the memory is used for storing a computer executable program, the processor reads part or all of the computer executable program from the memory and executes the computer executable program, and when the processor executes part or all of the computer executable program, the short-time road condition prediction method based on historical road conditions can be realized.

A computer readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the computer program can implement the method for predicting a short-term traffic condition based on historical traffic conditions according to the present invention.

Compared with the prior art, the invention has at least the following beneficial effects: the method starts from abstract semantics of traffic flow and a road network to construct a model, does not consider the position of a vehicle on a concrete road (a wide road is forty-five meters wide, and the representation heights of the vehicle at different pixel positions influence the fitting degree of the model), starts from road physical parameters and traffic flow state data to construct more generalized characteristics, and further models and represents; aiming at the natural defects of position data drift, ping-pong effect, data distortion under high-speed traffic flow and the like of traffic flow data recorded by a GPS, the invention does not adopt and depend on the position data of the traffic flow and a road network, and only starts from the moving mode and the micro semantic meaning of a vehicle, thereby having higher fault tolerance to the data;

the method has better fitting degree to real data and higher stability to vehicle road condition prediction, compared with the conventional statistical vehicle flow detailed real-time data, the modeling method of the type causes the problem of reduced model generalization, and the method integrates the model by utilizing the gradient lifting decision tree algorithm, and accesses the deep neural network calculation micro semantics into the tree model, thereby having more stable data prediction.

Furthermore, the bidirectional road network captures semantic features, captures congestion semantics from variable-scale logic, and can predict sequences more accurately. Compared with a road condition sequence only predicted in a macroscopic period, a traffic flow network only considering a time relation and a model only considering road network space distribution, the method carries out feature modeling from a macroscopic time scale, microscopic hot point semantics and a time-space relation of a hot point neighborhood, and comprehensively considers semantic information and hidden variable features of a road condition event.

Drawings

Fig. 1a is a schematic diagram illustrating an example road network and its upstream road conditions.

Fig. 1b is a schematic diagram of a road network congestion condition.

FIG. 1c is a schematic diagram illustrating an exemplary road network and its downstream road conditions.

FIG. 1d is a schematic diagram of a topological segment including both upstream and downstream branches.

Fig. 2 is a schematic view of the daily road condition of every same wednesday in a month.

Fig. 3 is a schematic diagram of road condition summary in 720 time periods in a day.

Fig. 4 is a schematic diagram of tag classification of road conditions.

FIG. 5 is a model structure diagram of the algorithm.

FIG. 6 is a training process of a semantic capture network.

FIG. 7 is a loss iteration of LSTM short-time condition prediction.

Fig. 8 is an acc iteration process for LSTM short-time short-circuit condition prediction.

FIG. 9 shows the cross entropy loss reduction process of the short-term prediction of the GBDT model integration algorithm.

Detailed Description

The traffic flow information amount prediction is widely applied to many fields and has high commercial value, for example, in a plurality of scenes such as taxi taking, bus dispatching, private car traveling, ambulance route planning and the like, and the short-time road condition prediction has the characteristics of poor fitting degree and low fault tolerance rate, so that the defect of low application value is caused. Aiming at short-time road condition prediction, the invention designs an upstream and downstream data construction method based on a macroscopic history period and a microscopic bidirectional road network topological semantic meaning, and firstly, macroscopically down-sampling data in a time domain to construct homogeneous data; secondly, microcosmically designing a road condition feature comprising a road network point space-time relation and a hot point semantic, capturing the microcosmic semantic of a target point on the microcosmic space-time through a shallow layer full-connection network, constructing a time sequence feature through a multilayer neural network, namely a historical period feature, fusing and accessing the microcosmic semantic and the historical period feature into a GBDT ensemble learning model, predicting the drip-drop taxi-taking data in one month of a certain city through a cross validation method, considering both the calculation performance and the timeliness, not needing a large-scale graph model and a network model to support an example, and simultaneously ensuring higher precision, and referring to a graph 5.

The invention provides a lightweight prediction algorithm which comprises the following steps: the invention relates to a method for extracting semantics based on bidirectional space-time topology and target points, which designs a topological feature group containing target point details by simply constructing space-time data, wherein the feature group contains a bidirectional topological road network in space and an elastic time domain road network in time. Considering that the data are from a data set under a real scene and have natural disturbance and errors, a batch standardization layer is added behind each linear layer of DNN to reduce overfitting, so that the network is converged quickly. The data structure designed by the invention can well represent the historical time sequence information of the road condition target point and the target point space hot spot semantic information, thereby better generalizing the abnormal road condition of the target point.

In consideration of data loss and structural complexity, DNN is used for constructing hidden variables, dimensionality unification is carried out on the constructed data, then an integrated learning model is used for carrying out learning reasoning on the designed spatio-temporal road network characteristics, and the training result is expressed. The method has the advantages of easy application in practical requirements such as engineering landing and algorithm tuning.

The concrete implementation is as follows:

step 1, downsampling data in a time domain to construct homogeneous data;

step 2, constructing time neighbor data of a microscopic space-time of a target point, carrying out dimensionless and dimension-unified operation on the data, and zooming to obtain microscopic semantic features of road condition information;

step 3, defining a shallow semantic catcher based on a multi-time neural network (DNN), and putting the micro semantic feature data into the multi-time neural network (DNN) to catch the micro semantics of the target point; the structure of DNN is as follows: the hidden variable extraction dimensionality is formed by a plurality of shallow layer linear networks, an activation layer is superposed behind each shallow layer network, a batch standardization layer is superposed behind the activation layer, and an output layer is a linear layer. The method takes a set containing a shallow linear network as a network module, and designs three modules as feature extractors.

Step 4, constructing macroscopic historical data based on the homogeneous data, and performing label mapping on the road condition state after mean value down-sampling to obtain macroscopic historical periodic characteristics;

step 5, the macroscopic historical period characteristics and the microscopic semantics of the target point are fused and input into a Gradient Boosting Decision Tree algorithm (Gradient Boosting Decision Tree),

step 6, randomly sampling the loaded features, and performing multi-round training according to k =5,k folding cross validation to obtain an optimal training model;

and 7, training a storage model, obtaining input characteristics of the road condition to be predicted through the steps 1-5, putting the input characteristics into the model obtained in the step 6 for reasoning, and obtaining a prediction result about the short-time road condition semantics of the future road section.

The method comprises the steps of carrying out homogeneous data construction on a taxi taking data source to construct original data into data dimensions and data formats required by the method, cutting the preprocessed original data into space-time distribution data, processing text format data into numerical data, and storing the data to serve as mother plate data constructed by characteristics; acquiring macro period data of each datum in the constructed master data, down-sampling the macro period data according to a mean value down-sampling method, and then acquiring homogeneous data of the data.

According to the method, the road condition information of the same road condition in the adjacent time interval is captured in the historical past first set time period according to the macroscopic spatiotemporal scale, on the other hand, the adjacent data of the same road condition in the past second set time period is captured on each working day, wherein the first time period is 7 days, and the second set time period is one month.

Referring to fig. 1a to 1d, road network data is characterized by desensitized road segment signs, and road condition information includes legal information of road segments and physical information of road segments; the traffic information represents specific space-time traffic information amount, the data structure is S = [ day, link, id, I ], S represents traffic road condition information, the date that day is a target point, link is a road segment number of the target point, id is a time slice of the target point in one day, 2Minute is a time slice, and I represents the traffic information amount, and specifically, the traffic information I = [ eta, v, label, cars ], link segments for obtaining upstream and downstream of each link form a topological structure of each link, and the topological structure is represented as

l _i Represents the ith target road section, L _up Represents l _i Upstream section of (1), L _up ＝{l _up1 ,l _up2 ...,l _upn Upstream section of road is composed of specific section of road l _upj The composition is the same for the downstream road sections;

the invention extracts the road network data acquired by two space-time scales of one month and one week, the road condition information of historical space-time positions does not always exist, the invention adopts an average value down-sampling method to construct a macroscopic historical periodic characteristic, respectively calculates the average values of v, eta and cars for a batch of data on a time-space coordinate, and in addition, in consideration of the fact that the road condition prediction has different proportions of general congestion, congestion and smoothness, in an actual application scene, the invention also pays more attention to whether the road section is congested or not rather than smooth or not, so the invention weights the label, and particularly, the original category label is entitled as follows: label = {0,1,2,3,4} → label = { Nan,0.2,0.4,0.6,0.8}, label being 0 means that there is no target traffic flow estimation data for the link, and the present invention assigns Nan to this, and assigns weights to smooth, general congestion, and extraordinary congestion of 0.2,0.4,0.6, and 0.8, referring to fig. 4.

For the road condition information info (day, id, I) of the target point, obtaining the road condition information of the adjacent time interval of the same working day in the past month, specifically:

the method for extracting the micro semantics through the shallow DNN comprises the following steps: firstly, constructing a multilayer Liner network, extracting road condition information on the target point adjacent time and space for each selected target point to obtain a target point road condition serving as label, taking space-time neighborhood data as a data set, and performing the following steps according to the data 8:2, splitting the road condition into a training set and a verification set, calling DNN for training, classifying historical road conditions after training, and fitting the historical road conditions into three categories of good road conditions, congestion and very congestion, wherein the shallow DNN has a structure shown in the attached figure 2; in the DNN, a plurality of shallow linear networks form hidden variable extraction dimensionality, an activation layer is superposed behind each shallow network, a batch standardization layer is superposed behind the activation layer, and an output layer is a linear layer; the method takes a set containing a shallow linear network as a network module, and designs three modules as feature extractors.

a shallow layer full feed network is designed to capture the road condition information quantity of a target point in a past period, firstly, the road condition information of the target point in the past n periods (n = 10) is obtained, then, a shallow layer full connection network is designed, the normalized road condition information is used as the input of the network, the current road condition information of the target point is used as label for training, the road condition information quantity I is combined with the road condition physical quantity, and the cross information of a road section and the road condition is obtained as the input characteristic.

Combining the extracted macroscopic features and the mined microscopic road condition semantics, predicting the road condition, inputting the information quantity of the past 5 time slices into the shallow layer network, outputting the information quantity of the road condition of each time slice, training the network by adopting a cross validation method, and applying the training result to the measurement of the information quantity of the current road condition.

Leading the microscopic semantic features into a macroscopic feature module of the feature matrix; constructing macroscopic historical data based on the homogeneous data, and performing label mapping on road condition states after mean value downsampling to obtain macroscopic historical periodic characteristics;

fusing the microscopic semantics and the macroscopic history periodic characteristics, inputting the characteristics into a Gradient Boosting Decision Tree algorithm (GBDT-Gradient Boosting Decision Tree), predicting road conditions, inputting the information quantity of the past 5 time slices into a shallow layer network, and outputting the information quantity of the past 5 time slices as the road conditions of each time slice; and training the shallow network by adopting a cross validation method, storing the trained optimal model parameters, deploying and packaging, and using the parameters for the reasoning process of the large-batch data of the user.

The specific implementation process of the invention is as follows:

obtaining the information of the air-vehicle empty condition of the cart in 2019-07 months of a certain city from the published drip cover sub-plan:

https://outreach.didichuxing.com/app-vue/outreach.didichuxing.comid ＝1022

preprocessing original data, cutting the preprocessed original data into space-time distribution data, processing text format data into numerical data, and storing the processed data as mother plate data constructed by characteristics; acquiring macro period data of each datum in the constructed master plate data, performing down-sampling on the macro period data according to a mean value down-sampling method, and then acquiring homogeneous data of the data, namely micro space-time neighbor data of a target point;

for each road section needing to be predicted, acquiring micro time data of the past 5 time slices, combining the micro time data with data of a macro time period, cutting out duplicate attributes, and then performing down-sampling to obtain micro space-time neighbor data of the current predicted road section (target point);

defining a shallow semantic catcher based on a multiple neural network (DNN), constructing the shallow semantic catcher DNN, designing a multilayer shallow perceptron, and carrying out batch standardization after each layer; loading the microscopic space-time neighbor data into a neural network for training for multiple times to obtain microscopic semantic features; and taking the macroscopic spatiotemporal historical period characteristics as input characteristics, and loading the micro road condition semantic characteristics obtained by training into the input characteristics to obtain input characteristic data finally subjected to training and deployment.

And loading the input characteristic data into the GBDT for training, carrying out cross validation to obtain a training result, and storing the reasoning model obtained after training for reasoning batch deployment.

And acquiring the input features of the road condition to be predicted, putting the input features into a reasoning model for reasoning, and predicting to obtain the short-time road condition semantics of the future road section.

The invention designs a topological characteristic group containing target point details by simply constructing space-time data, the characteristic group comprises a bidirectional topological road network in space and an elastic time domain road network in time, in order to construct the shape and the dimension of the data to meet network reasoning, a time domain comprising hyper-parameter definition is designed, the adjacent data around the target point is downsampled to capture the semantic characteristics of the target point on microscopic space-time, the adjacent refers to the adjacent data of the target point on the time neighborhood, concretely, a shallow BP network is adopted to capture the semantic characteristics of the upper and lower streams of the target point, and the input of the BP network is the road condition characteristics of the upper/lower streams of the target point around the time slice of the target point. The algorithm does not need the position information of the traffic flow, only needs the state information of the traffic flow, and uses the static information of the road section to represent the movement of the traffic flow in the road network, so that the fault tolerance of the model for acquiring the position data is guaranteed. The data structure designed by the method can well represent historical time sequence information of road condition target points and target point space hot point semantic information, so that abnormal road condition conditions occurring at the target points can be well generalized. In consideration of data loss and structural complexity, DNN is used for constructing hidden variables, dimensionality unification is carried out on the constructed data, and then an integrated learning model is used for carrying out learning reasoning on the well-designed spatio-temporal road network characteristics. The training results show that the method provided by the invention simultaneously guarantees higher precision and lower computational power requirements. The method has the advantage of easy use in practical requirements such as landing of engineering, algorithm tuning and the like. Comparing the algorithm proposed by the invention with the conventional time sequence network, we train the road condition information by using the LSTM, and the data of the past 5 segments of the target point are: and splicing the road condition dynamic data and the road section physical data to obtain the comparison characteristics of the physical data and the dynamic data of the road section, such as length/width/lane number/road grade and the like. In total, 16 features are obtained. And predicting road condition semantics of 1-60min in the future. FIG. 8 is the result of LSTM training. It can be seen that the average accuracy of LSTM is below 65%, which indicates that the method using LSTM in one way is not ideal for short-term short-circuit condition prediction. This also illustrates the effectiveness of the method of the invention, with reference to fig. 7, 8 and 9.

TABLE 1

Layer	In_Dim	OutDim
			InData	n×16
Linear1 relu BathNom	n×16	n×64
			Linear2 relu BathNom	n×64	n×32
Linear3 relu BathNom	n×32	n×20
			Linear4 relu	n×20	n×3

TABLE 2

epochs	layer_nums	hidden_size	best_score	loss_fun
					3	2	16/32/64	0.63/0.63/0.63	0.59//0.53/0.58
3	3	16/32/64	0.63/0.62/0.65	0.54/0.56/0.47
					3	4	16/32/64	0.65/0.64/0.65	0.48/0.61/0.54

A computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the method for predicting a short-time traffic condition based on historical traffic conditions according to the present invention can be implemented.

The computer equipment can be an onboard computer, a notebook computer, a tablet computer, a desktop computer, a mobile phone or a workstation.

The processor may be a Central Processing Unit (CPU), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), or an off-the-shelf programmable gate array (FPGA).

The memory of the invention can be an internal storage unit of a notebook computer, a tablet computer, a desktop computer, a mobile phone or a workstation, such as a memory and a hard disk; external memory units such as removable hard disks, flash memory cards may also be used.

Computer-readable storage media may include computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. The computer-readable storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a Solid State Drive (SSD), or an optical disc. The random access memory may include a resistive random access memory (ReRAM).

Claims

1. A short-time road condition prediction method based on historical road conditions is characterized by comprising the following steps:

downsampling data in a time domain to construct homogenous data;

constructing micro time-space time neighbor data of a target point, and carrying out standardized operation, dimension unification and scaling on the time neighbor data to obtain micro semantic features of road condition information; the target point is a current predicted road section;

after acquiring the loading characteristics of the road condition to be predicted, inputting the loading characteristics into the reasoning model for reasoning to obtain a prediction result about the short-time road condition semantics of a future road section; the down-sampling of the data in the time domain to construct homogeneous data specifically includes: capturing road condition information of the same road condition in an adjacent time interval in the historical past first set time period according to a macroscopic spatiotemporal scale, on the other hand, capturing adjacent data in each working day in the past second set time period, wherein the first time period is 7 days, the second set time period is one month, preprocessing the adjacent data, cutting the preprocessed adjacent data into spatiotemporal distribution data, processing text format data into numerical data, and storing the data as master data constructed by characteristics; acquiring macro period data of each datum in the constructed master data, performing down-sampling on the macro period data according to a mean value down-sampling method, and then acquiring homogeneous data of the data; the road condition information comprises legal information of a road section and physical information of the road section; the traffic information represents specific space-time traffic information amount, the data structure is S = [ day, link, id, I ], S represents traffic road condition information, day is date of a target point, link is road segment number of the target point, id is time slice of the target point in one day, 2min is a time slice, I is traffic information of space-time neighborhood of the target point, specifically includes traffic information of current road condition of the target point and space-time neighbor nodes, specifically, I = [ eta, v, label, cars ], obtains link segments of upstream and downstream of each link, and constitutes a topological structure of each link, and the data structure is represented as:

l _i represents the ith target road section, L _up Is represented by _i Upstream section of (1), L _up ＝{l _up1 ,l _up2 ...,l _upn Upstream section of road is composed of specific section of road l _upj The components of the composition are as follows,the downstream road section is treated in the same way;

the method for acquiring the road condition information of the target point, namely the road condition information info (day, id, I) of the target point in the past month in the adjacent time interval of the same working day comprises the following steps:

performing feature extraction on the first time period to obtain road condition information of each same working day of a target point in the past month, then performing down-sampling on data to obtain historical time features based on the first time period, adding the historical time features into a feature group, performing feature extraction on the second time period, and firstly obtaining the road condition information of the target point in the adjacent time domain of each day of the week:

and downsampling the data to obtain historical time characteristics based on a second time period, and adding the historical time characteristics into a characteristic group to obtain:

F＝[Link,cid,fid,I' _-1week ,...I' _-1week ,I' _-1day ...I' _-6day ]。

2. the method as claimed in claim 1, wherein a shallow semantic catcher is defined based on multiple neural networks, the micro semantic features are loaded into a shallow multi-layer neural network, and the micro semantic process of catching the target point is as follows: constructing a multi-layer linear network, extracting the current road condition of the target point and the traffic flow information I of the space-time neighbor nodes for each selected target point to obtain a data set which takes the road condition of the target point as label and takes the space-time neighborhood data as the data set, and processing the data set according to the following steps of 8: and 2, splitting the road condition into a training set and a verification set, calling DNN (Dempster navigation network) for training, classifying historical road conditions after training, and fitting the historical road conditions into three categories of good road conditions, congestion and very congestion.

3. The method as claimed in claim 2, wherein the multiple neural networks form hidden variable extraction dimensions from multiple shallow linear networks, an activation layer is superimposed behind each shallow network, a batch normalization layer is superimposed behind the activation layer, and an output layer is a linear layer; a set containing a shallow linear network is used as a network module, and three modules are designed to be used as feature extractors.

4. The method as claimed in claim 1, wherein the traffic information has the following micro semantic features:

designing a shallow layer full feed network to capture the road condition information quantity of a target point in a past period, firstly acquiring the road condition information of the target point in the past n periods, then designing a shallow layer full connection network, taking the normalized road condition information as the input of the network, taking the current road condition information of the target point as label, training, combining the road condition information quantity with the road condition physical quantity, and acquiring the cross information of the road section of the target point and the road condition as the input characteristic;

5. The short-time road condition prediction system based on historical road conditions is characterized by comprising a homogeneous data module, a micro semantic feature acquisition module, a micro semantic extraction module of a target point, a macro historical period feature extraction module, a loading feature acquisition module, a training module and a prediction module; the target point is a current predicted road section;

the homogeneous data module is used for down-sampling data in a time domain to construct homogeneous data; the downsampling of the data in the time domain to construct the homogeneous data specifically includes: capturing road condition information of the same road condition in an adjacent time interval in a historical past first set time period according to a macroscopic spatiotemporal scale, on the other hand, capturing adjacent data of the same road condition in a past second set time period on each working day, wherein the first time period is 7 days, the second set time period is one, preprocessing the adjacent data, cutting the preprocessed adjacent data into spatiotemporal distribution data, processing text format data into numerical data, and storing the data as master data constructed by characteristics; acquiring macro period data of each datum in the constructed master plate data, and performing down-sampling on the macro period data according to a mean value down-sampling method to obtain homogeneous data of the data;

the microscopic semantic feature acquisition module is used for constructing microscopic time-space time neighbor data of the target point, and carrying out standardized operation, dimension unification and scaling on the time neighbor data to obtain microscopic semantic features of the road condition information; the road condition information comprises legal information of the road section and physical information of the road section; the traffic information represents specific space-time traffic information amount, the data structure is S = [ day, link, id, I ], S represents traffic road condition information, day is date of a target point, link is road segment number of the target point, id is time slice of the target point in one day, 2min is a time slice, I is traffic information of space-time neighborhood of the target point, specifically includes traffic information of current road condition of the target point and space-time neighbor nodes, specifically, I = [ eta, v, label, cars ], obtains link segments of upstream and downstream of each link, and constitutes a topological structure of each link, and the data structure is represented as:

acquiring time neighbor road condition information of each target point on the past four working days through the above formula, grouping the time neighbor road condition information according to the working days, carrying out mean value down-sampling on the time neighbor road condition information for each group of grouped road condition information, and changing discrete data into continuous data:

the microscopic semantic extraction module of the target point is used for defining a shallow semantic catcher according to the neural network for multiple times, loading the microscopic semantic features into the shallow multilayer neural network and catching the microscopic semantic of the target point;

the macro history cycle feature extraction module is used for constructing macro history data according to the homogeneous data, and performing label mapping on road condition states after mean value down-sampling to obtain macro history cycle features; constructing macroscopic historical data based on the homogeneous data, performing label mapping on road condition states after mean value downsampling, and defining two time periods with different scales to extract the macroscopic historical period characteristics of the vehicle information when obtaining the macroscopic historical period characteristics:

F＝[Link,cid,fid,I' _-1week ,...I' _-1week ,I' _-1day ...I' _-6day ]；

the training module is used for inputting the obtained loading characteristics into a gradient boosting decision tree algorithm, randomly sampling the loading characteristics, and performing multiple rounds of training according to k-fold cross validation to obtain a reasoning model;

and the prediction module is used for inputting the loading characteristics of the road conditions to be predicted into the inference model for inference after acquiring the loading characteristics of the road conditions to be predicted, so as to obtain a prediction result about the short-time road condition semantics of the future road section.

6. A computer device, comprising a processor and a memory, wherein the memory is used for storing a computer executable program, the processor reads part or all of the computer executable program from the memory and executes the computer executable program, and the processor can implement the method for predicting short-term road conditions based on historical road conditions according to any one of claims 1 to 4 when executing part or all of the computer executable program.

7. A computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the method for predicting a short-term road condition based on historical road conditions as claimed in any one of claims 1 to 4 is implemented.