CN110837888A - Traffic missing data completion method based on bidirectional cyclic neural network - Google Patents
Traffic missing data completion method based on bidirectional cyclic neural network Download PDFInfo
- Publication number
- CN110837888A CN110837888A CN201911106967.7A CN201911106967A CN110837888A CN 110837888 A CN110837888 A CN 110837888A CN 201911106967 A CN201911106967 A CN 201911106967A CN 110837888 A CN110837888 A CN 110837888A
- Authority
- CN
- China
- Prior art keywords
- data
- time
- completion
- traffic flow
- deep learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Abstract
The invention provides a traffic missing data completion method based on a bidirectional cyclic neural network, and belongs to the field of traffic. According to the method, the time sequence characteristic of data in time is utilized, the influence of data before and after the completion time point on the current time point is considered, the utilization and completion precision of the data are greatly improved, the influence of external characteristics and adjacent sensor data on the current sensor data is considered, the external characteristics and the influence of the adjacent sensor data on the current sensor data are added into a completion model, and the completion precision is greatly improved. The method not only greatly improves the completion accuracy under the condition of low data loss rate, but also improves the completion accuracy under the condition of high data loss rate.
Description
Technical Field
The invention belongs to the field of traffic, and particularly relates to a traffic missing data completion method based on a bidirectional cyclic neural network.
Background
The road coil traffic data has periodicity, time-series and trend. At present, the method for complementing the traffic data is mainly based on the time sequence.
And (3) supplementing the traffic flow data based on the time sequence, taking data in a period of time before the current missing point, and supplementing the missing point data through a neural network. For example, to complement the traffic data at 16 points of today, the data from 8 to 15 points of the day are taken as input, and the data at the next time point, 16 points, is obtained through the recurrent neural network. The completion method based on the historical data well utilizes the characteristic of time sequence of the data to complete, the completion result is relatively good, but the method has limitation. When a special event occurs, the current missing point is also preceded by a series of missing points, such as: a power outage can result in the loss of a continuous piece of data, and when the last missing point is complemented, the complementing effect is very poor in this case because the input data is seriously missing.
Neural networks were originally inspired by biological nervous systems and appeared to simulate biological nervous systems, and consist of a large number of nodes (or neurons) interconnected with each other. The neural network adjusts the weight according to the input change, improves the system behavior and automatically learns a model capable of solving the problem. The LSTM (long and short memory network) is a special form of RNN (recurrent neural network), effectively solves the problems of gradient disappearance and gradient explosion of multi-layer neural network training, and can process long-time dependent sequences. The LSTM can capture the time series characteristics of the charging quantity data, and the completion precision can be effectively improved by using the LSTM model.
The LSTM network consists of LSTM units, and the LSTM units consist of units, input gates, output gates and forgetting gates.
Forget the door: deciding how much information to discard from the output state of the last cell, the formula is as follows:
ft=σg(Wfxt+Ufht-1+bf)
wherein f istIs the output of a forgetting gate, xtIs an input sequence, ht-1Is the output of the last cell, σgDenotes the sigmoid function, WfA matrix of weight parameters, U, representing the inputfA matrix of weight parameters representing the output of the last cell, bfRepresenting a deviation parameter vector.
An input gate: determining how much new information to add to the Cell state and updating the Cell state C, the formula is as follows:
it=σg(Wixt+Uiht-1+bi)
wherein, ctRepresenting the cell state, σ, of the current cellgAnd σcA sigmoid function is represented as a function,representing the matrix product, WiA matrix of weight parameters, U, representing the inputiA matrix of weight parameters representing the output of the last cell, biRepresenting deviation parameter vectors, ftIs the output of a forgetting gate, ct-1Is the cell state of the last cell,representing the matrix product, WcA matrix of weight parameters, U, representing the inputcA matrix of weight parameters representing the output of the last cell, bcRepresenting a deviation parameter vector.
An output gate: the result is output based on the current cell state.
ot=σg(Woxt+Uoht-1+bo)
Wherein h istRepresenting the output of the current cell, σgAnd σhA sigmoid function is represented as a function,representing the matrix product, WoA matrix of weight parameters, U, representing the inputoA matrix of weight parameters representing the output of the last cell, boRepresenting a deviation parameter vector.
Disclosure of Invention
The invention provides a traffic loss data completion method based on a bidirectional cyclic neural network, which is a deep learning completion method based on time sequence, periodicity and spatiality and aims to improve the completion precision of road traffic flow data.
The technical scheme of the invention is as follows:
a traffic loss data completion method based on a bidirectional cyclic neural network comprises the following steps:
firstly, preprocessing the traffic flow data
The preprocessing comprises time granularity division and data standardization;
and secondly, performing random data point loss processing on the preprocessed data to construct a data set with missing points, and then recording the position information of the missing points to be used as verification values, thereby verifying the completion effect of the method.
Meanwhile, a time dimension influence attenuation matrix is constructed. Due to the fact that continuous missing can occur when data are missing, for example, data loss in a period of time can be caused by damage of a power supply element of a sensor, and as time is accumulated, influence of historical data on data of a missing point is smaller and smaller, and the completion accuracy can be influenced, the attenuation of influence of time dimension data needs to be recorded. The time dimension influence attenuation matrix is defined as follows:
wherein n istWhich indicates the current time of day and,is defined as follows:
and thirdly, dividing the traffic data after loss processing into a training set, a verification set and a test set. In each data set, the data used by the different models are of the following types:
extrinsic feature data employed in the extrinsic feature module: fn;
where n denotes the current time, t denotes the step of the time series, and p denotes the step of the period series. S denotes traffic data, and T denotes an inverted sequence of S in the time dimension. siIndicating the traffic flow data at time n,the traffic flow data indicating the same time on the day i days before the nth time,represents a set of traffic flow data at the t-th time point before the nth time point,representing a set of traffic data including the same time of day p days before the time of day n, FnThe appearance characteristics at the nth time are shown, including holidays, location areas, weather, and air temperature.
And fourthly, constructing a completion model, wherein the completion model comprises a forward time series deep learning module, a reverse time series deep learning module, a periodic characteristic module and an external characteristic module, and the structure and the training mechanism of each module are as follows:
(1) a forward time series deep learning module: the LSTM model is a linear regression network and multi-layer long and short memory networks combined LSTM model, and continuity information of a current missing point in time is added through one layer of linear regression network to deal with the situation of long-time sequence missing, so that completion accuracy is improved.
Implementation details of the forward sequence deep learning module: the time dimension attenuation matrix is input into a linear regression network, and then the output of the linear regression network and the forward time sequence data are inputInputting the value x to the LSTM networktIf the data point is not lost, the data point is directly input, when the data point is lost, the hidden state of the previous moment is taken as the input of the current moment, after the input is processed, the deep learning network is trained, and the final output of the forward sequence deep learning module is obtained in continuous iteration updating.
(2) The reverse time series deep learning module: the network structure is consistent with the forward sequence deep learning module, and the difference is that the input of the forward time sequence deep learning module is processed in a reverse direction in a time dimension to be used as the input of the module.
(3) A periodic feature learning module: the system is a module formed by three layers of fully-connected networks, and is used for acquiring the change rule of the traffic flow of the same sensor and the same time period in historical data by extracting the characteristics of periodic data and then outputting the extracted characteristics. Implementation details: and inputting the periodic sequence data into the full-connection layer, extracting the time sequence characteristics of the periodic data through the three full-connection layers, and then outputting.
(4) An external feature module: the device consists of two parts: the first part processes holiday and weather characteristics and is a characteristic coding layer. Implementation details: and inputting the external feature data into a feature coding layer, converting the data into a vector form, and combining the obtained vector with the outputs of the three modules.
The second part processes the spatial features. In order to take information on the road space into consideration, all sensors on the road section are simultaneously input into the second part, then the implicit states of other sensors at the same time as the missing point of the current sensor are used as input, the weight is calculated through a Softmax network, output is obtained, and the output is input into the forward and reverse time series deep learning modules.
And finally, combining the outputs of the four modules into a one-dimensional vector, and obtaining a final completion result through a layer of fully-connected network.
And fifthly, pre-training the pre-training parts of the forward time sequence deep learning module and the reverse time sequence deep learning module by using the training set data, optimizing the parameters of the time sequence deep learning module in advance, and avoiding optimizing the parameters to a local optimal point during integral training.
And sixthly, performing overall training on the four modules established in the step four by using the training set data and the verification set data:
and respectively inputting the preprocessed data into corresponding modules, and simultaneously carrying out overall training on all the modules. And calculating loss function values of the supplement value after each training and the true value of the traffic flow data, and training the parameters of the model to the target values. And continuously debugging the hyper-parameters of the model according to the effects of the model on the training set and the verification set, and improving the completion precision under the condition of reducing overfitting.
The input data comprises: forward time series data(front t)1Hourly traffic data), reverse directionTime series data(after t)2Hourly traffic data), periodic sequence data(front t)3Traffic data at the same time of day), time dimension impact attenuation matrixExtrinsic feature data Fn(external characteristic data of holidays, regions, weather and air temperature at the nth time) and truth value of traffic flow data(traffic flow data at the present time).
After one iteration, the traffic flow data after one completion operation is obtained. The data after the iteration is used as the input of the next iteration, the previous missing points have completion values but still represent missing due to labels, and in the subsequent iteration process, the target is to complete the data of the missing points, but due to the existence of the data relatively close to the true value, the prior knowledge is provided, and the convergence speed and the completion precision of the model can be improved.
And seventhly, completing the traffic flow data by using the test set and utilizing the model trained in the sixth step.
The input data is: forward time series dataReverse time series dataPeriodic sequence dataTime dimension influence attenuation matrixExtrinsic feature dataTruth value of sum traffic flow data
And obtaining a completion value of the missing traffic flow data through the model in the sixth step, and comparing the completion value with the verification value obtained after the loss processing in the second step to verify the completion effect of the model.
In the first step, the specific process of pretreatment is as follows:
(1) time granularity division: processing all traffic flow data into traffic flow data of every k minutes according to the time granularity of k minutes;
(2) data were normalized: the traffic flow data is normalized using the minimum and maximum values, as follows:
wherein x represents the original value, xminMinimum value, x, representing the original valuemaxRepresents the maximum value of the original values, max is the normalized upper limit value, min is the normalized lower limit value, [ min, max]Denotes the normalized interval, x*Is the result after standardization.
In the fourth step, the road space information section (Softmax process) is considered: let all sensors at the current moment have their hidden states h ═<h1,h2,h3,…,hi,…,ht>,hiIs the implicit state of the ith sensor at the current time, then for each hiCalculating weight to obtain new implicit state h 'of current sensor'i。
After processing using Softmax, the sum of all weights is 1. Wherein l represents the number of sensors, hijIndicating an implicit state at the time of the jth sensor i.
And in the sixth step, calculating the mean square error MAE of the data obtained by completion and the truth value of the traffic flow data obtained by each iteration, and minimizing the MAE by using an Adam method.
Wherein, x'iTrue value of sensor, x, representing the i-th momentiIndicating the sensor full value at the i-th time.
The invention has the beneficial effects that: the invention is different from the existing method in that firstly, the use of the data time sequence characteristic is improved, the influence of historical data on the current time point data is usually considered when the data time sequence characteristic is utilized by the traditional method, but the information of the subsequent time point has influence on the data of the current time point in the completion application of the traffic flow data, and the invention considers the forward time sequence and the reverse time sequence simultaneously, thereby greatly improving the completion precision. And secondly, considering the influence of external characteristic holidays and adjacent areas of the sensor on traffic flow data, adding the influence into a completion model, and greatly improving the completion precision and completion of special values. Finally, the attenuation of the influence of data missing on the time dimension is also considered, and the completion precision is improved. The method not only greatly improves the completion precision of the low-loss-rate traffic flow data, but also can achieve a good completion effect under the condition of higher data loss rate.
Drawings
Fig. 1 is a diagram of a completion model structure according to the present invention.
Fig. 2 is a graph comparing the low dropout completion result with a data dropout of 20% with the real value.
Fig. 3 is a graph comparing the high dropout completion result with a data dropout of 50% with the real value.
Detailed description of the invention
The technical solution of the present invention will be further described with reference to the following specific embodiments and accompanying drawings.
A traffic loss data completion method based on a bidirectional cyclic neural network comprises the following steps:
first, preprocessing the traffic flow data
(1) Time granularity division: processing all traffic flow data into traffic flow data of every 5 minutes according to the time granularity of 5 minutes;
(2) data were normalized: and (3) standardizing the traffic flow data by adopting the minimum value and the maximum value, wherein the formula is as follows:
wherein x represents the original value, xminMinimum value, x, representing the original valuemaxRepresents the maximum value of the original values, max is the normalized upper limit value, min is the normalized lower limit value, [ min, max]Denotes the normalized interval, x*Is the result after standardization.
And secondly, performing random data point loss on the preprocessed data, marking missing labels on data in a certain proportion (set according to experimental requirements) by adopting a random number method to serve as missing points, and recording values of the points to serve as true values to verify the final completion effect of the model.
At the same time, a time dimension influence attenuation matrix is established. Due to the fact that continuous missing of data occurs, for example, a power failure may cause a sensor to not collect data within several hours, and as time accumulates, influence of historical data on data of a missing point is smaller and smaller, and completion accuracy is affected, so that attenuation of influence of time dimension data needs to be recorded. The time dimension influence attenuation matrix is defined as follows:
thirdly, dividing the preprocessed traffic flow data into a training set, a verification set and a test set, and performing the following steps according to the data ratio of 8: 1: a ratio of 1. In each data set, the data used by the different models are of the following types:
extrinsic feature data employed in the extrinsic feature model: fn;
Periodic sequence data employed in the periodic signature module:
where n denotes the current time, t denotes the step of the time series, and p denotes the step of the period series. S denotes traffic data, and T denotes an inverted sequence of S in the time dimension. siIndicating the traffic flow data at time n,the traffic flow data indicating the same time on the day i days before the nth time,represents a set of traffic flow data at the t-th time point before the nth time point,representing a set of traffic data including the same time of day p days before the time of day n, FnThe appearance characteristics at the nth time are shown, including holidays, location areas, weather, and air temperature.
And fourthly, constructing a completion model, wherein the completion model comprises a forward sequence deep learning module, a reverse time sequence deep learning module, a periodic characteristic module and an external characteristic module, and the structure and the training mechanism of each module are as follows:
(1) a forward sequence deep learning module: the LSTM model is a linear regression network and multi-layer long and short memory networks combined LSTM model, and continuity information of a current missing point in time is added through one layer of linear regression network to deal with the situation of long-time sequence missing, so that completion accuracy is improved.
Implementation details of the forward sequence deep learning module: the time dimension attenuation matrix is input into a linear regression network, and then the output of the linear regression network and the forward time sequence data are inputInputting the value x to the LSTM networktIf the data point is not lost, the data point is directly input, when the data point is lost, the hidden state of the previous moment is taken as the input of the current moment, after the input is processed, the deep learning network is trained, and the final output of the forward sequence deep learning module is obtained in continuous iteration updating.
(2) The reverse sequence deep learning module: the network structure is consistent with the forward sequence deep learning module, and the difference is that the input of the forward sequence deep learning module is processed in a reverse direction in the time dimension and is used as the input of the module.
(3) A periodic feature module: the system is a module formed by three layers of fully-connected networks, and is used for acquiring the change rule of the traffic flow of the same sensor and the same time period in historical data by extracting the characteristics of periodic data and then outputting the extracted characteristics. Implementation details: and inputting the periodic sequence data into the full-connection layer, extracting the time sequence characteristics of the periodic data through the three full-connection layers, and then outputting.
(4) An external feature module: is a feature coding layer; implementation details: inputting external feature data into a feature coding layer, and classifying external features of weather, holidays and the like described in a text mode: for example, the cycle sequence data is converted into a vector form based on whether it is a holiday, which is represented by 1, and not by 0, and the obtained vector is output to the next step.
In order to take information on a road space into consideration, a spatial characteristic learning module is added, all sensors on a road section are simultaneously input into a model, then the implicit states of other sensors at the same time as the missing point of the current sensor are used as input, the output is obtained after the weight is calculated through a Softmax network, and the output is input into a forward sequence module and a reverse sequence module.
And finally, combining the outputs of the modules into a one-dimensional vector, and then obtaining a final completion result through a layer of fully-connected network.
And fifthly, pre-training a pre-training part of the time series deep learning model by using the training set data, optimizing parameters of the time series deep learning model in advance, and avoiding optimizing the parameters to a local optimal point during integral training.
And sixthly, performing overall training on the four modules established in the step four by using training set data and verification set data (replacing points with missing data by a completion value, and keeping the original data unchanged if the data is not missing):
and respectively inputting the preprocessed data into corresponding modules, and simultaneously carrying out overall training on all the modules. And calculating loss function values of the supplement value after each training and the true value of the traffic flow data, and training the parameters of the model to the target values. And continuously debugging the hyper-parameters of the model according to the effects of the model on the training set and the verification set, and improving the completion precision under the condition of reducing overfitting. In the training process, the MAE (mean square error) of the data obtained by completion and the truth value of the traffic flow data obtained by each iteration is calculated, and the MAE is minimized by using an Adam method.
Wherein, x'iTrue value of sensor, x, representing the i-th momentiIndicating the sensor full value at the i-th time.
The input data comprises: forward time series data(front t)1Hourly traffic data), reverse time series data(after t)2Hourly traffic data), time dimension impact attenuation matrixPeriodic sequence data(front t)3Traffic data at the same time of day), extrinsic feature data Fn(external characteristic data of holidays, regions, weather and air temperature at the nth time) and truth value of traffic flow data(traffic flow data at the present time).
And seventhly, completing the traffic flow data by using the test set and utilizing the model trained in the sixth step.
The input data is: forward time series dataReverse time series dataPeriodic sequence dataExtrinsic feature dataTruth value of sum traffic flow dataTime dimension influence attenuation matrix
Fig. 2 is a comparison graph of the completion result with the real value of the data loss rate of 20%, and the mean square error MAE of the model completion result with the real value of the traffic flow is 29.18. (the first 100 points of absence are selected in the figure)
Fig. 3 is a comparison graph of the completion result with the data loss rate of 50% and the true value, and the mean square error MAE of the model completion result and the true value of the traffic flow is 31.94. (the first 100 missing points are selected in the figure).
Claims (5)
1. A traffic loss data completion method based on a bidirectional cyclic neural network is characterized by comprising the following steps:
firstly, preprocessing the traffic flow data
The preprocessing comprises time granularity division and data standardization;
secondly, performing random data point loss processing on the preprocessed data to construct a data set with missing points, and then recording position information of the missing points to be used as verification values; meanwhile, constructing a time dimension influence attenuation matrix:
thirdly, dividing the traffic data after loss processing into a training set, a verification set and a test set; in each data set, the data used by the different models are of the following types:
extrinsic feature data employed in the extrinsic feature module: fn;
wherein n represents the current time, t represents the step length of the time sequence, and p represents the step length of the period sequence; s represents the traffic flow data, and T represents the reverse sequence of S in the time dimension; siIndicating the traffic flow data at time n,the traffic flow data indicating the same time on the day i days before the nth time,represents a set of traffic flow data at the t-th time point before the nth time point,indicating the amount of traffic at the same time within the previous p days including the current day at the nth timeData set, FnIndicating the appearance at the nth time, including holidays, location areas, weather, and air temperature;
and fourthly, constructing a completion model, wherein the completion model comprises a forward time series deep learning module, a reverse time series deep learning module, a periodic characteristic module and an external characteristic module, and the structure and the training mechanism of each module are as follows:
(1) a forward time series deep learning module: the LSTM model is a linear regression network and multi-layer long and short memory networks, and continuity information of a current missing point in time is added through one layer of linear regression network to deal with the condition of long-time sequence missing and improve completion precision;
implementation details of the forward sequence deep learning module: the time dimension attenuation matrix is input into a linear regression network, and then the output of the linear regression network and the forward time sequence data are inputInputting the value x to the LSTM networktIf the data point is not lost, the data is directly input, when the data point is lost, the hidden state of the previous moment is taken as the input of the current moment, after the input is processed, the deep learning network is trained, and the final output of the forward sequence deep learning module is obtained in continuous iteration updating;
(2) the reverse time series deep learning module: the network structure is consistent with the forward sequence deep learning module, and the difference is that the input of the forward time sequence deep learning module is reversely processed in the time dimension and is used as the input of the module;
(3) a periodic feature learning module: the system is a module consisting of three layers of fully-connected networks, and is used for acquiring the change rule of the traffic flow of the same sensor and the same time period in historical data by extracting the characteristics of periodic data and then outputting the extracted characteristics; implementation details: inputting the periodic sequence data into a full-connection layer, extracting the time sequence characteristics of the periodic data through three full-connection layers, and then outputting;
(4) an external feature module: the module consists of two parts: the first part processes holiday and weather characteristics and is a characteristic coding layer; implementation details: inputting external feature data into a feature coding layer, converting the data into a vector form, and then combining the obtained vector with the outputs of the three modules;
the second part processes spatial characteristics, all sensors on a road section are simultaneously input into the second part, then the implicit states of other sensors at the same time as the missing point of the current sensor are used as input, the output is obtained after the weight is calculated through a Softmax network, and the output is input into a forward time sequence deep learning module and a reverse time sequence deep learning module;
finally, combining the outputs of the four modules into a one-dimensional vector, and obtaining a final completion result through a layer of fully-connected network;
fifthly, pre-training the pre-training parts of the forward time sequence deep learning module and the reverse time sequence deep learning module by using training set data, optimizing the parameters of the time sequence deep learning module in advance, and avoiding optimizing the parameters to a local optimal point during integral training;
and sixthly, performing overall training on the four modules established in the step four by using the training set data and the verification set data:
inputting the preprocessed data into corresponding modules respectively, and simultaneously carrying out overall training on all the modules; calculating loss function values of the supplement value and the true value of the traffic flow data after each training, and training the parameters of the model to target values; continuously debugging hyper-parameters of the model according to the effects of the model on a training set and a verification set, and improving the completion accuracy under the condition of reducing overfitting;
the input data comprises:
Cycle sequence data: front t3Traffic data at the same time of day
external characteristic data: external characteristic data F of holidays, areas, weather and air temperatures at the nth momentn;
After one iteration, obtaining the traffic flow data after one completion operation; taking the data after the iteration as the input of the next iteration, wherein the previous missing points have completion values but the labels still represent the missing, and the target still completes the data of the missing points in the subsequent iteration process;
seventhly, completing traffic flow data by using the test set and utilizing the model trained in the sixth step;
the input data is: forward time series dataReverse time series dataPeriodic sequence dataTime dimension influence attenuation matrixExtrinsic feature dataTruth value of sum traffic flow data
And obtaining a completion value of the missing traffic flow data through the model in the sixth step, and comparing the completion value with the verification value obtained after the loss processing in the second step to verify the completion effect of the model.
2. The method for completing the traffic loss data based on the bidirectional recurrent neural network as claimed in claim 1, wherein in the first step, the preprocessing comprises:
(1) time granularity division: processing all traffic flow data into traffic flow data of every k minutes according to the time granularity of k minutes;
(2) data were normalized: the traffic flow data is normalized using the minimum and maximum values, as follows:
wherein x represents the original value, xminMinimum value, x, representing the original valuemaxRepresents the maximum value of the original values, max is the normalized upper limit value, min is the normalized lower limit value, [ min, max]Denotes the normalized interval, x*Is the result after standardization.
3. The traffic loss data completion method based on the bidirectional recurrent neural network as claimed in claim 1 or 2, wherein in the fourth step, the specific process of processing spatial features: let all sensors at the current moment have their hidden states h ═<h1,h2,h3,…,hi,…,ht>,hiIs the implicit state of the ith sensor at the current time, then for each hiCalculating weight to obtain new implicit state h 'of current sensor'i;
Wherein l represents the number of sensors, hijIndicating an implicit state at the time of the jth sensor i.
4. The traffic loss data completion method based on the bidirectional cyclic neural network as claimed in claim 1 or 2, wherein in the sixth step, the mean square error MAE of the data obtained by completion and the truth value of the traffic flow data obtained by each iteration is calculated, and the MAE is minimized by using an Adam method;
wherein, x'iTrue value of sensor, x, representing the i-th momentiIndicating the sensor full value at the i-th time.
5. The traffic loss data completion method based on the bidirectional cyclic neural network as claimed in claim 3, wherein in the sixth step, the mean square error MAE of the data obtained by completion and the truth value of the traffic flow data obtained by each iteration is calculated, and the MAE is minimized by using an Adam method;
wherein, x'iTrue value of sensor, x, representing the i-th momentiIndicating the sensor full value at the i-th time.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911106967.7A CN110837888A (en) | 2019-11-13 | 2019-11-13 | Traffic missing data completion method based on bidirectional cyclic neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911106967.7A CN110837888A (en) | 2019-11-13 | 2019-11-13 | Traffic missing data completion method based on bidirectional cyclic neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110837888A true CN110837888A (en) | 2020-02-25 |
Family
ID=69576320
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911106967.7A Withdrawn CN110837888A (en) | 2019-11-13 | 2019-11-13 | Traffic missing data completion method based on bidirectional cyclic neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110837888A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112417000A (en) * | 2020-11-18 | 2021-02-26 | 杭州电子科技大学 | Time sequence missing value filling method based on bidirectional cyclic codec neural network |
CN113094357A (en) * | 2021-04-23 | 2021-07-09 | 大连理工大学 | Traffic missing data completion method based on space-time attention mechanism |
CN113239029A (en) * | 2021-05-18 | 2021-08-10 | 国网江苏省电力有限公司镇江供电分公司 | Completion method for missing daily freezing data of electric energy meter |
CN113392139A (en) * | 2021-06-04 | 2021-09-14 | 中国科学院计算技术研究所 | Multi-element time sequence completion method and system based on association fusion |
CN113554105A (en) * | 2021-07-28 | 2021-10-26 | 桂林电子科技大学 | Missing data completion method for Internet of things based on space-time fusion |
CN114611396A (en) * | 2022-03-15 | 2022-06-10 | 国网安徽省电力有限公司蚌埠供电公司 | Line loss analysis method based on big data |
CN116595806A (en) * | 2023-07-14 | 2023-08-15 | 江西师范大学 | Self-adaptive temperature data complement method |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5822712A (en) * | 1992-11-19 | 1998-10-13 | Olsson; Kjell | Prediction method of traffic parameters |
US20150120174A1 (en) * | 2013-10-31 | 2015-04-30 | Here Global B.V. | Traffic Volume Estimation |
CN107154150A (en) * | 2017-07-25 | 2017-09-12 | 北京航空航天大学 | A kind of traffic flow forecasting method clustered based on road with double-layer double-direction LSTM |
CN107610469A (en) * | 2017-10-13 | 2018-01-19 | 北京工业大学 | A kind of day dimension regional traffic index forecasting method for considering multifactor impact |
CN107680377A (en) * | 2017-11-06 | 2018-02-09 | 浙江工商大学 | Traffic flow data based on trend fitting intersects complementing method |
CN107992536A (en) * | 2017-11-23 | 2018-05-04 | 中山大学 | Urban transportation missing data complementing method based on tensor resolution |
CN108010320A (en) * | 2017-12-21 | 2018-05-08 | 北京工业大学 | A kind of complementing method of the road grid traffic data based on adaptive space-time constraint low-rank algorithm |
CN108090558A (en) * | 2018-01-03 | 2018-05-29 | 华南理工大学 | A kind of automatic complementing method of time series missing values based on shot and long term memory network |
CN108205889A (en) * | 2017-12-29 | 2018-06-26 | 长春理工大学 | Freeway traffic flow Forecasting Methodology based on convolutional neural networks |
CN109146156A (en) * | 2018-08-03 | 2019-01-04 | 大连理工大学 | A method of for predicting charging pile system charge volume |
CN109598935A (en) * | 2018-12-14 | 2019-04-09 | 银江股份有限公司 | A kind of traffic data prediction technique based on ultra-long time sequence |
CN110070713A (en) * | 2019-04-15 | 2019-07-30 | 浙江工业大学 | A kind of traffic flow forecasting method based on two-way nested-grid ocean LSTM neural network |
CN110162744A (en) * | 2019-05-21 | 2019-08-23 | 天津理工大学 | A kind of multiple estimation new method of car networking shortage of data based on tensor |
CN110223510A (en) * | 2019-04-24 | 2019-09-10 | 长安大学 | A kind of multifactor short-term vehicle flowrate prediction technique based on neural network LSTM |
US20190286990A1 (en) * | 2018-03-19 | 2019-09-19 | AI Certain, Inc. | Deep Learning Apparatus and Method for Predictive Analysis, Classification, and Feature Detection |
CN110322695A (en) * | 2019-07-23 | 2019-10-11 | 内蒙古工业大学 | A kind of Short-time Traffic Flow Forecasting Methods based on deep learning |
-
2019
- 2019-11-13 CN CN201911106967.7A patent/CN110837888A/en not_active Withdrawn
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5822712A (en) * | 1992-11-19 | 1998-10-13 | Olsson; Kjell | Prediction method of traffic parameters |
US20150120174A1 (en) * | 2013-10-31 | 2015-04-30 | Here Global B.V. | Traffic Volume Estimation |
CN107154150A (en) * | 2017-07-25 | 2017-09-12 | 北京航空航天大学 | A kind of traffic flow forecasting method clustered based on road with double-layer double-direction LSTM |
CN107610469A (en) * | 2017-10-13 | 2018-01-19 | 北京工业大学 | A kind of day dimension regional traffic index forecasting method for considering multifactor impact |
CN107680377A (en) * | 2017-11-06 | 2018-02-09 | 浙江工商大学 | Traffic flow data based on trend fitting intersects complementing method |
CN107992536A (en) * | 2017-11-23 | 2018-05-04 | 中山大学 | Urban transportation missing data complementing method based on tensor resolution |
CN108010320A (en) * | 2017-12-21 | 2018-05-08 | 北京工业大学 | A kind of complementing method of the road grid traffic data based on adaptive space-time constraint low-rank algorithm |
CN108205889A (en) * | 2017-12-29 | 2018-06-26 | 长春理工大学 | Freeway traffic flow Forecasting Methodology based on convolutional neural networks |
CN108090558A (en) * | 2018-01-03 | 2018-05-29 | 华南理工大学 | A kind of automatic complementing method of time series missing values based on shot and long term memory network |
US20190286990A1 (en) * | 2018-03-19 | 2019-09-19 | AI Certain, Inc. | Deep Learning Apparatus and Method for Predictive Analysis, Classification, and Feature Detection |
CN109146156A (en) * | 2018-08-03 | 2019-01-04 | 大连理工大学 | A method of for predicting charging pile system charge volume |
CN109598935A (en) * | 2018-12-14 | 2019-04-09 | 银江股份有限公司 | A kind of traffic data prediction technique based on ultra-long time sequence |
CN110070713A (en) * | 2019-04-15 | 2019-07-30 | 浙江工业大学 | A kind of traffic flow forecasting method based on two-way nested-grid ocean LSTM neural network |
CN110223510A (en) * | 2019-04-24 | 2019-09-10 | 长安大学 | A kind of multifactor short-term vehicle flowrate prediction technique based on neural network LSTM |
CN110162744A (en) * | 2019-05-21 | 2019-08-23 | 天津理工大学 | A kind of multiple estimation new method of car networking shortage of data based on tensor |
CN110322695A (en) * | 2019-07-23 | 2019-10-11 | 内蒙古工业大学 | A kind of Short-time Traffic Flow Forecasting Methods based on deep learning |
Non-Patent Citations (8)
Title |
---|
DONALD B. RUBIN: "Inference and missing data", 《BIOMETRIKA》 * |
FILIPE RODRIGUES ET AL.: "Multi-Output Gaussian Processes for Crowdsourced Traffic Data Imputation", 《IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS》 * |
HAN-GYU KIM ET AL.: "Medical examination data prediction with missing information imputation based on recurrent neural networks", 《INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS》 * |
LABLACK MOURAD ET AL.: "ASTIR: Spatio-Temporal Data Mining for Crowd Flow Prediction", 《IEEE ACCESS》 * |
WEI CAO ET AL.: "BRITS: Bidirectional Recurrent Imputation for Time Series", 《ARXIV》 * |
YI-FAN ZHANG ET AL.: "SSIM—A Deep Learning Approach for Recovering Missing Time Series Sensor Data", 《IEEE INTERNET OF THINGS JOURNAL》 * |
任艺柯: "基于改进的LSTM网络的交通流预测", 《万方》 * |
朱勇: "基于时空关联混合模型的交通流预测方法研究", 《中国优秀博硕士学位论文全文数据库(硕士)工程科技Ⅱ辑》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112417000A (en) * | 2020-11-18 | 2021-02-26 | 杭州电子科技大学 | Time sequence missing value filling method based on bidirectional cyclic codec neural network |
CN113094357A (en) * | 2021-04-23 | 2021-07-09 | 大连理工大学 | Traffic missing data completion method based on space-time attention mechanism |
CN113239029A (en) * | 2021-05-18 | 2021-08-10 | 国网江苏省电力有限公司镇江供电分公司 | Completion method for missing daily freezing data of electric energy meter |
CN113392139A (en) * | 2021-06-04 | 2021-09-14 | 中国科学院计算技术研究所 | Multi-element time sequence completion method and system based on association fusion |
CN113392139B (en) * | 2021-06-04 | 2023-10-20 | 中国科学院计算技术研究所 | Environment monitoring data completion method and system based on association fusion |
CN113554105A (en) * | 2021-07-28 | 2021-10-26 | 桂林电子科技大学 | Missing data completion method for Internet of things based on space-time fusion |
CN113554105B (en) * | 2021-07-28 | 2023-04-18 | 桂林电子科技大学 | Missing data completion method for Internet of things based on space-time fusion |
CN114611396A (en) * | 2022-03-15 | 2022-06-10 | 国网安徽省电力有限公司蚌埠供电公司 | Line loss analysis method based on big data |
CN116595806A (en) * | 2023-07-14 | 2023-08-15 | 江西师范大学 | Self-adaptive temperature data complement method |
CN116595806B (en) * | 2023-07-14 | 2023-10-10 | 江西师范大学 | Self-adaptive temperature data complement method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110837888A (en) | Traffic missing data completion method based on bidirectional cyclic neural network | |
CN111899510B (en) | Intelligent traffic system flow short-term prediction method and system based on divergent convolution and GAT | |
CN112365040B (en) | Short-term wind power prediction method based on multi-channel convolution neural network and time convolution network | |
CN113094357B (en) | Traffic missing data completion method based on space-time attention mechanism | |
CN110223517B (en) | Short-term traffic flow prediction method based on space-time correlation | |
CN109255505B (en) | Short-term load prediction method of multi-model fusion neural network | |
CN109685252B (en) | Building energy consumption prediction method based on cyclic neural network and multi-task learning model | |
CN109146156B (en) | Method for predicting charging amount of charging pile system | |
CN110766212B (en) | Ultra-short-term photovoltaic power prediction method for historical data missing electric field | |
CN109886444A (en) | A kind of traffic passenger flow forecasting, device, equipment and storage medium in short-term | |
CN110619430A (en) | Space-time attention mechanism method for traffic prediction | |
CN111815033A (en) | Offshore wind power prediction method based on RCNN and meteorological time sequence characteristics | |
CN109902862A (en) | A kind of time series forecasting system of time of fusion attention mechanism | |
CN111027772A (en) | Multi-factor short-term load prediction method based on PCA-DBILSTM | |
CN113723010B (en) | Bridge damage early warning method based on LSTM temperature-displacement correlation model | |
CN109583565A (en) | Forecasting Flood method based on the long memory network in short-term of attention model | |
CN111626764A (en) | Commodity sales volume prediction method and device based on Transformer + LSTM neural network model | |
CN111861013A (en) | Power load prediction method and device | |
CN112257847A (en) | Method for predicting geomagnetic Kp index based on CNN and LSTM | |
CN114781744A (en) | Deep learning multi-step long radiance prediction method based on codec | |
CN113947182A (en) | Traffic flow prediction model construction method based on double-stage stack graph convolution network | |
CN114120637A (en) | Intelligent high-speed traffic flow prediction method based on continuous monitor | |
CN111783688B (en) | Remote sensing image scene classification method based on convolutional neural network | |
Zhichao et al. | Short-term load forecasting of multi-layer LSTM neural network considering temperature fuzzification | |
CN115392387B (en) | Low-voltage distributed photovoltaic power generation output prediction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20200225 |
|
WW01 | Invention patent application withdrawn after publication |