CN109118763B

CN109118763B - Vehicle flow prediction method based on corrosion denoising deep belief network

Info

Publication number: CN109118763B
Application number: CN201810986737.3A
Authority: CN
Inventors: 阮雅端; 张园笛; 葛嘉琦; 王麟皇; 曹小峰; 陈启美
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2018-08-28
Filing date: 2018-08-28
Publication date: 2021-05-18
Anticipated expiration: 2038-08-28
Also published as: CN109118763A

Abstract

The invention provides a traffic flow prediction method based on a corrosion denoising deep belief network, which is used for predicting the traffic flow conditions at the current time and the future time by using the traffic flow conditions at the historical time. Meanwhile, a traffic flow prediction model is established by combining a specific application scene and considering the spatial correlation and the time regularity of traffic flow, so that accurate, reliable and real-time traffic flow prediction is realized, traffic scheduling is effectively optimized, traffic pressure is relieved, and the operation efficiency of a road network is improved. The method has great significance to the development of intelligent transportation.

Description

Vehicle flow prediction method based on corrosion denoising deep belief network

Technical Field

The invention belongs to the technical field of artificial intelligence, relates to intelligent traffic and neural network learning system technology, and discloses a traffic flow prediction method based on a corrosion denoising deep belief network.

Background

With the development of urbanization, traffic congestion has become a common phenomenon, which seriously affects the convenience of travel. Accurate, reliable, real-time traffic flow prediction is the urgent problem of solving of current intelligent traffic, and this can effectively optimize traffic scheduling, alleviates traffic pressure, improves the operating efficiency of road network, promotes people's life comfort level and joyfulness. In recent years, the development of artificial intelligence also promotes the research process of the traffic flow prediction problem, and a method based on deep learning is applied to the field of intelligent transportation. The model established based on the method has good robustness and strong learning ability, and can deal with complicated and variable traffic conditions. By utilizing the deep learning method and combining with specific application scenes, a corresponding traffic flow prediction model is established to obtain better traffic flow prediction performance, and the method has great significance for the development of intelligent traffic.

The traffic flow prediction of the present invention is to predict the traffic flow situation at the present and future time using the traffic flow situation at the past time. The most common early traffic flow prediction method is a time series model, which is a linear model and has the advantages of simple structure and high calculation speed, but the method cannot well process complicated and variable traffic conditions and has low prediction precision. The potential rule of the traffic flow data is nonlinear, and a nonparametric machine learning method can be used for freely learning any form of function from training data so as to better fit a prediction model. In machine learning, various regression algorithms can be effectively used for traffic flow prediction, such as a support vector machine model, a K-nearest neighbor regression model, a random forest model, a neural network model, and the like. The neural network model is a model for simulating a human brain neural network, has strong learning capability, does not need manual design of characteristics, and can learn a corresponding nonlinear mapping relation only by inputting original data, so that the neural network model is widely used.

With the increase of the number of layers of the model network, the mapping relation which can be learned by the neural network model is more complex, however, the deep network is usually trapped in a local optimal solution by a training mode based on the traditional back propagation algorithm, the performance of the model is reduced, and the deep belief network solves the problem by using a layer-by-layer greedy pre-training algorithm.

When a deep network is trained by directly using a back propagation algorithm based on a gradient descent method without adopting a pre-training strategy, if the weight parameter is too large during network initialization, the situation of local minimum value is often caused, and if the weight parameter is too small during network initialization, the phenomenon that the gradient disappears is caused. The method for pre-training the deep network by greedy layer by layer can enable the network to find the global optimum point quickly, and is a proper initialization strategy. However, the overfitting problem in the network is still not well solved.

Disclosure of Invention

The invention aims to solve the problems that: neural network learning technology is increasingly used for traffic flow prediction, but the current neural network prediction method still needs to be improved to meet the technical requirements of traffic flow prediction.

The technical scheme of the invention is as follows: the method comprises the steps of predicting the traffic flow at the current time and the future time according to the traffic flow at the historical time based on the deep belief network, dividing the traffic flow data at the historical time into a training set and a testing set, training a deep belief network model by using the training set, and obtaining a trained traffic flow prediction model by using the prediction performance of the testing set testing model, wherein the trained traffic flow prediction model is used for predicting the traffic flow at the current time and the future time; the middle layer of the deep belief network is formed by stacked corrosion-removal limited Boltzmann machines, a random corrosion layer is added at the input end of the limited Boltzmann machine, the damaged output of the random corrosion layer is used as a visible layer of the limited Boltzmann machine, and the hidden layer is not changed.

Further, the random corrosion layer is realized by setting corrosion probability, the corrosion probability is a global hyper-parameter, more neurons are reserved when the corrosion probability is smaller, when the corrosion probability is 0, the random corrosion layer is degraded into a common constant mirror image layer, and the output is only simply copied to the input; the greater the probability of corrosion, the more neurons lose activity, the weaker the association between neurons, the more difficult the feature learning, and the more reasonable probability of corrosion value determined experimentally.

As a preferable mode, when the traffic flow prediction model is established, the correlation between the traffic flow information of the upper and lower roads and the adjacent roads and the dependency of the traffic flow information at the current time on the traffic flow information at the historical time are combined to obtain a prediction model frame as follows:

the lowest layer is a prediction model data input layer, and the input is X_1，t-1，X_1，t-2，…，X_1，t-d，X_2，t-1，X_2，t-2，…，X_m，t-dWherein X is_i，jThe traffic flow of the ith vehicle detector at the time j is represented, i is 1, the.. t, m and m are the total number of the vehicle detectors, t is the prediction time, and j is t-1, the.. t-d is input into a prediction model, namely all traffic flow information of all relevant vehicle detectors in a road network from the current time to the previous d times;

the middle layer is a stacked corrosion-removing limited Boltzmann machine, the corrosion-removing limited Boltzmann machine is a generating type random neural network based on an energy function, and the whole network is divided into two layers: the device comprises a visible layer and a hidden layer, wherein the visible layer is an input layer of the limited Boltzmann machine, and the hidden layer is a feature extraction layer of the limited Boltzmann machine; when a traffic flow prediction model is pre-trained, a random corrosion layer is arranged at the most front end of a de-corrosion limited Boltzmann machine, input data firstly enter the random corrosion layer, damaged output after corrosion is used as a visible layer, and the traffic flow prediction model is not provided with the random corrosion layer during micro-blending and testing;

the top layer is a prediction model logistic regression output layer, and the output is Y₁，Y₂，Y₃，…，Y_mWherein Y is_iWhich indicates the traffic flow of the i-th vehicle detector at the predicted time t.

As a preferred mode, the training deep belief network model specifically comprises:

step 1: pre-training a corrosion-removing limited Boltzmann machine: setting corrosion probability, inputting vehicle flow data, entering a random corrosion layer of a first stacked layer of a de-corrosion limited Boltzmann machine, corroding the random corrosion layer by using preset corrosion probability to obtain damaged output serving as a visible layer of the de-corrosion limited Boltzmann machine, obtaining hidden layer characteristic representation after an energy generation function, reconstructing and inputting the hidden layer characteristic representation through the energy generation function, updating parameters by using a log-likelihood function to enable the probability distribution of the limited Boltzmann machine under the parameter condition to be as consistent as possible, and outputting the pre-trained hidden layer characteristic as the input of the next de-corrosion limited Boltzmann machine;

step 2: fixing the weight and bias parameters of the pretrained corroded-limited Boltzmann machine, beginning to pretrain the next corroded-limited Boltzmann machine, wherein the input end of the next corroded-limited Boltzmann machine is also closely followed by a random corrosion layer, the input end is corroded with the same preset corrosion probability, the following training process is similar to the previous corroded-limited Boltzmann machine, and so on, the output of each corroded-limited Boltzmann machine firstly enters the random corrosion layer of the next corroded-limited Boltzmann machine, and then the damaged input is used as a visible layer to continue training;

and step 3: after pre-training all the corrosion-removing limited Boltzmann machines, adding a prediction regression layer on the top of the network model for predicting the traffic flow;

and 4, step 4: and (3) carrying out supervision fine adjustment on the whole network model by using a back propagation algorithm, updating the weight and the offset parameters of the last layer of network in the first three periods, and then updating the parameters of all layers to obtain the finally trained deep belief network model.

Preferably, when the test set is used for testing the prediction performance of the model, the evaluation standard adopts the average absolute percentage error MAPE:

wherein Y is_iIs the actual flow rate of the vehicle,

is to predict the traffic flow, N isNumber of samples.

When the deep belief network model is trained, the hyper-parameters needing to be adjusted are as follows: stacked corrosion-removing limited number of boltzmann machines N_layerThe number N of hidden layer nodes of each corrosion-removing limited Boltzmann machine_nodePre-training period N of each corrosion-limited Boltzmann machine_epochPredicting the historical time period number d and the corrosion probability Clevel required by the traffic flow at the current moment; determining the hyper-parameter setting according to MAPE error function by using a grid search method, and reducing the number N of hidden layer nodes of all the corrosion-removing limited Boltzmann machines in order to reduce the search space_nodeSame, pre-training period N_epochThe corrosion probabilities Clevel are the same.

The corrosion layer of the present invention can be viewed as a means of regularization. Its action mechanism is to randomly erode the neurons in the layer, i.e. each node is damaged and inactivated with a certain probability. This probability is a preset corrosion probability. Before corroding, each neuron participates in the training of the network and coordinates with each other, and the extraction of the features by a certain neuron is influenced by other dependent neurons, so that complex correlation exists. This complex correlation is one of the main causes of overfitting. The random corrosion layer can remove the interdependence among a part of neurons, force the retained neurons to work in a coordinated manner, weaken the fixed association and improve the network robustness and generalization capability.

The corrosion is random and irregular, so that a different visible layer can be obtained in each training period, and further, a limited Boltzmann machine with different structures can be obtained. Such an operation amounts to training several different network structures and then averaging the results of these several network structures. This operation of averaging different network structures can improve the model generalization ability and is helpful in mitigating overfitting.

According to the method, a deep learning method is utilized, a deep belief network is based on, when a basic construction unit limited Boltzmann machine is trained, a random corrosion layer is added at the input end, a denoising mechanism is fused, the network generalization capability is improved, and the risk of overfitting is reduced. Meanwhile, by combining a specific application scene, the relevance of the traffic flow information of upstream and downstream roads and adjacent roads and the dependency of the traffic flow information at the current moment on the traffic flow information at the historical moment are noticed, and the spatial correlation and the time regularity of the traffic flow information are taken into account during the design of a network structure, so that a traffic flow prediction neural network model with higher accuracy is established. This can provide good prejudgment for traffic scheduling, and alleviate traffic pressure.

Drawings

FIG. 1 is a diagram of a Boltzmann machine training for corrosion removal.

FIG. 2 is a diagram of a traffic flow prediction model structure based on a corrosion denoising deep belief network.

FIG. 3 is a flow chart of vehicle flow prediction model training based on a corrosion denoising deep belief network.

FIG. 4 is a diagram of the traffic flow prediction effect on a day of the working day according to the method of the present invention.

FIG. 5 is a diagram showing the flow prediction effect of five consecutive crown blocks in a working day according to the method of the present invention.

Detailed Description

The invention utilizes a deep learning method to transform the traditional deep belief network so as to further obtain more representative characteristics, improve the generalization capability of the model and effectively reduce the over-fitting problem. The invention discloses a deep belief network, which is formed by building stacked limited Boltzmann mechanisms.

The probability of corrosion is a global over-parameter. The smaller the corrosion probability is, the more neurons are reserved, when the corrosion probability is 0, the random corrosion layer degenerates into a common identical mirror layer, and the output is only simply copied to the input; the greater the probability of erosion, the more neurons lose activity, the weaker the association between neurons, and the more difficult feature learning. It is necessary to set a reasonable corrosion probability value experimentally.

After the input passes through the corrosion layer, the limited Boltzmann machine obtains an input layer with a part of nodes deactivated, which is equivalent to noise pollution, so that the input layer not only needs to simulate the energy distribution of network nodes, but also needs to remove the influence of corrosion noise. The novel Boltzmann machine provided by the invention is called as a corrosion-removing limited Boltzmann machine, and the corrosion-removing limited Boltzmann machine is trained to force hidden layer units to learn more robust characteristics, so that a network with stronger generalization performance is obtained.

Based on the points, the traffic flow prediction method based on the corrosion denoising deep belief network can cope with complex and variable traffic conditions, and meanwhile, the spatial correlation and the time regularity of traffic flow information are considered in the model structure design, so that the prediction accuracy is further improved.

The concrete model framework of the invention is as follows:

the bottom layer is a model data input layer with input of X_1，t-1，X_1，t-2，…，X_1，t-d；X_2，t-1，X_2，t-2，…；X_m，t-1，…，X_m，t-dWherein X is_i，jThe traffic flow of the ith vehicle detector at the time j is represented, i is 1, the.. t, m is the total number of the vehicle detectors, t is the prediction time, and j is t-1. The spatial correlation and the time regularity of the traffic flow are considered by the fully embodied model, the spatial correlation shows that the traffic flow of the upstream road section and the downstream road section and the traffic flow of the adjacent road section can greatly influence the traffic flow of the current predicted road section, and the time regularity shows that obvious traffic flow trend information exists between continuous time periods;

the middle layer is a stacked corrosion-removing limited Boltzmann machine and is also the most important basic construction unit of the model. The corrosion-removing limited Boltzmann machine is a generating type random neural network based on an energy function, and the whole network is divided into two layers: the system comprises a visible layer and a hidden layer, wherein the visible layer is an input layer of the limited Boltzmann machine, the hidden layer is a feature extraction layer of the limited Boltzmann machine, all hidden layer nodes are independent in condition under the condition that the state of the visible layer is known, and all visible layer nodes are independent in condition under the condition that the state of the hidden layer is known;

when a traffic flow prediction model is pre-trained, a random corrosion layer is arranged at the most front end of a de-corrosion limited Boltzmann machine, input data firstly enter the random corrosion layer, damaged output after corrosion is used as a visible layer, and the traffic flow prediction model is not provided with the random corrosion layer during micro-blending and testing;

the top layer is a model logistic regression output layer, and the output is Y₁，Y₂，Y₃，…，Y_mWherein Y is_iWhich indicates the traffic flow of the i-th vehicle detector at the predicted time t.

The invention will be further explained with reference to the drawings.

The training process of the corrosion-removing limited Boltzmann machine is shown in figure 1, input firstly passes through a random corrosion layer, each neuron is corroded and damaged with a certain probability, the complex correlation among a part of neurons is removed, and the generalization capability is enhanced. Therefore, compared with the traditional limited Boltzmann machine, the corrosion-removing limited Boltzmann machine not only needs to simulate the energy distribution states of the hidden layer and the visible layer, but also needs to remove the influence of input corrosion noise, so that the useful information of the neurons randomly reserved can be learned, and finally, the characteristics with more expressive force can be learned.

Random etching process:

mask_p～Bernoulli(1-Clevel) (1)

v_p＝mask_p*x_p (2)

where Clevel is the probability of corrosion, mask_pThe state of the neuron after the random corrosion layer conforms to the Bernoulli random variable distribution, the probability of 1-Clevel of each variable is 1, the probability of Clevel is 0, the state 1 shows that the neuron is intact and not damaged, and the state 0 shows that the neuron is corroded and damaged. x is the number of_pRepresenting the original complete input, v_pShowing the visible layer of the de-etch limited boltzmann machine after etching the layer.

When the limited Boltzmann machine is trained, damaged data enter the limited Boltzmann machine, and the hidden layer unit is forced to use the neuron connection relation randomly reserved to learn more robust characteristics to reconstruct original undamaged data. The network of this step is divided into two layers: a visible layer and a hidden layer. The visible layer is the damaged input, and the hidden layer is the feature extraction layer. The energy function of the limited boltzmann machine is:

wherein v is a visible layer unit, h is a hidden layer unit, a and b are the offsets of the visible layer unit and the hidden layer unit, respectively, subscripts p and q are unit serial numbers, and w is a weight matrix.

There is a connection between the layers of the corrosion-limited boltzmann machine, and there is no connection in the layers. Thus, all hidden nodes are conditionally independent with the visible layer state known, and similarly, all visible layer nodes are conditionally independent with the hidden state known. So there is the following conditional probability formula:

wherein sigm (x) is a sigmoid logic function of 1/[1+ exp (x) ].

The process of pre-training the corrosion-removing limited boltzmann machine is shown in fig. 1, and is specifically described as follows:

step 1: inputting a random corrosion layer, wherein each node is corroded and damaged with a certain probability, removing the interdependence among a part of neurons, and then taking the output damaged data as a visible layer unit of a corrosion-removing limited Boltzmann machine;

step 2: randomly initializing a weight matrix w and bias vectors a and b according to a visible layer unit v₀In the state of (1), the hidden layer unit h is obtained by the formula (4)₀The state of (1);

and step 3: according to hidden layer unit h₀By reconstructing the visible layer cell v by equation (5)₁A state;

and 4, step 4: reuse of equation (4) according to visible layer element v₁Reconstructing hidden layer unit h₁The state of (1);

and 5: updating the weight matrix w and the offset vectors a and b by the following formula:

wherein E (v)_ph_q)_inputIndicating the expectation of the distribution of the input data, E (v)_ph_q)_reconIndicating the desire to reconstruct the data distribution.

As can be seen from the training process of the de-corrosion limited Boltzmann machine, the whole process does not use the label information of the data, which is very advantageous in the scene lacking the label information. It is both a generative model and an unsupervised model.

And stacking the corrosion-removal limited Boltzmann machine, and customizing the input layer and the output layer to obtain the vehicle flow prediction model based on the deep belief network, as shown in FIG. 2. The model of the invention mainly has three parts:

the bottom layer is a model data input layer with input of X_1，t-1，X_1，t-2，…，X_1，t-d；X_2，t-1，X_2，t-2，…；X_m，t-1，…，X_m，t-dWherein X is_i，jThe traffic flow of the ith vehicle detector at the time point j is shown, m represents the total number of the vehicle detectors, namely the input of the model is the traffic flow information of all the related vehicle detectors in the road network in the previous d time periods. The spatial correlation and the time regularity of the traffic flow are considered by the fully embodied model, the spatial correlation shows that the traffic flow of the upstream road section and the downstream road section and the traffic flow of the adjacent road section can greatly influence the traffic flow of the current predicted road section, and the time regularity shows that obvious traffic flow trend information exists between continuous time periods;

the middle layer is a stacked corrosion-removing limited Boltzmann machine, is also the most important basic construction unit of the model, and is a generating type random neural network based on an energy function. When a traffic flow prediction model is pre-trained, a random corrosion layer is arranged at the most front end of a de-corrosion limited Boltzmann machine, input data firstly enter the random corrosion layer, damaged output after corrosion is used as a visible layer, and the traffic flow prediction model is not provided with the random corrosion layer during micro-blending and testing;

The flow chart of the overall training process of the vehicle flow prediction model based on the corrosion denoising depth belief network is shown in FIG. 3, and is briefly described as follows:

step 1: setting corrosion probability, inputting vehicle flow data, entering a random corrosion layer of a corrosion-removing limited Boltzmann machine on a first layer of a model, corroding the random corrosion layer by using the preset corrosion probability, removing the interdependence among a part of neurons, obtaining damaged output as a visible layer of the corrosion-removing limited Boltzmann machine, obtaining hidden layer feature representation after an energy generation model, obtaining reconstruction input according to the hidden layer feature representation and the energy generation model, updating the weight by using a log-likelihood function according to the result, enabling the probability distribution of the limited Boltzmann machine under the parameter condition to be in accordance with the condition as far as possible, and using the pre-trained hidden layer feature output as the input of the next corrosion-removing limited Boltzmann machine;

step 2: fixing the weight and bias parameters of the pre-trained corroded limited Boltzmann machine, pre-training a second corroded limited Boltzmann machine, and so on, wherein each time of feature extraction output enters a random corrosion layer of the next corroded limited Boltzmann machine, and then taking the damaged input as a visible layer to continue training;

and 4, step 4: and (3) carrying out supervision fine adjustment on the whole network model by using a back propagation algorithm, only updating the weight and the offset parameters of the last layer of network in the first few periods, and then updating the parameters of all layers.

Determining the hyper-parameters of the model according to the evaluation standard MAPE function and the test result of the model performance: stacked corrosion-removing limited number of boltzmann machines N_layerThe number N of hidden layer nodes of each corrosion-removing limited Boltzmann machine_nodePre-training period N of each corrosion-limited Boltzmann machine_epochThe number d of historical time periods required for predicting the traffic flow at the current time, and the corrosion probability Clevel. And then, under the condition of determining the specific framework of the model, further training the model, and adjusting the weight matrix and the offset vector to optimize the performance of the model. Wherein, in order to reduce the search space, the number N of hidden layer nodes of all the de-corroded limited Boltzmann machines_nodeSame, pre-training period N_epochThe corrosion probabilities Clevel are the same.

The corrosion denoising depth belief network framework has the following advantages:

first, it is a probabilistic generative model;

second, it can use unlabeled data for unsupervised learning;

thirdly, the network is pre-trained, compared with a random initialization method, the algorithm can better optimize the weight of the whole network and prevent the optimization from falling into a local optimal solution;

fourthly, when the model is pre-trained, the most front end of each corrosion-removing limited Boltzmann machine is a random corrosion layer which is used for removing the cross correlation among a part of neurons, improving the generalization capability of the model, learning more representative characteristics and reducing the over-fitting problem.

Fig. 4 is a one-day prediction performance of the traffic flow prediction method based on the corrosion denoising deep belief network, and fig. 5 is a five-working-day prediction performance of the proposed prediction method in a week. It can be seen that the coincidence degree of the predicted curve and the actual traffic flow curve under the method is very high, and the prediction accuracy rate can be very high under the condition of large traffic flow fluctuation.

Claims

1. The traffic flow prediction method based on the corrosion denoising deep belief network is characterized in that the traffic flow at the current time and the traffic flow at the future time are predicted based on the deep belief network according to the traffic flow at the historical time, the traffic flow data at the historical time are divided into a training set and a testing set, a deep belief network model is trained by using the training set, and a well-trained traffic flow prediction model is obtained by using the prediction performance of the testing set testing model and is used for predicting the traffic flow at the current time and the future time; the bottom layer of the deep belief network is a prediction model data input layer, the top layer of the deep belief network is a prediction model logistic regression output layer, the middle layer of the deep belief network is formed by stacked corrosion-removing limited Boltzmann machines, the corrosion-removing limited Boltzmann machines are formed by adding a random corrosion layer at the input end of the limited Boltzmann machines, the damaged output of the random corrosion layer is used as a visible layer of the limited Boltzmann machines, and the hidden layer is not changed; the random corrosion layer is realized by setting corrosion probability, the corrosion probability is a global hyper-parameter, more neurons are reserved when the corrosion probability is smaller, when the corrosion probability is 0, the random corrosion layer is degraded into a common constant mirror layer, and the output is only simply copied to the input; the greater the probability of corrosion, the more neurons lose activity, the weaker the association between neurons, the more difficult the feature learning, and the more reasonable probability of corrosion value determined experimentally.

2. The traffic flow prediction method based on the corrosion denoising deep belief network as claimed in claim 1, wherein when the traffic flow prediction model is established, a prediction model frame is obtained by combining the relevance of the traffic flow information of the upstream and downstream road sections and the adjacent road sections and the dependency of the traffic flow information at the current moment on the traffic flow information at the historical moment:

3. The method for predicting the traffic flow based on the corrosion denoising deep belief network as claimed in claim 1, wherein the training deep belief network model is specifically as follows:

4. The method for predicting vehicle flow based on the corrosion denoising depth belief network as claimed in claim 1, wherein when the prediction performance of the test set test model is used, the evaluation criterion adopts a mean absolute percentage error MAPE:

wherein Y is_iIs the actual flow rate of the vehicle,

the predicted traffic flow is, and N is the number of test samples.

5. The method for predicting the traffic flow based on the corrosion denoising deep belief network as claimed in claim 1, wherein when the deep belief network model is trained, the hyper-parameters to be adjusted are as follows: stacked corrosion-removing limited number of boltzmann machines N_layerThe number N of hidden layer nodes of each corrosion-removing limited Boltzmann machine_nodePre-training period N of each corrosion-limited Boltzmann machine_epochPredicting the historical time period number d and the corrosion probability Clevel required by the traffic flow at the current moment; determining the hyper-parameter setting according to MAPE error function by using a grid search method, and reducing the number N of hidden layer nodes of all the corrosion-removing limited Boltzmann machines in order to reduce the search space_nodeSame, pre-training period N_epochThe corrosion probabilities Clevel are the same.