CN111524348A - Long-short term traffic flow prediction model and method - Google Patents

Long-short term traffic flow prediction model and method Download PDF

Info

Publication number
CN111524348A
CN111524348A CN202010289673.9A CN202010289673A CN111524348A CN 111524348 A CN111524348 A CN 111524348A CN 202010289673 A CN202010289673 A CN 202010289673A CN 111524348 A CN111524348 A CN 111524348A
Authority
CN
China
Prior art keywords
traffic flow
layer
model
long
term
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010289673.9A
Other languages
Chinese (zh)
Inventor
屈立成
吕娇
王海飞
屈艺华
张明皓
李翔
李昭璐
张壮壮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changan University
Original Assignee
Changan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changan University filed Critical Changan University
Priority to CN202010289673.9A priority Critical patent/CN111524348A/en
Publication of CN111524348A publication Critical patent/CN111524348A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0129Traffic data processing for creating historical data or processing based on historical data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application belongs to the technical field of traffic, and particularly relates to a long-term and short-term traffic flow prediction model and a method. The external environment influence data containing random errors are used for predicting future traffic flow, so that the random errors are amplified again, and the great influence and fluctuation on the model prediction accuracy are inevitable. The application provides a long-term and short-term traffic flow prediction model which comprises a context factor input layer, a feature learning and pattern recognition layer and a traffic flow data output layer. The multi-scale prediction model is high in accuracy, good in prediction effect, short in training time, strong in robustness and free from influence of historical data loss. The context factor is used as prediction input, the accuracy of a traffic flow prediction model is improved, and the method plays a vital role in advanced traffic management and traveler route planning.

Description

Long-short term traffic flow prediction model and method
Technical Field
The application belongs to the technical field of traffic, and particularly relates to a long-term and short-term traffic flow prediction model and a method.
Background
The traffic flow refers to a flow of vehicles formed by continuous driving of cars on a road. But also traffic and people flows of other vehicles in a broad sense. In a certain period of time, on a road section which is not influenced by transverse intersection, the traffic flow is in a continuous flow state; when meeting the control of the signal lamp at the intersection, the traffic light is in a discontinuous flow state. The traffic flow prediction plays an important role in predicting road congestion and accident occurrence, controlling traffic signals and the like. Especially for fixed time signal control strategies, traffic flow prediction is crucial. The prediction accuracy is greatly affected by the traffic flow prediction mode, appropriate history data, and the like.
Internal and external influence factors influencing traffic flow are various, and only historical traffic flow data and internal and external influence factors of traffic flow data are considered in traditional traffic flow prediction. Some traffic flow data and external factors are combined, but the data acquisition difficulty is high due to the external factors, and in the model prediction stage, the data of the external factors are unknown as the traffic flow data to be predicted, and only the forecast data of the external factors can be used. The use of these forecasted external environmental impact data with random errors to predict future traffic flow will only amplify the random errors again, inevitably resulting in large impact and fluctuation on the model prediction accuracy.
Disclosure of Invention
1. Technical problem to be solved
The internal and external influence factors influencing the traffic flow are various, and only historical traffic flow data and the internal and external influence factors of the traffic flow data are considered in the conventional traffic flow prediction. Some traffic flow data and external factors are combined, but the data acquisition difficulty is high due to the external factors, and in the model prediction stage, the data of the external factors are unknown as the traffic flow data to be predicted, and only the forecast data of the external factors can be used. The application provides a long-term and short-term traffic flow prediction model and a method based on contextual factors and historical traffic flow data.
2. Technical scheme
In order to achieve the above purpose, the present application provides a long and short term traffic flow prediction model, which comprises a context factor input layer, a feature learning and pattern recognition layer and a traffic flow data output layer in sequence;
the context factor input layer is used for inputting the preprocessed context factors into the neural network;
the learning and pattern recognition layer is used for transforming the input context factors layer by layer and extracting implicit patterns and characteristics;
and the traffic flow data output layer is used for aggregating and summarizing the modes and characteristics obtained by learning of the previous hidden layer, and obtaining corresponding traffic flow prediction data after nonlinear weighted transformation.
Another embodiment provided by the present application is: the contextual factor input layer takes future contextual factors as input.
Another embodiment provided by the present application is: the traffic flow prediction output by the traffic flow data output layer comprises long-time traffic flow prediction and short-time traffic flow prediction.
Another embodiment provided by the present application is: the predicted time of the traffic flow output by the traffic flow data output layer comprises 5 minutes, 10 minutes, 15 minutes, 20 minutes, 30 minutes or 1 hour.
The application also provides a long-term and short-term traffic flow prediction method, which comprises the following steps:
step 1): inputting historical traffic flow data and context factors into a feature learning and pattern recognition layer for training;
step 2): updating iteration and testing of the model are continuously carried out by utilizing a neural network back propagation algorithm;
step 3): generating a deep belief network model capable of expressing the characteristics of the traffic flow context factors;
step 4): loading a trained deep belief network model;
step 5): sending the prepared future context factors into a prediction model in sequence;
step 6): and outputting the predicted result through the traffic flow data output layer.
Another embodiment provided by the present application is: the contextual factors include year, month, day, week, holiday, hour, minute, and daily data time points.
Another embodiment provided by the present application is: the prediction method exploits from historical data the relationship between traffic flow data and contextual factors over a given time interval.
Another embodiment provided by the present application is: and mining the relation between the traffic flow data and the contextual factors by adopting a deep belief network model.
Another embodiment provided by the present application is: and mining the relation between the traffic flow data and the contextual factors by adopting a multi-layer supervised learning algorithm.
3. Advantageous effects
Compared with the prior art, the long-term and short-term traffic flow prediction model and the method have the advantages that:
the long-term and short-term traffic flow prediction model provided by the application improves the accuracy of the traffic flow prediction model. The multi-scale prediction model is high in accuracy, good in prediction effect, short in training time, strong in robustness and free from influence of historical data loss. Plays a crucial role for advanced traffic management and traveler routing.
The long-term and short-term traffic flow prediction model provided by the application researches training methods of different unit models, optimizes model structures and parameters, and reduces training time.
The long-short term traffic flow prediction method provided by the application is a long-short term traffic prediction algorithm based on context factors and historical traffic flow data.
The long-short term traffic flow prediction method provided by the application aims at the problem that the accuracy of the existing traffic prediction flow is low in the aspect of prediction accuracy, and provides a long-short term traffic flow prediction algorithm based on a deep confidence network.
The long-term and short-term traffic flow prediction method provided by the application researches the relation between the traffic flow and the context factor combination in a given time interval.
The long-term and short-term traffic flow prediction method provided by the application can resist the influence of the historical data missing problem on the prediction precision.
The long-term and short-term traffic flow prediction method provided by the application researches the structure and optimization method of the factor feature extraction network.
The long-short-term traffic flow prediction method provided by the application is a method for researching an implementation method of a long-time multi-time scale traffic flow prediction model.
According to the long-term and short-term traffic flow prediction method, the model algorithm analyzes traffic flow data, context factors influencing traffic flow operation modes are extracted from the traffic flow data, a context factor traffic flow prediction model is established, and complex relations between the context factors and the traffic flow modes are mined from historical data. After the model is trained, the future contextual factor input model is used, accurate estimation and prediction of future traffic flow are achieved under the condition of not depending on any external data, and the influence of the problem of historical data missing on prediction accuracy is resisted. Finally, the collected traffic flow data is fed into the model at given time intervals (e.g., 5 minutes, 10 minutes, 15 minutes, or 1 hour) to achieve long and short term neural network traffic flow predictions.
Drawings
FIG. 1 is a schematic diagram of a long and short term traffic flow prediction model of the present application;
in the figure: 1-context factor input layer, 2-feature learning and pattern recognition layer and 3-traffic flow data output layer.
Detailed Description
Hereinafter, specific embodiments of the present application will be described in detail with reference to the accompanying drawings, and it will be apparent to those skilled in the art from this detailed description that the present application can be practiced. Features from different embodiments may be combined to yield new embodiments, or certain features may be substituted for certain embodiments to yield yet further preferred embodiments, without departing from the principles of the present application.
Referring to fig. 1, the present application provides a long and short term traffic flow prediction model comprising a contextual factor input layer 1, a feature learning and pattern recognition layer 2, and a traffic flow data output layer 3.
The whole model is divided into three parts, wherein the leftmost part is a context factor input layer which is responsible for inputting the context factors into a neural network after preprocessing; the middle part is a feature transformation and pattern recognition layer which is the most core part of the whole model and is also called a hidden layer, the input contextual factors are transformed layer by layer here, and the hidden pattern and features are extracted; the last part is a predictor, also called an output layer, which is a simple artificial neural network layer, and is an output part of the whole model, wherein the mode and the characteristics obtained by learning the hidden layer in front are gathered and summarized, and corresponding traffic flow prediction data is obtained after nonlinear weighted transformation.
Further, the contextual factor input layer employs a future contextual factor input layer. The method realizes accurate estimation and prediction of future traffic flow without depending on any external data, and resists the influence of the problem of historical data missing on prediction precision.
Further, the traffic flow prediction output by the traffic flow data output layer comprises a long-time traffic flow prediction and a short-time traffic flow prediction.
Further, the predicted time periods of the traffic flow outputted from the traffic flow data output layer include 5 minutes, 10 minutes, 15 minutes, 20 minutes, 30 minutes and 1 hour. Multi-scale prediction is employed. And the prediction models with different scales can be selected according to the prediction requirements to perform long-term and short-term traffic flow prediction.
The application also provides a long-term and short-term traffic flow prediction method, which comprises the following steps:
(1) model training phase
Step 1): inputting historical traffic flow data and context factors into a feature learning and pattern recognition layer 2 for training;
step 2): updating iteration and testing of the model are continuously carried out by utilizing a neural network back propagation algorithm;
step 3): generating a deep belief network model capable of expressing the characteristics of the traffic flow context factors;
(2) model prediction phase
Step 4): loading the trained deep belief network model;
step 5): sending the prepared future context factors into a prediction model in sequence;
step 6): and outputting the predicted result through the traffic flow data output layer.
Further, the contextual factors include year, month, day, week, holiday, hour, minute, and daily data time points.
Further, the prediction method mines from historical data the relationship between traffic flow data and contextual factors over a given time interval.
Further, the relationship between the traffic flow data and the contextual factors is mined by adopting a deep belief network model.
Further, the relation between the traffic flow data and the contextual factors is mined by adopting a multi-layer supervised learning algorithm.
1. High accuracy
The long-term and short-term traffic flow prediction algorithm based on the deep belief network considers the influence of the context factors on the traffic flow values, fully researches the traffic flow data, and excavates the potential relation between the traffic flow values and the context factors by using the deep belief network model. The method provided by the application is superior to the traditional traffic flow prediction method in the aspect of prediction accuracy.
2. Is not influenced by missing values of historical data
The daily and long-term traffic flow prediction algorithm based on the deep confidence network uses a future contextual factor input model to realize accurate estimation and prediction of future traffic flow without depending on any external data.
3. Multi-scale prediction
The long-term and short-term traffic flow prediction algorithm based on the deep belief network adopts multi-scale model prediction, is different from the conventional prediction model, can only use one time interval for training and predicting, and can select the time interval according to the requirement of the model after the model is trained to perform multi-scale prediction on future traffic flow.
The influence of the contextual factors on the traffic flow value is mainly considered, and the relation between the traffic flow value and the contextual factors in a given time interval is mined by using a multi-layer supervised learning algorithm. The method is characterized in that accurate estimation and prediction of future traffic flow are realized by using a future context factor input model under the condition of not depending on any external data for the first time, and the method combines the unique advantages of a deep belief network in the aspect of nonlinear transformation, performs higher-level abstraction on the features of the context factors and realizes prediction output of the future traffic flow.
On the basis of analyzing long-term trends, periodic characteristics and external factors of traffic flow data, the method mainly studies the association relationship between the traffic flow time sequence data and internal data attributes thereof, establishes a multi-time-scale long-term traffic flow prediction model by utilizing a time sequence analysis method and a nonlinear characteristic fitting model, and excavates a potential relationship between traffic flow and context factors from historical traffic data, thereby accurately and reliably predicting the future traffic operation condition and the trends.
Testing and evaluating multiple data sources and multiple time scales by using historical traffic flow data and context factor combinations to verify the correctness and robustness of the proposed model and the adaptability to long-term and short-term traffic flow prediction; meanwhile, the multi-scale multi-fold cross validation method is compared with the prediction result of a common typical traffic flow prediction method to verify the accuracy, universality and advancement of the proposed prediction model.
The application provides a deep belief network long-term and short-term traffic prediction method based on context factors and historical traffic flow data. The main idea is that the accuracy of the traditional traffic prediction in the aspect of prediction is superior to that of the traditional traffic prediction by mining the corresponding relation between traffic flow data and context factors, and the accuracy of the traffic flow prediction is improved. The general framework of the traffic flow contextual feature prediction model considering contextual factors is shown in the figure I.
Deep neural networks are often recommended for mining potential relationships in massive multi-dimensional historical data sets. Deep belief networks, a type of Deep Neural Network (DNN), are a Deep learning architecture that allows a computational model to be composed of multiple processing layers to learn data representations with multiple levels of abstraction. In order to mine complex nonlinear relations between the contextual factors and traffic flow patterns, a classical deep belief network is used as a tool for building a model.
The deep belief network is actually an extension of a single-layer artificial neural network in the number of network layers, and the term "deep" refers to an artificial neural network with more than one layer. In order to realize the extraction and conversion of complex features, the deep belief network cascades a plurality of single-layer neural network layers with nonlinear processing functions. The nodes of each layer are trained on a group of different patterns based on the output of the previous layer, each subsequent neural network layer uses the output of the network transformation of the previous layer as input, and the context factors are subjected to different nonlinear transformation and recombination at each layer and are transmitted backwards layer by layer. With the continuous advance of the deep confidence network to deeper layers, the more complex the recognizable modes and features are, so that hidden deeper abstract relationships can be mined. As with the single layer neural network, the deep belief network is oneA fully connected feedforward network is modeled as a simple mathematical function map f:
Figure BDA0002449913450000061
or Z is at
Figure BDA0002449913450000062
A distribution of. Here, Z is an argument representing a set of contextual factors extracted from traffic flow data;
Figure BDA0002449913450000063
assuming Z is a context factor vector, Z ∈ Z, network function of neurons f (Z) is defined as some transformation function gi(z) combinations, the functions in these combinations can be further decomposed into more functions to more conveniently represent complex network structures. Arrows in the formula describe the dependency relationship between functions, and can be conveniently expressed by using a combination function, and the combination function widely used in a neural network is a nonlinear weighted sum as shown in formula 1:
Figure BDA0002449913450000064
in the formula: wj is a weight; gj points to an element in a set of function vectors g ═ g1, g 2.., gn); bj represents the offset; σ represents a predetermined transformation function, also called activation function. An important feature of the activation function is that it provides a smooth transition when the input value changes. The activation function is herein tan h, which can generate a non-linear value and compress between-1 and 1.
The structure model of the deep confidence network prediction model is divided into three parts, wherein the leftmost part is a context factor input layer which is responsible for inputting the context factors into a neural network after preprocessing; the middle part is a feature transformation and pattern recognition layer which is the most core part of the whole model and is also called a hidden layer, the input contextual factors are transformed layer by layer here, and the hidden pattern and features are extracted; the last part is a predictor, also called an output layer, which is a simple artificial neural network layer, and is an output part of the whole model, wherein the mode and the characteristics obtained by learning the hidden layer in front are gathered and summarized, and corresponding traffic flow prediction data is obtained after nonlinear weighted transformation.
Each input node of the deep confidence network prediction model corresponds to one contextual factor z in the contextual factor vectori(zi∈ Z), the output node of the predictor corresponds to the traffic flow data
Figure BDA0002449913450000065
As with the previous mapping, the entire deep belief network model can be expressed as shown in equation 2:
Figure BDA0002449913450000066
wherein z is a context factor vector; w is a weight parameter; b is a bias parameter. In the function, the deep confidence network model is regarded as a whole for completing nonlinear transformation, and W and b are regarded as weight parameters and bias parameters of the whole model, so that the description and understanding of the function of the model are greatly simplified. Since the deep belief network is itself formed by stacking multiple artificial neural network layers, this function can also be used to represent an artificial neural network layer, or even a neuron in the network, if W and b are given different meanings and ranges. After the context factor vectors are input into the network, the internal states of the neurons and layers change according to their weight parameters and bias, and feature outputs are generated by activation functions. The deep belief network is realized by connecting the output of the neurons to the input of other neurons, and the output is transmitted layer by layer from front to back, so that a directed weighted graph is formed. These weight parameters in the neural network cannot be determined when building the model, and due to their complex and numerous relationships, these weight parameters need to be updated and corrected continuously by learning, which is also called learning processAnd (5) training a model. Because the weight parameters in the deep belief network model cannot be determined one by one during the establishment, the best method is to select a random value first and then continuously perform iterative optimization according to the error between the output result of the model and the expected value until the error between the output result of the model and the expected value is minimum or no longer changes. When the training error reaches a predetermined minimum value (e.g. 10)-5Or less), the iterative process may be declared complete, the parameters of the model determined, and the training session completed. Of course, there may be a case where the training error stops decreasing and never reaches the preset value, and then the cause should be found from both the model structure and the data set, the model should be modified, or the data set should be updated and then retrained. In this respect, after the structure of a deep belief network is determined, the learning or training of the deep belief network actually reduces the error of the model continuously through an iterative algorithm, optimizes and determines the value of the weight parameter.
(1) Loss function of model
In the case of supervised learning, after context data is input into the deep belief network model, a predicted value is finally output from an output layer through a series of abstraction, transformation and extraction processes on a hidden layer. This contextual factor exists in dependence on traffic flow data, and the model inevitably has an error between the predicted value output at that time and the actual traffic flow data. The error is expressed by a function, namely a loss function used in model training.
The loss function is used for representing the degree of inconsistency between the predicted value and the true value of the model, and is the reflection of the fitting degree of the prediction model and the true data, the farther the fitting degree of the prediction model and the true data is, the larger the numerical value of the loss function is, and thus the effect can be effectively produced in the optimization process. The loss function is also differentiable over a range of values, with a reasonable gradient, which is greater for higher loss values and significantly smaller for lower loss values, and is also 0 for values of 0. In the application of nonlinear regression classes, the most commonly used loss function is the mean square error loss function:
Figure BDA0002449913450000071
in the formula: y is the true value;
Figure BDA0002449913450000072
is a predicted value. With the loss function, the overall loss of each context factor prediction model can be conveniently calculated, the learning of the model is converted into an optimization problem, and the optimization target is the minimization of the loss function.
(2) Back propagation of response errors and parameter updates
The loss is caused by the propagation of errors, and the overall loss of the model may also be referred to as response error, or simply error. The errors generated by the deep neural belief network model are sequentially propagated to an output layer from front to back, and the total loss propagated by the model can be calculated according to a loss function, but the losses are transmitted by the errors of a plurality of neuron adjusting parameters after a plurality of times of combined transformation, and the error of which parameter generates the error is not known at present.
Back Propagation (BP) is a common method of training artificial neural networks used in conjunction with gradient descent methods. In each iteration process of the network model, the method sequentially calculates loss gradients of all weight parameters from back to front by performing forward calculation and backward propagation on response errors of the model, and updates the corresponding weight parameters by using the calculated gradients, thereby realizing parameter adjustment and optimization of the model.
The error back-propagation algorithm generally consists of three stages. The first phase is the excitation propagation phase. The stage is mainly forward propagation of excitation response of the neural network model, weighted sum and excitation response are calculated layer by layer according to the structure and initial parameters of the model, and response error of the model is calculated according to a loss function. The second phase is a gradient calculation phase. The decreasing gradient of each parameter is calculated back by taking the derivative of the loss function. And the third stage is a weight updating stage, and corresponding model parameters are updated according to the calculated gradient values.
In the gradient calculation stage, the model parameters are regarded as the independent variables of the loss function to find the gradient thereof, because the change of the parameters causes the generation of errors. Let oiThe excitation response of the ith neuron in a certain layer of the network can be known according to formula 4
Figure BDA0002449913450000081
In the formula:
Figure BDA0002449913450000082
is an activation function; netiIs a weighted sum of neurons i; biIs the bias parameter of neuron i; w is ajiIs the weight parameter of the neuron i connected to the anterior layer neuron j. In the formula, i is the index of the neurons in the current layer, j is the index of the neurons in the previous layer, so ojRepresents the excitation response of the anterior layer neurons j, at the first layer o of the overall modelj=zjAt the last layer of the whole model
Figure BDA0002449913450000083
Calculating the derivative forward layer by layer according to Chain derivative Rule (Chain Rule), and weighting the parameter wjiThe derivation formula of (c) is as follows:
Figure BDA0002449913450000084
in the formula: e is the response error of neuron i; the other symbols are defined as in the above formula. Wherein,
Figure BDA0002449913450000085
Figure BDA0002449913450000086
in the formula:
Figure BDA0002449913450000089
is the derivative of the activation function. For the output layer, the response error is the overall loss of the model and can be directly calculated using a loss function, and the derivative of the overall output layer response error to the response stimulus can be directly calculated using the following equation:
Figure BDA0002449913450000088
however, the neurons in the hidden layer cannot directly obtain the response error, and the response error needs to be calculated by passing the response error of the output layer from back to front layer by layer. Assuming L to be the set of all neurons (subscripts) that accept input from neuron i, the response error of the current neuron can be back-propagated by these connected neurons, numerically equal to the sum of these errors, for the excitation response ojTaking the full differential, the derivative of the response error of neuron i to the excitation response can then be written as:
Figure BDA0002449913450000091
comparing two ends in the formula, it can be seen that there is an obvious recursive relationship, if the derivatives of response excitation of all neurons in the next layer can be obtained, then the derivative of response excitation of neurons in the hidden layer can be calculated, and the gradient formula corresponding to the weight parameter W can be obtained by substituting the above equation as follows:
Figure BDA0002449913450000092
wherein:
Figure BDA0002449913450000093
now, the gradient of each neuron weight parameter can be calculated, and the value of the weight parameter can be updated by using a gradient descent method, wherein the update formula of the weight parameter W is as follows:
wij(t+1)=wij(t)+Δwij+ξ(t)
Figure BDA0002449913450000094
in the formula: t is the number of iterations; η is the learning rate, usually taking a value less than 1; ξ (t) is a random variable.
The model also has another parameter to be learned, namely an offset parameter b, and like the weight parameter W, the derivative is solved layer by layer forwards by using the chain type derivation method, and the offset parameter bjiThe derivation formula of (c) is as follows:
Figure BDA0002449913450000095
in the formula:
Figure BDA0002449913450000096
Figure BDA0002449913450000097
response to excitation o according to the foregoing derivation procedure and equation 10jTaking the full differential, the derivative of the neuron i response error to the excitation response can then be written as:
Figure BDA0002449913450000098
substituting the above equation can obtain the corresponding gradient formula of model bias b as follows:
Figure BDA0002449913450000101
wherein:
Figure BDA0002449913450000102
now, the gradient of each neuron bias parameter is calculated, and the value of the bias parameter can be updated by using a gradient descent method, wherein the updating formula is as follows:
bij(t+1)=bij(t)+Δbij+ξ(t)
Figure BDA0002449913450000103
in the formula: t is the number of iterations; η is the learning rate; ξ (t) is a random variable.
Considering that different activation functions may be selected in practical application, the derivative of the activation function phi in the above formula is represented by phi', and the practical application should be specifically analyzed according to a specific function. For example, the derivative of the hyperbolic tangent function used in the model is tan h (x)' -1-tan h (x) 2.
The gradient descent method is an optimization algorithm for finding the optimal model parameters, and the model error is continuously descended along the direction of the minimum gradient by calculating the response error of the training set on the model until the error is not descended or the parameter which enables the model to have the minimum error on the training set is found. The minibatch gradient descent method (Mini-Batch gradient Descent) is a gradient descent algorithm between online learning and offline learning. The advantages of a batch gradient descent algorithm and a random gradient descent algorithm are combined, the training data set is divided into smaller batches, an off-line learning method is used for each batch, and macroscopically, the on-line learning is realized.
By adopting a small-batch gradient descent method, the model updating frequency is high, the model updating speed is accelerated, the noise interference of a single sample is reduced, the variance of the gradient is reduced, the model is prevented from falling into local minimum, and the convergence of the model is stronger. The batch updating small-batch gradient descent algorithm finds a balance between the robustness of random gradient descent and the efficiency of batch gradient descent, provides a more effective calculation process, improves the utilization efficiency of a storage space, and is the most frequently used gradient descent method in the field of deep learning.
The use of the small batch stochastic gradient descent algorithm introduces a super parameter batch _ size in the model training process, and the parameter can be assigned according to experience or adjusted in the actual training process. The smaller value gives a process similar to online learning, which converges rapidly at the expense of the introduction of noise in the training process; larger values give a process similar to off-line learning, allowing accurate estimation of the error gradient but slow convergence. At this time, equation (3) can be written as the following matrix expression:
Figure BDA0002449913450000111
in the formula: n is the number of samples of each batch of training, and 256 is selected as a batch size parameter according to experience in the text; m is the dimension of the context factor; o is the dimension of the predicted value. Although the use of a slightly larger batch size brings higher memory occupation, data is sent to the model for training in batches, so that the parallel processing capacity of the computer is effectively utilized, the parallelism of calculation and the throughput of the machine are improved, the training time is shortened, and meanwhile, better generalization performance is kept.
Although the present application has been described above with reference to specific embodiments, those skilled in the art will recognize that many changes may be made in the configuration and details of the present application within the principles and scope of the present application. The scope of protection of the application is determined by the appended claims, and all changes that come within the meaning and range of equivalency of the technical features are intended to be embraced therein.

Claims (9)

1. A long-short term traffic flow prediction model is characterized in that: the model sequentially comprises a context factor input layer, a feature learning and pattern recognition layer and a traffic flow data output layer;
the context factor input layer is used for inputting the preprocessed context factors into the neural network;
the characteristic learning and pattern recognition layer is used for transforming the input context factors layer by layer and extracting implicit patterns and characteristics;
and the traffic flow data output layer is used for aggregating and summarizing the modes and characteristics obtained by learning of the previous hidden layer, and obtaining corresponding traffic flow prediction data after nonlinear weighted transformation.
2. The long and short term traffic flow prediction model according to claim 1, characterized in that: the contextual factor input layer takes future contextual factors as input.
3. The long and short term traffic flow prediction model according to claim 1, characterized in that: the traffic flow prediction output by the traffic flow data output layer comprises long-time traffic flow prediction and short-time traffic flow prediction.
4. The long-and-short-term traffic flow prediction method according to claim 3, characterized in that: the predicted time of the traffic flow output by the traffic flow data output layer comprises 5 minutes, 10 minutes, 15 minutes, 20 minutes, 30 minutes or 1 hour.
5. A long-term and short-term traffic flow prediction method is characterized in that: the method comprises the following steps:
step 1): inputting historical traffic flow data and context factors into a feature learning and pattern recognition layer for training;
step 2): updating iteration and testing of the model are continuously carried out by utilizing a neural network back propagation algorithm;
step 3): generating a deep belief network model capable of expressing the characteristics of the traffic flow context factors;
step 4): loading a trained deep belief network model;
step 5): sending the prepared context factors into a prediction model in sequence;
step 6): and outputting the predicted result through the traffic flow data output layer.
6. The long-and-short-term traffic flow prediction method according to claim 5, characterized in that: the contextual factors include year, month, day, week, holiday, hour, minute, and daily data time points.
7. The long-and-short-term traffic flow prediction method according to claim 5, characterized in that: the prediction method exploits from historical data the relationship between traffic flow data and contextual factors over a given time interval.
8. The long-and-short-term traffic flow prediction method according to claim 7, characterized in that: and mining the relation between the traffic flow data and the contextual factors by adopting a deep belief network model.
9. The long-and-short-term traffic flow prediction method according to claim 7, characterized in that: and mining the relation between the traffic flow data and the contextual factors by adopting a multi-layer supervised learning algorithm.
CN202010289673.9A 2020-04-14 2020-04-14 Long-short term traffic flow prediction model and method Pending CN111524348A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010289673.9A CN111524348A (en) 2020-04-14 2020-04-14 Long-short term traffic flow prediction model and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010289673.9A CN111524348A (en) 2020-04-14 2020-04-14 Long-short term traffic flow prediction model and method

Publications (1)

Publication Number Publication Date
CN111524348A true CN111524348A (en) 2020-08-11

Family

ID=71902178

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010289673.9A Pending CN111524348A (en) 2020-04-14 2020-04-14 Long-short term traffic flow prediction model and method

Country Status (1)

Country Link
CN (1) CN111524348A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113762578A (en) * 2020-12-28 2021-12-07 京东城市(北京)数字科技有限公司 Training method and device of flow prediction model and electronic equipment
CN114023074A (en) * 2022-01-10 2022-02-08 佛山市达衍数据科技有限公司 Traffic jam prediction method, device and medium based on multiple signal sources

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7948400B2 (en) * 2007-06-29 2011-05-24 Microsoft Corporation Predictive models of road reliability for traffic sensor configuration and routing
JP4790864B2 (en) * 2007-06-28 2011-10-12 マイクロソフト コーポレーション Learning and reasoning about the situation-dependent reliability of sensors
CN105761488A (en) * 2016-03-30 2016-07-13 湖南大学 Real-time limit learning machine short-time traffic flow prediction method based on fusion
CN106935034A (en) * 2017-05-08 2017-07-07 西安电子科技大学 Towards the regional traffic flow forecasting system and method for car networking
CN107103754A (en) * 2017-05-10 2017-08-29 华南师范大学 A kind of road traffic condition Forecasting Methodology and system
CN107730887A (en) * 2017-10-17 2018-02-23 海信集团有限公司 Realize method and device, the readable storage medium storing program for executing of traffic flow forecasting
CN108960496A (en) * 2018-06-26 2018-12-07 浙江工业大学 A kind of deep learning traffic flow forecasting method based on improvement learning rate
CN109242140A (en) * 2018-07-24 2019-01-18 浙江工业大学 A kind of traffic flow forecasting method based on LSTM_Attention network
CN109460855A (en) * 2018-09-29 2019-03-12 中山大学 A kind of throughput of crowded groups prediction model and method based on focus mechanism
CN110223510A (en) * 2019-04-24 2019-09-10 长安大学 A kind of multifactor short-term vehicle flowrate prediction technique based on neural network LSTM

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4790864B2 (en) * 2007-06-28 2011-10-12 マイクロソフト コーポレーション Learning and reasoning about the situation-dependent reliability of sensors
US7948400B2 (en) * 2007-06-29 2011-05-24 Microsoft Corporation Predictive models of road reliability for traffic sensor configuration and routing
CN105761488A (en) * 2016-03-30 2016-07-13 湖南大学 Real-time limit learning machine short-time traffic flow prediction method based on fusion
CN106935034A (en) * 2017-05-08 2017-07-07 西安电子科技大学 Towards the regional traffic flow forecasting system and method for car networking
CN107103754A (en) * 2017-05-10 2017-08-29 华南师范大学 A kind of road traffic condition Forecasting Methodology and system
CN107730887A (en) * 2017-10-17 2018-02-23 海信集团有限公司 Realize method and device, the readable storage medium storing program for executing of traffic flow forecasting
CN108960496A (en) * 2018-06-26 2018-12-07 浙江工业大学 A kind of deep learning traffic flow forecasting method based on improvement learning rate
CN109242140A (en) * 2018-07-24 2019-01-18 浙江工业大学 A kind of traffic flow forecasting method based on LSTM_Attention network
CN109460855A (en) * 2018-09-29 2019-03-12 中山大学 A kind of throughput of crowded groups prediction model and method based on focus mechanism
CN110223510A (en) * 2019-04-24 2019-09-10 长安大学 A kind of multifactor short-term vehicle flowrate prediction technique based on neural network LSTM

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LICHENG QU等: "Daily long-term traffic flow forecasting based on a deep neural network", 《EXPERT SYSTEMS WITH APPLICATIONS》 *
YOUSEF-AWWAD DARAGHMI 等: "Negative Binomial Additive Models for Short-Term Traffic Flow Forecasting in Urban Areas", 《IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS》 *
赵庶旭 等: "一种改进的深度置信网络在交通流预测中的应用", 《计算机应用研究》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113762578A (en) * 2020-12-28 2021-12-07 京东城市(北京)数字科技有限公司 Training method and device of flow prediction model and electronic equipment
CN114023074A (en) * 2022-01-10 2022-02-08 佛山市达衍数据科技有限公司 Traffic jam prediction method, device and medium based on multiple signal sources

Similar Documents

Publication Publication Date Title
CN110245801A (en) A kind of Methods of electric load forecasting and system based on combination mining model
CN111027772A (en) Multi-factor short-term load prediction method based on PCA-DBILSTM
CN114547974A (en) Dynamic soft measurement modeling method based on input variable selection and LSTM neural network
Barzola-Monteses et al. Energy consumption of a building by using long short-term memory network: a forecasting study
CN113219871B (en) Curing room environmental parameter detecting system
CN112766603A (en) Traffic flow prediction method, system, computer device and storage medium
CN111524348A (en) Long-short term traffic flow prediction model and method
CN104732067A (en) Industrial process modeling forecasting method oriented at flow object
CN114819395A (en) Industry medium and long term load prediction method based on long and short term memory neural network and support vector regression combination model
Lei et al. A novel time-delay neural grey model and its applications
Smith et al. Multi-objective evolutionary recurrent neural network ensemble for prediction of computational fluid dynamic simulations
CN114202063A (en) Fuzzy neural network greenhouse temperature prediction method based on genetic algorithm optimization
Reddy et al. An optimal neural network model for software effort estimation
Liu et al. Network traffic big data prediction model based on combinatorial learning
Wang et al. Short term load forecasting: A dynamic neural network based genetic algorithm optimization
Doudkin et al. Spacecraft Telemetry Time Series Forecasting With Ensembles of Neural Networks
He et al. Application of neural network model based on combination of fuzzy classification and input selection in short term load forecasting
Zain Al-Thalabi et al. Modeling and prediction using an artificial neural network to study the impact of foreign direct investment on the growth rate/a case study of the State of Qatar
Zhu et al. From numeric to granular models: A quest for error and performance analysis
Shamsuddin et al. Artificial neural network time series modeling for revenue forecasting
Kumar et al. Comparison of HDNN with other Machine Learning Models in Stock Market Prediction
CN113887570B (en) Solar flare two-classification prediction method based on neural network
CN118154232B (en) Tourism supply chain management system based on data mining technology
Yang Analysis and Prediction of Economic Industrial Structure by BP Neural Network Based on Genetic Algorithm
CN112183846B (en) TVF-EMD-MCQRNN load probability prediction method based on fuzzy C-means clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200811