CN105701571A

CN105701571A - Short-term traffic flow prediction method based on nerve network combination model

Info

Publication number: CN105701571A
Application number: CN201610020549.6A
Authority: CN
Inventors: 陈志�; 林海涛; 岳文静; 黄诚博; 卜杰; 王宇虹; 刘亚威
Original assignee: Nanjing Post and Telecommunication University
Current assignee: Nanjing Post and Telecommunication University
Priority date: 2016-01-13
Filing date: 2016-01-13
Publication date: 2016-06-22

Abstract

The invention provides a short-term traffic flow prediction method based on a neural network combination model. The method constructs a backpropagation neural network combination prediction model, and proposes a short-term traffic flow prediction method based on the model. According to the characteristics of the traffic flow, the present invention first uses the fuzzy C-means clustering algorithm to cluster the traffic flow, builds a backpropagation neural network prediction model for each cluster generated by the clustering, and calculates the prediction of each prediction model according to the degree of membership The weighted sum of the results is used as the final prediction result. In order to improve the prediction accuracy, the present invention adopts the Taguchi method for experimental design to test the influence of different structural parameters on the prediction accuracy of the prediction model, and uses the optimal structural parameters as the initial structure of the prediction model. The method of the invention can effectively improve the prediction accuracy of short-term traffic flow, reduce the influence of noise in training data on the prediction accuracy, and the running time is reasonable.

Description

A Short-term Traffic Flow Forecasting Method Based on Neural Network Combination Model

技术领域 technical field

本发明涉及一种短时交通流量预测方法，利用神经网络组合模型来提升短时交通流量的预测精度，降低训练数据中噪声对预测精度的影响，属于交通流量和神经网络的交叉技术应用领域。 The invention relates to a short-term traffic flow prediction method, which uses a neural network combination model to improve the prediction accuracy of short-term traffic flow and reduces the influence of noise in training data on the prediction accuracy, and belongs to the cross-technical application field of traffic flow and neural network.

背景技术 Background technique

智能交通系统是一种能实时、准确、高效运行的交通运输管理系统，其有效地融合了先进的信息技术、数据通信技术、电子控制技术以及计算机处理技术。智能交通系统的核心技术——交通控制技术与交通诱导技术是解决城市的交通拥堵和提高路网通行效率最有效的方式，是近年来的研究热点。而实现交通控制和交通诱导的基础则是实时、准确的短时交通流预测。 Intelligent transportation system is a real-time, accurate and efficient transportation management system, which effectively integrates advanced information technology, data communication technology, electronic control technology and computer processing technology. The core technology of intelligent transportation system——traffic control technology and traffic guidance technology is the most effective way to solve urban traffic congestion and improve the efficiency of road network traffic, and it is a research hotspot in recent years. The basis for realizing traffic control and traffic guidance is real-time and accurate short-term traffic flow forecasting.

交通流量预测按照预测时间长短，可分为长期流量预测和短时流量预测。长期流量预测提供数月甚至数年的交通流量预测，主要用于构建交通运输系统的长期规划。短时流量预测着重于预测交通流量短期内的变化，一般预测未来5分钟的流量。短时交通流量具有高度的非线性、时变性和不确定性，这种不确定性可能来自环境因素，如路面状况、天气变化等，也可能来自突发状况，如交通事故、大型集会等，这些因素导致对短时交通流量准确预测较为困难。 Traffic flow forecasting can be divided into long-term traffic forecasting and short-term traffic forecasting according to the length of forecasting time. Long-term flow forecasting provides traffic flow forecasts for months or even years, and is mainly used to construct long-term planning for transportation systems. Short-term traffic forecasting focuses on predicting short-term changes in traffic flow, generally predicting the traffic in the next 5 minutes. Short-term traffic flow is highly nonlinear, time-varying, and uncertain. This uncertainty may come from environmental factors, such as road conditions, weather changes, etc., or from emergencies, such as traffic accidents, large gatherings, etc. These factors make it difficult to accurately predict short-term traffic flow.

针对短时交通流量的预测，研究者们已经提出了多种模型。早期的预测模型，如指数平滑模型、时间序列模型和历史平均模型主要是以微积分和数理统计等为基础的，具有历史数据需求量大、预测精度不高等问题。之后，一些现代科学技术和方法逐渐被引入到了预测中，出现了一些预测效果更好地模型，如卡尔曼滤波模型、非参数回归模型、灰色系统预测模型以及神经网络模型等等。 For the prediction of short-term traffic flow, researchers have proposed a variety of models. Early forecasting models, such as exponential smoothing models, time series models, and historical average models, were mainly based on calculus and mathematical statistics, which had problems such as large demand for historical data and low forecasting accuracy. Afterwards, some modern scientific techniques and methods were gradually introduced into the prediction, and some models with better prediction effect appeared, such as Kalman filter model, non-parametric regression model, gray system prediction model and neural network model, etc.

人工神经网络是一个具有高度非线性的动力学系统，它具有强大的非线性拟合能力，因此用于交通流预测有特定的优势。反向传播神经网络(BP神经网络)模型是交通流预测方面应用最为广泛的预测模型之一，具有一个隐含层的反向传播神经网络能够以任意精度逼近任意一个连续函数。然而，反向传播神经网络预测模型也有不少缺点，例如收敛速度慢、易陷入局部极小值以及预测精度受其结构影响等等。因此，反向传播神经网络预测模型在预测精度方面还有待于提高。为了进一步提升神经网络的预测能力，研究者们将其与其它智能方法或统计方法相结合，构建出组合预测模型。这些组合模型相对于单一模型往往具有更好的预测精度。 Artificial neural network is a highly nonlinear dynamical system, and it has strong nonlinear fitting ability, so it has specific advantages in traffic flow prediction. The backpropagation neural network (BP neural network) model is one of the most widely used prediction models in traffic flow forecasting. The backpropagation neural network with a hidden layer can approximate any continuous function with arbitrary precision. However, the backpropagation neural network prediction model also has many shortcomings, such as slow convergence speed, easy to fall into local minimum, and prediction accuracy affected by its structure, etc. Therefore, the prediction accuracy of the backpropagation neural network prediction model still needs to be improved. In order to further improve the predictive ability of the neural network, researchers combine it with other intelligent methods or statistical methods to construct a combined forecasting model. These combined models tend to have better predictive accuracy than single models.

聚类是一个把数据对象集划分成子集的过程，每个子集为一个簇，使得簇中的对象彼此相似，但与其它簇中的对象不相似。聚类方法大致可分为四类：划分方法、层次方法、基于密度的方法和基于网格的方法。k均值就是一种典型的划分方法，需要事先指定簇的个数，并且簇的划分是互斥的。然而并非所有对象都确定属于一个簇，也可以同属于几个簇，于是可以把模糊集概念用于聚类上，这就诞生了模糊聚类方法。模糊c均值就是一种典型的模糊聚类方法，每个对象以一定的隶属度属于不同的簇，其同样也是基于划分的方法，需要指定簇的个数。 Clustering is a process of dividing a set of data objects into subsets, and each subset is a cluster, so that objects in a cluster are similar to each other, but not similar to objects in other clusters. Clustering methods can be roughly divided into four categories: partition methods, hierarchical methods, density-based methods, and grid-based methods. K-means is a typical division method, which needs to specify the number of clusters in advance, and the division of clusters is mutually exclusive. However, not all objects definitely belong to one cluster, and can also belong to several clusters at the same time, so the concept of fuzzy set can be used in clustering, which gave birth to the fuzzy clustering method. Fuzzy c-means is a typical fuzzy clustering method. Each object belongs to different clusters with a certain degree of membership. It is also based on division, and the number of clusters needs to be specified.

田口方法是一种低成本、高效益的质量工程方法，它强调产品质量的提高不是通过检验，而是通过设计。其思想主要是在产品最初的开发设计阶段，通过围绕所设置的目标值选择设计参数，并经过试验最低限度减少变异，从而把质量构建到产品中，使所生产的全部产品具有相同的、稳定的质量，极大地减少成本和损失。在质量控制领域，田口方法已经被成功应用于以低成本设计可靠且高质量的产品，例如汽车和消费类电子产品。 Taguchi method is a low-cost, high-efficiency quality engineering method, which emphasizes that the improvement of product quality is not through inspection, but through design. The idea is mainly to build quality into the product by selecting design parameters around the set target value in the initial stage of product development and design, and minimizing variation through experiments, so that all products produced have the same, stable High quality, greatly reducing costs and losses. In the field of quality control, the Taguchi method has been successfully applied to design reliable and high-quality products at low cost, such as automobiles and consumer electronics.

发明内容 Contents of the invention

技术问题：本发明的目的是提供一种基于神经网络组合模型的短时交通流量预测方法，该方法解决短时交通流量的预测精度低、训练数据中噪声对预测精度的影响大、运行时间不合理等问题。 Technical problem: the purpose of this invention is to provide a kind of short-term traffic flow forecasting method based on neural network combination model, this method solves the prediction accuracy of short-term traffic flow is low, the noise in the training data has a great impact on the prediction accuracy, and the running time is not long. Reasonable and other issues.

技术方案：本发明将构建一种反向传播神经网络组合预测模型，并且在反向传播神经网络组合预测模型的基础上，提出了短时交通流量预测算法TFBCM，首先针对交通流的特性，使用模糊C均值聚类算法对交通流量进行聚类，对聚类生成的每个簇构建一个反向传播神经网络预测模型，并根据隶属度求各预测模型预测结果的加权和作为最终预测结果。为了进一步提升预测精度，采用田口方法进行试验设计来测试不同结构参数对预测模型预测精度的影响，并使用最佳结构参数作为预测模型的初始结构。 Technical solution: The present invention will build a backpropagation neural network combination forecasting model, and on the basis of the backpropagation neural network combination forecasting model, a short-term traffic flow forecasting algorithm TFBCM is proposed. First, for the characteristics of traffic flow, use The fuzzy C-means clustering algorithm clusters the traffic flow, builds a backpropagation neural network prediction model for each cluster generated by clustering, and calculates the weighted sum of the prediction results of each prediction model according to the degree of membership as the final prediction result. In order to further improve the prediction accuracy, the Taguchi method was used for experimental design to test the influence of different structural parameters on the prediction accuracy of the prediction model, and the optimal structural parameters were used as the initial structure of the prediction model.

本发明所述的基于神经网络组合模型的短时交通流量预测方法，该方法构建一种神经网络组合预测模型，利用交通流的特性，用模糊C均值聚类算法将交通流量划分为少量模式，并对每种流量模式建立一个神经网络预测模型。 The short-term traffic flow prediction method based on the neural network combination model of the present invention, the method constructs a kind of neural network combination prediction model, utilizes the characteristics of the traffic flow, and uses the fuzzy C-means clustering algorithm to divide the traffic flow into a small number of patterns, And build a neural network prediction model for each traffic pattern.

基于神经网络组合模型的短时交通流量预测方法包括以下步骤： The short-term traffic flow prediction method based on the neural network combination model includes the following steps:

步骤1)设置短时交通流量预测的结构参数，具体步骤如下： Step 1) Set the structural parameters of the short-term traffic flow forecast, the specific steps are as follows:

步骤1.1)将聚类个数作为因子A，使因子A表示交通流量的分类，设置因子A的水准，将因子A的最低的水准取为3，最高的水准取为5，中间的水准取为4。所述水准代表指每个因子所取的用量或所处的状态值。 Step 1.1) Use the number of clusters as factor A, make factor A represent the classification of traffic flow, set the level of factor A, take the lowest level of factor A as 3, the highest level as 5, and the middle level as 4. The level represents the amount taken by each factor or the state value it is in.

步骤1.2)将输入节点数作为因子B，设置因子B三个水准分别为1、3、5，这三个水准依次表示节点较少、一般、较多三种情况； Step 1.2) The number of input nodes is used as factor B, and the three levels of factor B are set to 1, 3, and 5 respectively. These three levels successively represent three situations: fewer nodes, normal nodes, and more nodes;

步骤1.3)将隐含层节点数作为因子C，因子C由经验公式确定，所述l为隐含层节点数，设置n为因子B，即n为输入节点数，设置m为输出节点数，δ是取值在1到10之间的常数。因子C的三个水准设置为2、3和7。 Step 1.3) Use the number of hidden layer nodes as factor C, and factor C is determined by the empirical formula It is determined that l is the number of hidden layer nodes, n is set as factor B, that is, n is the number of input nodes, m is the number of output nodes, and δ is a constant with a value between 1 and 10. The three levels of Factor C are set to 2, 3 and 7.

步骤1.4)将学习率作为因子D，因子D取值在0.01到0.1之间，其三个水准设置为0.02、0.04和0.07。 Step 1.4) The learning rate is used as the factor D, the value of the factor D is between 0.01 and 0.1, and its three levels are set as 0.02, 0.04 and 0.07.

步骤1.5)使用田口方法进行试验，获得因子A、因子B、因子C、因子D在效应值最高值时的水准，将这些水准作为结构参数的初始数值。在田口方法中，使用来评估每个设计因子的效应值，所述k为每组试验执行的次数，e_MRE是平均相对误差，为交通流量的预测值，θ(T)为用户输入的交通流量的实际值，T为某一预测时间点，N为预测时间点的总数，p和q均为变量。所述田口方法是一种质量工程方法，该方法通过围绕所设置的目标值选择设计参数，并经过试验最低限度减少变异，使所生产的全部产品具有相同的、稳定的质量。 Step 1.5) Use the Taguchi method to conduct experiments to obtain the levels of factor A, factor B, factor C, and factor D when the effect value is the highest, and use these levels as the initial values of the structural parameters. In the Taguchi method, using To evaluate the effect value of each design factor, the k is the number of times each group of experiments is performed, e _MRE is the average relative error, is the predicted value of traffic flow, θ(T) is the actual value of traffic flow input by the user, T is a certain forecast time point, N is the total number of forecast time points, p and q are variables. The Taguchi method is a quality engineering method that makes all products produced have the same, stable quality by selecting design parameters around set target values and minimizing variation through experiments.

步骤2)对交通流量进行聚类。用户输入由数据对象组成的数据集X＝{x₁,x₂,...x_n}、阀值ε和最大迭代次数maxt；设定有分类组成的聚类中心为V＝{v₁,v₂,...v_c}，u_ij是表示第j个数据对象x_j属于第i个分类v_i的隶属度，所有u_ij组成隶属度矩阵U，设定目标函数J为隶属度加权的距离平方和，n为数据总数，c为聚类个数，m为不小于1的加权参数，d_ij为对象x_j与聚类中心v_i之间的距离，i和j均为变量，设置迭代步数t。步骤2)的具体步骤如下： Step 2) Cluster the traffic flow. The user inputs the data set X={x ₁ ,x ₂ ,...x _n }, the threshold ε and the maximum number of iterations maxt composed of data objects; set the cluster center composed of categories as V={v ₁ , v ₂ ,...v _c }, u _ij is the degree of membership that the j-th data object x _j belongs to the i-th category v _i , all u _ij form the membership degree matrix U, and the objective function J is set as the weighted degree of membership The sum of squared distances, n is the total number of data, c is the number of clusters, m is a weighting parameter not less than 1, d _ij is the distance between the object x _j and the cluster center v _i , i and j are variables, Set the number of iteration steps t. The specific steps of step 2) are as follows:

步骤2.1)将c设定为步骤1)获得的因子A的水准，用户输入m，用0到1之间的随机数赋值给u_ij，所有u_ij组成初始的隶属度矩阵U，使其满足将迭代步数t设置为0。 Step 2.1) Set c as the level of factor A obtained in step 1), the user inputs m, assigns a random number between 0 and 1 to u _ij , and all u _ij form the initial membership matrix U, so that it satisfies Set the number of iteration steps t to 0.

步骤2.2)使用计算聚类中心V当前每个元素值，所述是u_ij在第t次迭代时的取值，表示v_i在第t次迭代时的取值。 Step 2.2) Use Calculate the current value of each element of the cluster center V, the is the value of u _ij at the tth iteration, Indicates the value of v _i at the tth iteration.

步骤2.3)使用计算J^t，所述J^t是J在第t次迭代时的取值，d_ij为数据对象x_j与分类v_i之间的欧几里得距离，所述欧几里得距离是空间中两个点之间的真实距离。 Step 2.3) Use Calculate J ^t , the J ^t is the value of J at the t-th iteration, d _ij is the Euclidean distance between the data object x _j and the classification v _i , and the Euclidean distance is in the space The true distance between two points.

步骤2.4)利用根据对象到聚类中心的距离对隶属度矩阵U进行修正，所述是u_ij在第t次迭代后的取值，d_kj为数据对象x_j与分类v_k之间的欧几里得距离。 Step 2.4) utilize The membership degree matrix U is corrected according to the distance from the object to the cluster center, the is the value of u _ij after the t-th iteration, and d _kj is the Euclidean distance between the data object x _j and the classification v _k .

步骤2.5)使用计算J^t+1，所述J^t+1是J在第t次迭代后的取值。 Step 2.5) Use Calculate J ^t+1 , where J ^t+1 is the value of J after the t iteration.

步骤2.6)判断是否满足|J^t+1-J^t|≤ε或者t等于maxt，如果满足，则转到步骤3)，否则t＝t+1，转向步骤2.2)。 Step 2.6) Judging whether |J ^t+1 -J ^t |≤ε or t is equal to maxt, if satisfied, go to step 3), otherwise t=t+1, go to step 2.2).

步骤3)使用反向传播神经网络进行训练。所述反向传播神经网络也称为BP神经网络，是按误差逆传播算法训练的多层前馈神经网络。步骤3)的具体步骤如下： Step 3) Use the backpropagation neural network for training. The backpropagation neural network is also called BP neural network, which is a multi-layer feedforward neural network trained according to the error backpropagation algorithm. The specific steps of step 3) are as follows:

步骤3.1)将步骤1)获得的因子B水准值作为网络输入层节点数n，因子C水准值作为隐含层节点数l，使用输出层节点数m，因子D水准值作为学习率λ，用户给出输入层、隐含层和输出层神经元之间的连接权值ω_rs和ω_su设置初始值，用户再输入初始的隐含层阀值a₀、输出层阀值b₀，重新设置最大迭代次数maxt，将迭代步数t设置为0。 Step 3.1) Use the level value of factor B obtained in step 1) as the number n of network input layer nodes, the level value of factor C as the number l of nodes in the hidden layer, use the number m of nodes in the output layer, and the level value of factor D as the learning rate λ, the user Given the connection weights ω _rs and ω _su between the input layer, hidden layer and output layer neurons to set the initial value, the user then enters the initial hidden layer threshold value a ₀ and output layer threshold value b ₀ , and resets The maximum number of iterations maxt, set the number of iterations t to 0.

步骤3.2)根据用户输入的数据集变量X，利用公式计算隐含层输出H_s，s＝1,2,...,l； Step 3.2) According to the data set variable X input by the user, use the formula Calculate hidden layer output H _s , s=1,2,...,l;

步骤3.3)利用计算反向传播神经网络的预测输出O_u，u＝1,2,...,m。 Step 3.3) utilize Calculate the predicted output O _u of the backpropagation neural network, u=1, 2, . . . , m.

步骤3.4)用户输入期望输出Y_u，利用e_u＝Y_u-O_u计算预测误差e_u，u＝1,2,...,m。 Step 3.4) The user inputs the desired output Y _u , and uses e _u =Y _u -O _u to calculate the prediction error e _u , where u=1,2,...,m.

步骤3.5)利用 $ω_{r s} = ω_{r s} + {λH}_{s} (1 - H_{s}) x_{r} Σ_{k = 1}^{m} ω_{j u} e_{u},$ 和 $ω_{su} = ω_{su} + {λH}_{s} e_{u},$ 将预测误差e_u带入更新网络连接权值ω_rs和ω_su,r＝1,2,...,n,s＝1,2,...,l,u＝1,2,...,m。 Step 3.5) utilize $ω_{r the s} = ω_{r the s} + {λH}_{the s} (1 - h_{the s}) x_{r} Σ_{k = 1}^{m} ω_{j u} e_{u},$ and $ω_{su} = ω_{su} + {λH}_{the s} e_{u},$ Bring prediction error _u into update network connection weights ω _rs and ω _su , r=1,2,...,n,s=1,2,...,l,u=1,2,.. ., m.

步骤3.6)利用和b_u＝b_u+e_u，带入预测误差e_u更新网络节点阀值a_s和b_u，s＝1,2,...,l，u＝1,2,...,m。 Step 3.6) utilize and b _u = b _u + e _u , bring in the prediction error e _u to update the network node thresholds a _s and b _u , s=1,2,...,l, u=1,2,...,m .

步骤3.7)判断迭代t是否已达到最大迭代次数maxt，若达到，则转到步骤4)，否则t＝t+1，转向步骤3.2)。 Step 3.7) Judging whether iteration t has reached the maximum number of iterations maxt, if so, then go to step 4), otherwise t=t+1, go to step 3.2).

步骤4)进行短时交通流量预测。用户输入依次预测时间点T，从步骤3)获得的反向传播神经网络的预测输出中，取c个预测值O₁,O₂,...,O_c；根据步骤3)隶属度矩阵U中预测时间点对于各个簇的隶属度，用隶属度作为权值，求各个簇对应的反向传播神经网络预测值的加权和作为在预测时间点T的预测值O_T，即所述T为预测时间点，T＝1,2,...,n。 Step 4) Carry out short-term traffic flow prediction. The user inputs the prediction time point T in sequence, and from the prediction output of the backpropagation neural network obtained in step 3), take c prediction values O ₁ , O ₂ ,...,O _c ; according to step 3) the membership degree matrix U For the membership degree of each cluster at the prediction time point, use the membership degree as the weight, and find the weighted sum of the predicted values of the backpropagation neural network corresponding to each cluster as the predicted value O _T at the prediction time point T, that is The T is a prediction time point, and T=1, 2, . . . , n.

有益效果：本发明解决短时交通流量预测问题，各预测模型预测结果使用隶属度加权求和作为最终的预测值，从一定程度上缓和了训练数据中噪声对预测精度的影响，具有如下的有益效果： Beneficial effects: the present invention solves the problem of short-term traffic flow prediction. The prediction results of each prediction model use the weighted sum of the membership degrees as the final prediction value, which alleviates the influence of noise in the training data on the prediction accuracy to a certain extent, and has the following benefits Effect:

(1)本发明使用田口方法来选择合适的结构参数，并根据这些参数设置预测模型的结构，进一步提升了预测精度。 (1) The present invention uses the Taguchi method to select appropriate structural parameters, and sets the structure of the prediction model according to these parameters, further improving the prediction accuracy.

(2)本发明在人为指定结构参数的情况下，预测精度就已经显著高于传统的反向传播神经网络算法和CITFF算法，而相比遗传算法，使用田口方法选择结构参数，可以在较短的时间内降低更多的预测误差。 (2) In the case of artificially specifying the structural parameters, the prediction accuracy of the present invention is significantly higher than that of the traditional backpropagation neural network algorithm and the CITFF algorithm. Compared with the genetic algorithm, using the Taguchi method to select the structural parameters can be achieved in a shorter time. reduce the prediction error more in a short period of time.

(3)本发明对于短时交通流量的预测精度较高，且计算时间合理。 (3) The present invention has high prediction accuracy for short-term traffic flow, and the calculation time is reasonable.

附图说明 Description of drawings

图1是短时交通流量预测方案框架图。 Figure 1 is a framework diagram of short-term traffic flow forecasting scheme.

具体实施方式 detailed description

本发明针对十字路口短时交通流量预测，在具体实施中考虑交通流量分布形式的多样性、模型训练充分性和预测精度，减少随机生成或是人为指定这些参数值。下面对本发明作更详细的描述。 The present invention is aimed at short-term traffic flow prediction at crossroads, and considers the diversity of traffic flow distribution forms, model training adequacy and prediction accuracy in specific implementation, and reduces random generation or artificial designation of these parameter values. The present invention will be described in more detail below.

本发明在具体实施中首先构建一种反向传播神经网络组合预测模型，该模型由反向传播神经网络预测模型、模糊C均值聚类模型和结构参数选择模型组合而成。 In the specific implementation of the present invention, a backpropagation neural network combination prediction model is first constructed, which is composed of a backpropagation neural network prediction model, a fuzzy C-means clustering model and a structural parameter selection model.

反向传播神经网络是一种多层前馈神经网络，其主要特点是信号前向传播，误差反向传播。在前向传播中，输入信号从输入层经隐含层逐层处理，直至输出层。每一层的神经元状态只影响下一层神经元状态。如果输出层得不到期望输出，则转入反向传播，根据预测误差调整网络权值和阀值，从而使反向传播神经网络预测输出不断逼近期望输出。本发明在具体实施中采用单隐含层的反向传播神经网络来进行短时交通流量的预测。 The backpropagation neural network is a multi-layer feedforward neural network, and its main feature is that the signal propagates forward and the error propagates backward. In the forward propagation, the input signal is processed layer by layer from the input layer through the hidden layer to the output layer. The state of neurons in each layer only affects the state of neurons in the next layer. If the output layer cannot get the desired output, it will switch to backpropagation, and adjust the network weights and thresholds according to the prediction error, so that the predicted output of the backpropagation neural network will continue to approach the desired output. In the specific implementation, the present invention uses a single hidden layer reverse propagation neural network to predict short-term traffic flow.

模糊C均值聚类提供如何将多维空间的数据点分组成特定数目的群的途径，该方法首先随机选取若干聚类中心，所有数据点都被赋予对聚类中心一定的模糊隶属度，然后通过迭代方法不断修正聚类中心，迭代过程中以极小化所有数据点到各个聚类中心的距离与隶属度值的加权和为优化目标。本发明在具体实施中采用模糊聚类方法对交通流量进行划分，这样每个时间点就可以以一定的隶属度属于不同的模式。由于聚类准确性几乎不影响预测模型的精度，固采用计算量相对较小的模糊C均值算法进行聚类，以提升算法的整体效率。 Fuzzy C-means clustering provides a way to group data points in multi-dimensional space into a specific number of groups. This method first randomly selects several cluster centers, and all data points are given a certain degree of fuzzy membership to the cluster centers, and then through The iterative method continuously corrects the cluster centers, and the optimization goal is to minimize the weighted sum of the distances from all data points to each cluster center and the membership value in the iterative process. In the specific implementation, the present invention adopts fuzzy clustering method to divide the traffic flow, so that each time point can belong to different modes with a certain degree of membership. Since the clustering accuracy hardly affects the accuracy of the prediction model, the fuzzy C-means algorithm with a relatively small amount of calculation is used for clustering to improve the overall efficiency of the algorithm.

预测模型参数的选择对于其预测精度至关重要，应通过试验来验证不同参数对预测模型精度的影响。在试验设计中，因子代表试验中的自变量，是影响试验指标的有关因素和条件，而水准代表每个因子所取的用量或所处的状态。目前试验设计方法主要有4种：试误法、一次一因子试验法、全因子试验法和田口方法。田口方法考虑兼顾一次一因子试验法和全因子试验法的优点，从所有可能的组合中选择具有典型性、代表性的组合，使试验组合在全局范围内均匀分布，能反映全面情况，又希望试验组合尽可能地少，于是使用正交表。本发明在具体实施中采用田口方法设计并进行试验，根据试验结果求出每个设计因子的效应值，选取出组合预测模型的合理参数，使其预测精度进一步提升。 The selection of prediction model parameters is crucial to its prediction accuracy, and experiments should be conducted to verify the influence of different parameters on the prediction model accuracy. In the experimental design, the factors represent the independent variables in the experiment, and are related factors and conditions that affect the experimental indicators, while the level represents the dosage or state of each factor. At present, there are mainly four experimental design methods: trial and error method, one-factor experiment at a time, full factorial experiment and Taguchi method. Taguchi method considers the advantages of one-time one-factor test method and full-factor test method, selects typical and representative combinations from all possible combinations, makes the test combinations evenly distributed in the global range, and can reflect the overall situation. The number of test combinations is as small as possible, so an orthogonal table is used. The present invention adopts the Taguchi method to design and conduct tests in the specific implementation, calculates the effect value of each design factor according to the test results, and selects reasonable parameters of the combined forecasting model to further improve the forecasting accuracy.

本发明在具体实施中，首先采用田口方法来决定聚类模型和预测模型的结构参数，然后使用模糊C均值聚类算法对交通流量进行聚类，并根据生成的簇划分训练数据，对应于每个簇训练一个反向传播神经网络预测模型。预测时，用所有的预测模型对预测时间点进行预测，并根据预测时间点对于各预测模型的隶属度，求出各预测值的加权和，作为最终的预测值。该模型框架如图1所示。 In the specific implementation of the present invention, the Taguchi method is first used to determine the structural parameters of the clustering model and the prediction model, and then the fuzzy C-means clustering algorithm is used to cluster the traffic flow, and the training data is divided according to the generated clusters, corresponding to each clusters to train a backpropagation neural network prediction model. When forecasting, all forecasting models are used to predict the forecasting time point, and according to the membership degree of each forecasting model at the forecasting time point, the weighted sum of each forecasting value is obtained as the final forecasting value. The framework of the model is shown in Figure 1.

本发明在具体实施中，在反向传播神经网络组合预测模型的基础上，进行短时交通流量预测。考虑到交通流量分布形式的多样性，本发明对流量维进行大致划分，对不同的流量模式使用不同的预测模型来描述，而划分的最佳个数则用试验来确定。为了避免模式数量过多带来的问题，本发明在具体实施中一般限制聚类个数不超过5个。由于短时交通流量的不确定性，本发明使用隶属度来表示每个时间点属于某个模式的可能性，并根据隶属度求各预测模型预测结果的加权和，作为最终的预测值。为进一步提升预测的精度，本发明采用田口方法设计试验，以较少的试验次数确定出了聚类模型和预测模型的合理结构参数，使用这些参数初始化模型的结构，可以使预测结果更加准确。 In the specific implementation of the present invention, the short-term traffic flow prediction is carried out on the basis of the combined prediction model of the backpropagation neural network. Considering the diversity of traffic flow distribution forms, the present invention roughly divides the flow dimension, uses different prediction models to describe different flow patterns, and determines the optimal number of divisions by experiments. In order to avoid problems caused by too many patterns, the present invention generally limits the number of clusters to no more than 5 in specific implementation. Due to the uncertainty of short-term traffic flow, the present invention uses the degree of membership to indicate the possibility of each time point belonging to a certain mode, and calculates the weighted sum of the prediction results of each prediction model according to the degree of membership as the final prediction value. In order to further improve the prediction accuracy, the present invention adopts the Taguchi method to design experiments, and determines the reasonable structural parameters of the clustering model and the prediction model with a small number of experiments. Using these parameters to initialize the structure of the model can make the prediction results more accurate.

1、结构参数的选取 1. Selection of structural parameters

反向传播神经网络的结构及其初始的权值和阀值对其精度、收敛速度以及是否会陷入局部极小都会有一定影响。本发明在具体实施中考虑反向传播神经网络结构参数的选取。 The structure of the backpropagation neural network and its initial weights and thresholds will have a certain impact on its accuracy, convergence speed and whether it will fall into a local minimum. The present invention considers the selection of the structural parameters of the backpropagation neural network in the specific implementation.

本发明在具体实施中，设计因子A直接影响聚类模型的输出，设计因子B、C和D决定了神经网络预测模型的结构。各个设计因子及其水准的选择描述如下： In the specific implementation of the present invention, the design factor A directly affects the output of the clustering model, and the design factors B, C and D determine the structure of the neural network prediction model. The selection of each design factor and its level is described as follows:

1)设计因子A：聚类个数决定了模糊C均值算法需要将交通流量聚为几类，也就决定了需要建立几个预测模型。此参数对预测精度的影响较大，需通过试验判断其最优取值。根据常识，交通流量按其大小至少可分为三类，即高峰、低谷和普通流量，所以其最低水准确定为3。考虑到效率以及CITFF算法所存在的问题，聚类个数不宜过多，所以最高水准确定为5，而第2水准则取中间值4。 1) Design factor A: The number of clusters determines how many categories the fuzzy C-means algorithm needs to cluster the traffic flow into, which also determines how many prediction models need to be established. This parameter has a great influence on the prediction accuracy, and it is necessary to judge its optimal value through experiments. According to common sense, traffic flow can be divided into at least three categories according to its size, namely peak, trough and ordinary flow, so its minimum level is determined to be 3. Considering the efficiency and the problems of the CITFF algorithm, the number of clusters should not be too many, so the highest level is determined to be 5, and the second level criterion takes the middle value of 4.

2)设计因子B：输入节点数决定了提供给反向传播神经网络预测模型的历史交通流量信息的多少。如果输入节点数过少，则不能提供足够的信息给预测模型，导致预测精度下降；而如果输入节点数过多，又会引入一些噪声信息，也会影响预测精度。本发明在具体实施中将输入节点数的三个水准设定为1、3和5，分别涵盖了输入节点较少、一般和较多三种情况。 2) Design factor B: The number of input nodes determines the amount of historical traffic flow information provided to the backpropagation neural network prediction model. If the number of input nodes is too small, it will not be able to provide enough information to the prediction model, resulting in a decrease in prediction accuracy; and if the number of input nodes is too large, some noise information will be introduced, which will also affect the prediction accuracy. In the specific implementation of the present invention, the three levels of the number of input nodes are set to 1, 3 and 5, which respectively cover the three situations of less input nodes, general and more input nodes.

3)设计因子C：隐含层节点数影响神经网络的学习能力，如果过少，则无法产生足够的连接权组合数来满足若干样本的学习；如果过多，则学习以后网络的泛化能力变差。对于隐含层节点数的确定，通过以下公式确定：所述l为隐含层节点数，设置n为因子B，即n为输入节点数，设置m为输出节点数，δ是取值在1到10之间的常数。由于输入节点数设为1、3和5，而输出节点数固定为1，根据该公式，可将隐含层节点数的三个水准分别设置为2、3和7。 3) Design factor C: The number of nodes in the hidden layer affects the learning ability of the neural network. If it is too small, it will not be able to generate enough connection weight combinations to satisfy the learning of several samples; if it is too large, the generalization ability of the network after learning will be affected. worse. For the determination of the number of hidden layer nodes, it is determined by the following formula: Said l is the number of hidden layer nodes, set n as factor B, that is, n is the number of input nodes, m is the number of output nodes, and δ is a constant with a value between 1 and 10. Since the number of input nodes is set to 1, 3 and 5, and the number of output nodes is fixed at 1, according to the formula, the three levels of the number of hidden layer nodes can be set to 2, 3 and 7 respectively.

4)设计因子D：BP算法以反向传播误差并修正权值阀值的方式学习，学习率的大小对于收敛速度以及训练结果均有较大影响。如果学习率太小，会使学习过于缓慢；如果学习率太大，则可能导致振荡或发散。根据经验，学习率的取值应在0.01到0.1之间较为适宜，故可将学习率的三个水准设置为0.02、0.04和0.07。 4) Design factor D: The BP algorithm learns by backpropagating errors and correcting the weight threshold. The learning rate has a great influence on the convergence speed and training results. If the learning rate is too small, it will make the learning too slow; if the learning rate is too large, it may cause oscillation or divergence. According to experience, the value of the learning rate should be between 0.01 and 0.1, so the three levels of the learning rate can be set to 0.02, 0.04 and 0.07.

由于有4种设计因子，每种设计因子水准数为3，故使用表1所述的正交表来设计试验。本发明在具体实施中采用以下平均相对误差e_MRE来表示预测的误差，所述为交通流量的预测值，θ(T)为用户输入的交通流量的实际值，T为某一预测时间点，N为预测时间点的总数。试验的误差符合望小特性，越小越好，在具体实施中，采用田口方法中望小特性η来评估每个设计因子的效应值，即所述k为正交表中每组试验执行的次数，p和q均为变量。 Since there are 4 design factors, and the level number of each design factor is 3, the orthogonal table described in Table 1 is used to design the experiment. In the specific implementation, the present invention adopts the following average relative error e _MRE to represent the error of prediction, said is the predicted value of traffic flow, θ(T) is the actual value of traffic flow input by the user, T is a certain forecast time point, and N is the total number of forecast time points. The error of the experiment conforms to the expected small characteristic, and the smaller the better, in the specific implementation, the effect value of each design factor is evaluated by using the expected small characteristic η in the Taguchi method, namely Said k is the number of executions of each group of experiments in the orthogonal table, and both p and q are variables.

在具体实施中，通过执行试验，计算出每组试验的η，将某因子各个水准所对应试验的η相加求均值，即可得出该因子各水准的效应值。选取每个因子具有最高效应值的水准作为结构参数初始值的最佳组合。 In the specific implementation, the η of each group of experiments is calculated by performing experiments, and the η of the experiments corresponding to each level of a certain factor is added to obtain the average value, and the effect value of each level of the factor can be obtained. The level with the highest effect value of each factor is selected as the best combination of initial values of structural parameters.

2、交通流量的聚类 2. Clustering of traffic flow

本发明在具体实施中采用模糊C均值聚类算法进行交通流量的聚类，聚类数c由田口方法设计试验预先确定。给定数据集X＝{x₁,x₂,...x_n}，模糊C均值聚类就是要将X划分为c类，c个聚类中心为V＝{v₁,v₂,...v_c}。在模糊聚类中，并不像硬聚类方法那样把每个对象严格地划分到某一类中，而是求出每个对象对应于所有类的隶属度。若使用u_ij来表示第j个对象属于第i类的隶属度，则u_ij∈[0,1]，并且有：即每个对象对于所有类的隶属度之和为1。模糊C均值聚类的目标仍为最小化类内距离并最大化类间距离，所以其目标函数可设为隶属度加权的距离平方和，即所述n为对象总数，c为聚类个数，m为不小于1的加权参数，d_ij为对象x_j与聚类中心v_i之间的距离，这里采用欧几里得距离来计算。 The present invention uses the fuzzy C-means clustering algorithm to cluster the traffic flow in the specific implementation, and the cluster number c is determined in advance by the Taguchi method design test. Given a data set X={x ₁ ,x ₂ ,...x _n }, fuzzy C-means clustering is to divide X into c classes, and the c cluster centers are V={v ₁ ,v ₂ ,. .. v _c }. In fuzzy clustering, unlike the hard clustering method, each object is not strictly divided into a certain class, but the degree of membership of each object corresponding to all classes is calculated. If u _ij is used to represent the membership degree of the j-th object belonging to the i-th class, then u _ij ∈ [0,1], and there are: That is, the sum of the membership degrees of each object for all classes is 1. The goal of fuzzy C-means clustering is still to minimize the intra-class distance and maximize the inter-class distance, so its objective function can be set to the sum of squared distances weighted by membership, that is The n is the total number of objects, c is the number of clusters, m is a weighting parameter not less than 1, and d _ij is the distance between the object x _j and the cluster center v _i , which is calculated by Euclidean distance here.

模糊C均值聚类是通过迭代调整隶属度矩阵U和聚类中心V，使得目标函数最小，具体步骤如下： Fuzzy C-means clustering is to iteratively adjust the membership matrix U and cluster center V to minimize the objective function. The specific steps are as follows:

(2.1)选定聚类数c和加权参数m，并用0到1之间的随机数初始化隶属度矩阵U⁰，使其满足的约束条件。将迭代步数t设置为0。 (2.1) Select the number of clusters c and the weighting parameter m, and initialize the membership matrix U ⁰ with a random number between 0 and 1, so that it satisfies constraints. Set the number of iteration steps t to 0.

(2.2)使用计算聚类中心V当前每个元素值。 (2.2) use Calculate the current value of each element of the cluster center V.

(2.3)使用计算J^t，所述J^t是J在第t次迭代时的取值，d_ij为数据对象x_j与分类v_i之间的欧几里得距离，所述欧几里得距离是空间中两个点之间的真实距离。 (2.3) use Calculate J ^t , the J ^t is the value of J at the t-th iteration, d _ij is the Euclidean distance between the data object x _j and the classification v _i , and the Euclidean distance is in the space The true distance between two points.

(2.4)利用根据对象到聚类中心的距离对隶属度矩阵U进行修正，所述是u_ij在第t次迭代后的取值，d_kj为数据对象x_j与分类v_k之间的欧几里得距离。 (2.4) Use The membership degree matrix U is corrected according to the distance from the object to the cluster center, the is the value of u _ij after the t-th iteration, and d _kj is the Euclidean distance between the data object x _j and the classification v _k .

(2.5)使用计算J^t+1，所述J^t+1是J在第t次迭代后的取值。 (2.5) use Calculate J ^t+1 , where J ^t+1 is the value of J after the t iteration.

(2.6)判断是否满足|J^t+1-J^t|≤ε或者t等于maxt，如果满足，继续下面的操作，否则t＝t+1，转向(2.2)。 (2.6) Determine whether |J ^t+1 -J ^t |≤ε or t is equal to maxt, if satisfied, continue the following operation, otherwise t=t+1, turn to (2.2).

根据最大隶属原则，每个对象应该属于其对应的隶属度最大的类。 According to the principle of maximum membership, each object should belong to its corresponding class with the largest membership degree.

3、反向传播神经网络的训练 3. Back propagation neural network training

交通流量数据经过聚类，形成了c个簇，对应于交通流量的c个模式。为了对每个模式建立相应的预测模型，首先应划分出训练这些模型的训练集。按时间维对原训练集中的数据进行划分，对于每个簇，将时间维与该簇中对象相同的所有训练数据划分为一个子训练集，这样共形成c个子训练集，用每个子训练集训练一个反向传播神经网络。反向传播神经网络的训练过程包括以下几个步骤： The traffic flow data is clustered to form c clusters, corresponding to c patterns of traffic flow. In order to build corresponding predictive models for each mode, the training set for training these models should first be divided. Divide the data in the original training set according to the time dimension, and for each cluster, divide all the training data whose time dimension is the same as the object in the cluster into a sub-training set, so that a total of c sub-training sets are formed, and each sub-training set is used Train a backpropagation neural network. The training process of the backpropagation neural network includes the following steps:

(3.1)网络初始化。采用之前获得的网络输入层节点数n、隐含层节点数l、输出层节点数m、学习率λ，给定输入层、隐含层和输出层神经元之间的连接权值ω_rs和ω_su，再给定隐含层阀值a₀、输出层阀值b₀，重新设置最大迭代次数maxt，将迭代步数t设置为0。 (3.1) Network initialization. Using the previously obtained network input layer node number n, hidden layer node number l, output layer node number m, and learning rate λ, the connection weights ω _rs and ω _su , and given the hidden layer threshold a ₀ and the output layer threshold b ₀ , reset the maximum number of iterations maxt, and set the number of iterations t to 0.

(3.2)隐含层输出计算。根据输入变量X，计算隐含层输出s＝1,2,...,l。 (3.2) Hidden layer output calculation. According to the input variable X, calculate the hidden layer output s=1,2,...,l.

(3.3)输出层输出计算。根据隐含层输出，计算反向传播神经网络的预测输出 $O_{u} = Σ_{s = 1}^{l} H_{s} ω_{su} - b_{u},$ u＝1,2,...,m (3.3) Output layer output calculation. Calculate the predicted output of the backpropagation neural network based on the output of the hidden layer $o_{u} = Σ_{the s = 1}^{l} h_{the s} ω_{su} - b_{u},$ u=1,2,...,m

(3.4)误差计算。用户输入期望输出Y_u，计算计算预测误差e_u＝Y_u-O_u，u＝1,2,...,m。 (3.4) Error calculation. The user inputs the expected output Y _u , and calculates the prediction error e _u =Y _u −O _u , u=1,2,...,m.

(3.5)权值更新。将预测误差e_u带入更新网络连接权值ω_rs和ω_su，即ω_su＝ω_su+λH_se_u，r＝1,2,...,n,s＝1,2,...,l,u＝1,2,...,m。 (3.5) Weight update. Bring the prediction error e _u into the updated network connection weights ω _rs and ω _su , namely ω _su =ω _su +λH _s e _u , r=1,2,...,n,s=1,2,...,l,u=1,2,...,m.

(3.6)阀值更新。根据预测误差e_u，更新网络节点阀值a_s和b_u，即b_u＝b_u+e_u，所述s＝1,2,...,l，u＝1,2,...,m。 (3.6) Threshold update. According to the prediction error e _u , update the network node thresholds a _s and b _u , namely b _u =b _u +e _u , the s=1,2,...,l, u=1,2,...,m.

(3.7)判断迭代t是否已达到最大迭代次数maxt，若达到，则转到步骤4)，否则t＝t+1，转向(3.2)。 (3.7) Judging whether iteration t has reached the maximum number of iterations maxt, if so, then go to step 4), otherwise t=t+1, go to (3.2).

在本发明上述具体实施中，使用田口方法确定合适的结构参数，例如，利用表1的正交表来设计试验，每组试验执行5次，求出四种设计因子各自所对应的3个水准的效应值，选取每个因子具有最高效应值的水准作为结构参数初始值的最佳组合，设计因子及其水准试验结果见表2。在接下来的实施中，选取训练集中某一天的交通流量(任意一天均可)，对其使用模糊C均值聚类算法进行聚类，共形成多个子训练集，用每个子训练集训练一个反向传播神经网络，依次对应不同的流量模式。在具体实施最后阶段，根据反向传播神经网络的预测输出结果、隶属度矩阵U，计算出特定预测时间点上的预测值。 In the above-mentioned concrete implementation of the present invention, use Taguchi method to determine suitable structural parameter, for example, utilize the orthogonal table of table 1 to design test, every group of test is carried out 5 times, obtain 3 levels corresponding to four kinds of design factors respectively The effect value of each factor is selected, and the level with the highest effect value of each factor is selected as the best combination of the initial value of the structural parameters. The test results of the design factors and their levels are shown in Table 2. In the following implementation, select the traffic flow of a certain day in the training set (any day is acceptable), cluster it using the fuzzy C-means clustering algorithm, and form multiple sub-training sets, and use each sub-training set to train a counter To propagate the neural network, which in turn corresponds to different traffic patterns. In the final stage of specific implementation, according to the prediction output results of the backpropagation neural network and the membership degree matrix U, the prediction value at a specific prediction time point is calculated.

表1是正交表示例， Table 1 is an example of an orthogonal table,

表1 Table 1

表2是设计因子示例。 Table 2 is an example of design factors.

表2 Table 2

Claims

1. the short-term traffic flow forecast method based on neural network ensemble model, it is characterised in that the step that the method comprises is:

Step 1) structural parameters of short-term traffic flow forecast are set,

Step 2) traffic flow is clustered, user inputs the data set X={x being made up of data object₁,x₂,...x_n, threshold values ε and maximum iteration time maxt；The cluster centre being set with classification composition is V={v₁,v₂,...v_c, u_ijIndicate that jth data object x_jBelong to i-th classification v_iDegree of membership, all u_ijComposition subordinated-degree matrix U, target setting function J be fuzzy set theory square distance and, n is data count, c for cluster number, m be not less than 1 weighting parameters, d_ijFor object x_jWith cluster centre v_iBetween distance, i and j is variable, arranges iterative steps t；

Step 3) use reverse transmittance nerve network to be trained, described reverse transmittance nerve network, also referred to as BP neutral net, is the multilayer feedforward neural network by Back Propagation Algorithm training；

Step 4) carry out short-term traffic flow forecast, user inputs predicted time point T successively, from step 3) the prediction output of the reverse transmittance nerve network that obtains, take c predictive value O₁,O₂,...,O_c；According to step 3) predicted time point is for the degree of membership of each bunch in subordinated-degree matrix U, by degree of membership as weights, asks the weighted sum of each bunch corresponding reverse transmittance nerve network predictive value as the predictive value O at predicted time point T_T, namelyDescribed T is predicted time point, T=1,2 ..., n。

2. a kind of short-term traffic flow forecast method based on neural network ensemble model according to claim 1, it is characterised in that described step 1) structural parameters that arrange short-term traffic flow forecast specifically comprise the following steps that

Step 1.1) number will be clustered as factors A, make factors A represent the classification of traffic flow, the level of factors A is set, the minimum level of factors A is taken as 3, the highest level is taken as 5, and middle level is taken as 4, and described level represents the consumption or the state in which value that refer to that each factor takes；

Step 1.2) using input number of nodes as factor B, three levels of factor B respectively 1,3,5 are set, these three level represents node three kinds of situations less, general, more successively；

Step 1.3) using node in hidden layer as factor C, factor C is by empirical equationDetermining, described l is node in hidden layer, and arranging n is factor B, and namely n is input number of nodes, and arranging m is output node number, and δ is value constant between 1 to 10, and three levels of factor C are set to 2,3 and 7；

Step 1.4) using learning rate as factor D, factor D value is between 0.01 to 0.1, and its three levels are set to 0.02,0.04 and 0.07；

Step 1.5) use Taguchi's method to test, it is thus achieved that factors A, factor B, factor C, the factor D level when effect value peak, using these levels initial value as structural parameters。In Taguchi's method, useAssessing the effect value of each design factor, described k is the number of times that often group test performs, e_MREIt is average relative error, For the predictive value of traffic flow, the actual value of the traffic flow that θ (T) inputs for user, T is a certain predicted time point, and N is the sum of predicted time point, and p and q is variable；Described Taguchi's method is a kind of quality engineering method, and the method is by selecting design parameter around set desired value, and reduces variation through overtesting bottom line, makes all over products produced have identical, stable quality。

3. a kind of short-term traffic flow forecast method based on neural network ensemble model according to claim 1, it is characterised in that described step 2) specifically comprise the following steps that

Step 2.1) c is set as step 1) level of factors A that obtains, user inputs m, is assigned to u with the random number between 0 to 1_ij, all u_ijForm initial subordinated-degree matrix U so that it is meetIterative steps t is set to 0；

Step 2.2) useCalculate the current each element value of cluster centre V, described inIt is u_ijValue when the t time iteration,Represent v_iValue when the t time iteration；

Step 2.3) useCalculating Jt, described Jt is the J value when the t time iteration, d_ijFor data object x_jWith classification v_iBetween Euclidean distance, described Euclidean distance is the actual distance in space between two points；

Step 2.4) utilizeSubordinated-degree matrix U is modified by the distance according to object to cluster centre, described inIt is u_ijValue after the t time iteration, d_kjFor data object x_jWith classification v_kBetween Euclidean distance；

Step 2.5) useCalculate J^t+1, described J^t+1It it is J value after the t time iteration；

Step 2.6) judge whether to meet | J^t+1-J^t|≤ε or t is equal to maxt, if it is satisfied, then forward step 3 to), otherwise t=t+1, turn to step 2.2)。

4. a kind of short-term traffic flow forecast method based on neural network ensemble model according to claim 1, it is characterised in that described step 3) specifically comprise the following steps that

Step 3.1) using step 1) the factor B level value that obtains is as network input layer nodes n, factor C level value is as node in hidden layer l, use output layer nodes m, factor D level value is as learning rate λ, and user provides the connection weights ω between input layer, hidden layer and output layer neuron_rsAnd ω_suArranging initial value, user inputs initial hidden layer threshold values a again₀, output layer threshold values b₀, reset maximum iteration time maxt, iterative steps t be set to 0；

Step 3.2) according to user input data set variable X, utilize formulaCalculate hidden layer output H_s, s=1,2 ..., l；

Step 3.3) utilizeCalculate the prediction output O of reverse transmittance nerve network_u, u=1,2 ..., m；

Step 3.4) user inputs desired output Y_u, utilize e_u=Y_u-O_uCalculate forecast error e_u, u=1,2 ..., m；

Step 3.5) utilizeWithBy forecast error e_uBring renewal network into and connect weights ω_rsAnd ω_su, r=1,2 ..., n, s=1,2 ..., l, u=1,2 ..., m；

Step 3.6) utilizeAnd b_u=b_u+e_u, bring forecast error e into_uUpdate network node threshold values a_sAnd b_u, s=1,2 ..., l, u=1,2 ..., m；

Step 3.7) judge whether iteration t has reached maximum iteration time maxt, if reaching, then forward step 4 to), otherwise t=t+1, turn to step 3.2)。