CN109584552B - Bus arrival time prediction method based on network vector autoregressive model - Google Patents

Bus arrival time prediction method based on network vector autoregressive model Download PDF

Info

Publication number
CN109584552B
CN109584552B CN201811430278.7A CN201811430278A CN109584552B CN 109584552 B CN109584552 B CN 109584552B CN 201811430278 A CN201811430278 A CN 201811430278A CN 109584552 B CN109584552 B CN 109584552B
Authority
CN
China
Prior art keywords
travel
bus
stations
station
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201811430278.7A
Other languages
Chinese (zh)
Other versions
CN109584552A (en
Inventor
吴舜尧
刘殿中
张齐
余翔
宋涛涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao University
Original Assignee
Qingdao University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao University filed Critical Qingdao University
Priority to CN201811430278.7A priority Critical patent/CN109584552B/en
Publication of CN109584552A publication Critical patent/CN109584552A/en
Application granted granted Critical
Publication of CN109584552B publication Critical patent/CN109584552B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0133Traffic data processing for classifying traffic situation
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/123Traffic control systems for road vehicles indicating the position of vehicles, e.g. scheduled vehicles; Managing passenger vehicles circulating according to a fixed timetable, e.g. buses, trains, trams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Traffic Control Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种基于网络向量自回归模型的公交到站时间预测方法,以公交站点和路口为节点,基于城市道路交通信息与公交线路规划情况构建城市交通网络,并从智能交通系统数据库抽取、推断公共设施数量、站点间旅行速度、交通拥堵程度等数据,站点间旅行速度矩阵与城市交通网络的低维隐含因子,构建隐含因子间回归关系并预测相应路段的旅行速度,再基于拓展网络向量空间自回归模型学习历史数据,预测未来某时段站点间旅行速度,然后依据站点间距离及预测的旅行速度估计站点间旅行时间,该方法考虑了城市交通网络的拓扑关联,充分利用了公交到站时间、公交GPS定位信息等数据,有效提升了预测效果。

Figure 201811430278

The invention discloses a method for predicting bus arrival time based on a network vector autoregressive model. Taking bus stops and intersections as nodes, an urban traffic network is constructed based on urban road traffic information and bus route planning, and extracted from an intelligent transportation system database. , Infer data such as the number of public facilities, travel speed between stations, traffic congestion degree, etc., the travel speed matrix between stations and the low-dimensional implicit factors of the urban transportation network, build a regression relationship between the implicit factors and predict the travel speed of the corresponding road section, and then based on The extended network vector space autoregressive model learns historical data, predicts the travel speed between stations in a certain period of time in the future, and then estimates the travel time between stations according to the distance between stations and the predicted travel speed. The bus arrival time, bus GPS positioning information and other data can effectively improve the prediction effect.

Figure 201811430278

Description

一种基于网络向量自回归模型的公交到站时间预测方法A method for predicting bus arrival time based on network vector autoregressive model

技术领域:Technical field:

本发明涉及到城市智能公共交通信息处理技术领域,具体涉及一种基于网络向量自回归模型的公交到站时间预测方法。The invention relates to the technical field of urban intelligent public transport information processing, in particular to a method for predicting bus arrival time based on a network vector autoregressive model.

背景技术:Background technique:

近年来,中国经济的快速发展与科技的迅猛进步促进了城市公共交通水平的大幅提升。其中,公交车是城市公共交通的重要组成部分,已成为人们现代生活中必不可少的交通工具。随着城市化进程的不断推进及城市规模的迅速扩张,乘客总量增加快、公交客流强度变化范围大、客运效果在不同时段差异大等问题日益凸显。准确预测公交到站时间是缓解城市公共交通压力的一项重要手段。一方面,公交到站时间预测可为公交车客流引导、公交安全管理与运营协调提供决策支持,有利于提供城市公交网络运行效率、减少交通拥堵。另一方面,可为乘客提供公交到站时间的查询服务,帮助乘客做出规划,缓解候车乘客的焦躁情绪。In recent years, the rapid development of China's economy and the rapid progress of science and technology have promoted a substantial improvement in the level of urban public transportation. Among them, the bus is an important part of urban public transportation, and has become an indispensable means of transportation in people's modern life. With the continuous advancement of the urbanization process and the rapid expansion of the city scale, problems such as the rapid increase in the total number of passengers, the wide range of changes in the intensity of bus passenger flow, and the large differences in passenger transport effects at different time periods have become increasingly prominent. Accurate prediction of bus arrival time is an important means to relieve the pressure of urban public transport. On the one hand, bus arrival time prediction can provide decision support for bus passenger flow guidance, bus safety management and operation coordination, which is beneficial to improve the operation efficiency of urban bus network and reduce traffic congestion. On the other hand, it can provide passengers with the inquiry service of bus arrival time, help passengers make plans, and relieve the anxiety of waiting passengers.

公交到站时间预测是指利用智能交通系统采集到的数据建模预测公交车到达车站的时间。相应的建模方法大致可分为时间序列分析和机器学习两类策略。时间序列分析策略提取历史公交线路站点间旅行时间作为时间序列,并对其进行平稳性、随机性等检验,然后依据检验情况选择合适的时间序列分析模型做预测。机器学习策略将站点间旅行情况视为对象,将站点间旅行时间视为预测变量,提取站点间旅行路段长度、拥挤程度、附近天气情况、POI情况、上游路段旅行时间等作为特征,然后选择随机森林、支持向量机、神经网络等构建模型。概括而言,现有方法不能充分考虑城市道路交通网络间拓扑关联对公交旅行时间的影响。此外,采集到的公交到站时间往往存在大量缺失,现有工作通常选择丢弃缺失数据,而未进行合适处理。Prediction of bus arrival time refers to using the data collected by the intelligent transportation system to model and predict the time when the bus arrives at the station. The corresponding modeling methods can be roughly divided into two types of strategies: time series analysis and machine learning. The time series analysis strategy extracts the travel time between historical bus lines and stops as a time series, and performs stationarity, randomness and other tests on it, and then selects an appropriate time series analysis model for prediction according to the test situation. The machine learning strategy treats the inter-site travel situation as an object, regards the inter-site travel time as a predictor variable, extracts the length of the inter-site travel section, congestion level, nearby weather conditions, POI conditions, upstream section travel time, etc. as features, and then randomly selects Forests, support vector machines, neural networks, etc. to build models. In general, existing methods cannot fully consider the impact of topological associations between urban road traffic networks on bus travel time. In addition, the collected bus arrival time often has a large number of missing data, and the existing work usually chooses to discard the missing data without proper processing.

考虑到站点间旅行速度可反映交通情况,会直接受相邻区域旅行速度的影响,本发明将公交到站时间预测转化为站点间旅行速度预测。在此基础上,利用城市交通网络与站点旅行速度矩阵构建回归关系,从而填补历史缺失数据。进而,基于部分线性单指标模型拓展了网络向量自回归模型以预测站点间旅行速度。最终,依据站点间旅行速度估计站点间旅行时间,从而预测公交车到达目标站点的时间。Considering that the travel speed between stations can reflect the traffic situation and is directly affected by the travel speed of adjacent areas, the present invention converts the bus arrival time prediction into the travel speed prediction between stations. On this basis, a regression relationship is constructed by using the urban transportation network and the station travel speed matrix to fill in the missing historical data. Furthermore, a network vector autoregressive model is extended based on a partially linear single-index model to predict travel speed between sites. Finally, the travel time between stations is estimated based on the travel speed between stations, thereby predicting the time when the bus will arrive at the target station.

发明内容:Invention content:

为了克服现有技术存在的缺陷,本发明考虑了城市交通网络的拓扑关联,充分利用了公交到站时间、公交GPS定位信息等数据,提出了一种基于网络向量自回归模型的公交到站时间预测方法,有效提升了预测效果。In order to overcome the defects of the prior art, the present invention considers the topological association of the urban traffic network, and makes full use of the bus arrival time, bus GPS positioning information and other data, and proposes a bus arrival time based on the network vector autoregressive model. The forecasting method effectively improves the forecasting effect.

本发明涉及的基于网络向量自回归模型的公交到站时间预测方法包括以下步骤:The method for predicting the bus arrival time based on the network vector autoregressive model involved in the present invention comprises the following steps:

A、面向智能交通系统的数据预处理:以公交站点和路口为节点,基于城市道路交通信息与公交线路规划情况构建城市交通网络,并从智能交通系统数据库抽取、推断公共设施数量、站点间旅行速度、交通拥堵程度等数据;A. Data preprocessing for intelligent transportation system: take bus stations and intersections as nodes, build urban transportation network based on urban road traffic information and bus route planning, and extract and infer the number of public facilities and travel between stations from the intelligent transportation system database Speed, traffic congestion and other data;

B、基于奇异值矩阵分解的站点间旅行速度缺失填补:对于旅行速度存在缺失的某时段,提取该时段的站点间旅行速度矩阵与城市交通网络的低维隐含因子,构建隐含因子间回归关系并预测相应路段的旅行速度;B. Filling for missing travel speed between stations based on singular value matrix decomposition: For a certain period of time when travel speed is missing, extract the low-dimensional latent factor of the inter-station travel speed matrix and the urban transportation network in this period, and construct a regression between hidden factors relationship and predict the travel speed of the corresponding road segment;

C、基于网络向量部分线性自回归模型的站点间旅行速度预测:基于拓展网络向量空间自回归模型学习历史数据,从而预测未来某时段站点间旅行速度;C. Prediction of travel speed between sites based on the network vector partial linear autoregressive model: based on the extended network vector space autoregressive model to learn historical data, so as to predict the travel speed between sites in a certain period of time in the future;

D、公交到站时间预测及修正:依据站点间距离及预测的旅行速度估计站点间旅行时间,进而估计累加公交车到目标站点间各路段的旅行时间,并参照历史数据进行修正。D. Prediction and correction of bus arrival time: According to the distance between stations and the predicted travel speed, the travel time between stations is estimated, and then the travel time of each road section between the accumulated bus and the target station is estimated, and the correction is made with reference to historical data.

本发明涉及的步骤A基于城市道路交通网络及公交线路规划情况推断出了站点间旅行关系网络,并依据站点间夹角关系计算站点间距离。The step A involved in the present invention infers the travel relationship network between sites based on the urban road traffic network and bus route planning, and calculates the distance between sites according to the angle relationship between sites.

本发明涉及的步骤A利用公交GPS数据推断站点间拥堵程度。The step A involved in the present invention uses bus GPS data to infer the degree of congestion between stations.

本发明涉及的步骤B构建站点间旅行速度矩阵与站点间旅行关系网络之间的拓扑关联,从而填补站点间缺失的旅行速度。The step B involved in the present invention constructs the topological association between the inter-site travel speed matrix and the inter-site travel relationship network, so as to fill in the missing travel speed between the sites.

本发明涉及的步骤C基于部分线性单指标模型拓展了网络向量空间自回归模型,使其可以处理自变量与因变量直接的非线性关联。The step C involved in the present invention expands the network vector space autoregression model based on the partial linear single-index model, so that it can handle the direct nonlinear relationship between the independent variable and the dependent variable.

本发明与现有技术相比原理可靠,考虑了城市交通网络的拓扑关联,充分利用了公交到站时间、公交GPS定位信息等数据,有效提升了预测效果预测时间准确,应用环境友好。Compared with the prior art, the present invention is reliable in principle, takes into account the topological correlation of the urban traffic network, makes full use of data such as bus arrival time, bus GPS positioning information, etc., effectively improves the prediction effect, and has a friendly application environment.

附图说明:Description of drawings:

图1:本发明涉及的基于网络向量自回归模型的公交到站时间预测的流程框图。Fig. 1 is a flow chart of the bus arrival time prediction based on the network vector autoregressive model involved in the present invention.

图2:本发明涉及的基于奇异值矩阵分解法填补缺失值的流程框图。Fig. 2 is a flow chart of filling missing values based on the singular value matrix decomposition method involved in the present invention.

图3:实施例1设计的站点间夹角关系的三种情况Figure 3: Three cases of angular relationship between sites designed in Example 1

具体实施方式:Detailed ways:

为使本发明的目的、技术方案和优点表达得更加清楚明白,下面结合附图及具体实施例对本发明作进一步详细说明。In order to express the objectives, technical solutions and advantages of the present invention more clearly, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

实施例1:Example 1:

本实施例涉及的方案包括如下步骤:The scheme involved in this embodiment includes the following steps:

A、面向智能交通系统的数据预处理A. Data preprocessing for intelligent transportation systems

(1)构建站点间旅行关系网络(1) Build a travel relationship network between sites

首先,依据经纬度将路口作为节点放置在大地坐标系中,并依据城市道路规划情况连接节点,具体可使用网络G=(V,L)描述;其中,V代表路口集合,V={v1,v2,…vn},n=|V|是路口总数,L代表路口间存在的路段集合,L={<vh,vl>|vh,vl∈V,1<h,l<n},G中的节点有经纬度的位置限定,可更真实的反映城市道路情况;First, the intersection is placed as a node in the geodetic coordinate system according to the latitude and longitude, and the nodes are connected according to the urban road planning situation. Specifically, the network G=(V, L) can be used to describe; where V represents the intersection set, V={v 1 , v 2 ,...v n }, n=|V| is the total number of intersections, L represents the set of road segments existing between intersections, L={<v h , v l >|v h , v l ∈V, 1<h, l <n}, the nodes in G are limited by the latitude and longitude, which can reflect the urban road conditions more realistically;

然后,依据公交线路规划情况向城市道路交通网络中添加公交站点;在此基础上,重新定义节点集合为V=V1∪V2,V1代表路口集合,V2代表站点集合;站点经纬度、路口间距离、站点距最近路口间距离由智能交通系统的数据库获得;进而,计算站点间旅行距离,当两站点均与某路口紧邻时,需明确两站点与路口的确切位置关系,给定两站点i和j,它们的方位角Azij(0°<Azij<360°)根据以下模型计算Then, add bus stops to the urban road traffic network according to the planning of bus routes; on this basis, redefine the node set as V=V 1 ∪ V 2 , where V 1 represents the intersection set, and V 2 represents the station set; The distance between intersections and the distance between stations and the nearest intersection are obtained from the database of the intelligent transportation system; further, to calculate the travel distance between stations, when both stations are close to a certain intersection, it is necessary to clarify the exact position relationship between the two stations and the intersection. Stations i and j, their azimuths Az ij (0° < Az ij < 360°) are calculated according to the following model

Figure BDA0001882532670000041
Figure BDA0001882532670000041

Figure BDA0001882532670000042
Figure BDA0001882532670000042

其中,Wi代表节点i的纬度,Ji代表节点i的经度,Wj代表节点j的纬度,Jj代表节点j的经度;根据夹角关系可计算站点间距离,其中有三种情况如图3所示Among them, Wi represents the latitude of node i, J i represents the longitude of node i, W j represents the latitude of node j, and J j represents the longitude of node j; the distance between sites can be calculated according to the included angle relationship, among which there are three cases as shown in the figure 3 shown

图3-a表示站点A和站点B相对于路口的方位角不相等,站点A和站点B之间的距离DAB是两个站点与路口距离之和;图3-b代表站点A和站点B相对于路口的方位角不相等,但是和为360°,站点A和站点B之间的距离DAB是两个站点与路口距离之和;图3-c代表站点A和站点B相对于路口的方位角相等,站点A和站点B之间的距离DAB是两个站点与路口距离差的绝对值。针对以上三种情况,计算求得站点间的距离;然后,依据公交线路规划情况生成站点间旅行距离矩阵D;Figure 3-a shows that the azimuth angles of station A and station B relative to the intersection are not equal, and the distance D AB between station A and station B is the sum of the distances between the two stations and the intersection; Figure 3-b represents station A and station B The azimuth angles relative to the intersection are not equal, but the sum is 360°, and the distance D AB between station A and station B is the sum of the distances between the two stations and the intersection; Figure 3-c represents the distance between station A and station B relative to the intersection. The azimuths are equal, and the distance D AB between station A and station B is the absolute value of the distance difference between the two stations and the intersection. For the above three situations, calculate the distance between stations; then, generate the travel distance matrix D between stations according to the bus route planning;

最后,从G中提取站点间旅行关系网络GBus=(VBus,LBus),其中,VBus代表公交站点集合,VBus={v1,…,vN},N=|VBus|视为站点间旅行关系网络中公交站点的个数,LBus代表站点间相邻路段的集合,LBus={<vh,vl>|vh,vl∈VBus,1<h,l<N},同时,根据站点间旅行距离矩阵生成站点间旅行关系网络GBus的邻接矩阵为A=(aij)∈RN×N,其中,当(vh,vl)∈LBus,aij=1,反之aij=0;Finally, the inter-station travel relationship network G Bus = (V Bus , L Bus ) is extracted from G, where V Bus represents the set of bus stations, V Bus = {v 1 , . . . , v N }, N = |V Bus | It is regarded as the number of bus stops in the travel relationship network between stations, L Bus represents the set of adjacent road segments between stations, L Bus = {<v h , v l >|v h , v l ∈ V Bus , 1<h, l<N}, at the same time, the adjacency matrix of the inter-site travel relationship network G Bus is generated according to the inter-site travel distance matrix as A=(a ij )∈R N×N , where, when (v h , v l )∈L Bus , a ij =1, otherwise a ij =0;

(2)提取站点间旅行速度(2) Extract travel speed between sites

某个路段的交通拥堵情况会受临近路段影响,而旅行时间不能直接的反映交通拥堵情况;为此,本实施例从智能交通系统数据库中提取旅行时间,然后转换为旅行速度,并对旅行速度建模预测;The traffic congestion situation of a certain road section will be affected by the adjacent road sections, and the travel time cannot directly reflect the traffic congestion situation; for this reason, in this embodiment, the travel time is extracted from the intelligent transportation system database, and then converted into the travel speed, and the travel speed is calculated. modeling forecast;

(2-1)将所抽取数据的初始时刻作为起始时间,并按固定时段为间隔划分T个时段;(2-1) The initial moment of the extracted data is used as the starting time, and the T time periods are divided at intervals by a fixed period of time;

(2-2)Yt∈RN×N为t时段的旅行时间矩阵,其元素为

Figure BDA0001882532670000051
代表t时段从站点i到站点j的平均旅行时间,因此,
Figure BDA0001882532670000052
构成了一个T维的高维度向量;(2-2) Y t ∈R N×N is the travel time matrix of t period, and its elements are
Figure BDA0001882532670000051
represents the average travel time from site i to site j in period t, therefore,
Figure BDA0001882532670000052
A high-dimensional vector of T dimension is formed;

(2-3)获取旅行速度数据(2-3) Obtain travel speed data

给定某站点间旅行时间

Figure BDA0001882532670000061
可依据以下模型计算站点间旅行速度travel time between a given station
Figure BDA0001882532670000061
The speed of travel between sites can be calculated according to the following model

Figure BDA0001882532670000062
Figure BDA0001882532670000062

依次序将站点间旅行时间矩阵转化为站点间旅行速度矩阵,进而生成高维旅行速度向量

Figure BDA0001882532670000063
The inter-site travel time matrix is sequentially transformed into the inter-site travel speed matrix, and then a high-dimensional travel speed vector is generated
Figure BDA0001882532670000063

(3)提取相关协变量(3) Extract relevant covariates

站点间旅行速度的预测不仅要考虑城市道路交通网络的拓扑关联,还存在其他可影响速度的因素;为此,本实施例选用公共基础设施情况(POI,Point Of Interest)和交通拥堵程度作为协变量;The prediction of travel speed between stations should not only consider the topological correlation of the urban road traffic network, but also other factors that can affect the speed. Therefore, in this embodiment, the public infrastructure situation (POI, Point Of Interest) and the degree of traffic congestion are used as the coordination factors. variable;

(3-1)公共基础设施情况(3-1) Public infrastructure

POI(Point Of Interest)代表了公交站点间所在区域内公共基础设施(例如学校、医院、商场、电影院)的数量;在本实施例中,采用

Figure BDA0001882532670000064
记录站点i到站点j间旅附近的公共设施数量;(采用
Figure BDA0001882532670000065
记录站点i到站点j间旅附近的公共设施数量)POI (Point Of Interest) represents the number of public infrastructure (such as schools, hospitals, shopping malls, cinemas) in the area where the bus stops are located; in this embodiment, the
Figure BDA0001882532670000064
Record the number of public facilities near the trip from site i to site j; (using
Figure BDA0001882532670000065
Record the number of public facilities near the trip from site i to site j)

(3-2)交通拥堵程度(3-2) Traffic congestion level

本实施例采用公交GPS数据评估旅行路段的拥堵程度;给定相邻两站点i和j,依据GPS数据统计在t时段内相邻站点i和j间公交车的数量

Figure BDA0001882532670000066
并基于该路段公交车的历史数量序列
Figure BDA0001882532670000067
的最小值
Figure BDA0001882532670000068
第一四分位数
Figure BDA0001882532670000069
中位数
Figure BDA00018825326700000610
第三四分位数
Figure BDA00018825326700000611
最大值
Figure BDA00018825326700000612
将交通拥堵程度划分为四种级别:In this embodiment, the traffic GPS data is used to evaluate the congestion degree of the travel section; given two adjacent stations i and j, the number of buses between adjacent stations i and j in the t period is counted according to the GPS data.
Figure BDA0001882532670000066
And based on the historical number sequence of buses in this section
Figure BDA0001882532670000067
the minimum value of
Figure BDA0001882532670000068
first quartile
Figure BDA0001882532670000069
median
Figure BDA00018825326700000610
third quartile
Figure BDA00018825326700000611
maximum value
Figure BDA00018825326700000612
The degree of traffic congestion is divided into four levels:

Figure BDA0001882532670000071
Figure BDA0001882532670000071

其中,1代表通畅,2代表比较通畅,3代表比较拥堵,4代表拥堵,Among them, 1 means unobstructed, 2 means relatively smooth, 3 means relatively congested, 4 means congested,

综上所述,(则)协变量矩阵模型Z可表示为To sum up, (then) the covariate matrix model Z can be expressed as

Z=(ZPOI,ZTPI)T (4)。Z=(Z POI , Z TPI ) T (4).

B、基于奇异值矩阵分解的站点间旅行速度缺失填补B. Filling the missing travel speed between stations based on singular value matrix decomposition

对于t时段旅行时间速度矩阵St∈RN×N,本实施例提取旅行速度矩阵和站点间旅行关系网络邻接矩阵的低维隐含因子,并构建隐含因子间回归关系以填补St中的缺失数据,具体包括以下三个步骤操作;For the travel time speed matrix S t ∈ R N×N in the t period, this embodiment extracts the low-dimensional latent factors of the travel speed matrix and the adjacency matrix of the travel relationship network between sites, and constructs a regression relationship between the latent factors to fill in the S t The missing data includes the following three steps;

(1)提取低维隐含因子(1) Extract low-dimensional latent factors

本实施例涉及的隐空间网络模型提取低维隐含因子,隐空间网络模型为The latent space network model involved in this embodiment extracts low-dimensional latent factors, and the latent space network model is

Figure BDA0001882532670000072
Figure BDA0001882532670000072

其中,

Figure BDA0001882532670000073
Et是n×n白噪声矩阵,μt是整体均值,at、bt代表节点的输出和接收效应,Ut、Vt代表交互效应,上述参数构成低维度隐含因子
Figure BDA0001882532670000074
它可以通过SVD模型估计in,
Figure BDA0001882532670000073
E t is an n×n white noise matrix, μ t is the overall mean, at and b t represent the output and reception effects of nodes, and U t and V t represent the interaction effects . The above parameters constitute low-dimensional implicit factors
Figure BDA0001882532670000074
It can be estimated by the SVD model

Figure BDA0001882532670000075
Figure BDA0001882532670000075

其中,

Figure BDA0001882532670000081
Figure BDA0001882532670000082
是是N×k非奇异矩阵,
Figure BDA0001882532670000083
是(k×k)对角元素为非零元素的对角矩阵,
Figure BDA0001882532670000084
n维向量
Figure BDA0001882532670000085
分别是
Figure BDA0001882532670000086
Figure BDA0001882532670000087
的列均值;进而,旅行时间速度矩阵St的低维隐含因子被
Figure BDA0001882532670000088
提取;类似的,可以提取站点间旅行关系网络邻接矩阵A的低维隐含因子,NA=[aA,bA,UA,VA];in,
Figure BDA0001882532670000081
and
Figure BDA0001882532670000082
is an N×k nonsingular matrix,
Figure BDA0001882532670000083
is a (k×k) diagonal matrix with non-zero diagonal elements,
Figure BDA0001882532670000084
n-dimensional vector
Figure BDA0001882532670000085
respectively
Figure BDA0001882532670000086
and
Figure BDA0001882532670000087
The column mean of
Figure BDA0001882532670000088
Extraction; similarly, the low-dimensional latent factor of the adjacency matrix A of the travel relationship network between sites can be extracted, N A =[a A, b A , U A , V A ];

(2)构建低维隐含因子间回归关系模型(2) Build a low-dimensional regression model between hidden factors

首先,获得St中存在缺失值的行号和列号,然后删除St和邻接矩阵A对应的行和列,并记为St′和A′;进而,提取它们的低维隐含因子

Figure BDA00018825326700000821
Figure BDA00018825326700000822
并构建如下回归模型First, obtain the row and column numbers of missing values in S t , then delete the rows and columns corresponding to S t and adjacency matrix A, and denote them as S t ' and A'; then, extract their low-dimensional implicit factors
Figure BDA00018825326700000821
and
Figure BDA00018825326700000822
And build the following regression model

Figure BDA0001882532670000089
Figure BDA0001882532670000089

其中,模型f(·)可以是线性模型、非线性模型或非参数模型,本实施例采用的是随机森林算法,决策树数目设置为200;Wherein, the model f( ) can be a linear model, a nonlinear model or a non-parametric model, the random forest algorithm is adopted in this embodiment, and the number of decision trees is set to 200;

(3)预测并填补缺失值(3) Predict and fill in missing values

首先,获得St中存在缺失值的行号和列号,然后提取St和邻接矩阵A对应的行和列,并记为St″和A″,进而,提取A″的低维隐含因子NA″=[aA″,bA″,UA″,VA″]。将NA″代入模型(7)中,得到相应的低维隐含因子

Figure BDA00018825326700000810
最后得出
Figure BDA00018825326700000811
的列均值
Figure BDA00018825326700000812
Figure BDA00018825326700000813
的列均值
Figure BDA00018825326700000814
代入
Figure BDA00018825326700000815
得出总体均值
Figure BDA00018825326700000816
后再代入以下模型First, obtain the row number and column number of missing values in S t , then extract the row and column corresponding to S t and adjacency matrix A, and denote them as S t "and A", and then extract the low-dimensional implicit of A" Factor N A″ = [a A″ , b A″ , U A″ , V A″ ]. Substitute NA into the model (7) to obtain the corresponding low-dimensional implicit factor
Figure BDA00018825326700000810
Finally got
Figure BDA00018825326700000811
column mean of
Figure BDA00018825326700000812
and
Figure BDA00018825326700000813
column mean of
Figure BDA00018825326700000814
substitute
Figure BDA00018825326700000815
get the population mean
Figure BDA00018825326700000816
Then substitute the following model

Figure BDA00018825326700000817
Figure BDA00018825326700000817

得到

Figure BDA00018825326700000818
根据行号和列号将
Figure BDA00018825326700000819
相应位置的数据代入St中,即得无缺站点间旅行速度矩阵
Figure BDA00018825326700000820
get
Figure BDA00018825326700000818
According to the row number and column number will
Figure BDA00018825326700000819
The data of the corresponding position is substituted into S t , that is, the travel speed matrix between stations is obtained.
Figure BDA00018825326700000820

C、基于网络向量部分线性自回归模型的站点间旅行速度预测C. Inter-site travel speed prediction based on network vector partial linear autoregressive model

本实施例采用的网络向量部分线性自回归模型为The network vector partial linear autoregressive model used in this embodiment is:

Figure BDA0001882532670000091
Figure BDA0001882532670000091

其中,

Figure BDA0001882532670000092
表示(与时间无关的公共设施数量、拥堵程度等特征变量)非线性变量对因变量的影响,
Figure BDA0001882532670000093
中的
Figure BDA0001882532670000094
代表站点i到站点j之间的相关协变量向量,g(zijγ)中的γ=(γ1,γ2)T是协变量系数即节点效应系数,
Figure BDA0001882532670000095
表示节点i连接到其他节点的总个数,模型中的
Figure BDA0001882532670000096
表示t-1时刻其他站点k对站点i的平均影响效应,模型中的
Figure BDA0001882532670000097
表示站点i到站点j路段前一刻旅行速度对当前旅行速度的影响,即t-1时刻的因变量对t时刻的因变量取值会有影响,
Figure BDA00018825326700000915
是误差项,它与协变量zij是相互独立的,且服从正态分布;它的期望和方差分别为
Figure BDA0001882532670000098
in,
Figure BDA0001882532670000092
Represents the influence of nonlinear variables on the dependent variable (characteristic variables such as the number of public facilities and the degree of congestion that are not related to time),
Figure BDA0001882532670000093
middle
Figure BDA0001882532670000094
Represents the correlated covariate vector between site i and site j, γ=(γ 1 , γ 2 ) in g(z ij γ) T is the covariate coefficient, that is, the node effect coefficient,
Figure BDA0001882532670000095
Indicates the total number of node i connected to other nodes, the model in the
Figure BDA0001882532670000096
represents the average influence effect of other site k on site i at time t-1, in the model
Figure BDA0001882532670000097
Represents the impact of the travel speed on the current travel speed at the moment before the section from station i to station j, that is, the dependent variable at time t-1 will have an impact on the value of the dependent variable at time t,
Figure BDA00018825326700000915
is the error term, which is independent of the covariate z ij and obeys the normal distribution; its expectation and variance are
Figure BDA0001882532670000098

令β=(β1,β2)T

Figure BDA0001882532670000099
Let β=(β 1 , β 2 ) T ,
Figure BDA0001882532670000099

将模型(9)改写为:Rewrite model (9) as:

Figure BDA00018825326700000910
Figure BDA00018825326700000910

令μi=zijγ,

Figure BDA00018825326700000911
可得:Let μ i = z ij γ,
Figure BDA00018825326700000911
Available:

Figure BDA00018825326700000912
Figure BDA00018825326700000912

估计未知参数ξ=(γT,βT)T的步骤如下:The steps for estimating the unknown parameter ξ=(γ T , β T ) T are as follows:

(1)估计g(·)(1) Estimate g(·)

对于给定的

Figure BDA00018825326700000913
使用局部线性回归方法最小化如下的目标函数模型:for a given
Figure BDA00018825326700000913
Use the local linear regression method to minimize the following objective function model:

Figure BDA00018825326700000914
Figure BDA00018825326700000914

其中,

Figure BDA0001882532670000101
K(·)是核函数,h是带宽,K(·)是一有界、非负、有关于0对称的紧支撑且Lipschitz连续的密度函数in,
Figure BDA0001882532670000101
K( ) is the kernel function, h is the bandwidth, and K( ) is a bounded, nonnegative, compactly supported Lipschitz-continuous density function symmetric about zero

得到估计量:Get an estimate:

Figure BDA0001882532670000102
Figure BDA0001882532670000102

其中,in,

Figure BDA0001882532670000103
Figure BDA0001882532670000103

(2)估计ζ(2) Estimate ζ

在得到(1)中

Figure BDA0001882532670000104
后,通过最小化以下的profile最小二乘函数得到
Figure BDA0001882532670000105
in getting (1)
Figure BDA0001882532670000104
Then, by minimizing the following profile least squares function to get
Figure BDA0001882532670000105

Figure BDA0001882532670000106
Figure BDA0001882532670000106

得到对

Figure BDA0001882532670000107
再重复(1)步骤,得到
Figure BDA0001882532670000108
然后再次重复(2)步骤,得到
Figure BDA0001882532670000109
不断重复,直至
Figure BDA00018825326700001010
get right
Figure BDA0001882532670000107
Repeat step (1) again to get
Figure BDA0001882532670000108
Then repeat step (2) again to get
Figure BDA0001882532670000109
repeat until
Figure BDA00018825326700001010

D、公交到站时间预测及修正D. Prediction and correction of bus arrival time

为提高预测精度,修正预测时段的延长对预测结果的干扰,本实施例添加修正系数α(0≤α≤1)来对预测结果进行调整以提高预测准确度。In order to improve the prediction accuracy and correct the interference of the extension of the prediction period on the prediction result, in this embodiment, a correction coefficient α (0≤α≤1) is added to adjust the prediction result to improve the prediction accuracy.

站点i到站点j共有l个时段,将从智能交通系统提取旅行时间数据并构成l维的向量

Figure BDA00018825326700001011
进而,将
Figure BDA00018825326700001012
拆分为两个h维向量
Figure BDA00018825326700001013
Figure BDA00018825326700001014
其中,
Figure BDA00018825326700001015
然后根据模型(2)将
Figure BDA00018825326700001016
转换为站点间旅行速度向量后代入站点间旅行速度预测模型得到站点间旅行速度估计向量
Figure BDA0001882532670000111
并依次序根据下式计算站点间旅行时间There are l time periods from station i to station j, and travel time data will be extracted from the intelligent transportation system to form an l-dimensional vector
Figure BDA00018825326700001011
Further, will
Figure BDA00018825326700001012
split into two h-dimensional vectors
Figure BDA00018825326700001013
and
Figure BDA00018825326700001014
in,
Figure BDA00018825326700001015
Then according to model (2) the
Figure BDA00018825326700001016
Convert to the inter-site travel speed vector and enter the inter-site travel speed prediction model to obtain the inter-site travel speed estimation vector
Figure BDA0001882532670000111
and calculate the travel time between stations according to the following formula

Figure BDA0001882532670000112
Figure BDA0001882532670000112

得到旅行时间预测向量

Figure BDA0001882532670000113
get travel time prediction vector
Figure BDA0001882532670000113

最后,令

Figure BDA0001882532670000114
Finally, let
Figure BDA0001882532670000114

并依据(15)式找到最优的修正系数α0 And find the optimal correction coefficient α 0 according to formula (15)

Figure BDA0001882532670000115
Figure BDA0001882532670000115

则t时段站点i到站点j的旅行时间修正模型输出为Then the travel time correction model output from site i to site j in t period is

Figure BDA0001882532670000116
Figure BDA0001882532670000116

从智能交通系统中获取数据后按照上述步骤计算得到站点m到站点n之间的所有路段的旅行时间,然后相加求和,最后与m站的出发时间相加并输出,即完成本次公交到站时间预测。After the data is obtained from the intelligent transportation system, the travel time of all road sections between station m and station n is calculated according to the above steps, then added and summed, and finally added with the departure time of station m and output, that is, the bus is completed. Arrival time prediction.

Claims (1)

1.一种基于网络向量自回归模型的公交到站时间预测方法,其特征在于,该方法主要包括以下步骤:1. a method for predicting bus arrival time based on network vector autoregressive model, is characterized in that, this method mainly comprises the following steps: A、面向智能交通系统的数据预处理:以公交站点和路口为节点,基于城市道路交通信息与公交线路规划情况构建城市交通网络,并从智能交通系统数据库抽取、推断公共设施数量、站点间旅行速度和交通拥堵程度的数据;具体包括以下步骤:A. Data preprocessing for intelligent transportation system: take bus stations and intersections as nodes, build urban transportation network based on urban road traffic information and bus route planning, and extract and infer the number of public facilities and travel between stations from the intelligent transportation system database Data on speed and traffic congestion levels; this includes the following steps: (1)构建站点间旅行关系网络(1) Build a travel relationship network between sites 首先,依据经纬度将路口作为节点放置在大地坐标系中,并依据城市道路规划情况连接节点,具体使用网络G=(V,L)描述;其中,V代表路口集合,V={v1,v2,…vn},n=|V|是路口总数,L代表路口间存在的路段集合,L={<vh,vl>|vh,vl∈V,1<h,l<n},G中的节点有经纬度的位置限定,反映城市道路情况;然后依据公交线路规划情况向城市道路交通网络中添加公交站点;重新定义节点集合为V=V1∪V2,V1代表路口集合,V2代表站点集合;站点经纬度、路口间距离、站点距最近路口间距离由智能交通系统的数据库获得;进而,计算站点间旅行距离,两站点均与某路口紧邻,明确两站点与路口的确切位置关系,给定两站点i和j,它们的方位角Azij(0°<Azij<360°)根据以下模型得出:First, the intersection is placed as a node in the geodetic coordinate system according to the latitude and longitude, and the nodes are connected according to the urban road planning situation, which is specifically described by the network G=(V, L); where V represents the intersection set, V={v 1 , v 2 ,...v n }, n=|V| is the total number of intersections, L represents the set of road segments existing between intersections, L={<v h , v l >|v h , v l ∈V, 1<h, l< n}, the nodes in G are limited by latitude and longitude, reflecting the urban road conditions; then add bus stops to the urban road traffic network according to the bus route planning; redefine the node set as V=V 1 ∪ V 2 , V 1 represents The intersection set, V 2 represents the station set; the latitude and longitude of the station, the distance between the intersection, and the distance between the station and the nearest intersection are obtained from the database of the intelligent transportation system; then, the travel distance between the stations is calculated, and the two stations are both adjacent to a certain intersection. The exact location relationship of the intersection, given two stations i and j, their azimuths Az ij (0°<Az ij <360°) are obtained according to the following model: cos(c)=cos(90-Wi)×cos(90-Wj)+sin(90-Wi)×sin(90-Wj)×cos(Ji-Jj)cos(c)=cos(90-W i )×cos(90-W j )+sin(90-W i )×sin(90-W j )×cos(J i -J j )
Figure FDA0002853901640000011
Figure FDA0002853901640000011
其中,Wi代表节点i的纬度,Ji代表节点i的经度,Wj代表节点j的纬度,Jj代表节点j的经度;根据夹角关系可计算站点间距离,其中有三种情况分别为,站点A和站点B相对于路口的方位角不相等,站点A和站点B之间的距离DAB是两个站点与路口距离之和;站点A和站点B相对于路口的方位角不相等,但是和为360°,站点A和站点B之间的距离DAB是两个站点与路口距离之和;站点A和站点B相对于路口的方位角相等,站点A和站点B之间的距离DAB是两个站点与路口距离差的绝对值;针对以上三种情况,计算求得站点间的距离;然后,依据公交线路规划情况生成站点间旅行距离矩阵D;Among them, Wi represents the latitude of node i, J i represents the longitude of node i, W j represents the latitude of node j, and J j represents the longitude of node j; the distance between sites can be calculated according to the included angle relationship, among which there are three cases: , the azimuths of station A and station B relative to the intersection are not equal, the distance D AB between station A and station B is the sum of the distances between the two stations and the intersection; the azimuth angles of station A and station B relative to the intersection are not equal, But the sum is 360°, the distance D AB between station A and station B is the sum of the distances between the two stations and the intersection; the azimuth angles of station A and station B relative to the intersection are equal, and the distance D between station A and station B AB is the absolute value of the distance difference between the two stations and the intersection; for the above three situations, the distance between the stations is calculated and obtained; then, the travel distance matrix D between the stations is generated according to the planning of the bus route; 最后,从G中提取站点间旅行关系网络GBus=(VBus,LBus),其中,VBus代表公交站点集合,VBus={v1,…,vN},N=|VBus|视为站点间旅行关系网络中公交站点的个数,LBus代表站点间相邻路段的集合,LBus={<vh,vl>|vh,vl∈VBus,1<h,l<N},根据站点间旅行距离矩阵生成站点间旅行关系网络GBus的邻接矩阵为A=(aij)∈RN×N,其中,当(vh,vl)∈LBus,aij=1,反之aij=0;Finally, the inter-station travel relationship network G Bus = (V Bus , L Bus ) is extracted from G, where V Bus represents the set of bus stations, V Bus = {v 1 , . . . , v N }, N = |V Bus | It is regarded as the number of bus stops in the travel relationship network between stations, L Bus represents the set of adjacent road segments between stations, L Bus = {<v h , v l >|v h , v l ∈ V Bus , 1<h, l<N}, the adjacency matrix of the inter-station travel relationship network G Bus is generated according to the inter-station travel distance matrix as A=(a ij )∈R N×N , where, when (v h , v l )∈L Bus , a ij =1, otherwise a ij =0; (2)提取站点间旅行速度(2) Extract travel speed between sites 从智能交通系统数据库中提取旅行时间,转换为旅行速度,对旅行速度建模预测;将所抽取数据的初始时刻作为起始时间,并按固定时段为间隔划分T个时段;Yt∈RN×N为t时段的旅行时间矩阵,其元素为
Figure FDA0002853901640000021
代表t时段从站点i到站点j的平均旅行时间,因此,
Figure FDA0002853901640000022
构成了一个T维的高维度向量;给定某站点间旅行时间
Figure FDA0002853901640000023
可依据以下模型得到站点间旅行速度
The travel time is extracted from the intelligent transportation system database, converted into travel speed, and the travel speed is modeled and predicted; the initial time of the extracted data is taken as the starting time, and T time periods are divided according to fixed time intervals; Y t ∈ R N ×N is the travel time matrix of time period t, and its elements are
Figure FDA0002853901640000021
represents the average travel time from site i to site j in period t, therefore,
Figure FDA0002853901640000022
constitutes a T-dimensional high-dimensional vector; given the travel time between a certain station
Figure FDA0002853901640000023
The travel speed between sites can be obtained according to the following model
Figure FDA0002853901640000024
Figure FDA0002853901640000024
依次序将站点间旅行时间矩阵转化为站点间旅行速度矩阵,进而生成高维旅行速度向量
Figure FDA0002853901640000025
The inter-site travel time matrix is sequentially transformed into the inter-site travel speed matrix, and then a high-dimensional travel speed vector is generated
Figure FDA0002853901640000025
(3)提取相关协变量(3) Extract relevant covariates 采用
Figure FDA0002853901640000026
记录站点i到站点j间旅附近的公共设施数量;
use
Figure FDA0002853901640000026
Record the number of public facilities near the trip from site i to site j;
采用公交GPS数据评估旅行路段的拥堵程度;给定相邻两站点i和j,依据GPS数据统计在t时段内相邻站点i和j间公交车的数量
Figure FDA0002853901640000031
并基于该路段公交车的历史数量序列
Figure FDA0002853901640000032
的最小值(countij min)、第一四分位数(countij 0.25)、中位数(countij median)、第三四分位数(countij 0.75)、最大值(countij max)将交通拥堵程度划分为四种级别:
Use bus GPS data to evaluate the congestion degree of the travel section; given two adjacent stations i and j, count the number of buses between adjacent stations i and j in the t period according to the GPS data
Figure FDA0002853901640000031
And based on the historical number sequence of buses in this section
Figure FDA0002853901640000032
The minimum value (count ij min ), the first quartile (count ij 0.25 ), the median (count ij median ), the third quartile (count ij 0.75 ), the maximum value (count ij max ) will be There are four levels of traffic congestion:
Figure FDA0002853901640000033
Figure FDA0002853901640000033
其中,1代表通畅,2代表比较通畅,3代表比较拥堵,4代表拥堵,则协变量矩阵模型Z可表示为Among them, 1 represents unobstructed, 2 represents relatively unobstructed, 3 represents relatively congestion, and 4 represents congestion, then the covariate matrix model Z can be expressed as Z=(ZPOI,ZTPI)T (4);Z=(Z POI , Z TPI ) T (4); B、基于奇异值矩阵分解的站点间旅行速度缺失填补:对于旅行速度存在缺失的某时段,提取该时段的站点间旅行速度矩阵与城市交通网络的低维隐含因子,构建隐含因子间回归关系并预测相应路段的旅行速度;具体包括以下步骤:B. Filling for missing travel speed between stations based on singular value matrix decomposition: For a certain period of time when travel speed is missing, extract the low-dimensional latent factor of the inter-station travel speed matrix and the urban transportation network in this period, and construct a regression between hidden factors relationship and predict the travel speed of the corresponding road segment; it includes the following steps: 对于t时段旅行时间速度矩阵St∈RN×N,提取旅行速度矩阵和站点间旅行关系网络邻接矩阵的低维隐含因子,并构建隐含因子间回归关系以填补St中的缺失数据,具体包括以下三个步骤;For the travel time speed matrix S t ∈ R N×N in the t period, extract the low-dimensional latent factors of the travel speed matrix and the adjacency matrix of the travel relationship network between sites, and construct the regression relationship between the latent factors to fill the missing data in S t , which includes the following three steps; (1)提取低维隐含因子(1) Extract low-dimensional latent factors 采用隐空间网络模型提取低维隐含因子,隐空间网络模型为The latent space network model is used to extract low-dimensional latent factors, and the latent space network model is
Figure FDA0002853901640000035
Figure FDA0002853901640000035
其中,
Figure FDA0002853901640000036
Et是n×n白噪声矩阵,μt是整体均值,at、bt代表节点的输出和接收效应,Ut、Vt代表交互效应,上述参数构成低维度隐含因子
Figure FDA0002853901640000034
它能够通过SVD模型估计
in,
Figure FDA0002853901640000036
E t is an n×n white noise matrix, μ t is the overall mean, at and b t represent the output and reception effects of nodes, and U t and V t represent the interaction effects . The above parameters constitute low-dimensional implicit factors
Figure FDA0002853901640000034
It can be estimated by SVD model
Figure FDA0002853901640000041
Figure FDA0002853901640000041
Figure FDA0002853901640000042
Figure FDA0002853901640000042
Figure FDA0002853901640000043
Figure FDA0002853901640000043
其中,
Figure FDA0002853901640000044
Figure FDA0002853901640000045
是是N×k非奇异矩阵,
Figure FDA0002853901640000046
是(k×k)对角元素为非零元素的对角矩阵,
Figure FDA0002853901640000047
n维向量
Figure FDA0002853901640000048
分别是
Figure FDA0002853901640000049
Figure FDA00028539016400000410
的列均值;旅行时间速度矩阵St的低维隐含因子被
Figure FDA00028539016400000411
提取,然后提取站点间旅行关系网络邻接矩阵A的低维隐含因子,NA=[aA,bA,UA,VA];
in,
Figure FDA0002853901640000044
and
Figure FDA0002853901640000045
is an N×k nonsingular matrix,
Figure FDA0002853901640000046
is a (k×k) diagonal matrix with non-zero diagonal elements,
Figure FDA0002853901640000047
n-dimensional vector
Figure FDA0002853901640000048
respectively
Figure FDA0002853901640000049
and
Figure FDA00028539016400000410
The column mean of ; the low-dimensional implicit factor of the travel time velocity matrix S t is
Figure FDA00028539016400000411
Extract, and then extract the low-dimensional latent factor of the adjacency matrix A of the travel relationship network between sites, N A =[a A , b A , U A , V A ];
(2)构建低维隐含因子间回归关系模型(2) Build a low-dimensional regression model between hidden factors 首先获得St中存在缺失值的行号和列号,然后删除St和邻接矩阵A对应的行和列,并记为St′和A′;再提取它们的低维隐含因子
Figure FDA00028539016400000412
Figure FDA00028539016400000413
并构建如下回归模型
First obtain the row and column numbers of missing values in S t , then delete the rows and columns corresponding to S t and adjacency matrix A, and denote them as S t ' and A'; then extract their low-dimensional implicit factors
Figure FDA00028539016400000412
and
Figure FDA00028539016400000413
And build the following regression model
Figure FDA00028539016400000414
Figure FDA00028539016400000414
其中,模型f(·)是线性模型、非线性模型或非参数模型中的一种或多种,采用随机森林算法,决策树数目设置为200;Among them, the model f( ) is one or more of a linear model, a nonlinear model or a non-parametric model, the random forest algorithm is used, and the number of decision trees is set to 200; (3)预测并填补缺失值(3) Predict and fill in missing values 先获得St中存在缺失值的行号和列号,然后提取St和邻接矩阵A对应的行和列,并记为St″和A″,再提取A″的低维隐含因子NA″=[aA″,bA″,UA″,VA″],将NA″代入模型(7)中,得到相应的低维隐含因子
Figure FDA00028539016400000415
最后得出
Figure FDA00028539016400000416
的列均值
Figure FDA00028539016400000417
Figure FDA00028539016400000418
的列均值
Figure FDA00028539016400000419
代入
Figure FDA00028539016400000420
得出总体均值
Figure FDA00028539016400000421
后再代入以下模型
First obtain the row and column numbers of missing values in S t , then extract the rows and columns corresponding to S t and adjacency matrix A, and denote them as S t "and A", and then extract the low-dimensional implicit factor N of A" A″ = [a A″ , b A″ , U A″ , V A″ ], substitute N A″ into the model (7) to obtain the corresponding low-dimensional implicit factor
Figure FDA00028539016400000415
Finally got
Figure FDA00028539016400000416
column mean of
Figure FDA00028539016400000417
and
Figure FDA00028539016400000418
column mean of
Figure FDA00028539016400000419
substitute
Figure FDA00028539016400000420
get the population mean
Figure FDA00028539016400000421
Then substitute the following model
Figure FDA00028539016400000422
Figure FDA00028539016400000422
得到
Figure FDA00028539016400000423
根据行号和列号将
Figure FDA00028539016400000424
相应位置的数据代入St中,即得无缺站点间旅行速度矩阵
Figure FDA0002853901640000051
get
Figure FDA00028539016400000423
According to the row number and column number will
Figure FDA00028539016400000424
The data of the corresponding position is substituted into S t , that is, the travel speed matrix between stations is obtained.
Figure FDA0002853901640000051
C、基于网络向量部分线性自回归模型的站点间旅行速度预测:基于拓展网络向量空间自回归模型学习历史数据,从而预测未来某时段站点间旅行速度;具体包括以下步骤:C. Prediction of travel speed between sites based on the network vector partial linear autoregressive model: based on the extended network vector space autoregressive model to learn historical data, so as to predict the travel speed between sites in a certain period of time in the future; specifically including the following steps: 采用的网络向量部分线性自回归模型为The adopted network vector partial linear autoregressive model is
Figure FDA0002853901640000052
Figure FDA0002853901640000052
Figure FDA0002853901640000053
Figure FDA0002853901640000053
其中,g(zijγ)表示与时间无关的公共设施数量、拥堵程度非线性变量对因变量的影响,g(zijγ)中的
Figure FDA0002853901640000054
代表站点i到站点j之间的相关协变量向量,g(zijγ)中的γ=(γ1,γ2)T是协变量系数即节点效应系数,
Figure FDA0002853901640000055
表示节点i连接到其他节点的总个数,模型中的
Figure FDA0002853901640000056
表示t-1时刻其他站点k对站点i的平均影响效应,模型中的
Figure FDA0002853901640000057
表示站点i到站点j路段前一刻旅行速度对当前旅行速度的影响,即t-1时刻的因变量对t时刻的因变量取值会有影响,
Figure FDA0002853901640000058
是误差项与协变量zij是相互独立的,且服从正态分布;
Figure FDA0002853901640000059
的期望和方差分别为
Figure FDA00028539016400000510
Among them, g(z ij γ) represents the influence of the number of time-independent public facilities and the non-linear variable of the congestion degree on the dependent variable, and g(z ij γ) in
Figure FDA0002853901640000054
Represents the correlated covariate vector between site i and site j, γ=(γ 1 , γ 2 ) in g(z ij γ) T is the covariate coefficient, that is, the node effect coefficient,
Figure FDA0002853901640000055
Indicates the total number of node i connected to other nodes, the model in the
Figure FDA0002853901640000056
represents the average influence effect of other site k on site i at time t-1, in the model
Figure FDA0002853901640000057
Represents the impact of the travel speed on the current travel speed at the moment before the section from station i to station j, that is, the dependent variable at time t-1 will have an impact on the value of the dependent variable at time t,
Figure FDA0002853901640000058
is that the error term and the covariate z ij are independent of each other and obey the normal distribution;
Figure FDA0002853901640000059
The expectation and variance of , respectively, are
Figure FDA00028539016400000510
Figure FDA00028539016400000511
make
Figure FDA00028539016400000511
将模型(9)改写为:Rewrite model (9) as:
Figure FDA00028539016400000512
Figure FDA00028539016400000512
令μi=zijγ,
Figure FDA00028539016400000513
可得:
Let μ i = z ij γ,
Figure FDA00028539016400000513
Available:
Figure FDA00028539016400000514
Figure FDA00028539016400000514
估计未知参数ξ=(γT,βT)T包括:The estimated unknown parameter ξ=(γ T , β T ) T includes: (1)估计g(·)(1) Estimate g(·) 对于给定的
Figure FDA0002853901640000061
采用局部线性回归方法最小化如下的目标函数模型:
for a given
Figure FDA0002853901640000061
A local linear regression method is used to minimize the following objective function model:
Figure FDA0002853901640000062
Figure FDA0002853901640000062
其中,
Figure FDA0002853901640000063
K(·)是核函数,h是带宽,K(·)是一有界、非负、有关于0对称的紧支撑且Lipschitz连续的密度函数
in,
Figure FDA0002853901640000063
K( ) is the kernel function, h is the bandwidth, and K( ) is a bounded, nonnegative, compactly supported Lipschitz-continuous density function symmetric about zero
得到估计量:Get an estimate:
Figure FDA0002853901640000064
Figure FDA0002853901640000064
其中:in:
Figure FDA0002853901640000065
Figure FDA0002853901640000065
(2)估计ξ(2) Estimate ξ 在得到(1)中
Figure FDA0002853901640000066
后,通过最小化以下的profile最小二乘函数得到
Figure FDA0002853901640000067
in getting (1)
Figure FDA0002853901640000066
Then, by minimizing the following profile least squares function to get
Figure FDA0002853901640000067
Figure FDA0002853901640000068
Figure FDA0002853901640000068
得到对
Figure FDA0002853901640000069
再重复(1)步骤,得到
Figure FDA00028539016400000610
然后再次重复(2)步骤,得到
Figure FDA00028539016400000611
不断重复,直至
Figure FDA00028539016400000612
get right
Figure FDA0002853901640000069
Repeat step (1) again to get
Figure FDA00028539016400000610
Then repeat step (2) again to get
Figure FDA00028539016400000611
repeat until
Figure FDA00028539016400000612
D、公交到站时间预测及修正:依据站点间距离及预测的旅行速度估计站点间旅行时间,进而累加公交车到目标站点间各路段的旅行时间,并参照历史数据进行修正;具体步骤为:D. Prediction and correction of bus arrival time: According to the distance between stations and the predicted travel speed, the travel time between stations is estimated, and then the travel time of each road section between the bus and the target station is accumulated, and the correction is made with reference to historical data; the specific steps are: 站点i到站点j共有l个时段,将从智能交通系统提取旅行时间数据并构成l维的向量
Figure FDA00028539016400000613
再将
Figure FDA00028539016400000614
拆分为两个h维向量
Figure FDA0002853901640000071
Figure FDA0002853901640000072
其中,
Figure FDA0002853901640000073
然后根据模型(2)将
Figure FDA0002853901640000074
转换为站点间旅行速度向量后代入站点间旅行速度预测模型得到站点间旅行速度估计向量
Figure FDA0002853901640000075
并依次序根据下式计算站点间旅行时间
There are l time periods from station i to station j, and travel time data will be extracted from the intelligent transportation system to form an l-dimensional vector
Figure FDA00028539016400000613
again
Figure FDA00028539016400000614
split into two h-dimensional vectors
Figure FDA0002853901640000071
and
Figure FDA0002853901640000072
in,
Figure FDA0002853901640000073
Then according to model (2) the
Figure FDA0002853901640000074
Convert to the inter-site travel speed vector and enter the inter-site travel speed prediction model to obtain the inter-site travel speed estimation vector
Figure FDA0002853901640000075
and calculate the travel time between stations according to the following formula
Figure FDA0002853901640000076
Figure FDA0002853901640000076
得到旅行时间预测向量
Figure FDA0002853901640000077
get travel time prediction vector
Figure FDA0002853901640000077
最后,令
Figure FDA0002853901640000078
Finally, let
Figure FDA0002853901640000078
Figure FDA0002853901640000079
Figure FDA0002853901640000079
并依据模型(15)找到最优的修正系数α0 And according to the model (15) to find the optimal correction coefficient α 0
Figure FDA00028539016400000710
Figure FDA00028539016400000710
则t时段站点i到站点j的旅行时间修正模型输出为Then the travel time correction model output from site i to site j in t period is
Figure FDA00028539016400000711
Figure FDA00028539016400000711
即公交车到站时间。The bus arrival time.
CN201811430278.7A 2018-11-28 2018-11-28 Bus arrival time prediction method based on network vector autoregressive model Expired - Fee Related CN109584552B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811430278.7A CN109584552B (en) 2018-11-28 2018-11-28 Bus arrival time prediction method based on network vector autoregressive model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811430278.7A CN109584552B (en) 2018-11-28 2018-11-28 Bus arrival time prediction method based on network vector autoregressive model

Publications (2)

Publication Number Publication Date
CN109584552A CN109584552A (en) 2019-04-05
CN109584552B true CN109584552B (en) 2021-04-30

Family

ID=65925127

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811430278.7A Expired - Fee Related CN109584552B (en) 2018-11-28 2018-11-28 Bus arrival time prediction method based on network vector autoregressive model

Country Status (1)

Country Link
CN (1) CN109584552B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021051329A1 (en) * 2019-09-19 2021-03-25 Beijing Didi Infinity Technology And Development Co., Ltd. Systems and methods for determining estimated time of arrival in online to offline services
CN111464937B (en) * 2020-03-23 2021-06-22 北京邮电大学 A positioning method and device based on multipath error compensation
CN111667689B (en) * 2020-05-06 2022-06-03 浙江师范大学 Method, device and computer device for vehicle travel time prediction
CN112632462B (en) * 2020-12-22 2022-03-18 天津大学 Synchronous measurement missing data restoration method and device based on time sequence matrix decomposition
CN113239198B (en) * 2021-05-17 2023-10-31 中南大学 Subway passenger flow prediction method and device and computer storage medium
CN113470365B (en) * 2021-09-01 2022-01-14 北京航空航天大学杭州创新研究院 Bus arrival time prediction method oriented to missing data
CN113487872B (en) * 2021-09-07 2021-11-16 南通飞旋智能科技有限公司 Bus transit time prediction method based on big data and artificial intelligence
CN114446039B (en) * 2021-12-31 2023-05-19 深圳云天励飞技术股份有限公司 Passenger flow analysis method and related equipment
CN115018454B (en) * 2022-05-24 2024-04-05 北京交通大学 A method for calculating passenger travel time value based on travel mode recognition

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104064028A (en) * 2014-06-23 2014-09-24 银江股份有限公司 Bus arrival time predicting method and system based on multivariate information data
CN105243868A (en) * 2015-10-30 2016-01-13 青岛海信网络科技股份有限公司 Bus arrival time forecasting method and device
CN108831181A (en) * 2018-05-04 2018-11-16 东南大学 A kind of method for establishing model and system for Forecasting of Travel Time for Public Transport Vehicles

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102074124B (en) * 2011-01-27 2013-05-08 山东大学 A Dynamic Bus Arrival Time Prediction Method Based on SVM and H∞ Filter
CN102708701B (en) * 2012-05-18 2015-01-28 中国科学院信息工程研究所 System and method for predicting arrival time of buses in real time
US10225161B2 (en) * 2016-10-31 2019-03-05 Accedian Networks Inc. Precise statistics computation for communication networks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104064028A (en) * 2014-06-23 2014-09-24 银江股份有限公司 Bus arrival time predicting method and system based on multivariate information data
CN105243868A (en) * 2015-10-30 2016-01-13 青岛海信网络科技股份有限公司 Bus arrival time forecasting method and device
CN108831181A (en) * 2018-05-04 2018-11-16 东南大学 A kind of method for establishing model and system for Forecasting of Travel Time for Public Transport Vehicles

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Network vector autoregression;Hansheng Wang等;《Social Science Electronic Publishing》;20161231;参见全文1-30页 *
公交车辆到站时间预测方法研究;赵衍青;《中国优秀硕士学位论文全文数据库》;20170615;说明书第3章 *
急于向量空间的多子网复合复杂网络模型动态组网运算的形式描述;隋毅等;《软件学报》;20151231;全文 *

Also Published As

Publication number Publication date
CN109584552A (en) 2019-04-05

Similar Documents

Publication Publication Date Title
CN109584552B (en) Bus arrival time prediction method based on network vector autoregressive model
WO2023029234A1 (en) Method for bus arrival time prediction when lacking data
CN112150207B (en) Online ride-hailing order demand prediction method based on spatiotemporal contextual attention network
CN114330106A (en) Urban public traffic planning method
CN114463977A (en) Path planning method based on vehicle-road collaborative multi-source data fusion traffic flow prediction
JP7625140B2 (en) Distributed Multi-Task Machine Learning for Traffic Forecasting
CN110274609B (en) A real-time route planning method based on travel time prediction
CN106652534A (en) Method for predicting arrival time of bus
CN106652441A (en) Urban road traffic condition prediction method based on spatial-temporal data
Zhao et al. Agent-based model (ABM) for city-scale traffic simulation: A case study on San Francisco
CN110009906A (en) Dynamic route planning method based on traffic prediction
CN111695225A (en) Bus composite complex network model and bus scheduling optimization method thereof
CN115204478A (en) A public transport flow prediction method combining urban points of interest and spatiotemporal causality
CN109269516A (en) A kind of dynamic route guidance method based on multiple target Sarsa study
CN109064750B (en) Urban road network traffic estimation method and system
CN109583708B (en) A method for establishing a multi-agent microscopic traffic distribution model
Lam et al. Prediction of bus arrival time using real-time on-line bus locations
Zhu et al. Large-scale travel time prediction for urban arterial roads based on Kalman filter
CN114692980B (en) A method for predicting short-term passenger flow at new line stations of urban rail transit
CN114993336B (en) A commuter route optimization method and system based on PM2.5 pollutant exposure risk
CN115146840A (en) A data-driven method for predicting passenger flow of new rail transit lines
CN114330871A (en) Method for predicting urban road conditions by combining public transport operation data with GPS data
CN111275970A (en) An optimal route planning method considering real-time bus information
Matsumoto et al. Multi-agent Simulation with Mathematical Optimization of Urban Traffic Using Open Geographic Data
Wu et al. Bus Arrival Time Estimation Based on GPS Data by the Artificial Bee Colony Optimization BP Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210430