CN109584552B - Bus arrival time prediction method based on network vector autoregressive model - Google Patents

Bus arrival time prediction method based on network vector autoregressive model Download PDF

Info

Publication number
CN109584552B
CN109584552B CN201811430278.7A CN201811430278A CN109584552B CN 109584552 B CN109584552 B CN 109584552B CN 201811430278 A CN201811430278 A CN 201811430278A CN 109584552 B CN109584552 B CN 109584552B
Authority
CN
China
Prior art keywords
travel
bus
station
model
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201811430278.7A
Other languages
Chinese (zh)
Other versions
CN109584552A (en
Inventor
吴舜尧
刘殿中
张齐
余翔
宋涛涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao University
Original Assignee
Qingdao University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao University filed Critical Qingdao University
Priority to CN201811430278.7A priority Critical patent/CN109584552B/en
Publication of CN109584552A publication Critical patent/CN109584552A/en
Application granted granted Critical
Publication of CN109584552B publication Critical patent/CN109584552B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0133Traffic data processing for classifying traffic situation
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/123Traffic control systems for road vehicles indicating the position of vehicles, e.g. scheduled vehicles; Managing passenger vehicles circulating according to a fixed timetable, e.g. buses, trains, trams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Traffic Control Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for predicting bus arrival time based on a network vector autoregressive model, which takes bus stops and intersections as nodes, constructs an urban traffic network based on urban road traffic information and bus route planning conditions, extracts and deduces data such as public facility quantity, travel speed between stops, traffic jam degree and the like from an intelligent traffic system database, constructs a regression relationship between implicit factors and predicts the travel speed of corresponding road sections based on low-dimensional implicit factors of a travel speed matrix between stops and the urban traffic network, predicts the travel speed between stops in a certain future period based on learning historical data of an expanded network vector space autoregressive model, and estimates the travel time between stops according to the distance between stops and the predicted travel speed, takes the topological correlation of the urban traffic network into consideration, fully utilizes the data such as bus arrival time, GPS positioning information and the like, the prediction effect is effectively improved.

Description

Bus arrival time prediction method based on network vector autoregressive model
The technical field is as follows:
the invention relates to the technical field of urban intelligent public transport information processing, in particular to a public transport arrival time prediction method based on a network vector autoregressive model.
Background art:
in recent years, the rapid development of Chinese economy and the rapid progress of science and technology promote the great improvement of urban public transport level. Among them, buses are important components of urban public transportation and have become essential transportation means in people's modern life. With the continuous promotion of the urbanization process and the rapid expansion of the urban scale, the problems of the increase of the total passenger quantity, the large variation range of the bus passenger flow intensity, the large difference of the passenger transport effect in different time periods and the like become increasingly prominent. The accurate prediction of the bus arrival time is an important means for relieving the pressure of urban public transport. On one hand, the prediction of the arrival time of the bus can provide decision support for bus passenger flow guidance, bus safety management and operation coordination, and is beneficial to providing the operation efficiency of the urban bus network and reducing traffic jam. On the other hand, the bus arrival time inquiry service can be provided for passengers, so that the passengers can be helped to plan, and the anxiety of the passengers waiting for the bus can be relieved.
The bus arrival time prediction means that the time of the bus arriving at the station is predicted by modeling by using data acquired by an intelligent transportation system. The corresponding modeling method can be roughly divided into two strategies of time series analysis and machine learning. The time sequence analysis strategy extracts the travel time between the historical bus line stops as a time sequence, tests the stability, randomness and the like of the time sequence, and then selects a proper time sequence analysis model for prediction according to the test conditions. The machine learning strategy takes the travel situation between the sites as an object, takes the travel time between the sites as a prediction variable, extracts the length of the travel road section between the sites, the crowding degree, the nearby weather situation, the POI situation, the travel time of the upstream road section and the like as characteristics, and then selects a random forest, a support vector machine, a neural network and the like to construct a model. In summary, the influence of topological correlation between urban road traffic networks on bus travel time cannot be fully considered in the existing method. In addition, a large amount of missing usually exists in the acquired bus arrival time, and the missing data is usually discarded in the existing work without being properly processed.
Considering that the travel speed between stops can reflect the traffic condition and can be directly influenced by the travel speed of adjacent areas, the method converts the prediction of the bus arrival time into the prediction of the travel speed between stops. On the basis, a regression relationship is constructed by utilizing the urban traffic network and the station travel speed matrix, so that historical missing data is filled. Furthermore, a network vector autoregressive model is expanded on the basis of a partial linear single index model to predict the travel speed between sites. And finally, estimating the travel time between the stations according to the travel speed between the stations, thereby predicting the time of the bus reaching the target station.
The invention content is as follows:
in order to overcome the defects in the prior art, the invention considers the topological correlation of the urban traffic network, makes full use of the data such as the bus arrival time, the bus GPS positioning information and the like, provides the bus arrival time prediction method based on the network vector autoregressive model, and effectively improves the prediction effect.
The invention relates to a bus arrival time prediction method based on a network vector autoregressive model, which comprises the following steps:
A. data preprocessing facing an intelligent traffic system: taking bus stops and intersections as nodes, constructing an urban traffic network based on urban road traffic information and bus route planning conditions, and extracting and deducing data such as public facility quantity, travel speed between stops, traffic jam degree and the like from an intelligent traffic system database;
B. and (3) filling the traveling speed loss between the sites based on singular value matrix decomposition: for a certain time period with missing travel speed, extracting a travel speed matrix between sites of the time period and a low-dimensional hidden factor of an urban traffic network, constructing a regression relation between the hidden factors and predicting the travel speed of a corresponding road section;
C. inter-site travel speed prediction based on a network vector partial linear autoregressive model: learning historical data based on an expanding network vector space autoregressive model, so as to predict the travel speed between sites in a certain period of time in the future;
D. predicting and correcting the bus arrival time: and estimating the travel time between the stations according to the distance between the stations and the predicted travel speed, further estimating and accumulating the travel time of each road section from the bus to the target station, and correcting by referring to historical data.
The step A related by the invention deduces a travel relation network between stations based on the urban road traffic network and the bus route planning condition, and calculates the distance between stations according to the included angle relation between stations.
The step A related by the invention utilizes the bus GPS data to deduce the congestion degree between stations.
Step B related by the invention constructs topological correlation between the inter-site travel speed matrix and the inter-site travel relation network, thereby filling up the missing travel speed between sites.
Step C related by the invention expands the network vector space autoregressive model based on a partial linear single index model, so that the direct nonlinear correlation of independent variables and dependent variables can be processed.
Compared with the prior art, the method has reliable principle, considers the topological correlation of the urban traffic network, fully utilizes the data of bus arrival time, bus GPS positioning information and the like, effectively improves the prediction effect, predicts accurate time and is environment-friendly in application.
Description of the drawings:
FIG. 1: the invention relates to a flow diagram of bus arrival time prediction based on a network vector autoregressive model.
FIG. 2: the invention relates to a flow chart diagram for filling missing values based on a singular value matrix decomposition method.
FIG. 3: example 1 three cases of inter-site included angle relationships
The specific implementation mode is as follows:
in order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
Example 1:
the scheme related to the embodiment comprises the following steps:
A. data preprocessing oriented to intelligent traffic system
(1) Building an inter-site travel relationship network
Firstly, placing intersections as nodes in a geodetic coordinate system according to longitude and latitude, connecting the nodes according to urban road planning conditions, and specifically describing by using a network G (V, L); wherein, V represents intersection set, V ═ V1,v2,…vnN ═ V | is the total number of intersections, L represents a set of links existing between intersections, and L ═ a great face<vh,vl>|vh,vlThe element belongs to V, 1 is less than h, l is less than n, and the nodes in G are limited by the positions of longitude and latitude, so that the urban road condition can be reflected more truly;
then, according to public transportAdding bus stops to an urban road traffic network according to the route planning condition; on the basis, redefining the node set as V ═ V1∪V2,V1Representative intersection set, V2A representative site set; the longitude and latitude of the station, the distance between intersections and the distance between the station and the nearest intersection are obtained by a database of the intelligent traffic system; further, the travel distance between the stations is calculated, when both stations are adjacent to a crossing, the exact position relation between the stations and the crossing needs to be determined, and the azimuth angles Az of the stations i and j are givenij(0°<Azij< 360 °) calculation from the following model
Figure BDA0001882532670000041
Figure BDA0001882532670000042
Wherein, WiRepresents the latitude, J, of node iiRepresents the longitude, W, of node ijRepresents the latitude, J, of node JjRepresents the longitude of node j; the distance between the stations can be calculated according to the angle relationship, and three cases are shown in FIG. 3
FIG. 3-a shows that the azimuth angles of station A and station B are not equal to each other, and the distance D between station A and station BABIs the sum of the distances between the two stations and the intersection; FIG. 3-B represents station A and station B having unequal azimuths relative to the intersection, but 360 of the sums, and a distance D between station A and station BABIs the sum of the distances between the two stations and the intersection; 3-c represent station A and station B having equal azimuth relative to the intersection, and distance D between station A and station BABIs the absolute value of the distance difference between the two stations and the intersection. Aiming at the three conditions, calculating to obtain the distance between the stations; then, generating a travel distance matrix D between stations according to the bus route planning condition;
finally, the travel relation network G between sites is extracted from GBus=(VBus,LBus) Wherein V isBusRepresenting a set of bus stops, VBus={v1,…,vN},N=|VBusL is the number of bus stops in the travel relationship network between stops, LBusRepresenting sets of adjacent road sections between stations, LBus={<vh,vl>|vh,vl∈VBusH is more than 1, l is less than N, and meanwhile, a travel relation network G between sites is generated according to a travel distance matrix between sitesBusIs A ═ aij)∈RN×NWherein when (v)h,vl)∈LBus,aij1, otherwiseij=0;
(2) Extracting travel speeds between stops
The traffic jam condition of a certain road section is influenced by the adjacent road sections, and the travel time cannot directly reflect the traffic jam condition; for this purpose, the embodiment extracts the travel time from the intelligent transportation system database, then converts the travel time into the travel speed, and models and predicts the travel speed;
(2-1) taking the initial time of the extracted data as the starting time, and dividing T time periods at intervals of a fixed time period;
(2-2)Yt∈RN×Na travel time matrix of t time period, whose elements are
Figure BDA0001882532670000051
Represents the average travel time from site i to site j for time period t, and therefore,
Figure BDA0001882532670000052
a high-dimensional vector of a T dimension is formed;
(2-3) acquiring travel speed data
Given travel time between certain sites
Figure BDA0001882532670000061
The travel speed between stations can be calculated according to the following model
Figure BDA0001882532670000062
Sequentially converting the travel time matrix among the sites into a travel speed matrix among the sites to generate a high-dimensional travel speed vector
Figure BDA0001882532670000063
(3) Extracting related covariates
The prediction of the travel speed between the stations not only needs to consider the topological correlation of the urban road traffic network, but also has other factors which can influence the speed; for this purpose, the present embodiment selects a public infrastructure condition (POI, Point Of Interest) and a traffic congestion degree as covariates;
(3-1) public infrastructure situation
POI (Point Of interest) represents the amount Of public infrastructure (e.g., school, hospital, mall, movie theater) in the area between transit stations; in this embodiment, use is made of
Figure BDA0001882532670000064
Recording the number of public facilities near the travel from the station i to the station j; (by using
Figure BDA0001882532670000065
Recording number of public facilities near travel from station i to station j)
(3-2) degree of traffic congestion
The method adopts the bus GPS data to evaluate the congestion degree of the travel road section; giving two adjacent stations i and j, and counting the number of buses between the adjacent stations i and j in a t period according to GPS data
Figure BDA0001882532670000066
And based on the historical number sequence of the buses in the road section
Figure BDA0001882532670000067
Minimum value of (2)
Figure BDA0001882532670000068
First quartile
Figure BDA0001882532670000069
Median number
Figure BDA00018825326700000610
Fourth quartile
Figure BDA00018825326700000611
Maximum value
Figure BDA00018825326700000612
The traffic congestion degree is divided into four levels:
Figure BDA0001882532670000071
wherein 1 represents unobstructed, 2 represents comparatively unobstructed, 3 represents comparatively congested, 4 represents congested,
in summary, the covariate matrix model Z can be expressed as
Z=(ZPOI,ZTPI)T (4)。
B. Singular value matrix decomposition-based inter-site travel speed deficiency filling
Travel time velocity matrix S for t time periodt∈RN×NIn this embodiment, low-dimensional implicit factors of the travel speed matrix and the inter-site travel relationship network adjacency matrix are extracted, and regression relationships between the implicit factors are constructed to fill up StThe missing data in (1) specifically comprises the following three steps of operation;
(1) extracting low dimensional implicit factors
The hidden space network model related to the embodiment extracts the low-dimensional hidden factor, and the hidden space network model is
Figure BDA0001882532670000072
Wherein,
Figure BDA0001882532670000073
Etis an n × n white noise matrix, μtIs the overall mean value, at、btRepresenting the output and receiving effects of the node, Ut、VtRepresenting interaction effects, the above parameters constituting low-dimensional implicit factors
Figure BDA0001882532670000074
It can be estimated by SVD model
Figure BDA0001882532670000075
Wherein,
Figure BDA0001882532670000081
and
Figure BDA0001882532670000082
is an N x k non-singular matrix,
Figure BDA0001882532670000083
is a diagonal matrix with (k x k) diagonal elements being non-zero elements,
Figure BDA0001882532670000084
n-dimensional vector
Figure BDA0001882532670000085
Are respectively
Figure BDA0001882532670000086
And
Figure BDA0001882532670000087
column mean of (1); further, travel time velocity matrix StIs covered by a low-dimensional implicit factor
Figure BDA0001882532670000088
Extracting; similarly, a low-dimensional implication factor, N, of the inter-site travel relationship network adjacency matrix A may be extractedA=[aA,bA,UA,VA];
(2) Construction of a model of regression relationships between low-dimensional hidden factors
First, S is obtainedtThe row number and column number in which the missing value exists, and then S is deletedtThe rows and columns corresponding to the adjacency matrix A and denoted as St'and A'; further, extracting their low-dimensional implicit factors
Figure BDA00018825326700000821
And
Figure BDA00018825326700000822
and constructing a regression model
Figure BDA0001882532670000089
The model f (-) can be a linear model, a nonlinear model or a nonparametric model, a random forest algorithm is adopted in the embodiment, and the number of decision trees is set to be 200;
(3) predicting and filling missing values
First, S is obtainedtThe row number and column number in which the missing value exists are then extracted StThe rows and columns corresponding to the adjacency matrix A and denoted as St"and A", and further, extracting the low-dimensional hidden factor N of AA″=[aA″,bA″,UA″,VA″]. Will NA″Substituting into the model (7) to obtain corresponding low-dimensional hidden factors
Figure BDA00018825326700000810
Finally, obtain
Figure BDA00018825326700000811
Column mean of
Figure BDA00018825326700000812
And
Figure BDA00018825326700000813
column mean of
Figure BDA00018825326700000814
Substitution into
Figure BDA00018825326700000815
Deriving an ensemble mean
Figure BDA00018825326700000816
Then substituted into the following model
Figure BDA00018825326700000817
To obtain
Figure BDA00018825326700000818
According to the row number and the column number
Figure BDA00018825326700000819
Data substitution of corresponding position StTo obtain a travel speed matrix between the sites
Figure BDA00018825326700000820
C. Inter-site travel speed prediction based on network vector partial linear autoregressive model
The present embodiment adopts a network vector partial linear autoregressive model of
Figure BDA0001882532670000091
Wherein,
Figure BDA0001882532670000092
representing the influence of the nonlinear variable (characteristic variables such as the number of public facilities and the degree of congestion which are independent of time) on the dependent variable,
Figure BDA0001882532670000093
in (1)
Figure BDA0001882532670000094
Representing the associated covariate vector, g (z), between site i and site jijγ) of γ ═ y (γ)1,γ2)TAre covariate coefficients or nodal effect coefficients,
Figure BDA0001882532670000095
representing the total number of nodes i connected to other nodes, in the model
Figure BDA0001882532670000096
Representing the average effect of other sites k on site i at time t-1, in the model
Figure BDA0001882532670000097
The influence of the traveling speed at the moment before the road section from the station i to the station j on the current traveling speed is shown, namely the dependent variable at the time t-1 has influence on the value of the dependent variable at the time t,
Figure BDA00018825326700000915
is an error term which is related to the covariate zijAre independent of each other and follow a normal distribution; its expectation and variance are respectively
Figure BDA0001882532670000098
Let beta be (beta)1,β2)T
Figure BDA0001882532670000099
The model (9) is rewritten as:
Figure BDA00018825326700000910
let mu leti=zijγ,
Figure BDA00018825326700000911
The following can be obtained:
Figure BDA00018825326700000912
estimating an unknown parameter ξ ═ (γ)T,βT)TThe steps are as follows:
(1) estimate g (·)
For a given
Figure BDA00018825326700000913
The objective function model is minimized using a local linear regression method as follows:
Figure BDA00018825326700000914
wherein,
Figure BDA0001882532670000101
k (-) is a kernel function, h is bandwidth, K (-) is a bounded, non-negative, tightly-supported with 0 symmetry and Lipschitz's continuous density function
Obtaining an estimator:
Figure BDA0001882532670000102
wherein,
Figure BDA0001882532670000103
(2) estimate ζ
In obtaining (1)
Figure BDA0001882532670000104
Then, it is obtained by minimizing the following profile least square function
Figure BDA0001882532670000105
Figure BDA0001882532670000106
Get a pair
Figure BDA0001882532670000107
Repeating the step (1) to obtain
Figure BDA0001882532670000108
Then repeating the step (2) again to obtain
Figure BDA0001882532670000109
Continuously repeat until
Figure BDA00018825326700001010
D. Bus arrival time prediction and correction
In order to improve the prediction accuracy and correct the interference of the extension of the prediction time period on the prediction result, the embodiment adds a correction coefficient α (α is greater than or equal to 0 and less than or equal to 1) to adjust the prediction result so as to improve the prediction accuracy.
The total time interval from the station i to the station j is l, the travel time data is extracted from the intelligent transportation system, and a vector with l dimension is formed
Figure BDA00018825326700001011
Further, will
Figure BDA00018825326700001012
Split into two h-dimensional vectors
Figure BDA00018825326700001013
And
Figure BDA00018825326700001014
wherein,
Figure BDA00018825326700001015
then according to the model (2) will
Figure BDA00018825326700001016
Converting the inter-site travel velocity vector into an inter-site travel velocity vector, and substituting the inter-site travel velocity vector into an inter-site travel velocity prediction model to obtain an inter-site travel velocity estimation vector
Figure BDA0001882532670000111
And calculating the travel time between the stations according to the following formula
Figure BDA0001882532670000112
Deriving travel time prediction vectors
Figure BDA0001882532670000113
Finally, let
Figure BDA0001882532670000114
And finding out the optimal correction coefficient alpha according to the formula (15)0
Figure BDA0001882532670000115
The travel time correction model output from the station i to the station j in the time period t is
Figure BDA0001882532670000116
And obtaining data from the intelligent transportation system, calculating according to the steps to obtain the travel time of all road sections between the station m and the station n, adding and summing, and finally adding and outputting the sum and the departure time of the station m, namely finishing the bus arrival time prediction.

Claims (1)

1. A bus arrival time prediction method based on a network vector autoregressive model is characterized by mainly comprising the following steps:
A. data preprocessing facing an intelligent traffic system: taking bus stops and intersections as nodes, constructing an urban traffic network based on urban road traffic information and bus route planning conditions, and extracting and deducing data of public facility quantity, travel speed between stops and traffic jam degree from an intelligent traffic system database; the method specifically comprises the following steps:
(1) building an inter-site travel relationship network
Firstly, placing intersections as nodes in a geodetic coordinate system according to longitude and latitude, connecting the nodes according to urban road planning conditions, and specifically describing by using a network G (V, L); wherein, V represents intersection set, V ═ V1,v2,…vnN ═ V | is the total number of intersections, L represents a set of links existing between intersections, and L ═ a great face<vh,vl>|vh,vlE is V, 1 is more than h, l is more than n, and the nodes in G are limited by the positions of longitude and latitude to reflect the urban road condition; then adding bus stops to the urban road traffic network according to the planning condition of the bus route; redefining a set of nodes as V ═ V1∪V2,V1Representative intersection set, V2A representative site set; the longitude and latitude of the station, the distance between intersections and the distance between the station and the nearest intersection are obtained by a database of the intelligent traffic system; further, the travel distance between the stations is calculated, the two stations are both close to a crossing, the exact position relation between the two stations and the crossing is determined, and the azimuth angles Az of the two stations i and j are givenij(0°<Azij< 360 °) was obtained from the following model:
cos(c)=cos(90-Wi)×cos(90-Wj)+sin(90-Wi)×sin(90-Wj)×cos(Ji-Jj)
Figure FDA0002853901640000011
wherein, WiRepresents the latitude, J, of node iiRepresents the longitude, W, of node ijRepresents the latitude, J, of node JjRepresents the longitude of node j; the distance between the stations can be calculated according to the included angle relationship, wherein the three conditions are that the azimuth angles of the station A and the station B relative to the intersection are unequal, and the distance D between the station A and the station BABIs the sum of the distances between the two stations and the intersection; the azimuth angles of the station A and the station B relative to the intersection are not equal, but the sum is 360 DEG, and the distance D between the station A and the station BABIs the sum of the distances between the two stations and the intersection; the azimuth angles of the station A and the station B relative to the intersection are equal, and the distance D between the station A and the station BABIs the absolute value of the distance difference between the two stations and the intersection; aiming at the three conditions, calculating to obtain the distance between the stations; then, generating a travel distance matrix D between stations according to the bus route planning condition;
finally, the travel relation network G between sites is extracted from GBus=(VBus,LBus) Wherein V isBusRepresenting a set of bus stops, VBus={v1,…,vN},N=|VBusL is the number of bus stops in the travel relationship network between stops, LBusRepresenting sets of adjacent road sections between stations, LBus={<vh,vl>|vh,vl∈VBusH is more than 1, l is less than N, and the travel relation network G between sites is generated according to the travel distance matrix between sitesBusIs A ═ aij)∈RN×NWherein when (v)h,vl)∈LBus,aij1, otherwiseij=0;
(2) Extracting travel speeds between stops
Extracting travel time from an intelligent traffic system database, converting the travel time into travel speed, and modeling and predicting the travel speed; taking the initial time of the extracted data as the starting time, and dividing T time periods at intervals of a fixed time period; y ist∈RN×NA travel time matrix of t time period, whose elements are
Figure FDA0002853901640000021
Representing time period t from site i to site jThe average travel time of, and therefore,
Figure FDA0002853901640000022
a high-dimensional vector of a T dimension is formed; given travel time between certain sites
Figure FDA0002853901640000023
The travel speed between stations can be obtained according to the following model
Figure FDA0002853901640000024
Sequentially converting the travel time matrix among the sites into a travel speed matrix among the sites to generate a high-dimensional travel speed vector
Figure FDA0002853901640000025
(3) Extracting related covariates
By using
Figure FDA0002853901640000026
Recording the number of public facilities near the travel from the station i to the station j;
evaluating the congestion degree of the travel road section by adopting bus GPS data; giving two adjacent stations i and j, and counting the number of buses between the adjacent stations i and j in a t period according to GPS data
Figure FDA0002853901640000031
And based on the historical number sequence of the buses in the road section
Figure FDA0002853901640000032
Minimum value of (count)ij min) First quartile (count)ij 0.25) Median (count)ij median) The third quartile (count)ij 0.75) Maximum value (count)ij max) Degree of traffic jamThe classification is four levels:
Figure FDA0002853901640000033
where 1 represents unobstructed, 2 represents comparatively unobstructed, 3 represents comparatively congested, and 4 represents congested, the covariate matrix model Z can be represented as
Z=(ZPOI,ZTPI)T (4);
B. And (3) filling the traveling speed loss between the sites based on singular value matrix decomposition: for a certain time period with missing travel speed, extracting a travel speed matrix between sites of the time period and a low-dimensional hidden factor of an urban traffic network, constructing a regression relation between the hidden factors and predicting the travel speed of a corresponding road section; the method specifically comprises the following steps:
travel time velocity matrix S for t time periodt∈RN×NExtracting low-dimensional hidden factors of the travel speed matrix and the travel relation network adjacency matrix between sites, and constructing a regression relation between the hidden factors to fill StThe missing data in (1) specifically comprises the following three steps;
(1) extracting low dimensional implicit factors
Extracting low-dimensional hidden factors by adopting a hidden space network model which is
Figure FDA0002853901640000035
Wherein,
Figure FDA0002853901640000036
Etis an n × n white noise matrix, μtIs the overall mean value, at、btRepresenting the output and receiving effects of the node, Ut、VtRepresenting interaction effects, the above parameters constituting low-dimensional implicit factors
Figure FDA0002853901640000034
It can be estimated by SVD model
Figure FDA0002853901640000041
Figure FDA0002853901640000042
Figure FDA0002853901640000043
Wherein,
Figure FDA0002853901640000044
and
Figure FDA0002853901640000045
is an N x k non-singular matrix,
Figure FDA0002853901640000046
is a diagonal matrix with (k x k) diagonal elements being non-zero elements,
Figure FDA0002853901640000047
n-dimensional vector
Figure FDA0002853901640000048
Are respectively
Figure FDA0002853901640000049
And
Figure FDA00028539016400000410
column mean of (1); travel time velocity matrix StIs covered by a low-dimensional implicit factor
Figure FDA00028539016400000411
Extracting and then extracting a low-dimensional implicit factor, N, of an inter-site travel relationship network adjacency matrix AA=[aA,bA,UA,VA];
(2) Construction of a model of regression relationships between low-dimensional hidden factors
First obtaining StThe row number and column number in which the missing value exists, and then S is deletedtThe rows and columns corresponding to the adjacency matrix A and denoted as St'and A'; re-extracting their low-dimensional implicit factors
Figure FDA00028539016400000412
And
Figure FDA00028539016400000413
and constructing a regression model
Figure FDA00028539016400000414
The model f (-) is one or more of a linear model, a nonlinear model or a non-parameter model, a random forest algorithm is adopted, and the number of decision trees is set to be 200;
(3) predicting and filling missing values
First obtaining StThe row number and column number in which the missing value exists are then extracted StThe rows and columns corresponding to the adjacency matrix A and denoted as St"and A", and extracting the low-dimensional hidden factor N of AA″=[aA″,bA″,UA″,VA″]Is a reaction of NA″Substituting into the model (7) to obtain corresponding low-dimensional hidden factors
Figure FDA00028539016400000415
Finally, obtain
Figure FDA00028539016400000416
Column mean of
Figure FDA00028539016400000417
And
Figure FDA00028539016400000418
column mean of
Figure FDA00028539016400000419
Substitution into
Figure FDA00028539016400000420
Deriving an ensemble mean
Figure FDA00028539016400000421
Then substituted into the following model
Figure FDA00028539016400000422
To obtain
Figure FDA00028539016400000423
According to the row number and the column number
Figure FDA00028539016400000424
Data substitution of corresponding position StTo obtain a travel speed matrix between the sites
Figure FDA0002853901640000051
C. Inter-site travel speed prediction based on a network vector partial linear autoregressive model: learning historical data based on an expanding network vector space autoregressive model, so as to predict the travel speed between sites in a certain period of time in the future; the method specifically comprises the following steps:
the network vector part linear autoregressive model is adopted as
Figure FDA0002853901640000052
Figure FDA0002853901640000053
Wherein g (z)ijγ) represents the time-independent influence of the number of utilities, the degree of congestion, and the non-linear variable on the dependent variable, g (z)ijγ) in
Figure FDA0002853901640000054
Representing the associated covariate vector, g (z), between site i and site jijγ) of γ ═ y (γ)1,γ2)TAre covariate coefficients or nodal effect coefficients,
Figure FDA0002853901640000055
representing the total number of nodes i connected to other nodes, in the model
Figure FDA0002853901640000056
Representing the average effect of other sites k on site i at time t-1, in the model
Figure FDA0002853901640000057
The influence of the traveling speed at the moment before the road section from the station i to the station j on the current traveling speed is shown, namely the dependent variable at the time t-1 has influence on the value of the dependent variable at the time t,
Figure FDA0002853901640000058
is the error term and the covariate zijAre independent of each other and follow a normal distribution;
Figure FDA0002853901640000059
respectively of the expectation and variance of
Figure FDA00028539016400000510
Order to
Figure FDA00028539016400000511
The model (9) is rewritten as:
Figure FDA00028539016400000512
let mu leti=zijγ,
Figure FDA00028539016400000513
The following can be obtained:
Figure FDA00028539016400000514
estimating an unknown parameter ξ ═ (γ)T,βT)TThe method comprises the following steps:
(1) estimate g (·)
For a given
Figure FDA0002853901640000061
The following objective function model is minimized using local linear regression:
Figure FDA0002853901640000062
wherein,
Figure FDA0002853901640000063
k (-) is a kernel function, h is bandwidth, K (-) is a bounded, non-negative, tightly-supported with 0 symmetry and Lipschitz's continuous density function
Obtaining an estimator:
Figure FDA0002853901640000064
wherein:
Figure FDA0002853901640000065
(2) estimate xi
In obtaining (1)
Figure FDA0002853901640000066
Then, it is obtained by minimizing the following profile least square function
Figure FDA0002853901640000067
Figure FDA0002853901640000068
Get a pair
Figure FDA0002853901640000069
Repeating the step (1) to obtain
Figure FDA00028539016400000610
Then repeating the step (2) again to obtain
Figure FDA00028539016400000611
Continuously repeat until
Figure FDA00028539016400000612
D. Predicting and correcting the bus arrival time: estimating the travel time between stations according to the distance between stations and the predicted travel speed, further accumulating the travel time of each road section from the bus to the target station, and correcting by referring to historical data; the method comprises the following specific steps:
the total time interval from the station i to the station j is l, the travel time data is extracted from the intelligent transportation system, and a vector with l dimension is formed
Figure FDA00028539016400000613
Then will be
Figure FDA00028539016400000614
Split into two h-dimensional vectors
Figure FDA0002853901640000071
And
Figure FDA0002853901640000072
wherein,
Figure FDA0002853901640000073
then according to the model (2) will
Figure FDA0002853901640000074
Converting the inter-site travel velocity vector into an inter-site travel velocity vector, and substituting the inter-site travel velocity vector into an inter-site travel velocity prediction model to obtain an inter-site travel velocity estimation vector
Figure FDA0002853901640000075
And calculating the travel time between the stations according to the following formula
Figure FDA0002853901640000076
Deriving travel time prediction vectors
Figure FDA0002853901640000077
Finally, let
Figure FDA0002853901640000078
Figure FDA0002853901640000079
And finding the optimal correction coefficient alpha according to the model (15)0
Figure FDA00028539016400000710
The travel time correction model output from the station i to the station j in the time period t is
Figure FDA00028539016400000711
I.e. the bus arrival time.
CN201811430278.7A 2018-11-28 2018-11-28 Bus arrival time prediction method based on network vector autoregressive model Expired - Fee Related CN109584552B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811430278.7A CN109584552B (en) 2018-11-28 2018-11-28 Bus arrival time prediction method based on network vector autoregressive model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811430278.7A CN109584552B (en) 2018-11-28 2018-11-28 Bus arrival time prediction method based on network vector autoregressive model

Publications (2)

Publication Number Publication Date
CN109584552A CN109584552A (en) 2019-04-05
CN109584552B true CN109584552B (en) 2021-04-30

Family

ID=65925127

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811430278.7A Expired - Fee Related CN109584552B (en) 2018-11-28 2018-11-28 Bus arrival time prediction method based on network vector autoregressive model

Country Status (1)

Country Link
CN (1) CN109584552B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114365205A (en) * 2019-09-19 2022-04-15 北京嘀嘀无限科技发展有限公司 System and method for determining estimated time of arrival in online-to-offline service
CN111464937B (en) * 2020-03-23 2021-06-22 北京邮电大学 Positioning method and device based on multipath error compensation
CN111667689B (en) * 2020-05-06 2022-06-03 浙江师范大学 Method, device and computer device for predicting vehicle travel time
CN112632462B (en) * 2020-12-22 2022-03-18 天津大学 Synchronous measurement missing data restoration method and device based on time sequence matrix decomposition
CN113239198B (en) * 2021-05-17 2023-10-31 中南大学 Subway passenger flow prediction method and device and computer storage medium
CN113470365B (en) * 2021-09-01 2022-01-14 北京航空航天大学杭州创新研究院 Bus arrival time prediction method oriented to missing data
CN113487872B (en) * 2021-09-07 2021-11-16 南通飞旋智能科技有限公司 Bus transit time prediction method based on big data and artificial intelligence
CN114446039B (en) * 2021-12-31 2023-05-19 深圳云天励飞技术股份有限公司 Passenger flow analysis method and related equipment
CN115018454B (en) * 2022-05-24 2024-04-05 北京交通大学 Passenger travel time value calculation method based on travel mode identification

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104064028A (en) * 2014-06-23 2014-09-24 银江股份有限公司 Bus arrival time predicting method and system based on multivariate information data
CN105243868A (en) * 2015-10-30 2016-01-13 青岛海信网络科技股份有限公司 Bus arrival time forecasting method and device
CN108831181A (en) * 2018-05-04 2018-11-16 东南大学 A kind of method for establishing model and system for Forecasting of Travel Time for Public Transport Vehicles

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102074124B (en) * 2011-01-27 2013-05-08 山东大学 Dynamic bus arrival time prediction method based on support vector machine (SVM) and H-infinity filtering
CN102708701B (en) * 2012-05-18 2015-01-28 中国科学院信息工程研究所 System and method for predicting arrival time of buses in real time
US10225161B2 (en) * 2016-10-31 2019-03-05 Accedian Networks Inc. Precise statistics computation for communication networks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104064028A (en) * 2014-06-23 2014-09-24 银江股份有限公司 Bus arrival time predicting method and system based on multivariate information data
CN105243868A (en) * 2015-10-30 2016-01-13 青岛海信网络科技股份有限公司 Bus arrival time forecasting method and device
CN108831181A (en) * 2018-05-04 2018-11-16 东南大学 A kind of method for establishing model and system for Forecasting of Travel Time for Public Transport Vehicles

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Network vector autoregression;Hansheng Wang等;《Social Science Electronic Publishing》;20161231;参见全文1-30页 *
公交车辆到站时间预测方法研究;赵衍青;《中国优秀硕士学位论文全文数据库》;20170615;说明书第3章 *
急于向量空间的多子网复合复杂网络模型动态组网运算的形式描述;隋毅等;《软件学报》;20151231;全文 *

Also Published As

Publication number Publication date
CN109584552A (en) 2019-04-05

Similar Documents

Publication Publication Date Title
CN109584552B (en) Bus arrival time prediction method based on network vector autoregressive model
WO2023029234A1 (en) Method for bus arrival time prediction when lacking data
CN108629979B (en) Congestion prediction algorithm based on history and peripheral intersection data
Velaga et al. Developing an enhanced weight-based topological map-matching algorithm for intelligent transport systems
CN112489426B (en) Urban traffic flow space-time prediction scheme based on graph convolution neural network
US7953544B2 (en) Method and structure for vehicular traffic prediction with link interactions
CN107103392A (en) A kind of identification of bus passenger flow influence factor and Forecasting Methodology based on space-time Geographical Weighted Regression
CN110274609B (en) Real-time path planning method based on travel time prediction
CN105809962A (en) Traffic trip mode splitting method based on mobile phone data
CN103295414A (en) Bus arrival time forecasting method based on mass historical GPS (global position system) trajectory data
Hao et al. Modal activity-based stochastic model for estimating vehicle trajectories from sparse mobile sensor data
CN105512741A (en) Bus passenger traffic combined prediction method
CN111695225A (en) Bus composite complex network model and bus scheduling optimization method thereof
CN110414795B (en) Newly-increased high-speed rail junction accessibility influence method based on improved two-step mobile search method
CN112633602B (en) Traffic congestion index prediction method and device based on GIS map information
CN109269516A (en) A kind of dynamic route guidance method based on multiple target Sarsa study
Kannan et al. Predictive indoor navigation using commercial smart-phones
CN110084491A (en) Based on the optimal air route blockage percentage appraisal procedure for passing through path under the conditions of convection weather
CN109064750B (en) Urban road network traffic estimation method and system
CN109033239A (en) A kind of road network structure generating algorithm based on Least-squares minimization
Chen et al. Local path searching based map matching algorithm for floating car data
CN111008730B (en) Crowd concentration prediction model construction method and device based on urban space structure
CN113903171B (en) Vehicle crowd sensing node optimization method based on spatial-temporal characteristics of highway network
Zhu et al. Large-scale travel time prediction for urban arterial roads based on Kalman filter
CN116597666A (en) Urban road network non-detector road section flow real-time estimation method based on transfer learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210430

CF01 Termination of patent right due to non-payment of annual fee