CN113470365A - Bus arrival time prediction method oriented to missing data - Google Patents

Bus arrival time prediction method oriented to missing data Download PDF

Info

Publication number
CN113470365A
CN113470365A CN202111022557.1A CN202111022557A CN113470365A CN 113470365 A CN113470365 A CN 113470365A CN 202111022557 A CN202111022557 A CN 202111022557A CN 113470365 A CN113470365 A CN 113470365A
Authority
CN
China
Prior art keywords
bus
time
geographic
information
route
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111022557.1A
Other languages
Chinese (zh)
Other versions
CN113470365B (en
Inventor
马佳曼
罗喜伶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Innovation Research Institute of Beihang University
Original Assignee
Hangzhou Innovation Research Institute of Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Innovation Research Institute of Beihang University filed Critical Hangzhou Innovation Research Institute of Beihang University
Priority to CN202111022557.1A priority Critical patent/CN113470365B/en
Publication of CN113470365A publication Critical patent/CN113470365A/en
Priority to PCT/CN2021/132828 priority patent/WO2023029234A1/en
Application granted granted Critical
Publication of CN113470365B publication Critical patent/CN113470365B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0137Measuring and analyzing of parameters relative to traffic conditions for specific applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Human Resources & Organizations (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Educational Administration (AREA)
  • Primary Health Care (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a bus arrival time prediction method oriented to missing data. The invention firstly designs a clustering method based on density to find important geographic positions, utilizes the mined geographic positions to represent the geographic structure of each route to form node information, and constructs weighted side information with traffic importance according to the similarity of the geographic structures so as to form a public traffic network graph; according to the public transport network diagram, a multi-head space-time diagram attention network prediction model is built to learn correlation among bus routes from the perspective of space and time, a space attention module is a multi-head diagram attention module with masks, and a time attention module is built by an LSTM layer and a transform layer. The invention can accurately and effectively learn and predict the travel time of each route in the public transport network, not only can improve the accuracy rate of predicting the arrival time of the public transport route with less current travel records, but also can provide the predicted travel time for the public transport route which is not recorded in history and is in a design stage.

Description

Bus arrival time prediction method oriented to missing data
Technical Field
The invention relates to the field of bus arrival time prediction, in particular to a bus arrival time prediction method oriented to missing data, which is used for providing accurate arrival time prediction for bus lines with data missing situations, such as uncertain departure time, problems of GPS equipment and the like.
Background
The bus network is of great importance to the rapidly developed urban traffic, and the bus is still the main scheme of urban trip at present due to the characteristics of economy, environmental protection and the like. The main factor hindering the passenger from selecting a bus is its long waiting time and uncertainty in the journey time, so accurately predicting the arrival time of a bus is very important to solve this problem. It may also help reduce traffic congestion and be used for other integrated intelligent traffic applications, such as trip planning.
The existing method for predicting the arrival or travel time of the bus depends on a large amount of historical travel records and wide coverage of public transport routes. Therefore, the existing method cannot or is difficult to accurately predict the arrival time of the bus with missing data, and the reasons mainly include:
1) data sparsity: for travel time predictions for predicting suburban (rare records) and traffic routes under consideration and design (no records), the main problem is that they have a large amount of missing data, cannot learn the exact mode of operation from the history, and thus cannot predict the arrival time;
2) independent travel modes: from the space perspective, in order to enlarge the service coverage area, the bus lines in the urban bus network cannot be repeated and overlapped too much, so that the bus lines are difficult to learn the operation mode mutually. Even if some stops share parts of the same road, the routes in the traffic network have relatively different and independent travel patterns due to different demands, different numbers of passengers, different parking sequences and different positions;
3) complex traffic conditions: the travel patterns of public transportation lines are more complex than private vehicles because they also have unique traffic information. In addition to being influenced by road length, direction and interactive figures, public transportation routes are also influenced by departure timing, number and stopping location.
Therefore, it is necessary and important to propose a bus arrival time prediction method having a history data loss.
Disclosure of Invention
The invention provides a bus arrival time prediction method oriented to missing data, which can accurately and effectively learn and predict the travel time of each route in a public transport network, improve the arrival time prediction accuracy of a bus line with less current travel records, and provide predicted travel time for the bus line without history records in a design stage.
The technical scheme of the invention is as follows:
the invention provides a bus arrival time prediction method for missing data, which comprises the following steps:
1) integrating historical bus running track GPS information, public transport station position information and city interest point data (data information comprises building longitude and latitude and building city function classification information); extracting important geographic positions in the public transportation network from the integrated data through a density-based clustering method, representing the geographic structure of each public transportation route by using the obtained geographic positions, and performing node extraction and edge extraction on the public transportation network diagram to construct the public transportation network diagram;
2) establishing a multi-head space-time diagram attention network prediction model according to the constructed public transport network diagram to learn the correlation between the bus routes from the space and time angles; the multi-head space-time diagram attention network prediction model comprises a space attention module and a time attention module which are connected in sequence; the space attention module is an attention module of a multi-head graph attention network with masks, the graph attention network is applied to learn the space dependency relationship among different nodes, and a multi-head graph attention block with the masks is used for learning the global and local space dependency relationship among the bus lines under different conditions; the time attention module comprises an LSTM layer and a transformer layer which are respectively used for carrying out local time dependence learning and global time dependence learning;
3) and predicting the arrival time of the bus with the missing data by using a multi-head space-time diagram attention network prediction model.
Further, the important geographic position in the public transportation network is extracted from the integrated data through a density-based clustering method, and the method specifically comprises the following steps:
setting parameters of a density-based clustering method according to the number of GPS points with the speed of 0 in the integrated data, obtaining important geographic positions in the public transportation network through the density-based clustering method, and determining the weight of the geographic positions according to the number of the GPS points contained in each geographic position.
Further, the important geographic location obtained by the utilization represents the geographic structure of each route, specifically: the locations of intersections and stops are represented by weights to represent the geographic structure of each bus route.
Further, the node extraction specifically includes: selecting a geographical position between every two adjacent stops of each bus route, and using information of the two stops and the selected geographical position information between the two stops as information of a node s to represent a road section between the two adjacent stops.
Further, a geographical location is selected between every two adjacent sites to form node information, which specifically includes: and selecting an intersection which is farthest away from the positions of the two adjacent stops and has the largest pedestrian flow in the bus route, and forming node information by the intersection information and the geographical position information of the two adjacent stops.
Further, the edge extraction specifically includes:
establishing an edge graph to represent the relationship between the road sections, wherein the weight of the edge codes is the spatial correlation strength or similarity; and constructing three adjacency matrixes A which respectively represent the geographic structure similarity relation, the distance relation among the bus routes and the city functional area division relation to obtain three public transport network diagrams which represent different relations.
Preferably, the method for constructing the adjacency matrix a specifically includes: establishing a geographical structure similarity relationship edge graph and rootExtracting the position information and the node length information of the nodes according to the three important geographical position information contained in the extracted nodes, performing similarity comparison by using a DTW algorithm, and establishing a geographical structure similar adjacency matrix between the nodesA g (ii) a Then, according to building category information contained in urban interest point data near three geographic positions in each node, extracting the urban function category of each node, and according to the similarity of urban functions, establishing an adjacent matrix of urban function area division relation among the nodesA f (ii) a Finally, a third geographical distance adjacency matrix is designed according to the distance relation between the bus routesA d (ii) a The weights of the edges in the adjacency matrix are normalized and range from 0 to 1.
Further, the step 3) is specifically: for a bus route with lack of data, learning a bus running mode with complete historical data in the previous h time periods by using a multi-head space-time diagram attention network prediction model and according to the similarity of the bus running modes; and then predicting the arrival time of the bus with the bus route lacking data.
Compared with the prior art, the method positions important travel places (such as stations and intersections) in the routes by using a density-based clustering method according to the space-time dependency relationship between the geographical structure of each public transport route and the route; and constructing a weighted public transportation network graph with the transportation importance according to the mined important travel places. Based on the constructed traffic network diagram, a multi-attention neural network is provided, and the correlation among bus routes is learned from the perspective of space and time. In the space attention module, a multi-head GAT (multi-head GAT) of a multi-head graph attention network with a mask is established, the importance of global and local routes on three views of city functions, route distances and bus structure similarity can be learned, and a travel mode of learning is effectively combined by combining various influence factors. In the time attention module, Long Short Term Memory (LSTM) and transform layers are proposed to learn the bus operating mode at far and recent times. Travel (arrival) times for bus routes with sparse and no historical data are accurately inferred and learned in conjunction with spatial and temporal attention modules. The invention can not only improve the prediction accuracy of the arrival time of the bus route with less current travel records, but also provide the predicted travel time between stations for the route without history records in the design stage.
Drawings
FIG. 1 is a flow chart of a bus arrival time prediction method oriented to missing data;
FIG. 2 is a pseudo-code diagram of a density-based clustering method according to an embodiment of the present invention;
FIG. 3 is a pseudo-code diagram of a method of selecting a geographic location between sites in an embodiment of the invention;
FIG. 4 is a graph of experimental results of the inventive and comparative experimental methods.
Detailed Description
The invention will be further illustrated and described with reference to specific embodiments.
The flow chart of the method of the invention is shown in figure 1, which consists of two main parts, namely the construction of a public transport network diagram and the construction of a network model based on a multi-head attention diagram. Firstly, integrating historical bus running track GPS information, public transport station position information and city interest point data (POI) data, then automatically discovering important geographic positions (including stations and intersections) by a density-based clustering method, and representing the geographic structure of each route by utilizing the mined geographic positions; three public transport network maps are constructed according to geographic structure similarity (such as the distance between bus stops and the number of crossroads), the distance between bus routes and the functional area division of cities. According to the constructed public transportation network diagram, the invention establishes a multi-head space-time diagram attention network prediction model which comprises a space and time attention module to predict arrival time. In the space attention module, the invention designs a multi-head graph attention block with masks to learn the global and local space dependency relationship between the bus lines under different conditions. Multi-head mechanism for adjacent bus lines with complete historical travel recordsThe travel time of the segment is collectively learned. By the designed mask mechanism, the weighted adjacent matrix formed by different neighbor bus line sections is shielded, the attention area block can be concentrated on the selected neighbor, and the calculation time is reduced. For example, when the unstable history of the target route is small, the input historical travel time of the target route itself needs to be ignored. In the time attention module, in order to accurately simulate the time influence of past historical data under different conditions (such as normal and abnormal conditions under traffic jam and weather conditions), the invention constructs a time attention module through an LSTM layer and a transform layer so as to obtain global (long-distance) and local (recent) time patterns for travel time prediction. Through the attention of time and space, can be learned beforehBus running mode with complete historical data in each time period (X t-h ,…,X t ) (ii) a According to the similarity of the operation modes, the arrival time of the buses with the lack of historical data is carried outX t+1 And (6) predicting.
The construction of the public transportation network diagram and the construction of the prediction model of the multi-attention diagram network of the invention are explained below.
Construction of public transport network diagram
Since the structure of public transportation is usually represented by a series of connected road segments or stations, existing mesh-based map segmentation does not work for the prediction of travel time. Thus, the present invention constructs a traffic network from a new graphical view by taking into account the spatial and temporal characteristics of public trafficGAnd a density-based clustering method (DBSCAN) is proposed to automatically find important geographical positions (stations and intersections) and to represent the geographical structure of each route by using the obtained geographical positions (Geo-structure). Then, based on the discovered geographic structure of the bus route, the invention constructs a public transport networkGWhereinGNode inSRepresenting travel segments between each two adjacent sites.
(1) Public transport geographic structure mining based on DBSCAN
First, the present invention utilizes a density-based clustering algorithm (DBSCAN) to extract the geographical structure of public transportation, including the exact location and density of intersections and sites. The process does not directly use the marked station information, but extracts the position from the DBSCAN algorithm for two reasons: first, since the proposed travel segments are site-based, there are situations where two routes have the same site, but different paths. Therefore, extracting the geographical structure of the route using only the location of the station is inaccurate. Second, the waiting time at each stop and intersection is typically affected by the number of passengers and traffic lights, which can result in large differences in travel time for different routes. Therefore, the invention searches for a dense area (Geo-structure discovery) by a density-based clustering method (DBSCAN), without setting the number of clusters or a fixed shape, and the specific steps are as shown in fig. 2 (Algorithm 1) fig. 2, the GPS pointspContaining longitude and latitude information: (long, lat),cIs the key point in the public transport network obtained after algorithm clustering, which comprises longitude, latitude, GPS quantity and station information (long,lat,num,st). The density-based clustering algorithm requires two parameters: radius of scanepsAnd minimum number of pointsminPtsThe parameters are set according to the number of GPS points in the database with speed values around 0 to find all important geographical locations along the way, including stations and crossings. The weight of the site and intersection is then determined based on the number of GPS within each of the mined geographic locations.
(2) Node extraction and edge extraction for public transport network graph
2.1) node extraction
After the important geographical position of the public transportation network is extracted, the invention provides a node information representation method, the method is designed to select an important geographical position between every two adjacent stops of each public transportation route, and a set of two stop information and the important geographical position information between the two stop information is used as a nodesTo represent a travel segment from one site to the next. The method of selecting the geographical location between sites is shown in figure 3 (Algorithm 2). In FIG. 3, step 3 traverses all slave DBSCAN dataClustering of arrivalsCIn steps 4-6, at two neighboring owning sitescIn the method, the position farthest away from the station and the maximum flow of people are foundc i+1 As one of the information in s, step 7 aggregates the two sites andc i+1 as information of s.
2.2) Adjacent matrix (edge) establishment
The invention establishes the edge graph according to the information of the nodes to represent the relationship between the road sections (nodes). Because a plurality of influencing factors can influence the travel mode of the public traffic system, the invention constructs three adjacent matrixesATo represent different relationships. Firstly, establishing a geographical structure similarity relation edge graph, extracting position information and section length information of a section (node) according to three important geographical position information contained in the extracted node, performing similarity comparison by using a DTW (delay tolerant shift) algorithm, and establishing a geographical structure similarity adjacency matrix between the nodesA g (ii) a Then, according to the building category information (a building city function category information set within a distance of 100 meters in a node longitude and latitude radius range) contained in the city interest point data near three positions in each node, extracting the city function category of each road section (node), and according to the similarity of the city functions, establishing an adjacent matrix of the city function area division relationship among the nodesA f (ii) a Finally, traffic conditions on roads may be similar due to the spatial geographical range. Based on the consideration, the invention designs a third geographic distance adjacency matrix according to the distance relationship between the bus routesA d . The weights of edges in the adjacency matrix are normalized by the design of the present invention, and range from 0 to 1.
Two-head and multi-head space-time diagram attention network prediction model construction
After a bus network diagram is constructed, the invention provides a multi-head space-time diagram attention network prediction model which is used for predicting the bus travel time of the whole city by using limited data. The model can effectively learn and predict the travel time of each bus route in the city range, particularly for suburban routes and routes without any historical bus travel records. It can help update existing public transportation systems, adjust the outdated schedule of developing regional links, and help select new links and design new links, providing travel time for each link under normal and abnormal traffic conditions. This model contains two modules, a spatial attention module and a temporal attention module.
2.1) spatial attention Module
In the spatial attention module, the present invention applies a graph attention network (GAT) to learn spatial dependencies between different travel segments. Compared with the Graph Convolution Network (GCN), the dynamic graph processing capability and inductive learning capability of GAT are more suitable for urban-wide travel time prediction under the condition of limited data. The graph attention layer is the basic part of the GAT, which can learn the correlation between each node and update the hidden features of each pair of nodes. The node characteristics are represented in the time interval t ash t i . In the first layer of the optical disc,h t i is thats i The segment's input travel time record and embedded time information.s i Ands j attention coefficient ofe t ijCan be expressed as:
Figure 604931DEST_PATH_IMAGE001
whereinWIs thatlThe learnable parameters of a layer or layers,a(.)is a function of the calculated correlation. The invention utilizesLeakyReLUThe active function trains the feedforward neural network. For each layer, bysoftmaxThe function normalizes the output to [0,1 ]]:
Figure 990913DEST_PATH_IMAGE002
In order to obtain richer travel mode combinations for accurately learning bus route operation modes with complete historical data for bus routes with missing data, the invention expands the spatial attention to multi-start with masksAttention mechanism, it is by having learnableKIndividual heads of attention are connected in series to achieve the final spatial attention result:
Figure 239492DEST_PATH_IMAGE003
whereinσIs composed ofsoftmaxA function. In each attention head, the present invention adds a masked attention mechanism to the adjacency matrix for paying attention to the bus lines with complete historical operating data and learning the operating mode. Mask codemIn thatlThe representation of the layers is:
Figure 103543DEST_PATH_IMAGE004
wherein gamma is a nodeiAnd nodejAttention of (1)aIs a threshold value oflThe layer masked X output is represented as:
Figure 386756DEST_PATH_IMAGE005
whereinXFor the input data of the transit time,X’to add an output after the attention mechanism,X l to be after the addition of a masking mechanismlThe output of the layer(s) is,A
Figure 678060DEST_PATH_IMAGE007
Mis the Hadamard product of the adjacency matrix to the mask matrix.
2.2) temporal attention Module
After learning the spatial dependencies, the invention connects a temporal attention module. The travel time of each bus trip is greatly affected by real-time traffic conditions. For example, when traffic conditions are normal, the past long distance history of the target route for the same time period may be highly similar in travel time for the current time period. However, when traffic congestion occurs, the travel pattern may be unstable, but may still have a similar pattern to the most recent time period. Thus, both global (far time) and local (recent) time travel patterns need to be considered for different traffic conditions.
2.2.1 local time-dependent learning
A Recurrent Neural Network (RNN) is an artificial neural network that is particularly well suited to capture the time dependencies in sequence learning. However, previous studies have shown that RNNs are often difficult to train long sequences due to problems of gradient extinction and explosion. To overcome these disadvantages, LSTM (Long Short-Term Memory) automatically determines the optimum time lag by introducing an input gate and a forgetting gate. Thus, the present invention builds an LSTM-based model to focus on local temporal information, whereh t-1 Is the input vector of the LSTM unit,W ix , W ih andb i is a learnable parameter matrix and bias vector for the recursive layer,σstandard sigmoid function. LSTM input gatei t Can be expressed as:
Figure 413935DEST_PATH_IMAGE008
2.2.2 Global time dependent learning
In order to discover the information of the bus travel mode in time from the global perspective, the invention introduces a transformer layer in the time attention module. For a single attention layer, there are typically three types of vectors for each node in the public transportation network mapQQuery, inquireKKey and valueV. The hidden subspace learning process may be expressed as:
Figure 81677DEST_PATH_IMAGE009
W Q , W K , W V output of global time attention as a scientific system parameterAttentionIs based on the dot product attention of the scalingForce is calculated, whereind K Is a scaling factor, expressed as:
Figure 219397DEST_PATH_IMAGE010
2.3) bus arrival time prediction layer
After the high-dimensional spatio-temporal features are obtained, the invention uses linear layers for prediction. By minimizing the predicted value of the desired outputX t+1 Mean square error between and true valueLTo train predictionsX t+1 Training a multi-head space-time graph attention network prediction model by using Mean Square Error (MSE) loss can be expressed as:
Figure 947182DEST_PATH_IMAGE011
wherein
Figure DEST_PATH_IMAGE013
Are learnable parameters in the model.
In the experimental verification link of the method, the bus track and POI information are used, and the data are obtained from the traffic department of a certain city. The bus trajectory includes location, timestamp, speed, and bus ID information. The average sampling frequency is 30 seconds per point, with a number of 278 individual lines per day producing about 30 ten thousand points. The POI data set is composed of building location and category (social function). And selecting three lines as target lines by using a cross verification method to evaluate the performance and the robustness of the method. The three test bus routes selected by the invention are located in different areas of the city, including developed central areas, remote areas and paths connecting the centers and suburbs. The results were then averaged. And deleting 40%, 60% and 80% of GPS recording points randomly in each track of the three bus lines to test the prediction accuracy of the rarely recorded lines to different degrees. The present invention also removes all of their historical records, treating them as three routes under design (without any historical travel time records) to test the travel time estimation performance of the new route (site location and path are designed), which helps to evaluate the proposed model, here named magte method.
Comparative experimental method:
historical average model (HA). The travel time is predicted by calculating the historical average travel time of the bus route over each time period (15 minutes).
A spatio-temporal artificial neural network (ST-ANN).
Kalman Filtering (KF).
Support Vector Regression (SVR).
E-knn: this is a model proposed based on the weighted-enhancement k-NN method, which uses the k records that are most similar to the current traffic conditions to identify the traffic conditions and predict the travel time. Here, the similarity of the travel pattern is set to 90/% or more, and the target link is k neighbors.
RnTTE: the model is based on an LSTM neural network, comprising a fully-connected LSTM layer with 128 hidden units.
DeepTTE, a model that combines the geo-conv and lstm layers to predict travel time.
The experimental results are shown in fig. 4, and the results show that the method of the invention has better performance in the field of travel time prediction problems than other advanced methods in the design process, namely sparse records or routes. It proves that the proposed MAGTE can effectively predict the public transportation travel time of the whole city, has the minimum MAPE error rate, and is used for updating and developing public transportation paths.
The above examples only show some embodiments of the present invention, and the description thereof is specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the appended claims.

Claims (8)

1. A bus arrival time prediction method oriented to missing data is characterized by comprising the following steps:
1) performing data integration on historical bus running track GPS information, public transport station position information and city interest point data; extracting important geographic positions in the public transportation network from historical GPS data of the public transportation through a density-based clustering method, representing the geographic structure of each public transportation route by using the obtained geographic positions, designing a node extraction and edge weight representation method of a public transportation network diagram according to station information positions and urban interest point data, and constructing the public transportation network diagram;
2) establishing a multi-head space-time diagram attention network prediction model according to the constructed public transport network diagram to learn the correlation between the bus routes from the space and time angles; the multi-head space-time diagram attention network prediction model comprises a space attention module and a time attention module which are connected in sequence; the space attention module is a multi-head graph attention module with a mask, wherein the multi-head design is used for learning the global and local space dependency relationship between the bus routes under different conditions, and the mask design is used for paying attention to the important bus route travel rule with complete historical data; the time attention module comprises an LSTM layer and a transformer layer which are respectively used for carrying out local time dependence learning and global time dependence learning;
3) and predicting the arrival time of the bus with the missing data by using a multi-head space-time diagram attention network prediction model.
2. The method for predicting the arrival time of the bus facing the missing data according to claim 1, wherein the important geographic position in the bus network is extracted from the integrated data by a density-based clustering method, and specifically comprises the following steps:
setting parameters of a density-based clustering method according to the number of GPS points with the speed of 0 in the integrated data, obtaining important geographic positions in the public transportation network through the density-based clustering method, and determining the weight of the geographic positions according to the number of the GPS points contained in each geographic position.
3. The method for predicting the arrival time of the bus with the missing data according to claim 1, wherein the obtained important geographic position is used for representing the geographic structure of each route, and specifically comprises the following steps: the locations of intersections and stops are represented by weights to represent the geographic structure of each bus route.
4. The method for predicting the arrival time of the bus with the missing data according to claim 1, wherein the node extraction specifically comprises: selecting a geographical position between every two adjacent stops of each bus route, and using information of the two stops and the selected geographical position information between the two stops as information of a node s to represent a road section between the two adjacent stops.
5. The method for predicting the arrival time of the bus with the missing data according to claim 4, wherein a geographical position is selected between every two adjacent stations to form node information, and the method specifically comprises the following steps: and selecting an intersection which is farthest away from the positions of the two adjacent stops and has the largest pedestrian flow in the bus route, and forming node information by the intersection information and the geographical position information of the two adjacent stops.
6. The method for predicting the arrival time of the bus with the missing data according to claim 5, wherein the edge weight representation method specifically comprises:
establishing an edge graph to represent the relationship between the road sections, wherein the weight of the edge codes is the spatial correlation strength or similarity; and constructing three adjacency matrixes A which respectively represent the geographic structure similarity relation, the distance relation among the bus routes and the city functional area division relation to obtain three public transport network diagrams which represent different relations.
7. The method for predicting the arrival time of the bus with the missing data according to claim 6, wherein the construction method of the adjacency matrix A specifically comprises the following steps: establishing a geographical structure similarity relation edge graph, extracting position information and node length information of nodes according to three important geographical position information contained in the extracted nodes, performing similarity comparison by using a DTW (delay tolerant shift keying) algorithm, and establishing a geographical structure similarity adjacency matrix between the nodesA g (ii) a Then, according to building category information contained in urban interest point data near three geographic positions in each node, extracting the urban function category of each node, and according to the similarity of urban functions, establishing an adjacent matrix of urban function area division relation among the nodesA f (ii) a Finally, a third geographical distance adjacency matrix is designed according to the distance relation between the bus routesA d (ii) a The weights of the edges in the adjacency matrix are normalized and range from 0 to 1.
8. The method for predicting the arrival time of the bus with the missing data according to claim 1, wherein the step 3) is specifically as follows: for a bus route with lack of data, learning a bus running mode with complete historical data in the previous h time periods by using a multi-head space-time diagram attention network prediction model and according to the similarity of the bus running modes; and then predicting the arrival time of the bus with the bus route lacking data.
CN202111022557.1A 2021-09-01 2021-09-01 Bus arrival time prediction method oriented to missing data Active CN113470365B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111022557.1A CN113470365B (en) 2021-09-01 2021-09-01 Bus arrival time prediction method oriented to missing data
PCT/CN2021/132828 WO2023029234A1 (en) 2021-09-01 2021-11-24 Method for bus arrival time prediction when lacking data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111022557.1A CN113470365B (en) 2021-09-01 2021-09-01 Bus arrival time prediction method oriented to missing data

Publications (2)

Publication Number Publication Date
CN113470365A true CN113470365A (en) 2021-10-01
CN113470365B CN113470365B (en) 2022-01-14

Family

ID=77867139

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111022557.1A Active CN113470365B (en) 2021-09-01 2021-09-01 Bus arrival time prediction method oriented to missing data

Country Status (2)

Country Link
CN (1) CN113470365B (en)
WO (1) WO2023029234A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113936462A (en) * 2021-10-18 2022-01-14 广州交信投科技股份有限公司 Bus road condition prediction method and system based on ASTGCN algorithm
CN114495484A (en) * 2021-12-17 2022-05-13 北京航空航天大学杭州创新研究院 Multi-source data hierarchical graph clustering algorithm-based bus station position recommendation method
CN114841268A (en) * 2022-05-06 2022-08-02 国网江苏省电力有限公司营销服务中心 Abnormal power customer identification method based on Transformer and LSTM fusion algorithm
CN115424467A (en) * 2022-08-19 2022-12-02 贵阳移动金融发展有限公司 Information acquisition system based on public transport
WO2023029234A1 (en) * 2021-09-01 2023-03-09 北京航空航天大学杭州创新研究院 Method for bus arrival time prediction when lacking data
CN116071939A (en) * 2023-03-24 2023-05-05 华东交通大学 Traffic signal control model building method and control method
CN116596126A (en) * 2023-04-27 2023-08-15 苏州大学 Bus string prediction method and system

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116542438B (en) * 2023-03-28 2024-01-30 大连海事大学 Bus passenger starting and stopping point estimation and repair method based on non-reference real phase
CN116542374B (en) * 2023-05-05 2024-08-23 北京蔚行科技有限公司 Public transportation arrival time prediction method based on neural network and Kalman filtering
CN116432870B (en) * 2023-06-13 2023-10-10 齐鲁工业大学(山东省科学院) Urban flow prediction method
CN117057848B (en) * 2023-08-28 2024-05-31 淮阴工学院 Takeout order demand prediction method based on GAT-GPR
CN117435935B (en) * 2023-09-13 2024-08-02 广州大学 Personnel group prediction method and device based on self-supervision graph attention network
CN117235556B (en) * 2023-11-13 2024-02-27 浙江大学城乡规划设计研究院有限公司 Traffic manager-oriented multi-mode traffic composite network construction method
CN117334072B (en) * 2023-12-01 2024-02-23 青岛城运数字科技有限公司 Bus arrival time prediction method and device
CN117688453B (en) * 2024-02-02 2024-04-30 山东科技大学 Traffic flow prediction method based on space-time embedded attention network
CN117910660B (en) * 2024-03-18 2024-06-28 华中科技大学 Bus arrival time prediction method and system based on GPS data and space-time correlation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109410580A (en) * 2018-11-15 2019-03-01 山东管理学院 A kind of real-time arrival time prediction technique of public transport and system
CN109584552A (en) * 2018-11-28 2019-04-05 青岛大学 A kind of public transport arrival time prediction technique based on network vector autoregression model
JP2019091389A (en) * 2017-11-10 2019-06-13 株式会社日立製作所 Resource arbitration system and resource arbitration apparatus
CN110889546A (en) * 2019-11-20 2020-03-17 浙江省交通规划设计研究院有限公司 Attention mechanism-based traffic flow model training method
CN111899511A (en) * 2020-08-03 2020-11-06 西南交通大学 Bus arrival time prediction method for AVL data of collinear line

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210248460A1 (en) * 2020-02-07 2021-08-12 Uatc, Llc Systems and Methods for Optimized Multi-Agent Routing Between Nodes
CN111554118B (en) * 2020-04-24 2022-01-25 深圳职业技术学院 Dynamic prediction method and system for bus arrival time
CN112071062B (en) * 2020-09-14 2022-09-23 山东理工大学 Driving time estimation method based on graph convolution network and graph attention network
CN112150207A (en) * 2020-09-30 2020-12-29 武汉大学 Online taxi appointment order demand prediction method based on space-time context attention network
CN113470365B (en) * 2021-09-01 2022-01-14 北京航空航天大学杭州创新研究院 Bus arrival time prediction method oriented to missing data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019091389A (en) * 2017-11-10 2019-06-13 株式会社日立製作所 Resource arbitration system and resource arbitration apparatus
CN109410580A (en) * 2018-11-15 2019-03-01 山东管理学院 A kind of real-time arrival time prediction technique of public transport and system
CN109584552A (en) * 2018-11-28 2019-04-05 青岛大学 A kind of public transport arrival time prediction technique based on network vector autoregression model
CN110889546A (en) * 2019-11-20 2020-03-17 浙江省交通规划设计研究院有限公司 Attention mechanism-based traffic flow model training method
CN111899511A (en) * 2020-08-03 2020-11-06 西南交通大学 Bus arrival time prediction method for AVL data of collinear line

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023029234A1 (en) * 2021-09-01 2023-03-09 北京航空航天大学杭州创新研究院 Method for bus arrival time prediction when lacking data
CN113936462A (en) * 2021-10-18 2022-01-14 广州交信投科技股份有限公司 Bus road condition prediction method and system based on ASTGCN algorithm
CN114495484A (en) * 2021-12-17 2022-05-13 北京航空航天大学杭州创新研究院 Multi-source data hierarchical graph clustering algorithm-based bus station position recommendation method
CN114495484B (en) * 2021-12-17 2023-10-27 北京航空航天大学杭州创新研究院 Bus stop position recommendation method based on multi-source data hierarchical graph clustering algorithm
CN114841268A (en) * 2022-05-06 2022-08-02 国网江苏省电力有限公司营销服务中心 Abnormal power customer identification method based on Transformer and LSTM fusion algorithm
CN115424467A (en) * 2022-08-19 2022-12-02 贵阳移动金融发展有限公司 Information acquisition system based on public transport
CN115424467B (en) * 2022-08-19 2023-10-24 贵阳移动金融发展有限公司 Information acquisition system based on public transportation
CN116071939A (en) * 2023-03-24 2023-05-05 华东交通大学 Traffic signal control model building method and control method
CN116596126A (en) * 2023-04-27 2023-08-15 苏州大学 Bus string prediction method and system

Also Published As

Publication number Publication date
WO2023029234A1 (en) 2023-03-09
CN113470365B (en) 2022-01-14

Similar Documents

Publication Publication Date Title
CN113470365B (en) Bus arrival time prediction method oriented to missing data
Yuan et al. A survey of traffic prediction: from spatio-temporal data to intelligent transportation
Sun et al. DxNAT—Deep neural networks for explaining non-recurring traffic congestion
CN108898838B (en) Method and device for predicting airport traffic jam based on LSTM model
CN112489426B (en) Urban traffic flow space-time prediction scheme based on graph convolution neural network
US11429987B2 (en) Data-driven method and system to forecast demand for mobility units in a predetermined area based on user group preferences
Ma et al. Multi-attention graph neural networks for city-wide bus travel time estimation using limited data
Necula Dynamic traffic flow prediction based on GPS data
CN113283581B (en) Multi-fusion graph network collaborative multi-channel attention model and application method thereof
Barnes et al. Bustr: Predicting bus travel times from real-time traffic
Osogami et al. Toward simulating entire cities with behavioral models of traffic
Chen et al. Dynamic travel time prediction using pattern recognition
Li et al. A lightweight and accurate spatial-temporal transformer for traffic forecasting
CN115204478A (en) Public traffic flow prediction method combining urban interest points and space-time causal relationship
Yang et al. Dynamic Origin‐Destination Matrix Estimation Based on Urban Rail Transit AFC Data: Deep Optimization Framework with Forward Passing and Backpropagation Techniques
Qaddoura et al. Temporal prediction of traffic characteristics on real road scenarios in Amman
Hongsakham et al. Estimating road traffic congestion from cellular handoff information using cell-based neural networks and K-means clustering
Rajbhandari Bus arrival time prediction using stochastic time series and Markov chains
Osman et al. Application of long short term memory networks for long-and short-term bus travel time prediction
CN116665448A (en) Traffic speed real-time prediction method and system based on graph convolution network
Yao et al. Trip segmentation and mode detection for human mobility data
Gong et al. Spatio-temporal travel volume prediction and spatial dependencies discovery using gru, gcn and bayesian probabilities
Khodabandelou et al. Attention-based gated recurrent unit for links traffic speed forecasting
Masoud et al. Impact of traffic conditions and carpool lane availability on peer to peer ridesharing demand
Seitbekova et al. The bus arrival time prediction using LSTM neural network and location analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant