CN114912669A - Public transport passenger flow combined graph neural network prediction method based on multi-source data - Google Patents
Public transport passenger flow combined graph neural network prediction method based on multi-source data Download PDFInfo
- Publication number
- CN114912669A CN114912669A CN202210436660.9A CN202210436660A CN114912669A CN 114912669 A CN114912669 A CN 114912669A CN 202210436660 A CN202210436660 A CN 202210436660A CN 114912669 A CN114912669 A CN 114912669A
- Authority
- CN
- China
- Prior art keywords
- bus
- attribute
- passenger flow
- data
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Traffic Control Systems (AREA)
Abstract
The invention relates to a public transport passenger flow combination graph neural network prediction method based on multi-source data, and belongs to the technical field of public transport passenger flow analysis. The method comprises the following steps: a, collecting multi-source data related to bus passenger flow; b, constructing an attribute graph and generating graph structure data; c, building a bus passenger flow prediction combination graph neural network model; and D, training the model to obtain a prediction result. The invention realizes the correlation of the multisource data related to the public traffic passenger flow in time and space by using the combination of the graph neural network and the deep learning.
Description
Technical Field
The invention relates to a public transport passenger flow combination graph neural network prediction method based on multi-source data, and belongs to the technical field of public transport passenger flow analysis.
Background
In urban public transport systems, the short-term number and variation of bus stop passenger flows reflect the actual demands of bus passengers and the instability under objective external influence. The short-time passenger flow of the bus stop can be accurately predicted, the travel rule and characteristics of the passenger can be accurately obtained, and the method is beneficial to a bus management department and an operation enterprise to manage and service the travel of the bus passenger. In order to improve the accuracy of short-time passenger flow prediction of a bus stop, the development of a prediction algorithm has two major trends, namely the application of artificial intelligence technologies such as deep learning; and secondly, fusing multi-source data. For example, a patent "a method, an apparatus, an electronic device, and a storage medium for predicting bus traffic" (publication No. CN112862187A) discloses a method for predicting bus traffic using a convolutional neural network. The patent "public transport passenger flow prediction method and system based on adaptive graph learning" (publication number: CN113537580A) discloses a method for predicting public transport passenger flow by generating a relationship matrix through constructing a graph learning module. The two patents adopt deep learning methods on the bus passenger flow method, but only use a single bus passenger flow data source. The patent 'public transport passenger flow prediction method and system' (publication number: CN112766597A) discloses a public transport passenger flow prediction method based on LSTM, attention mechanism and time-sharing graph convolution method, which integrates historical public transport passenger flow data, public transport route data, weather and holiday information. The prediction method proposed in this patent relates to a deep learning method and multi-source data, however, the integrated data source is still insufficient.
Along with the construction of intelligent transportation and intelligent cities, more and richer data resources provide assistance for improving the accuracy of short-time prediction of the bus passenger flow. Besides the historical bus passenger flow data directly related to the prediction, bus network data, city other public transport operation information, bus station peripheral land information (POI), weather information, road traffic operation information and the like can provide services for the short-time bus passenger flow prediction. The multi-source information not only includes correlation on a time sequence, but also includes connectivity on a space topological structure, so that a prediction method which integrates space-time characteristics and can reflect different influence weights of various data sources is required to be constructed.
Disclosure of Invention
The invention aims to overcome the problems in the prior art and provide a public transport passenger flow combined graph neural network prediction method based on multi-source data, and the time-space association relationship of the public transport passenger flow data is realized by utilizing the combination of a graph neural network and deep learning.
In order to solve the problems, the public transport passenger flow combination diagram neural network prediction method based on multi-source data comprises the following steps:
step A: collecting multi-source data related to bus passenger flow;
and B: constructing an attribute graph and generating graph structure data;
and C: building a public transport passenger flow prediction combination graph neural network model;
step D: and training the model to obtain a prediction result.
Further, the step a specifically includes the following steps:
step A1: obtaining static information, including the road topology where the public transportation network is located, basic information of public transportation stations and information of land around the stations (such as POI, Point of Interest);
step A2: acquiring historical information and real-time dynamic information, wherein the dynamic information comprises the passenger flow on and off a bus stop, the running state of a road where a bus route is located, weather and a calendar;
step A3: and generating a feature vector set of the multi-source data.
Further, the step B specifically includes the following steps:
step B1: constructing an attribute graph, and simplifying a topological structure graph of a bus route into a directed graph, wherein the directed graph comprises nodes and edges; the node represents a bus stop; the edges represent the incidence relation among different stops, the upstream bus stop is considered to be associated with all the downstream bus stops, if the connected bus stops are adjacent to the upstream bus stop and the downstream bus stop, the connected bus stops are called as real edges, and the rest are called as virtual edges;
step B2: generating graph structure data, and encoding the multi-source data into the attribute graph, wherein the graph structure multi-source data in the t-th time period is as follows: g t =(N,E,V t ,A t ,u t ) In the formula, N is a node data set and corresponds to a bus stop; e is a side data set corresponding to the relevance among the sites; v t The attribute of the node in the t time period comprises the data of the passenger flow on the bus and the data of the passenger flow off the bus at the bus stop; a. the t Is the firstthe t time period edge attributes comprise the association relation between the bus stops and various factors influencing the association; u. of t The method comprises the steps that global attributes in the t-th time period refer to factors shared by passenger flow data of all bus stops;
step B3: and dividing the data set, and dividing the feature vector set of the multi-source data into a training data set, a verification data set and a test data set according to a certain proportion.
Further, in step B2, V t The attribute of the node in the t-th time period comprises the data of the passenger flow on the bus and the data of the passenger flow off the bus at the bus station; if there is n no A node, then the node attribute for the t-th time period is defined asWherein the content of the first and second substances, andrespectively are characteristic vectors of the passenger flow data of the passengers getting on and off the ith bus stop in the tth time period;
A t the attribute of the t time slot edge comprises the incidence relation between the representative bus stops and various factors influencing the incidence; if there is n ed The edge attribute of the t-th time segment isWherein the content of the first and second substances, the characteristic vector is one of the characteristic vectors of factors influencing the relevance of the upper and lower passenger flow data between the bus stops, and specifically can comprise but is not limited to the following four typesThe factors are as follows:
(1) the spatial proximity is measured by the travel distance of bus lines between bus stops; if dis (i, j) is used to indicate the road driving distance between the upstream bus stop i and the downstream bus stop j in the same driving directionSpatial proximity of edges connecting i and j;
(2) the time influence degree is measured by the similarity between the passenger flow data of the upstream bus stop and the passenger flow data of the downstream bus stop; giving the data of the passengers on and off each bus stop in tau time periods, forming a time sequence containing tau data by the passenger flow on the upstream bus stop i, forming a time sequence containing tau data by the passenger flow on the downstream bus stop j, and calculating the time influence degree S of the stop i on the stop j by using DTW ti(i,j) D (i, j); d (i, j) is a distance measure of two groups of time sequences calculated based on DTW;
(3) the semantic similarity is measured by the land utilization condition of the peripheral area of the bus stop; POIs in the peripheral area of the bus stop represent the land utilization condition of the area, are used for calculating the semantic distance between two bus stops, and calculate the density distribution of POIs in different categories through the POIs in the peripheral land of the bus stop; suppose is provided with n poi POI of the category, land utilization of bus stop i may be represented as a length n poi Vector p of i Where each dimension represents the density of nearby POIs of a particular category, the semantic similarity of the edges connecting bus stops i and j may be calculated as S se(i,j) =p i ·p j ;
(4) The traffic influence degree is measured by the running state of a road between two bus stops; the road operating state between two bus stops can be represented by the average speed of travel, i.e. Is the level between bus stop i and bus stop jThe average running speed; in combination with the above four types of factors having influence on the edge attribute, for the t-th time period, i.e. the edge attributeWherein, the first and the second end of the pipe are connected with each other,n ed is the number of edges;
u t the method comprises the steps that global attributes in the t-th time period refer to factors shared by passenger flow data of all bus stops; regarding weather conditions and calendar features as global attributes closely related to bus stop passenger flow data; the calendar features specifically refer to which time of day, which day of week, which day of month, which month of year, and the date type; t time period global propertyWherein the content of the first and second substances,a characteristic of the weather condition is indicated,a calendar feature is represented.
Further, the step C specifically includes the following steps:
c1: building an input layer model and inputting graph structure data;
c2: building a hidden layer model, and modeling the time-space relation of graph structure data;
c3: building an output layer model and outputting a final bus passenger flow prediction result;
step C2 specifically includes the following steps:
c2.1: quantifying the contribution of each factor using an attention mechanism; graph structure G t =(N,E,V t ,A t ,u t ) Each type of attribute v of t Are all composed of a plurality of feature vectors;
c2.2: and constructing a combined unit of the graph neural network GNN and the long-short term memory network LSTM, and replacing matrix multiplication in the LSTM unit with GNN convolution, wherein the GNN convolution process comprises updating edge attributes, then updating node attributes and finally updating global attributes.
Further, in step C2.1, the attributesTherein code n fa A factor of a type, wherein,is the feature vector of the kth factor at the tth moment, and the contribution of the attention mechanism to each factor is quantitatively calculated as follows:
wherein z is l 、β l 、θ l And b l Is a learnable parameter, h t-1 Is a hidden state; the calculation process takes the node attribute as an example, and is also suitable for the edge attribute and the global attribute; wherein, for the edge attribute, the hidden state of the edge connecting the node i and the node j is corresponding to two hidden states h i 、h j Summing; for a global attribute, its hidden state is the sum of all nodes hidden states.
Further, in step C2.2, the calculation method of the update edge attribute is as follows:
a′ i,j =φ a (a i,j ,v i ,v j ,u)
in the formula, a i,j Is an edge attribute from upstream node i to downstream node j; v. of i And v j Is the node attribute of node i and node j; u is a global attribute;is the update attribute of all edges connecting node i; r i Is the number of edges connecting node i; phi is a a Calculating the update attribute of each edge aiming at the update function of the edge attribute; alpha's' i,j The updated edge attribute is the influence of the upstream node on the downstream node; aggregation function ρ a→v Aggregating all updated edge attributes connected with the node i into a vector; aggregation function ρ a→u Aggregating all edge attributes in a graph structure together into an updated edge attribute And collecting the updated edge attributes.
Further, in step C2.2, the calculation method of the update node attribute is as follows:
in the formula, the update function phi of the node attribute v Obtaining by using the updated aggregation update edge attribute, node i attribute and global attribute of the node iUpdate attribute v 'of node i' i ;Is an updated set of node attributes, by means of an aggregation function ρ v→u Forming a vector;
in step C2.2, the calculation method for updating the global attribute is as follows:in the formula, the update function φ of the global property u And obtaining an updated global attribute u' by using the updated edge attribute and the updated node attribute.
Further, in the process of updating the global attribute, the operation mode of the integrated GNN convolved LSTM model is as follows:
h t =o⊙tanh(c t )
wherein, # denotes convolution with graph structure data, # denotes a Hadamard product, and σ () denotes a sigmoid function; i. f and o are an input gate, a forgetting gate and an output gate; h is the hidden state, c is the cell state, W is the weight; all nodes in the graph structure share the LSTM layer.
Further, step D includes the steps of:
D1. utilizing historical multisource data over a period of time, byTime back propagation, namely training the whole combined graph neural network model by aiming at minimizing the passenger flow prediction error of the bus stop to obtain a learning function of mapping input multi-source data to future bus passenger flow; the process is expressed as: given tau time periods of graph structure historical multi-source data G t-τ+1 ,G t -τ+2 ,…,G t ]And the bus passenger flow output by the t + l time period combination model is Y t+l The learning function f () is:
if the real passenger flow in the t + l time period isThe loss during training is then calculated by:
wherein, W 1 And W 2 Weight matrices, λ, for the combined model and attention mechanism, respectively 1 And λ 2 Is a penalty factor;
D2. inputting real-time data to obtain a bus passenger flow prediction result; and after multi-source data at the current moment are input and collected in the trained combined graph neural network model, a passenger flow predicted value of the bus stop at the future moment is obtained.
The invention has the beneficial effects that: (1) constructing spatial relevance of bus stop passenger flow data: constructing a graph structure and graph structure data based on public transport passenger flow related multi-source data;
(2) improving the prediction precision and integrating multi-source data: through the combination of the graph neural network GNN and the deep learning LSTM, multi-source data influencing bus passenger flow, including weather, calendar, road running state, upstream and downstream station relevance, POI and the like, are integrated;
(3) quantifying the contribution of different factors to the prediction accuracy: by adding the attention mechanism, the accuracy of bus passenger flow prediction is improved.
Drawings
FIG. 1 is a logic block diagram of a public transport passenger flow combination diagram neural network prediction method based on multi-source data according to the present invention;
FIG. 2 is a property building diagram of the present invention;
fig. 3 is a diagram of a neural network prediction model structure of a bus passenger flow combination diagram.
Detailed Description
The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic views illustrating only the basic structure of the present invention in a schematic manner, and thus show only the constitution related to the present invention.
As shown in FIG. 1, the method for predicting the public transport passenger flow combination diagram neural network based on the multi-source data comprises the following steps:
step A: collecting multi-source data related to bus passenger flow;
the step A specifically comprises the following steps:
step A1: obtaining static information, including the road topology where the bus net is located, basic information of bus stops and information of land around the stops (such as POI);
step A2: acquiring historical information and real-time dynamic information, wherein the dynamic information comprises the passenger flow on and off a bus stop, the running state of a road where a bus route is located, weather and a calendar;
step A3: and generating a feature vector set of the multi-source data.
And B: constructing an attribute graph and generating graph structure data;
the step B specifically comprises the following steps:
step B1: constructing an attribute graph as shown in FIG. 2, and simplifying a topological structure graph of a bus route into a directed graph, wherein the directed graph comprises nodes and edges; the nodes represent bus stops; the edges represent the incidence relation among different stops, the upstream bus stop is considered to be related to all the downstream bus stops, if the upstream bus stop and the downstream bus stop are connected, the adjacent upstream bus stop and the adjacent downstream bus stop are called as real edges, and the rest are called as virtual edges;
step B2: generating graph structure dataEncoding multi-source data into an attribute graph, wherein the graph structure multi-source data of the t time period is as follows: g t =(N,E,V t ,A t ,u t ) In the formula, N is a node data set and corresponds to a bus stop; e is a side data set corresponding to the relevance among the sites; v t The attribute of the node in the t time period comprises the data of the passenger flow on the bus and the data of the passenger flow off the bus at the bus stop; a. the t The attribute of the t time slot edge comprises the incidence relation between the representative bus stops and various factors influencing the incidence; u. of t The global attribute refers to a factor for sharing passenger flow data of all bus stops;
step B3: and dividing the data set, and dividing the feature vector set of the multi-source data into a training data set, a verification data set and a test data set according to a certain proportion.
As shown in fig. 3, step C: building a public transport passenger flow prediction combination graph neural network model;
the step C specifically comprises the following steps:
c1: building an input layer model and inputting graph structure data;
c2: building a hidden layer model, and modeling the time-space relation of graph structure data; the influence of different factors on prediction is quantified by adding an attention mechanism in the attribute of graph structure data, the spatial correlation is constructed by a graph neural network GNN, and then the GNN is integrated into a long-short term memory network LSTM to realize space-time relationship modeling;
c3: building an output layer model and outputting a final bus passenger flow prediction result;
step C2 specifically includes the following steps:
c2.1: quantifying the contribution of each factor using an attention mechanism; graph structure G t =(N,E,V t ,A t ,u t ) Each type of attribute v of t Are all composed of a plurality of feature vectors;
in step C2.1, attributesTherein code n fa The factor of the type of the object to be tested,wherein the content of the first and second substances,is the feature vector of the kth factor at the tth moment, and the contribution of the attention mechanism to each factor is quantitatively calculated as follows:
wherein z is l 、β l 、θ l And b l Is a learnable parameter, h t-1 Is a hidden state; the calculation process takes the node attribute as an example, and is also suitable for the edge attribute and the global attribute; wherein, for the edge attribute, the hidden state of the edge connecting the node i and the node j is corresponding to two hidden states h i 、h j Summing; for a global attribute, its hidden state is the sum of all nodes hidden states.
C2.2: and constructing a GNN and LSTM combined unit of the graph neural network, replacing matrix multiplication in the LSTM unit with GNN convolution, wherein the GNN convolution process comprises updating edge attributes, then updating node attributes and finally updating global attributes.
In step C2.2, the calculation method of the update edge attribute is as follows:
a′ i,j =φ a (a i,j ,v i ,v j ,u)
in the formula, a i,j Is an edge attribute from upstream node i to downstream node j; v. of i And v j Is the node attribute of node i and node j; u is a global attribute;is the update attribute of all edges connecting node i; r i Is the number of edges connecting node i; phi is a a Calculating the update attribute of each edge aiming at the update function of the edge attribute; a' i,j The updated edge attribute is the influence of the upstream node on the downstream node; aggregation function ρ a→v Aggregating all updated edge attributes connected with the node i into a vector; aggregation function ρ a→u Aggregating all edge attributes in a graph structure together into an updated edge attribute The updated edge attribute sets are obtained.
In step C2.2, the calculation method for updating the node attribute is as follows:
in the formula, the update function phi of the node attribute v Obtaining an update attribute v 'of the node i by using the updated aggregate update edge attribute, the node i attribute and the global attribute of the node i' i ;Is an updated set of node attributes, by means of an aggregation function ρ v→u A vector is formed.
In step C2.2, the calculation method for updating the global attribute is as follows:in the formula, the update function phi of the global attribute u And obtaining an updated global attribute u' by using the updated edge attribute and the updated node attribute.
In the process of updating the global attribute, the operation mode of the integrated LSTM model after GNN convolution is as follows:
h t =o⊙tanh(c t )
wherein, # denotes convolution with graph structure data, # denotes a Hadamard product, and σ () denotes a sigmoid function; i. f and o are an input gate, a forgetting gate and an output gate; h is the hidden state, c is the cell state, W is the weight; all nodes in the graph structure share the LSTM layer.
Step D: and training the model to obtain a prediction result.
The step D comprises the following steps:
D1. training a whole combination graph neural network model by utilizing historical multi-source data in a period of time and by using time back propagation and aiming at minimizing bus stop passenger flow prediction errors to obtain a learning function of mapping input multi-source data to future bus passenger flow;the process is expressed as: given tau time periods of graph structure historical multi-source data G t-τ+1 ,G t -τ+2 ,…,G t ]And the bus passenger flow output by the t + l time period combination model is Y t+l The learning function f () is:
if the real passenger flow in the t + l time period isThe loss during training is then calculated by:
wherein, W 1 And W 2 Weight matrices, λ, for the combined model and attention mechanism, respectively 1 And λ 2 Is a penalty factor;
D2. inputting real-time data to obtain a bus passenger flow prediction result; and after multi-source data at the current moment are input and collected in the trained combined graph neural network model, a passenger flow predicted value of the bus stop at the future moment is obtained.
In step B2, V t The attribute of the node in the t-th time period comprises the data of the passenger flow on the bus and the data of the passenger flow off the bus at the bus station; if there is n no A node, then the node attribute for the t-th time period is defined asWherein the content of the first and second substances, andrespectively are characteristic vectors of the passenger flow data of the passengers getting on and off the ith bus stop in the tth time period;
A t the attribute of the t time slot edge comprises the incidence relation between the representative bus stops and various factors influencing the incidence; if there is n ed The edge attribute of the t-th time segment isWherein the content of the first and second substances, the characteristic vector is one of the characteristic vectors of factors influencing the relevance of the upper and lower passenger flow data between the bus stops, and specifically can include but is not limited to the following four types of factors:
(1) the spatial proximity is measured by the travel distance of bus lines between bus stops; if dis (i, j) is used to indicate the road driving distance between the upstream bus stop i and the downstream bus stop j in the same driving directionSpatial proximity of edges connecting i and j;
(2) the time influence degree is measured by the similarity between the passenger flow data of the upstream bus stop and the passenger flow data of the downstream bus stop; giving the data of the passenger flow on and off each bus stop in tau Time periods, forming a Time sequence containing tau data by the passenger flow on the upstream bus stop i, forming a Time sequence containing tau data by the passenger flow on the downstream bus stop j, and calculating the Time influence degree S of the stop i on the stop j by using DTW (Dynamic Time Warping) ti(i,j) D (i, j); d (i, j) is a distance measure of two groups of time sequences calculated based on DTW;
(3) the semantic similarity is measured by the land utilization condition of the peripheral area of the bus stop; POI of peripheral area of bus stop represents land utilization condition of the area for calculatingCalculating the density distribution of POI of different categories according to the POI of the land around the bus stop; suppose there is n poi POI of the category, land utilization of bus stop i may be represented as a length n poi Vector p of i Where each dimension represents the density of nearby POIs of a particular category, the semantic similarity of the edges connecting bus stops i and j may be calculated as S se(i,j) =p i ·p j ;
(4) The traffic influence degree is measured by the running state of a road between two bus stops; the road operating state between two bus stops can be represented by the average speed of travel, i.e. The average driving speed between the bus stop i and the bus stop j is obtained; in combination with the above four types of factors having influence on the edge attribute, for the t-th time period, i.e. the edge attributeWherein the content of the first and second substances,n ed is the number of edges;
u t the method comprises the steps that global attributes in the t-th time period refer to factors shared by passenger flow data of all bus stops; regarding weather conditions and calendar features as global attributes closely related to bus stop passenger flow data; calendar features specifically refer to which time of day, which day of the week, which day of the month, which month of the year, and the type of date (i.e., weekday or holiday); t time period global propertyWherein, the first and the second end of the pipe are connected with each other,a characteristic of the weather condition is indicated,a calendar feature is represented.
In light of the foregoing description of the preferred embodiment of the present invention, many modifications and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The technical scope of the present invention is not limited to the content of the specification, and must be determined according to the scope of the claims.
Claims (10)
1. A public transport passenger flow combination diagram neural network prediction method based on multi-source data is characterized by comprising the following steps:
step A: collecting multi-source data related to bus passenger flow;
and B: constructing an attribute graph and generating graph structure data;
and C: building a public transport passenger flow prediction combination graph neural network model;
step D: and training the model to obtain a prediction result.
2. The multi-source data-based bus passenger flow combination map neural network prediction method according to claim 1, wherein the step A specifically comprises the following steps:
step A1: obtaining static information, including the road topology of a bus network, basic information of bus stops and information of land around the stops;
step A2: acquiring historical information and real-time dynamic information, wherein the dynamic information comprises the passenger flow on and off a bus stop, the running state of a road where a bus route is located, weather and a calendar;
step A3: and generating a feature vector set of the multi-source data.
3. The multi-source data-based bus passenger flow combination map neural network prediction method according to claim 1, wherein the step B specifically comprises the following steps:
step B1: constructing an attribute graph, and simplifying a topological structure graph of a bus route into a directed graph, wherein the directed graph comprises nodes and edges; the node represents a bus stop; the edges represent the incidence relation among different stops, the upstream bus stop is considered to be associated with all the downstream bus stops, if the connected bus stops are adjacent to the upstream bus stop and the downstream bus stop, the connected bus stops are called as real edges, and the rest are called as virtual edges;
step B2: generating graph structure data, and encoding the multi-source data into the attribute graph, wherein the graph structure multi-source data in the t time period is as follows: g t =(N,E,V t ,A t ,u t ) In the formula, N is a node data set and corresponds to a bus stop; e is an edge data set corresponding to the relevance among the sites; v t The attribute of the node in the t time period comprises the data of the passenger flow on the bus and the data of the passenger flow off the bus at the bus stop; a. the t The attribute of the t time slot edge comprises the incidence relation between the representative bus stops and various factors influencing the incidence; u. of t The method comprises the steps that global attributes in the t-th time period refer to factors shared by passenger flow data of all bus stops;
step B3: and dividing the data set, and dividing the feature vector set of the multi-source data into a training data set, a verification data set and a test data set according to a certain proportion.
4. The multi-source data-based bus passenger flow combination map neural network prediction method of claim 3, characterized in that: in step B2, V t The attribute of the node in the t-th time period comprises the data of the passenger flow on the bus and the data of the passenger flow off the bus at the bus station; if there is n no A node, then the node attribute for the t-th time period is defined asWherein the content of the first and second substances, andrespectively are characteristic vectors of the passenger flow data of the passengers getting on and off the ith bus stop in the tth time period;
A t the attribute of the t time slot edge comprises the incidence relation between the representative bus stops and various factors influencing the incidence; if there is n ed The edge attribute of the t-th time segment isWherein the content of the first and second substances, the characteristic vector is one of the characteristic vectors of factors influencing the relevance of the upper and lower passenger flow data between the bus stops, and specifically can include but is not limited to the following four types of factors:
(1) the spatial proximity is measured by the travel distance of bus lines between bus stops; if dis (i, j) is used to indicate the road driving distance between the upstream bus stop i and the downstream bus stop j in the same driving directionThe spatial proximity of the edge connecting i and j;
(2) the time influence degree is measured by the similarity between the passenger flow data of the upstream bus stop and the passenger flow data of the downstream bus stop; giving the data of the passengers on and off each bus stop in r time periods, forming a time sequence containing r data by the passenger flow on the upstream bus stop i, forming a time sequence containing tau data by the passenger flow on the downstream bus stop j, and calculating the time influence degree S of the stop i on the stop j by using DTW ti(i,j) D (i, j); d (i, j) is a distance measure of two groups of time series calculated based on DTW;
(3) bus station for semantic similarityMeasuring the land utilization condition of the area around the point; POIs in the peripheral area of the bus stop represent the land utilization condition of the area, are used for calculating the semantic distance between two bus stops, and calculate the density distribution of POIs in different categories through the POIs in the peripheral land of the bus stop; suppose there is n poi POI of the category, land utilization of bus stop i may be represented as a length n poi Vector p of i Where each dimension represents the density of nearby POIs of a particular category, the semantic similarity of the edges connecting bus stops i and j may be calculated as S se(i,j) =p i ·p i ;
(4) The traffic influence degree is measured by the running state of a road between two bus stops; the road operating state between two bus stops can be represented by the average speed of travel, i.e. The average driving speed between the bus stop i and the bus stop j is obtained; in combination with the above four types of factors having influence on the edge attribute, for the t-th time period, i.e. the edge attributeWherein the content of the first and second substances,n ed is the number of edges;
u t the method comprises the steps that global attributes in the t-th time period refer to factors shared by passenger flow data of all bus stops; regarding weather conditions and calendar features as global attributes closely related to bus stop passenger flow data; calendar features specifically refer to which time of day, which day of the week, which day of the month, which month of the year, and date type; t time period global propertyWherein the content of the first and second substances,a characteristic of the weather condition is represented,a calendar feature is represented.
5. The multi-source data-based bus passenger flow combination diagram neural network prediction method of claim 1, characterized in that: the step C specifically comprises the following steps:
c1: building an input layer model and inputting graph structure data;
c2: building a hidden layer model, and modeling the time-space relation of graph structure data;
c3: building an output layer model and outputting a final bus passenger flow prediction result;
step C2 specifically includes the following steps:
c2.1: quantifying the contribution of each factor using an attention mechanism; graph structure G t =(N,E,V t ,A t ,u t ) Each type of attribute v of t Are all composed of a plurality of feature vectors;
c2.2: and constructing a combined unit of the graph neural network GNN and the long-short term memory network LSTM, and replacing matrix multiplication in the LSTM unit with GNN convolution, wherein the GNN convolution process comprises updating edge attributes, then updating node attributes and finally updating global attributes.
6. The multi-source data-based bus passenger flow combination map neural network prediction method of claim 5, characterized in that: in step C2.1, the attributesTherein code n fa A factor of type, wherein,is the feature vector of the kth factor at the tth moment, and the contribution of the attention mechanism to each factor is quantitatively calculated as follows:
wherein z is l 、β l 、θ l And b l Is a learnable parameter, h t-1 Is a hidden state; the calculation process takes the node attribute as an example, and is also suitable for the edge attribute and the global attribute; wherein, for the edge attribute, the hidden state of the edge connecting the node i and the node j is corresponding to two hidden states h i 、h j Summing; for a global attribute, its hidden state is the sum of all nodes hidden states.
7. The multi-source data-based bus passenger flow combination map neural network prediction method of claim 5, characterized in that: in step C2.2, the calculation method of the update edge attribute is as follows:
a′ i,j =φ a (a i,j ,v i ,v j ,u)
in the formula, a i,j Is fromEdge attributes from upstream node i to downstream node j; v. of i And v j Is the node attribute of node i and node j; u is a global attribute;is the update attribute of all edges connecting node i; r i Is the number of edges connecting node i; phi is a a Calculating the update attribute of each edge aiming at the update function of the edge attribute; a' i,j The updated edge attribute is the influence of the upstream node on the downstream node; aggregation function ρ a→v Aggregating all updated edge attributes connected with the node i into a vector; aggregation function ρ a→u Aggregating all edge attributes in a graph structure together into an updated edge attribute The updated edge attribute sets are obtained.
8. The multi-source data-based bus passenger flow combination diagram neural network prediction method of claim 1, characterized in that: in step C2.2, the calculation method of the update node attribute is as follows:
in the formula, the update function phi of the node attribute v Obtaining an updated attribute v 'of the node i by using the updated aggregation updated edge attribute, the node i attribute and the global attribute of the node i' i ;Is an updated set of node attributes, through an aggregation function ρ v→u Forming a vector;
9. The multi-source data-based bus passenger flow combination map neural network prediction method of claim 8, characterized in that: in the process of updating the global attribute, the operation mode of the integrated LSTM model after GNN convolution is as follows:
h t =o⊙tanh(c t )
wherein, # denotes convolution with graph structure data, # denotes a Hadamard product, and σ () denotes a sigmoid function; i. f and o are an input gate, a forgetting gate and an output gate; h is the hidden state, c is the cell state, W is the weight; all nodes in the graph structure share the LSTM layer.
10. The multi-source data-based bus passenger flow combination diagram neural network prediction method of claim 1, characterized in that: the step D comprises the following steps:
D1. training a whole combination graph neural network model by utilizing historical multi-source data in a period of time and by using time back propagation and aiming at minimizing bus stop passenger flow prediction errors to obtain a learning function of mapping input multi-source data to future bus passenger flow; the process is expressed as: given tau time periods of graph structure historical multi-source data G t-τ+1 ,G t-τ+2 ,…,G t ]And the bus passenger flow output by the t + l time period combination model is Y t+l The learning function f () is:
if the real passenger flow in the t + l time period isThe loss during training is then calculated by:
wherein, W 1 And W 2 Weight matrices, λ, for the combined model and attention mechanism, respectively 1 And λ 2 Is a penalty factor;
D2. inputting real-time data to obtain a bus passenger flow prediction result; and after multi-source data at the current moment are input and collected in the trained combined graph neural network model, a passenger flow predicted value of the bus stop at the future moment is obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210436660.9A CN114912669A (en) | 2022-04-24 | 2022-04-24 | Public transport passenger flow combined graph neural network prediction method based on multi-source data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210436660.9A CN114912669A (en) | 2022-04-24 | 2022-04-24 | Public transport passenger flow combined graph neural network prediction method based on multi-source data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114912669A true CN114912669A (en) | 2022-08-16 |
Family
ID=82764528
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210436660.9A Pending CN114912669A (en) | 2022-04-24 | 2022-04-24 | Public transport passenger flow combined graph neural network prediction method based on multi-source data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114912669A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115580547A (en) * | 2022-11-21 | 2023-01-06 | 中国科学技术大学 | Website fingerprint identification method and system based on time-space correlation between network data streams |
-
2022
- 2022-04-24 CN CN202210436660.9A patent/CN114912669A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115580547A (en) * | 2022-11-21 | 2023-01-06 | 中国科学技术大学 | Website fingerprint identification method and system based on time-space correlation between network data streams |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wu et al. | Hierarchical travel demand estimation using multiple data sources: A forward and backward propagation algorithmic framework on a layered computational graph | |
Nocera et al. | Assessing carbon emissions from road transport through traffic flow estimators | |
CN109919358A (en) | A kind of real-time site traffic prediction technique based on neural network space-time attention mechanism | |
CN111582559B (en) | Arrival time estimation method and device | |
CN106910199A (en) | Towards the car networking mass-rent method of city space information gathering | |
CN112419131B (en) | Method for estimating traffic origin-destination demand | |
US20240143999A1 (en) | Multi-modal data prediction method based on causal markov model | |
CN110956807A (en) | Highway flow prediction method based on combination of multi-source data and sliding window | |
Zhang et al. | PewLSTM: Periodic LSTM with Weather-Aware Gating Mechanism for Parking Behavior Prediction. | |
Yamamoto et al. | Structured random walk parameter for heterogeneity in trip distance on modeling pedestrian route choice behavior at downtown area | |
Guo et al. | Real-time ride-sharing framework with dynamic timeframe and anticipation-based migration | |
CN114912669A (en) | Public transport passenger flow combined graph neural network prediction method based on multi-source data | |
Madadi et al. | Multi-stage optimal design of road networks for automated vehicles with elastic multi-class demand | |
He et al. | ML-MMAS: Self-learning ant colony optimization for multi-criteria journey planning | |
Shmueli | Applications of neural networks in transportation planning | |
CN111008736A (en) | Opening decision method and system for new airline | |
Vijayalakshmi et al. | Multivariate Congestion Prediction using Stacked LSTM Autoencoder based Bidirectional LSTM Model. | |
CN113947132A (en) | Bus arrival prediction method based on GCN (generalized belief network) graph neural network, computer and medium | |
Treboux et al. | A predictive data-driven model for traffic-jams forecasting in smart santader city-scale testbed | |
CN110490365B (en) | Method for predicting network car booking order quantity based on multi-source data fusion | |
Nizar et al. | Forecasting of temperature by using LSTM and bidirectional LSTM approach: case study in Semarang, Indonesia | |
CN115994787A (en) | Car pooling demand prediction matching method based on neural network | |
Petelin et al. | Models for forecasting the traffic flow within the city of Ljubljana | |
CN115273472B (en) | Traffic time prediction method and system for representing road based on graph convolution network | |
Li et al. | Ridesplitting demand prediction via spatiotemporal multi-graph convolutional network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |