CN111242395B - Method and device for constructing prediction model for OD (origin-destination) data - Google Patents
Method and device for constructing prediction model for OD (origin-destination) data Download PDFInfo
- Publication number
- CN111242395B CN111242395B CN202010336521.XA CN202010336521A CN111242395B CN 111242395 B CN111242395 B CN 111242395B CN 202010336521 A CN202010336521 A CN 202010336521A CN 111242395 B CN111242395 B CN 111242395B
- Authority
- CN
- China
- Prior art keywords
- time
- prediction
- space
- data set
- interval
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000012549 training Methods 0.000 claims abstract description 35
- 238000013528 artificial neural network Methods 0.000 claims abstract description 20
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims abstract description 13
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 13
- 238000007781 pre-processing Methods 0.000 claims abstract description 11
- 239000011159 matrix material Substances 0.000 claims description 72
- 238000012545 processing Methods 0.000 claims description 38
- 238000000605 extraction Methods 0.000 claims description 25
- 238000010276 construction Methods 0.000 claims description 21
- 230000007246 mechanism Effects 0.000 claims description 21
- 230000003442 weekly effect Effects 0.000 claims description 18
- 230000002776 aggregation Effects 0.000 claims description 15
- 238000004220 aggregation Methods 0.000 claims description 15
- 238000000926 separation method Methods 0.000 claims description 12
- 230000004913 activation Effects 0.000 claims description 11
- 238000001914 filtration Methods 0.000 claims description 8
- 230000004931 aggregating effect Effects 0.000 claims description 6
- 238000012795 verification Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 17
- 230000006870 function Effects 0.000 description 10
- 238000013527 convolutional neural network Methods 0.000 description 7
- 230000000737 periodic effect Effects 0.000 description 7
- 230000002596 correlated effect Effects 0.000 description 6
- 230000000875 corresponding effect Effects 0.000 description 5
- 230000000306 recurrent effect Effects 0.000 description 5
- 230000007613 environmental effect Effects 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 239000000126 substance Substances 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000004088 simulation Methods 0.000 description 3
- 230000036962 time dependent Effects 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 102100033592 Calponin-3 Human genes 0.000 description 1
- 101000945410 Homo sapiens Calponin-3 Proteins 0.000 description 1
- 101001095088 Homo sapiens Melanoma antigen preferentially expressed in tumors Proteins 0.000 description 1
- 102100037020 Melanoma antigen preferentially expressed in tumors Human genes 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000000739 chaotic effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G06Q50/40—
Abstract
The invention belongs to the technical field of artificial intelligence prediction and discloses a method and a device for constructing a prediction model for OD data, wherein the method for constructing the prediction model comprises the following steps: step S1: preprocessing historical data, and selecting at least one interval historical period data set and a current day trend data set; step S2: constructing an OD sparse space-time residual error network model according to the processed data; step S3: training an OD sparse space-time residual error network model by using a back propagation rule of a deep neural network and an Adam algorithm; step S4: and verifying and outputting the OD sparse space-time residual error network model. By constructing a description of the OD data sparse space-time residual error network on the OD data complex space-time dependence and distribution characteristics, the OD data can be accurately predicted.
Description
Technical Field
The invention belongs to the technical field of artificial intelligence prediction, and particularly relates to a method and a device for constructing a prediction model for OD data.
Background
The OD data refers to the magnitude of a social or economic interactive relationship variable from a departure point (o point) to a destination point (d point) in a certain period of time, and is also called OD data. The interactive relationship between the spatial geographic units in a certain time period comprises the economic trade scale, the social communication frequency or the passenger and goods transportation volume between areas. The combination of the OD quantities between all the origin (o) and destination (d) points is called the OD matrix. Prediction of OD data is often an important basis for making socio-economic activity plans and decisions between regions. Generally, the task of OD data prediction is to predict OD data in a subsequent period on the premise of given historical OD data (and corresponding environmental information), so as to be an important basis for socio-economic activity organization and management decision.
Taking the example of passenger transportation, for a specific rail transit network, the amount of passenger traffic between a specific departure point (o point) and a specific destination point (d point) is also called OD traffic. The combination of the OD traffic between all origin (o) and destination (d) points may be referred to as an OD traffic matrix. Generally, the distribution and evolution of OD data are complex spatio-temporal phenomena, and there is still a certain challenge to realize accurate passenger flow demand prediction, which mainly appears in the following aspects.
1) Multiple time dependence
The OD data also exhibits multiple time-dependent concurrent characteristics over time. It is often difficult to obtain ideal prediction results by simply considering the time dependence of a certain dimension.
The current day trend depends on: for example, the OD traffic demand in a study period tends to have a dependency on the OD traffic in multiple periods that historically approach that day.
The daily interval period depends on: for example, the OD traffic at a particular time may not only be dependent on the historical traffic at its nearby time, but may also be correlated with historical contemporaneous traffic over the course of a day (or days).
The weekly interval period depends on: the OD traffic at a particular time is correlated with historical contemporaneous traffic over the course of a week (or weeks).
2) Complex spatial dependence
OD data has a complex spatial dependency in space, which can be further divided into origin and destination dependencies by cause. The two spatial dependencies differ significantly from the spatial dependency relationship embodied in conventional spatial data. How to accurately identify the passenger flow is an important basis for accurately predicting the OD passenger flow.
The origin depends on: there is a correlation between OD amounts starting from an adjacent departure point and ending at a specific destination. For example, in a subway line network, the same change law is exhibited by the traveling amounts of passengers arriving at the same station in the center of a city, starting from adjacent stations in the vicinity of a certain residential area.
Destination dependence: there is a correlation between OD amounts starting from a specific departure point and ending at an adjacent destination point. For example, in a subway line network, the traveling volume of passengers departing from a station in a certain residential area and arriving at an adjacent station in the center of a city has a strong positive correlation.
3) Sparse distribution
Except a few elements, most elements in the OD matrix take the value of 0, and the OD matrix presents an obvious sparse distribution characteristic. For example, due to the influence of factors such as the property of the service area of a station, the traveling behavior of residents, the accessibility of routes and other transportation competition, the passenger traveling demand does not exist in most of the combination of the departure station and the destination station in the rail transit network. When OD data prediction is performed based on time and space dependence, if the sparse distribution characteristic of passenger flow demand is not considered, the value of the OD data value is 0, the predicted value of the OD data value is often larger than 0, and the situation is called as the overflow phenomenon of the time-space dependence of a prediction model. When the temporal-spatial dependence overflow phenomenon occurs, a larger deviation is presented between the predicted OD data value and the actual value, so that the prediction accuracy of the model is reduced and the overfitting risk is increased.
How to realize accurate prediction of OD demand on the basis of considering the complex space-time relevance and sparse distribution characteristic of OD data is a problem to be solved urgently at present. Many scholars propose OD data prediction models based on traditional time series and machine learning methods, and the OD data prediction models mainly comprise the following types: (a) linear prediction models such as time series prediction models, Kalman (Kalman) filtering models, and the like; (b) nonlinear prediction models, such as wavelet prediction models, chaotic prediction models, nonparametric prediction models and the like; (c) simulation prediction models such as cellular automaton prediction methods and traffic simulation prediction methods; (d) and (3) learning a prediction model by a shallow machine, such as a support vector machine, a shallow neural network and the like.
The research results have important significance for OD data prediction, but have limitations. When the linear prediction model is used for processing the prediction problem with strong randomness and nonlinear characteristics, the development distribution and evolution rule of short-time passenger flow data are difficult to fully reflect; the nonlinear prediction model can describe the nonlinear characteristics of OD data, but in the face of mass small-granularity short-time passenger flow data, the prediction precision of the nonlinear prediction model needs to be further improved; the simulation prediction model is generally higher in modeling cost, and the model calculation efficiency is difficult to meet the timeliness requirement; and the shallow machine learning prediction model is easy to generate over-fitting and under-fitting problems when processing big data.
In recent years, successful application of deep learning in various fields has stimulated research and attempts for its application in the fields of transportation and transportation. For example, some studies consider road network traffic throughout a city as a thermodynamic diagram (where each pixel value represents traffic within the corresponding region), and model the spatial dependence of the non-linearity using Convolutional Neural Networks (CNNs). In addition, some researchers have proposed using Recurrent Neural Networks (RNNs) to build non-linear time-dependent models for traffic flow prediction. In subsequent researches, CNN and RNN are organically fused, a comprehensive prediction model considering the time and space dependence relationship is provided, and the prediction accuracy is further improved. However, in many of these studies, urban road traffic flow is considered as a study target, and complicated space-time dependence and sparsity peculiar to OD data are not considered sufficiently, and it is difficult to accurately predict OD data.
As described above, many prediction models used in the conventional spatio-temporal prediction method can output only a predicted value of a single object, and are difficult to be applied to OD data prediction of the entire system. For example, the space-time prediction method is difficult to be applied to OD passenger flow prediction of the whole rail transit network, and the OD passenger flow matrixes of a certain space range in the future and even all stations of the whole network are often required to be predicted by optimizing, adjusting and scheduling a real-time transportation plan of rail transit. A few space-time prediction methods suitable for the whole system lack consideration on the specific complex time or space dependence of the OD passenger flow, and meanwhile, the sparse distribution characteristic of the OD passenger flow demand is ignored. Therefore, it is necessary to provide a method and an apparatus for constructing a prediction model for OD data with a specific purpose for the characteristics of OD data to construct a prediction model more suitable for the characteristics of OD data.
Disclosure of Invention
In view of the above problems, the present invention provides a method for constructing a prediction model for OD data, including:
step S1: preprocessing historical data, and selecting at least one interval historical period data set and a current day trend data set;
step S2: constructing an OD sparse space-time residual error network model according to the processed data;
step S3: training an OD sparse space-time residual error network model by using a back propagation rule of a deep neural network and an Adam algorithm;
step S4: and verifying and outputting the OD sparse space-time residual error network model.
In the above method for constructing a prediction model, step S1 includes:
selecting at least one interval historical period data set and a current day trend data set by taking different time periods as prediction targets according to historical data, wherein the at least one interval historical period data set comprises: at least one of a weekly interval historical cycle data set, a daily interval historical cycle data set, a monthly interval historical cycle data set, and an annual interval historical cycle data set.
In the above method for constructing a prediction model, step S2 includes:
step S21: processing at least one interval historical period data set based on a time sliding mechanism to correspondingly obtain at least one sliding interval historical period data set;
step S22: constructing a two-dimensional point-by-point convolution layer, and performing feature extraction and aggregation on at least one sliding interval historical period data set to obtain a feature map;
step S23: constructing an OD residual convolution unit, and extracting spatial features from the feature map and the current day trend data set;
step S24: constructing a simplified time sequence processing unit to extract nonlinear time sequence correlation characteristics from a plurality of spatial characteristics to complete time-space characteristic aggregation and obtain historical time-space prediction;
step S25: and obtaining preliminary prediction according to historical space-time prediction and external environment prediction.
In the above method for constructing a prediction model, step S2 further includes:
step S26: and a non-zero attention mechanism of sparsity is introduced to obtain final prediction according to the preliminary prediction.
In the above method for constructing a prediction model, in step S3, a training sample set is formed according to the constructed current day trend data set and at least one interval historical period data set, and model training is performed according to the training sample set by using a back propagation rule of a deep neural network and an Adam algorithm.
In the above method for constructing a prediction model, step S23 includes:
step S231: constructing a plurality of depth separation one-dimensional convolution layers, and extracting the characteristics of departure place dependence and destination dependence by stacking the depth separation one-dimensional convolution layers;
step S232: and constructing a one-dimensional point-by-point convolutional layer, extracting a spatial correlation structure through the one-dimensional point-by-point convolutional layer, and aggregating the two spatial dependencies to output spatial characteristics.
In the above method for constructing a prediction model, step S24 includes:
step S241: stacking the spatial features in time to obtain a current day trend spatial feature set and at least one interval period spatial feature set;
step S242: extracting nonlinear time sequence associated features from a current day trend space feature set and at least one interval period space feature set through a plurality of continuous two-dimensional point-by-point convolution layers;
step S242: obtaining a time-space predicted value of the current day trend and at least one interval period time-space predicted value according to the extracted time sequence correlation characteristics;
step S243: and obtaining a historical space-time predicted value according to the current day trend space-time predicted value and the space-time predicted value of at least one interval period.
In the above method for constructing a prediction model, step S26 includes:
step S261: obtaining an OD matrix mean value according to at least one interval historical period data set and a current day trend data set;
step S262: converting the OD matrix mean value into a non-zero element attention matrix through a non-zero activation function;
step S263: and filtering the preliminary predicted value through the non-zero element attention matrix to obtain a final predicted value of the OD matrix in a certain period.
In the above prediction model construction method, in step S4, the OD sparse spatiotemporal residual network model is verified based on the final prediction value.
The invention also provides a device for constructing the prediction model of the OD data, which comprises the following steps:
a pretreatment unit: preprocessing historical data, and selecting at least one interval historical period data set and a current day trend data set;
a model construction unit: constructing an OD sparse space-time residual error network model according to the processed data;
a model training unit: training an OD sparse space-time residual error network model by using a back propagation rule of a deep neural network and an Adam algorithm;
a verification output unit: and verifying and outputting the OD sparse space-time residual error network model.
In the above apparatus for constructing a prediction model, the preprocessing unit selects at least one interval history cycle data set and a current day trend data set according to the history data and using different time periods as prediction targets, where the at least one interval history cycle data set includes: at least one of a weekly interval historical cycle data set, a daily interval historical cycle data set, a monthly interval historical cycle data set, and an annual interval historical cycle data set.
The above prediction model constructing apparatus, wherein the model constructing unit includes:
a time-slip processing unit: processing at least one interval historical period data set based on a time sliding mechanism to correspondingly obtain at least one sliding interval historical period data set;
a feature map obtaining unit: constructing a two-dimensional point-by-point convolutional layer, and performing feature extraction and aggregation on at least one sliding interval historical period data set and sliding day interval historical period data to obtain a feature map;
OD residual convolution unit: extracting spatial features from the feature map and the current day trend data set;
the simplified time sequence processing unit: extracting nonlinear time sequence correlation characteristics from a plurality of spatial characteristics to complete space-time characteristic aggregation and then obtaining historical space-time prediction;
a preliminary prediction obtaining unit: and obtaining preliminary prediction according to historical space-time prediction and external environment prediction.
The above prediction model constructing apparatus, wherein the model constructing unit further includes:
a final prediction obtaining unit: and a non-zero attention mechanism of sparsity is introduced to obtain final prediction according to the preliminary prediction.
In the prediction model construction device, the model training unit forms a training sample set according to the constructed current day trend data set and at least one interval historical period data set, and performs model training according to the training sample set by using a back propagation rule of a deep neural network and an Adam algorithm.
The above prediction model construction apparatus, wherein the OD residual convolution unit includes:
the spatial feature extraction module: constructing a plurality of depth separation one-dimensional convolution layers, and extracting the characteristics of departure place dependence and destination dependence by stacking the depth separation one-dimensional convolution layers;
a spatial feature output module: and constructing a one-dimensional point-by-point convolutional layer, extracting a spatial correlation structure through the one-dimensional point-by-point convolutional layer, and aggregating the two spatial dependencies to output spatial characteristics.
The above prediction model construction apparatus, wherein the simplified time-series processing unit includes:
a spatial feature set obtaining module: stacking the spatial features in time to obtain a current day trend spatial feature set and at least one interval period spatial feature set;
the time sequence correlation characteristic extraction module: extracting nonlinear time sequence associated features from a current day trend space feature set and at least one interval period space feature set through a plurality of continuous two-dimensional point-by-point convolution layers;
a space-time prediction value obtaining module: obtaining a time-space predicted value of the current day trend and at least one interval period time-space predicted value according to the extracted time sequence correlation characteristics;
a historical space-time prediction value obtaining module: and obtaining a historical space-time predicted value according to the current day trend space-time predicted value and the space-time predicted value of at least one interval period.
The above prediction model construction apparatus, wherein the final prediction obtaining unit includes:
an OD matrix mean value obtaining module: obtaining an OD matrix mean value according to the space-time predicted value of at least one interval period and the current day trend data set;
a conversion module: converting the OD matrix mean value into a non-zero element attention matrix through a non-zero activation function;
and a final predicted value output module: and filtering the preliminary predicted value through the non-zero element attention matrix to obtain a final predicted value of the OD matrix in a certain period.
In the prediction model construction device, the verification output unit verifies the OD sparse spatiotemporal residual network model based on the final predicted value.
Aiming at the prior art, the invention has the following effects: by constructing an OD sparse space-time residual error network OD-SparsesSTnet, the OD data is accurately predicted by describing the complex space-time dependence and distribution characteristics of the OD data. In OD-SparsesSTnet, aiming at the complex spatial correlation of OD data, a residual convolution unit OD _ ResUnit is constructed and the characteristics of origin dependence and destination dependence are identified at the same time, so that the accurate description of the complex spatial dependence is realized; aiming at the problem that the existing recurrent neural network needs huge training samples and computing resources when predicting the OD matrix, a simplified time sequence processing unit Simp _ SeqUnit is constructed, and the nonlinear time characteristics of all OD data are extracted and predicted by using fewer parameters; aiming at the sparse distribution characteristic of OD data, a non-zero element attention mechanism is designed, and the prediction precision of the non-zero part of an OD matrix is further improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of a predictive model construction method of the present invention;
FIG. 2 is a flowchart illustrating the substeps of step S2 in FIG. 1;
FIG. 3 is a flowchart illustrating the substeps of step S23 in FIG. 2;
FIG. 4 is a flowchart illustrating the substeps of step S24 in FIG. 2;
FIG. 5 is a flowchart illustrating the substeps of step S26 in FIG. 2;
FIG. 6 is a schematic diagram of an OD sparse spatiotemporal residual network model;
FIG. 7 is a schematic diagram of a sliding time window mechanism for periodic historical feature extraction;
FIG. 8 is a schematic diagram of the structure of an OD residual convolution unit;
FIG. 9 is a schematic diagram of spatial feature extraction;
FIG. 10 is a simplified schematic diagram of a sequential processing unit;
FIG. 11 is a schematic diagram of a sequential stacked feature matrix processing procedure;
FIG. 12 is a schematic diagram of a non-zero element attention mechanism considering sparsity;
fig. 13 is a schematic structural diagram of a prediction model construction apparatus.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As used herein, the terms "comprising," "including," "having," "containing," and the like are open-ended terms that mean including, but not limited to.
References to "a plurality" herein include "two" and "more than two".
The prediction model constructed by the invention can be applied to predicting the interaction relation between the spatial and geographic units in a certain time period, such as the OD passenger flow of a rail transit network, the economic trade scale between regions, the social communication frequency and the passenger and goods transportation volume.
Referring to fig. 1, fig. 1 is a flowchart of a prediction model construction method according to the present invention. As shown in fig. 1, the method for constructing a prediction model of the present invention includes:
step S1: and preprocessing the historical data, and selecting at least one interval historical period data set and a current day trend data set.
Specifically, at least one interval history cycle data set and a current day trend data set are selected from the history data with different time periods as prediction targets in step S1, and the at least one interval history cycle data set includes: at least one of a weekly interval historical cycle data set, a daily interval historical cycle data set, a monthly interval historical cycle data set, and an annual interval historical cycle data set.
Definition 1: for a railway transit network including N stations, the set of all stations is S (S = {1,2,3, …, N }), and the set of all trips in the t-th time slot of the study is PtFor stations i (i ∈ S), j (j ∈ S), the OD passenger flow from station i to station j in the t-th time periodCan be defined as:
wherein p isoAnd pdRepresents the origin and destination of the p-th trip in the t-th time period, | circuitry represents the aggregate potential, which is the number of elements in the aggregate, in equation (1),
|{po=i∧pd= j } | denotes the set { po=i∧pdNumber of elements in = j }.
Define 2 (OD traffic matrix): in the studied railway traffic network, the OD passenger flow between all stations in the t-th time periodForming an OD passenger flow matrix X of the rail transit line network in the t-th time periodt,Xt∈RN×NR is a real number, i.e. XtThe matrix is a real number matrix of dimension N × N, as shown in the following equation.
For a particular rail transit network, the OD passenger flow matrix { X over the first m-1 time periods is giventI t =1,2,3, …, m-1}, and predicting OD traffic matrix X in the m-th time period thereofmThe problem of predicting passenger flow of a rail transit network can be expressed as follows:
in which Ω(s) are prediction models or prediction functions, EtThe environmental variables in the t-th time period are used for describing environmental information such as weather conditions (sunny days, cloudy days, rain, snow and fog), air temperature, wind speed, whether the weather is a holiday or not and the like.
Step S2: and constructing an OD sparse space-time residual error network model according to the processed data.
Referring to fig. 2, fig. 2 is a flowchart illustrating a substep of step S2 in fig. 1. As shown in fig. 2, the step S2 includes:
step S21: and processing the at least one interval historical period data set based on a time sliding mechanism to correspondingly obtain at least one sliding interval historical period data set.
Specifically, referring to fig. 6 and 7, fig. 6 is a schematic diagram of an OD sparse spatiotemporal residual error network model. As shown in FIG. 6, the predicted OD traffic matrix of the t-th time period is not all strongly correlated with the OD traffic matrix of the first t-1 time periods. For example, the OD passenger flow of 8:00 of the early peak of a certain Tuesday is only strongly correlated with 7:00-8:00 of the day, and is also strongly correlated with historical data around a plurality of working days and a plurality of 8:00 early peaks of the Tuesday, and is less dependent on the passenger flow data of other time periods. Therefore, the invention respectively selects a weekly interval historical period data set, a daily interval historical period data set and a current day trend data set from the historical data as input variables of the OD sparse spatiotemporal residual error network model.
Wherein the trend data set S of the daytrFrom the top q of the pseudo-prediction period ttrThe OD traffic matrix for each time interval is constructed as shown below. Wherein the content of the first and second substances,backtracking k for a quasi-prediction time period ttrThe OD traffic matrix for each time period,r is a real number, i.e. StrBelong to qtr× N × N dimensional real matrix.
Because the evolution law of the OD passenger flow in each period has certain difference, the OD passenger flow in the period to be predicted has strong correlation with the observed value before the interval of the historical integer periods, and depends on the observation of the surrounding period to a certain extentAnd (6) measuring. For example, the OD traffic of 8:00 early peak on a certain tuesday is strongly correlated with the observed values of 8:00 early peak on the last tuesday and the adjacent time period. Therefore, a certain time sliding mechanism needs to be considered when constructing the weekly and daily interval historical period data sets. For the k-thwInterval of every week and kdSliding historical data set of daily intervals of individualsAndand can be represented by formula (5).(5)
Wherein the content of the first and second substances,r is a real number, i.e.Belong to (2 p)w+ 1) × N × N-dimensional real matrix,r is a real number, i.e.Belong to (2 p)d+ 1) × N × N-dimensional real number matrix qwAnd q isdHistorical cycles in the historical cycle data set are weekly intervals and daily intervals, respectively. p is a radical ofwAnd pdThe time sliding window sizes are respectively. I is the number of time periods of the day, related to the length of the study period. Weekly and daily interval historical period data set SwAnd SdHistorical periodic data sets, possibly from slip cycle intervalsHistorical periodic data set spaced from sliding daysThe construction is carried out as shown in formula (6).
step S22: and constructing a two-dimensional point-by-point convolution layer, and performing feature extraction and aggregation on at least one sliding interval historical period data set to obtain a feature map.
Referring to fig. 7, fig. 7 is a schematic diagram of a sliding time window mechanism for periodic historical feature extraction. Referring to FIG. 7 and FIG. 6, as shown in part a of FIG. 6, the present invention employs two-dimensional point-by-point convolutional layer 1x1_ cov2 for sliding history data setAndcarrying out feature extraction and aggregation to obtain a feature mapAnd,,as shown in equation (7).
Wherein the content of the first and second substances,andfor learnable parameters, f is an activation function,representing convolution operations, the Re L U function is chosen to ensure sparsity of features, but the invention is not limited thereto.
Step S23: and (4) constructing an OD residual convolution unit, and extracting spatial features from the feature map and the current day trend data set.
Referring to fig. 3, fig. 3 is a flowchart illustrating a substep of step S23 in fig. 2. As shown in fig. 3, the step S23 includes:
step S231: constructing a plurality of depth separation one-dimensional convolution layers, and extracting the characteristics of departure place dependence and destination dependence by stacking the depth separation one-dimensional convolution layers;
step S232: and constructing a one-dimensional point-by-point convolutional layer, extracting a spatial correlation structure through the one-dimensional point-by-point convolutional layer, and aggregating the two spatial dependencies to output spatial characteristics.
Please refer to fig. 8 and 9; FIG. 8 is a schematic diagram of the structure of an OD residual convolution unit; fig. 9 is a schematic diagram of spatial feature extraction. As shown in fig. 8 and 9, in part b of fig. 6, the present invention proposes an OD residual convolution unit OD _ result for extracting both origin dependency and destination dependency characteristics, and its structure is shown in fig. 8. By stacking a plurality of depth-separated one-dimensional convolutional layers Ds _ cov1, the OD residual convolutional unit realizes the extraction of the departure place dependence or destination dependence characteristics, simultaneously extracts the spatial correlation structure by using the one-dimensional point-by-point convolutional layers 1x1_ cov1, and aggregates the two spatial dependencies to output. The OD residual convolution unit also introduces residual concatenation to prevent gradient vanishing from occurring. Are respectively paired、Andextracting spatial features to obtain features、Andas shown in formula (8).
FIG. 9 illustrates the principle of spatial feature extraction based on Ds _ cov1 and 1x1_ cov1 taking destination dependent feature extraction as an example, rows in an OD traffic matrix are taken as channel dimensions, the OD traffic matrix can be converted into a 1 × N-dimensional picture containing N channels, each channel representing the OD traffic structure from a particular station, Ds _ cov1 can utilize a 1 × N structure, wherein the present invention utilizes a convolution kernel of 1 × to extract the correlation between OD traffic from the same station to an adjacent station, but the present invention is not limited thereto.
Step S24: and constructing a simplified time sequence processing unit to extract nonlinear time sequence correlation characteristics from a plurality of spatial characteristics to complete time-space characteristic aggregation and obtain historical time-space prediction.
Referring to fig. 4, fig. 4 is a flowchart illustrating a sub-step of step S24 in fig. 2. As shown in fig. 4, the step S24 includes:
step S241: stacking the spatial features in time to obtain a current day trend spatial feature set and at least one interval period spatial feature set;
step S242: constructing a plurality of two-dimensional point-by-point convolution layers, and extracting nonlinear time sequence associated features from a current day trend space feature set and at least one interval period space feature set through the continuous two-dimensional point-by-point convolution layers;
step S243: obtaining a time-space predicted value of the current day trend and at least one interval period time-space predicted value according to the extracted time sequence correlation characteristics;
step S244: and obtaining a historical space-time predicted value according to the current day trend space-time predicted value and the space-time predicted value of at least one interval period.
Referring to fig. 10 and 11, fig. 10 is a schematic diagram of a simplified time sequence processing unit structure, and fig. 11 is a schematic diagram of a processing process of a time sequence stacking feature matrix, as shown in fig. 10 and 11, a conventional Recurrent Neural Network (RNN), such as a long-time and short-time memory network (L STM), can automatically learn long-time dependencies in a time sequence, however, predicting an OD traffic matrix of an entire traffic network requires the RNN to have higher hidden layer and output layer feature dimensions, which require a huge amount of training samples and computing resources and are often difficult to satisfy in most application scenarios.
Fig. 10 shows the complete structure of a simplified sequential processing unit. In the simplified time sequence processing unit, a plurality of continuous two-dimensional point-by-point convolution layers 1x1_ cov2 are used for extracting nonlinear time sequence correlation characteristics from the spatial characteristics of a plurality of time segments, and 1x1_ cov2 layers with an output channel of 1 complete final space-time characteristic aggregation. The detailed principle of the simplified sequential processing unit is shown in fig. 11. Compared with the traditional RNN, when the number of considered time periods is small, the simplified time sequence processing unit can utilize fewer parameters to realize the nonlinear time characteristic extraction and prediction of all station OD passenger flows, and is more flexible and efficient on a small data set.
Specifically, part c of fig. 6 shows the process of OD passenger flow time-dependent feature extraction using the simplified time-series processing unit. Convolving OD residual error with the obtained space characteristic、Andstacking in time to obtain a current day trend spatial feature set Utr() Periodic spatial feature set Uw() And a daily interval periodic spatial feature set Ud() As shown in formula (9).
Using reduced sequential processing unit pairs U, respectivelytr、UwAnd UdExtracting time sequence characteristics to obtain a time-space predicted value Z of the current day trendtr() And a week interval period space-time predicted value Zw() And the space-time prediction value Z of the day interval periodd() As shown in formula (10).
Wherein phi is a processing function of the simplified time sequence processing unit,、andto simplify the learnable parameters of the sequential processing unit. Summing the three space-time predicted values to obtain a historical space-time predicted value Zst,As shown in formula (11).
Additionally, on the basis of current-term trend dependence, daily interval period dependence, and weekly interval period dependence considered herein, monthly interval (4 weeks) period dependence and yearly interval period dependence (52 weeks) may be further considered. In step S21 of the method of the present invention, the monthly interval period and yearly interval period historical data sets are added, and the corresponding spatio-temporal features are extracted by using a processing flow similar to the daily interval period and yearly interval period historical data sets, thereby obtaining the spatio-temporal featuresFor final OD traffic prediction, which may be a further embodiment of the invention. The historical spatiotemporal prediction value calculation shown in equation (11) can be updated to equation (11-1) by considering the monthly interval cycle dependence and the annual interval cycle dependence, wherein ZmAnd ZyThe space-time predicted values of the monthly interval cycle and the annual interval cycle are respectively.
Step S25: and obtaining preliminary prediction according to historical space-time prediction and external environment prediction.
Specifically, the historical spatio-temporal predicted value Z is calculated under the condition of considering the influence of external environmental factorsstAnd the external environment predicted value ZEt(derived from Et by the ordinary linear layer) are summed to obtain a preliminary prediction for the t periodAs shown in formula (12).
Step S26: and a non-zero attention mechanism of sparsity is introduced to obtain final prediction according to the preliminary prediction.
Referring to fig. 5, fig. 5 is a flowchart illustrating a sub-step of step S26 in fig. 2. As shown in fig. 5, the step S26 includes:
step S261: obtaining an OD matrix mean value according to at least one interval historical period data set and a current day trend data set;
step S262: converting the OD matrix mean value into a non-zero element attention matrix through a non-zero activation function;
step S263: and filtering the preliminary predicted value through the non-zero element attention matrix to obtain a final predicted value of the OD matrix in a certain period.
Referring to fig. 12 again, fig. 12 is a schematic diagram of a non-zero element attention mechanism considering sparsity. As shown in figure 12 of the drawings,to describe the sparseness of the OD data in the spatial distribution, in section d of fig. 6, the present invention also introduces a Non-zero attention mechanism (Non-zero Activation), the principle of which is shown in fig. 12. The sparsity of the traffic matrix remains stable due to the OD over time. Thus, the OD matrix means of all input data sets are used herein() As is known, sparsely distributed, by the action of the non-zero activation function Λ,is converted to a non-zero element attention matrix. Use it to predict the initial valueFiltering to obtain the final predicted value of the OD passenger flow matrix in the t periodThus, the sparsity of the final prediction result is ensured, as shown in formula (13).
The nonzero activation function Λ used herein is as in formula (14), and the parameter λ can be set according to the actual data distribution, when the parameter λ is large enough, even if the value of tiny x is taken, Λ (x) can approach 1The sparse structure of the network simultaneously limits the space-time feature extraction of the OD-SparsesSTnet network to be only carried out around non-zero elements, thereby improvingThe prediction accuracy of (2).
Step S3: and (3) training the OD sparse space-time residual error network model by utilizing a back propagation rule of the deep neural network and an Adam algorithm.
In step S3, a training sample set may be formed according to the current day trend data set, the weekly interval period data set, and the daily interval period data set, and model training may be performed according to the training sample set by using a back propagation rule of the deep neural network and the Adam algorithm.
The model training objective is to minimize the loss of the square Root of Mean Square Error (RMSE), which is shown in equation (15).
However, for the predicted sparse OD passenger flow matrix, since the general prediction deviation evaluation index, average absolute percentage error (MAPE), may have a dividend of 0, the present invention provides a total absolute percentage error (GAPE) to estimate the prediction accuracy of the entire OD passenger flow matrix, as shown in the following formula:
specifically, a set of normalized historical OD passenger flow matrix observations is input: { X1,X2,…,Xm-1}; external environment information observation set: { E1,E2,…,Em-1}; sequence length of weekly dependence history data, daily dependence history data and current trend history data: q. q.sw、qd、qtrSliding window of weekly dependence history data and daily dependence history data: p is a radical ofw、pd(ii) a Study period length: h (unit:minutes, total number of time periods encompassed by the day: i = 1440/h), an OD sparse spatiotemporal residual network model is trained.
Step S4: and verifying and outputting the OD sparse space-time residual error network model.
Specifically, the OD sparse spatiotemporal residual network model is verified based on the final predicted value.
The invention verifies the advancement of the proposed OD sparse space-time residual error network model by introducing 7 existing models as a reference. The basic settings of the reference model in 7 are as follows:
historical Average (HA): and directly carrying out historical synchronous averaging on the input weekly interval period data set and the daily interval period data set to serve as a predicted value of the OD passenger flow matrix at the predicted time.
Autoregressive moving average (ARIMA): autoregressive moving average is a classical model of time series prediction, which is used here to predict the time series of current day trends.
Three-dimensional convolutional network (CNN 3): the three-dimensional convolution network is directly utilized to carry out space and time feature simultaneous extraction (without independent processing of departure place and destination dependence) and prediction on the current day trend time series.
General Recurrent Neural Network (RNN): and firstly, extracting the characteristics of each OD passenger flow matrix in the time sequence of the daily trend, and then predicting by using the RNN.
And (8) a long-time and short-time memory network (L STM) which firstly extracts the characteristics of each OD passenger flow matrix in the daily trend time sequence and then predicts by using L STM.
Deep space-time network (deep st): the space-time deep neural network prediction model aiming at the space-time data is mainly used for carrying out people stream aggregation prediction in an urban range.
Space-time residual network (ST-ResNet): aiming at a space-time residual error network model of space-time data, a plurality of CNNs and corresponding residual error connections are stacked for space characteristic identification, and the method is mainly used for urban traffic flow prediction.
TABLE 17 comparison of predicted results of the reference model with the proposed OD-SparsesSTnet
Table 1 shows the comparison of the prediction results of 7 reference models and the proposed OD-SparsesSTnet model in the period of 8:00-8:15 (or 8:00-8: 05) of the early peak, because the HA method only partially considers the historical periodic dependence of the weekly interval and the daily interval and lacks consideration on the spatial dependence, the prediction results have the highest RMSE and GAPE values and the largest prediction deviation, the ARIMA, the RNN and the L STM are all time sequence models essentially, the complex spatial dependence of the current day trend dependence which cannot be recognized by OD passenger flow can be better recognized, the prediction indexes of the three are also better than that of the HA method, especially when the output characteristic dimension is larger, because the RNN and the L STM need a large number of hidden layer parameters, the OD passenger flow matrix prediction of the whole traffic network is difficult to be suitable for, and the large prediction deviation exists.
The CNN3 can regard the time change of the OD passenger flow matrix as a third dimension, can extract the time-of-day dependence and space dependence characteristics to a certain extent, and the prediction result is superior to three time sequence models and HA methods. As a typical spatio-temporal data processing model, the DeepST and the ST-ResNet can simultaneously extract multiple time correlation and external factor characteristics, and the internal CNN can also identify spatial dependence to a certain extent. The prediction deviation of the two models is greatly reduced relative to other reference models. However, deep ST and ST-ResNet still cannot finely describe the dependency of origin and destination specific to OD traffic, and there is no consideration for sparse distribution of data.
The proposed OD-SparsesSTnet further introduces a non-zero activation mechanism to describe the sparse distribution characteristic of the OD passenger flow on the basis of considering the complex time dependence (week interval period dependence, day interval period dependence, current day trend dependence) and space dependence (origin dependence and destination dependence) which are specific to the OD passenger flow of the traffic line network, the prediction deviation of the OD-SparsesSTnet is obviously lower than that of the existing space model (CNN 3), time sequence models (ARIMA, RNN and L STM) and space-time models (DeepsT and ST-ResNe), and the prediction deviation is reduced by more than 14.89%.
Referring to fig. 13, fig. 13 is a schematic structural diagram of a prediction model construction device. As shown in fig. 13, the prediction model building apparatus of the present invention includes:
the preprocessing unit 11: preprocessing the historical data, wherein the preprocessing unit 11 selects at least one interval historical cycle data set and a current day trend data set according to the historical data and by taking different time periods as prediction targets, the data set construction unit selects a plurality of data sets according to the historical data and by taking different time periods as the prediction targets, and the at least one interval historical cycle data set comprises: at least one of a weekly interval historical period dataset, a daily interval historical period dataset, a monthly interval historical period dataset, and an annual interval historical period dataset;
the model construction unit 12: constructing an OD sparse space-time residual error network model according to the processed data;
the model training unit 13: the OD sparse space-time residual error network model is trained by utilizing a back propagation rule of a deep neural network and an Adam algorithm, wherein the model training unit 13 forms a training sample set according to a constructed current day trend data set and at least one interval period data set, and performs model training by utilizing the back propagation rule of the deep neural network and the Adam algorithm according to the training sample set;
the verification output unit 14: and obtaining a prediction result through the OD sparse space-time residual error network model according to the real-time data, wherein the OD sparse space-time residual error network model is verified based on the final prediction value.
Further, the model building unit 12 includes:
time slide processing unit 121: processing at least one interval historical period data set based on a time sliding mechanism to correspondingly obtain at least one sliding interval historical period data set;
the feature map obtaining unit 122: constructing a two-dimensional point-by-point convolution layer, and performing feature extraction and aggregation on at least one sliding interval historical period data set to obtain a feature map;
OD residual convolution unit 123: extracting spatial features from the feature map and the current day trend data set;
the simplified timing processing unit 124: extracting nonlinear time sequence correlation characteristics from a plurality of spatial characteristics to complete space-time characteristic aggregation and then obtaining historical space-time prediction;
the preliminary prediction obtaining unit 125: obtaining preliminary prediction according to historical space-time prediction and external environment prediction;
final prediction obtaining unit 126: and a non-zero attention mechanism of sparsity is introduced to obtain final prediction according to the preliminary prediction.
Still further, the OD residual convolution unit 123 includes:
spatial feature extraction module 1231: constructing a plurality of depth separation one-dimensional convolution layers, and extracting the characteristics of departure place dependence or destination dependence by stacking the depth separation one-dimensional convolution layers;
spatial feature output module 1232: and constructing a one-dimensional point-by-point convolutional layer, extracting a spatial correlation structure through the one-dimensional point-by-point convolutional layer, and aggregating the two spatial dependencies to output spatial characteristics.
Still further, the simplified timing processing unit 124 includes:
the spatial feature set obtaining module 1241: stacking the spatial features in time to obtain a current day trend spatial feature set and at least one interval period spatial feature set;
the timing correlation feature extraction module 1242: extracting nonlinear time sequence associated features from a current day trend space feature set and at least one interval period space feature set through a plurality of continuous two-dimensional point-by-point convolution layers;
the space-time prediction value obtaining module 1243: obtaining a time-space predicted value of the current day trend and at least one interval period time-space predicted value according to the extracted time sequence correlation characteristics;
historical spatiotemporal prediction value obtaining module 1244: and obtaining a historical space-time predicted value according to the current day trend space-time predicted value and the space-time predicted value of at least one interval period.
Further, the final prediction obtaining unit 126 includes:
the OD matrix mean obtaining module 1261: obtaining an OD matrix mean value according to the space-time predicted value of at least one interval period and the current day trend data set;
a conversion module 1262: converting the OD matrix mean value into a non-zero element attention matrix through a non-zero activation function;
the final prediction value output module 1263: and filtering the preliminary predicted value through the non-zero element attention matrix to obtain a final predicted value of the OD matrix in a certain period.
In conclusion, the prediction model constructed according to the invention can realize the prediction of the OD matrix under the condition of using less network parameters and training resources; by means of the OD sparse space-time residual error network model, the specific complex time dependence (current day trend, weekly interval and daily interval), space dependence (origin dependence and destination dependence) and sparse distribution characteristics of OD data are considered at the same time, so that the description capacity and the prediction accuracy of the model are improved; the departure place dependence and the destination dependence are simultaneously identified through a residual convolution unit, so that the accurate description of the complex space dependence is realized; by simplifying a time sequence processing unit, the problem that the existing recurrent neural network needs a huge amount of training samples and computing resources when predicting the OD matrix is solved; by designing a non-zero element attention mechanism to consider the sparse distribution characteristic of OD data, the feature extraction of the model is only carried out around non-zero elements, and the prediction accuracy of the non-zero part of the OD matrix is further improved.
Although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (6)
1. A method for constructing a prediction model for OD data is characterized by comprising the following steps:
step S1: preprocessing historical data, and selecting at least one interval historical period data set and a current day trend data set;
step S2: constructing an OD sparse space-time residual error network model according to the processed data;
step S3: training an OD sparse space-time residual error network model by using a back propagation rule of a deep neural network and an Adam algorithm;
step S4: verifying and outputting an OD sparse space-time residual error network model;
wherein, the step S2 includes:
step S21: processing at least one interval historical period data set based on a time sliding mechanism to correspondingly obtain at least one sliding interval historical period data set;
step S22: constructing a two-dimensional point-by-point convolution layer, and performing feature extraction and aggregation on at least one sliding interval historical period data set to obtain a feature map;
step S23: constructing an OD residual convolution unit, and extracting spatial features from the feature map and the current day trend data set;
step S24: constructing a simplified time sequence processing unit to extract nonlinear time sequence correlation characteristics from a plurality of spatial characteristics to complete time-space characteristic aggregation and obtain historical time-space prediction;
step S25: obtaining preliminary prediction according to historical space-time prediction and external environment prediction;
step S26: introducing a non-zero attention mechanism of sparsity to obtain final prediction according to the preliminary prediction;
in the step S3, a training sample set is formed according to the constructed current day trend data set and at least one interval historical period data set, and model training is performed according to the training sample set by using a back propagation rule of a deep neural network and an Adam algorithm;
the step S23 includes:
step S231: constructing a plurality of depth separation one-dimensional convolution layers, and extracting the characteristics of departure place dependence and destination dependence by stacking the depth separation one-dimensional convolution layers;
step S232: constructing a one-dimensional point-by-point convolutional layer, extracting a spatial correlation structure through the one-dimensional point-by-point convolutional layer, and aggregating the two spatial dependencies to output spatial characteristics;
the step S24 includes:
step S241: stacking the spatial features in time to obtain a current day trend spatial feature set and at least one interval period spatial feature set;
step S242: extracting nonlinear time sequence associated features from a current day trend space feature set and at least one interval period space feature set through a plurality of continuous two-dimensional point-by-point convolution layers;
step S242: obtaining a time-space predicted value of the current day trend and at least one interval period time-space predicted value according to the extracted time sequence correlation characteristics;
step S243: obtaining a historical space-time predicted value according to the current day trend space-time predicted value and the space-time predicted value of at least one interval period;
the step S26 includes:
step S261: obtaining an OD matrix mean value according to at least one interval historical period data set and a current day trend data set;
step S262: converting the OD matrix mean value into a non-zero element attention matrix through a non-zero activation function;
step S263: and filtering the preliminary predicted value through the non-zero element attention matrix to obtain a final predicted value of the OD matrix in a certain period.
2. The prediction model construction method according to claim 1, wherein the step S1 includes:
selecting at least one interval historical period data set and a current day trend data set by taking different time periods as prediction targets according to historical data, wherein the at least one interval historical period data set comprises: at least one of a weekly interval historical cycle data set, a daily interval historical cycle data set, a monthly interval historical cycle data set, and an annual interval historical cycle data set.
3. The prediction model construction method of claim 1, wherein in the step S4, the OD sparse spatiotemporal residual network model is verified based on the final prediction value.
4. A prediction model construction device for OD data is characterized by comprising:
a pretreatment unit: preprocessing historical data, and selecting at least one interval historical period data set and a current day trend data set;
a model construction unit: constructing an OD sparse space-time residual error network model according to the processed data;
a model training unit: training an OD sparse space-time residual error network model by using a back propagation rule of a deep neural network and an Adam algorithm;
a verification output unit: verifying and outputting an OD sparse space-time residual error network model;
wherein the model construction unit includes:
a time-slip processing unit: processing at least one interval historical period data set based on a time sliding mechanism to correspondingly obtain at least one sliding interval historical period data set;
a feature map obtaining unit: constructing a two-dimensional point-by-point convolution layer, and performing feature extraction and aggregation on at least one sliding interval historical period data set to obtain a feature map;
OD residual convolution unit: extracting spatial features from the feature map and the current day trend data set;
the simplified time sequence processing unit: extracting nonlinear time sequence correlation characteristics from a plurality of spatial characteristics to complete space-time characteristic aggregation and then obtaining historical space-time prediction;
a preliminary prediction obtaining unit: obtaining preliminary prediction according to historical space-time prediction and external environment prediction;
the model building unit further comprises:
a final prediction obtaining unit: introducing a non-zero attention mechanism of sparsity to obtain final prediction according to the preliminary prediction;
the model training unit forms a training sample set according to the constructed current day trend data set and at least one interval historical period data set, and performs model training according to the training sample set by using a back propagation rule of a deep neural network and an Adam algorithm;
the OD residual convolution unit includes:
the spatial feature extraction module: constructing a plurality of depth separation one-dimensional convolution layers, and extracting the characteristics of departure place dependence and destination dependence by stacking the depth separation one-dimensional convolution layers;
a spatial feature output module: constructing a one-dimensional point-by-point convolutional layer, extracting a spatial correlation structure through the one-dimensional point-by-point convolutional layer, and aggregating the two spatial dependencies to output spatial characteristics;
the simplified timing processing unit includes:
a spatial feature set obtaining module: stacking the spatial features in time to obtain a current day trend spatial feature set and at least one interval period spatial feature set;
the time sequence correlation characteristic extraction module: extracting nonlinear time sequence associated features from a current day trend space feature set and at least one interval period space feature set through a plurality of continuous two-dimensional point-by-point convolution layers;
a space-time prediction value obtaining module: obtaining a time-space predicted value of the current day trend and at least one interval period time-space predicted value according to the extracted time sequence correlation characteristics;
a historical space-time prediction value obtaining module: obtaining a historical space-time predicted value according to the current day trend space-time predicted value and the space-time predicted value of at least one interval period;
the final prediction obtaining unit includes:
an OD matrix mean value obtaining module: obtaining an OD matrix mean value according to the space-time predicted value of at least one interval period and the current day trend data set;
a conversion module: converting the OD matrix mean value into a non-zero element attention matrix through a non-zero activation function;
and a final predicted value output module: and filtering the preliminary predicted value through the non-zero element attention matrix to obtain a final predicted value of the OD matrix in a certain period.
5. The prediction model construction apparatus according to claim 4, wherein the preprocessing unit selects at least one interval history cycle data set and a current day trend data set for the prediction target at different time periods based on the history data, the at least one interval history cycle data set comprising: at least one of a weekly interval historical cycle data set, a daily interval historical cycle data set, a monthly interval historical cycle data set, and an annual interval historical cycle data set.
6. The prediction model construction apparatus of claim 4, wherein the verification output unit verifies the OD sparse spatiotemporal residual network model based on the final prediction value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010336521.XA CN111242395B (en) | 2020-04-26 | 2020-04-26 | Method and device for constructing prediction model for OD (origin-destination) data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010336521.XA CN111242395B (en) | 2020-04-26 | 2020-04-26 | Method and device for constructing prediction model for OD (origin-destination) data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111242395A CN111242395A (en) | 2020-06-05 |
CN111242395B true CN111242395B (en) | 2020-07-31 |
Family
ID=70875578
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010336521.XA Active CN111242395B (en) | 2020-04-26 | 2020-04-26 | Method and device for constructing prediction model for OD (origin-destination) data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111242395B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112508303B (en) * | 2020-12-22 | 2022-09-27 | 西南交通大学 | OD passenger flow prediction method, device, equipment and readable storage medium |
US20240044663A1 (en) * | 2021-03-02 | 2024-02-08 | Grabtaxi Holdings Pte. Ltd. | System and method for predicting destination location |
CN114139836B (en) * | 2022-01-29 | 2022-05-31 | 北京航空航天大学杭州创新研究院 | Urban OD (origin-destination) people flow prediction method based on gravimetry multi-layer three-dimensional residual error network |
CN114819366A (en) * | 2022-05-06 | 2022-07-29 | 华侨大学 | OD passenger flow short-time prediction method, device, equipment and storage medium thereof |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106205126A (en) * | 2016-08-12 | 2016-12-07 | 北京航空航天大学 | Large-scale Traffic Network based on convolutional neural networks is blocked up Forecasting Methodology and device |
CN108647834A (en) * | 2018-05-24 | 2018-10-12 | 浙江工业大学 | A kind of traffic flow forecasting method based on convolutional neural networks structure |
CN109598381A (en) * | 2018-12-05 | 2019-04-09 | 武汉理工大学 | A kind of Short-time Traffic Flow Forecasting Methods based on state frequency Memory Neural Networks |
CN110991775A (en) * | 2020-03-02 | 2020-04-10 | 北京全路通信信号研究设计院集团有限公司 | Deep learning-based rail transit passenger flow demand prediction method and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7058638B2 (en) * | 2002-09-03 | 2006-06-06 | Research Triangle Institute | Method for statistical disclosure limitation |
-
2020
- 2020-04-26 CN CN202010336521.XA patent/CN111242395B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106205126A (en) * | 2016-08-12 | 2016-12-07 | 北京航空航天大学 | Large-scale Traffic Network based on convolutional neural networks is blocked up Forecasting Methodology and device |
CN108647834A (en) * | 2018-05-24 | 2018-10-12 | 浙江工业大学 | A kind of traffic flow forecasting method based on convolutional neural networks structure |
CN109598381A (en) * | 2018-12-05 | 2019-04-09 | 武汉理工大学 | A kind of Short-time Traffic Flow Forecasting Methods based on state frequency Memory Neural Networks |
CN110991775A (en) * | 2020-03-02 | 2020-04-10 | 北京全路通信信号研究设计院集团有限公司 | Deep learning-based rail transit passenger flow demand prediction method and device |
Also Published As
Publication number | Publication date |
---|---|
CN111242395A (en) | 2020-06-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bi et al. | Daily tourism volume forecasting for tourist attractions | |
CN111242292B (en) | OD data prediction method and system based on deep space-time network | |
CN111210633B (en) | Short-term traffic flow prediction method based on deep learning | |
CN111242395B (en) | Method and device for constructing prediction model for OD (origin-destination) data | |
Pan et al. | Predicting bike sharing demand using recurrent neural networks | |
Yang et al. | MF-CNN: traffic flow prediction using convolutional neural network and multi-features fusion | |
CN113487066B (en) | Long-time-sequence freight volume prediction method based on multi-attribute enhanced graph convolution-Informer model | |
CN115204478A (en) | Public traffic flow prediction method combining urban interest points and space-time causal relationship | |
Venkatesh et al. | Rainfall prediction using generative adversarial networks with convolution neural network | |
Zhang et al. | Multistep speed prediction on traffic networks: A graph convolutional sequence-to-sequence learning approach with attention mechanism | |
CN115206092B (en) | Traffic prediction method of BiLSTM and LightGBM models based on attention mechanism | |
Lee et al. | Long short-term memory recurrent neural network for urban traffic prediction: A case study of seoul | |
Liu et al. | A method for short-term traffic flow forecasting based on GCN-LSTM | |
CN112488185A (en) | Method, system, electronic device and readable storage medium for predicting vehicle operating parameters including spatiotemporal characteristics | |
Xiong et al. | DCAST: a spatiotemporal model with DenseNet and GRU based on attention mechanism | |
Bao et al. | Forecasting network-wide multi-step metro ridership with an attention-weighted multi-view graph to sequence learning approach | |
Liu et al. | A MRT daily passenger flow prediction model with different combinations of influential factors | |
Agga et al. | Short-term load forecasting: based on hybrid CNN-LSTM neural network | |
Zhang et al. | Short-term Traffic Flow Prediction With Residual Graph Attention Network. | |
Wang et al. | Metroeye: A weather-aware system for real-time metro passenger flow prediction | |
Lai et al. | Short‐term passenger flow prediction for rail transit based on improved particle swarm optimization algorithm | |
Qu et al. | Improving parking occupancy prediction in poor data conditions through customization and learning to learn | |
Li et al. | Hydropower generation forecasting via deep neural network | |
Ye et al. | Demand forecasting of online car‐hailing by exhaustively capturing the temporal dependency with TCN and Attention approaches | |
ABBAS | A survey of research into artificial neural networks for crime prediction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |