CN111242395B

CN111242395B - Method and device for constructing prediction model for OD (origin-destination) data

Info

Publication number: CN111242395B
Application number: CN202010336521.XA
Authority: CN
Inventors: 韦伟; 刘岭; 王舟帆; 张�杰; 白光禹
Original assignee: CRSC Research and Design Institute Group Co Ltd
Current assignee: CRSC Research and Design Institute Group Co Ltd
Priority date: 2020-04-26
Filing date: 2020-04-26
Publication date: 2020-07-31
Anticipated expiration: 2040-04-26
Also published as: CN111242395A

Abstract

The invention belongs to the technical field of artificial intelligence prediction and discloses a method and a device for constructing a prediction model for OD data, wherein the method for constructing the prediction model comprises the following steps: step S1: preprocessing historical data, and selecting at least one interval historical period data set and a current day trend data set; step S2: constructing an OD sparse space-time residual error network model according to the processed data; step S3: training an OD sparse space-time residual error network model by using a back propagation rule of a deep neural network and an Adam algorithm; step S4: and verifying and outputting the OD sparse space-time residual error network model. By constructing a description of the OD data sparse space-time residual error network on the OD data complex space-time dependence and distribution characteristics, the OD data can be accurately predicted.

Description

Method and device for constructing prediction model for OD (origin-destination) data

Technical Field

The invention belongs to the technical field of artificial intelligence prediction, and particularly relates to a method and a device for constructing a prediction model for OD data.

Background

The OD data refers to the magnitude of a social or economic interactive relationship variable from a departure point (o point) to a destination point (d point) in a certain period of time, and is also called OD data. The interactive relationship between the spatial geographic units in a certain time period comprises the economic trade scale, the social communication frequency or the passenger and goods transportation volume between areas. The combination of the OD quantities between all the origin (o) and destination (d) points is called the OD matrix. Prediction of OD data is often an important basis for making socio-economic activity plans and decisions between regions. Generally, the task of OD data prediction is to predict OD data in a subsequent period on the premise of given historical OD data (and corresponding environmental information), so as to be an important basis for socio-economic activity organization and management decision.

Taking the example of passenger transportation, for a specific rail transit network, the amount of passenger traffic between a specific departure point (o point) and a specific destination point (d point) is also called OD traffic. The combination of the OD traffic between all origin (o) and destination (d) points may be referred to as an OD traffic matrix. Generally, the distribution and evolution of OD data are complex spatio-temporal phenomena, and there is still a certain challenge to realize accurate passenger flow demand prediction, which mainly appears in the following aspects.

1) Multiple time dependence

The OD data also exhibits multiple time-dependent concurrent characteristics over time. It is often difficult to obtain ideal prediction results by simply considering the time dependence of a certain dimension.

The current day trend depends on: for example, the OD traffic demand in a study period tends to have a dependency on the OD traffic in multiple periods that historically approach that day.

The daily interval period depends on: for example, the OD traffic at a particular time may not only be dependent on the historical traffic at its nearby time, but may also be correlated with historical contemporaneous traffic over the course of a day (or days).

The weekly interval period depends on: the OD traffic at a particular time is correlated with historical contemporaneous traffic over the course of a week (or weeks).

2) Complex spatial dependence

OD data has a complex spatial dependency in space, which can be further divided into origin and destination dependencies by cause. The two spatial dependencies differ significantly from the spatial dependency relationship embodied in conventional spatial data. How to accurately identify the passenger flow is an important basis for accurately predicting the OD passenger flow.

The origin depends on: there is a correlation between OD amounts starting from an adjacent departure point and ending at a specific destination. For example, in a subway line network, the same change law is exhibited by the traveling amounts of passengers arriving at the same station in the center of a city, starting from adjacent stations in the vicinity of a certain residential area.

Destination dependence: there is a correlation between OD amounts starting from a specific departure point and ending at an adjacent destination point. For example, in a subway line network, the traveling volume of passengers departing from a station in a certain residential area and arriving at an adjacent station in the center of a city has a strong positive correlation.

3) Sparse distribution

Except a few elements, most elements in the OD matrix take the value of 0, and the OD matrix presents an obvious sparse distribution characteristic. For example, due to the influence of factors such as the property of the service area of a station, the traveling behavior of residents, the accessibility of routes and other transportation competition, the passenger traveling demand does not exist in most of the combination of the departure station and the destination station in the rail transit network. When OD data prediction is performed based on time and space dependence, if the sparse distribution characteristic of passenger flow demand is not considered, the value of the OD data value is 0, the predicted value of the OD data value is often larger than 0, and the situation is called as the overflow phenomenon of the time-space dependence of a prediction model. When the temporal-spatial dependence overflow phenomenon occurs, a larger deviation is presented between the predicted OD data value and the actual value, so that the prediction accuracy of the model is reduced and the overfitting risk is increased.

How to realize accurate prediction of OD demand on the basis of considering the complex space-time relevance and sparse distribution characteristic of OD data is a problem to be solved urgently at present. Many scholars propose OD data prediction models based on traditional time series and machine learning methods, and the OD data prediction models mainly comprise the following types: (a) linear prediction models such as time series prediction models, Kalman (Kalman) filtering models, and the like; (b) nonlinear prediction models, such as wavelet prediction models, chaotic prediction models, nonparametric prediction models and the like; (c) simulation prediction models such as cellular automaton prediction methods and traffic simulation prediction methods; (d) and (3) learning a prediction model by a shallow machine, such as a support vector machine, a shallow neural network and the like.

The research results have important significance for OD data prediction, but have limitations. When the linear prediction model is used for processing the prediction problem with strong randomness and nonlinear characteristics, the development distribution and evolution rule of short-time passenger flow data are difficult to fully reflect; the nonlinear prediction model can describe the nonlinear characteristics of OD data, but in the face of mass small-granularity short-time passenger flow data, the prediction precision of the nonlinear prediction model needs to be further improved; the simulation prediction model is generally higher in modeling cost, and the model calculation efficiency is difficult to meet the timeliness requirement; and the shallow machine learning prediction model is easy to generate over-fitting and under-fitting problems when processing big data.

In recent years, successful application of deep learning in various fields has stimulated research and attempts for its application in the fields of transportation and transportation. For example, some studies consider road network traffic throughout a city as a thermodynamic diagram (where each pixel value represents traffic within the corresponding region), and model the spatial dependence of the non-linearity using Convolutional Neural Networks (CNNs). In addition, some researchers have proposed using Recurrent Neural Networks (RNNs) to build non-linear time-dependent models for traffic flow prediction. In subsequent researches, CNN and RNN are organically fused, a comprehensive prediction model considering the time and space dependence relationship is provided, and the prediction accuracy is further improved. However, in many of these studies, urban road traffic flow is considered as a study target, and complicated space-time dependence and sparsity peculiar to OD data are not considered sufficiently, and it is difficult to accurately predict OD data.

As described above, many prediction models used in the conventional spatio-temporal prediction method can output only a predicted value of a single object, and are difficult to be applied to OD data prediction of the entire system. For example, the space-time prediction method is difficult to be applied to OD passenger flow prediction of the whole rail transit network, and the OD passenger flow matrixes of a certain space range in the future and even all stations of the whole network are often required to be predicted by optimizing, adjusting and scheduling a real-time transportation plan of rail transit. A few space-time prediction methods suitable for the whole system lack consideration on the specific complex time or space dependence of the OD passenger flow, and meanwhile, the sparse distribution characteristic of the OD passenger flow demand is ignored. Therefore, it is necessary to provide a method and an apparatus for constructing a prediction model for OD data with a specific purpose for the characteristics of OD data to construct a prediction model more suitable for the characteristics of OD data.

Disclosure of Invention

In view of the above problems, the present invention provides a method for constructing a prediction model for OD data, including:

step S1: preprocessing historical data, and selecting at least one interval historical period data set and a current day trend data set;

step S2: constructing an OD sparse space-time residual error network model according to the processed data;

step S3: training an OD sparse space-time residual error network model by using a back propagation rule of a deep neural network and an Adam algorithm;

step S4: and verifying and outputting the OD sparse space-time residual error network model.

In the above method for constructing a prediction model, step S1 includes:

selecting at least one interval historical period data set and a current day trend data set by taking different time periods as prediction targets according to historical data, wherein the at least one interval historical period data set comprises: at least one of a weekly interval historical cycle data set, a daily interval historical cycle data set, a monthly interval historical cycle data set, and an annual interval historical cycle data set.

In the above method for constructing a prediction model, step S2 includes:

step S21: processing at least one interval historical period data set based on a time sliding mechanism to correspondingly obtain at least one sliding interval historical period data set;

step S22: constructing a two-dimensional point-by-point convolution layer, and performing feature extraction and aggregation on at least one sliding interval historical period data set to obtain a feature map;

step S23: constructing an OD residual convolution unit, and extracting spatial features from the feature map and the current day trend data set;

step S24: constructing a simplified time sequence processing unit to extract nonlinear time sequence correlation characteristics from a plurality of spatial characteristics to complete time-space characteristic aggregation and obtain historical time-space prediction;

step S25: and obtaining preliminary prediction according to historical space-time prediction and external environment prediction.

In the above method for constructing a prediction model, step S2 further includes:

step S26: and a non-zero attention mechanism of sparsity is introduced to obtain final prediction according to the preliminary prediction.

In the above method for constructing a prediction model, in step S3, a training sample set is formed according to the constructed current day trend data set and at least one interval historical period data set, and model training is performed according to the training sample set by using a back propagation rule of a deep neural network and an Adam algorithm.

In the above method for constructing a prediction model, step S23 includes:

step S231: constructing a plurality of depth separation one-dimensional convolution layers, and extracting the characteristics of departure place dependence and destination dependence by stacking the depth separation one-dimensional convolution layers;

step S232: and constructing a one-dimensional point-by-point convolutional layer, extracting a spatial correlation structure through the one-dimensional point-by-point convolutional layer, and aggregating the two spatial dependencies to output spatial characteristics.

In the above method for constructing a prediction model, step S24 includes:

step S241: stacking the spatial features in time to obtain a current day trend spatial feature set and at least one interval period spatial feature set;

step S242: extracting nonlinear time sequence associated features from a current day trend space feature set and at least one interval period space feature set through a plurality of continuous two-dimensional point-by-point convolution layers;

step S242: obtaining a time-space predicted value of the current day trend and at least one interval period time-space predicted value according to the extracted time sequence correlation characteristics;

step S243: and obtaining a historical space-time predicted value according to the current day trend space-time predicted value and the space-time predicted value of at least one interval period.

In the above method for constructing a prediction model, step S26 includes:

step S261: obtaining an OD matrix mean value according to at least one interval historical period data set and a current day trend data set;

step S262: converting the OD matrix mean value into a non-zero element attention matrix through a non-zero activation function;

step S263: and filtering the preliminary predicted value through the non-zero element attention matrix to obtain a final predicted value of the OD matrix in a certain period.

In the above prediction model construction method, in step S4, the OD sparse spatiotemporal residual network model is verified based on the final prediction value.

The invention also provides a device for constructing the prediction model of the OD data, which comprises the following steps:

a pretreatment unit: preprocessing historical data, and selecting at least one interval historical period data set and a current day trend data set;

a model construction unit: constructing an OD sparse space-time residual error network model according to the processed data;

a model training unit: training an OD sparse space-time residual error network model by using a back propagation rule of a deep neural network and an Adam algorithm;

a verification output unit: and verifying and outputting the OD sparse space-time residual error network model.

In the above apparatus for constructing a prediction model, the preprocessing unit selects at least one interval history cycle data set and a current day trend data set according to the history data and using different time periods as prediction targets, where the at least one interval history cycle data set includes: at least one of a weekly interval historical cycle data set, a daily interval historical cycle data set, a monthly interval historical cycle data set, and an annual interval historical cycle data set.

The above prediction model constructing apparatus, wherein the model constructing unit includes:

a time-slip processing unit: processing at least one interval historical period data set based on a time sliding mechanism to correspondingly obtain at least one sliding interval historical period data set;

a feature map obtaining unit: constructing a two-dimensional point-by-point convolutional layer, and performing feature extraction and aggregation on at least one sliding interval historical period data set and sliding day interval historical period data to obtain a feature map;

OD residual convolution unit: extracting spatial features from the feature map and the current day trend data set;

the simplified time sequence processing unit: extracting nonlinear time sequence correlation characteristics from a plurality of spatial characteristics to complete space-time characteristic aggregation and then obtaining historical space-time prediction;

a preliminary prediction obtaining unit: and obtaining preliminary prediction according to historical space-time prediction and external environment prediction.

The above prediction model constructing apparatus, wherein the model constructing unit further includes:

a final prediction obtaining unit: and a non-zero attention mechanism of sparsity is introduced to obtain final prediction according to the preliminary prediction.

In the prediction model construction device, the model training unit forms a training sample set according to the constructed current day trend data set and at least one interval historical period data set, and performs model training according to the training sample set by using a back propagation rule of a deep neural network and an Adam algorithm.

The above prediction model construction apparatus, wherein the OD residual convolution unit includes:

the spatial feature extraction module: constructing a plurality of depth separation one-dimensional convolution layers, and extracting the characteristics of departure place dependence and destination dependence by stacking the depth separation one-dimensional convolution layers;

a spatial feature output module: and constructing a one-dimensional point-by-point convolutional layer, extracting a spatial correlation structure through the one-dimensional point-by-point convolutional layer, and aggregating the two spatial dependencies to output spatial characteristics.

The above prediction model construction apparatus, wherein the simplified time-series processing unit includes:

a spatial feature set obtaining module: stacking the spatial features in time to obtain a current day trend spatial feature set and at least one interval period spatial feature set;

the time sequence correlation characteristic extraction module: extracting nonlinear time sequence associated features from a current day trend space feature set and at least one interval period space feature set through a plurality of continuous two-dimensional point-by-point convolution layers;

a space-time prediction value obtaining module: obtaining a time-space predicted value of the current day trend and at least one interval period time-space predicted value according to the extracted time sequence correlation characteristics;

a historical space-time prediction value obtaining module: and obtaining a historical space-time predicted value according to the current day trend space-time predicted value and the space-time predicted value of at least one interval period.

The above prediction model construction apparatus, wherein the final prediction obtaining unit includes:

an OD matrix mean value obtaining module: obtaining an OD matrix mean value according to the space-time predicted value of at least one interval period and the current day trend data set;

a conversion module: converting the OD matrix mean value into a non-zero element attention matrix through a non-zero activation function;

and a final predicted value output module: and filtering the preliminary predicted value through the non-zero element attention matrix to obtain a final predicted value of the OD matrix in a certain period.

In the prediction model construction device, the verification output unit verifies the OD sparse spatiotemporal residual network model based on the final predicted value.

Aiming at the prior art, the invention has the following effects: by constructing an OD sparse space-time residual error network OD-SparsesSTnet, the OD data is accurately predicted by describing the complex space-time dependence and distribution characteristics of the OD data. In OD-SparsesSTnet, aiming at the complex spatial correlation of OD data, a residual convolution unit OD _ ResUnit is constructed and the characteristics of origin dependence and destination dependence are identified at the same time, so that the accurate description of the complex spatial dependence is realized; aiming at the problem that the existing recurrent neural network needs huge training samples and computing resources when predicting the OD matrix, a simplified time sequence processing unit Simp _ SeqUnit is constructed, and the nonlinear time characteristics of all OD data are extracted and predicted by using fewer parameters; aiming at the sparse distribution characteristic of OD data, a non-zero element attention mechanism is designed, and the prediction precision of the non-zero part of an OD matrix is further improved.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a flow chart of a predictive model construction method of the present invention;

FIG. 2 is a flowchart illustrating the substeps of step S2 in FIG. 1;

FIG. 3 is a flowchart illustrating the substeps of step S23 in FIG. 2;

FIG. 4 is a flowchart illustrating the substeps of step S24 in FIG. 2;

FIG. 5 is a flowchart illustrating the substeps of step S26 in FIG. 2;

FIG. 6 is a schematic diagram of an OD sparse spatiotemporal residual network model;

FIG. 7 is a schematic diagram of a sliding time window mechanism for periodic historical feature extraction;

FIG. 8 is a schematic diagram of the structure of an OD residual convolution unit;

FIG. 9 is a schematic diagram of spatial feature extraction;

FIG. 10 is a simplified schematic diagram of a sequential processing unit;

FIG. 11 is a schematic diagram of a sequential stacked feature matrix processing procedure;

FIG. 12 is a schematic diagram of a non-zero element attention mechanism considering sparsity;

fig. 13 is a schematic structural diagram of a prediction model construction apparatus.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As used herein, the terms "comprising," "including," "having," "containing," and the like are open-ended terms that mean including, but not limited to.

References to "a plurality" herein include "two" and "more than two".

The prediction model constructed by the invention can be applied to predicting the interaction relation between the spatial and geographic units in a certain time period, such as the OD passenger flow of a rail transit network, the economic trade scale between regions, the social communication frequency and the passenger and goods transportation volume.

Referring to fig. 1, fig. 1 is a flowchart of a prediction model construction method according to the present invention. As shown in fig. 1, the method for constructing a prediction model of the present invention includes:

step S1: and preprocessing the historical data, and selecting at least one interval historical period data set and a current day trend data set.

Specifically, at least one interval history cycle data set and a current day trend data set are selected from the history data with different time periods as prediction targets in step S1, and the at least one interval history cycle data set includes: at least one of a weekly interval historical cycle data set, a daily interval historical cycle data set, a monthly interval historical cycle data set, and an annual interval historical cycle data set.

Definition 1: for a railway transit network including N stations, the set of all stations is S (S = {1,2,3, …, N }), and the set of all trips in the t-th time slot of the study is P_tFor stations i (i ∈ S), j (j ∈ S), the OD passenger flow from station i to station j in the t-th time period

Can be defined as:

wherein p is_oAnd p_dRepresents the origin and destination of the p-th trip in the t-th time period, | circuitry represents the aggregate potential, which is the number of elements in the aggregate, in equation (1),

|{p_o=i∧p_d= j } | denotes the set { p_o=i∧p_dNumber of elements in = j }.

Define 2 (OD traffic matrix): in the studied railway traffic network, the OD passenger flow between all stations in the t-th time period

Forming an OD passenger flow matrix X of the rail transit line network in the t-th time period_t，X_t∈R^N×NR is a real number, i.e. X_tThe matrix is a real number matrix of dimension N × N, as shown in the following equation.

（2）

For a particular rail transit network, the OD passenger flow matrix { X over the first m-1 time periods is given_tI t =1,2,3, …, m-1}, and predicting OD traffic matrix X in the m-th time period thereof_mThe problem of predicting passenger flow of a rail transit network can be expressed as follows:

（3）

in which Ω(s) are prediction models or prediction functions, E_tThe environmental variables in the t-th time period are used for describing environmental information such as weather conditions (sunny days, cloudy days, rain, snow and fog), air temperature, wind speed, whether the weather is a holiday or not and the like.

Step S2: and constructing an OD sparse space-time residual error network model according to the processed data.

Referring to fig. 2, fig. 2 is a flowchart illustrating a substep of step S2 in fig. 1. As shown in fig. 2, the step S2 includes:

step S21: and processing the at least one interval historical period data set based on a time sliding mechanism to correspondingly obtain at least one sliding interval historical period data set.

Specifically, referring to fig. 6 and 7, fig. 6 is a schematic diagram of an OD sparse spatiotemporal residual error network model. As shown in FIG. 6, the predicted OD traffic matrix of the t-th time period is not all strongly correlated with the OD traffic matrix of the first t-1 time periods. For example, the OD passenger flow of 8:00 of the early peak of a certain Tuesday is only strongly correlated with 7:00-8:00 of the day, and is also strongly correlated with historical data around a plurality of working days and a plurality of 8:00 early peaks of the Tuesday, and is less dependent on the passenger flow data of other time periods. Therefore, the invention respectively selects a weekly interval historical period data set, a daily interval historical period data set and a current day trend data set from the historical data as input variables of the OD sparse spatiotemporal residual error network model.

Wherein the trend data set S of the day_trFrom the top q of the pseudo-prediction period t_trThe OD traffic matrix for each time interval is constructed as shown below. Wherein the content of the first and second substances,

backtracking k for a quasi-prediction time period t_trThe OD traffic matrix for each time period,

r is a real number, i.e. S_trBelong to q_tr× N × N dimensional real matrix.

（4）

Because the evolution law of the OD passenger flow in each period has certain difference, the OD passenger flow in the period to be predicted has strong correlation with the observed value before the interval of the historical integer periods, and depends on the observation of the surrounding period to a certain extentAnd (6) measuring. For example, the OD traffic of 8:00 early peak on a certain tuesday is strongly correlated with the observed values of 8:00 early peak on the last tuesday and the adjacent time period. Therefore, a certain time sliding mechanism needs to be considered when constructing the weekly and daily interval historical period data sets. For the k-th_wInterval of every week and k_dSliding historical data set of daily intervals of individuals

And

and can be represented by formula (5).

(5）

Wherein the content of the first and second substances,

r is a real number, i.e.

Belong to (2 p)_w+ 1) × N × N-dimensional real matrix,

r is a real number, i.e.

Belong to (2 p)_d+ 1) × N × N-dimensional real number matrix q_wAnd q is_dHistorical cycles in the historical cycle data set are weekly intervals and daily intervals, respectively. p is a radical of_wAnd p_dThe time sliding window sizes are respectively. I is the number of time periods of the day, related to the length of the study period. Weekly and daily interval historical period data set S_wAnd S_dHistorical periodic data sets, possibly from slip cycle intervals

Historical periodic data set spaced from sliding days

The construction is carried out as shown in formula (6).

（6）

Wherein the content of the first and second substances,

，

。

step S22: and constructing a two-dimensional point-by-point convolution layer, and performing feature extraction and aggregation on at least one sliding interval historical period data set to obtain a feature map.

Referring to fig. 7, fig. 7 is a schematic diagram of a sliding time window mechanism for periodic historical feature extraction. Referring to FIG. 7 and FIG. 6, as shown in part a of FIG. 6, the present invention employs two-dimensional point-by-point convolutional layer 1x1_ cov2 for sliding history data set

And

carrying out feature extraction and aggregation to obtain a feature map

And

，

，

as shown in equation (7).

（7）

Wherein the content of the first and second substances,

and

for learnable parameters, f is an activation function,

representing convolution operations, the Re L U function is chosen to ensure sparsity of features, but the invention is not limited thereto.

Step S23: and (4) constructing an OD residual convolution unit, and extracting spatial features from the feature map and the current day trend data set.

Referring to fig. 3, fig. 3 is a flowchart illustrating a substep of step S23 in fig. 2. As shown in fig. 3, the step S23 includes:

Please refer to fig. 8 and 9; FIG. 8 is a schematic diagram of the structure of an OD residual convolution unit; fig. 9 is a schematic diagram of spatial feature extraction. As shown in fig. 8 and 9, in part b of fig. 6, the present invention proposes an OD residual convolution unit OD _ result for extracting both origin dependency and destination dependency characteristics, and its structure is shown in fig. 8. By stacking a plurality of depth-separated one-dimensional convolutional layers Ds _ cov1, the OD residual convolutional unit realizes the extraction of the departure place dependence or destination dependence characteristics, simultaneously extracts the spatial correlation structure by using the one-dimensional point-by-point convolutional layers 1x1_ cov1, and aggregates the two spatial dependencies to output. The OD residual convolution unit also introduces residual concatenation to prevent gradient vanishing from occurring. Are respectively paired

、

And

extracting spatial features to obtain features

、

And

as shown in formula (8).

（8）

FIG. 9 illustrates the principle of spatial feature extraction based on Ds _ cov1 and 1x1_ cov1 taking destination dependent feature extraction as an example, rows in an OD traffic matrix are taken as channel dimensions, the OD traffic matrix can be converted into a 1 × N-dimensional picture containing N channels, each channel representing the OD traffic structure from a particular station, Ds _ cov1 can utilize a 1 × N structure, wherein the present invention utilizes a convolution kernel of 1 × to extract the correlation between OD traffic from the same station to an adjacent station, but the present invention is not limited thereto.

Step S24: and constructing a simplified time sequence processing unit to extract nonlinear time sequence correlation characteristics from a plurality of spatial characteristics to complete time-space characteristic aggregation and obtain historical time-space prediction.

Referring to fig. 4, fig. 4 is a flowchart illustrating a sub-step of step S24 in fig. 2. As shown in fig. 4, the step S24 includes:

step S242: constructing a plurality of two-dimensional point-by-point convolution layers, and extracting nonlinear time sequence associated features from a current day trend space feature set and at least one interval period space feature set through the continuous two-dimensional point-by-point convolution layers;

step S243: obtaining a time-space predicted value of the current day trend and at least one interval period time-space predicted value according to the extracted time sequence correlation characteristics;

step S244: and obtaining a historical space-time predicted value according to the current day trend space-time predicted value and the space-time predicted value of at least one interval period.

Referring to fig. 10 and 11, fig. 10 is a schematic diagram of a simplified time sequence processing unit structure, and fig. 11 is a schematic diagram of a processing process of a time sequence stacking feature matrix, as shown in fig. 10 and 11, a conventional Recurrent Neural Network (RNN), such as a long-time and short-time memory network (L STM), can automatically learn long-time dependencies in a time sequence, however, predicting an OD traffic matrix of an entire traffic network requires the RNN to have higher hidden layer and output layer feature dimensions, which require a huge amount of training samples and computing resources and are often difficult to satisfy in most application scenarios.

Fig. 10 shows the complete structure of a simplified sequential processing unit. In the simplified time sequence processing unit, a plurality of continuous two-dimensional point-by-point convolution layers 1x1_ cov2 are used for extracting nonlinear time sequence correlation characteristics from the spatial characteristics of a plurality of time segments, and 1x1_ cov2 layers with an output channel of 1 complete final space-time characteristic aggregation. The detailed principle of the simplified sequential processing unit is shown in fig. 11. Compared with the traditional RNN, when the number of considered time periods is small, the simplified time sequence processing unit can utilize fewer parameters to realize the nonlinear time characteristic extraction and prediction of all station OD passenger flows, and is more flexible and efficient on a small data set.

Specifically, part c of fig. 6 shows the process of OD passenger flow time-dependent feature extraction using the simplified time-series processing unit. Convolving OD residual error with the obtained space characteristic

、

And

stacking in time to obtain a current day trend spatial feature set U_tr(

) Periodic spatial feature set U_w（

) And a daily interval periodic spatial feature set U_d（

) As shown in formula (9).

（9）

Using reduced sequential processing unit pairs U, respectively_tr、U_wAnd U_dExtracting time sequence characteristics to obtain a time-space predicted value Z of the current day trend_tr(

) And a week interval period space-time predicted value Z_w（

) And the space-time prediction value Z of the day interval period_d（

) As shown in formula (10).

（10）

Wherein phi is a processing function of the simplified time sequence processing unit,

、

and

to simplify the learnable parameters of the sequential processing unit. Summing the three space-time predicted values to obtain a historical space-time predicted value Z_st，

As shown in formula (11).

（1）

Additionally, on the basis of current-term trend dependence, daily interval period dependence, and weekly interval period dependence considered herein, monthly interval (4 weeks) period dependence and yearly interval period dependence (52 weeks) may be further considered. In step S21 of the method of the present invention, the monthly interval period and yearly interval period historical data sets are added, and the corresponding spatio-temporal features are extracted by using a processing flow similar to the daily interval period and yearly interval period historical data sets, thereby obtaining the spatio-temporal featuresFor final OD traffic prediction, which may be a further embodiment of the invention. The historical spatiotemporal prediction value calculation shown in equation (11) can be updated to equation (11-1) by considering the monthly interval cycle dependence and the annual interval cycle dependence, wherein Z_mAnd Z_yThe space-time predicted values of the monthly interval cycle and the annual interval cycle are respectively.

（11-1）

Specifically, the historical spatio-temporal predicted value Z is calculated under the condition of considering the influence of external environmental factors_stAnd the external environment predicted value Z_Et(derived from Et by the ordinary linear layer) are summed to obtain a preliminary prediction for the t period

As shown in formula (12).

（12）

Referring to fig. 5, fig. 5 is a flowchart illustrating a sub-step of step S26 in fig. 2. As shown in fig. 5, the step S26 includes:

Referring to fig. 12 again, fig. 12 is a schematic diagram of a non-zero element attention mechanism considering sparsity. As shown in figure 12 of the drawings,to describe the sparseness of the OD data in the spatial distribution, in section d of fig. 6, the present invention also introduces a Non-zero attention mechanism (Non-zero Activation), the principle of which is shown in fig. 12. The sparsity of the traffic matrix remains stable due to the OD over time. Thus, the OD matrix means of all input data sets are used herein

(

) As is known, sparsely distributed, by the action of the non-zero activation function Λ,

is converted to a non-zero element attention matrix. Use it to predict the initial value

Filtering to obtain the final predicted value of the OD passenger flow matrix in the t period

Thus, the sparsity of the final prediction result is ensured, as shown in formula (13).

（13）

The nonzero activation function Λ used herein is as in formula (14), and the parameter λ can be set according to the actual data distribution, when the parameter λ is large enough, even if the value of tiny x is taken, Λ (x) can approach 1

The sparse structure of the network simultaneously limits the space-time feature extraction of the OD-SparsesSTnet network to be only carried out around non-zero elements, thereby improving

The prediction accuracy of (2).

（14）

Step S3: and (3) training the OD sparse space-time residual error network model by utilizing a back propagation rule of the deep neural network and an Adam algorithm.

In step S3, a training sample set may be formed according to the current day trend data set, the weekly interval period data set, and the daily interval period data set, and model training may be performed according to the training sample set by using a back propagation rule of the deep neural network and the Adam algorithm.

The model training objective is to minimize the loss of the square Root of Mean Square Error (RMSE), which is shown in equation (15).

（15）

However, for the predicted sparse OD passenger flow matrix, since the general prediction deviation evaluation index, average absolute percentage error (MAPE), may have a dividend of 0, the present invention provides a total absolute percentage error (GAPE) to estimate the prediction accuracy of the entire OD passenger flow matrix, as shown in the following formula:

（16）

specifically, a set of normalized historical OD passenger flow matrix observations is input: { X₁,X₂,…,X_m-1}; external environment information observation set: { E_1,E₂,…,E_m-1}; sequence length of weekly dependence history data, daily dependence history data and current trend history data: q. q.s_w、q_d、q_trSliding window of weekly dependence history data and daily dependence history data: p is a radical of_w、p_d(ii) a Study period length: h (unit:minutes, total number of time periods encompassed by the day: i = 1440/h), an OD sparse spatiotemporal residual network model is trained.

Specifically, the OD sparse spatiotemporal residual network model is verified based on the final predicted value.

The invention verifies the advancement of the proposed OD sparse space-time residual error network model by introducing 7 existing models as a reference. The basic settings of the reference model in 7 are as follows:

historical Average (HA): and directly carrying out historical synchronous averaging on the input weekly interval period data set and the daily interval period data set to serve as a predicted value of the OD passenger flow matrix at the predicted time.

Autoregressive moving average (ARIMA): autoregressive moving average is a classical model of time series prediction, which is used here to predict the time series of current day trends.

Three-dimensional convolutional network (CNN 3): the three-dimensional convolution network is directly utilized to carry out space and time feature simultaneous extraction (without independent processing of departure place and destination dependence) and prediction on the current day trend time series.

General Recurrent Neural Network (RNN): and firstly, extracting the characteristics of each OD passenger flow matrix in the time sequence of the daily trend, and then predicting by using the RNN.

And (8) a long-time and short-time memory network (L STM) which firstly extracts the characteristics of each OD passenger flow matrix in the daily trend time sequence and then predicts by using L STM.

Deep space-time network (deep st): the space-time deep neural network prediction model aiming at the space-time data is mainly used for carrying out people stream aggregation prediction in an urban range.

Space-time residual network (ST-ResNet): aiming at a space-time residual error network model of space-time data, a plurality of CNNs and corresponding residual error connections are stacked for space characteristic identification, and the method is mainly used for urban traffic flow prediction.

TABLE 17 comparison of predicted results of the reference model with the proposed OD-SparsesSTnet

Table 1 shows the comparison of the prediction results of 7 reference models and the proposed OD-SparsesSTnet model in the period of 8:00-8:15 (or 8:00-8: 05) of the early peak, because the HA method only partially considers the historical periodic dependence of the weekly interval and the daily interval and lacks consideration on the spatial dependence, the prediction results have the highest RMSE and GAPE values and the largest prediction deviation, the ARIMA, the RNN and the L STM are all time sequence models essentially, the complex spatial dependence of the current day trend dependence which cannot be recognized by OD passenger flow can be better recognized, the prediction indexes of the three are also better than that of the HA method, especially when the output characteristic dimension is larger, because the RNN and the L STM need a large number of hidden layer parameters, the OD passenger flow matrix prediction of the whole traffic network is difficult to be suitable for, and the large prediction deviation exists.

The CNN3 can regard the time change of the OD passenger flow matrix as a third dimension, can extract the time-of-day dependence and space dependence characteristics to a certain extent, and the prediction result is superior to three time sequence models and HA methods. As a typical spatio-temporal data processing model, the DeepST and the ST-ResNet can simultaneously extract multiple time correlation and external factor characteristics, and the internal CNN can also identify spatial dependence to a certain extent. The prediction deviation of the two models is greatly reduced relative to other reference models. However, deep ST and ST-ResNet still cannot finely describe the dependency of origin and destination specific to OD traffic, and there is no consideration for sparse distribution of data.

The proposed OD-SparsesSTnet further introduces a non-zero activation mechanism to describe the sparse distribution characteristic of the OD passenger flow on the basis of considering the complex time dependence (week interval period dependence, day interval period dependence, current day trend dependence) and space dependence (origin dependence and destination dependence) which are specific to the OD passenger flow of the traffic line network, the prediction deviation of the OD-SparsesSTnet is obviously lower than that of the existing space model (CNN 3), time sequence models (ARIMA, RNN and L STM) and space-time models (DeepsT and ST-ResNe), and the prediction deviation is reduced by more than 14.89%.

Referring to fig. 13, fig. 13 is a schematic structural diagram of a prediction model construction device. As shown in fig. 13, the prediction model building apparatus of the present invention includes:

the preprocessing unit 11: preprocessing the historical data, wherein the preprocessing unit 11 selects at least one interval historical cycle data set and a current day trend data set according to the historical data and by taking different time periods as prediction targets, the data set construction unit selects a plurality of data sets according to the historical data and by taking different time periods as the prediction targets, and the at least one interval historical cycle data set comprises: at least one of a weekly interval historical period dataset, a daily interval historical period dataset, a monthly interval historical period dataset, and an annual interval historical period dataset;

the model construction unit 12: constructing an OD sparse space-time residual error network model according to the processed data;

the model training unit 13: the OD sparse space-time residual error network model is trained by utilizing a back propagation rule of a deep neural network and an Adam algorithm, wherein the model training unit 13 forms a training sample set according to a constructed current day trend data set and at least one interval period data set, and performs model training by utilizing the back propagation rule of the deep neural network and the Adam algorithm according to the training sample set;

the verification output unit 14: and obtaining a prediction result through the OD sparse space-time residual error network model according to the real-time data, wherein the OD sparse space-time residual error network model is verified based on the final prediction value.

Further, the model building unit 12 includes:

time slide processing unit 121: processing at least one interval historical period data set based on a time sliding mechanism to correspondingly obtain at least one sliding interval historical period data set;

the feature map obtaining unit 122: constructing a two-dimensional point-by-point convolution layer, and performing feature extraction and aggregation on at least one sliding interval historical period data set to obtain a feature map;

OD residual convolution unit 123: extracting spatial features from the feature map and the current day trend data set;

the simplified timing processing unit 124: extracting nonlinear time sequence correlation characteristics from a plurality of spatial characteristics to complete space-time characteristic aggregation and then obtaining historical space-time prediction;

the preliminary prediction obtaining unit 125: obtaining preliminary prediction according to historical space-time prediction and external environment prediction;

final prediction obtaining unit 126: and a non-zero attention mechanism of sparsity is introduced to obtain final prediction according to the preliminary prediction.

Still further, the OD residual convolution unit 123 includes:

spatial feature extraction module 1231: constructing a plurality of depth separation one-dimensional convolution layers, and extracting the characteristics of departure place dependence or destination dependence by stacking the depth separation one-dimensional convolution layers;

spatial feature output module 1232: and constructing a one-dimensional point-by-point convolutional layer, extracting a spatial correlation structure through the one-dimensional point-by-point convolutional layer, and aggregating the two spatial dependencies to output spatial characteristics.

Still further, the simplified timing processing unit 124 includes:

the spatial feature set obtaining module 1241: stacking the spatial features in time to obtain a current day trend spatial feature set and at least one interval period spatial feature set;

the timing correlation feature extraction module 1242: extracting nonlinear time sequence associated features from a current day trend space feature set and at least one interval period space feature set through a plurality of continuous two-dimensional point-by-point convolution layers;

the space-time prediction value obtaining module 1243: obtaining a time-space predicted value of the current day trend and at least one interval period time-space predicted value according to the extracted time sequence correlation characteristics;

historical spatiotemporal prediction value obtaining module 1244: and obtaining a historical space-time predicted value according to the current day trend space-time predicted value and the space-time predicted value of at least one interval period.

Further, the final prediction obtaining unit 126 includes:

the OD matrix mean obtaining module 1261: obtaining an OD matrix mean value according to the space-time predicted value of at least one interval period and the current day trend data set;

a conversion module 1262: converting the OD matrix mean value into a non-zero element attention matrix through a non-zero activation function;

the final prediction value output module 1263: and filtering the preliminary predicted value through the non-zero element attention matrix to obtain a final predicted value of the OD matrix in a certain period.

In conclusion, the prediction model constructed according to the invention can realize the prediction of the OD matrix under the condition of using less network parameters and training resources; by means of the OD sparse space-time residual error network model, the specific complex time dependence (current day trend, weekly interval and daily interval), space dependence (origin dependence and destination dependence) and sparse distribution characteristics of OD data are considered at the same time, so that the description capacity and the prediction accuracy of the model are improved; the departure place dependence and the destination dependence are simultaneously identified through a residual convolution unit, so that the accurate description of the complex space dependence is realized; by simplifying a time sequence processing unit, the problem that the existing recurrent neural network needs a huge amount of training samples and computing resources when predicting the OD matrix is solved; by designing a non-zero element attention mechanism to consider the sparse distribution characteristic of OD data, the feature extraction of the model is only carried out around non-zero elements, and the prediction accuracy of the non-zero part of the OD matrix is further improved.

Although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method for constructing a prediction model for OD data is characterized by comprising the following steps:

step S4: verifying and outputting an OD sparse space-time residual error network model;

wherein, the step S2 includes:

step S25: obtaining preliminary prediction according to historical space-time prediction and external environment prediction;

step S26: introducing a non-zero attention mechanism of sparsity to obtain final prediction according to the preliminary prediction;

in the step S3, a training sample set is formed according to the constructed current day trend data set and at least one interval historical period data set, and model training is performed according to the training sample set by using a back propagation rule of a deep neural network and an Adam algorithm;

the step S23 includes:

step S232: constructing a one-dimensional point-by-point convolutional layer, extracting a spatial correlation structure through the one-dimensional point-by-point convolutional layer, and aggregating the two spatial dependencies to output spatial characteristics;

the step S24 includes:

step S243: obtaining a historical space-time predicted value according to the current day trend space-time predicted value and the space-time predicted value of at least one interval period;

the step S26 includes:

2. The prediction model construction method according to claim 1, wherein the step S1 includes:

3. The prediction model construction method of claim 1, wherein in the step S4, the OD sparse spatiotemporal residual network model is verified based on the final prediction value.

4. A prediction model construction device for OD data is characterized by comprising:

a verification output unit: verifying and outputting an OD sparse space-time residual error network model;

wherein the model construction unit includes:

a feature map obtaining unit: constructing a two-dimensional point-by-point convolution layer, and performing feature extraction and aggregation on at least one sliding interval historical period data set to obtain a feature map;

a preliminary prediction obtaining unit: obtaining preliminary prediction according to historical space-time prediction and external environment prediction;

the model building unit further comprises:

a final prediction obtaining unit: introducing a non-zero attention mechanism of sparsity to obtain final prediction according to the preliminary prediction;

the model training unit forms a training sample set according to the constructed current day trend data set and at least one interval historical period data set, and performs model training according to the training sample set by using a back propagation rule of a deep neural network and an Adam algorithm;

the OD residual convolution unit includes:

a spatial feature output module: constructing a one-dimensional point-by-point convolutional layer, extracting a spatial correlation structure through the one-dimensional point-by-point convolutional layer, and aggregating the two spatial dependencies to output spatial characteristics;

the simplified timing processing unit includes:

a historical space-time prediction value obtaining module: obtaining a historical space-time predicted value according to the current day trend space-time predicted value and the space-time predicted value of at least one interval period;

the final prediction obtaining unit includes:

5. The prediction model construction apparatus according to claim 4, wherein the preprocessing unit selects at least one interval history cycle data set and a current day trend data set for the prediction target at different time periods based on the history data, the at least one interval history cycle data set comprising: at least one of a weekly interval historical cycle data set, a daily interval historical cycle data set, a monthly interval historical cycle data set, and an annual interval historical cycle data set.

6. The prediction model construction apparatus of claim 4, wherein the verification output unit verifies the OD sparse spatiotemporal residual network model based on the final prediction value.