CN113435587A

CN113435587A - Time-series-based task quantity prediction method and device, electronic equipment and medium

Info

Publication number: CN113435587A
Application number: CN202110892935.5A
Authority: CN
Inventors: 胡文波; 崔鹏
Original assignee: Beijing Real AI Technology Co Ltd
Current assignee: Beijing Real AI Technology Co Ltd
Priority date: 2021-03-25
Filing date: 2021-03-25
Publication date: 2021-09-24
Also published as: CN112699998B; CN112699998A

Abstract

The invention provides a task quantity prediction method, a device, electronic equipment and a readable storage medium based on a time sequence, wherein the method is realized by acquiring historical time sequence data of a task; then establishing a neural network; the neural network comprises an initialization layer, a random differential equation layer and a prediction layer, wherein random differential equations are arranged in the random differential equation layer; then, training the neural network by utilizing a training data set of a task to obtain a trained neural network; and inputting historical time sequence data of the task into the trained neural network, acquiring the mean value and the variance of future data through the neural network, and finally acquiring the prediction interval of the future data. The invention establishes the relation between the mean value and the variance through the random differential equation layer, thereby ensuring that the mean value and the variance are converged to the optimal value at the same time and obtaining a more reliable prediction interval.

Description

Time-series-based task quantity prediction method and device, electronic equipment and medium

Technical Field

The invention relates to the technical field of machine learning, in particular to a time series prediction method and device, electronic equipment and a readable storage medium.

The present application is a divisional application of the patent application having the application number of 2021103165792.

Background

The time series prediction task is a kind of tasks commonly existing in machine learning, and is widely applied to various scenes such as finance, industry, manufacturing, traffic and the like, such as power consumption prediction, stock analysis, traffic flow prediction, weather forecast and the like.

With the rapid development of deep learning technology in recent years, deep neural networks have become very important machine learning tools, which exceed the human level in many tasks. Therefore, some time series prediction methods have been developed, but the existing neural networks for time series prediction have problems, and some unreliable and over-confidence predictions are often made.

Moreover, when the machine learning system is deployed in many practical applications in the real world, only point prediction does not meet the requirement, and for the prediction task, a more reliable and accurate prediction interval is needed to meet the requirement of a user in decision making.

Disclosure of Invention

In view of the above, the present invention provides a time series prediction method, a time series prediction apparatus and a readable storage medium.

The invention is realized by the following steps:

in a first aspect, the present invention provides a time series prediction method, including the following steps:

acquiring historical time sequence data of tasks;

establishing a neural network; the neural network comprises an initialization layer, a random differential equation layer and a prediction layer, wherein random differential equations are arranged in the random differential equation layer; the initialization layer is used for extracting initialization characteristic mapping of historical time sequence data; the random differential equation layer is used for acquiring a mean characteristic and a variance characteristic of historical sequence data by using a random differential equation with an initial value as initialization feature mapping, the mean characteristic is a solution of the random differential equation, the variance characteristic is a diffusion coefficient corresponding to the solution of the random differential equation, and the prediction layer is used for predicting a mean value and a variance of future data according to the mean characteristic and the variance characteristic;

training the neural network by utilizing a training data set of a task to obtain a trained neural network;

inputting historical time sequence data of the task into a trained neural network, and acquiring the mean value and the variance of future data through the neural network;

and obtaining a prediction interval of the future data according to the mean value and the variance of the future data.

Specifically, in the time series prediction method, the random differential equation is as follows:

wherein f (·) represents a drift coefficient of the random differential equation; g (-) represents the diffusion coefficient of the random differential equation; z is a radical of_tRepresenting the state or output of the t hidden layer of the neural network; f represents the neural network parameters of the drift term; theta_gA neural network parameter representing a diffusion term; bt represents the brownian motion state at time t in the stochastic differential equation.

Specifically, in the time series prediction method, the stochastic differential equation is solved iteratively by using an eulerian method, and an expression formula of the iterative solution by using the eulerian method is as follows:

where s denotes the s-th iteration, z_sRepresenting solutions of random differential equations at the s-th iteration，z_s+1Represents the solution of the random differential equation at the s +1 th iteration, W_sRepresenting a random variable sampled from a standard normal distribution.

In some embodiments, when the neural network is trained using a training data set of a task, a loss function of the neural network is composed of a mean and a variance of future data, and the mean and the variance of the future data are guaranteed to converge to an optimal point at the same time, and the formula is as follows:

μ (-) represents the mean of future data output by the neural network; sigma²() represents the variance of future data output by the neural network; l (theta)_f；Θ_g) Representing a negative log-likelihood loss function to be minimized; costant represents a constant term; (X)ⁱ，yⁱ) Is a set of real sample point data, where XⁱIs input data to the neural network in the training dataset, yⁱIs the real data in the training data set corresponding to the future data output by the neural network; n represents the number of sample points, i is more than or equal to 1 and less than or equal to N.

Preferably, in some embodiments, the time series prediction method further includes the step of obtaining an uncertainty according to a mean and a variance of the future data, wherein the uncertainty is used for representing the reliability degree of the prediction interval of the neural network output.

In some embodiments, the time series prediction method for obtaining the prediction interval of the future data according to the mean and the variance of the future data includes the following steps:

obtaining the prediction distribution of the future data according to the mean value and the variance of the future data;

and acquiring a prediction interval of the future data according to the prediction distribution of the future data.

Preferably, the acquiring the historical time-series data of the task comprises the following steps: acquiring real-time sequence data of tasks:

normalizing the real-time sequence data to obtain normalized time sequence data;

and converting the normalized time sequence data into historical time sequence data of the task by using a sliding window algorithm.

In some embodiments, there is also provided a time series prediction apparatus, the apparatus comprising:

the first acquisition module is used for acquiring historical time sequence data of the task;

the neural network module comprises an initialization unit, a random differential equation unit and a prediction unit, wherein a random differential equation is arranged in the random differential equation unit; the initialization unit is used for extracting initialization characteristic mapping of historical time series data; the random differential equation unit is used for acquiring a mean characteristic and a variance characteristic of the historical sequence data by using a random differential equation with an initial value as initialization feature mapping, wherein the mean characteristic is a solution of the random differential equation, and the variance characteristic is a diffusion coefficient corresponding to the solution of the random differential equation;

the prediction unit is used for predicting the mean value and the variance of future data according to the mean value characteristic and the variance characteristic;

the training module is used for training the neural network by utilizing a training data set of a task to obtain a trained neural network;

the input module is used for inputting the historical time sequence data of the task into the trained neural network;

the second acquisition module is used for acquiring the mean value and the variance of future data through a neural network;

and the third acquisition module acquires the prediction interval of the future data according to the mean value and the variance of the future data.

In some embodiments, there is also provided an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the time series prediction method when executing the computer program.

In some embodiments, a readable storage medium is also provided, on which a computer program is stored, which computer program, when being executed by a processor, is adapted to carry out the steps of the time series prediction method.

The invention has the following beneficial effects: according to the time sequence prediction method, a relatively simple neural network model is established, posterior distribution does not need to be calculated, the calculation resources required in the training process are less, the number of the established network models is less, the training process is fast, and the calculation cost is low; the method is characterized in that a random differential equation layer is arranged in the neural network, and the interaction relation between the mean value and the variance is established through the random differential equation, so that the mean value characteristic and the variance characteristic output by the random differential equation layer are mutually related, the prediction layer forms close relation between the mean value and the variance of future data obtained by optimizing according to the mean value characteristic and the variance characteristic, the mean value and the variance of the future data to be predicted can be converged to the optimal point for each other at the same time in the training process of the neural network, and the diffusion term simulates interference and disturbance in reality, so that the mean value and the variance of the future data output by the prediction layer are more determined and more reliable, and the generated prediction interval of the future data is higher in precision and reliability.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a basic flowchart of a time series prediction method according to an embodiment of the present invention;

FIG. 2 is an architecture diagram of a neural network constructed according to an embodiment of the present invention;

FIG. 3 is a basic flowchart of training a neural network to obtain a trained neural network according to an embodiment of the present invention;

FIG. 4 is a basic flowchart of an embodiment of the present invention for obtaining historical time series data of a task;

FIG. 5 is a basic flowchart of an embodiment of the present invention for obtaining a prediction interval of future data according to the mean and variance of the future data;

FIG. 6 is a schematic diagram of a prediction interval of 95% confidence of a pick up data set obtained by the time series prediction method according to the embodiment of the present invention;

FIG. 7 is a schematic diagram of an evaluation of calibration loss for a pick up data set in an embodiment of the invention;

FIG. 8 is a schematic structural diagram of a time series prediction apparatus according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

In the figure:

s1201, initializing a layer; s1202, a random differential equation layer; s1203, predicting a layer; 201. a first acquisition module; 202. a neural network module; 203. a training module; 204. an input module; 205. a second acquisition module; 206. a third obtaining module; 301. a memory; 302. a processor.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the description of the present invention, it should be noted that the terms "first", "second", "third", and the like are used only for distinguishing the description, and are not intended to indicate or imply relative importance.

In deep learning, existing uncertainty estimation and interval prediction algorithms can be generally classified into the following three categories: bayesian Neural Network (BNN), Deep ensemble method (Deep ensemble), Heteroscedastic Neural Network (HNN).

In contrast to standard deep neural networks, in Bayesian Neural Networks (BNNs), uncertainty estimates are provided by introducing prior probabilities to the weights of the neural network and representing their parameters in the form of a probability distribution; meanwhile, by representing the parameters of the neural network in the form of a prior probability distribution, the mean is calculated over many models during training, thereby providing a regularization effect to the neural network, thereby preventing overfitting. The Bayesian neural network obtains prediction distribution through uncertainty on modeling parameters, then calculates a prediction interval, and quantifies prediction uncertainty into uncertainty.

The deep integration method is a model integration technology, can be regarded as an empirical Bayes criterion and a method, simultaneously establishes a plurality of deep neural networks, and then respectively initializes each neural network, so that each neural network has different initial weights, and the purpose of enhancing the diversity of models is achieved. Through the model integration method, the final prediction distribution and uncertainty can be obtained, and compared with a Bayesian neural network, the model integration method is simpler to use and deploy and often has good performance.

In the heteroscedastic neural network, a variance term is introduced into the last layer of the neural network, negative log likelihood loss (NLL loss) is optimized through an optimization algorithm of random gradient descent, the neural network is trained, so that the deep neural network learns the mean value and the variance of future data, the purpose of establishing prediction distribution is achieved, uncertainty of neural network prediction is quantized into uncertainty and a prediction interval is calculated, and the method is direct, simple and convenient.

On the time series prediction task, the modes adopted by the three methods are basically similar, the data are fitted and learned, the optimal mean value and variance are obtained through a neural network, the prediction distribution is estimated through the variance assumption, the uncertainty output by the neural network is quantized into uncertainty, and the prediction interval of the required confidence coefficient is calculated according to the prediction distribution.

However, bayesian neural networks prevent overfitting by regularizing the weights of the neural network by introducing uncertainty, and while modeling is simple, performing bayesian inference in the neural network is very challenging. On one hand, because posterior distribution needs to be calculated, a large amount of calculation resources are often needed, and the training process is slow in practical application; on the other hand, the computed uncertainty of the prediction tends to be inaccurate due to the use of misidentification and approximate inference of the model.

The deep integration method needs to integrate a plurality of neural network models, often requires a long training time, requires a large amount of computing resources, and is very expensive in computation when the complexity of the models becomes high, which brings a lot of difficulties to practical applications (for example, collecting traffic flow in real time and predicting the traffic flow within one hour in the future).

Although the heteroscedastic neural network has no expensive Bayes inference process and low model complexity, in the heteroscedastic neural network, only one probability distribution is directly output at the last layer of the network, and the probability distribution corresponds to the mean value and the variance of the output of the neural network. In the whole information propagation and network optimization process, the mean value and the variance are respectively subjected to relatively independent learning processes, and no explicit connection or mutual influence relationship exists between the prediction interval obtained by the neural network and the uncertainty of the quantized neural network output. Due to this limitation, the mean and variance to be optimized cannot be tightly connected within the network, which may result in that the mean and variance predicted by the neural network cannot converge to the point optimal for each other at the same time. Therefore, simply only letting the last layer of the network output the mean and variance, the resulting prediction distribution tends to have a high error, the resulting prediction interval is unreliable, and the uncertainty of the neural network output also tends to be based on the mean and variance, and is therefore also not reliable enough.

In view of the above problems, the inventors of the present application believe that the following time series prediction method can be adopted, and the complexity of the neural network and the reliability of the prediction result are both considered, as shown in fig. 1, a time series prediction method comprising the following steps:

s101, acquiring historical time sequence data of a task;

s102, establishing a neural network; as shown in fig. 2, the neural network includes an initialization layer S1201, a random differential equation layer S1202 and a prediction layer S1203, and a random differential equation is set in the random differential equation layer S1202; the initialization layer S1201 is used for extracting initialization characteristic mapping of historical time series data; the random differential equation layer S1202 is configured to obtain a mean characteristic and a variance characteristic of the historical sequence data by using a random differential equation whose initial value is initialized feature mapping, where the mean characteristic is a solution of the random differential equation, the variance characteristic is a diffusion coefficient corresponding to the solution of the random differential equation, and the prediction layer S1203 is configured to predict a mean and a variance of future data according to the mean characteristic and the variance characteristic;

s103, training the neural network by using a training data set of a task to obtain a trained neural network;

s104, inputting historical time series data of the task into a trained neural network, and acquiring the mean value and the variance of future data through the neural network;

and S105, acquiring a prediction interval of the future data according to the mean value and the variance of the future data.

In step S101, the historical time-series data of the task refers to historical data in several time periods closest to future data of the time period to be predicted. Taking the traffic flow prediction task as an example, step S101 is to obtain historical time-series data of the traffic flow prediction task. Specifically, the historical time-series data of the traffic flow of the entrance lane of a certain intersection may be acquired by a traffic flow acquisition device provided at the entrance lane of the intersection. For example, if the future data to be predicted is the entrance lane 12 of the intersection: 00-12: traffic flow data within 10 minutes, then 12: 00 prior to a plurality of 10 minute time intervals.

In step S102, specifically, the initialization layer S1201 of the neural network extracts an initialization feature map that is easy to learn the neural network from the historical time series data by upsampling or downsampling.

The prediction layer S1203 performs linear transformation on the mean feature and the variance feature, and predicts a mean and a variance of future data.

In step S103, specifically, taking the Traffic flow prediction task as an example, the training data sets Metro-Traffic and Traffic flow are public data sets of the Traffic flow prediction task.

As shown in fig. 3, step S103 specifically includes the following steps:

s1031, randomly initializing neural network parameters, wherein the parameters comprise weight and bias;

s1032, initializing hyper-parameters of the neural network optimizer and the Euler equation, wherein the hyper-parameters comprise learning rate, weight attenuation, batch size, iteration step length and iteration step number;

s1033, inputting the processed training data into a neural network, and iteratively solving a random differential equation parameterized by the neural network by using an Euler method;

s1034, optimizing parameters of the neural network by using a gradient descent updating algorithm until the neural network converges;

and S1035, obtaining the finally optimized neural network parameters.

The optimization of the neural network is mainly parameter optimization, and a trained neural network is obtained.

Specifically, in step S1034, the neural network converges, i.e., the loss function of the neural network tends to be stable, i.e., L_m+1-L_m< tau, tau being a predetermined loss function threshold, L_mIs the value of the loss function after the mth update of the neural network parameter，L_m+1Is the value of the loss function after the mth update of the neural network parameter;

in step S104, taking a traffic flow prediction task as an example, historical time series data of the traffic flow acquired in real time is input to a trained neural network, and a mean and a variance of future data of the traffic flow are acquired through the neural network.

Compared with a Bayesian neural network and a deep integration method, the time sequence prediction method disclosed by the invention does not need to calculate posterior distribution, and requires less computing resources in the training process; compared with a deep integration method, the number of the established network models is small, the training process is fast, and the calculation cost is low; compared with the traditional heteroscedastic neural network, the generated future data has higher prediction interval precision and higher reliability.

It should be noted that, when the time series prediction method of the present invention is applied to an actual prediction task, the prediction time needs to be adjusted according to the actual situation. The time series prediction method provided by the invention actually utilizes the continuity of the development of the object in the prediction interval for acquiring the future data, and uses the past historical time series data to carry out statistical analysis to estimate the trend of the development of the task, so that the closer the future time is to the historical time, the more accurate the acquired prediction interval is, and the magnitude of the traffic flow prediction time interval is in the minute level under the normal condition.

Specifically, the reason why the accuracy and reliability of the prediction section generated by the present invention are higher is the random differential equation layer S1202.

In this embodiment, the random differential equation layer S1202 has the random differential power

The process is as follows:

wherein f (·) represents a drift coefficient of the random differential equation; g (-) represents the diffusion coefficient of the random differential equation; z is a radical of_tRepresenting the state or output of the t hidden layer of the neural network; f denotes driftA neural network parameter of the item; theta_gA neural network parameter representing a diffusion term; bt represents the brownian motion state at time t in the stochastic differential equation.

In this embodiment, specifically, an euler method is adopted to perform iterative solution on a random differential equation using the initialized feature mapping as an initial value, a solution of the random differential equation is a value corresponding to the termination time, the value is used as a mean feature, a diffusion coefficient corresponding to the solution is a diffusion coefficient corresponding to the termination time, the diffusion coefficient is used as a variance feature, and a connection between a mean feature and a variance feature of the historical time series data is established based on the random differential equation.

The expression formula of the iterative solution of the Eulerian method is as follows:

wherein s represents the s-th iteration, zs represents the solution of the stochastic differential equation at the s-th iteration, and z_s+1Represents the solution of the random differential equation at the s +1 th iteration, W_sRepresenting a random variable sampled from a standard normal distribution.

In physics, there is a relevant relationship between particle motion and ambient disturbance — brownian motion, and at first in physics, physicists established an equation to simulate the relationship between particle motion and ambient disturbance, i.e. a random differential equation developed later. The diffusion term of the stochastic differential equation is a disturbance and disturbance applied to the current state of the system, and the solution of the stochastic differential equation represents the current state of the system (such as the speed of particle motion), so that the solution at the termination moment of the stochastic differential equation can better reflect the real physical world observation. From the practical meaning of the mean value and the variance, the variance of the system also represents the degree of disturbance and interference on the system, for example, the smaller the variance, the narrower the corresponding normal distribution, the more the predicted value of the future data changes in a small range, the more certain the predicted result is, the more stable the state of the system is, that is, the system is less disturbed and disturbed, and the solution of the system represents the current state of the system, for example, the current traffic flow at the current time, the current state of the traffic system.

Inspired by the physical meaning of Brownian motion and the modeling form of the random differential equation, the diffusion term and the variance of the random differential equation are connected, the solution of the random differential equation is connected with the mean, the random differential equation is used for establishing the interaction relationship of the mean and the variance, so that the mean characteristic and the variance characteristic output by the random differential equation layer S1202 are connected with each other, the prediction layer S1203 forms close connection with the mean and the variance of future data obtained by optimizing according to the mean characteristic and the variance characteristic, the mean and the variance of the future data can be converged to the optimal points for each other at the same time in the neural network training process, and the diffusion term simulates real interference and disturbance, so that the mean and the variance of the future data output by the prediction layer S1203 are more definite and more reliable.

The key point of whether the neural network obtained by training is reliable is that in the training process, the loss function of the neural network ensures that the mean value and the variance of future data can be converged to the optimal point for each other at the same time.

In this embodiment, specifically, when the neural network is trained by using a training data set of a task, a loss function of the neural network is composed of a mean and a variance of future data, so as to ensure that the mean and the variance of the future data converge to an optimal point at the same time, and a formula of the loss function of the neural network is as follows:

μ (-) represents the mean of future data output by the neural network; sigma²() represents the variance of future data output by the neural network; l (theta x; theta)_g) Representing a negative log-likelihood loss function to be minimized; costant represents a constant term; (X)ⁱ，yⁱ) Is a set of real sample point data, where XⁱIs input data to the neural network in the training dataset, yⁱIs training data set and nervesReal data corresponding to future data output by the network; n represents the number of sample points, i is more than or equal to 1 and less than or equal to N.

The neural network predicts a certain error of future data, and the error is determined by the uncertainty of the neural network, so the time series prediction method further comprises the following steps: and acquiring uncertainty according to the mean value and the variance of the future data, wherein the uncertainty is used for representing the reliability of the prediction interval of the neural network output.

The uncertainty is a quantization result of uncertainty of the neural network itself, and there are various quantization modes, in this embodiment, since the variance of the future data represents the degree of interference and disturbance to the system, the uncertainty is represented by the variance of the future data output by the neural network.

Specifically, as shown in fig. 4, the step S101 specifically includes the following steps:

s1011, collecting real-time sequence data of the task;

s1012, normalizing the real-time sequence data to obtain normalized time sequence data;

and S1013, converting the normalized time series data into historical time series data of the task by using a sliding window algorithm.

After data normalization, the optimization process of the optimal solution of the neural network becomes gentle and can be more easily and correctly converged to the optimal solution, and the calculation amount in the data preprocessing process can be optimized by the sliding window algorithm.

As shown in fig. 5, step S105 specifically includes the following steps,

s1051, obtaining the prediction distribution of the future data according to the mean value and the variance of the future data;

s1052, obtaining a prediction interval of the future data according to the prediction distribution of the future data. The prediction distribution is of various types, such as normal distribution, poisson distribution, and the like, and normal distribution is a more commonly used prediction distribution in time series prediction, and is also a prediction distribution specifically adopted in this embodiment.

In the present embodiment, a method is proposed to evaluate and monitor the reliability of the prediction results based on the confidence weighted calibration loss (CWCE) and its variant, the decision coefficient driven calibration loss (R-CWCE).

The confidence weighted calibration loss (CWCE) is calculated as follows:

where n is the number of confidences to sample, E (P)_k) Is the empirical coverage, P, calculated over the prediction interval obtained by the time series prediction method of the present invention_kIs the desired confidence, i.e. the true confidence, e.g. the prediction interval for which a 95% confidence is desired, the desired confidence is 0.95; k is more than or equal to 1 and less than or equal to n.

The calculation formula for determining the coefficient-driven calibration loss (R-CWCE) is as follows:

yⁱis the real data in the training dataset corresponding to future data output by the neural network,

is a predicted value of future data output after the neural network learning training data set inputs data,

is the average of the true sample point data in the training dataset.

The reliability of the neural network prediction interval can be monitored by calculating the CWCCE and the R-CWCCE, and the smaller the CWCCE and the R-CWCCE, the more reliable the prediction result is.

The time series prediction algorithm provided by the real-time example can be applied to various tasks, such as a traffic flow prediction task, a passenger flow prediction task, a power consumption prediction task and a stock price prediction task.

Experimental data of a traffic flow prediction task, a passenger flow prediction task, a power consumption prediction task, and a stock price prediction task are provided below, respectively.

Metro-Traffic and Traffic flow are public data sets of Traffic flow. Pickups is a public data set for passenger flow, electric is a public data set for Electricity usage, and Stock is a public data set for Stock prices. These data sets are widely used.

In the experiments, the sliding windows of the selected time series predictions were 5 and 1, i.e., for traffic flow prediction, traffic flow values for one hour in the future were predicted using traffic flow data for the past five hours.

For fair comparison, the experimental setup was the same for all methods. RMSE (root mean square error) and R²The (R-side) is an index that is generally used to evaluate prediction accuracy.

The results of the experimental data are shown in table 1, MCD stands for monte carlo dropout method; DGP represents a depth Gaussian process method; BNN stands for bayesian neural network method; deep ensemble method is represented by Deep ensemble, and different variance neural network method is represented by HNN; the SDE-HNN represents the improved heteroscedastic neural network method based on random differential equations described in this example.

TABLE 1

The results of the prediction accuracy and reliability evaluation for each method are shown in table 1 for different time series prediction tasks, starting from the third column, where each column is the performance of one method for different prediction tasks. In each table, four indices are shown for performing a time series of predictive tasks in a manner that is specific to R²According to the index, a larger value represents better prediction accuracy; the three indexes of RMSE, CWCE and R-CWCE represent prediction errors, and the smaller the value is, the better the prediction performance is represented, and the higher the reliability is.

As can be seen from Table 1, the time series prediction method index R of the neural network established based on the embodiment²All the methods are larger than the other five methods, so that the time series prediction method described in the embodiment has higher prediction precision. Indexes RMSE, CWCE and R-CWCE of the time series prediction method of the neural network established based on the embodiment are all smaller than other five methods, so that the reliability of the prediction result obtained by the time series prediction method of the embodiment is higher.

Particularly, compared with the existing heterovariance neural network method, the indexes CWCE and R-CWCE of the time series prediction method according to the present embodiment are both greatly reduced, taking the data set Metro-traffic as an example, the CWCE is reduced by 69%, and the index R-CWCE is reduced by 76%.

As shown in fig. 6, a schematic diagram of a prediction interval with 95% confidence of a pick up data set obtained by the time series prediction method according to the embodiment of the present invention is provided; FIG. 7 is a schematic diagram of an evaluation of calibration loss for a pick up data set in an embodiment of the invention;

in fig. 6, the dark-shaded portion represents the prediction interval with 95% confidence obtained by the time series prediction method described in this embodiment, the light-shaded portion represents the prediction interval with 95% confidence obtained by the variance neural network method, and the point in the broken line represents the real data.

In fig. 7, the horizontal axis represents the expected (real) confidence level, the vertical axis represents the actually observed confidence level, the closer to the diagonal line represents the smaller calibration loss, and the higher reliability of the result is, and it can be seen from the figure that the result of the method of the present embodiment almost coincides with the diagonal line, which is better than that of the existing other methods, which proves that the variance of the future data obtained by the method described in the present embodiment can generate reliable uncertainty and prediction interval.

As shown in fig. 8, the present invention further provides a time series prediction apparatus, including:

a first obtaining module 201, configured to obtain historical time-series data of a task;

the neural network module 202, the neural network includes an initialization unit, a random differential equation unit and a prediction unit, and a random differential equation is set in the random differential equation unit; the initialization unit is used for extracting initialization characteristic mapping of historical time series data; the random differential equation unit is used for acquiring a mean characteristic and a variance characteristic of the historical sequence data by using a random differential equation with an initial value as initialization feature mapping, wherein the mean characteristic is a solution of the random differential equation, and the variance characteristic is a diffusion coefficient corresponding to the solution of the random differential equation;

the training module 203, the training module 203 is configured to train the neural network by using a training data set of a task to obtain a trained neural network;

an input module 204, configured to input historical time-series data of the task into a trained neural network;

a second obtaining module 205, configured to obtain a mean and a variance of future data through a neural network;

the third obtaining module 206 obtains the prediction interval of the future data according to the mean and the variance of the future data.

As shown in fig. 9, the present invention further provides an electronic device, which includes a memory 301, a processor 302, and a computer program stored in the memory 301 and executable on the processor 302, wherein the processor 302 implements the steps of the time series prediction method according to this embodiment when executing the computer program.

Memory 301 may include both read-only memory and random access memory and provides instructions and data to processor 302. A portion of the memory 301 may also include non-volatile random access memory 301 (NVRAM).

The invention also provides a readable storage medium having stored thereon a computer program for performing the steps of the time series prediction method when being executed by the processor 302.

Specifically, the storage medium can be a general-purpose storage medium such as a removable disk, a hard disk, or the like, and when a computer program on the storage medium is executed, the above-described dialogue sentence determination method can be executed, so that the accuracy of determining the first pair of dialogue sentences can be improved.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, a module may be divided into only one logical function, and may be implemented in other ways, and for example, a plurality of units or modules may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments provided in the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory 301 (ROM), a Random Access Memory 301 (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The task quantity prediction method based on the time series is characterized by comprising the following steps of:

acquiring historical task quantity of a system task to be predicted in a plurality of historical time periods before the time period to be predicted; the task to be predicted has development continuity; the historical task quantity represents the state of the task to be predicted in a corresponding time period;

inputting the historical task quantities in the plurality of historical time periods into a pre-trained neural network prediction model, and processing the historical task quantities in the plurality of historical time periods through the neural network prediction model to obtain the predicted task quantities of the tasks to be predicted in the time periods to be predicted;

wherein the neural network prediction model comprises an initialization layer, a random differential equation layer and a prediction layer; the initialization layer is used for extracting initialization characteristics of real-time historical task quantities in the plurality of historical time periods; the random differential equation layer is used for taking the initialization characteristic as the initial state of the system, applying a disturbance to the current state of the system to establish the relationship between the current state of the system and the disturbance and acquiring the current state of the system and the disturbance at the termination moment; and the prediction layer is used for predicting the prediction task amount and the prediction error of the task to be predicted in the time period to be predicted according to the current state and the disturbance of the system at the termination time.

2. The time-series-based task quantity prediction method according to claim 1, wherein the system task to be predicted is any one of a passenger flow quantity prediction task, a power consumption quantity prediction task, or a weather prediction task.

3. The time-series-based task quantity prediction method according to claim 1, wherein the neural network prediction model includes an initialization layer, a random differential equation layer and a prediction layer, and a random differential equation is set in the random differential equation layer; the initialization layer is used for extracting initialization characteristic mapping of historical time sequence data, and the historical time sequence data are real-time historical task quantities in the plurality of historical time periods; the random differential equation layer is used for acquiring a mean characteristic and a variance characteristic of historical sequence data by using a random differential equation with an initial value as initialization feature mapping, the initial value is an initial state of a random differential equation characterization system with the initialization feature mapping, the mean characteristic is a solution of the random differential equation, the solution of the random differential equation is a current state of the system, the variance characteristic is a diffusion coefficient corresponding to the solution of the random differential equation, and the diffusion coefficient is a disturbance applied to the current state of the system; the prediction layer is used for predicting the mean value and the variance of future data according to the mean value characteristic and the variance characteristic, the mean value is the predicted task amount of the task to be predicted in the time period to be predicted, and the variance is the prediction error of the task to be predicted in the time period to be predicted.

4. The method for predicting task volume based on time series according to claim 1, wherein the prediction layer is further configured to: and obtaining the value range of the predicted task quantity of the system task to be predicted in the time period to be predicted according to the predicted task quantity and the prediction error.

5. The method for predicting task volume based on time series according to claim 1, wherein the prediction layer is further configured to: and obtaining uncertainty according to the predicted task quantity and the prediction error, wherein the uncertainty is used for representing the reliability of the predicted task quantity of the system task to be predicted in the time period to be predicted, which is obtained by predicting the system task quantity by a system task quantity prediction model.

6. The time-series-based task amount prediction method according to claim 2,

the random differential equation is as follows:

wherein f (·) represents a drift coefficient of the random differential equation; g (-) represents the diffusion coefficient of the random differential equation; z_tRepresenting the state or output of the t hidden layer of the neural network; theta_fA neural network parameter representing a drift term; theta_gA neural network parameter representing a diffusion term; bt represents the brownian motion state at time t in the stochastic differential equation.

7. The time-series-based task quantity prediction method according to claim 2, wherein the stochastic differential equation is iteratively solved by using an eulerian method, and a formula for the iterative solution by using the eulerian method is as follows:

wherein S denotes the S-th iteration, Z_SRepresenting the solution of the random differential equation at the S-th iteration, Z_S+1Represents the solution of the random differential equation at the S +1 th iteration, W_SRepresenting a random variable sampled from a standard normal distribution.

8. A time-series-based task amount prediction apparatus, comprising:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method according to any one of claims 1 to 7 when executing the computer program.

10. A readable storage medium, having stored thereon a computer program which, when being executed by a processor, is adapted to carry out the steps of the method according to any one of claims 1 to 7.