CN112988840A - Time series prediction method, device, equipment and storage medium - Google Patents

Time series prediction method, device, equipment and storage medium Download PDF

Info

Publication number
CN112988840A
CN112988840A CN202110298232.XA CN202110298232A CN112988840A CN 112988840 A CN112988840 A CN 112988840A CN 202110298232 A CN202110298232 A CN 202110298232A CN 112988840 A CN112988840 A CN 112988840A
Authority
CN
China
Prior art keywords
sequence
seasonal
historical time
time point
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110298232.XA
Other languages
Chinese (zh)
Inventor
鄂潇原
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Zhenshi Information Technology Co Ltd
Original Assignee
Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Zhenshi Information Technology Co Ltd filed Critical Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority to CN202110298232.XA priority Critical patent/CN112988840A/en
Publication of CN112988840A publication Critical patent/CN112988840A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis

Abstract

The embodiment of the invention discloses a time sequence prediction method, a time sequence prediction device, time sequence prediction equipment and a storage medium, wherein the method comprises the following steps: acquiring a historical time sequence; for each historical time point, determining a regression variable value corresponding to the current historical time point according to the position identification of the current historical time point in the seasonal period to which the current historical time point belongs and the number of the historical time points contained in the seasonal period to which the current historical time point belongs; fitting a seasonal periodic item sequence corresponding to the historical time sequence by using each historical service data and each regression variable value based on a preset kernel regression algorithm; and predicting the service data of the subsequent time point based on the seasonal periodic item sequence. According to the technical scheme of the embodiment of the invention, the seasonal periodic item sequence can be fitted through the preset kernel regression algorithm, so that the service data of the subsequent time point can be predicted, and the accuracy of predicting the seasonal periodic time sequence with the non-fixed length is improved.

Description

Time series prediction method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a time series prediction method, a time series prediction device, time series prediction equipment and a storage medium.
Background
The time sequence predicting method is a quantitative predicting technology for analyzing a time sequence formed by arranging certain data according to time sequence, establishing a predicting model, inputting historical data of a past period of time into the model and outputting a predicted value of the data in a future period of time. The technology has very wide application, and is common in the fields of commodity sales prediction, macroscopic economy prediction, environmental variable prediction and the like.
Time series prediction models commonly used in practical scenarios include classical exponential smoothing models (Brown and Holt,1950s), ARIMA models (Box and Jenkins,1970s), their derived sarimax, ARIMAX, etc., random forest (Breiman, 2001) and XGBoost (Tianqi Chen, 2016) that use machine learning models for prediction, and Prophet models (Taylor and Letham,2017) that have been proposed in recent years by Facebook. Models for time series prediction using advanced deep learning techniques include LSTM (Hochreiter and Schimidhuber, 1995) and DeepaR (Salinas et al, 2017), among others. The various models have advantages and disadvantages under different scenes and different requirements.
In the process of implementing the invention, at least the following technical problems are found in the prior art:
time series generally have seasonal variation rules, which result in the need for a time series analysis method capable of analyzing seasonality, however, a time series prediction method which has high accuracy and can model seasonal periods of non-fixed length is lacking at present.
Disclosure of Invention
The embodiment of the invention provides a time sequence prediction method, a time sequence prediction device, time sequence prediction equipment and a storage medium, which are used for predicting a seasonal periodic time sequence with a non-fixed length and improving the technical effect of prediction accuracy.
In a first aspect, an embodiment of the present invention provides a time series prediction method, where the method includes:
acquiring a historical time sequence; wherein the historical time series comprises historical traffic data for each historical time point within at least one seasonal period length;
for each historical time point, determining a regression variable value corresponding to the current historical time point according to the position identification of the current historical time point in the seasonal period to which the current historical time point belongs and the number of the historical time points contained in the seasonal period to which the current historical time point belongs; the regression variable value is within a preset value range;
fitting a seasonal periodic item sequence corresponding to the historical time sequence by using each historical service data and each regression variable value based on a preset kernel regression algorithm; wherein the seasonal periodic item sequence comprises a plurality of seasonal periodic item data which represent the variation rule of the historical service data in a seasonal period;
and predicting the service data of the subsequent time point based on the seasonal periodic item sequence.
In a second aspect, an embodiment of the present invention further provides a time series prediction apparatus, where the apparatus includes:
the historical time sequence acquisition module is used for acquiring a historical time sequence; wherein the historical time series comprises historical traffic data for each historical time point within at least one seasonal period length;
the regression variable value determining module is used for determining the regression variable value corresponding to the current historical time point according to the position identification of the current historical time point in the seasonal period to which the current historical time point belongs and the number of the historical time points contained in the seasonal period to which the current historical time point belongs; the regression variable value is within a preset value range;
the seasonal periodic item sequence fitting module is used for fitting a seasonal periodic item sequence corresponding to the historical time sequence by using each historical service data and each regression variable value based on a preset kernel regression algorithm; wherein the seasonal periodic item sequence comprises a plurality of seasonal periodic item data which represent the variation rule of the historical service data in a seasonal period;
and the service data prediction module is used for predicting the service data of the subsequent time point based on the seasonal periodic item sequence.
In a third aspect, an embodiment of the present invention further provides an electronic device, where the electronic device includes:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a method of time series prediction according to any one of the embodiments of the present invention.
In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the time series prediction method according to any one of the embodiments of the present invention.
According to the technical scheme of the embodiment of the invention, the regression variable values corresponding to the historical time points are determined, so that the historical time sequence is suitable for the preset kernel regression algorithm, the seasonal periodic item sequence corresponding to the historical time sequence is fitted based on the preset kernel regression algorithm, and the service data of the subsequent time points are predicted based on the seasonal periodic item sequence, so that the problem that the time sequence of the seasonal period with the non-fixed length cannot be accurately predicted in the prior art is solved, and the technical effect of accurately predicting the time sequence of the seasonal period with the non-fixed length is realized.
Drawings
In order to more clearly illustrate the technical solutions of the exemplary embodiments of the present invention, a brief description is given below of the drawings used in describing the embodiments. It should be clear that the described figures are only views of some of the embodiments of the invention to be described, not all, and that for a person skilled in the art, other figures can be derived from these figures without inventive effort.
Fig. 1 is a schematic flowchart of a time series prediction method according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a time series prediction method according to a second embodiment of the present invention;
fig. 3 is a schematic flowchart of a time series prediction method according to a third embodiment of the present invention;
fig. 4 is a schematic flowchart of a time series prediction method according to a fourth embodiment of the present invention;
FIG. 5 is a schematic diagram of seasonal periodic item data prediction at a subsequent point in time provided by a fourth embodiment of the present invention;
fig. 6 is a schematic structural diagram of a time series prediction apparatus according to a fifth embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to a sixth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a time series prediction method according to an embodiment of the present invention, where the embodiment is applicable to a case of predicting a time series of a non-fixed-length seasonal period, and the method may be executed by a time series prediction apparatus, and the time series prediction apparatus may be implemented in a form of software and/or hardware, where the hardware may be an electronic device, and optionally, the electronic device may be a PC terminal, and the like.
As shown in fig. 1, the method of this embodiment specifically includes the following steps:
and S110, acquiring a historical time sequence.
Wherein the historical time series comprises historical traffic data for historical time points within at least one seasonal period length. Seasonal periods may be predictable repeating portions of a time series, for example, monthly data may cycle with seasons and/or years. The seasonal period length may be non-fixed, for example: the seasonal period is a month, and the seasonal period may be 30 days, 31 days, 28 days, or 29 days in length. Alternatively, the seasonal period may be month, week, quarter, year, etc. If the seasonal period is a month, the historical time point is a day; if the seasonal period is quarterly, the historical time point may be a day or a month, etc. The historical service data may be data values corresponding to historical time points in a historical time sequence, for example: the historical business data may be daily commodity sales data, monthly commodity sales data, or the like.
Specifically, the historical time series may be acquired from pre-stored data. The historical service data in the historical time series may also be subjected to data preprocessing, such as: data abnormal point elimination and the like can be performed, and the present embodiment is not particularly limited. Or splitting the historical time sequence according to seasonal periods to obtain historical time subsequences in each seasonal period.
Illustratively, if the seasonal period is a month, it indicates that the historical traffic data of each month in the historical time series has similar change rules. At this time, commodity sales data per day for three months may be used as historical business data. Furthermore, historical traffic data that changes over time within three months may be used as a historical time series. For example: the historical service data with the historical time sequence from 1/2018 to 31/2018/3 can be determined, and the historical service data with the historical time subsequence from 1/2018 to 31/2018/1/2018, the historical service data from 2/2018/1/2018/2/28/2018 and the historical service data from 3/2018/1/2018/3/31/2018 can be determined.
In a scenario of selling commodities, the sales volume of commodities generally continuously and greatly increases on the last few days of each natural month, the sales volume on the last day of the natural month is high, and the sales volume data of each natural month and the sales volume data of the natural month have similar change rules. In this case, the sales data of the commodities in this scene may be considered to have seasonality, and the natural month may be regarded as a seasonal period.
And S120, for each historical time point, determining a regression variable value corresponding to the current historical time point according to the position identifier of the current historical time point in the seasonal period to which the current historical time point belongs and the number of the historical time points contained in the seasonal period to which the current historical time point belongs.
The regression variable value may be a value within a preset value range after the historical time point is processed. The regression variable value is within a predetermined range, which can be set according to actual requirements, such as [0,1], [ -1,1], [0, + ∞ ] and the like. The location identity may be used to represent the location of the historical point in time in the seasonal period, i.e. the current historical point in time is the next point in time in the seasonal period, for example: the seasonal period is months, the historical time sequence is historical business data of each day in three months, and the date can be used as the position identification of the historical time point.
It should be noted that, the same manner is adopted for determining the regression variable value of each historical time point, and for clarity, the technical solution of the present embodiment is described by taking the determination of the regression variable value of one of the historical time points as an example.
Specifically, for the current historical time point, the location identifier of the current historical time point within the seasonal period to which the current historical time point belongs may be determined, that is, the current historical time point is determined to be the fourth time point in the corresponding seasonal period. Furthermore, the number of historical time points within the seasonal period to which the current historical time point belongs may also be determined. The value of the position identifier in the preset value range can be calculated according to the number of the position identifier and the historical time points, and therefore the value can be used as a regression variable value.
Illustratively, the seasonal period is a month, the position of the current historical time point in the seasonal period is identified as 14, the number of the historical time points in the seasonal period to which the current historical time point belongs is 31, and the preset value range is [0,1]]Then the regression variable value at the current historical time point can be determined as
Figure BDA0002985113260000071
And S130, fitting a seasonal periodic item sequence corresponding to the historical time sequence by using the historical service data and the regression variable values based on a preset kernel regression algorithm.
The seasonal periodic item sequence comprises a plurality of seasonal periodic item data representing the variation rule of the historical service data in the seasonal period. The pre-set kernel regression algorithm is a preset kernel regression algorithm, and the kernel regression algorithm is generally an algorithm for estimating a data change rule by considering selection of all samples. The predetermined kernel regression algorithm may be a beta kernel regression algorithm, a gamma kernel regression algorithm, a spline regression algorithm, or the like.
It should be noted that the predetermined value range of the regression variable value may be related to a predetermined kernel regression algorithm, for example, the predetermined value range corresponding to the beta kernel regression algorithm is [0,1], the predetermined value range corresponding to the gamma kernel regression algorithm is [0, + ∞ ], and the predetermined value range corresponding to the spline regression algorithm is [ - ∞, + ∞ ].
Specifically, each historical service data and each regression variable value are processed according to a preset kernel regression algorithm, so that seasonal periodic item data corresponding to each historical service data can be obtained. Furthermore, a seasonal periodic item sequence is composed from each seasonal periodic item data.
And S140, predicting the service data of the subsequent time point based on the seasonal periodic item sequence.
Specifically, the seasonal periodic item sequence may be input to a preset kernel regression algorithm to obtain a predicted value of the service data at a subsequent time point, that is, predicted service data.
It should be noted that, if it is desired to predict the service data at a plurality of subsequent time points, the predicted values of the service data may be determined one by one. Can be as follows: and fitting the historical service data and the predicted service data corresponding to each time point before the time point to be predicted through a preset kernel regression algorithm, determining a new seasonal periodic item sequence, and further determining the predicted service data of the time point to be predicted.
According to the technical scheme of the embodiment of the invention, the regression variable values corresponding to the historical time points are determined, so that the historical time sequence is suitable for the preset kernel regression algorithm, the seasonal periodic item sequence corresponding to the historical time sequence is fitted based on the preset kernel regression algorithm, and the service data of the subsequent time points are predicted based on the seasonal periodic item sequence, so that the problem that the time sequence of the seasonal period with the non-fixed length cannot be accurately predicted in the prior art is solved, and the technical effect of accurately predicting the time sequence of the seasonal period with the non-fixed length is realized.
Example two
Fig. 2 is a schematic flow chart of a time series prediction method according to a second embodiment of the present invention, and in this embodiment, on the basis of the foregoing embodiments, reference may be made to the technical solution of this embodiment for a manner of obtaining a seasonal periodic item sequence through fitting and predicting service data at a subsequent time point. The same or corresponding terms as those in the above embodiments are not explained in detail herein.
As shown in fig. 2, the method of this embodiment specifically includes the following steps:
and S210, acquiring a historical time sequence.
S220, for each historical time point, determining a regression variable value corresponding to the current historical time point according to the position identification of the current historical time point in the seasonal period to which the current historical time point belongs and the number of the historical time points contained in the seasonal period to which the current historical time point belongs.
And S230, fitting a seasonal periodic item sequence corresponding to the historical time sequence by using the historical service data and the regression variable values based on a preset kernel regression algorithm.
Specifically, the preset kernel regression algorithm can be used for fitting the seasonal periodic item sequence corresponding to the historical time sequence by using the historical service data and the regression variable values, which indicates that the seasonal periodic item sequence corresponding to the historical time sequence can be obtained by performing one-time data processing through the preset kernel regression algorithm.
In order to improve the prediction accuracy and the model prediction effect, an iterative algorithm and a preset kernel regression algorithm can be combined, and the specific steps are as follows:
the method comprises the steps of firstly, obtaining a trend item sequence in the i-1 st iteration, and obtaining a first intermediate time sequence based on a historical time sequence and the trend item sequence in the i-1 st iteration.
Wherein the trend item sequence may be a data sequence for describing a trend of change of the remaining data after the seasonal periodic item sequence is extracted from the historical time sequence. Optionally, a local weighted regression (LOWESS) may be used to fit the trend term sequence, and the proportion of the local data may be set according to actual requirements, for example: 0.25. alternatively, the optimal ARIMA model may be determined by searching for different ARIMA model orders and comparing model information criteria to determine the optimal ARIMA model, and sequence prediction may be performed using the optimal ARIMA model, such as: arima. The first intermediate time series may be a de-trended series of historical time series. And i represents the iteration number, the initial value is 1, and the initial value of the trend item sequence is 0.
Specifically, a trend item sequence in the i-1 st iteration is obtained, and the historical time sequence and the trend item sequence in the i-1 st iteration are subtracted to obtain a first intermediate time sequence of the i-th iteration.
And step two, fitting the seasonal periodic item sequence in the ith iteration by using the current first intermediate time sequence and each regression variable value based on a preset kernel regression algorithm.
Specifically, the current first intermediate time sequence and the regression variable values are input into a preset kernel regression algorithm for processing, and a seasonal periodic term sequence in the ith iteration can be fitted.
And thirdly, obtaining a second intermediate time sequence based on the historical time sequence and the seasonal periodic item sequence in the ith iteration, and fitting a trend item sequence in the (i + 1) th iteration by using the current second intermediate time sequence based on a local weighted regression algorithm.
Wherein the second time series may be a sequence of the historical time series after removing the seasonal period.
Specifically, the historical time series is subtracted from the seasonal periodic item series in the ith iteration to obtain a second intermediate time series of the ith iteration. The LOWESS method can be used to fit a sequence of trend terms in the (i + 1) th iteration from the current second intermediate time sequence.
Step four, determining whether the current value of i is smaller than L, if so, adding 1 to i, and returning to the step of acquiring the trend item sequence in the i-1 st iteration; otherwise, respectively determining the seasonal periodic item sequence and the trend item sequence which are fitted recently as the seasonal periodic item sequence and the trend item sequence corresponding to the historical time sequence.
Wherein L is a preset iteration number.
Specifically, whether the current iteration number reaches a preset iteration number is judged. And if the current iteration times do not meet the preset iteration times, namely i is less than L, adding 1 to the iteration times i, and returning to execute the step I, namely executing the iterative computation. And if the current iteration times meet the preset iteration times, namely i is larger than or equal to L, determining the fitted seasonal periodic item sequence and trend item sequence, namely the seasonal periodic item sequence in the ith iteration and the trend item sequence in the (i + 1) th iteration, further determining the fitted seasonal periodic item sequence as the seasonal periodic item sequence corresponding to the historical time sequence, and determining the fitted trend item sequence as the trend item sequence corresponding to the historical time sequence.
It should be noted that the condition for stopping the iteration may further include stopping the iteration when an L2 norm of a difference between the seasonal periodic item sequence in the ith iteration and the seasonal periodic item sequence in the i-1 st iteration is smaller than a preset norm threshold, and determining the most recently fitted seasonal periodic item sequence and trend item sequence as the seasonal periodic item data and trend item data corresponding to the historical time sequence, respectively. The preset number of iterations may be set according to actual requirements, for example, L-4. The preset norm threshold in the iteration stop condition may be designed by comprehensively considering the prediction error and the computing resource consumption, and is not specifically limited in this embodiment.
Optionally, if the preset kernel regression algorithm is a beta kernel regression algorithm, based on the preset kernel regression algorithm, the fitting of the current first intermediate time sequence and each regression variable value to obtain the seasonal periodic term sequence in the ith iteration may be: and inputting each data in the current first intermediate time sequence as a response variable value and each regression variable value as an independent variable value into the beta kernel regression function to obtain the seasonal periodic item sequence in the ith iteration.
In the fitting process, the beta kernel regression function in the beta kernel regression algorithm comprises the following steps:
Figure BDA0002985113260000111
Figure BDA0002985113260000112
Figure BDA0002985113260000113
Figure BDA0002985113260000114
Figure BDA0002985113260000115
wherein the content of the first and second substances,
Figure BDA0002985113260000116
the seasonal periodic item data y corresponding to the t-th historical service data in the fitted seasonal periodic item sequencejFor the jth historical traffic data, wj(xt) As kernel regression weight, xtFor the regression variable value, x corresponding to the t-th historical service datat-jIs a regression variable value corresponding to jth historical service data before the tth historical service data, n is the number of the historical service data contained in the historical time sequence, b is a hyper-parameter of beta kernel regression,
Figure BDA0002985113260000117
is the probability density function of the beta distribution, B is the beta function, and I is the indicative function.
In addition, a (x)t) The intermediate formula is defined for facilitating function writing and has no practical significance. The beta-kernel regression algorithm is commonly used for curve estimation in statistics, such as estimation of probability density functions, and the curve estimated by the algorithm has low estimation error, particularly near two ends of the curve, but is not applied to time series prediction; the algorithm can estimate a relatively smooth curve by using sample points in a coordinate system, and therefore, the algorithm is performed by using a modified local linear regression algorithm based on the beta kernel functionPeriodic modeling for time series prediction.
It should be further noted that the beta kernel regression algorithm includes two hyper-parameters: the historical time series comprises the number n of historical business data and a hyperparameter b of beta kernel regression. n determines how much historical service data is used for weighted regression, and can be understood as how much historical service data is used for learning the shape characteristics of the historical time series. Too large n can lead to the beta-kernel regression algorithm learning too long and possibly unused historical business data, and too small n can lead to the beta-kernel regression algorithm difficultly learning the time-varying trend of seasonal periodic items. Taking the seasonal period as a month as an example, the number n of the historical service data should be not less than 31, the value range may be set to [90, 120], and the specific value of n may be determined according to the actual requirement, which is not specifically limited in this embodiment. The hyperparameter b of the beta kernel regression can be regarded as the kernel regression bandwidth, the b value determines the smoothness degree of the beta kernel regression algorithm estimation, and the larger the b value is, the smoother the beta kernel estimation is. It can be known that an overly smooth estimate can lead to loss of local features; an insufficiently smooth estimate may cause an overfitting, leading to a problem of reduced generalization capability of the algorithm. The b value can be determined in a cross validation mode, and the cross validation-based b value selection method comprises the following steps: acquiring a cross-validation period M (e.g., 30 days); in the process of fitting the seasonal periodic items, every time M pieces of historical service data pass, re-determining the optimal b value by using a cross validation method, and using the b value to fit the seasonal periodic item sequences of the subsequent M pieces of historical service data; in the prediction process, the b value determined in the last cross-validation cycle M is used. In order to save the calculation amount of the cross validation, a preset value of b can be used, for example, the value of b is 0.01.
And S240, based on a preset kernel regression algorithm, predicting seasonal periodic item data corresponding to the subsequent time points by using the seasonal periodic item sequence and the regression variable values.
Specifically, if the preset kernel regression algorithm is a beta kernel regression algorithm, the seasonal periodic item data in the seasonal periodic item sequence may be used as the response variable value, the regression variable value corresponding to each historical time point and the regression variable value corresponding to the subsequent time point may be used as the argument value, and the argument values are input to the beta kernel regression function to obtain the predicted seasonal periodic item data at the subsequent time point.
In the prediction process, the beta kernel regression function in the beta kernel regression algorithm comprises the following steps:
Figure BDA0002985113260000131
Figure BDA0002985113260000132
Figure BDA0002985113260000133
Figure BDA0002985113260000134
Figure BDA0002985113260000135
wherein the content of the first and second substances,
Figure BDA0002985113260000136
for the predicted seasonal periodic item data at the subsequent point in time t +1,
Figure BDA0002985113260000137
for the t +1-j seasonal periodic item data in the fitted seasonal periodic item sequence, wj(xt+1) As kernel regression weight, xt+1For the value of the regression variable, x, corresponding to the subsequent time point t +1t+1-jIs a regression variable value corresponding to jth historical service data before a subsequent time point t +1, n is the number of historical service data contained in the historical time sequence, b is a hyperparameter of beta kernel regression,
Figure BDA0002985113260000138
is the probability density function of the beta distribution, B is the beta function, and I is the indicative function.
In addition, a (x)t+1) The intermediate formula is defined for facilitating function writing and has no practical significance.
It should be further noted that the time point t +1 indicates a time point corresponding to the seasonal periodic item data to be predicted, and the meaning of the time point t is a time point before the time point t + 1. It can be understood that: if seasonal periodic item data of a subsequent first time point are predicted, the meaning of the time point t is the last time point in the historical time points; if seasonal periodic item data of a subsequent second time point or a time point after the second time point is predicted, the meaning of the time point t is a predicted time point adjacent to the time point to be predicted before the time point to be predicted.
Illustratively, the historical time points range from 1/2018 to 31/2018. If seasonal periodic item data of 2018, 4, month and 1 are predicted, t +1 is 2018, 4, month and 1, and t is 2018, 3, month and 31. If the seasonal periodic item data of 2018 year 4 month 2 day is predicted, t +1 is 2018 year 4 month 2 day, and t is 2018 year 4 month 1 day, it indicates that the predicted seasonal periodic item data of 2018 year 4 month 1 day can be used for predicting the seasonal periodic item data of 2018 year 4 month 2 day.
And S250, acquiring a trend item sequence corresponding to the fitted historical time sequence.
Wherein the trend item sequence comprises trend item data representing the overall change trend of the historical business data.
Specifically, the trend item data corresponding to the historical time sequence determined through iteration in S230 may be obtained as the trend item sequence, or the historical time sequence and the seasonal periodic sequence may be subjected to difference calculation without using an iterative algorithm, and the difference is processed by using local weighted regression, so as to obtain the trend item sequence.
When the fitting is performed by using a local weighted regression (LOWESS) algorithm, the input of the algorithm is the difference value between the historical time series and the seasonal periodic series, and the output is the predicted value of each data in the trend term series.
And S260, predicting trend item data corresponding to the subsequent time points by using the trend item sequence.
Specifically, the trend item sequence may be processed by an auto.
It should be noted that, the method for predicting trend item data of the subsequent time point according to the trend item sequence may use other algorithms, for example: exponential smoothing method or random forest, XGboost and other machine learning algorithms.
And S270, predicting the service data of the subsequent time points according to the seasonal periodic item data corresponding to the subsequent time points and the trend item data corresponding to the subsequent time points.
Specifically, the seasonal period item data corresponding to the subsequent time point and the trend item data corresponding to the subsequent time point may be added, and the sum may be predicted as the service data of the subsequent time point.
In order to improve the prediction accuracy of the service data at the subsequent time point, fitting and predicting a residual sequence from which the seasonal periodic sequence and the trend item sequence are removed, and predicting the service data by combining the seasonal periodic item data, the trend item data and the residual item data, the method specifically comprises the following steps:
determining a residual error item sequence corresponding to the historical time sequence according to the historical time sequence, the seasonal periodic item sequence corresponding to the historical time sequence and the trend item sequence corresponding to the historical time sequence; predicting residual item data corresponding to a subsequent time point by using the residual item sequence; and obtaining the predicted service data of the subsequent time points according to the seasonal periodic item data corresponding to the subsequent time points, the trend item data corresponding to the subsequent time points and the residual error item data.
Specifically, the seasonal periodic sequence may be subtracted from the historical time sequence, the trend term sequence may be subtracted, and the finally obtained sequence may be used as the residual term sequence corresponding to the historical time sequence. The residual error item sequence can be processed by an auto. The method of predicting the residual term data of the subsequent time point according to the residual term sequence may use other algorithms, such as: exponential smoothing method or random forest, XGboost and other machine learning algorithms. After the residual error item data corresponding to the subsequent time point is obtained, the residual error item data, the seasonal periodic item data corresponding to the subsequent time point and the trend item data corresponding to the subsequent time point can be added, and the added data of the residual error item data, the seasonal periodic item data corresponding to the subsequent time point and the trend item data corresponding to the subsequent time point are used as the predicted service data of the subsequent time point.
According to the technical scheme of the embodiment of the invention, the seasonal periodic item sequence and the trend item sequence corresponding to each historical time point are determined, so that the seasonal periodic item data and the trend item data corresponding to the subsequent time points are predicted, and the service data of the subsequent time points are predicted according to the seasonal periodic item data corresponding to the subsequent time points and the trend item data corresponding to the subsequent time points, so that the problem that the time sequence of the seasonal period with the non-fixed length cannot be accurately predicted in the prior art is solved, and the technical effect of accurately predicting the time sequence of the seasonal period with the non-fixed length is realized.
EXAMPLE III
Fig. 3 is a flowchart of a time series prediction method according to a third embodiment of the present invention, and the present embodiment refers to the technical solution of the present embodiment for determining the regression variable value based on the above embodiments. The same or corresponding terms as those in the above embodiments are not explained in detail herein.
As shown in fig. 3, the method of this embodiment specifically includes the following steps:
and S310, acquiring a historical time sequence.
S320, determining a first difference value between the position sequence number of the current historical time point in the belonged seasonal period and 1.
Specifically, 1 is subtracted from the position serial number of the current historical time point in the belonged seasonal period, and the difference value is used as a first difference value. May be determined based on the following formula: the first difference is t-1, where t is the position index of the current historical time point within the seasonal period.
S330, determining a second difference value between the number of the historical time points contained in the seasonal period i to which the current historical time point belongs and 1.
Specifically, 1 is subtracted from the number of historical time points included in the seasonal period i to which the current historical time point belongs, and the difference is used as a second difference. May be determined based on the following formula: second difference being Mi-1, wherein MiIs the number of historical time points contained in the seasonal period i to which the current historical time point belongs.
S340, determining a regression variable value corresponding to the current historical time point according to the ratio of the first difference value to the second difference value.
Specifically, the ratio of the first difference and the second difference may be calculated, and the ratio may be determined as a regression variable value corresponding to the current historical time point. May be determined based on the following formula:
Figure BDA0002985113260000161
wherein x isitIs the value of the regression variable corresponding to the current historical time point.
It should be noted that, when determining the regression variable value corresponding to the current historical time point according to the steps of S320-S340, the preset value range corresponding to the regression variable value is [0,1], that is, the historical time point is allocated between [0,1 ]. For example: the seasonal period is a month, and for the historical time series of the 1 month in 2018, the regression variable value of the 1 month and 1 day in 2018 can be determined to be 0, and the regression variable value of the 31 month and 1 day in the 1 month and 1 in 2018; for the historical time series of 2018, month 2, it can be determined that the value of the regression variable for day 2, month 1, 2018 is 0 and the value of the regression variable for day 28, month 2, 2018 is 1. If the business data graph is drawn according to the historical business data and the regression variable values, the abscissa can be mapped between [0,1 ].
And S350, fitting a seasonal periodic item sequence corresponding to the historical time sequence by using the historical service data and the regression variable values based on a preset kernel regression algorithm.
And S360, predicting the service data of the subsequent time points based on the seasonal periodic item sequence.
According to the technical scheme of the embodiment of the invention, the first difference value is determined according to the position identification of the current historical time point in the seasonal period to which the current historical time point belongs, the second difference value is determined according to the number of the historical time points contained in the seasonal period to which the current historical time point belongs, and then the regression variable value corresponding to the current historical time point is determined according to the first difference value and the second difference value, so that the historical time sequence is suitable for the preset kernel regression algorithm, the problem that the time sequence of the seasonal period with the non-fixed length cannot be accurately predicted in the prior art is solved, and the technical effect of accurately predicting the time sequence of the seasonal period with the non-fixed length is realized.
Example four
As an alternative implementation of the foregoing embodiments, fig. 4 is a schematic flow chart of a time series prediction method provided in the fourth embodiment of the present invention. The same or corresponding terms as those in the above embodiments are not explained in detail herein.
As shown in fig. 4, if the seasonal period is month and the preset kernel regression algorithm is a beta kernel regression algorithm, the method of this embodiment specifically includes the following steps:
a. inputting a historical time series Yt
b. For historical time series YtCarrying out data preprocessing, and taking the sequence after data preprocessing as a new historical time sequence yt
Specifically, the data preprocessing may include exception point elimination and the like, and the historical time sequence may be split in a unit of natural month to obtain a plurality of subsequences, such as 2018, 1 month and 1 day to 1 month and 31 days; year 2018, 2 month 1 day to 2 month 28 day, etc.
c. Initializing seasonal periodic terms to
Figure BDA0002985113260000188
The trend term is
Figure BDA0002985113260000181
The number of iterations is i-1.
d. De-trending, i.e.
Figure BDA0002985113260000182
Inputting the de-trended sequence into a beta kernel regression function, and fitting to obtain the seasonal periodic term sequence of the ith iteration
Figure BDA0002985113260000183
e. To seasonal periodization, i.e.
Figure BDA0002985113260000184
Inputting the sequence without seasonal periodicity into a local weighted regression algorithm (LOWESS), and fitting to obtain a trend term sequence of the (i + 1) th iteration
Figure BDA0002985113260000185
f. Judging whether the convergence condition is met, if yes, executing step 7; and if not, adding 1 to the iteration number i, and returning to execute 4.
g. Outputting a sequence of seasonal periodic items
Figure BDA0002985113260000186
Sequence of trend items
Figure BDA0002985113260000187
h. A seasonal periodic sequence of items StAnd as input, calculating by using a beta kernel regression algorithm to obtain seasonal periodic item data corresponding to the subsequent time points.
i. The trend item sequence TtAnd calculating trend item data corresponding to the subsequent time points by using an auto.
j. Determining a sequence of residual terms yt-St-TtAnd taking the residual item sequence as input, and calculating to obtain residual item data corresponding to the subsequent time points by an auto.
k. And adding the seasonal periodic item data, the trend item data and the residual error item data corresponding to the subsequent time points, and taking the sum as a final prediction result, namely the service data of the subsequent time points.
A schematic diagram of seasonal periodic item data prediction at a subsequent time point, taking a certain commodity sales data of a merchant as an example, is shown in fig. 5.
"in fig. 5 represents historical business data of 90 days in 1 month, 2 months and 3 months of 2020, where the ordinate is the historical business data, and the abscissa is the regression variable value corresponding to each historical time point, that is, the date is mapped into the [0,1] interval. In fig. 5, the thicker lines are real business data of 4 months in 2020, and the thinner lines are seasonal periodic data of 4 months in 2020.
As can be seen from FIG. 5, the method maps each natural month into the [0,1] interval, thereby solving the problem that the seasonal period length of the natural month is not fixed. In fig. 5, the sales data of a certain commodity has obvious sales promotion in the last days of each natural month, and the beta kernel regression algorithm can learn the sales promotion feature of the sequence in the last days and take the feature as a seasonal periodic term.
It should be noted that the period of the seasonal periodic item is not necessarily a natural month, but may also be a week, a quarter, a year, etc.; the period length can be fixed or unfixed, and the sequence can show a stable seasonal periodic rule along with seasonal changes.
According to the technical scheme of the embodiment of the invention, the input historical time sequence is subjected to data preprocessing, the seasonal periodic item sequence and the trend item sequence are determined through iterative calculation, the seasonal periodic item data, the trend item data and the residual error item data corresponding to the subsequent time points are further calculated, and the sum of the seasonal periodic item data, the trend item data and the residual error item data is used as the service data of the subsequent time points, so that the problem that the time sequence of the seasonal period with the non-fixed length cannot be accurately predicted in the prior art is solved, and the technical effect of accurately predicting the time sequence of the seasonal period with the non-fixed length is realized.
The following is an embodiment of a time-series prediction apparatus provided in an embodiment of the present invention, which belongs to the same inventive concept as the time-series prediction methods of the above embodiments, and details that are not described in detail in the embodiment of the time-series prediction apparatus may refer to the embodiment of the time-series prediction method.
EXAMPLE five
Fig. 6 is a schematic structural diagram of a time series prediction apparatus according to a fifth embodiment of the present invention, including: a historical time series acquisition module 510, a regression variable value determination module 520, a seasonal periodic term sequence fitting module 530, and a traffic data prediction module 540.
The historical time sequence obtaining module 510 is configured to obtain a historical time sequence; wherein the historical time series comprises historical traffic data for each historical time point within at least one seasonal period length; a regression variable value determining module 520, configured to determine, for each historical time point, a regression variable value corresponding to the current historical time point according to the location identifier of the current historical time point in the seasonal period to which the current historical time point belongs and the number of historical time points included in the seasonal period to which the current historical time point belongs; the regression variable value is within a preset value range; a seasonal periodic item sequence fitting module 530, configured to fit a seasonal periodic item sequence corresponding to the historical time sequence by using each historical service data and each regression variable value based on a preset kernel regression algorithm; wherein the seasonal periodic item sequence comprises a plurality of seasonal periodic item data which represent the variation rule of the historical service data in a seasonal period; and a service data prediction module 540, configured to predict service data at a subsequent time point based on the seasonal periodic item sequence.
Optionally, the service data predicting module 540 is specifically configured to predict seasonal periodic item data corresponding to a subsequent time point by using the seasonal periodic item sequence and each regression variable value based on a preset kernel regression algorithm; acquiring a trend item sequence corresponding to the fitted historical time sequence; wherein the trend item sequence comprises trend item data representing the overall change trend of the historical business data; predicting trend item data corresponding to the subsequent time points by using the trend item sequence; and predicting the service data of the subsequent time points according to the seasonal periodic item data corresponding to the subsequent time points and the trend item data corresponding to the subsequent time points.
Optionally, the service data prediction module 540 is further configured to obtain a trend item sequence in the i-1 st iteration, and obtain a first intermediate time sequence based on the historical time sequence and the trend item sequence in the i-1 st iteration; wherein the initial value of i is 1, and the initial value of the trend item sequence is 0; fitting a seasonal periodic item sequence in the ith iteration by using the current first intermediate time sequence and each regression variable value based on a preset kernel regression algorithm; obtaining a second intermediate time sequence based on the historical time sequence and the seasonal periodic item sequence in the ith iteration, and fitting a trend item sequence in the (i + 1) th iteration by using the current second intermediate time sequence based on a local weighted regression algorithm; determining whether the current value of i is smaller than L, if so, adding 1 to i, and returning to the step of acquiring the trend item sequence in the i-1 st iteration; otherwise, respectively determining the seasonal periodic item sequence and the trend item sequence which are fitted recently as the seasonal periodic item sequence and the trend item sequence corresponding to the historical time sequence; wherein L is a preset iteration number.
Optionally, the preset kernel regression algorithm is a beta kernel regression algorithm; the service data prediction module 540 is further configured to input each data in the current first intermediate time sequence as a response variable value and each regression variable value as an argument value into the beta kernel regression function to obtain a seasonal periodic term sequence in the ith iteration.
Optionally, the preset kernel regression algorithm is a beta kernel regression algorithm; the service data prediction module 540 is further configured to input the beta kernel regression function with the seasonal periodic item data in the seasonal periodic item sequence as response variable values, the regression variable values corresponding to the historical time points, and the regression variable values corresponding to the subsequent time points as argument values, and obtain the predicted seasonal periodic item data at the subsequent time points.
Optionally, the service data prediction module 540 is further configured to determine a residual error item sequence corresponding to the historical time sequence according to the historical time sequence, the seasonal periodic item sequence corresponding to the historical time sequence, and the trend item sequence corresponding to the historical time sequence; predicting residual item data corresponding to the subsequent time points by using the residual item sequence; and obtaining the predicted service data of the subsequent time point according to the seasonal periodic item data corresponding to the subsequent time point, the trend item data corresponding to the subsequent time point and the residual error item data.
Optionally, the regression variable value determining module 520 is specifically configured to determine a first difference between a position serial number of the current historical time point in the belonged seasonal period and 1; determining a second difference value between the number of the historical time points contained in the seasonal period i to which the current historical time point belongs and 1; and determining the regression variable value corresponding to the current historical time point according to the ratio of the first difference value to the second difference value.
Optionally, the preset kernel regression algorithm is a beta kernel regression algorithm, and the preset numerical range is [0,1 ].
Optionally, the seasonal period is a month, and the historical time point is a day.
According to the technical scheme of the embodiment of the invention, the regression variable values corresponding to the historical time points are determined, so that the historical time sequence is suitable for the preset kernel regression algorithm, the seasonal periodic item sequence corresponding to the historical time sequence is fitted based on the preset kernel regression algorithm, and the service data of the subsequent time points are predicted based on the seasonal periodic item sequence, so that the problem that the time sequence of the seasonal period with the non-fixed length cannot be accurately predicted in the prior art is solved, and the technical effect of accurately predicting the time sequence of the seasonal period with the non-fixed length is realized.
The time sequence prediction device provided by the embodiment of the invention can execute the time sequence prediction method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
It should be noted that, the units and modules included in the time series prediction apparatus are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be realized; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the embodiment of the invention.
EXAMPLE six
Fig. 7 is a schematic structural diagram of an electronic device according to a sixth embodiment of the present invention. FIG. 7 illustrates a block diagram of an exemplary electronic device 60 suitable for use in implementing embodiments of the present invention. The electronic device 60 shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 7, the electronic device 60 is in the form of a general purpose computing device. The components of the electronic device 60 may include, but are not limited to: one or more processors or processing units 601, a system memory 602, and a bus 603 that couples various system components including the system memory 602 and the processing unit 601.
Bus 603 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Electronic device 60 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by electronic device 60 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 602 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)604 and/or cache memory 605. The electronic device 60 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 606 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 7, commonly referred to as a "hard drive"). Although not shown in FIG. 7, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to the bus 603 by one or more data media interfaces. System memory 602 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 608 having a set (at least one) of program modules 607 may be stored, for example, in system memory 602, such program modules 607 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. The program modules 607 generally perform the functions and/or methods of the described embodiments of the invention.
Electronic device 60 may also communicate with one or more external devices 609 (e.g., keyboard, pointing device, display 610, etc.), with one or more devices that enable a user to interact with electronic device 60, and/or with any devices (e.g., network card, modem, etc.) that enable electronic device 60 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 611. Also, the electronic device 60 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 612. As shown, the network adapter 612 communicates with the other modules of the electronic device 60 via the bus 603. It should be appreciated that although not shown in FIG. 7, other hardware and/or software modules may be used in conjunction with electronic device 60, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 601 executes various functional applications and data processing by executing programs stored in the system memory 602, for example, implementing the time series prediction method provided by the embodiment of the present invention.
EXAMPLE seven
An embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a method for time series prediction, the method including:
acquiring a historical time sequence; wherein the historical time series comprises historical traffic data for each historical time point within at least one seasonal period length;
for each historical time point, determining a regression variable value corresponding to the current historical time point according to the position identification of the current historical time point in the seasonal period to which the current historical time point belongs and the number of the historical time points contained in the seasonal period to which the current historical time point belongs; the regression variable value is within a preset value range;
fitting a seasonal periodic item sequence corresponding to the historical time sequence by using each historical service data and each regression variable value based on a preset kernel regression algorithm; wherein the seasonal periodic item sequence comprises a plurality of seasonal periodic item data which represent the variation rule of the historical service data in a seasonal period;
and predicting the service data of the subsequent time point based on the seasonal periodic item sequence.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for embodiments of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (12)

1. A method for time series prediction, comprising:
acquiring a historical time sequence; wherein the historical time series comprises historical traffic data for each historical time point within at least one seasonal period length;
for each historical time point, determining a regression variable value corresponding to the current historical time point according to the position identification of the current historical time point in the seasonal period to which the current historical time point belongs and the number of the historical time points contained in the seasonal period to which the current historical time point belongs; wherein the regression variable value is within a preset numerical range;
fitting a seasonal periodic item sequence corresponding to the historical time sequence by using each historical service data and each regression variable value based on a preset kernel regression algorithm; wherein the seasonal periodic item sequence comprises a plurality of seasonal periodic item data which represent the variation rule of the historical service data in a seasonal period;
and predicting the service data of the subsequent time point based on the seasonal periodic item sequence.
2. The method of claim 1, wherein predicting traffic data at a subsequent point in time based on the sequence of seasonal periodic items comprises:
based on a preset kernel regression algorithm, predicting seasonal periodic item data corresponding to a subsequent time point by using the seasonal periodic item sequence and each regression variable value;
acquiring a trend item sequence corresponding to the fitted historical time sequence; wherein the trend item sequence comprises trend item data representing the overall change trend of the historical business data;
predicting trend item data corresponding to the subsequent time points by using the trend item sequence;
and predicting the service data of the subsequent time points according to the seasonal periodic item data corresponding to the subsequent time points and the trend item data corresponding to the subsequent time points.
3. The method of claim 2, wherein the fitting method of the seasonal periodic term sequence and the trend term sequence corresponding to the historical time sequence comprises:
acquiring a trend item sequence in the i-1 th iteration, and obtaining a first intermediate time sequence based on the historical time sequence and the trend item sequence in the i-1 th iteration; wherein the initial value of i is 1, and the initial value of the trend item sequence is 0;
fitting a seasonal periodic item sequence in the ith iteration by using the current first intermediate time sequence and each regression variable value based on a preset kernel regression algorithm;
obtaining a second intermediate time sequence based on the historical time sequence and the seasonal periodic item sequence in the ith iteration, and fitting a trend item sequence in the (i + 1) th iteration by using the current second intermediate time sequence based on a local weighted regression algorithm;
determining whether the current value of i is smaller than L, if so, adding 1 to i, and returning to the step of acquiring the trend item sequence in the i-1 st iteration; otherwise, respectively determining the seasonal periodic item sequence and the trend item sequence which are fitted recently as the seasonal periodic item sequence and the trend item sequence corresponding to the historical time sequence; wherein L is a preset iteration number.
4. The method of claim 3, wherein the pre-set kernel regression algorithm is a beta kernel regression algorithm; the fitting of the current first intermediate time sequence and each regression variable value to the seasonal periodic term sequence in the ith iteration based on the pre-set kernel regression algorithm includes:
and inputting each data in the current first intermediate time sequence as a response variable value and each regression variable value as an independent variable value into a beta kernel regression function to obtain a seasonal periodic item sequence in the ith iteration.
5. The method of claim 2, wherein the pre-set kernel regression algorithm is a beta kernel regression algorithm; the predicting seasonal periodic item data corresponding to a subsequent time point by using the seasonal periodic item sequence and the regression variable values based on the preset kernel regression algorithm comprises the following steps:
and inputting the seasonal periodic item data in the seasonal periodic item sequence as response variable values, the regression variable values corresponding to the historical time points and the regression variable values corresponding to the subsequent time points as independent variable values into a beta kernel regression function to obtain the predicted seasonal periodic item data of the subsequent time points.
6. The method according to claim 2, wherein the predicting the service data of the subsequent time point according to the seasonal periodic item data corresponding to the subsequent time point and the trend item data corresponding to the subsequent time point comprises:
determining a residual error item sequence corresponding to the historical time sequence according to the historical time sequence, the seasonal periodic item sequence corresponding to the historical time sequence and the trend item sequence corresponding to the historical time sequence;
predicting residual item data corresponding to the subsequent time points by using the residual item sequence;
and obtaining the predicted service data of the subsequent time point according to the seasonal periodic item data corresponding to the subsequent time point, the trend item data corresponding to the subsequent time point and the residual error item data.
7. The method of claim 1, wherein determining the regression variable value corresponding to the current historical time point according to the position identifier of the current historical time point in the seasonal period to which the current historical time point belongs and the number of the historical time points included in the seasonal period to which the current historical time point belongs comprises:
determining a first difference value between the position sequence number of the current historical time point in the seasonal period to which the current historical time point belongs and 1;
determining a second difference value between the number of the historical time points contained in the seasonal period i to which the current historical time point belongs and 1;
and determining the regression variable value corresponding to the current historical time point according to the ratio of the first difference value to the second difference value.
8. The method of claim 7, wherein the predetermined kernel regression algorithm is a beta kernel regression algorithm and the predetermined range of values is [0,1 ].
9. The method of any of claims 1-8, wherein the seasonal period is a month and the historical time point is a day.
10. A time-series prediction apparatus, comprising:
the historical time sequence acquisition module is used for acquiring a historical time sequence; wherein the historical time series comprises historical traffic data for each historical time point within at least one seasonal period length;
the regression variable value determining module is used for determining the regression variable value corresponding to the current historical time point according to the position identification of the current historical time point in the seasonal period to which the current historical time point belongs and the number of the historical time points contained in the seasonal period to which the current historical time point belongs; wherein the regression variable value is within a preset numerical range;
the seasonal periodic item sequence fitting module is used for fitting a seasonal periodic item sequence corresponding to the historical time sequence by using each historical service data and each regression variable value based on a preset kernel regression algorithm; wherein the seasonal periodic item sequence comprises a plurality of seasonal periodic item data which represent the variation rule of the historical service data in a seasonal period;
and the service data prediction module is used for predicting the service data of the subsequent time point based on the seasonal periodic item sequence.
11. An electronic device, characterized in that the electronic device comprises:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of time series prediction according to any one of claims 1-9.
12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a method for time series prediction according to any one of claims 1 to 9.
CN202110298232.XA 2021-03-19 2021-03-19 Time series prediction method, device, equipment and storage medium Pending CN112988840A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110298232.XA CN112988840A (en) 2021-03-19 2021-03-19 Time series prediction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110298232.XA CN112988840A (en) 2021-03-19 2021-03-19 Time series prediction method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112988840A true CN112988840A (en) 2021-06-18

Family

ID=76334232

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110298232.XA Pending CN112988840A (en) 2021-03-19 2021-03-19 Time series prediction method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112988840A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469461A (en) * 2021-07-26 2021-10-01 北京沃东天骏信息技术有限公司 Method and device for generating information
CN114779731A (en) * 2022-06-22 2022-07-22 江苏翔晟信息技术股份有限公司 Intelligent manufacturing-oriented production data dynamic monitoring and analyzing system and method
CN116562471A (en) * 2023-07-10 2023-08-08 安徽中科海奥电气股份有限公司 STL-SARIMA-GRU power prediction method based on STL data decomposition

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469461A (en) * 2021-07-26 2021-10-01 北京沃东天骏信息技术有限公司 Method and device for generating information
WO2023005635A1 (en) * 2021-07-26 2023-02-02 北京沃东天骏信息技术有限公司 Method and apparatus for generating information
CN114779731A (en) * 2022-06-22 2022-07-22 江苏翔晟信息技术股份有限公司 Intelligent manufacturing-oriented production data dynamic monitoring and analyzing system and method
CN114779731B (en) * 2022-06-22 2022-09-23 江苏翔晟信息技术股份有限公司 Intelligent manufacturing-oriented production data dynamic monitoring and analyzing system and method
CN116562471A (en) * 2023-07-10 2023-08-08 安徽中科海奥电气股份有限公司 STL-SARIMA-GRU power prediction method based on STL data decomposition
CN116562471B (en) * 2023-07-10 2023-10-24 安徽中科海奥电气股份有限公司 STL-SARIMA-GRU power prediction method based on STL data decomposition

Similar Documents

Publication Publication Date Title
US11586880B2 (en) System and method for multi-horizon time series forecasting with dynamic temporal context learning
TWI788529B (en) Credit risk prediction method and device based on LSTM model
CN112988840A (en) Time series prediction method, device, equipment and storage medium
CN107220217A (en) Characteristic coefficient training method and device that logic-based is returned
US20210303970A1 (en) Processing data using multiple neural networks
Stone Calibrating rough volatility models: a convolutional neural network approach
CN113177700B (en) Risk assessment method, system, electronic equipment and storage medium
CN112163963A (en) Service recommendation method and device, computer equipment and storage medium
CN114782201A (en) Stock recommendation method and device, computer equipment and storage medium
CN113902260A (en) Information prediction method, information prediction device, electronic equipment and medium
CN116542673B (en) Fraud identification method and system applied to machine learning
US11682069B2 (en) Extending finite rank deep kernel learning to forecasting over long time horizons
CN112348590A (en) Method and device for determining value of article, electronic equipment and storage medium
CN110851600A (en) Text data processing method and device based on deep learning
Hwang et al. A two-stage probit model for predicting recovery rates
CN116228284A (en) Goods demand prediction method, training device, computer system and medium
CN114118526A (en) Enterprise risk prediction method, device, equipment and storage medium
CN114897607A (en) Data processing method and device for product resources, electronic equipment and storage medium
CN113256191A (en) Classification tree-based risk prediction method, device, equipment and medium
CN114266414A (en) Loan amount prediction method, loan amount prediction device, loan amount prediction electronic device, and loan amount prediction medium
CN111126629A (en) Model generation method, system, device and medium for identifying brushing behavior
Dwarakanath et al. Optimal Stopping with Gaussian Processes
CN116340864B (en) Model drift detection method, device, equipment and storage medium thereof
CN113033658A (en) Merchant classification method and device, electronic equipment and medium
CN117979089A (en) Live video processing method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination