WO2024104276A1

WO2024104276A1 - Time series perdition optimization method, device, and storage medium

Info

Publication number: WO2024104276A1
Application number: PCT/CN2023/131132
Authority: WO
Inventors: 谢娟
Original assignee: 杭州阿里云飞天信息技术有限公司
Priority date: 2022-11-17
Filing date: 2023-11-10
Publication date: 2024-05-23
Also published as: CN118052310A

Abstract

Embodiments of the present application provide a time series prediction optimization method, a device, and a storage medium. A feature conversion module and a prediction module are constructed in a time series prediction model, and a time-invariant feature can be introduced into a time series prediction process by means of the feature conversion module to serve as an important influencing factor. In addition, it is also proposed that a volatility evaluation index for the time-invariant feature is used as a first loss function corresponding to the feature conversion module, an evaluation index of a prediction sequence based on dynamic time warping (DTW) is used as a second loss function corresponding to the prediction module, and tuning is performed on the time series prediction model at least based on the two loss functions. In this way, the extraction of the time-invariant feature is more accurate, and the DTW can be introduced into the time series prediction process by means of the second loss function to serve as another important influencing factor, thereby improving the accuracy of sudden change prediction. Therefore, the performance of the time sequence prediction model can be effectively optimized, and the prediction accuracy is improved.

Description

A time series prediction optimization method, device and storage medium

This application claims priority to the Chinese patent application filed with the China Patent Office on November 17, 2022, with application number 202211441398.3 and application name “A Timing Prediction Optimization Method, Device and Storage Medium”, all contents of which are incorporated by reference in this application.

Technical Field

The present application relates to the field of data processing technology, and in particular to a time series prediction optimization method, device and storage medium.

Background technique

Time series forecasting is widely used in industries such as industry, agriculture, water services, and finance. Existing time series forecasting solutions have increasingly used neural network models such as deep learning and machine learning.

At present, these models usually use the mean absolute error (MAE) constructed based on the predicted results and the actual measurement results as the loss function. In practice, it is found that the performance of the time series prediction model constructed according to this loss function is poor and the accuracy of the prediction results is insufficient.

Summary of the invention

Multiple aspects of the present application provide a time series prediction optimization method, device and storage medium to optimize the prediction performance of a time series prediction model.

The present application provides a time series prediction optimization method, including:

Inputting the sample feature sequence into a feature conversion module constructed in the time series prediction model, wherein the feature conversion module is used to convert the received sequence into a converted sequence carrying time-varying features and time-invariant features;

Inputting the converted sequence into a prediction module constructed in the time series prediction model, wherein the prediction module is used to perform time series prediction based on the time-varying features and the time-invariant features carried by the converted sequence to generate a prediction sequence;

Calculating an evaluation index for volatility for the time-invariant feature corresponding to the sample feature sequence to determine a first loss function value corresponding to the feature conversion module;

Calculating an evaluation index based on dynamic time warping (DTW) for the prediction sequence to determine a second loss function value corresponding to the prediction module;

Based on the first loss function value and the second loss function value, the feature conversion module and the prediction module are jointly tuned to update relevant model parameters in the time series prediction model.

The present application also provides a time series prediction optimization method, including:

receiving a timing prediction request;

Acquire a historical feature sequence according to the time series prediction request;

Inputting the historical feature sequence into a feature conversion module in a time series prediction model to generate a converted sequence for the historical feature sequence using the feature conversion module, wherein the feature conversion module is used to extract time-varying features and time-invariant features from the received sequence to generate a converted sequence;

Inputting the converted sequence corresponding to the historical feature sequence into the prediction module, so as to use the prediction module to perform time series prediction and generate a prediction sequence corresponding to the historical feature sequence;

The prediction module uses an evaluation index based on dynamic time warping (DTW) as a loss function.

The embodiment of the present application also provides a computing device, including a memory and a processor;

The memory is used to store one or more computer instructions;

The processor is coupled to the memory and is used to execute the one or more computer instructions to execute the aforementioned timing prediction optimization method.

An embodiment of the present application also provides a computer-readable storage medium storing computer instructions. When the computer instructions are executed by one or more processors, the one or more processors are caused to execute the aforementioned timing prediction optimization method.

In an embodiment of the present application, it is proposed to construct a feature conversion module and a prediction module in a time series prediction model. The feature conversion module can be used to convert the received feature sequence into a converted sequence carrying time-varying features and time-invariant features, and the converted sequence output by the module will be used as the input of the prediction module. The prediction module can be used to perform time series prediction and generate a prediction sequence. In this way, the time-invariant feature can be introduced into the time series prediction process as an important influencing factor through the feature conversion module. In addition, it is also proposed to use the volatility evaluation index for the time-invariant feature as the first loss function corresponding to the feature conversion module, and the evaluation index of the prediction sequence based on dynamic time warping DTW as the second loss function corresponding to the prediction module, and the time series prediction model is tuned based on at least these two loss functions. This makes the extraction of time-invariant features more accurate, and DTW can be introduced into the time series prediction process through the second loss function as another important influencing factor to improve the accuracy of mutation prediction. Therefore, the performance of the time series prediction model can be effectively optimized and the prediction accuracy can be improved.

BRIEF DESCRIPTION OF THE DRAWINGS

The drawings described herein are used to provide a further understanding of the present application and constitute a part of the present application. The illustrative embodiments of the present application and their descriptions are used to explain the present application and do not constitute an improper limitation on the present application. In the drawings:

FIG1 is a flow chart of a timing prediction optimization method provided by an exemplary embodiment of the present application;

FIG2 is a schematic diagram of the logical structure of a time series prediction model provided by an exemplary embodiment of the present application;

FIG3 is a schematic diagram of an optional implementation of a feature conversion module provided by an exemplary embodiment of the present application;

FIG4 is a flow chart of another timing prediction optimization method provided by an exemplary embodiment of the present application;

FIG5 is a schematic diagram of the structure of a computing device provided by another exemplary embodiment of the present application.

Detailed ways

In order to make the purpose, technical solution and advantages of this application clearer, the technical solution of this application will be described clearly and completely in combination with the specific embodiments of this application and the corresponding drawings. Obviously, the described embodiments are only part of the embodiments of this application, not all of them. Based on the embodiments in this application, ordinary technicians in this field will not be able to make any mistakes. All other embodiments obtained through creative work are within the scope of protection of this application.

At present, the performance of the time series prediction model is poor, and the accuracy of the prediction results is insufficient. To this end, in some embodiments of the present application: it is proposed to construct a feature conversion module and a prediction module in the time series prediction model, and the feature conversion module can be used to convert the received feature sequence into a converted sequence carrying time-varying features and time-invariant features, and the converted sequence output by it will be used as the input of the prediction module, and the prediction module can be used to perform time series prediction and generate a prediction sequence. In this way, the time-invariant feature can be introduced into the time series prediction process as an important influencing factor through the feature conversion module. In addition, it is also proposed to use the volatility evaluation index for the time-invariant feature as the first loss function corresponding to the feature conversion module, and the evaluation index of the prediction sequence based on dynamic time warping DTW as the second loss function corresponding to the prediction module, and the time series prediction model is tuned based on at least these two loss functions. This makes the extraction of time-invariant features more accurate, and DTW can be introduced into the time series prediction process through the second loss function as another important influencing factor to improve the accuracy of mutation prediction. Therefore, the performance of the time series prediction model can be effectively optimized and the prediction accuracy can be improved.

The technical solutions provided by various embodiments of the present application are described in detail below in conjunction with the accompanying drawings.

FIG1 is a flow chart of a method for optimizing time series prediction provided by an exemplary embodiment of the present application. The method may be executed by a data processing device, which may be implemented as a combination of software and/or hardware, and may be integrated in a computing device. Referring to FIG1 , the method may include:

Step 100: Input the sample feature sequence into a feature conversion module constructed in the time series prediction model, where the feature conversion module is used to convert the received sequence into a converted sequence carrying time-varying features and time-invariant features;

Step 101: input the converted sequence into a prediction module constructed in a time series prediction model, the prediction module is used to perform time series prediction based on the time-varying features and time-invariant features carried by the converted sequence to generate a prediction sequence;

Step 102: Calculate an evaluation index for volatility for the time-invariant feature corresponding to the sample feature sequence to determine a first loss function value corresponding to the feature conversion module;

Step 103: Calculate an evaluation index based on dynamic time warping (DTW) for the prediction sequence to determine a second loss function value corresponding to the prediction module;

Step 104: Based on the first loss function value and the second loss function value, jointly tune the feature conversion module and the prediction module to update relevant model parameters in the time series prediction model.

The time series prediction optimization method provided in this embodiment can be applied to various scenarios that require time series prediction, such as water conservancy, digital agriculture and other scenarios. This embodiment does not limit the application scenarios. It should be understood that in different application scenarios, the feature sequence used as the basis for prediction may have different contents, and the final prediction sequence may also have different contents. The sequence content can be adaptively set according to the needs of the scenario, and this embodiment does not limit this. Among them, time series prediction can be understood as using historical time series to predict future time series. For example, the temperature forecast of the past 12 hours can be used to predict the temperature series of the next 6 hours.

FIG2 is a schematic diagram of the logical structure of a time series prediction model provided by an exemplary embodiment of the present application. Referring to FIG2, in this embodiment, the time series prediction module may include a feature conversion module and a prediction module. It should be understood that both the feature conversion module and the prediction module may adopt various feasible neural network learning models, such as RNN, CNN, Transformer, etc. This embodiment does not limit the model type adopted by the feature conversion module and the prediction module. Referring to FIG2, the feature conversion module and the prediction module may adopt various feasible neural network learning models, such as RNN, CNN, Transformer, etc. The conversion module and the prediction module are connected in series, that is, the output result of the feature conversion module will be used as the input of the prediction module.

In this embodiment, the feature conversion module can be used to convert the received feature sequence into a converted sequence carrying time-varying features and time-invariant features, and the prediction module can be used to perform time series prediction based on the received converted sequence to generate a predicted sequence. That is, the input of the feature conversion module may include the historical feature sequence, and the output includes the converted sequence; and the input of the prediction module may include the converted sequence provided by the feature conversion module, and the output may include the predicted sequence. Among them, the time series prediction logic in the prediction module can adopt single-step time series prediction logic or multi-step time series prediction logic, which is not limited in this embodiment. In addition, the time-varying features mentioned above refer to features that change with time, and the time-invariant features are features that do not change with time.

The following first describes the training process of the time series prediction model.

Referring to FIG. 1 , in step 100 , the sample feature sequence can be input into the feature conversion module constructed in the time series prediction model. As mentioned above, the feature conversion module can convert the feature sequence it receives into a converted sequence. Therefore, in step 100 , the feature conversion sequence can be used to output the converted sequence corresponding to the sample feature sequence. Among them, the sample feature sequence refers to the historical feature sequence used as the basis for prediction. As a sample, the sample feature sequence is usually associated with a corresponding sample future sequence. The sample future sequence is usually a real sequence corresponding to the predicted sequence generated after the time series prediction based on the sample feature sequence. That is, the sample future sequence is usually a real collection value, and similarly, the sample feature sequence is usually a real collection value. During the model training process, the time series prediction module can generate a predicted sequence based on the sample feature sequence, and the real sequence associated with the sample feature sequence can be used to evaluate the accuracy of the predicted sequence, thereby guiding the tuning of the time series prediction model.

FIG3 is a schematic diagram of an optional implementation of a feature conversion module provided by an exemplary embodiment of the present application. In this optional implementation: the feature conversion module may adopt an encoding-decoding model, and the processing logic of the feature conversion module may be:

The encoding unit in the feature conversion module is used to map the sample feature sequence to the time-invariant space and the time-varying space, so as to respectively extract the time-varying features and the time-invariant features corresponding to the sample feature sequence;

Reconstruct the time-varying features and the time-invariant features into a fusion sequence;

The fused sequence is decoded using the decoding unit in the feature conversion module to generate a converted sequence.

Referring to FIG3 , Y_label[tn:t-1] may represent a sample feature sequence, u[tk:t-1] represents the dimension of a single time-varying feature, and s[tk:t-1] represents the dimension of a single time-invariant feature. It should be understood that each sample element in Y_label[tn:t-1] may generate a time-varying feature u and a time-invariant feature s. ht may represent a fused sequence generated by reconstructing the time-varying feature and the time-invariant feature. It can be understood that the elements in the fused sequence are aligned with the elements in Y_label[tn:t-1], and the single element in the fused sequence is the result of reconstructing the time-varying feature u and the time-invariant feature s generated by the corresponding element in Y_label[tn:t-1]. Among them, an exemplary reconstruction logic may be ht=[S ^T ,U ^T ] ^T , S represents the matrix composed of the time-invariant features generated by Y_label[tn:t-1], and U represents the matrix composed of the time-varying features generated by Y_label[tn:t-1]. In this way, by decoding the fused sequence ht, the transformed matrix Y_t[tn:t-1] corresponding to the sample feature sequence can be generated.

In addition, in this optional implementation, before mapping the sample feature sequence to the time-invariant space and the time-varying space, The encoding unit may also first standardize the sample feature sequence, and then map the standardized sample feature sequence to the time-invariant space and the time-varying space. For example, the sample feature sequence may be standardized to [-1, 1]. Of course, this embodiment does not limit the range used in the standardization link. Similarly, after decoding the fused sequence, the decoded sequence may be de-standardized to generate a converted sequence. The standardization link can effectively reduce the amount of calculation inside the feature conversion model.

In this embodiment, the time-invariant features include the autocorrelation information between the elements in the feature sequence and other information that does not change with time. In this way, in this embodiment, the time-invariant features such as the autocorrelation information of the historical feature sequence can be introduced into the time series prediction process. As an important influencing factor, the subsequent prediction module can smoothly capture the time-invariant features in the historical feature sequence, thereby more accurately predicting the time series. For example, if the values of the 5th and 6th elements in the historical feature sequence do not change with time, and an element in the predicted sequence can be inferred from these two elements in the historical feature sequence, then after the time-invariant features corresponding to these two elements in the historical feature sequence are introduced into the time series prediction process, it can have a positive impact on the predicted sequence and improve the prediction accuracy. Of course, this is only exemplary.

Continuing to refer to FIG. 1, in step 101, the converted sequence can be input into the prediction module constructed in the time series prediction model. As mentioned above, the prediction module can be used to generate a prediction sequence by performing time series prediction based on the time-varying features and time-invariant features carried by the converted sequence. It should be understood that in practical applications, the input of the prediction module may also include other contents besides the converted sequence provided by the feature conversion module, for example, multimodal data that affects the variables in the sample feature sequence. For example, the sample feature sequence may be a temperature sequence, and the input of the prediction module may include, in addition to the converted sequence corresponding to the temperature sequence, weather data, seasonal data, geographic location data, human data and other multimodal data that affect the temperature, which can also serve as the basis for time series prediction. That is, this embodiment does not limit other inputs of the prediction module, and it can be accessed on demand according to the needs of the application scenario.

In this way, in step 101, a prediction sequence corresponding to the sample feature sequence can be generated.

On this basis, referring to Figure 1, in step 102, an evaluation index for volatility can be calculated for the time-invariant feature corresponding to the sample feature sequence to determine the first loss function value corresponding to the feature conversion module. In step 103, an evaluation index based on dynamic time warping DTW can be calculated for the prediction sequence to determine the second loss function value corresponding to the prediction module.

In this embodiment, a first loss function can be configured for the feature conversion module, and the first loss function can adopt a volatility evaluation index for time-invariant features; a second loss function can also be configured for the prediction module, and the second loss function can adopt an evaluation index for the prediction sequence based on dynamic time warping DTW.

In an exemplary solution, the mean square error of the time-invariant features extracted by the feature conversion module can be used as an evaluation index for volatility, and a first loss function is also generated. In this case, the first loss function can be characterized as:

Among them, s(t-i) and s(t-i-1) are the time-invariant features corresponding to any two adjacent ones of the sample feature sequence, and K is the number of time-invariant features, which is usually consistent with the sequence length of the sample feature sequence.

In another exemplary embodiment, the shape deviation and time of the predicted dynamic time warping (DTW) may be calculated for the predicted sequence. The time deviation is used to construct the second loss function. The second loss function can be represented as:
L _DTW = α ₁ L _shape + α ₂ L _tdi

Among them, L _shape represents the shape deviation, and L _tdi represents the time deviation.

It is worth noting that in this embodiment, the knowledge related to the dynamic time warping DTW technology can be found in the existing and future public information. In this embodiment, the two processing objects of the dynamic time warping DTW are limited to the prediction queue and its corresponding real queue, and the specific dynamic time warping logic can be found in the records in the relevant materials, which will not be explained in detail here. This embodiment does not change the DTW technology itself.

Based on this, in this exemplary solution, the process of calculating the evaluation index based on dynamic time warping DTW for the prediction sequence to determine the second loss function value corresponding to the prediction module may be:

Perform dynamic time warping on the predicted sequence and its corresponding real sequence, calculate the shortest path, and generate the time matching relationship between the two;

Based on the time matching relationship, the shape deviation between the predicted sequence and the true sequence after dynamic time warping is calculated;

Based on the time matching relationship, the time deviation between the predicted sequence and the real sequence after dynamic time warping is calculated;

A second loss function value is determined based on the shape deviation and the time deviation.

Optionally, a shape deviation calculation scheme may be: for each element in the predicted sequence, determine the corresponding matching element in the real sequence after dynamic time warping; for each element in the predicted sequence, calculate the distance between it and the corresponding matching element; calculate the square root of the sum of the distances as the shape deviation. In this calculation scheme, the shape deviation can be characterized as:

Among them, π is the shortest path after dynamic time warping, Y _pred[i] is the value of the i-th element in the sample feature sequence, and Y _label[j] is the value of the j-th element in the corresponding true sequence, and i and j are aligned. In addition, in order to reverse the derivation, the shape deviation can also be converted into log-sum-exp.

Optionally, a time deviation calculation scheme may be: for each element in the predicted sequence, determine the corresponding matching element in the real sequence after dynamic time warping; for each element in the predicted sequence, calculate the time difference between it and the corresponding matching element; calculate the mean square sum of the time differences as the time deviation. In this calculation scheme, the time deviation can be characterized as:

Among them, π is the shortest path after dynamic time warping, i is the time corresponding to the i-th element in the predicted sequence, j is the time corresponding to the j-th element in the corresponding real sequence, and i and j are aligned. In addition, in order to reverse the derivation, the time deviation can also be converted into log-sum-exp mathematically

It is worth noting that the above-mentioned calculation scheme for shape deviation and time deviation is only exemplary, and the present embodiment is not limited thereto. For example, in addition to using the square root of the sum of distances, other logics such as the mean variance of distances can also be used to calculate shape deviation. In addition to using the mean square sum, other logics such as diagonal offset can also be used to calculate time deviation. These are not exhaustively listed here.

Continuing to refer to FIG. 1 , in step 104 , the feature conversion module and the prediction module may be jointly tuned based on the first loss function value and the second loss function value to update relevant model parameters in the time series prediction model.

In this way, based on the first loss function, the feature conversion module can optimize the extraction performance of the time-invariant features, so that the fluctuation of the time-invariant features extracted by the feature conversion module is more stable, thereby capturing the time-invariant features more accurately. Based on the second loss function, the prediction module can fully consider the deviations in the prediction of time series regularity during the time series prediction process, especially the lag or deviation that may exist in the case of time series mutation, so as to make time series prediction more accurate.

In addition to the first loss function and the second loss function mentioned above, in this embodiment, other loss functions can be configured for the time series prediction model during the training process. A typical loss function can be the mean absolute error MAE. For this reason, in this embodiment, a third loss function can also be configured for the prediction module. The third loss function uses the mean absolute error MAE for the prediction sequence. The third loss function can be characterized as:

Among them, Y _pred[i] represents the i-th element value in the predicted sequence, Y _label[i] represents the i-th element value in the corresponding true sequence, and n is the length of the predicted sequence.

Thus, based on the three loss functions mentioned above, in this embodiment, the three loss functions mentioned above can be combined to generate a hybrid loss function. Then the hybrid loss function can be characterized as:
L _loss = α ₁ L _shape + α ₂ L _tdi + α ₃ L _MAE + β ₁ L _s

On this basis, in this embodiment, a hybrid loss function can be used to jointly tune the feature conversion module and the prediction module in the time series prediction model to jointly optimize the performance of the feature conversion module and the prediction module.

Back to the sample feature sequence, the mean absolute error MAE between the predicted sequence corresponding to the sample feature sequence and the corresponding true sequence can be calculated to determine the third loss function value corresponding to the prediction module; the first loss function value, the second loss function value and the third loss function value are weighted and summed to jointly optimize the feature conversion module and the prediction module.

In addition, in this embodiment, the gradient of the mixed loss function can also be automatically derived to determine the weight values corresponding to the first loss function, the second loss function, and the third loss function. During the training process, each weight parameter in the mixed loss function can be set as an adjustable weight parameter, the sum of which is less than 1, and is set as a differentiable function. In this way, the final weight value can be automatically found through the gradient automatic derivation technology. When verifying the model, these weight values in the mixed loss function can be fixed. Among them, Ls can use an independent weight parameter, and its weight parameter can be set to 1 or other, and the weight coefficients of other loss functions can be combined for gradient automatic derivation. Of course, this is only an exemplary method, and this embodiment does not limit this.

It is worth noting that the above is an explanation of the training scheme of the time series prediction model from the perspective of a single iteration process. It should be understood that in this embodiment, several sample feature sequences can be used to iterate the time series prediction model multiple times, so as not to tune the time series prediction model until the performance of the time series prediction model reaches the specified requirements and then stop. That is, in this embodiment, parameters such as iteration cycle and learning rate can be set, and the time series prediction model can be iteratively tuned according to the first loss function and the second loss function customized in this embodiment to complete the training of the time series prediction model.

After the time series prediction model is trained, you can use the time series prediction model to provide time series prediction services.

In this regard, in this embodiment, a time series prediction request can also be received, and according to the time series prediction request, a historical feature sequence can be obtained; the historical feature sequence is input into the feature conversion module in the time series prediction model, so as to use the feature conversion module to convert the historical feature sequence into a converted sequence carrying time-varying features and time-invariant features; the converted sequence corresponding to the historical feature sequence is input into the prediction module, so as to use the prediction module to perform time series prediction, and generate a prediction sequence corresponding to the historical feature sequence. Among them, the historical feature sequence can be directly carried in the time series prediction request, and of course, it can also be obtained from other channels. Based on the training of the feature conversion module and the prediction module in the previous text, the feature conversion sequence can accurately extract time-varying features and time-invariant features from the historical sample sequence, and these features are brought into the prediction module through the converted sequence. The prediction module can fully capture the influence of the time-invariant features and the relevant indicators of the dynamic time rule DTW on the prediction results, so as to more accurately perform time series prediction, and the output prediction sequence has higher accuracy.

In summary, in this embodiment, it is proposed to construct a feature conversion module and a prediction module in the time series prediction model. The feature conversion module can be used to convert the received feature sequence into a converted sequence carrying time-varying features and time-invariant features, and the converted sequence output by it will be used as the input of the prediction module. The prediction module can be used to perform time series prediction and generate a prediction sequence. In this way, the time-invariant feature can be introduced into the time series prediction process as an important influencing factor through the feature conversion module. In addition, it is also proposed to use the volatility evaluation index for the time-invariant feature as the first loss function corresponding to the feature conversion module, and the evaluation index of the prediction sequence based on dynamic time warping DTW as the second loss function corresponding to the prediction module, and the time series prediction model is tuned based on at least these two loss functions. This makes the extraction of time-invariant features more accurate, and DTW can be introduced into the time series prediction process through the second loss function as another important influencing factor to improve the accuracy of mutation prediction. Therefore, the performance of the time series prediction model can be effectively optimized and the prediction accuracy can be improved.

FIG4 is a flow chart of another timing prediction optimization method provided by an exemplary embodiment of the present application. Referring to FIG4 , the method may include:

Step 400: receiving a timing prediction request;

Step 401, obtaining a historical feature sequence according to a time series prediction request;

Step 402: input the historical feature sequence into a feature conversion module in the time series prediction model, so as to generate a converted sequence for the historical feature sequence using the feature conversion module, wherein the feature conversion module is used to extract time-varying features and time-invariant features from the received sequence to generate a converted sequence;

Step 403: input the converted sequence corresponding to the historical feature sequence into the prediction module, so as to use the prediction module to perform time series prediction and generate a prediction sequence corresponding to the historical feature sequence;

Among them, the prediction module uses the evaluation index based on dynamic time warping DTW as the loss function.

In this embodiment, a time series prediction model is proposed, which includes a feature conversion module and a prediction module. The functions of the block are as above. The feature conversion module can fully capture the time-varying features and time-invariant features in the historical feature sequence, and bring them into the prediction module through the converted sequence. In this way, the time-varying features in the historical feature sequence can be used as the influencing factors of time series prediction, which can effectively improve the accuracy of time series prediction.

In addition, the prediction module also uses an evaluation index based on dynamic time rules as a loss function, which allows the time series prediction process of the prediction module to fully consider the time regularity deviation as another important influencing factor. Especially for the prediction work under the condition of time series mutation, it can effectively improve the prediction accuracy.

Among them, this embodiment only has requirements for the structure and module functions of the time series prediction model, and does not limit the specific training process. Of course, the time series prediction model in this embodiment can be tuned using the training scheme provided in the relevant embodiment of Figure 1, but it should be understood that this embodiment is not limited to this, and any training scheme that can ensure the model structure and module functions required by this embodiment can be used.

Regarding the model training scheme, please refer to the exemplary scheme provided in the relevant embodiment of Figure 1, which will not be described here in order to save space.

It should be noted that in some of the processes described in the above embodiments and the accompanying drawings, multiple operations that appear in a specific order are included, but it should be clearly understood that these operations may not be executed in the order in which they appear in this article or may be executed in parallel. The serial numbers of the operations, such as 101, 102, etc., are only used to distinguish between different operations, and the serial numbers themselves do not represent any execution order. In addition, these processes may include more or fewer operations, and these operations may be executed in sequence or in parallel. It should be noted that the descriptions of "first", "second", etc. in this article are used to distinguish different loss functions, etc., and do not represent the order of precedence, nor do they limit "first" and "second" to be different types.

Fig. 5 is a schematic diagram of a computing device provided by another exemplary embodiment of the present application. As shown in Fig. 5 , the computing device includes: a memory 50 and a processor 51 .

The processor 51 is coupled to the memory 50 and is used to execute the computer program in the memory 50 to:

Inputting the sample feature sequence into a feature conversion module constructed in the time series prediction model, the feature conversion module is used to convert the received sequence into a converted sequence carrying time-varying features and time-invariant features;

The converted sequence is input into a prediction module constructed in the time series prediction model, and the prediction module is used to perform time series prediction based on the time-varying features and time-invariant features carried by the converted sequence to generate a prediction sequence;

Calculate an evaluation index for volatility for the time-invariant feature corresponding to the sample feature sequence to determine a first loss function value corresponding to the feature conversion module;

Calculate the evaluation index based on dynamic time warping (DTW) for the prediction sequence to determine the second loss function value corresponding to the prediction module;

In an optional embodiment, the processor 51, in the process of calculating the evaluation index based on dynamic time warping (DTW) for the prediction sequence to determine the second loss function value corresponding to the prediction module, may be used to:

In an optional embodiment, the processor 51, in the process of calculating the shape deviation between the predicted sequence and the true sequence after the dynamic time warping process, may be used to:

For each element in the predicted sequence, determine the corresponding matching element in the real sequence after dynamic time warping processing;

For each element in the predicted sequence, calculate the distance between it and the corresponding matching element;

The square root of the sum of the distances is calculated as the shape deviation.

In an optional embodiment, the processor 51, in the process of calculating the time deviation between the predicted sequence and the real sequence after the dynamic time warping process, may be used to:

For each element in the predicted sequence, calculate the time difference between it and the corresponding matching element;

Calculate the mean square sum of the time differences as the time deviation.

In an optional embodiment, the processor 51 jointly tunes the feature conversion module and the prediction module based on the first loss function value and the second loss function value, and the processor 51 may also be used to:

Calculate the mean absolute error (MAE) between the predicted sequence and its corresponding true sequence to determine the third loss function value corresponding to the prediction module;

The first loss function value, the second loss function value, and the third loss function value are weightedly summed to jointly tune the feature conversion module and the prediction module.

In an optional embodiment, the processor 51 may also be configured to:

The first loss function, the second loss function, and the third loss function are respectively configured with weight parameters and then combined to generate a mixed loss function;

The gradient of the mixed loss function is automatically derived to determine the weight values corresponding to the first loss function, the second loss function, and the third loss function.

In an optional embodiment, the feature conversion module adopts an encoding-decoding model, and the processor 51 can also be used to:

In an optional embodiment, the processor 51, in the process of calculating the evaluation index for volatility for the time-invariant feature corresponding to the sample feature sequence, may be used to:

The mean square error of the time-invariant features corresponding to the sample feature sequence is calculated as an evaluation indicator for volatility.

In an optional embodiment, the time-invariant features include autocorrelation information.

In an optional embodiment, after completing the tuning of the time series prediction model, the processor 51 may also be used to:

receiving a timing prediction request;

Obtain historical feature sequences based on time series prediction requests;

Inputting the historical feature sequence into a feature conversion module in the time series prediction model, so as to utilize the feature conversion module to convert the historical feature sequence into a converted sequence carrying time-varying features and time-invariant features;

The converted sequence corresponding to the historical feature sequence is input into the prediction module, so as to use the prediction module to perform time series prediction and generate a prediction sequence corresponding to the historical feature sequence.

In some other possible aspects, based on the computing device shown in FIG. 5 , the processor 51 in the computing device may also be used to execute the computer program in the memory 50 to:

receiving a timing prediction request;

Obtain historical feature sequences based on time series prediction requests;

Inputting the historical feature sequence into a feature conversion module in the time series prediction model to generate a converted sequence for the historical feature sequence using the feature conversion module, wherein the feature conversion module is used to extract time-varying features and time-invariant features from the received sequence to generate a converted sequence;

Furthermore, as shown in Fig. 5, the computing device also includes other components such as a communication component 52 and a power component 53. Fig. 5 only schematically shows some components, which does not mean that the computing device only includes the components shown in Fig. 5.

It is worth noting that the technical details in the above-mentioned embodiments of the computing device can refer to the relevant description in the embodiment of the method. In order to save space, they will not be repeated here, but this should not cause loss of the protection scope of this application.

Accordingly, an embodiment of the present application further provides a computer-readable storage medium storing a computer program, which, when executed, can implement the steps that can be executed by a computing device in the above method embodiment.

The memory in FIG. 5 is used to store computer programs and can be configured to store various other data to support operations on the computing platform. Examples of such data include instructions for any application or method operating on the computing platform, contact data, phone book data, messages, pictures, videos, etc. The memory can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk.

The communication component in FIG. 5 is configured to facilitate wired or wireless communication between the device where the communication component is located and other devices. The device where the communication component is located can access a wireless network based on a communication standard, such as WiFi, 2G, 3G, 4G/LTE, 5G and other mobile communication networks, or a combination thereof. In an exemplary embodiment, the communication component is connected to a wireless network via a broadband connection. The broadcast channel receives a broadcast signal or broadcast related information from an external broadcast management system. In an exemplary embodiment, the communication component further includes a near field communication (NFC) module to facilitate short-range communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

The power supply assembly in Figure 5 provides power to various components of the device where the power supply assembly is located. The power supply assembly may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to the device where the power supply assembly is located.

Those skilled in the art will appreciate that the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the present application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment in combination with software and hardware. Moreover, the present application may adopt the form of a computer program product implemented in one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) that include computer-usable program code.

The present application is described with reference to the flowchart and/or block diagram of the method, device (system) and computer program product according to the embodiment of the present application. It should be understood that each process and/or box in the flowchart and/or block diagram, and the combination of the process and/or box in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for realizing the function specified in one process or multiple processes in the flowchart and/or one box or multiple boxes in the block diagram.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

In a typical configuration, a computing device includes one or more processors (CPU), input/output interfaces, network interfaces, and memory.

Memory may include non-permanent storage in a computer-readable medium, in the form of random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media include permanent and non-permanent, removable and non-removable media that can be used to store information by any method or technology. Information can be computer-readable instructions, data structures, program modules or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), Dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disk read-only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, magnetic cassettes, magnetic disk storage or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by a computing device. As defined herein, computer-readable media does not include transitory media such as modulated data signals and carrier waves.

It should also be noted that the terms "include", "comprises" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, commodity or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, commodity or device. In the absence of further restrictions, the elements defined by the sentence "comprises a ..." do not exclude the existence of other identical elements in the process, method, commodity or device including the elements.

The above is only the embodiment of the present application and is not intended to limit the present application. For those skilled in the art, the present application may have various changes and variations. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

A time series prediction optimization method, comprising:

Inputting the sample feature sequence into a feature conversion module constructed in the time series prediction model, wherein the feature conversion module is used to convert the received sequence into a converted sequence carrying time-varying features and time-invariant features;

Inputting the converted sequence into a prediction module constructed in the time series prediction model, wherein the prediction module is used to perform time series prediction based on the time-varying features and the time-invariant features carried by the converted sequence to generate a prediction sequence;

Calculating an evaluation index for volatility for the time-invariant feature corresponding to the sample feature sequence to determine a first loss function value corresponding to the feature conversion module;

Calculating an evaluation index based on dynamic time warping (DTW) for the prediction sequence to determine a second loss function value corresponding to the prediction module;

Based on the first loss function value and the second loss function value, the feature conversion module and the prediction module are jointly tuned to update relevant model parameters in the time series prediction model.
According to the method of claim 1, the step of calculating an evaluation index based on dynamic time warping (DTW) for the prediction sequence to determine a second loss function value corresponding to the prediction module comprises:

Performing dynamic time warping on the predicted sequence and its corresponding real sequence, calculating the shortest path, and generating a time matching relationship between the two;

Based on the time matching relationship, calculating the shape deviation between the predicted sequence and the real sequence after dynamic time warping processing;

Based on the time matching relationship, calculating the time deviation between the predicted sequence and the real sequence after dynamic time warping processing;

The second loss function value is determined according to the shape deviation and the time deviation.
According to the method of claim 2, the calculating of the shape deviation between the predicted sequence and the true sequence after dynamic time warping processing comprises:

For each element in the predicted sequence, determine the corresponding matching element in the real sequence after dynamic time warping processing;

For each element in the predicted sequence, respectively, calculate the distance between it and the corresponding matching element;

The square root of the sum of the distances is calculated as the shape deviation.
According to the method of claim 2, the step of calculating the time deviation between the predicted sequence and the real sequence after dynamic time warping processing comprises:

For each element in the predicted sequence, determine the corresponding matching element in the real sequence after dynamic time warping processing;

For each element in the predicted sequence, respectively, calculate the time difference between it and the corresponding matching element;

The mean square sum of the time differences is calculated as the time deviation.
According to the method of claim 1, the jointly optimizing the feature conversion module and the prediction module based on the first loss function value and the second loss function value further comprises:

Calculating the mean absolute error (MAE) between the predicted sequence and its corresponding true sequence to determine a third loss function value corresponding to the prediction module;

A weighted sum is performed on the first loss function value, the second loss function value, and the third loss function value to jointly tune the feature conversion module and the prediction module.
The method according to claim 5, further comprising:

The first loss function, the second loss function, and the third loss function are respectively configured with weight parameters and then combined to generate a mixed loss function;

Automatically derive the gradient of the hybrid loss function to determine the weight values corresponding to the first loss function, the second loss function, and the third loss function.
According to the method of claim 1, the feature conversion module adopts an encoding-decoding model, and the method further comprises:

Mapping the sample feature sequence to a time-invariant space and a time-varying space using the encoding unit in the feature conversion module, so as to respectively extract the time-varying features and the time-invariant features corresponding to the sample feature sequence;

reconstructing the time-varying features and the time-invariant features into a fused sequence;

The fused sequence is decoded by using a decoding unit in the feature conversion module to generate the converted sequence.
According to the method of claim 1, the step of calculating an evaluation index for volatility for the time-invariant features corresponding to the sample feature sequence comprises:

The mean square error of the time-invariant features corresponding to the sample feature sequence is calculated as the evaluation index for volatility.
According to the method of claim 1, the time-invariant features contain autocorrelation information.
According to the method of claim 1, after completing the tuning of the time series prediction model, the method further comprises:

receiving a timing prediction request;

Acquire a historical feature sequence according to the time series prediction request;

Inputting the historical feature sequence into a feature conversion module in the time series prediction model, so as to utilize the feature conversion module to convert the historical feature sequence into a converted sequence carrying time-varying features and time-invariant features;

The converted sequence corresponding to the historical feature sequence is input into the prediction module, so as to use the prediction module to perform time series prediction and generate a prediction sequence corresponding to the historical feature sequence.
A time series prediction optimization method, comprising:

receiving a timing prediction request;

Acquire a historical feature sequence according to the time series prediction request;

The historical feature sequence is input into a feature conversion module in a time series prediction model, so as to generate a converted sequence for the historical feature sequence using the feature conversion module, wherein the feature conversion module is used to convert the received sequence into a converted sequence. Extracting time-varying features and time-invariant features to generate a transformed sequence;

Inputting the converted sequence corresponding to the historical feature sequence into the prediction module, so as to use the prediction module to perform time series prediction and generate a prediction sequence corresponding to the historical feature sequence;

The prediction module uses an evaluation index based on dynamic time warping (DTW) as a loss function.
A computing device comprising a memory and a processor;

The memory is used to store one or more computer instructions;

The processor is coupled to the memory and is used to execute the one or more computer instructions to execute the timing prediction optimization method described in any one of claims 1-11.
A computer-readable storage medium storing computer instructions, when the computer instructions are executed by one or more processors, causes the one or more processors to execute the timing prediction optimization method described in any one of claims 1-11.