CN114565158A

CN114565158A - Data prediction method and device, electronic equipment and storage medium

Info

Publication number: CN114565158A
Application number: CN202210189872.1A
Authority: CN
Inventors: 苑宝鑫
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2022-02-28
Filing date: 2022-02-28
Publication date: 2022-05-31

Abstract

The present disclosure provides a data prediction method, apparatus, electronic device, storage medium, and program product, which can be used in the financial field and other fields, the data prediction method including: acquiring a first historical data sequence; decomposing the first historical data sequence to obtain a trend characteristic value sequence, a period characteristic value sequence and a residual characteristic value sequence; training the ARIMA model by using the trend characteristic value sequence to obtain the trained ARIMA model; and obtaining all predicted values and prediction intervals in the time period to be predicted by using the trained ARIMA model, the periodic characteristic value sequence and the residual characteristic value sequence. The method utilizes the trend characteristic value sequence obtained by decomposing historical data to train the ARIMA model, utilizes the trained ARIMA model to combine the cycle characteristic value sequence and the residual error characteristic value sequence obtained by decomposing historical data to obtain the predicted value and the predicted interval of the time period to be predicted, and can effectively improve the accuracy of data prediction.

Description

Data prediction method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of computer data processing technologies, and in particular, to a data prediction method, an apparatus, an electronic device, a storage medium, and a program product.

Background

With the continuous development of information technology, the logic of an IT financial system based on banking business is more and more complex, the links are more and more, the related transaction channels, types and fields are different day by day, and the data volume also shows a ten-million-level growth trend, so that valuable information is extracted from mass transaction data, and the IT financial system has important significance for providing business decision service for banks and timely finding and solving the abnormity of the bank real-time transaction information system. The prediction of future transaction data based on the bank transaction data generated based on the history is of great significance to the business decision service of the bank.

The time series prediction method is actually a regression prediction method, belongs to quantitative prediction, and has the basic principle that: on one hand, the continuity of the development of the object is admitted, and the development trend of the object is estimated by using the past time sequence data to carry out statistical analysis; on the other hand, randomness caused by accidental factors is fully considered, and in order to eliminate the influence caused by random fluctuation, statistical analysis is carried out by using historical data, and the data is appropriately processed to carry out trend prediction.

Different application scenes generate different processing requirements on time series data, the problems cannot be well solved by using a traditional time series method, the traditional time series prediction method is difficult to model time series data with high complexity and nonlinearity, the dependence of the data on time is difficult to capture, and the influence caused by abnormal data is difficult to avoid for the time series with non-stability and abrupt change characteristics.

Disclosure of Invention

In view of the above, the present disclosure provides a data prediction method, apparatus, electronic device, storage medium, and program product.

According to a first aspect of the present disclosure, there is provided a data prediction method, the method comprising:

acquiring a first historical data sequence;

decomposing the first historical data sequence to obtain a trend characteristic value sequence, a period characteristic value sequence and a residual characteristic value sequence;

training the ARIMA model by using the trend characteristic value sequence to obtain a trained ARIMA model;

and obtaining all predicted values and prediction intervals in the time period to be predicted by using the trained ARIMA model, the periodic characteristic value sequence and the residual characteristic value sequence.

In an embodiment of the present disclosure, the acquiring the first historical data sequence specifically includes:

determining a termination time and a period, the period comprising at least one time segment;

calculating the average value a of all data in each time segment in the period before the termination time_i；

Calculating the average value b of all data in the first time segment of each period in M periods before the termination time_i；

Calculating the average value a of all data in each time segment in the period before the termination time_iTo obtain a continuous average value A₁；

Calculating the average value b of all data in the first time segment of each period in M periods before the termination time_iTo obtain the same average value A₂；

Judging the same average value A₂Whether or not it belongs to the interval [ A ]₁-A₁*10％，A₁+A₁*10％]；

If the above-mentioned average values A are the same₂Belongs to the above interval [ A₁-A₁*10％，A₁+A₁*10％]Selecting all data in the first time period of each period in M periods before the termination time and all data in the continuous N time periods before the termination time as the first historical data sequence;

if the above-mentioned average values A are the same₂Not in the interval [ A₁-A₁*10％，A₁+A₁*10％]Selecting all data in a first time period of each period within M periods before the termination time as the first historical data sequence;

wherein M is more than or equal to 3, N is more than or equal to 3, a_iAn average value representing all data in the ith time period in the cycle; b_iRepresents an average value of data in a first period of the i-th cycle before the termination time.

In an embodiment of the present disclosure, before the decomposing the first historical data sequence, the method further includes:

when there is missing data in the first history data sequence, acquiring a time point of the missing data for each missing data, acquiring all first history data corresponding to the time point in the first history data sequence, calculating a mean value of all the first history data, and taking the mean value as the missing data;

and acquiring a difference sequence of the first historical data sequence, if the variation amplitude of any first historical data in the first historical data sequence exceeds a preset threshold value, calculating the average value of the previous first historical data and the next first historical data of the first historical data, and replacing the first historical data with the average value.

In an embodiment of the disclosure, the training of the ARIMA model by using the trend eigenvalue sequence to obtain a trained ARIMA model includes:

and inputting the trend characteristic value sequence into the ARIMA model to obtain an autoregressive term number p and a moving average term number q of the ARIMA model, and obtaining the trained ARIMA model according to the autoregressive term number p and the moving average term number q.

In an embodiment of the present disclosure, the obtaining all predicted values and prediction intervals in a time period to be predicted by using the trained ARIMA model, the periodic eigenvalue sequence, and the residual eigenvalue sequence specifically includes:

generating all prediction trend characteristic values in a time period to be predicted by using the trained ARIMA model;

acquiring a time point corresponding to the predicted trend characteristic value for each predicted trend characteristic value, acquiring all cycle characteristic values corresponding to the time point from the cycle characteristic value sequence, calculating a mean value of all the cycle characteristic values corresponding to the time point, and adding the mean value to the predicted trend characteristic value to obtain a predicted value corresponding to the predicted trend characteristic value;

for each predicted value, obtaining a time point corresponding to the predicted value, obtaining all residual eigenvalues corresponding to the time point from the residual eigenvalue sequence, calculating a mean value of all residual eigenvalues corresponding to the time point, adding the predicted value to the mean value to obtain one end point of a predicted interval, and subtracting the mean value from the predicted value to obtain the other end point of the predicted interval.

In an embodiment of the present disclosure, the method further includes:

acquiring all second historical data in a second preset time period to obtain a second historical data sequence;

acquiring all predicted values and prediction intervals in the second preset time period;

calculating a difference value between each second historical data in the second historical data sequence and the corresponding predicted value to obtain a predicted difference value;

calculating the average value of the prediction difference values corresponding to all the second historical data in the second historical data sequence to obtain a corrected average value;

adding the corrected average value to the two endpoint values of the prediction interval in the time period to be predicted to obtain a corrected prediction interval;

and adding the correction average value to each predicted value in the time period to be predicted to obtain a corrected predicted value.

A second aspect of the present disclosure provides a data prediction apparatus, including:

the acquisition module is used for acquiring a first historical data sequence;

the decomposition module is used for decomposing the first historical data sequence to obtain a trend characteristic value sequence, a period characteristic value sequence and a residual characteristic value sequence;

the training module is used for training the ARIMA model by utilizing the trend characteristic value sequence to obtain the trained ARIMA model;

and the first generation module is used for obtaining all predicted values and prediction intervals in a time period to be predicted by utilizing the trained ARIMA model, the periodic characteristic value sequence and the residual characteristic value sequence.

In an embodiment of the present disclosure, the apparatus further includes:

the first acquisition module is used for acquiring all second historical data in a second preset time period to obtain a second historical data sequence;

the second obtaining module is used for obtaining all predicted values and prediction intervals in the second preset time period;

a first calculating module, configured to calculate, for each second historical data in the second historical data sequence, a difference between the second historical data and a corresponding predicted value to obtain a predicted difference;

the second calculation module is used for calculating the average value of the prediction difference values corresponding to all the second historical data in the second historical data sequence to obtain a corrected average value;

the first correction module is used for adding the corrected average value to two endpoint values of the prediction interval in the time period to be predicted to obtain a corrected prediction interval;

and the second correction module is used for adding the correction average value to the predicted value to obtain a corrected predicted value aiming at each predicted value in the time period to be predicted.

A third aspect of the present disclosure provides an electronic device, comprising:

one or more processors;

a memory for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method for predicting data based on performing the method for predicting data.

The fourth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above-described data prediction method.

A fifth aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the above-described data prediction method.

The data prediction method provided by the present disclosure includes: acquiring a first historical data sequence; decomposing the first historical data sequence to obtain a trend characteristic value sequence, a period characteristic value sequence and a residual characteristic value sequence; training the ARIMA model by using the trend characteristic value sequence to obtain the trained ARIMA model; and obtaining all predicted values and prediction intervals in the time period to be predicted by utilizing the trained ARIMA model, the periodic characteristic value sequence and the residual characteristic value sequence. The method includes the steps of obtaining a trend characteristic value sequence, a periodic characteristic value sequence and a residual characteristic value sequence by decomposing historical data, training an ARIMA model by using the trend characteristic value sequence, obtaining a predicted value and a predicted interval of a time period to be predicted by using the trained ARIMA model and combining the periodic characteristic value sequence and the residual characteristic value sequence, and effectively improving accuracy of data prediction.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 schematically illustrates an application scenario of a data prediction method according to an embodiment of the present disclosure;

fig. 2A schematically illustrates a flow chart of a data prediction method provided by an embodiment of the present disclosure;

FIG. 2B is a schematic flow chart diagram illustrating a method for obtaining a first historical data sequence according to an embodiment of the present disclosure;

fig. 2C schematically illustrates a flowchart of a method for obtaining all predicted values and prediction intervals in a time period to be predicted according to an embodiment of the present disclosure;

FIG. 3 is a flow chart schematically illustrating another data prediction method provided by an embodiment of the present disclosure;

FIG. 4 is a flow chart schematically illustrating another data prediction method provided by an embodiment of the present disclosure;

fig. 5 schematically shows a block diagram of a data prediction apparatus provided in an embodiment of the present disclosure;

fig. 6 schematically shows a block diagram of an obtaining module of a data prediction apparatus according to an embodiment of the present disclosure;

fig. 7 schematically shows a block diagram of a first generation module of a data prediction apparatus according to an embodiment of the present disclosure;

fig. 8 is a block diagram schematically illustrating a structure of another data prediction apparatus provided in an embodiment of the present disclosure;

fig. 9 is a block diagram schematically illustrating a structure of another data prediction apparatus provided in an embodiment of the present disclosure; and

FIG. 10 schematically illustrates a block diagram of an electronic device suitable for implementing a data prediction method in accordance with an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). It will be further understood by those within the art that virtually any disjunctive word and/or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the phrase "a or B" should be understood to include the possibility of "a" or "B", or "a and B".

Some block diagrams and/or flow diagrams are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations thereof, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, which execute via the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks. The techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). In addition, the techniques of this disclosure may take the form of a computer program product on a computer-readable medium having instructions stored thereon for use by or in connection with an instruction execution system. In the context of this disclosure, a computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the instructions. For example, the computer readable medium can include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. Specific examples of the computer readable medium include: magnetic storage devices, such as magnetic tape or Hard Disk Drives (HDDs); optical storage devices, such as compact disks (CD-ROMs); a memory, such as a Random Access Memory (RAM) or a flash memory; and/or wired/wireless communication links.

The present disclosure provides a data prediction method, comprising: acquiring a first historical data sequence; decomposing the first historical data sequence to obtain a trend characteristic value sequence, a period characteristic value sequence and a residual characteristic value sequence; training the ARIMA model by using the trend characteristic value sequence to obtain the trained ARIMA model; and obtaining all predicted values and prediction intervals in the time period to be predicted by utilizing the trained ARIMA model, the periodic characteristic value sequence and the residual characteristic value sequence. The method includes the steps that a trend characteristic value sequence, a periodic characteristic value sequence and a residual characteristic value sequence are obtained by decomposing a historical data sequence, an ARIMA model is trained by the trend characteristic value sequence, a predicted value and a prediction interval of a time period to be predicted are obtained by the trained ARIMA model in combination with the periodic characteristic value sequence and the residual characteristic value sequence, and accuracy of data prediction can be effectively improved.

The present disclosure provides a data prediction method, apparatus, electronic device, storage medium, and program product. The following description is made by way of example with reference to the accompanying drawings. It should be noted that the sequence numbers of the respective operations in the following methods are merely used as representations of the operations for description, and should not be construed as representing the execution order of the respective operations. The method need not be performed in the exact order shown, unless explicitly stated.

It should be noted that the data prediction method, the data prediction apparatus, the electronic device, the storage medium, and the program product provided by the present disclosure may be used in the financial field, and may also be used in any field other than the financial field.

In addition, in the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure, application and other processing of the related data all meet the regulations of related laws and regulations, necessary confidentiality measures are taken, and the customs of the public order is not violated.

In the technical scheme of the disclosure, before the data is acquired or collected, the authorization or the consent of the data owner is acquired.

Fig. 1 schematically shows an application scenario of a data prediction method according to an embodiment of the present disclosure. As shown in fig. 1, an application scenario 100 according to this embodiment may comprise

terminal devices

101, 102, 103, a network 104 and a server/server cluster 105. Network 104 is used to provide a medium of communication links between

terminal devices

101, 102, 103 and server/server cluster 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

A user may use the

terminal devices

101, 102, 103 to interact with the server/server cluster 105 via the network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have various client applications installed thereon, such as a shopping-like application, a web browser application, a search-like application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only).

The

terminal devices

101, 102, 103 may interact with the server/server cluster 105 through various client applications to send various requests to the server/server cluster 105 or to receive results returned by the server/server cluster 105.

The

terminal devices

101, 102, 103 may be various electronic devices including, but not limited to, smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server/server cluster 105 may be a server that provides various services, such as a background management server (for example only) that provides support for websites browsed by users using the

terminal devices

101, 102, 103. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.

It should be noted that a data prediction method provided by the embodiments of the present disclosure may be generally executed by the server/server cluster 105. Accordingly, a data prediction apparatus provided by the embodiments of the present disclosure may be generally disposed in the server/server cluster 105. The data prediction method provided by the embodiment of the present disclosure may also be executed by a server or a server cluster different from the server/server cluster 105 and capable of communicating with the

terminal devices

101, 102, 103 and/or the server/server cluster 105. Correspondingly, a data prediction apparatus provided in the embodiments of the present disclosure may also be disposed in a server or a server cluster that is different from the server/server cluster 105 and is capable of communicating with the

terminal devices

101, 102, and 103 and/or the server/server cluster 105.

It should be understood that the number of end devices, networks, and server/server clusters in fig. 1 is illustrative only. There may be any number of end devices, networks and server/server clusters, as desired.

A data prediction method of the disclosed embodiment will be described in detail below with reference to fig. 2A to 4 based on the scenario described in fig. 1. So that those skilled in the art can more clearly understand the technical solution of the present disclosure. It should be understood that the following description is only exemplary to assist those skilled in the art in understanding the aspects of the present disclosure, and is not intended to limit the scope of the present disclosure.

Fig. 2A schematically illustrates a flow chart of a data prediction method according to an embodiment of the present disclosure.

As shown in fig. 2A, in an embodiment of the present disclosure, the method includes operations S210 to S240.

In operation S210, a first history data sequence is acquired.

Fig. 2B schematically illustrates a flowchart of a method for acquiring a first historical data sequence according to an embodiment of the present disclosure. As shown in fig. 2B, in an embodiment of the present disclosure, the operation S210 includes operations S211 to S218.

In operation S211, a termination time and a period, which includes at least one time period, are determined.

In operation S212, an average value a of all data in each period of time one cycle before the termination time is calculated_iWherein a is_iRepresents the average of all data in the ith time segment of the cycle.

In operation S213, an average value b of all data in the first period of each of the M periods before the termination time is calculated_iWherein M is not less than 3, b_iRepresents an average value of data in a first period of the i-th cycle before the termination time.

In operation S214, an average value a of all data for each period of time one cycle before the termination time is calculated_iTo obtain a running average A₁。

In operation S215, an average value b of all data in the first period of each of the M periods before the termination time is calculated_iTo obtain the same average value A₂。

In operation S216, the same average value A is judged₂Whether or not it belongs to the interval [ A ]₁-A₁*10％，A₁+A₁*10％]。

In operation S217, if the same average value A is obtained₂Belongs to the above interval [ A₁-A₁*10％，A₁+A₁*10％]Then all data in the first time period of each period in M periods before the termination time and all data in the continuous N time periods before the termination time are selected as the first historical data sequence, wherein N is larger than or equal to 3.

In operation S218, if the same average value A is obtained₂Not in the interval [ A₁-A₁*10％，A₁+A₁*10％]Then, all the data in the first time period of each period within M periods before the termination time are selected as the first history data sequence.

The termination time may be a day or an hour, such as 2021-12-15, or 13. The period refers to a period of the history data, for example, a period of a set of history data is 7 days, or a period of a set of history data is 10 hours, and the period refers to a unit time of the period, for example, when the period is 7 days, the period is 1 day, and when the period is 10 hours, the period is 1 hour. In this embodiment, assuming that the expiration time is 2021-08-02, Monday, period is 7 days, and time period is 1 day, the above operation S212 is to calculate the average value of all data for each day of 7 days (2021-07-26 to 2021-08-01) before the expiration time, and obtain a₁，a₂，a₃，a₄，a₅，a₆，a₇Assuming that M is 4, the above operation S213 is to calculate an average value of all data in the first day (2021-07-05, 2021-07-12, 2021-07-19, 2021-07-26) of each period of 4 periods before the termination time, and is given as b₁，b₂，b₃，b₄The continuous average value A in the above operation S214₁Is a₁，a₂，a₃，a₄，a₅，a₆，a₇The same average value a in the above operation S215₂Is b is₁，b₂，b₃，b₄If the average value of (A) is the same₂Belongs to the interval [ A₁-A₁*10％，A₁+A₁*10％]If the period of the data is not large, all the data in the first day of each period in 4 periods before the termination time and all the data in the continuous 3 days before the termination time are selected as the first history data sequence. If the same average value A₂Not in the interval [ A₁-A₁*10％，A₁+A₁*10％]If the periodicity of the data is larger, all the data in the first day of each period in 4 periods before the termination time are selected as the first historical data sequenceAnd (4) columns. The data are selected according to the difference of different data, so that the characteristics of the acquired data can be better kept, the prediction result of the ARIMA model obtained by training the acquired data is more accurate, and the accuracy of data prediction is improved.

It should be understood that the illustrations of the termination time, the period, the time period, M, N, and the like in the present embodiment are only exemplary to help those skilled in the art understand the technical solutions of the present disclosure, and are not intended to limit the scope of the present disclosure. The termination time, period, time period, M and N, etc. may be selected according to actual needs.

In operation S220, the first historical data sequence is decomposed to obtain a trend eigenvalue sequence, a periodic eigenvalue sequence, and a residual eigenvalue sequence.

In operation S230, the ARIMA model is trained by using the trend eigenvalue sequence, so as to obtain an ARIMA model after training.

The ARIMA model is called an Autoregressive Integrated Moving Average model (Autoregressive Integrated Moving Average model), and is one of time series prediction analysis methods, wherein ARIMA (p, d, q) is called a differential Autoregressive Moving Average model, AR is Autoregressive, p is the number of Autoregressive terms, MA is a Moving Average, q is the number of Moving Average terms, and d is the difference times when the time series becomes stationary. In this embodiment, the trend feature value sequence is input into the ARIMA model, so as to obtain an autoregressive term p and a moving average term q of the ARIMA model, and the trained ARIMA model can be obtained according to the autoregressive term p and the moving average term q.

In operation S240, all prediction values and prediction intervals in the time period to be predicted are obtained by using the trained ARIMA model, the periodic eigenvalue sequence, and the residual eigenvalue sequence.

Fig. 2C schematically illustrates a flowchart of a method for obtaining all predicted values and prediction intervals in a time period to be predicted according to an embodiment of the present disclosure. As shown in fig. 2C, in an embodiment of the present disclosure, the operation S240 includes operations S241 to S243.

In operation S241, all the feature values of the predicted trend in the time period to be predicted are generated by using the trained ARIMA model.

The trained ARIMA model is utilized to obtain a predicted value and a predicted interval, specifically, a time period to be predicted is input into the trained ARIMA model, namely, a time period to be predicted, for example, the time period to be predicted may be 10 minutes after the current time point, and after the trained ARIMA model receives the time period to be predicted, all predicted trend characteristic values in the time period to be predicted are generated.

In operation S242, for each predicted trend feature value, a time point corresponding to the predicted trend feature value is obtained, all the cycle feature values corresponding to the time point are obtained from the cycle feature value sequence, a mean value of all the cycle feature values corresponding to the time point is calculated, and the mean value is added to the predicted trend feature value to obtain a predicted value corresponding to the predicted trend feature value.

There may be multiple predicted trend characteristic values in the time period to be predicted, a time point corresponding to each predicted trend characteristic value needs to be determined, for example, the time period to be predicted is 10 minutes after the current time point, assuming that the current time point is 22 points 30 minutes, the time period to be predicted is 22 points 30 to 22 points 40 minutes, assuming that there are 3 predicted trend characteristic values d, e and f in the time period, respectively, wherein the time point corresponding to the predicted trend characteristic value d is 22 points 33 minutes, if a predicted value corresponding to the predicted trend characteristic value d is desired, all the cycle characteristic values of 22 points 33 minutes at the time point need to be extracted from the cycle characteristic value sequence, the average value of the extracted cycle characteristic values is calculated, the predicted trend characteristic value d is added to the average value of the extracted cycle characteristic values, so as to obtain a predicted value corresponding to the predicted trend characteristic value d, the calculation method of the predicted values corresponding to the predicted trend characteristic value e and the predicted trend characteristic value f is the same.

In operation S243, for each predicted value, a time point corresponding to the predicted value is obtained, all residual feature values corresponding to the time point are obtained from the residual feature value sequence, a mean value of all residual feature values corresponding to the time point is calculated, one end point of a predicted section is obtained by adding the predicted value to the mean value, and the other end point of the predicted section is obtained by subtracting the mean value from the predicted value.

After all the predicted values corresponding to the predicted trend eigenvalues within the time period to be predicted are calculated, for each predicted value, a time point corresponding to each predicted value needs to be determined, that is, a time point corresponding to the predicted trend eigenvalue corresponding to the predicted value, the predicted trend eigenvalue d is taken as an example to determine a time point corresponding to the predicted value (the time point corresponding to the predicted trend eigenvalue d is 22 points and 33 points), all residual eigenvalues with the time point of 22 points and 33 points are extracted from the residual eigenvalue sequence, the average values of the extracted residual eigenvalues are calculated, the predicted value is added to the average value of the extracted residual eigenvalues to obtain an end point of the predicted interval, and the average value of the extracted residual eigenvalues is subtracted from the predicted value to obtain another end point of the predicted interval, the end point of the prediction interval is determined to obtain the prediction interval. Compared with the method for generating the predicted value and the predicted interval by only using the trained ARIMA model, the method for obtaining the predicted value and the predicted interval by using the trained ARIMA model and combining the periodic characteristic value sequence and the residual characteristic value sequence of the historical data can enable the obtained predicted value and the obtained predicted interval to be more accurate, and effectively improves the accuracy of the predicted result.

It should be understood that the illustration of the time period to be predicted and the like in the present embodiment is only an example to help those skilled in the art understand the technical solution of the present disclosure, and is not intended to limit the protection scope of the present disclosure. The time period to be predicted can be selected according to actual needs, and the like.

Fig. 3 schematically illustrates a flow chart of another data prediction method provided by an embodiment of the present disclosure.

As shown in fig. 3, in the present embodiment, the method includes operations S301 to S308. Operations S301 to S304 are implemented in the same manner as operations S210 to S240, and repeated details will not be repeated.

In operation S305, all the second history data in the second preset time period are acquired, and a second history data sequence is obtained.

In operation S306, all the prediction values and the prediction intervals within the second preset time period are obtained.

In operation S307, for each second history data in the second history data sequence, a difference between the second history data and the corresponding predicted value is calculated, so as to obtain a predicted difference.

In operation S308, an average of the predicted differences corresponding to all the second history data in the second history data sequence is calculated to obtain a corrected average.

In operation S309, the corrected average value is added to both end points of the prediction interval in the time period to be predicted, so as to obtain a corrected prediction interval.

In operation S310, for each predicted value in the time period to be predicted, the corrected average value is added to the predicted value to obtain a corrected predicted value.

All predicted values and prediction intervals in a certain time period can be obtained by utilizing the trained ARIMA model, and predicted data and prediction intervals in a to-be-predicted time period can be corrected by utilizing real values, predicted values and prediction intervals in the certain time period. In the embodiment, all the second historical data (i.e. real values) and all the predicted values and prediction intervals in the second preset time period are firstly acquired, then calculating the difference value between each second historical data and the corresponding predicted value in a second preset time period, wherein the difference value is the predicted difference value, then calculating the average value of all the prediction difference values to obtain a corrected average value, for example, a second preset time period is 10 minutes before the current time point, X, Y, Z pieces of second historical data are collected in total, the trained ARIMA model is used to obtain the predicted values x, y and z in 10 minutes before the current time point, then the predicted differences for the second historical data X, Y, Z are X-X, Y-Y, and Z-Z, respectively, and the modified average is (X-X) + (Y-Y) + (Z-Z)/3. The corrected average value may also be obtained by calculating an average value of differences between the sum of all the second history data and the sum of all the predicted values, for example, the corrected average value is (X + Y + Z) - (X + Y + Z)/3. After the corrected average value is obtained through calculation, the predicted value and the prediction interval in the time period to be predicted can be corrected according to the corrected average value, the time period to be predicted in the embodiment refers to a period of time after the current time point, the corrected prediction interval can be obtained by adding the corrected average value to two end point values of the prediction interval in the time period to be predicted in the embodiment, and the corrected predicted value can be obtained by adding the corrected average value to the predicted value for each predicted value in the time period to be predicted. For example, assuming that the current time point is 10 points and 30 minutes, the corrected average value 10 minutes before the current time point is 0.8, the prediction interval 10 minutes after the current time point is [10, 35], and the prediction value is 20, the corrected prediction interval is [10.8, 35.8], and the corrected prediction value is 20.8. In order to make the prediction result more accurate, a correction average value may be periodically generated, and the prediction value and the prediction interval in the time period to be predicted are corrected according to the correction average value, for example, the correction method is executed after the historical data is acquired every 10 minutes, and the prediction value and the prediction interval in the time period to be predicted are corrected, so that the timeliness of correction can be ensured. According to the predicted value and the prediction interval corrected according to the true value and the like, the finally obtained predicted value and the prediction interval can be closer to the true value, and the data prediction result is more accurate.

It should be understood that the illustration of the second preset time period and the like in the present embodiment is only an example to help those skilled in the art understand the technical solution of the present disclosure, and is not intended to limit the protection scope of the present disclosure. The second preset time period can be set according to actual needs.

Fig. 4 schematically illustrates a flow chart of another data prediction method provided by an embodiment of the present disclosure.

As shown in fig. 4, in the present embodiment, the method includes operations S410 to S470. Operation S410, operation S450 to operation S470 are implemented in the same manner as operation S210 to operation S240, respectively, and repeated parts will not be described in detail.

In operation S420, it is determined whether there is missing data in the first history data sequence.

In operation S430, when there is missing data in the first history data sequence, for each missing data, a time point of the missing data is obtained, all first history data corresponding to the time point in the first history data sequence are obtained, a mean value of all the first history data is calculated, and the mean value is used as the missing data.

In order to ensure the accuracy of the prediction result, the first historical data in the first historical data sequence used for training the ARIMA model needs to be continuous, that is, there cannot be a case of data loss in the first historical data sequence, when the first historical data sequence is acquired, all the first historical data in the first historical data sequence are firstly checked to determine whether there is a case of data loss in the first historical data sequence, and when there is a case of data loss in the first historical data sequence, firstly, the time point corresponding to the missing first historical data is determined, and then the missing data is filled according to the time point, for example, the time point of the missing data is 13 points 23 points, and the first historical data with all the time points being 13 points 23 points is acquired from the first historical data sequence, assuming that the first historical data sequence shares H, and m, J. The K three first history data points in time are 13 points and 23 points, the average of the three first history data H, J, K is calculated, and the average is filled in the missing data.

In operation S440, a difference sequence of the first history data sequence is obtained, and if a variation width of any one of the first history data in the first history data sequence exceeds a preset threshold, a mean value of a previous first history data and a next first history data of the first history data is calculated, and the first history data is replaced by the mean value.

In order to further improve the accuracy of the prediction result, it is necessary to not only make the first historical data in the first historical data sequence continuous, but also to remove abnormal data in the first historical data sequence. Specifically, for example, when the variation amplitude of a certain first history data in the first history data sequence is greater than a preset threshold, it is determined that the first history data has an abnormality, and if the first history data is used for prediction, the accuracy of the prediction result may be reduced.

It should be understood that the illustration of the filling method of missing data, the culling method of abnormal data, and the like in this embodiment is only an example to help those skilled in the art understand the technical solution of the present disclosure, and is not intended to limit the protection scope of the present disclosure.

Based on the data prediction method, the disclosure also provides a data prediction device. The apparatus will be described in detail below with reference to fig. 5 to 9.

Fig. 5 schematically shows a block diagram of a data prediction apparatus according to an embodiment of the present disclosure.

As shown in fig. 5, in an embodiment of the present disclosure, the apparatus 500 includes: an acquisition module 510, a decomposition module 520, a training module 530, and a first generation module 540.

An obtaining module 510 is configured to obtain a first historical data sequence. In an embodiment, the obtaining module 510 may be configured to perform the operation S210 described above, which is not described herein again.

And a decomposition module 520, configured to decompose the first historical data sequence to obtain a trend characteristic value sequence, a period characteristic value sequence, and a residual characteristic value sequence. In an embodiment, the decomposition module 520 may be configured to perform the operation S220 described above, which is not described herein again.

And the training module 530 is configured to train the ARIMA model by using the trend characteristic value sequence to obtain the trained ARIMA model. In an embodiment, the training module 530 may be configured to perform the operation S230 described above, which is not described herein again.

And a first generating module 540, configured to obtain all predicted values and prediction intervals in a time period to be predicted by using the trained ARIMA model, the periodic eigenvalue sequence, and the residual eigenvalue sequence. In an embodiment, the first generating module 540 may be configured to perform the operation S240 described above, and is not described herein again.

Fig. 6 schematically shows a block diagram of an obtaining module of a data prediction apparatus according to an embodiment of the present disclosure.

As shown in fig. 6, in this embodiment, the obtaining module 510 includes: a determination module 511, a third calculation module 512, a fourth calculation module 513, a fifth calculation module 514, a sixth calculation module 515, a judgment module 516, a third generation module 517, and a fourth generation module 518.

The determining module 511 is configured to determine the ending time and the period, where the period includes at least one time period. In an embodiment, the determining module 511 may be configured to perform the operation S211 described above, which is not described herein again.

A third calculating module 512, configured to calculate an average value a of all data in each time segment in a period before the ending time_iWherein a is_iRepresents the average of all data in the ith time segment in the cycle. In an embodiment, the third calculating module 512 can be configured to perform the operation S212 described above, and is not described herein again.

A fourth calculating module 513, configured to calculate an average value b of all data in the first time period of each period M periods before the termination time_iWherein M is not less than 3, b_iRepresents an average value of data in a first period of the i-th cycle before the termination time. In an embodiment, the fourth calculating module 513 may be configured to perform the operation S213 described above, and is not described herein again.

A fifth calculating module 514, configured to calculate an average value a of all data in each time period in a period before the ending time_iTo obtain a continuous average value A₁. In an embodiment, the fifth calculation module 514 may be used to perform the operations described aboveS214, which is not described herein again.

A sixth calculating module 515, configured to calculate an average value b of all data in the first time period of each period M periods before the termination time_iTo obtain the same average value A₂. In an embodiment, the sixth calculating module 515 may be configured to perform the operation S215 described above, which is not described herein again.

A judging module 516 for judging the same average value A₂Whether or not it belongs to the interval [ A ]₁-A₁*10％，A₁+A₁*10％]. In an embodiment, the determining module 516 may be configured to perform the operation S216 described above, which is not described herein again.

A third generating module 517, configured to obtain the same average value a₂Belongs to the above interval [ A₁-A₁*10％，A₁+A₁*10％]Then all data in the first time period of each period in M periods before the termination time and all data in the continuous N time periods before the termination time are selected as the first historical data sequence, wherein N is larger than or equal to 3. In an embodiment, the third generating module 517 may be configured to perform the operation S217 described above, and is not described herein again.

A fourth generating module 518, configured to obtain the same average value A₂Not in the interval [ A₁-A₁*10％，A₁+A₁*10％]And selecting all data in the first time period of each period in M periods before the termination time as the first historical data sequence. In an embodiment, the fourth generating module 518 may be configured to perform the operation S218 described above, which is not described herein again.

Fig. 7 schematically shows a block diagram of a first generation module of a data prediction apparatus according to an embodiment of the present disclosure.

As shown in fig. 7, in the present embodiment, the first generating module 540 includes: a first generation submodule 541, a first prediction module 542, and a second prediction module 543.

The first generation submodule 541 generates all the predicted trend feature values in the time period to be predicted by using the trained ARIMA model. In an embodiment, the first generation submodule 541 may be configured to perform the operation S241 described above, which is not described herein again.

The first prediction module 542 acquires a time point corresponding to the predicted trend feature value for each predicted trend feature value, acquires all cycle feature values corresponding to the time point from the cycle feature value sequence, calculates a mean value of all cycle feature values corresponding to the time point, and adds the mean value to the predicted trend feature value to obtain a predicted value corresponding to the predicted trend feature value. In an embodiment, the first prediction module 542 may be configured to perform the operation S242 described above, which is not described herein again.

The second prediction module 543 obtains a time point corresponding to each of the predicted values, obtains all residual feature values corresponding to the time point from the residual feature value sequence, calculates a mean value of all residual feature values corresponding to the time point, adds the predicted value to the mean value to obtain one end point of a prediction section, and subtracts the mean value from the predicted value to obtain the other end point of the prediction section. In an embodiment, the second prediction module 543 may be configured to perform the operation S243 described above, which is not described herein again.

Fig. 8 schematically shows a block diagram of a structure of another data prediction apparatus provided in an embodiment of the present disclosure.

As shown in fig. 8, in the present embodiment, the apparatus 800 includes: a third obtaining module 810, a second determining module 820, a filling module 830, a replacing module 840, a second decomposing module 850, a second training module 860 and a second generating module 870.

The third obtaining module 810, the second decomposing module 850, the second training module 860 and the second generating module 870 respectively have functions similar to or the same as those of the obtaining module 410, the decomposing module 420, the training module 430 and the first generating module 440, and repeated parts are not described again.

A second determining module 820, configured to determine whether there is missing data in the first historical data sequence. In an embodiment, the second determining module 820 may be configured to perform the operation S420 described above, which is not described herein again.

A filling module 830, configured to, when there is missing data in the first historical data sequence, obtain, for each missing data, a time point of the missing data, obtain all first historical data in the first historical data sequence corresponding to the time point, calculate a mean value of all the first historical data, and take the mean value as the missing data. In an embodiment, the padding module 830 may be configured to perform the operation S430 described above, and details are not repeated herein.

A replacing module 840, configured to obtain a difference sequence of the first historical data sequence, and for each historical data in the first historical data sequence, when a variation width of the historical data exceeds a preset threshold, calculate a mean value of a previous historical data and a next historical data of the historical data, and replace the historical data with the mean value. In an embodiment, the replacing module 840 may be configured to perform the operation S440 described above, which is not described herein again.

Fig. 9 schematically shows a block diagram of another data prediction apparatus provided in an embodiment of the present disclosure.

As shown in fig. 9, in the present embodiment, the apparatus 900 includes: a first obtaining module 910, a second obtaining module 920, a first calculating module 930, a second calculating module 940, a first correcting module 950, and a second correcting module 960.

The first obtaining module 910 is configured to obtain all second historical data in a second preset time period, to obtain a second historical data sequence. In an embodiment, the first obtaining module 910 may be configured to perform the operation S305 described above, which is not described herein again.

A second obtaining module 920, configured to obtain all predicted values and prediction intervals within the second preset time period. In an embodiment, the second obtaining module 920 may be configured to perform the operation S306 described above, which is not described herein again.

A first calculating module 930, configured to calculate, for each second historical data in the second historical data set, a difference between the second historical data and the corresponding predicted value to obtain a predicted difference. In an embodiment, the first calculating module 930 may be configured to perform the operation S307 described above, which is not described herein again.

A second calculating module 940, configured to calculate an average of the prediction differences corresponding to all the second historical data in the second historical data sequence, so as to obtain a corrected average. In an embodiment, the second calculating module 940 may be configured to perform the operation S308 described above, and is not described herein again.

The first correcting module 950 is configured to add the corrected average value to both end points of the prediction interval in the time period to be predicted to obtain a corrected prediction interval. In an embodiment, the first modification module 950 may be configured to perform the operation S309 described above, which is not described herein again.

The second correction module 960 is configured to add the corrected average value to the predicted value for each predicted value in the to-be-predicted time period to obtain a corrected predicted value. In an embodiment, the second modification module 960 may be configured to perform the operation S310 described above, which is not described herein again.

Any number of modules, sub-modules, units, sub-units, or at least part of the functionality of any number thereof according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, and sub-units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in any other reasonable manner of hardware or firmware by integrating or packaging a circuit, or in any one of or a suitable combination of software, hardware, and firmware implementations. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the disclosure may be at least partially implemented as a computer program module, which when executed may perform the corresponding functions.

For example, any number of the obtaining module 510, the decomposing module 520, the training module 530, and the first generating module 540 may be combined in one module to be implemented, or any one of them may be split into multiple modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the disclosure, at least one of the obtaining module 510, the decomposing module 520, the training module 530, and the first generating module 540 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or in any one of three implementations of software, hardware, and firmware, or in any suitable combination of any of them. Alternatively, at least one of the acquisition module 510, the decomposition module 520, the training module 530 and the first generation module 540 may be at least partially implemented as a computer program module, which when executed may perform a corresponding function.

FIG. 10 schematically illustrates a block diagram of an electronic device adapted to implement a method of data prediction according to an embodiment of the present disclosure.

As shown in fig. 10, an electronic device 1000 according to an embodiment of the present disclosure includes a processor 1001 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)1002 or a program loaded from a storage section 1008 into a Random Access Memory (RAM) 1003. Processor 1001 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 1001 may also include on-board memory for caching purposes. The processor 1001 may include a single processing unit or multiple processing units for performing different actions of a method flow according to embodiments of the present disclosure.

In the RAM 1003, various programs and data necessary for the operation of the electronic apparatus 1000 are stored. The processor 1001, ROM 1002, and RAM 1003 are connected to each other by a bus 1004. The processor 1001 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 1002 and/or the RAM 1003. Note that the programs may also be stored in one or more memories other than the ROM 1002 and the RAM 1003. The processor 1001 may also perform various operations of the method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.

Electronic device 1000 may also include an input/output (I/O) interface 1005, the input/output (I/O) interface 1005 also being connected to bus 1004, according to an embodiment of the present disclosure. Electronic device 1000 may also include one or more of the following components connected to I/O interface 1005: an input portion 1006 including a keyboard, a mouse, and the like; an output section 1007 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 1008 including a hard disk and the like; and a communication section 1009 including a network interface card such as a LAN card, a modem, or the like. The communication section 1009 performs communication processing via a network such as the internet. A drive 1010 is also connected to the I/O interface 1005 as necessary. A removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1010 as necessary, so that a computer program read out therefrom is mounted into the storage section 1008 as necessary.

The present disclosure also provides a computer readable storage medium having stored thereon a computer program comprising a data prediction method as described above. The computer-readable storage medium may be embodied in the apparatuses/devices described in the above embodiments; or may be present separately and not assembled into the device/apparatus. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.

According to embodiments of the present disclosure, a computer readable medium may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 1002 and/or the RAM 1003 described above and/or one or more memories other than the ROM 1002 and the RAM 1003.

Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method illustrated in the flow chart. When the computer program product runs in a computer system, the program code is used for causing the computer system to realize a data prediction method provided by the embodiment of the disclosure.

The computer program performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure when executed by the processor 1001. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted in the form of a signal on a network medium, distributed, downloaded and installed via the communication part 1009, and/or installed from the removable medium 1011. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In such an embodiment, the computer program may be downloaded and installed from a network through the communication part 1009 and/or installed from the removable medium 1011. The computer program performs the above-described functions defined in the system of the embodiment of the present disclosure when executed by the processor 1001. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.

The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. While the disclosure has been shown and described with reference to certain exemplary embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims and their equivalents. Accordingly, the scope of the present disclosure should not be limited to the above-described embodiments, but should be defined not only by the appended claims, but also by equivalents thereof.

Claims

1. A method of data prediction, comprising:

acquiring a first historical data sequence;

training an ARIMA model by using the trend characteristic value sequence to obtain the trained ARIMA model;

2. The data prediction method according to claim 1, wherein the obtaining of the first historical data sequence specifically includes:

determining an expiration time and a period, the period comprising at least one time period;

If the same average value A₂Belongs to the interval [ A₁-A₁*10％，A₁+A₁*10％]Selecting all data in the first time period of each period in M periods before the termination time and all data in N continuous time periods before the termination time as the first historical data sequence;

if the same average value A₂Not in the interval [ A₁-A₁*10％，A₁+A₁*10％]If so, selecting all data in the first time period of each period in M periods before the termination time as the first historical data sequence;

wherein M is more than or equal to 3, N is more than or equal to 3, a_iRepresenting the average of all data in the ith said time segment in said cycle, b_iRepresents an average of data in a first one of the time segments of an ith one of the cycles before the termination time.

3. The data prediction method of claim 1, wherein prior to said decomposing the first historical data sequence, the method further comprises:

judging whether missing data exists in the first historical data sequence;

when missing data exists in the first historical data sequence, aiming at each missing data, acquiring the time point of the missing data, acquiring all first historical data corresponding to the time point in the first historical data sequence, calculating the average value of all the first historical data, and taking the average value as the missing data;

and obtaining a difference sequence of the first historical data sequence, if the variation amplitude of any first historical data in the first historical data sequence exceeds a preset threshold value, calculating the mean value of the previous first historical data and the next first historical data of the first historical data, and replacing the first historical data with the mean value.

4. The data prediction method of claim 1, wherein the training the ARIMA model using the sequence of trend feature values to obtain a trained ARIMA model comprises:

and inputting the trend characteristic value sequence into an ARIMA model to obtain an autoregressive item number p and a moving average item number q of the ARIMA model, and obtaining the trained ARIMA model according to the autoregressive item number p and the moving average item number q.

5. The data prediction method according to claim 1, wherein the obtaining of all predicted values and prediction intervals within a time period to be predicted by using the trained ARIMA model, the periodic eigenvalue sequence, and the residual eigenvalue sequence specifically includes:

generating all prediction trend characteristic values in a time period to be predicted by utilizing the trained ARIMA model;

for each predicted trend characteristic value, acquiring a time point corresponding to the predicted trend characteristic value, acquiring all cycle characteristic values corresponding to the time point from the cycle characteristic value sequence, calculating the mean value of all cycle characteristic values corresponding to the time point, and adding the mean value to the predicted trend characteristic value to obtain a predicted value corresponding to the predicted trend characteristic value;

and aiming at each predicted value, acquiring a time point corresponding to the predicted value, acquiring all residual characteristic values corresponding to the time point from the residual characteristic value sequence, calculating the mean value of all residual characteristic values corresponding to the time point, adding the mean value to the predicted value to obtain one end point of a predicted interval, and subtracting the mean value from the predicted value to obtain the other end point of the predicted interval.

6. The data prediction method of claim 1, further comprising:

calculating the average value of the prediction difference values corresponding to all the second historical data in the second historical data sequence to obtain a corrected average value; adding the corrected average value to the two endpoint values of the prediction interval in the time period to be predicted to obtain a corrected prediction interval;

7. A data prediction apparatus, comprising:

the acquisition module is used for acquiring a first historical data sequence;

8. The data prediction apparatus of claim 7, wherein the apparatus further comprises:

the first calculation module is used for calculating a difference value between each second historical data in the second historical data sequence and the corresponding predicted value to obtain a predicted difference value;

and the second correction module is used for adding the correction average value to each predicted value in the time period to be predicted to obtain a corrected predicted value.

9. An electronic device, characterized in that the electronic device comprises:

one or more processors;

a memory for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform a method of data prediction in accordance with any of claims 1-6.

10. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform a method of data prediction according to any one of claims 1 to 6.

11. A computer program product comprising a computer program which, when executed by a processor, implements a data prediction method according to any one of claims 1 to 6.