CN107346464B - Service index prediction method and device - Google Patents
Service index prediction method and device Download PDFInfo
- Publication number
- CN107346464B CN107346464B CN201610296883.4A CN201610296883A CN107346464B CN 107346464 B CN107346464 B CN 107346464B CN 201610296883 A CN201610296883 A CN 201610296883A CN 107346464 B CN107346464 B CN 107346464B
- Authority
- CN
- China
- Prior art keywords
- model
- prediction
- data
- service index
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000012549 training Methods 0.000 claims abstract description 63
- 238000007619 statistical method Methods 0.000 claims abstract description 22
- 238000012545 processing Methods 0.000 claims abstract description 16
- 238000012360 testing method Methods 0.000 claims description 26
- 238000004422 calculation algorithm Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 11
- 238000003860 storage Methods 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 5
- 230000000306 recurrent effect Effects 0.000 claims description 3
- 101001095088 Homo sapiens Melanoma antigen preferentially expressed in tumors Proteins 0.000 claims 6
- 102100037020 Melanoma antigen preferentially expressed in tumors Human genes 0.000 claims 6
- 241001123248 Arma Species 0.000 claims 2
- 230000007774 longterm Effects 0.000 abstract description 8
- 230000008859 change Effects 0.000 abstract description 7
- 230000000694 effects Effects 0.000 abstract description 7
- 230000001932 seasonal effect Effects 0.000 abstract description 7
- 230000006870 function Effects 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 230000003203 everyday effect Effects 0.000 description 5
- 210000002569 neuron Anatomy 0.000 description 3
- 238000007476 Maximum Likelihood Methods 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 230000002354 daily effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000012731 temporal analysis Methods 0.000 description 2
- 238000000700 time series analysis Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0637—Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
- G06Q10/06375—Prediction of business process outcome or impact based on a proposed change
Landscapes
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Educational Administration (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a service index prediction method and a service index prediction device, and belongs to the technical field of data processing. The method comprises the following steps: for a business index, training at least one first model for predicting the business index according to a first historical time sequence of the business index; for each first model, predicting the service index based on the first model and the second historical time sequence to obtain a preliminary prediction result; and carrying out statistical analysis on the preliminary prediction result output by the at least one first model to obtain a final prediction result of the service index. Due to the fact that the multiple models are fused, even if the business indexes are greatly fluctuated and influenced by seasonal factors and the like, the change rule and the long-term trend of the indexes can still be effectively grasped through the multiple models, the business indexes can be predicted more accurately, and the prediction effect is good.
Description
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for predicting a service index.
Background
The business index is generally characterized by the time sequence of each business generation under the flags of enterprises or companies and other organizations. The time sequence refers to a data sequence formed by arranging the numerical values of the same statistical index according to the sequence of the occurrence time. For example, the daily income of a certain game, the number of visitors to a certain website, the daily income gained by placing a certain advertisement, etc. Time series analysis plays an increasingly important role in life and production, and is mainly applied to the field of prediction, such as weather forecast, market prediction, population prediction, flood situation prediction, yield prediction and the like. For example, in the internet industry, a time series analysis method is adopted to predict business indexes such as website access amount, advertisement income, game user access amount and the like, so that problems can be found as early as possible, and further, the stability of business, good experience of users and effective operation of funds are guaranteed.
The prior art generally adopts the following method to predict the service index. The first mode adopts a simple moving average method, shifts and pushes item by item according to a time sequence, calculates the average value of data in a period of time window in sequence, and predicts the service index according to the average value. The calculation formula for a simple moving average method with a time window length of k is as follows:
wherein k, i and n are positive integers,denotes the predicted value, Vn-iRefers to the actual observed value.
The second method adopts a weighted moving average method, different weights are respectively given to data according to the influence degree of the data at different time in the same sliding section on the predicted value, and then averaging is carried out to predict a future value. The weighted moving average method treats each data in the sliding window differently according to the characteristic that the more recent data has a greater influence on the predicted value. And giving a larger weight to the recent data and giving a smaller weight to the more recent data so as to predict the service index. The calculation formula is as follows:
In the process of implementing the invention, the technology at least has the following problems:
when the service index has larger fluctuation and is influenced by seasonal factors and the like, the change rule of the index cannot be effectively grasped by adopting the two sliding average methods; in addition, the length of the sliding window has a large influence on the prediction result, a shorter window cannot catch the long-term trend of the index, and a longer window is insensitive to the index variation, so the prediction effect of the prediction method is poor.
Disclosure of Invention
In order to solve the technical problem, embodiments of the present invention provide a method and an apparatus for predicting a service index. The technical scheme is as follows:
in a first aspect, a method for predicting a service index is provided, where the method includes:
for a business index, training at least one first model for predicting the business index according to a first historical time sequence of the business index;
for each first model, predicting the service index based on the first model and the second historical time sequence to obtain a preliminary prediction result;
and carrying out statistical analysis on the preliminary prediction result output by the at least one first model to obtain a final prediction result of the service index.
In a second aspect, a service index prediction apparatus is provided, the apparatus includes:
the training module is used for training at least one first model for predicting the business indexes according to the first historical time sequence of the business indexes;
the prediction module is used for predicting the service index based on the first model and the second historical time sequence for each first model to obtain a preliminary prediction result;
and the statistical analysis module is used for performing statistical analysis on the preliminary prediction result output by the at least one first model to obtain a final prediction result of the service index.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
and training a plurality of models based on the historical data, and predicting the same service index by each model according to the historical data to obtain a plurality of prediction results. And then, carrying out weighted average statistical analysis on the plurality of prediction results, taking the finally obtained result as the prediction result of the service index, and fusing a plurality of models, so that even if the service index has larger fluctuation and is influenced by seasonal factors and the like, the change rule and the long-term trend of the index can still be effectively grasped by the plurality of models, the prediction of the service index is more accurate, and the prediction effect is better.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a service index prediction method according to an embodiment of the present invention;
fig. 2A is a flowchart of a method for predicting a service index according to an embodiment of the present invention;
fig. 2B is a network structure diagram of an RNN model according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a service index prediction apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Fig. 1 is a flowchart of a service index prediction method according to an embodiment of the present invention. Referring to fig. 1, a method flow provided by the embodiment of the present invention includes:
101. for a business index, at least one first model for predicting the business index is trained based on a first historical time series of business indexes.
102. And for each first model, predicting the service index based on the first model and the second historical time sequence to obtain a preliminary prediction result.
103. And carrying out statistical analysis on the preliminary prediction result output by the at least one first model to obtain a final prediction result of the service index.
According to the method provided by the embodiment of the invention, a plurality of models are trained based on historical data, and each model predicts the same service index according to the historical data, so that a plurality of prediction results are obtained. And then, carrying out weighted average statistical analysis on the plurality of prediction results, taking the finally obtained result as the prediction result of the service index, and fusing a plurality of models, so that even if the service index has larger fluctuation and is influenced by seasonal factors and the like, the change rule and the long-term trend of the index can still be effectively grasped by the plurality of models, the prediction of the service index is more accurate, and the prediction effect is better.
In another embodiment, training at least one first model for predicting a traffic indicator based on a first historical time series of traffic indicators comprises:
determining training data and test data, respectively, in a first historical time series based on a time sequence;
training each parameter in at least one second model according to the training data to obtain at least one training model;
and optimizing each parameter in the at least one training model according to the test data to obtain at least one first model.
In another embodiment, the determining training data and test data, respectively, in a first historical time series based on a chronological order includes:
taking the service data of which the time variable in the first historical time sequence is earlier than a preset time threshold as training data;
and taking the service data of which the time variable is later than a preset time threshold value in the first historical time sequence as test data.
In another embodiment, the performing statistical analysis on the preliminary prediction result output by the at least one first model to obtain a final prediction result of the service index includes:
and carrying out weighted average processing on the preliminary prediction result output by the at least one first model to obtain a final prediction result.
In another embodiment, the performing statistical analysis on the preliminary prediction result output by the at least one first model to obtain a final prediction result of the service index includes:
based on MAPE (Mean Absolute Percentage Error) algorithm, selecting appointed preliminary prediction results from all outputted preliminary prediction results, wherein the prediction Error of a model corresponding to the appointed preliminary prediction results on the service index is smaller than an appointed value;
and carrying out weighted average processing on the appointed initial prediction results to obtain a final prediction result.
In another embodiment, the at least one first model includes an EWMA (Exponentially Weighted Moving Average) model, an ARMA (Auto-Regressive and Moving Average) model, and an RNN (Recurrent neural Network) model.
All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.
Fig. 2A is a flowchart of a service index prediction method according to an embodiment of the present invention. Referring to fig. 2A, a method flow provided by the embodiment of the present invention includes:
201. for a business index, at least one first model for predicting the business index is trained according to a first historical time sequence of the business index.
The business index may refer to income gained by games every day, number of visitors of websites every day, income gained by advertisements every day, and the like, which is not specifically limited in the embodiment of the present invention. The first historical time series refers to historical data of the service index. In the embodiment of the present invention, the first historical time series may be abstracted as a set of two tuples, T { (T)1,v1),(t2,v2),(t3,v3),......,(tn,vn)}。
Wherein, tiIs a time variable, satisfies (t)i<ti+1,i∈{1,2,3,...,n-1},viIs tiAt that moment the actual observed value of the service indicator, i.e. viThe actual statistical value of the service index corresponds to the predicted value. The at least one first model includes an EWMA model, an ARMA model, an RNN model, and the like, which are not particularly limited by the embodiment of the present invention. For each of the above models, usually one model only considers one aspect influencing the business index, and cannot summarize all the aspects influencing the business indexFactors. For example, the EWMA model can only grasp the long-term variation trend of data when performing prediction, the ARMA model can only grasp some smooth variation rules of the data, and the RNN model can only grasp the variation trend of some regularity of the data. In order to improve the prediction accuracy of the service index, the embodiment of the invention adopts a mode of fitting the service index by using multiple models, namely, the same service index is predicted by using multiple models, and the obtained prediction result is weighted and averaged, so that the prediction result with higher accuracy is obtained, and the detailed description is given in the following steps.
The EWMA model is a commonly used prediction method, and the calculation formula is as follows:
equation (1) is a recursive definition, and substituting k predicted values in a sliding window into equation (1) can obtain:
as can be seen from the above equation (2), the predicted value at time n is (1-. alpha.) compared with the actual observed values at time n-1 and all the previous timesiI is incremented, which is the meaning of the exponent in the exponentially weighted moving average method. The text description of the formula (1) and the formula (2) is that a larger weight is given to an observed value closer to the current prediction moment, a smaller weight is given to an observed value farther from the current prediction moment, and the weights are decreased exponentially according to the sequence of time from near to far, so that the method is called an exponential weighted moving average method.
The ARMA model is another important method for researching time series, and is combined by taking an autoregressive model (AR) and a moving average Model (MA) as the basis. Wherein, the p-order autoregressive model has the following calculation formula:
wherein,for the parameter to be estimated, i ═ 1, 2.., p; c is a constant, epsilonnIs white noise and obeys mean 0 and variance σ2Is normally distributed. The autoregressive model describes the relationship between current data and historical data. Unlike the moving average method, in the autoregressive modelAre learned from historical data.
The calculation formula of the q-order moving average model is as follows:
wherein, thetaiFor the parameter to be estimated, i ═ 1, 2.., q; u is the index mean. The moving average model describes the error accumulation of the autoregressive part.
The ARMA (p, q) model contains p autoregressive terms and q moving average terms, which can be expressed as the following equation (5):
given an ARMA model and a time sequence over a period of time, a maximum likelihood estimate of the model parameters and the white noise variance can be given by the maximum likelihood principle.
RNN is a special neural network with rings in its network structure. The output result of one neuron may be the input of another neuron, whose network structure is shown in fig. 2B. As can be seen from fig. 2B, the input of the hidden layer includes not only the input of the input layer at the current time, but also the output of the hidden layer itself at the previous time. Final result O of output layer at time t(t)Is the time of day input and all historical outputsThe result of the combined action. Such a network structure is therefore very suitable for modeling time series. In this network, the activation function of each hidden layer neural unit and output layer unit can be expressed as the following formula (6) and formula (7):
ht=σ(Whxxt+Whhht-1+bh) (6)
yt=soft max(Wyhht+by) (7)
where σ () is the activation function, in the present embodiment we use sigmoid function, Whx、Whh、WyhFor a parameter to be estimated in a neural network structure, WhxConnecting weight matrices between neurons for input and hidden layers, WhhFor the hidden layer and the weight matrix between the hidden layers, WyhIs a weight matrix between the hidden layer and the output layer. bh、byIs the bias term to be estimated. These parameters can be solved by back propagation algorithms.
In the embodiment of the present invention, when training at least one first model for predicting a service indicator according to a first historical time series of the service indicator, the following method may be adopted:
determining training data and test data, respectively, in a first historical time series based on a time sequence; training each parameter in at least one second model according to the training data to obtain at least one training model; and optimizing each parameter in the at least one training model according to the test data to obtain at least one first model.
It should be noted that, in the embodiment of the present invention, the first model refers to a model that has undergone a preliminary training process and a parameter optimization process; the second model refers to the original model, i.e., the model that has not been processed by the preliminary training process and the parameter optimization process.
The meaning based on the time sequence is that the service data of which the time variable is earlier than a preset time threshold value in the first historical time sequence is used as training data; and taking the service data of which the time variable is later than a preset time threshold value in the first historical time sequence as test data. Taking the number of website visitors in the first historical time sequence with business data between 1 month 1 and 12 months 31 as an example, all the business data in the time period are the actual observed value of the number of the website visitors. For example, the actual observation value of the number of visitors to the website every day between 1 month 1 and 11 months 30 can be used as training data, and the actual observation value of the number of visitors to the website every day between 12 months 1 and 12 months 31 can be used as test data.
Before starting to train a model, all parameters involved in the model may be initialized with some different small random numbers. The small random number is used for ensuring that the model does not enter a saturation state due to overlarge parameter values, so that the training fails; "different" is used to ensure that the model can learn normally. If the parameters (e.g., weight matrices) are initialized with the same numbers, the model may not be able to learn. After training the model according to the training data in the first historical time sequence, that is, after determining each parameter in the model through the training data, in order to improve the prediction accuracy of the model and reduce the error between the predicted value predicted by the model and the actual observed value, the embodiment of the invention further comprises a process of further correcting the training model according to the test data in the first historical time sequence, that is, further optimizing each parameter in the training model, and selecting the optimal parameter for the model.
Wherein, MAPE algorithm can be used to select the optimal parameters for each training model trained. The MAPE algorithm is mainly used for measuring the average value of the difference between the predicted value and the actual observed value. It is defined as follows:
the smaller the average value of the difference between the predicted value and the actual observed value is, namely the smaller the numerical value of the MAPE is, the better the current parameters of the model are proved to be. After obtaining the prediction data according to the test data and each current parameter in the training model, calculating the average value of the difference between the prediction data and the actual test data based on the formula (8), further adjusting each current parameter in the model according to the obtained MAPE value, and continuously repeating the process until the MAPE value is less than a certain value or the MAPE value is maintained at a constant value, wherein each parameter in the model is the optimal parameter of the model.
202. And for each first model, predicting the service index based on the first model and the second historical time sequence to obtain a preliminary prediction result.
The second historical time series may be consistent with the first historical time series, or may not be consistent with the first historical time series, which is not specifically limited in the embodiment of the present invention. After at least one first model is trained according to the steps, namely, the optimal parameters are respectively selected for the at least one first model, the step is based on the second historical time sequence, and each model is respectively used for predicting the service index. Taking at least one first model as an EWMA model, an ARMA model and an RNN model as an example, performing primary prediction on the service index by using the EWMA model based on a second historical time sequence in the step to obtain a primary prediction result; based on the second historical time sequence, performing primary prediction on the service index by using an ARMA model to obtain a primary prediction result; and then, based on the second historical time sequence, the RNN model is used for carrying out primary prediction on the service index to obtain a primary prediction result.
After the preliminary prediction results of each model are obtained, the obtained all preliminary prediction results can be directly weighted and averaged, and the obtained data is the final prediction result for predicting the service index. Therefore, the accuracy of model prediction can be further improved through the fusion of various models, and the phenomenon of overfitting can be effectively prevented. In addition, in order to further improve the prediction accuracy, the embodiment of the present invention further provides a method for selecting a part of models with better performance from at least one first model to perform weighted average, so as to complete the prediction of the service index, and the detailed process is as follows:
203. and selecting a specified number of preliminary prediction results from all the output preliminary prediction results based on the MAPE algorithm.
In the embodiment of the invention, when a specified number of preliminary prediction results are selected from all the preliminary prediction results output by at least one first model, the implementation is also realized based on the MAPE algorithm. After the above 201 processing, the parameters in each first model are the optimal parameters. In this case, a small portion of data may be selected from the first historical time series of test data or from the second historical time series of test data, and each of the at least one first model may be tested based on the small portion of data. If the numerical value of the MAPE between the output prediction data and the actual observation value is smaller than a specified numerical value, the prediction error of the model to the service index is smaller than the specified numerical value, and the model is divided into one of the specified preliminary prediction results.
204. And carrying out weighted average processing on the appointed initial prediction results to obtain a final prediction result.
In the embodiment of the present invention, when performing weighted average processing on the specified preliminary prediction results, the model with the smallest prediction error may be given the largest weight, and the model with the largest prediction error may be given the smallest weight, so as to ensure the accuracy of the final prediction result.
According to the method provided by the embodiment of the invention, a plurality of models are trained based on historical data, and each model predicts the same service index according to the historical data, so that a plurality of prediction results are obtained. And then, carrying out weighted average statistical analysis on the plurality of prediction results, taking the finally obtained result as the prediction result of the service index, and fusing a plurality of models, so that even if the service index has larger fluctuation and is influenced by seasonal factors and the like, the change rule and the long-term trend of the index can still be effectively grasped by the plurality of models, the prediction of the service index is more accurate, and the prediction effect is better.
Fig. 3 is a schematic structural diagram of a service indicator prediction apparatus according to an embodiment of the present invention. Referring to fig. 3, the apparatus includes: training module 301, prediction module 302, and statistical analysis module 303.
The training module 301 is connected to the prediction module 302, and configured to train, for a service index, at least one first model for predicting the service index according to a first historical time sequence of the service index; the prediction module 302 is connected with the statistical analysis module 303, and is configured to predict, for each first model, a service index based on the first model and the second historical time series, so as to obtain a preliminary prediction result; and the statistical analysis module 303 is configured to perform statistical analysis on the preliminary prediction result output by the at least one first model to obtain a final prediction result of the service index.
In another embodiment, the training module 301 is configured to determine training data and test data, respectively, in a first historical time series based on a time sequence; training each parameter in at least one second model according to the training data to obtain at least one training model; and optimizing each parameter in the at least one training model according to the test data to obtain at least one first model.
In another embodiment, the training module 301 is configured to use, as training data, traffic data in which a time variable in the first historical time sequence is earlier than a preset time threshold; and taking the service data of which the time variable is later than a preset time threshold value in the first historical time sequence as test data.
In another embodiment, the statistical analysis module 303 is configured to perform weighted average processing on the preliminary prediction result output by the at least one first model to obtain a final prediction result.
In another embodiment, the statistical analysis module 303 is configured to select, based on the MAPE algorithm, a specified number of preliminary prediction results from all the output preliminary prediction results, where a prediction error of a model corresponding to the specified number of preliminary prediction results with respect to the service index is smaller than a specified value; and carrying out weighted average processing on the appointed initial prediction results to obtain a final prediction result.
According to the device provided by the embodiment of the invention, a plurality of models are trained based on historical data, and each model predicts the same service index according to the historical data, so that a plurality of prediction results are obtained. And then, carrying out weighted average statistical analysis on the plurality of prediction results, taking the finally obtained result as the prediction result of the service index, and fusing a plurality of models, so that even if the service index has larger fluctuation and is influenced by seasonal factors and the like, the change rule and the long-term trend of the index can still be effectively grasped by the plurality of models, the prediction of the service index is more accurate, and the prediction effect is better.
It should be noted that: in the service index prediction apparatus provided in the foregoing embodiment, when performing service index prediction, only the division of each function module is illustrated, and in practical application, the function distribution may be completed by different function modules according to needs, that is, the internal structure of the apparatus is divided into different function modules, so as to complete all or part of the functions described above. In addition, the service index prediction apparatus and the service index prediction method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.
Fig. 4 is a diagram illustrating a server according to an exemplary embodiment, which may be used to implement the service index prediction method illustrated in any of the above exemplary embodiments. Specifically, the method comprises the following steps: referring to fig. 4, the server 400 may vary greatly due to configuration or performance, and may include one or more Central Processing Units (CPUs) 422 (e.g., one or more processors) and memory 432, one or more storage media 430 (e.g., one or more mass storage devices) storing applications 442 or data 444. Wherein the memory 432 and storage medium 430 may be transient or persistent storage. The program stored on the storage medium 430 may include one or more modules (not shown).
The server 400 may also include one or more power supplies 428, one or more wired or wireless network interfaces 450, one or more input-output interfaces 458, and/or one or more operating systems 441, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.
One or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for:
for a business index, training at least one first model for predicting the business index according to a first historical time sequence of the business index;
for each first model, predicting the service index based on the first model and the second historical time sequence to obtain a preliminary prediction result;
and carrying out statistical analysis on the preliminary prediction result output by the at least one first model to obtain a final prediction result of the service index.
In another embodiment, the training of the at least one first model for predicting the traffic indicator according to the first historical time series of the traffic indicator comprises:
determining training data and test data, respectively, in the first historical time series based on a time sequence;
training each parameter in at least one second model according to the training data to obtain at least one training model;
and optimizing each parameter in the at least one training model according to the test data to obtain the at least one first model.
In another embodiment, the determining training data and test data in the first historical time series based on the time sequence comprises:
taking the service data of which the time variable in the first historical time sequence is earlier than a preset time threshold as the training data;
and taking the service data of which the time variable is later than the preset time threshold value in the first historical time sequence as the test data.
In another embodiment, the performing a statistical analysis on the preliminary prediction result output by the at least one first model to obtain a final prediction result of the service indicator includes:
and carrying out weighted average processing on the preliminary prediction result output by the at least one first model to obtain the final prediction result.
In another embodiment, the performing a statistical analysis on the preliminary prediction result output by the at least one first model to obtain a final prediction result of the service indicator includes:
selecting a specified number of preliminary prediction results from all the output preliminary prediction results based on a relative percentage error MAPE algorithm, wherein the prediction error of a model corresponding to the specified number of preliminary prediction results on the service index is smaller than a specified numerical value;
and carrying out weighted average processing on the appointed initial prediction results to obtain the final prediction result.
In another embodiment, the at least one first model comprises an EWMA model, an ARMA model, an RNN model.
The server provided by the embodiment of the invention trains a plurality of models based on the historical data, and each model predicts the same service index according to the historical data so as to obtain a plurality of prediction results. And then, carrying out weighted average statistical analysis on the plurality of prediction results, taking the finally obtained result as the prediction result of the service index, and fusing a plurality of models, so that even if the service index has larger fluctuation and is influenced by seasonal factors and the like, the change rule and the long-term trend of the index can still be effectively grasped by the plurality of models, the prediction of the service index is more accurate, and the prediction effect is better.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (4)
1. A business index prediction method is applied to the field of data processing in the Internet industry, and is executed by a server, and the method comprises the following steps:
for a service index, using service data of which the time variable in a first historical time sequence of the service index is earlier than a preset time threshold as training data; taking the service data of which the time variable is later than the preset time threshold value in the first historical time sequence as test data; training each parameter in at least one second model according to the training data to obtain at least one training model; after prediction data are obtained according to the test data and current parameters in the at least one training model, calculating an average value of differences between the prediction data and the test data based on a relative percent error MAPE algorithm, continuing to adjust the current parameters in the at least one training model according to the obtained average value, and continuously repeating the processes until the obtained average value is smaller than a target value or is kept constant at a constant value to obtain at least one first model for predicting the service index; wherein the service index comprises: website access amount, advertising revenue, game user access amount, or game revenue; each first model is of a different type;
for each first model, predicting the service index based on the first model and the second historical time sequence to obtain a preliminary prediction result;
based on the MAPE algorithm, selecting a specified number of preliminary prediction results from all output preliminary prediction results, wherein the prediction error of a model corresponding to the specified number of preliminary prediction results on the service index is smaller than a specified numerical value;
carrying out weighted average processing on the appointed initial prediction results to obtain a final prediction result of the service index;
wherein the prediction error is a numerical value of MAPE between the output predicted value and the actual observed value; the model with the minimum prediction error has the maximum weight, and the model with the maximum prediction error has the minimum weight; the at least one first model includes an exponentially weighted moving average EWMA model, an autoregressive moving average ARMA model, a recurrent neural network RNN model.
2. A business index prediction apparatus, which is applied to the data processing field in the Internet industry, the apparatus comprising:
the training module is used for taking the service data of which the time variable in the first historical time sequence of the service index is earlier than a preset time threshold as training data for one service index; taking the service data of which the time variable is later than the preset time threshold value in the first historical time sequence as test data; training each parameter in at least one second model according to the training data to obtain at least one training model; after prediction data are obtained according to the test data and current parameters in the at least one training model, calculating an average value of differences between the prediction data and the test data based on a relative percent error MAPE algorithm, continuing to adjust the current parameters in the at least one training model according to the obtained average value, and continuously repeating the processes until the obtained average value is smaller than a target value or is kept constant at a constant value to obtain at least one first model for predicting the service index; wherein the service index comprises: website access amount, advertising revenue, game user access amount, or game revenue; each first model is of a different type;
the prediction module is used for predicting the service index based on the first model and the second historical time sequence for each first model to obtain a preliminary prediction result;
the statistical analysis module is used for selecting appointed preliminary prediction results from all the output preliminary prediction results based on the MAPE algorithm, and the prediction error of a model corresponding to the appointed preliminary prediction results on the service index is smaller than an appointed value; carrying out weighted average processing on the appointed initial prediction results to obtain a final prediction result of the service index;
wherein the prediction error is a MAPE value between the output predicted value and the actual observed value; the model with the minimum prediction error has the maximum weight, and the model with the maximum prediction error has the minimum weight; the at least one first model includes an exponentially weighted moving average EWMA model, an autoregressive moving average ARMA model, a recurrent neural network RNN model.
3. A server, comprising a processor and a memory; the memory stores a program that is executed by the processor to implement the traffic indicator prediction method of claim 1.
4. A computer-readable storage medium, characterized in that the storage medium stores a program, which is executed by a processor to implement the traffic indicator prediction method according to claim 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610296883.4A CN107346464B (en) | 2016-05-06 | 2016-05-06 | Service index prediction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610296883.4A CN107346464B (en) | 2016-05-06 | 2016-05-06 | Service index prediction method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107346464A CN107346464A (en) | 2017-11-14 |
CN107346464B true CN107346464B (en) | 2021-04-16 |
Family
ID=60254154
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610296883.4A Active CN107346464B (en) | 2016-05-06 | 2016-05-06 | Service index prediction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107346464B (en) |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108376292A (en) * | 2017-12-12 | 2018-08-07 | 广州汇智通信技术有限公司 | A kind of crowd's method for predicting, system and equipment |
CN108846525A (en) * | 2018-08-02 | 2018-11-20 | 阿里巴巴集团控股有限公司 | Dealing amount of foreign exchange prediction technique and device |
CN109492709B (en) * | 2018-12-06 | 2020-11-06 | 新奥数能科技有限公司 | Data prediction method and device based on hybrid model |
CN109615226B (en) * | 2018-12-12 | 2020-12-29 | 焦点科技股份有限公司 | Operation index abnormity monitoring method |
CN111353624B (en) * | 2018-12-21 | 2023-10-10 | 顺丰科技有限公司 | Method and device for evaluating quantity prediction effect |
CN110009384A (en) * | 2019-01-07 | 2019-07-12 | 阿里巴巴集团控股有限公司 | Predict the method and device of operational indicator |
CN110826799B (en) * | 2019-11-05 | 2022-07-08 | 广州虎牙科技有限公司 | Service prediction method, device, server and readable storage medium |
CN111353809A (en) * | 2019-12-12 | 2020-06-30 | 合肥工业大学 | Social consumer goods retail total quarterly accumulated amplification prediction method and system |
CN113052582B (en) * | 2019-12-27 | 2024-03-22 | 中移动信息技术有限公司 | Method, device, equipment and computer storage medium for checking bill |
CN111614578B (en) * | 2020-05-09 | 2021-11-02 | 北京邮电大学 | Network resource allocation method and device based on exponential weighting and inflection point detection |
CN111783356B (en) * | 2020-06-29 | 2024-03-29 | 清华大学深圳国际研究生院 | Oil yield prediction method and device based on artificial intelligence |
CN111833114A (en) * | 2020-07-27 | 2020-10-27 | 北京思特奇信息技术股份有限公司 | Intelligent prediction method, system, medium and equipment for channel business development target |
CN111984658B (en) * | 2020-09-07 | 2023-09-22 | 中国银行股份有限公司 | Report processing method and device |
CN112101652B (en) * | 2020-09-10 | 2024-05-14 | 拉扎斯网络科技(上海)有限公司 | Method and device for predicting task number, readable storage medium and electronic equipment |
CN112288158A (en) * | 2020-10-28 | 2021-01-29 | 税友软件集团股份有限公司 | Service data prediction method and related device |
CN112364077A (en) * | 2020-11-09 | 2021-02-12 | 光大理财有限责任公司 | Training sample generation method, machine learning model training method and related device |
CN112613995A (en) * | 2020-12-30 | 2021-04-06 | 中国工商银行股份有限公司 | Abnormality diagnosis method and apparatus |
CN113762585B (en) * | 2021-05-17 | 2023-08-01 | 腾讯科技(深圳)有限公司 | Data processing method, account type identification method and device |
CN114118570A (en) * | 2021-11-24 | 2022-03-01 | 泰康保险集团股份有限公司 | Service data prediction method and device, electronic equipment and storage medium |
CN115828075A (en) * | 2022-07-27 | 2023-03-21 | 京东城市(北京)数字科技有限公司 | Method and device for calculating index data |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102663617A (en) * | 2012-03-20 | 2012-09-12 | 亿赞普(北京)科技有限公司 | Method and system for prediction of advertisement clicking rate |
CN104125584A (en) * | 2013-04-27 | 2014-10-29 | 中国移动通信集团福建有限公司 | Service index realization prediction method aiming at network service and apparatus thereof |
CN105139079A (en) * | 2015-07-30 | 2015-12-09 | 广州时韵信息科技有限公司 | Tax revenue prediction method and device based on hybrid model |
CN105354210A (en) * | 2015-09-23 | 2016-02-24 | 深圳市爱贝信息技术有限公司 | Mobile game payment account behavior data processing method and apparatus |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8577670B2 (en) * | 2010-01-08 | 2013-11-05 | Microsoft Corporation | Adaptive construction of a statistical language model |
-
2016
- 2016-05-06 CN CN201610296883.4A patent/CN107346464B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102663617A (en) * | 2012-03-20 | 2012-09-12 | 亿赞普(北京)科技有限公司 | Method and system for prediction of advertisement clicking rate |
CN104125584A (en) * | 2013-04-27 | 2014-10-29 | 中国移动通信集团福建有限公司 | Service index realization prediction method aiming at network service and apparatus thereof |
CN105139079A (en) * | 2015-07-30 | 2015-12-09 | 广州时韵信息科技有限公司 | Tax revenue prediction method and device based on hybrid model |
CN105354210A (en) * | 2015-09-23 | 2016-02-24 | 深圳市爱贝信息技术有限公司 | Mobile game payment account behavior data processing method and apparatus |
Also Published As
Publication number | Publication date |
---|---|
CN107346464A (en) | 2017-11-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107346464B (en) | Service index prediction method and device | |
Jiang et al. | Holt–Winters smoothing enhanced by fruit fly optimization algorithm to forecast monthly electricity consumption | |
CN111080400B (en) | Commodity recommendation method and system based on gate control graph convolution network and storage medium | |
CN106448151B (en) | Short-term traffic flow prediction method | |
CN105991397B (en) | Information dissemination method and device | |
WO2023109059A1 (en) | Method for determining fusion parameter, information recommendation method, and model training method | |
CN104123377A (en) | Microblog topic popularity prediction system and method | |
Godinho et al. | Comparative performance of AI methods for wind power forecast in Portugal | |
Rivero et al. | Energy associated tuning method for short-term series forecasting by complete and incomplete datasets | |
Lv et al. | An improved long short-term memory neural network for stock forecast | |
Dong et al. | A time series attention mechanism based model for tourism demand forecasting | |
CN113723692A (en) | Data processing method, apparatus, device, medium, and program product | |
CN115526333A (en) | Federal learning method for dynamic weight under edge scene | |
Liu et al. | Pbodl: Parallel bayesian online deep learning for click-through rate prediction in tencent advertising system | |
He et al. | Neural computing for grey Richards differential equation to forecast traffic parameters with various time granularity | |
Panda et al. | Predicting stock returns: an experiment of the artificial neural network in Indian stock market | |
Yuan | Jitter buffer control algorithm and simulation based on network traffic prediction | |
CN111724176A (en) | Shop traffic adjusting method, device, equipment and computer readable storage medium | |
Lande et al. | Model of information spread in social networks | |
CN113723593A (en) | Load shedding prediction method and system based on neural network | |
Wei et al. | QoE Prediction for IPTV Based on Imbalanced Dataset by the PNN-PSO algorithm | |
Okasha et al. | The application of artificial neural networks in forecasting economic time series | |
CN104883366A (en) | Method for updating Web service reliability predicating values by utilizing user real-time feedback information | |
Wu et al. | Study of software reliability prediction based on GR neural network | |
Wang et al. | Proxy Forecasting to Avoid Stochastic Decision Rules in Decision Markets |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |