CN113901294A - Information processing method and device - Google Patents

Information processing method and device Download PDF

Info

Publication number
CN113901294A
CN113901294A CN202111067295.0A CN202111067295A CN113901294A CN 113901294 A CN113901294 A CN 113901294A CN 202111067295 A CN202111067295 A CN 202111067295A CN 113901294 A CN113901294 A CN 113901294A
Authority
CN
China
Prior art keywords
time series
data
time
prediction
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111067295.0A
Other languages
Chinese (zh)
Inventor
陆明
聂志远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN202111067295.0A priority Critical patent/CN113901294A/en
Publication of CN113901294A publication Critical patent/CN113901294A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/909Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06315Needs-based resource requirements planning or analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Operations Research (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • Educational Administration (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses an information processing method and device. The method comprises the steps of carrying out primary prediction according to actual time sequence data in a certain time period to obtain a prediction result, and correcting the actual time sequence data through time sequence weight and the prediction result if the difference between the prediction result in the time period and the actual time sequence data is larger; then, the second prediction is performed based on the corrected time-series data. The time series data on which the prediction is based is corrected according to the first time series weight, and the weight and the influence of data in different periods in the prediction process can be specified. In this way, prediction can be performed again using the corrected data, and particularly, prediction for the middle and long term can be more accurate.

Description

Information processing method and device
Technical Field
The present application relates to the field of computer information processing, and in particular, to an information processing method and apparatus.
Background
Some medium and long term predictions, such as capacity planning, are needed during the IT operation and maintenance process. Capacity planning, namely simply evaluating whether the performance of the current equipment is matched with the user requirement, and if the performance is higher than the user requirement and is often in an idle state, closing or sealing some equipment to reduce daily electric energy consumption and maintenance cost; if the user demand is lower, the device can be requested to be obtained or added from the stock to meet the user demand.
Specifically, when capacity planning is performed, the following information, for example, is often acquired: what is the maximum processing capacity of a single device? How many devices are currently in use on the line? Is the current number of devices able to support before a large promotion? When and how much equipment needs to be added?
However, as business activities continue to evolve, and the iteration speed of some product and service fields continues to become faster, the time span of historical data which can be referred for performing medium-long term prediction becomes shorter and shorter, and the data is less and less; on the other hand, since the service environment changes frequently, the historical data which can be referred to in each period changes relatively violently between different periods.
Therefore, if the traditional method is still adopted, the difficulty and the accuracy of the medium-long term prediction such as capacity planning are more and more difficult and lower if the medium-long term prediction is directly carried out according to the referred historical data.
Disclosure of Invention
The applicant creatively provides an information processing method and device.
According to a first aspect of embodiments of the present application, there is provided an information processing method, including: performing first prediction according to the first time series data to obtain a first prediction result; determining the difference between the first time series data and the first prediction result according to a first difference function to obtain a first time series difference value; judging whether the difference value of the first time sequence is smaller than a first threshold value, if not, correcting the first time sequence data according to the weight of the first time sequence to obtain second time sequence data; and performing second prediction according to the second time sequence data to obtain a second prediction result.
According to an embodiment of the present application, the method further includes: and correcting the first prediction result or the second prediction result according to the third information to obtain a corrected prediction result. .
According to an embodiment of the present application, before the first prediction is performed according to the first time-series data to obtain the first prediction result, the method further includes: and carrying out abnormal data detection and processing on the third time sequence data to obtain first time sequence data.
According to an embodiment of the present application, modifying the first time series data according to the first time series weight to obtain second time series data includes: performing product operation on the first time sequence difference value and the first time sequence weight to obtain a second time sequence difference value; and determining second time series data according to the first prediction result and the second time series difference value.
According to an embodiment of the present application, after performing a product operation on the first time series difference value and the first time series weight to obtain a second time series difference value, the method further includes: normalizing the second time series difference value to obtain a normalized second time series difference value; correspondingly, determining second time series data according to the first prediction result and the second time series difference value, wherein the second time series data comprises: and determining second time series data according to the first prediction result and the normalized second time series difference value.
According to an embodiment of the present application, determining a first time series weight includes: acquiring a first time series weight curve corresponding to the time of the first time series data according to preset distribution parameters; and carrying out normalization processing on the first time series weight curve to obtain a first time series weight.
According to an embodiment of the present application, the first time-series weight curve includes a distribution curve with a larger magnitude for correcting recent data and a smaller magnitude for correcting long-distance data.
According to an embodiment of the present application, after obtaining the first time series weight, the method further includes: and correcting the first time sequence weight according to the fourth information to obtain the corrected first time sequence weight.
According to a second aspect of embodiments of the present application, there is provided an information processing apparatus including: the first prediction module is used for performing first prediction according to the first time series data to obtain a first prediction result; the difference value calculation module is used for determining the difference between the first time series data and the first prediction result according to a first difference function to obtain a first time series difference value; the data correction module is used for judging whether the difference value of the first time sequence is smaller than a first threshold value or not, and if not, correcting the first time sequence data according to the first time sequence weight to obtain second time sequence data; and the second prediction module is used for performing second prediction according to the second time sequence data to obtain a second prediction result.
According to a third aspect of embodiments herein, there is provided a computer-readable storage medium comprising a set of computer-executable instructions for performing any of the information processing methods described above when the instructions are executed.
The embodiment of the application provides an information processing method and device, the method carries out primary prediction according to actual time sequence data in a certain time period to obtain a prediction result, and if the difference between the prediction result in the time period and the actual time sequence data is larger, the actual time sequence data can be corrected through time sequence weight and the prediction result; then, the second prediction is performed based on the corrected time-series data.
The time series data on which the prediction is based is corrected according to the first time series weight, and the weight and the influence of data in different periods in the prediction process can be specified. In this way, prediction can be performed again using the corrected data, and particularly, prediction for the middle and long term can be more accurate.
It is to be understood that the implementation of the present application does not require all of the above-described advantages to be achieved, but rather that certain technical solutions may achieve certain technical effects, and that other embodiments of the present application may also achieve other advantages not mentioned above.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present application will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present application are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
in the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
Fig. 1 is a schematic flow chart illustrating an implementation of an information processing method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of first time series data and a prediction result obtained based on the first time series data according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram of second time-series data and a prediction result obtained based on the second time-series data according to an embodiment of the present application.
FIG. 4 is a schematic view of an implementation flow of an information processing method according to another embodiment of the present application;
FIG. 5 is a diagram illustrating the first prediction result and the residual data d1 of the first time series according to the embodiment of the present application shown in FIG. 4;
FIG. 6 is a diagram illustrating residual data d2 obtained by modifying the residual data d1 according to the embodiment shown in FIG. 4;
fig. 7 is a schematic diagram of a configuration of an information processing apparatus according to an embodiment of the present application.
Detailed Description
In order to make the objects, features and advantages of the present application more obvious and understandable, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.
Fig. 1 shows an implementation flow of an information processing method according to an embodiment of the present application. Referring to fig. 1, the method includes: an operation 110, performing a first prediction according to the first time-series data to obtain a first prediction result; an operation 120 of determining a difference between the first time series data and the first prediction result according to a first difference function to obtain a first time series difference value; operation 130, determining whether the difference value of the first time series is smaller than a first threshold, if not, determining the weight of the first time series, and correcting the first time series data according to the weight of the first time series to obtain second time series data; in operation 140, a second prediction is performed according to the second time-series data to obtain a second prediction result.
Where time series data (time series data) is data collected at different times for the case where the transaction or phenomenon being described changes over time. For example, recording the equipment condition data every hour forms equipment condition day data; the daily device condition data is recorded to form device condition week data, month data, and the like.
Time series data is often used to make predictions, as it often reflects trends and laws that a transaction or phenomenon changes over time.
In operation 110, the first forecast refers to a forecast based on time series data, such as an air temperature forecast, a capacity forecast, a population growth trend forecast, a capacity plan, and so on.
The first time-series data refers to time-series data on which the first prediction is based, and is typically historical data over a period of time recorded at a certain frequency. The time series data may be raw data acquired in real time, data obtained by cleaning or preprocessing raw data, or data obtained by correcting or adjusting certain time series data.
The first prediction result refers to trend data obtained by performing a first prediction according to the first time series data, and the trend data generally represents a possible value of the predicted object within the prediction time interval. To verify that the predicted data is accurate or trustworthy, the prediction interval typically includes a past time at which the first time series data was acquired and a period of time that extends back on that basis.
Fig. 2 is a schematic diagram showing the first time-series data and the prediction result obtained based on the first time-series data according to the embodiment.
In fig. 2, the horizontal axis represents time, the vertical axis represents the used capacity of the storage device a, and the data shown by the solid line is first time-series data of 5 months from 2021 to 9 months from 2021.
The capacity usage of the storage device a can be predicted (first prediction) by any suitable prediction method (e.g., linear regression) based on the first time series data, that is, the used capacity (first prediction result) of the storage device a can be obtained, wherein the prediction interval is from 5 months in 2021 to 9 months in 2022 (due to space limitation, the prediction data is only displayed to 10 months in 2021).
It is assumed that the prediction result obtained by this prediction is data shown by a dotted line in fig. 2.
Since there are both the first time-series data (data shown by a solid line) and the prediction result (data shown by a dotted line) during the period from 5 months at 2021 to 9 months at 2021, whether the prediction is accurate can be evaluated by comparing the prediction result during this period with the first time-series data actually recorded. The smaller the difference between the prediction result and the actually recorded first time-series data, the more accurate the first prediction result is, that is, the higher the reliability of the used capacity of the storage apparatus a predicted after 9 months 2021 in the first prediction result is.
In operation 120, the first difference function is a function for determining a difference between the prediction result and the actually recorded first time-series data, such as a difference, a squared difference, a quotient, and the like.
The difference between the first time series data and the first prediction determined by the first difference function is also a set of time series data, i.e. the first time series difference, since there are as many pairs of data as there are differences.
Generally, different prediction requirements have different requirements on the accuracy of prediction. The first threshold is a threshold for limiting the size of the disparity value, such as the maximum acceptable disparity value or the maximum average disparity value. The first threshold is typically specified based on predicted demand and empirical values. When the requirement on the prediction accuracy degree is high, the first threshold value can be set to a small value; conversely, when the demand for the degree of prediction accuracy is low, the first threshold value may be set to a large value.
When determining whether the first time series difference value is smaller than a first threshold, if the first threshold is the maximum acceptable difference value, each of the first time series difference values may be compared with the first threshold; if the first threshold is an acceptable maximum average disparity value, then the average of all disparity values in the first time series of disparity values can be compared to the first threshold.
If the result of the comparison is less than the preset value, the first prediction is expected, and the prediction can be finished.
If the result of the comparison is equal to or greater than the predetermined value, the first prediction is not yet expected, and the first prediction needs to be corrected.
As described above, in the prediction scenario to which the information processing method of the present application is applied, particularly in the medium-and-long-term prediction scenario, the prediction results vary greatly, mainly because the weights of data at all time points in the prediction are the same regardless of the distance of the data from the current time, but actually, data at different time points have different prediction values and different importance degrees for a specific prediction. For example, the shorter the data is from the current time, the greater the predictive value is, and the longer the data is from the current time, the lower the predictive value is; the more predictive value of the data in the normal period is, the lower the predictive value of the data in the abnormal period is.
Therefore, when the first prediction is corrected in the embodiment of the application, how to correct the weight of the data at different time points in the first time series data in the prediction is mainly considered, so that the first time series data can overcome the problem that the prediction weight of each time point data is inconsistent with the actual prediction value.
In operation 130, the first time-series weight is a set of time-series data obtained by giving a corresponding weight value to each time point (corresponding point on the time axis shown in fig. 2) corresponding to the first time-series data.
The weight value corresponding to each time point can be used for fine tuning the data corresponding to the time point so as to adjust the weight of the data at the time point in prediction and the influence on the prediction result.
For example, typical data obtained under normal conditions, or recent data with recent characteristics, can represent the development law or recent trend of things or phenomena more than abnormal data in an abnormal period or data that is old and far. Therefore, the weights of the time points corresponding to the typical data or the recent data can be set to be larger, so as to enhance the weight of the data in the prediction and the influence on the prediction result, and further enable the prediction result to be more accurate.
For abnormal data obtained under abnormal conditions or outdated long-term data, because the data can not reflect the development rule or the latest trend, the weight can be set according to expert experience or certain distribution rules and the distance from the current time to amplify or reduce the data change amplitude (or other indexes capable of highlighting the trend) at different time points, so that the data which more accord with the development rule of things and can reflect the latest trend is obtained.
Thus, the second time series data obtained by correcting the first time series data according to the first time series weight can better conform to the development law of the object and can better represent the data of the latest trend, such as the data shown by the solid line in fig. 3.
Thereafter, a second prediction is performed based on the second time-series data via operation 140, so that a more accurate second prediction result, such as the data shown by the dotted line in fig. 3, can be obtained. In essence, the second prediction and the first prediction have the same prediction target and basically the same prediction method, but because data based on the prediction is corrected, some parameters and configurations used in the prediction process are changed, so that different prediction results are caused. Referring to fig. 3, the difference between the corrected second time-series data (data shown by a solid line in fig. 3) and the second prediction result (data shown by a solid line in fig. 2) is smaller, which indicates that the second prediction result is more accurate than the first prediction result.
It should be noted that the embodiment shown in fig. 1 is only one of the most basic embodiments of the proportional risk regression model training method of the present application, and further refinement and extension can be performed by an implementer on the basis of the embodiment.
According to an embodiment of the present application, the method further includes: and correcting the first prediction result or the second prediction result according to the third information to obtain a corrected prediction result.
In this embodiment, the third information indicates that the actual application scenario can predict factors that may affect the actual value, for example, some working plans for using the storage device are predictable factors for the used capacity of the storage device. If these work plans are available, the first predicted outcome or the second predicted outcome may be modified based on these plans to obtain a more accurate predicted outcome.
According to an embodiment of the present application, before the first prediction is performed according to the first time-series data to obtain the first prediction result, the method further includes: and carrying out abnormal data detection and processing on the third time sequence data to obtain first time sequence data.
Wherein the third time series data refers to the original data actually collected or recorded, and some abnormal data, such as abnormal data which suddenly increases or decreases, may be collected due to some unexpected situation, such as equipment failure, occurring in the collection process. These data become outlier data when used for prediction and interfere with the prediction results.
Therefore, in the present embodiment, the abnormal data detection is performed on the raw data actually collected or recorded. And after the abnormal data is detected, performing corresponding repair processing on the abnormal data. For example, outliers are identified using anomaly detection techniques and outlier data corrections are made based on different conditions, including but not limited to:
1) data processing is carried out from right to left (from nearest to long) according to a sequence;
2) if the value exceeds some outlier and lies outside x times the standard deviation, then the original data is replaced with the closest data difference value within x times the standard deviation, where x is the threshold multiple of the acceptable standard deviation.
After the outlier data correction processing, the first time series data used for the first prediction can be legal and effective, and no abnormal data with much meaningless or large deviation exists. The validity of the first sequence data is based on the first prediction result, so that the first prediction result and the second sequence data expanded based on the prediction result are more accurate, and the second prediction result based on the second sequence data is more accurate.
According to an embodiment of the present application, modifying the first time series data according to the first time series weight to obtain second time series data includes: performing product operation on the first time sequence difference value and the first time sequence weight to obtain a second time sequence difference value; and determining second time series data according to the first prediction result and the second time series difference value.
In this embodiment, the first time-series difference value is corrected mainly by the first time-series weight, and then corrected time-series data is obtained based on the first prediction result and the corrected time-series difference value.
By adopting the embodiment, the change amplitude of the data can be enhanced or weakened by amplifying or reducing the residual error of the data, and the trend embodied by the data change can be highlighted or lightened.
In addition, only the residual error is corrected, so that the corrected data can be ensured to be still based on the original first sequence data. Thereby ensuring to the maximum that the data used for prediction is still reliable without too great a deviation.
According to an embodiment of the present application, after performing a product operation on the first time series difference value and the first time series weight to obtain a second time series difference value, the method further includes: normalizing the second time series difference value to obtain a normalized second time series difference value; correspondingly, determining second time series data according to the first prediction result and the second time series difference value, wherein the second time series data comprises: and determining second time series data according to the first prediction result and the normalized second time series difference value.
The normalization process may further process and process the original data, for example, convert a dimensional expression into a dimensionless expression to obtain a corresponding scalar, so that the features of different dimensions are in the same numerical order. Therefore, the negative effects generated by some characteristics with larger variance can be reduced, and the model is more accurate.
Similarly, in some embodiments, normalization may be used to further process the original data.
According to an embodiment of the present application, determining a first time series weight includes: acquiring a first time series weight curve corresponding to the time of the first time series data according to preset distribution parameters; and carrying out normalization processing on the first time series weight curve to obtain a first time series weight.
Usually, the trend of a specific thing or phenomenon is a certain rule, and if the rule is known, the distribution function describing the rule can be used to determine the first time series weight, so that the modified data can better conform to the known rule and become smoother time series data.
If this law is unknown, Beta distributions can be used-these can model a variety of possible distributions by specifying different parameters.
According to an embodiment of the present application, the first time-series weight curve includes a distribution curve with a larger magnitude for correcting recent data and a smaller magnitude for correcting long-distance data.
In the present embodiment, it is considered that the more recent data values are shorter from the current time, the less distant data values are longer from the current time. Therefore, by specifying the distribution parameters, a distribution curve which enables the correction amplitude of recent data to be larger and the correction amplitude of long-distance data to be smaller is generated. Therefore, the weight of the recent data in the prediction process and the influence on the prediction result are amplified as much as possible, and the weight of the long-term data in the prediction process and the influence on the prediction result are reduced.
Therefore, the recent data can play a larger role in the prediction process relative to the long-term data, and the prediction is more accurate.
According to an embodiment of the present application, after obtaining the first time series weight, the method further includes: and correcting the first time sequence weight according to the fourth information to obtain the corrected first time sequence weight.
Here, the fourth information mainly refers to expert experience, that is, data conditions at different time points are obtained through expert labeling or an integrated expert system, and weight values are revised at different time points. For example, if a failure occurs in a certain time period, the time series data of the time period can be corrected in a manner of still manually marking or modifying the weight value, so that the prediction result is more suitable for the characteristics of the service scene.
Fig. 4 shows a flow of implementing an information processing method according to another embodiment of the present application. The embodiment is used for predicting the capacity use condition of the storage device a in one year so as to plan the capacity of the storage device a, and specifically includes:
step 4010, obtaining time series data;
assume that the obtained time-series data are the used capacity of the storage apparatus a from 5 months in 2021 to 9 months in 2021.
Step 4020, calculating a difference value of adjacent data points in the time series;
step 4040, calculating the distribution of the difference values to obtain a strategy or value for replacing outlier data;
4040, identifying outliers of the difference using an anomaly detection algorithm;
step 4050, identifying effective outliers according to data distribution characteristics;
step 4060, combining the processing results of step 4040 and step 4050, and generating corrected time data S;
the above steps 4020 to 4060 are for performing abnormal data detection and processing on the time-series data obtained in step 4010 to obtain valid time-series data.
The resulting valid time series is assumed to be the data shown in solid lines in fig. 2.
Step 4070, generating a prediction model m1, e.g., linear regression;
step 4080, predicting according to S and m1 to obtain a prediction result n1 (data shown by a dotted line in fig. 2), and calculating a residual d1 (shown in fig. 5) between n1 and S;
step 4090, judging that the residual standard deviation is less than a first threshold value, if yes, returning a prediction result; if not, continue with step 4100;
step 4100, obtaining Beta distribution parameters according to empirical values or historical parameters;
in the present embodiment, since the usage of the storage device is not regular, the capacity requirement changes differently in different periods and different usage scenarios. Therefore, the Beta distribution is used in the present embodiment. Therefore, a distribution curve which is more fit with an actual application scene and is more consistent with different prediction values of different time point data can be obtained by appointing different Beta distribution parameters. For example, assigning the Beta distribution parameter to Beta (4,1) results in a distribution curve that has a large magnitude for near data corrections and a small magnitude for long data corrections.
Step 4110, obtaining a curve in a time series data point range, such as 5 months from 2021 to 9 months from 2021;
step 4120, normalizing to the interval of designated min and max to obtain a sequence b;
wherein min and max are the minimum value and the maximum value in the recent data, and the interval is specified for normalization, so that the data which are more remote can better accord with the newer development trend and distribution characteristics as far as possible.
Step 4140, obtaining and normalizing the sequence d2 of d1xb to obtain the data shown in FIG. 6;
as can be seen from a comparison of fig. 6 and fig. 5, the residual correction amplitude of the recent data becomes larger, and the residual correction amplitude of the long-term data becomes smaller. Therefore, the weight of the recent data in the prediction process and the influence on the prediction result can be amplified as much as possible, and the weight of the long-term data in the prediction process and the influence on the prediction result can be reduced, so that the prediction is more accurate, and the time characteristic and the service characteristic of the data are better met
Step 4140, obtaining time-series data d2+ n1, i.e., time-series data with time-series weight information (data shown by a solid line in fig. 3);
step 4150 generates prediction model m2, generating prediction result n2 (data shown as dashed lines in FIG. 3).
Linear regression is also used when generating the prediction model m2, and the same method as that used when generating the prediction model m1 is used, but some parameters of the model are changed due to a change in data used for generating the prediction model, and the prediction results are different.
As can be seen by comparing FIG. 2 with FIG. 3, the difference between the first sequence data and the first prediction is smaller than the difference between the first sequence data and the first prediction, which also indicates that the second prediction is more accurate than the first prediction.
Further, the embodiment of the application also provides an information processing device. As shown in fig. 7, the apparatus 70 includes: a first prediction module 701, configured to perform a first prediction according to the first time series data to obtain a first prediction result; a difference value calculating module 702, configured to determine a difference between the first time series data and the first prediction result according to a first difference function, so as to obtain a first time series difference value; the data correction module 703 is configured to determine whether the difference value of the first time series is smaller than a first threshold, determine a weight of the first time series if the difference value of the first time series is smaller than the first threshold, and correct the first time series data according to the weight of the first time series to obtain second time series data; and a second prediction module 704, configured to perform a second prediction according to the second time-series data to obtain a second prediction result.
According to an embodiment of the present application, the apparatus 70 further includes: and the prediction correction module is used for correcting the first prediction result or the second prediction result according to the third information to obtain a corrected prediction result.
According to an embodiment of the present application, the apparatus 70 further includes: and the abnormal data detection and processing module is used for detecting and processing the abnormal data of the third time sequence data to obtain first time sequence data.
According to an embodiment of the present application, the data modification module 703 includes: the second time series difference value calculation submodule is used for carrying out product operation on the first time series difference value and the first time series weight to obtain a second time series difference value; and the second time series data determining submodule is used for determining second time series data according to the first prediction result and the second time series difference value.
According to an embodiment of the present application, the second time-series data determining sub-module further includes: the normalization processing sub-module is used for performing normalization processing on the second time series difference value to obtain a normalized second time series difference value; correspondingly, the second time series data determination submodule is specifically configured to determine the second time series data according to the first prediction result and the normalized second time series difference value.
According to an embodiment of the present application, the data modification module 703 is specifically configured to perform a product operation on the first time series weight and the first time series data to obtain second time series data.
According to an embodiment of the present application, the data modification module 703 includes: and the time series weight determining submodule is used for determining the first time series weight.
According to an embodiment of the present application, the time-series weight determining sub-module includes: a first time series weight curve acquisition unit, configured to acquire a first time series weight curve corresponding to time of the first time series data according to a preset distribution parameter; and the first time series weight normalization processing unit is used for performing normalization processing on the first time series weight curve to obtain a first time series weight.
According to an embodiment of the present application, the time-series weight determining sub-module further includes: and the first time series weight correcting unit is used for correcting the first time series weight according to the fourth information to obtain a corrected first time series weight.
According to a third aspect of embodiments herein, there is provided a computer-readable storage medium comprising a set of computer-executable instructions for performing any of the information processing methods described above when the instructions are executed.
Here, it should be noted that: the above description on the embodiment of the information processing apparatus and the above description on the embodiment of the computer readable storage medium are similar to the description on the embodiment of the foregoing method, and have similar beneficial effects to the embodiment of the foregoing method, and therefore, the description is omitted here. For technical details that have not been disclosed in the description of the embodiment of the information processing apparatus and the embodiment of the computer-readable storage medium, please refer to the description of the foregoing method embodiments of the present application for understanding, and therefore will not be described again for brevity.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of a unit is only one logical function division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another device, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media capable of storing program codes, such as a removable storage medium, a Read Only Memory (ROM), a magnetic disk, and an optical disk.
Alternatively, the integrated units described above in the present application may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof that contribute to the prior art may be embodied in the form of a software product stored in a storage medium, and including several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a removable storage medium, a ROM, a magnetic disk, an optical disk, or the like, which can store the program code.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. An information processing method, the method comprising:
performing first prediction according to the first time series data to obtain a first prediction result;
determining a difference between the first time series data and the first prediction result according to a first difference function to obtain a first time series difference value;
judging whether the difference value of the first time series is smaller than a first threshold value, if not, determining the weight of the first time series, and correcting the first time series data according to the weight of the first time series to obtain second time series data;
and performing second prediction according to the second time series data to obtain a second prediction result.
2. The method of claim 1, further comprising:
and correcting the first prediction result or the second prediction result according to the third information to obtain a corrected prediction result.
3. The method of claim 1, prior to said first predicting from the first time-series data resulting in a first prediction result, the method further comprising:
and carrying out abnormal data detection and processing on the third time sequence data to obtain the first time sequence data.
4. The method of claim 1, wherein modifying the first time series data according to the first time series weight to obtain second time series data comprises:
performing product operation on the first time sequence difference value and the first time sequence weight to obtain a second time sequence difference value;
and determining second time series data according to the first prediction result and the second time series difference value.
5. The method of claim 4, after said multiplying said first time series difference value by said first time series weight to obtain a second time series difference value, further comprising:
normalizing the second time series difference value to obtain a normalized second time series difference value;
correspondingly, determining second time series data according to the first prediction result and the second time series difference value, wherein the second time series data comprises:
and determining second time series data according to the first prediction result and the normalized second time series difference value.
6. The method of claim 1, the determining first time series weights comprising:
acquiring a first time series weight curve corresponding to the time of the first time series data according to preset distribution parameters;
and carrying out normalization processing on the first time series weight curve to obtain a first time series weight.
7. The method of claim 6, the first time series weight curve comprising a profile having a greater magnitude for near future data corrections and a lesser magnitude for long-distance data corrections.
8. The method of claim 1, after said deriving first time series weights, further comprising:
and correcting the first time sequence weight according to third information to obtain a corrected first time sequence weight.
9. An information processing apparatus, the apparatus comprising:
the first prediction module is used for performing first prediction according to the first time series data to obtain a first prediction result;
a difference value calculation module, configured to determine a difference between the first time series data and the first prediction result according to a first difference function, so as to obtain a first time series difference value;
the data correction module is used for judging whether the difference value of the first time sequence is smaller than a first threshold value or not, if not, determining the weight of the first time sequence, and correcting the first time sequence data according to the weight of the first time sequence to obtain second time sequence data;
and the second prediction module is used for performing second prediction according to the second time sequence data to obtain a second prediction result.
10. A computer-readable storage medium comprising a set of computer-executable instructions that, when executed, perform the method of any of claims 1-8.
CN202111067295.0A 2021-09-13 2021-09-13 Information processing method and device Pending CN113901294A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111067295.0A CN113901294A (en) 2021-09-13 2021-09-13 Information processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111067295.0A CN113901294A (en) 2021-09-13 2021-09-13 Information processing method and device

Publications (1)

Publication Number Publication Date
CN113901294A true CN113901294A (en) 2022-01-07

Family

ID=79027834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111067295.0A Pending CN113901294A (en) 2021-09-13 2021-09-13 Information processing method and device

Country Status (1)

Country Link
CN (1) CN113901294A (en)

Similar Documents

Publication Publication Date Title
US8065098B2 (en) Progressive humidity filter for load data forecasting
US11055450B2 (en) Industrial asset health model update
US20180128863A1 (en) Energy Demand Predicting System and Energy Demand Predicting Method
JP2019022442A5 (en)
KR102097953B1 (en) Failure risk index estimation device and failure risk index estimation method
CN105095614A (en) Method and device for updating prediction model
JP7010674B2 (en) Power Demand Forecasting Device, Power Demand Forecasting Method and Program
US20210048811A1 (en) Model generation device for life prediction, model generation method for life prediction, and recording medium storing model generation program for life prediction
CN109583729B (en) Data processing method and device for platform online model
KR20220088534A (en) Apparatus and method for predicting consumer power demand in microgrid using clustering technique
CN110222313A (en) Reflect the drought early warning method and apparatus of nonuniformity drought character variable
JP2009104408A (en) Integrated demand forecasting apparatus, integrated demand forecasting method and integrated demand forecasting program
JP5031715B2 (en) Product demand forecasting system, product sales volume adjustment system
CN116091118A (en) Electricity price prediction method, device, equipment, medium and product
JP7253913B2 (en) Data processing device and data processing method
JP2013131259A (en) Integrated demand prediction device, integrated demand prediction method, and integrated demand prediction program
JP6625839B2 (en) Load actual data determination device, load prediction device, actual load data determination method, and load prediction method
CN113901294A (en) Information processing method and device
JP2021131627A (en) DR activation prediction system
KR20140146437A (en) Apparatus and method for forecasting business performance based on patent information
JP2009043292A (en) Merchandise demand forecast system and merchandise demand forecast system for year end and beginning of year
KR101484761B1 (en) Method and apparatus for predicting industry risk using industrial warning signs
JP2006350883A (en) Demand prediction value automatic determination system using knowledge database, demand prediction value automatic determination program used therefor, and storage medium storing its program
JP2020119029A (en) Order information calculation program, device, and method
CN115794906A (en) Method, device, equipment and storage medium for determining influence of emergency

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination