CN117557071A - Sparse time sequence prediction method, device, storage medium and application - Google Patents
Sparse time sequence prediction method, device, storage medium and application Download PDFInfo
- Publication number
- CN117557071A CN117557071A CN202410039467.0A CN202410039467A CN117557071A CN 117557071 A CN117557071 A CN 117557071A CN 202410039467 A CN202410039467 A CN 202410039467A CN 117557071 A CN117557071 A CN 117557071A
- Authority
- CN
- China
- Prior art keywords
- consumption
- data
- time
- stage
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000003860 storage Methods 0.000 title claims abstract description 16
- 238000013145 classification model Methods 0.000 claims abstract description 14
- 238000012545 processing Methods 0.000 claims abstract description 9
- 238000004364 calculation method Methods 0.000 claims abstract description 6
- 230000015654 memory Effects 0.000 claims description 17
- 230000006870 function Effects 0.000 claims description 16
- 230000002776 aggregation Effects 0.000 claims description 9
- 238000004220 aggregation Methods 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 8
- 238000010276 construction Methods 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 7
- 230000004931 aggregating effect Effects 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 5
- 238000005070 sampling Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims description 3
- 238000004883 computer application Methods 0.000 abstract description 2
- 238000010801 machine learning Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000013277 forecasting method Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000007637 random forest analysis Methods 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000012300 Sequence Analysis Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000001932 seasonal effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000012731 temporal analysis Methods 0.000 description 1
- 238000000700 time series analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
- G06Q10/06315—Needs-based resource requirements planning or analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/08—Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
- G06Q10/087—Inventory or stock management, e.g. order filling, procurement or balancing against orders
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Economics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Entrepreneurship & Innovation (AREA)
- Strategic Management (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Tourism & Hospitality (AREA)
- Operations Research (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Development Economics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Quality & Reliability (AREA)
- Educational Administration (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Game Theory and Decision Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of computer application, and discloses a method, equipment, a storage medium and application for predicting sparse time sequences based on service characteristics, wherein the method comprises the following steps: collecting data and processing data; constructing a two-stage prediction model; the first stage is an integrated classification model, and whether consumption occurs in the future is predicted according to the dimensions of a warehouse and a SKU, and classification results are two types, wherein one type is consumption occurring in the future, and the other type is consumption not occurring; the second stage is a time sequence prediction model, and an asymmetric loss function is adopted to train a TimesNet model to predict the future consumption. And classifying the predicted result into data with consumption in the first stage, and entering a second stage to predict the consumption, wherein the first stage predicts the part with no consumption, and the predicted value is directly assigned to zero. The method not only can save calculation time, but also considers business characteristics, and improves the accuracy of predicting sparse demands.
Description
Technical Field
The invention relates to the technical field of computer application, in particular to a business feature-based sparse time sequence prediction method, equipment, a storage medium and application.
Background
Currently, there are conventional prediction methods and machine learning-based prediction methods for demand prediction commonly used in supply chains. Traditional prediction methods include weighted average, exponential smoothing, holt-windows prediction, autoregressive moving average (ARIMA) prediction, etc., and common machine learning methods include long and short term memory neural network (LSTM) method, recurrent neural network (CNN), decision tree, random forest, etc.
In the field of consumer electronics after-market services, spare part requirements tend to be intermittent and sparse. The following schemes are generally employed to predict sparse consumer electronics after-market component demand. Firstly, a statistical method based on historical data uses past spare part demand data for analysis, and technologies such as time sequence analysis, seasonal decomposition, trend analysis and the like are applied to predict future spare part demands. And secondly, a machine learning prediction model is constructed by using a machine learning algorithm such as a decision tree, a random forest, a neural network and the like, so that the spare part demand is predicted more accurately.
However, these methods have some problems and pain points. The statistical method based on the historical data is suitable for the requirement conditions of relatively stability and strong regularity, but the market change of the electronic products is rapid, and the traditional statistical prediction method is difficult to cope with the conditions of emergencies or new product release and the like. The machine learning prediction model requires adjustment of multiple parameters and super parameters, and takes a lot of time and resources to perform model selection and tuning. This feature of the spare parts and the diversity of supply chain warehouse and spare part combinations, given the constraints of the funding flow, makes existing model predictive effects often less than ideal. How to build a model becomes a challenging task.
Disclosure of Invention
Aiming at the problems of low prediction accuracy or excessive prediction redundancy of the demand of the after-market accessories of the consumer electronic products in the prior art, the invention provides a business feature-based sparse time sequence prediction method, which not only can effectively solve the problem of predicting the sparse demand, but also can improve the prediction accuracy.
The invention relates to a method for predicting a sparse time sequence based on service characteristics, which comprises the following steps:
(1) Collecting data, wherein the data comprises a plurality of groups of product data, and each group of data comprises a data ID, a fitting model SKU, a warehouse, the time to market of the SKU and historical consumption data of each ID;
(2) Processing data, comprising:
intercepting data, namely intercepting historical consumption data with a certain time length according to a to-be-predicted demand time point;
data aggregation, namely, on the basis of data interception, aggregating consumed data according to time granularity;
construction features the following features are constructed on the basis of data aggregation: the time to market, the number of times of consumption, the maximum consumption, the average demand interval, the square of the demand variation coefficient, the last time interval and the minimum interval; the time length of marketing is the time length from the time of marketing to the time point of interception; the consumption times are the times of non-zero consumption occurrence; the average demand interval is the ratio of the time to market and the number of times of consumption; the calculation formula of the square of the demand variation coefficient is as follows:
wherein,CV 2 representing the square of the coefficient of variation of the demand, sigma represents the standard deviation of consumption,μis the consumption average value;
the last time interval is the time length from the last non-zero consumption to the interception time point; the minimum interval is the minimum value of the non-zero consumption time interval of the front and back two times in each ID history consumption data;
(3) Constructing a two-stage prediction model;
the first stage is an integrated classification model, and whether consumption occurs in the future is predicted according to the dimensions of a warehouse and a SKU, and classification results are two types, wherein one type is consumption occurring in the future, and the other type is consumption not occurring; the integrated classification model comprises a LightGBM classifier, a boost & Bayesian classifier, a SKU & warehouse classifier and a weighted average classifier, which are constructed to respectively predict whether consumption occurs; synthesizing classifier prediction results, wherein when the number of classifiers predicted to be consumed exceeds a set threshold, the integrated classification model prediction results are consumed, otherwise, the prediction results are not consumed;
the second stage is a time sequence prediction model, and an asymmetric loss function is adopted to train a TimesNet model to predict the future consumption;
and classifying the predicted result into data with consumption in the first stage, and entering a second stage to predict the consumption, wherein the first stage predicts the part with no consumption, and the predicted value is directly assigned to zero.
Preferably, the processing data further includes: data cleansing, wherein the history is removed from the collected data before data interception, and the ID which is never consumed is removed.
Preferably, the LightGBM classifier includes:
selecting a time point to be predicted, and creating a test set according to the data interception, the data aggregation and the construction characteristics;
selecting several groups of time points, intercepting data, aggregating data and constructing characteristics, combining data, and creating a training set;
selecting a model entering feature comprising SKU, time to market, warehouse, time to market, consumption times, maximum consumption, average demand interval, square of demand variable coefficient, last time interval, minimum interval and consumption data of last time points;
defining an optimization target, wherein the optimization target adopts a self-defined asymmetric loss function, and the form is as follows:
wherein,loss1 is the custom asymmetric loss function,a、bis the parameter of the ultrasonic wave to be used as the ultrasonic wave,yto be a true value of the value,is a predicted value;
training a LightGBM model and selecting characteristics, and predicting a test set after model tuning;
setting a quantile as a threshold, and when the predicted value is larger than the quantile, the classification result of the LightGBM classifier is that consumption occurs, otherwise, no consumption occurs.
Preferably, the boost trap & bayesian classifier includes:
computing statistics for each ID by boost samplingP(X|I) And P(I) Wherein X represents event occurrence drain, I represents last time interval,P(X|I) Indicating the probability of consumption occurring for a time interval I,P(I) Representing the probability of occurrence of time interval I;
calculating the probability of consumption per IDP(X) The probability is approximately equal to the ratio of the number of times consumed to the time to market;
according to the Bayesian theorem, calculating the probability of each ID that the time interval is I under the condition that the event X occurs:
setting quantiles as threshold values whenP(X|I X) Above this fraction, boost&The classification result of the Bayesian classifier is that the consumption occurs, otherwise, the consumption does not occur.
Preferably, the SKU & warehouse classifier includes:
counting the consumption times of each SKU and the number of distributed warehouses;
setting a threshold value, and screening a SKU list with the consumption times and the warehouse being larger than the threshold value;
when the SKU corresponding to the ID is included in the SKU list, the SKU & warehouse classifier classifies the result as consumption, otherwise, no consumption occurs.
Preferably, the weighted average classifier includes:
based on the historical consumption data, a weighted average method is used for predicting future consumption, when the prediction result is greater than 0, the classification result of the weighted average classifier is that consumption occurs, and otherwise, no consumption occurs.
Preferably, in the second stage of step (3), the time series model is a TimesNet time series model, and reference is made to Wu, haixu, et al, "TimesNet: sample 2d-variation modeling for general time series analysis".arXiv preprint arXiv:2210.02186 (2022)。
The asymmetric loss function is in the form of:
wherein,lossand a and b are super parameters for the asymmetric loss function, and N is the number of IDs.
The invention also aims to provide application of the method, which can be applied to predicting the after-market accessory demand of the consumer electronic product and carrying out spare parts according to the prediction result. The future consumption can be predicted more accurately, the accuracy of accessory supply can be effectively improved, and resources are saved.
It is also an object of the present invention to provide an electronic device comprising:
a processor and a memory;
the processor is configured to execute the steps of any of the methods described above by calling a program or instructions stored in the memory.
It is also an object of the present invention to provide a computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor realizes the steps of any of the methods described above.
The invention has the positive progress effects that:
firstly, a two-stage prediction method is adopted, wherein the first stage is an integrated classification model for predicting whether consumption occurs in the future; the second stage is a time sequence prediction model, and predicts the future consumption. According to the two-stage sparse time sequence prediction method based on the business characteristics, calculation power is saved, only the data which are predicted to be consumed by the first-stage integrated classification model enter the second stage to predict the time sequence, and the data which are predicted to be not consumed in the future by the first stage are not processed, so that calculation time is saved.
Secondly, in the integrated classification model, the classifier does not directly adopt the predicted value as a classification result, but sets a quantile as a threshold value, when the predicted value is greater than the threshold value, the classification result is consumed, and otherwise, no consumption occurs. The method reduces the complexity of adjusting the model parameters to a great extent and enhances the generalization performance of the model.
Moreover, business characteristics and data characteristics are comprehensively considered, and accuracy of predicting sparse requirements is improved. According to the prediction method, the first stage model considers the characteristics of time to market, SKU, warehouse and the like, the second stage model considers the characteristics of time sequence data, and the combination of the service characteristics and the data characteristics improves the prediction accuracy.
Finally, the asymmetric loss function is adopted by part of the classifier and the time sequence prediction model in the integrated classification model, so that the model has flexibility, and the model can be adjusted according to the requirements of a decision maker.
Drawings
FIG. 1 is a flow chart of a method for predicting a sparse time sequence based on business features according to the present invention;
fig. 2 is a diagram illustrating a training set and test set data interception example of the LightGBM classifier construction of the present invention.
Description of the embodiments
The invention is further described below with reference to specific examples and figures. It should be understood that the following examples are illustrative of the present invention and are not intended to limit the scope of the present invention.
Examples
The method for predicting the sparse time sequence based on the service features, referring to fig. 1, comprises the following steps:
(1) The data is collected and includes a plurality of sets of product data, each set of data including a data ID, a fitting model SKU, a warehouse, a time-to-market for the SKU, and historical usage data for each ID. The data format is not limited and table 1 shows a preferred data format.
(2) Processing data, comprising:
data cleaning, namely eliminating the ID which is never consumed in the history from the collected data before data interception;
intercepting data, namely intercepting historical consumption data with a certain time length according to a to-be-predicted demand time point;
data aggregation, namely, on the basis of data interception, aggregating consumed data according to time granularity; for example, taking 7 days as time granularity, summing the consumption data every 7 days as aggregated consumption data;
construction features the following features are constructed on the basis of data aggregation: the time to market, the number of times of consumption, the maximum consumption, the average demand interval, the square of the demand variation coefficient, the last time interval and the minimum interval; the time length of marketing is the time length from the time of marketing to the time point of interception; the consumption times are the times of non-zero consumption occurrence; the average demand interval is the ratio of the time to market and the number of times of consumption; the calculation formula of the square of the demand variation coefficient is as follows:
wherein,CV 2 representing the square of the coefficient of variation of the demand, sigma represents the standard deviation of consumption,μis the consumption average value;
the last time interval is the time length from the last non-zero consumption to the interception time point; the minimum interval is the minimum value of the non-zero consumption time interval of the front and back two times in each ID history consumption data;
(3) Constructing a two-stage prediction model;
the first stage is an integrated classification model, and whether consumption occurs in the future is predicted according to the dimensions of a warehouse and a SKU, and classification results are two types, wherein one type is consumption occurring in the future, and the other type is consumption not occurring; the integrated classification model comprises a LightGBM classifier, a boost & Bayesian classifier, a SKU & warehouse classifier and a weighted average classifier, which are constructed to respectively predict whether consumption occurs; synthesizing classifier prediction results, wherein when the number of classifiers predicted to be consumed exceeds a set threshold, the integrated classification model prediction results are consumed, otherwise, the prediction results are not consumed;
the second stage is a time sequence prediction model, and an asymmetric loss function is adopted to train a TimesNet model to predict the future consumption;
and classifying the predicted result into data with consumption in the first stage, and entering a second stage to predict the consumption, wherein the first stage predicts the part with no consumption, and the predicted value is directly assigned to zero.
Preferably, the LightGBM classifier includes:
selecting a time point to be predicted, and creating a test set according to the data interception, the data aggregation and the construction characteristics; the data interception process can be seen in fig. 2.
Selecting several groups of time points, intercepting data, aggregating data and constructing characteristics, combining data, and creating a training set;
selecting a model entering feature comprising SKU, time to market, warehouse, time to market, consumption times, maximum consumption, average demand interval, square of demand variable coefficient, last time interval, minimum interval and consumption data of last time points; for example, the consumption data for the last 10 time points is selected.
Defining an optimization objective employing a custom asymmetric loss functionloss1, the form is as follows:
wherein,a、bis the parameter of the ultrasonic wave to be used as the ultrasonic wave,yto be a true value of the value,is a predicted value;
training a LightGBM model and selecting characteristics, and predicting a test set after model tuning;
setting a quantile as a threshold, and when the predicted value is larger than the quantile, the classification result of the LightGBM classifier is that consumption occurs, otherwise, no consumption occurs.
For example, the LightGBM model learning rate is set to 0.0001, the number of weak classifiers is set to 300, the maximum leaf node number of the tree is set to 32, and the loss function is set to:
taking 60% quantiles as a threshold, and when the predicted value of the LightGBM is larger than the threshold, the predicted result of the LightGBM classifier is that consumption occurs.
Preferably, the boost trap & bayesian classifier includes:
computing statistics for each ID by boost samplingP(X|I) And P(I) Wherein X represents event occurrence drain, I represents last time interval,P(X|I) Indicating the probability of consumption occurring for a time interval I,P(I) Representing the probability of occurrence of time interval I;
calculating the probability of consumption per IDP(X) The probability is approximately equal to the ratio of the number of times consumed to the time to market;
according to the Bayesian theorem, calculating the probability of each ID that the time interval is I under the condition that the event X occurs:
setting quantiles as threshold values whenP(I |X) Above this fraction, boost&The classification result of the Bayesian classifier is that the consumption occurs, otherwise, the consumption does not occur.
Preferably, the SKU & warehouse classifier includes:
counting the consumption times of each SKU and the number of distributed warehouses;
setting a threshold value, and screening a SKU list with the consumption times and the warehouse being larger than the threshold value;
when the SKU corresponding to the ID is included in the SKU list, the SKU & warehouse classifier classifies the result as consumption, otherwise, no consumption occurs.
Preferably, the weighted average classifier includes:
based on the historical consumption data, a weighted average method is used for predicting future consumption, when the prediction result is greater than 0, the classification result of the weighted average classifier is that consumption occurs, and otherwise, no consumption occurs.
Preferably, in the second stage of the step (3), the time series model adopts a TimesNet time series model;
the asymmetric loss function is in the form of:
wherein a and b are super parameters, and N is the number of IDs.
The above method may be applied to predicting after-market accessory requirements including, but not limited to, consumer electronic products, and making spare parts based on the prediction. The future consumption can be predicted more accurately, the accuracy of accessory supply can be effectively improved, and resources are saved.
The embodiment of the invention also provides electronic equipment, which comprises one or more processors and a memory.
The processor may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device to perform the desired functions.
The memory may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by a processor to perform the inventory forecasting method and/or other desired functions of any of the embodiments of the present invention described above. Various content such as initial arguments, thresholds, etc. may also be stored in the computer readable storage medium.
In some examples, the electronic device may further include: input devices and output devices, which are interconnected by a bus system and/or other forms of connection mechanisms. The input device may include, for example, a keyboard, a mouse, and the like. The output device can output various information to the outside, including early warning prompt information, braking force and the like. The output means may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc.
In addition, the electronic device may include any other suitable components depending on the particular application.
In addition to the methods and apparatus described above, embodiments of the invention may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform the steps of the prediction method as set forth in any of the embodiments of the invention.
The computer program product may write program code for performing operations of embodiments of the present invention in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the C programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present invention may also be a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform the steps of the inventory forecasting method for after-market vehicle parts provided by any of the embodiments of the present invention.
The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
While the preferred embodiments of the present invention have been illustrated and described, the present invention is not limited to the embodiments, and various equivalent modifications and substitutions can be made by one skilled in the art without departing from the spirit of the present invention, and these equivalent modifications and substitutions are intended to be included in the scope of the present invention as defined in the appended claims.
Claims (10)
1. A business feature-based sparse time sequence prediction method is characterized by comprising the following steps:
(1) Collecting data, wherein the data comprises a plurality of groups of product data, and each group of data comprises a data ID, a fitting model SKU, a warehouse, the time to market of the SKU and historical consumption data of each ID;
(2) Processing data, comprising:
intercepting data, namely intercepting historical consumption data with a certain time length according to a to-be-predicted demand time point;
data aggregation, namely, on the basis of data interception, aggregating consumed data according to time granularity;
construction features the following features are constructed on the basis of data aggregation: the time to market, the number of times of consumption, the maximum consumption, the average demand interval, the square of the demand variation coefficient, the last time interval and the minimum interval; the time length of marketing is the time length from the time of marketing to the time point of interception; the consumption times are the times of non-zero consumption occurrence; the average demand interval is the ratio of the time to market and the number of times of consumption; the calculation formula of the square of the demand variation coefficient is as follows:
wherein,CV 2 representing the square of the coefficient of variation of the demand, sigma represents the standard deviation of consumption,μis the consumption average value;
the last time interval is the time length from the last non-zero consumption to the interception time point; the minimum interval is the minimum value of the non-zero consumption time interval of the front and back two times in each ID history consumption data;
(3) Constructing a two-stage prediction model;
the first stage is an integrated classification model, and whether consumption occurs in the future is predicted according to the dimensions of a warehouse and a SKU, and classification results are two types, wherein one type is consumption occurring in the future, and the other type is consumption not occurring; the integrated classification model comprises a LightGBM classifier, a boost & Bayesian classifier, a SKU & warehouse classifier and a weighted average classifier, which are constructed to respectively predict whether consumption occurs; synthesizing classifier prediction results, wherein when the number of classifiers predicted to be consumed exceeds a set threshold, the integrated classification model prediction results are consumed, otherwise, the prediction results are not consumed;
the second stage is a time sequence prediction model, and an asymmetric loss function is adopted to train a TimesNet model to predict the future consumption;
and classifying the predicted result into data with consumption in the first stage, and entering a second stage to predict the consumption, wherein the first stage predicts the part with no consumption, and the predicted value is directly assigned to zero.
2. The method of claim 1, wherein processing the data further comprises: data cleansing, wherein the history is removed from the collected data before data interception, and the ID which is never consumed is removed.
3. The method of claim 1 or 2, wherein the LightGBM classifier comprises:
selecting a time point to be predicted, and creating a test set according to the data interception, the data aggregation and the construction characteristics;
selecting several groups of time points, intercepting data, aggregating data and constructing characteristics, combining data, and creating a training set;
selecting a model entering feature comprising SKU, time to market, warehouse, time to market, consumption times, maximum consumption, average demand interval, square of demand variable coefficient, last time interval, minimum interval and consumption data of last time points;
defining an optimization target, wherein the optimization target adopts a self-defined asymmetric loss function, and the form is as follows:
wherein,loss1 is the custom asymmetric loss function,a、bis the parameter of the ultrasonic wave to be used as the ultrasonic wave,yis true value +.>Is a predicted value;
training a LightGBM model and selecting characteristics, and predicting a test set after model tuning;
setting a quantile as a threshold, and when the predicted value is larger than the quantile, the classification result of the LightGBM classifier is that consumption occurs, otherwise, no consumption occurs.
4. The method of claim 1 or 2, wherein the boost & bayesian classifier comprises:
computing statistics for each ID by boost samplingP(X|I) And P(I) Wherein X represents event occurrence drain, I represents last time interval,P(X|I) Indicating the probability of consumption occurring for a time interval I,P(I) Representing the probability of occurrence of time interval I;
calculating the probability of consumption per IDP(X) The probability is approximately equal to the ratio of the number of times consumed to the time to market;
according to the Bayesian theorem, calculating the probability of each ID that the time interval is I under the condition that the event X occurs:
setting quantiles as threshold values whenAbove this fraction, boost&The classification result of the Bayesian classifier is that the consumption occurs, otherwise, the consumption does not occur.
5. The method of claim 1 or 2, wherein the SKU & warehouse classifier comprises:
counting the consumption times of each SKU and the number of distributed warehouses;
setting a threshold value, and screening a SKU list with the consumption times and the warehouse being larger than the threshold value;
when the SKU corresponding to the ID is included in the SKU list, the SKU & warehouse classifier classifies the result as consumption, otherwise, no consumption occurs.
6. The method of claim 1 or 2, wherein the weighted average classifier comprises:
based on the historical consumption data, a weighted average method is used for predicting future consumption, when the prediction result is greater than 0, the classification result of the weighted average classifier is that consumption occurs, and otherwise, no consumption occurs.
7. The method of claim 1 or 2, wherein in the second stage of step (3), a time series model employs a TimesNet time series model; the asymmetric loss function is in the form of:
wherein,lossto be the instituteThe asymmetric loss function, a and b are super parameters, and N is the number of IDs.
8. Use of the method according to any of claims 1-7 for predicting after-market accessory requirements of consumer electronic products and for spare parts based on the prediction.
9. An electronic device, the electronic device comprising:
a processor and a memory;
the processor is adapted to perform the steps of the method according to any of claims 1-7 by invoking a program or instruction stored in the memory.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the method according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410039467.0A CN117557071A (en) | 2024-01-11 | 2024-01-11 | Sparse time sequence prediction method, device, storage medium and application |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410039467.0A CN117557071A (en) | 2024-01-11 | 2024-01-11 | Sparse time sequence prediction method, device, storage medium and application |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117557071A true CN117557071A (en) | 2024-02-13 |
Family
ID=89813161
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410039467.0A Pending CN117557071A (en) | 2024-01-11 | 2024-01-11 | Sparse time sequence prediction method, device, storage medium and application |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117557071A (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20010034673A1 (en) * | 2000-02-22 | 2001-10-25 | Yang Hong M. | Electronic marketplace providing service parts inventory planning and management |
US20170061374A1 (en) * | 2015-08-24 | 2017-03-02 | Toyota Motor Engineering & Manufacturing North America, Inc. | Spare Parts List Builder and Compiler Tools and Methods of Use |
CN108710905A (en) * | 2018-05-10 | 2018-10-26 | 华中科技大学 | One kind being based on the united spare part quantitative forecasting technique of multi-model and system |
KR101966557B1 (en) * | 2017-12-08 | 2019-04-05 | 세종대학교산학협력단 | Repairing-part-demand forecasting system and method using big data and machine learning |
CN109754118A (en) * | 2018-12-26 | 2019-05-14 | 复旦大学 | A kind of prediction technique of system self-adaption |
CN110705777A (en) * | 2019-09-26 | 2020-01-17 | 联想(北京)有限公司 | Method, device and system for predicting spare part reserve |
CN110728378A (en) * | 2019-09-03 | 2020-01-24 | 耀灵人工智能(浙江)有限公司 | Method for preparing parts in advance |
CN110909997A (en) * | 2019-11-13 | 2020-03-24 | 联想(北京)有限公司 | Spare part demand prediction method, spare part demand prediction device and electronic equipment |
CN115375250A (en) * | 2022-10-27 | 2022-11-22 | 河北东来工程技术服务有限公司 | Method and system for managing spare parts of ship |
CN116051160A (en) * | 2022-12-12 | 2023-05-02 | 中联重科股份有限公司 | Prediction method, prediction device and storage medium for accessory demand |
CN117010538A (en) * | 2022-04-26 | 2023-11-07 | 中国科学院沈阳自动化研究所 | Method and system for predicting agricultural machinery service resource spare parts |
-
2024
- 2024-01-11 CN CN202410039467.0A patent/CN117557071A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20010034673A1 (en) * | 2000-02-22 | 2001-10-25 | Yang Hong M. | Electronic marketplace providing service parts inventory planning and management |
US20170061374A1 (en) * | 2015-08-24 | 2017-03-02 | Toyota Motor Engineering & Manufacturing North America, Inc. | Spare Parts List Builder and Compiler Tools and Methods of Use |
KR101966557B1 (en) * | 2017-12-08 | 2019-04-05 | 세종대학교산학협력단 | Repairing-part-demand forecasting system and method using big data and machine learning |
CN108710905A (en) * | 2018-05-10 | 2018-10-26 | 华中科技大学 | One kind being based on the united spare part quantitative forecasting technique of multi-model and system |
CN109754118A (en) * | 2018-12-26 | 2019-05-14 | 复旦大学 | A kind of prediction technique of system self-adaption |
CN110728378A (en) * | 2019-09-03 | 2020-01-24 | 耀灵人工智能(浙江)有限公司 | Method for preparing parts in advance |
CN110705777A (en) * | 2019-09-26 | 2020-01-17 | 联想(北京)有限公司 | Method, device and system for predicting spare part reserve |
CN110909997A (en) * | 2019-11-13 | 2020-03-24 | 联想(北京)有限公司 | Spare part demand prediction method, spare part demand prediction device and electronic equipment |
CN117010538A (en) * | 2022-04-26 | 2023-11-07 | 中国科学院沈阳自动化研究所 | Method and system for predicting agricultural machinery service resource spare parts |
CN115375250A (en) * | 2022-10-27 | 2022-11-22 | 河北东来工程技术服务有限公司 | Method and system for managing spare parts of ship |
CN116051160A (en) * | 2022-12-12 | 2023-05-02 | 中联重科股份有限公司 | Prediction method, prediction device and storage medium for accessory demand |
Non-Patent Citations (1)
Title |
---|
李春秀: "备件分类与需求预测研究综述", 《电子产品可靠性与环境试验》, vol. 41, no. 05, 20 October 2023 (2023-10-20), pages 116 - 121 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110806954B (en) | Method, device, equipment and storage medium for evaluating cloud host resources | |
US9588832B2 (en) | Data preprocessing device and method associated with a failure risk level of a target system | |
CN112529023A (en) | Configured artificial intelligence scene application research and development method and system | |
CN107943582B (en) | Feature processing method, feature processing device, storage medium and electronic equipment | |
CN110147389A (en) | Account number treating method and apparatus, storage medium and electronic device | |
CN109658921A (en) | A kind of audio signal processing method, equipment and computer readable storage medium | |
WO2024087468A1 (en) | Category prediction model training method, prediction method, device, and storage medium | |
CN112994960B (en) | Method and device for detecting business data abnormity and computing equipment | |
CN113962294A (en) | Multi-type event prediction model | |
CN111190967B (en) | User multidimensional data processing method and device and electronic equipment | |
CN111159481B (en) | Edge prediction method and device for graph data and terminal equipment | |
WO2019062404A1 (en) | Application program processing method and apparatus, storage medium, and electronic device | |
US9141686B2 (en) | Risk analysis using unstructured data | |
CN112070564B (en) | Advertisement pulling method, device and system and electronic equipment | |
CN111126501B (en) | Image identification method, terminal equipment and storage medium | |
CN117557071A (en) | Sparse time sequence prediction method, device, storage medium and application | |
CN116668321A (en) | Network traffic prediction method, device, equipment and storage medium | |
CN116915710A (en) | Traffic early warning method, device, equipment and readable storage medium | |
CN114500075B (en) | User abnormal behavior detection method and device, electronic equipment and storage medium | |
CN109614854B (en) | Video data processing method and device, computer device and readable storage medium | |
US7672912B2 (en) | Classifying knowledge aging in emails using Naïve Bayes Classifier | |
CN113379533A (en) | Method, device, equipment and storage medium for improving circulating loan quota | |
CN115408373A (en) | Data processing method and device, and computer readable storage medium | |
CN114066669B (en) | Cloud manufacturing-oriented manufacturing service discovery method | |
CN111753992A (en) | Screening method and screening system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |