CN113792931A - Data prediction method, data prediction device, logistics cargo quantity prediction method, medium and equipment - Google Patents

Data prediction method, data prediction device, logistics cargo quantity prediction method, medium and equipment Download PDF

Info

Publication number
CN113792931A
CN113792931A CN202111100746.6A CN202111100746A CN113792931A CN 113792931 A CN113792931 A CN 113792931A CN 202111100746 A CN202111100746 A CN 202111100746A CN 113792931 A CN113792931 A CN 113792931A
Authority
CN
China
Prior art keywords
sequence data
prediction
prediction result
data
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111100746.6A
Other languages
Chinese (zh)
Other versions
CN113792931B (en
Inventor
吴盛楠
庄晓天
韩国帅
佟路
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Zhenshi Information Technology Co Ltd
Original Assignee
Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Zhenshi Information Technology Co Ltd filed Critical Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority to CN202111100746.6A priority Critical patent/CN113792931B/en
Publication of CN113792931A publication Critical patent/CN113792931A/en
Application granted granted Critical
Publication of CN113792931B publication Critical patent/CN113792931B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/083Shipping

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Software Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The disclosure relates to the technical field of computers, and relates to a data prediction method and device, a logistics cargo quantity prediction method and device, a storage medium and electronic equipment. The method comprises the following steps: acquiring historical time sequence data, and performing time sequence decomposition on the historical time sequence data to obtain trend sequence data, periodic sequence data and error sequence data; performing time series decomposition on the error sequence data, performing fitting prediction on the obtained decomposed subsequence, and determining a first prediction result corresponding to the error sequence data according to the fitting prediction result; respectively performing fitting prediction on the trend sequence data and the periodic sequence data by adopting prediction models corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result; and fusing the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result. The method and the device improve the prediction accuracy of the periodic and trend data to a certain extent.

Description

Data prediction method, data prediction device, logistics cargo quantity prediction method, medium and equipment
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a data prediction method, a data prediction apparatus, a logistics quantity prediction method, a logistics quantity prediction apparatus, a computer storage medium, and an electronic device.
Background
With the development of computer technology, the application field of machine learning is more and more extensive, and future data needs to be predicted according to the development rules of historical data in many application scenarios, such as financial market situation prediction, demand prediction of retail industry or continuous development for the health of logistics industry, so that the reduction of logistics service quality and the reduction of logistics service rate are avoided, and the accurate prediction of logistics cargo volume is also very important.
In the related technology, the model prediction process ignores the periodicity and trend characteristics of prediction data, a single pre-stored model is adopted for prediction, the prediction accuracy of the whole data is influenced, the model prediction result has larger deviation with real data, in addition, in the related technology, the generated random error is directly used as an input parameter and is input into the prediction model for prediction fitting, and the prediction accuracy of the model is reduced due to the direct introduction of an error term.
It is to be noted that the information invented in the background section above is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The purpose of the present disclosure is to provide a time-series-based data prediction method and apparatus, a time-series-based logistics quantity prediction method and apparatus, a computer storage medium, and an electronic device, so as to improve the prediction accuracy of periodic and trend data to at least some extent.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to an aspect of the present disclosure, there is provided a time series-based data prediction method, including: acquiring historical time sequence data, and performing time sequence decomposition on the historical time sequence data to obtain trend sequence data, periodic sequence data and error sequence data; performing time series decomposition on the error sequence data, and performing fitting prediction on a plurality of obtained decomposed subsequences to determine a first prediction result corresponding to the error sequence data according to a fitting prediction result; respectively performing fitting prediction on the trend sequence data and the periodic sequence data by adopting prediction models corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result; and performing fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result.
In an exemplary embodiment of the disclosure, the time-series decomposition of the error sequence data and fitting prediction of the obtained multiple decomposed subsequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result includes: performing time series decomposition on the error sequence data to obtain sub-trend sequence data, sub-period sequence data and sub-random error sequence data; discarding the sub-random error sequence data; fusing the sub-trend sequence data and the sub-cycle sequence data to obtain sub-fusion sequence data; and fitting and predicting the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result.
In an exemplary embodiment of the disclosure, before performing fitting prediction on the sub-fusion sequence data by using a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result, the method further includes: and labeling and normalizing the sub-fusion sequence data, and determining training set data and test set data according to the sub-fusion sequence data after standard normalization.
In an exemplary embodiment of the disclosure, the prediction model corresponding to the sub-fusion sequence data is a sine wave sequence characterization network and long-short term memory network composite model; the fitting prediction is performed on the sub-fusion sequence data by using the prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result, and the method comprises the following steps: inputting the training set data into a sine wave sequence characterization network for sample amplification, and inputting the training set data after sample amplification into a long-short term memory network so as to train the sine wave sequence characterization network and the long-short term memory network composite model; and inputting the test set data into the trained composite model to obtain the first prediction result.
In an exemplary embodiment of the disclosure, the performing, by using a prediction model corresponding to the trend sequence data and the cycle sequence data, fitting and predicting the trend sequence data and the cycle sequence data respectively to obtain a second prediction result and a third prediction result includes: respectively determining training set data and test set data corresponding to the trend sequence data and the cycle sequence data; inputting training set data corresponding to the trend sequence data into a corresponding first prediction model to train the first prediction model, and inputting test set data corresponding to the trend sequence data into the trained first prediction model to obtain a second prediction result; inputting training set data corresponding to the periodic sequence data to a corresponding second prediction model to train the second prediction model, and inputting test set data corresponding to the trend sequence data to the trained second prediction model to obtain the third prediction result.
In an exemplary embodiment of the disclosure, before determining the corresponding training set data and test set data from the periodic sequence data, the method further comprises: and normalizing the periodic sequence data.
In an exemplary embodiment of the present disclosure, the first prediction model is one of a differential integrated moving average autoregressive model, an exponential smoothing model, or a Theta model; the second prediction model is a sine wave sequence characterization network and long-short term memory network composite model.
In an exemplary embodiment of the present disclosure, the fusing the first prediction result, the second prediction result, and the third prediction result to obtain a target prediction result includes: carrying out inverse standardization and inverse normalization processing on the first prediction result and the third prediction result; and adding the second prediction result, the processed first prediction result and the processed third prediction result to obtain the target prediction result.
According to one aspect of the present disclosure, there is provided a time-series-based logistics cargo volume prediction method, including: acquiring historical cargo quantity time-series data, and performing time-series decomposition on the historical cargo quantity time-series data to obtain trend sequence data, periodic sequence data and error sequence data; performing time series decomposition on the error sequence data, and performing fitting prediction on a plurality of obtained decomposed subsequences to determine a first prediction result corresponding to the error sequence data according to a fitting prediction result; respectively performing fitting prediction on the trend sequence data and the periodic sequence data by adopting prediction models corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result; and performing fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result of the cargo quantity.
In an exemplary embodiment of the present disclosure, performing time-series decomposition on the error sequence data, and performing fitting prediction on the obtained multiple decomposed subsequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result, includes: performing time series decomposition on the error sequence data to obtain sub-trend sequence data, sub-period sequence data and sub-random error sequence data; discarding the sub-random error sequence data; fusing the sub-trend sequence data and the sub-cycle sequence data to obtain sub-fusion sequence data; and fitting and predicting the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result.
In an exemplary embodiment of the disclosure, the prediction model corresponding to the sub-fusion sequence data is a sine wave sequence characterization network and long-short term memory network composite model; the fitting prediction is performed on the sub-fusion sequence data by using the prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result, and the method comprises the following steps: inputting the training set data into a sine wave sequence characterization network for sample amplification, and inputting the obtained amplified training set data into a long-short term memory network so as to train the sine wave sequence characterization network and long-short term memory network composite model; and inputting the test set data into the trained composite model to obtain the first prediction result.
In an exemplary embodiment of the disclosure, the fusing the first prediction result, the second prediction result, and the third prediction result to obtain the target prediction result of the cargo volume includes: and adding the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result of the cargo quantity.
According to an aspect of the present disclosure, there is provided a time series-based data prediction apparatus, the apparatus including: the data acquisition module is used for acquiring historical time sequence data, and performing time sequence decomposition on the historical time sequence data to obtain trend sequence data, periodic sequence data and error sequence data; the sequence decomposition module is used for performing time sequence decomposition on the error sequence data and performing fitting prediction on a plurality of obtained decomposition subsequences so as to determine a first prediction result corresponding to the error sequence data according to a fitting prediction result; the fitting prediction module is used for respectively performing fitting prediction on the trend sequence data and the periodic sequence data by adopting a prediction model corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result; and the fusion processing module is used for carrying out fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result.
According to an aspect of the present disclosure, there is provided a time-series-based logistics cargo amount prediction apparatus, the apparatus including: the time sequence data acquisition module is used for acquiring historical cargo quantity time sequence data, and performing time sequence decomposition on the historical cargo quantity time sequence data to obtain trend sequence data, periodic sequence data and error sequence data; the time sequence decomposition module is used for performing time sequence decomposition on the error sequence data and performing fitting prediction on the obtained decomposition subsequences so as to determine a first prediction result corresponding to the error sequence data according to a fitting prediction result; the fitting prediction module is used for respectively performing fitting prediction on the trend sequence data and the periodic sequence data by adopting a prediction model corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result; and the prediction result determining module is used for fusing the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result of the cargo quantity.
According to an aspect of the present disclosure, there is provided a computer storage medium having a computer program stored thereon, the computer program, when executed by a processor, implementing any one of the above-described time-series-based data prediction methods or any one of the above-described time-series-based logistics cargo amount prediction methods.
According to an aspect of the present disclosure, there is provided an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to execute any one of the time series based data prediction methods or any one of the time series based logistics cargo volume prediction methods via execution of the executable instructions.
In the data prediction method based on time series in the exemplary embodiment of the disclosure, trend sequence data, cycle sequence data and error sequence data are obtained by performing time series decomposition on time series data, prediction fitting is performed on each sequence data by adopting different prediction models, wherein time series decomposition is performed on the error sequence data again, fitting processing is performed on a plurality of obtained decomposed subsequences to obtain a first prediction result, and the first prediction result is fused with a second prediction result and a third prediction result obtained by fitting the trend sequence data and the cycle sequence data to obtain a final prediction result. On one hand, time sequence data are decomposed into a plurality of sequence data by adopting a time sequence decomposition method, different prediction models are adopted to respectively carry out fitting prediction on each sequence data, and finally a plurality of fitting prediction results are fused, so that the prediction accuracy is improved by a combined prediction method; on the other hand, time sequence decomposition is carried out on the error sequence data again, fitting prediction and fusion of fitting results are carried out respectively, direct introduction of error items into a prediction model is avoided, and model prediction accuracy is improved to a certain extent.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
FIG. 1 illustrates a flow diagram for time series based data prediction according to an exemplary embodiment of the present disclosure;
FIG. 2 illustrates a flow chart for obtaining a first prediction result according to an exemplary embodiment of the present disclosure;
FIG. 3 illustrates a flow diagram for determining a first prediction result based on a composite model according to an exemplary embodiment of the present disclosure;
FIG. 4 shows a flow chart of fitting predictions to trend sequence data and periodic sequence data according to an example embodiment of the present disclosure;
FIG. 5 shows a flow chart of fitting predictions to trend sequence data using ARIMA according to an exemplary embodiment of the present disclosure;
FIG. 6 illustrates a flowchart of a method for time series based logistics inventory prediction, according to an exemplary embodiment of the present disclosure;
fig. 7 illustrates a flowchart of a first prediction result acquisition in a time-series logistics cargo quantity prediction method according to an exemplary embodiment of the present disclosure;
FIG. 8 shows a flowchart for fitting predictions to sub-fusion sequence data resulting in a first prediction result, according to an example embodiment of the present disclosure;
FIG. 9 shows a flow diagram of a method for time series based logistics cargo forecasting, according to an example embodiment of the present disclosure;
FIG. 10 is a diagram illustrating the result of STL decomposition of training set data in logistic inventory prediction according to an exemplary embodiment of the present disclosure;
FIG. 11 shows error sequence data R according to an exemplary embodiment of the present disclosure1The result of performing STL decomposition again is shown schematically;
FIG. 12 shows sub-fusion sequence data CS using a sine wave sequence characterization network and a long-short term memory network composite model according to an exemplary embodiment of the disclosure2A schematic diagram of the results of the fitting predictions;
FIG. 13 illustrates trend sequence data C using ARIMA model according to an exemplary embodiment of the present disclosure1A schematic diagram of the results of the fitting predictions;
FIG. 14 is a diagram illustrating a final target prediction by summing a first prediction, a second prediction, and a third prediction according to an exemplary embodiment of the present disclosure;
FIG. 15 is a diagram showing the results of fitting prediction of cycle sequence data S1 by the LSTM model alone and fitting prediction of cycle sequence data S1 by the composite model according to an exemplary embodiment of the present disclosure;
FIG. 16 shows an autocorrelation and partial autocorrelation plot of trend sequence data according to an exemplary embodiment of the present disclosure;
FIG. 17 is a schematic diagram illustrating a time series based data prediction apparatus according to an exemplary embodiment of the present disclosure;
FIG. 18 is a schematic diagram illustrating a time series-based logistics cargo quantity prediction apparatus according to an exemplary embodiment of the present disclosure;
FIG. 19 shows a schematic diagram of a storage medium according to an exemplary embodiment of the present disclosure; and
fig. 20 shows a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
Detailed Description
Exemplary embodiments will now be described more fully with reference to the accompanying drawings. The exemplary embodiments, however, may be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of exemplary embodiments to those skilled in the art. The same reference numerals in the drawings denote the same or similar structures, and thus their detailed description will be omitted.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known structures, methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the disclosure.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. That is, these functional entities may be implemented in the form of software, or in one or more software-hardened modules, or in different networks and/or processor devices and/or microcontroller devices.
In the process of realizing flow automation, time series prediction plays an important role, for example, in order to ensure service quality, the quantity of goods is prestored in advance, preparation is performed in advance according to the prediction result, and for example, in some online shopping websites, the sales volume of each type of goods in a future period of time is a variable which needs to be considered in series decisions such as stock preparation, promotion and the like, so that the prediction technical capability can finally generate important influence on stock preparation, transportation scheduling, sales income, inventory cost and the like. Therefore, how to improve the prediction accuracy of the data puts higher demands on the time sequence prediction technology.
In the related technology in the field, a single prediction model is often adopted for prediction data with periodicity and tendency characteristics, so that the performance on the prediction accuracy rate is poor, and the time series prediction model is well suitable for the application scenario.
The time series, also called dynamic number series, time number series or historical complex number, is a number series formed by arranging data of the same statistical index according to the time sequence of occurrence of the data. The time series analysis is to make and analyze the time series, and analogize or extend the time series according to the development process, direction and trend reflected by the time series, so as to predict the level which may be reached at the next time point or the next time period. The time series method can be classified into short-term prediction, medium-term prediction, and long-term prediction according to the predicted time span. According to different data analysis methods, the method can be a simple time-series average method, a weighted time-series average method, a simple moving average method, a weighted moving average method, an exponential smoothing method, etc.
However, the adoption of the time series prediction model in the related art generally inputs random errors into the prediction model for prediction, and the prediction accuracy of the model is reduced due to the direct introduction of error terms.
Based on this, in the exemplary embodiment of the present disclosure, a data prediction method based on time series is first provided. Referring to fig. 1, the data prediction method based on time series includes the following steps:
step S110: acquiring historical time sequence data, and performing time sequence decomposition on the historical time sequence data to obtain trend sequence data, periodic sequence data and error sequence data;
step S120: performing time series decomposition on the error sequence data, and performing fitting prediction on the obtained decomposition subsequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result;
step S130: respectively performing fitting prediction on the trend sequence data and the periodic sequence data by adopting prediction models corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result;
step S140: and performing fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result.
According to the data prediction method based on the time series in the example embodiment, on one hand, the time series data is decomposed into a plurality of series data by adopting a time series decomposition method, fitting prediction is respectively carried out on each series data by adopting different prediction models, and finally, a plurality of fitting prediction results are fused, so that the prediction accuracy is improved by adopting a combined prediction method; on the other hand, time sequence decomposition is carried out on the error sequence data again, fitting prediction and fusion of fitting results are carried out respectively, direct introduction of error items into a prediction model is avoided, and model prediction accuracy is improved to a certain extent.
The time series-based data prediction method in the exemplary embodiment of the present disclosure is further explained below.
In step S110, historical time-series data is acquired, and time-series decomposition is performed on the historical time-series data to obtain trend series data, cycle series data, and error series data.
In an exemplary embodiment of the present disclosure, the time in the historical time-series data may be in the form of year, quarter, month, day, or any other time, and the granularity of obtaining the historical time-series data may be determined according to the actual observed predicted time. The obtained historical data are sorted according to the time sequence, specific time is used as an index, and corresponding data are used as values to form historical time sequence data. STL (temporal series decomposition) is a time series decomposition method that uses robust local weighted regression as a smoothing method, which studies statistical rules followed by random data sequences based on a stochastic process and a carding statistical method to solve practical problems. The present disclosure employs an STL method to perform time series decomposition on historical time series data, resulting in trend series data, periodic series data, and error series data. The trend sequence data part shows the change of data along with time, the cycle sequence data reflects the cycle fluctuation of the data along with time, and the error sequence data reflects the parts which can not be explained by the trend sequence data and the cycle sequence data; the STL decomposition may be performed in any period, and the decomposition period may be selected according to the type of actually processed data, the application scenario, and the like, which is not particularly limited in this disclosure.
In step S120, time-series decomposition is performed on the error sequence data, and fitting prediction is performed on the obtained plurality of decomposed subsequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result.
In an exemplary embodiment of the present disclosure, in order to avoid directly inputting random errors as input parameters to a prediction model for prediction, the present disclosure performs time-series decomposition on error sequence data again, and performs fitting prediction on the obtained plurality of decomposed subsequences.
Specifically, fig. 2 shows a flowchart for obtaining a first prediction result according to an exemplary embodiment of the present disclosure, and as shown in fig. 2, the process includes the following steps:
in step S210, time-series decomposition is performed on the error sequence data to obtain sub-trend sequence data, sub-cycle sequence data, and sub-random error sequence data.
In the exemplary embodiment of the present disclosure, STL decomposition of the error sequence data may also be performed with an arbitrary period, and the decomposition period may be selected according to the actual processing data type, application scenario, and the like; optionally, the period of STL decomposition of the error sequence data is the same as the period of STL decomposition of the historical time sequence data, for example, T ═ 7 days; optionally, the period of performing STL decomposition on the error sequence data may be different from the period of performing STL decomposition on the historical time sequence data, and the present disclosure may determine whether the periods of the two STL decompositions are the same according to actual prediction requirements, which is not particularly limited by the present disclosure.
In step S220, the sub-random error sequence data is discarded.
In the exemplary embodiment of the disclosure, the random error sequence data has randomness, and in order to avoid inputting the sub-random errors into the prediction model for prediction, the sub-random error sequence is discarded by the disclosure and is not continuously used for fitting prediction of the model, so that the part is prevented from being introduced into the model, and the model prediction accuracy is reduced.
In step S230, the sub-trend sequence data and the sub-cycle sequence data are subjected to fusion processing to obtain sub-fusion sequence data.
In an exemplary embodiment of the present disclosure, the data values at corresponding times in the sub-trend sequence data and the cycle sequence data are summed to obtain sub-fusion sequence data.
In step S240, a prediction model corresponding to the sub-fusion sequence data is used to perform fitting prediction on the sub-fusion sequence data to obtain the first prediction result.
In an exemplary embodiment of the disclosure, for convenience of data processing and model fitting prediction, before fitting prediction is performed on the sub-fusion sequence data by using a prediction model corresponding to the sub-fusion sequence data, normalization and normalization processing are performed on the sub-fusion sequence data, and training set data and test set data are determined according to the sub-fusion sequence data after the normalization.
The data normalization and normalization process is to scale the data to fall into a specific interval, convert the data into dimensionless pure values, and facilitate the comparison and weighting of indexes of different units or magnitude. The data normalization process includes, but is not limited to, a linear method (e.g., extreme method, standard deviation method), a broken line method (e.g., three-broken line method), and a curve method (e.g., semi-normal distribution), where the data normalization process is to perform linear transformation on the original data to make the result fall into the [0,1] interval, and the standard normalization process of the present disclosure is described below.
For example, X represents data in the sub-fusion sequence data, and the following normalization formula (1) is used:
x′i=(xi-μ)/σ (1)
wherein μ and σ are the mean and standard deviation of X, respectively, X1,x2,x3…xnIs a sequence value of the data;
the normalized sequence set is X', and the normalized data is normalized by the following formula (2):
x″i=(x′i-min(X′)/(max(X′)-min(X′)) (2)
where max is the maximum value data in the normalized sequence set X ', and min is the minimum value data in the normalized sequence set X'.
Further, training set data and test set data are determined from the standard normalized sub-fusion sequence data. In the disclosure, the sub-fusion sequence data after standard normalization is divided into training set data and test set data according to a preset proportion. For example, for data from 4/month 1 to 5/month 30 in 2021, 4/month 1 to 5/month 23 are divided into training set data, and 5/month 24 to 5/month 30 are divided into test set data. The division ratio can be determined according to the actual fitting prediction condition, for example, 8:1, 9:1, and the like, which is not particularly limited in this disclosure.
In an exemplary embodiment of the disclosure, the prediction model corresponding to the sub-fusion sequence data is a sine wave sequence characterization network and long-short term memory network composite model, fig. 3 shows a flowchart for determining the first prediction result based on the composite model according to an exemplary embodiment of the disclosure, and as shown in fig. 3, a process of performing fitting prediction on the sub-fusion sequence data by using the prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result includes the following steps:
in step S310, the training set data is input to the sine wave sequence characterization network for sample amplification, and the training set data after sample amplification is input to the long-short term memory network, so as to train the complex model of the sine wave sequence characterization network and the long-short term memory network.
In an exemplary embodiment of the present disclosure, a training set data set is characterized by a network by using a sine wave sequence to represent an embedding layer (embedded layer) equivalent to a time sequence, and the sine wave sequence adopted by the present disclosure specifically represents the network by the following formula (3):
Figure BDA0003270824850000111
wherein tau is input sub-fusion sequence data, and h is a parameterj,wj
Figure BDA0003270824850000112
And k is an output dimension of the sine wave representation layer obtained through model training and learning, and can be set according to an actual fitting prediction situation, such as 16, 32, 64 and the like.
A Long Short-Term Memory Network (LSTM) is a type of RNN (Recurrent Network), and can learn Long-Term dependence information, including an input layer, a hidden layer, and an output layer.
In the process of the model training, the output of the sine wave sequence characterization network is used as the input of the LSTM, the output dimension of the LSTM is 1, and the data at the next time point can be predicted by the observed data at the time point, that is:
yt=LSTM(xt-1,xt-2,xt-3…xt-T) (4)
wherein, ytIs the output of LSTM, xiThe amplified training set data is the sample input to the LSTM.
In step S320, the test set data is input to the trained composite model to obtain a first prediction result. In an exemplary embodiment of the present disclosure, the first prediction result is obtained by inputting the test set data to the trained composite model.
When the number of time sequence data observation is small, the division of a training/testing data set is difficult, the model is difficult to train or even overfit due to the small number of training set data, and the model selection result is unreliable due to the small number of verification set data.
In step S130, fitting prediction is performed on the trend sequence data and the cycle sequence data respectively by using prediction models corresponding to the trend sequence data and the cycle sequence data, so as to obtain a second prediction result and a third prediction result.
In an exemplary embodiment of the present disclosure, fitting prediction is performed on trend sequence data and cycle sequence data using different prediction models, and fig. 4 shows a flowchart of fitting prediction on trend sequence data and cycle sequence data according to an exemplary embodiment of the present disclosure, and as shown in fig. 4, the process includes the following steps:
in step S410, the trend sequence data and the cycle sequence data are respectively determined according to the training set data and the test set data, specifically, the trend sequence data and the cycle sequence data are divided into the training set data and the test set data according to a preset ratio, the division ratio can be determined according to the actual fitting prediction condition, but the division ratio needs to be consistent with the division ratio of the error sequence data.
In step S420, inputting training set data corresponding to the trend sequence data to a corresponding first prediction model to train the first prediction model, and inputting test set data corresponding to the trend sequence data to the trained first prediction model to obtain a second prediction result; the first prediction model may be an ARIMA (differential Integrated Moving Average Autoregressive model), an exponential smoothing model (ETS model), a Theta model, or the like, and the corresponding prediction model may be selected according to actual fitting prediction needs. The following description will take the ARIMA fitting prediction process to trend sequence data as an example, and refer to fig. 5, which includes the following steps:
in step S510, stationarity checking and differential stationarity are performed.
Using a unit root inspection method to inspect the stationarity of the trend sequence data, and using ARIMA (p,0, q) as a prediction model if the trend sequence data is inspected to be a stable sequence; if the checking result is a non-stationary sequence, performing differential processing on the trend sequence data until the trend sequence data is checked to be a stationary sequence, and using ARIMA (p, d, q) as a prediction model. Wherein p is the order of an autoregressive model, q is the order of a moving average model, and d is the difference processing times; the unit root test method includes ADF test, PP test, NP test, KPSS test, ERS test, and the present disclosure does not make any special limitation on the type of the unit root test method.
In step S520, parameters p and q are determined.
Having determined the value of d through step S510, optionally, an autocorrelation graph ACF and a partial autocorrelation graph PACF of the trend sequence data may be plotted, and the values of P and q are determined by observing the truncation of ACF and PACF; alternatively, the Information Criterion AIC (Akaike Information Criterion) may be used for determination, and the present disclosure includes, but is not limited to, the determination method of the parameters P and q.
In step S530 ARIMA (p, d, q) fits the prediction.
Inputting training set data corresponding to the trend sequence data after the stabilization processing into a corresponding first prediction model so as to train the first prediction model, and inputting test set data corresponding to the trend sequence data after the stabilization processing into the trained first prediction model so as to obtain a second prediction result.
In step S430, training set data corresponding to the period sequence data is input to the corresponding second prediction model to train the second prediction model, and test set data corresponding to the trend sequence data is input to the trained second prediction model to obtain a third prediction result. Before inputting the training set data corresponding to the cycle sequence data into the corresponding second prediction model, the cycle sequence data is normalized and normalized, and the standard normalization processing procedure is the same as the standard normalization processing procedure in step S240, and is not described herein again.
In an exemplary embodiment of the present disclosure, the second prediction model is a composite model of the sine wave sequence representation network and the long-short term memory network, and the fitting prediction process of the cycle sequence by using the composite model of the sine wave sequence representation network and the long-short term memory network is the same as that in step S310, and is not described herein again.
In step S140, the first prediction result, the second prediction result, and the third prediction result are fused to obtain a target prediction result.
In an exemplary embodiment of the present disclosure, since data when fitting prediction is performed on error sequence data and cycle sequence data is normalized and normalized, in order to fuse a first prediction result, a second prediction result, and a third prediction result, the first prediction result and the third prediction result are first subjected to inverse normalization and inverse normalization, and then the second prediction result and the processed first prediction result and third prediction result are added to obtain a target prediction result.
The method comprises the steps of performing time series decomposition on time series data to obtain trend series data, period series data and error series data, performing prediction fitting on the sequence data by adopting different prediction models, performing time series decomposition on the error series data again, performing fitting processing on a plurality of obtained decomposition subsequences to obtain first result data, and fusing the first prediction result with a second prediction result and a third prediction result obtained by trend series data fitting prediction and period series data fitting to obtain a final prediction result. Decomposing time sequence data into a plurality of sequence data by adopting a time sequence decomposition method, respectively performing fitting prediction on each sequence data by adopting different prediction models, and finally fusing a plurality of fitting prediction results to improve the prediction accuracy by adopting a combined prediction method; in addition, time sequence decomposition is carried out on the error sequence data again, fitting prediction and fusion of fitting results are carried out respectively, direct introduction of error items into a prediction model is avoided, and model prediction accuracy is improved to a certain extent.
In an exemplary embodiment of the present disclosure, a time-series-based logistics cargo quantity prediction method is further provided, and fig. 6 shows a flow chart of the time-series-based logistics cargo quantity prediction method according to an exemplary embodiment of the present disclosure, and as shown in fig. 6, the process includes the following steps:
in step S610, historical cargo quantity time-series data is acquired, and time-series decomposition is performed on the historical cargo quantity time-series data to obtain trend sequence data, cycle sequence data, and error sequence data;
in step S620, performing time series decomposition on the error sequence data, and performing fitting prediction on the obtained plurality of decomposed subsequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result;
in step S630, fitting and predicting the trend sequence data and the cycle sequence data respectively by using prediction models corresponding to the trend sequence data and the cycle sequence data to obtain a second prediction result and a third prediction result;
in step S640, the first prediction result, the second prediction result, and the third prediction result are fused to obtain a target prediction result of the cargo volume.
In the exemplary embodiment schematically shown in fig. 6, on one hand, the historical cargo time series data is decomposed into a plurality of series data by adopting a time series decomposition method, and fitting prediction is performed on each series data by adopting different prediction models, and finally, a plurality of fitting prediction results are fused, so that the prediction accuracy is improved by adopting a combined prediction method according to the periodicity and trend characteristics of the transported cargo; in addition, time sequence decomposition is carried out on the error sequence data again, fitting prediction and fusion of fitting results are carried out respectively, direct introduction of error items into a prediction model is avoided, and model prediction accuracy is improved to a certain extent.
It should be further noted that, the portions related to defining and explaining the methods from step S110 to step S140 in the foregoing data prediction method based on time series also apply to step S610 to step S640, and further description is omitted here to avoid redundant contents.
Further, referring to fig. 7, the process of performing time-series decomposition on the error sequence data and performing fitting prediction on the obtained plurality of decomposed subsequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result includes: step S710, performing time series decomposition on the error sequence data to obtain sub-trend sequence data, sub-cycle sequence data and sub-random error sequence data; step S720, discarding the sub-random error sequence data; step S730, fusing the sub-trend sequence data and the sub-cycle sequence data to obtain sub-fusion sequence data; step S740, fitting and predicting the sub-fusion sequence data by using a prediction model corresponding to the sub-fusion sequence data to obtain a first prediction result.
It should be noted that, the parts related to defining and explaining the methods in steps S210 to S240 in the foregoing data prediction method based on time series are also applicable to steps S710 to S740, and are not described herein again in order to avoid redundant contents.
Still further, the prediction model corresponding to the sub fusion sequence data in the disclosure is a sine wave sequence characterization network and long-short term memory network composite model; referring to fig. 8, the process of obtaining the first prediction result by performing fitting prediction on the sub-fusion sequence data using the prediction model corresponding to the sub-fusion sequence data includes the following steps: step S810, inputting the training set data into a sine wave sequence characterization network for sample enhancement, and inputting the obtained enhanced training set data into a long-short term memory network so as to train the sine wave sequence characterization network and the long-short term memory network composite model; step S820, inputting the test set data into the trained composite model to obtain a first prediction result.
It should be noted that, the portions related to defining and explaining the methods in steps S310 to S320 in the data prediction method based on time series are also applicable to steps S810 to S820, and are not described herein again to avoid redundant contents.
The following describes a method for predicting the logistics cargo volume based on time series according to the present disclosure, taking daily order volume data of a logistics trunk of a certain logistics company as an example. The method comprises the steps that the quantities of goods belonging to the same day are collected into one piece of data, all the quantity of goods are sorted according to a date sequence, an actual departure date is selected as an index, the actual transportation quantity is a value, and historical quantity of goods time series data are formed; selecting historical data of the cargo quantity from 1 day at 4 months to 30 days at 5 months in 2021, wherein the data from 1 day at 4 months to 23 days at 5 months are training set data, and the data from 24 days at 5 months to 30 days at 5 months are test set data (the training set data and the test set data are both data after standard normalization).
Fig. 9 is a flowchart illustrating a logistic cargo amount prediction method based on a time series according to an exemplary embodiment of the present disclosure, and as shown in fig. 9, the method includes the following processes:
first, STL decomposition is performed on training set data (logistics quantity order data from day 1 on 4 th to day 23 on 5 th) at a cycle T of 7 to obtain trend sequence data C1Periodic sequence data S1And error sequence data R1(see fig. 10); error sequence data R after STL decomposition1In which there are discrete points deviating from the baseline, i.e., error sequence data R1There is still an information amount, so by aligning the error sequence data R1STL decomposition is carried out again to obtain sub-trend sequence data C2Sub-cycle sequence data S2(see FIG. 11) and sub-random error sequence data R2
Next, the sub-random error sequence data R2Discard, and get the sub-trend sequence data C2Sub-cycle sequence data S2Corresponding time points cargo quantity are added to obtain sub-fusion sequence data CS2(ii) a And adopts a sine wave sequence representation network and a long-short term memory network compound model to pair sub-fusion sequence data CS2Performing fitting prediction, and outputting a first prediction result (as shown in FIG. 12);
trend sequence data C was then modeled using ARIMA1Performing fitting prediction, outputting a second prediction result (as shown in FIG. 13), and performing cycle sequence data S by using a sine wave sequence characterization network and long-short term memory network composite model1Performing fitting prediction and outputA third prediction result is obtained;
and finally, summing the first prediction result, the second prediction result and the third prediction result to obtain a final target prediction result (as shown in fig. 14).
Wherein FIG. 15 shows the periodic sequence data S only by the LSTM model1Result schematic diagram for fitting prediction and cycle sequence data S by sine wave sequence characterization network and long-short term memory network composite model1The results of the fitting prediction are shown schematically, and the comparison of the two figures shows that the lower graph in FIG. 15 adopts a composite model to the periodic sequence data S1The fitting result of the fitting prediction can more accurately predict the fluctuation rule of the data (wherein the dotted line is a predicted value and is realized as a true value). For trend sequence data C1The prediction is performed by using an ARIMA model, autocorrelation and partial autocorrelation graphs (as shown in fig. 16, an autocorrelation graph and a partial autocorrelation graph are respectively drawn for the trend sequence data, parameters q is 3, p is 1 and d is 0 are determined by the autocorrelation graphs and the partial autocorrelation graphs, namely, the trend sequence data is predicted by using ARIMA (1,0,3), and a second prediction result is obtained. And finally, performing fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result of the cargo quantity.
In an exemplary embodiment of the present disclosure, there is also provided a time-series based data prediction apparatus, and as shown in fig. 17, the time-series based data prediction apparatus 1700 may include a data acquisition module 1710, a sequence decomposition module 1720, a fitting prediction module 1730, and a fusion processing module 1740. In particular, the amount of the solvent to be used,
a data obtaining module 1710, configured to obtain historical time-series data, perform time-series decomposition on the historical time-series data, and obtain trend series data, cycle series data, and error series data;
a sequence decomposition module 1720, configured to perform time series decomposition on the error sequence data, and perform fitting prediction on the obtained multiple decomposed subsequences, so as to determine a first prediction result corresponding to the error sequence data according to a fitting prediction result;
a fitting prediction module 1730, configured to perform fitting prediction on the trend sequence data and the cycle sequence data respectively by using prediction models corresponding to the trend sequence data and the cycle sequence data to obtain a second prediction result and a third prediction result;
and a fusion processing module 1740, configured to perform fusion processing on the first prediction result, the second prediction result, and the third prediction result to obtain a target prediction result.
In addition, in an exemplary embodiment of the present disclosure, a logistics cargo amount prediction apparatus based on a time series is also provided. Referring to fig. 18, the time-series-based logistics cargo quantity prediction apparatus 1800 may include a time-series data acquisition module 1810, a time-series decomposition module 1820, a fitting prediction module 1830, and a prediction result determination module 1840. In particular, the amount of the solvent to be used,
a time sequence data acquiring module 1810, configured to acquire historical cargo quantity time sequence data, perform time sequence decomposition on the historical cargo quantity time sequence data, and obtain trend sequence data, cycle sequence data, and error sequence data;
a time series decomposition module 1820, configured to perform time series decomposition on the error sequence data, and perform fitting prediction on the obtained multiple decomposed subsequences, so as to determine a first prediction result corresponding to the error sequence data according to a fitting prediction result;
a fitting prediction module 1830, configured to perform fitting prediction on the trend sequence data and the cycle sequence data respectively by using prediction models corresponding to the trend sequence data and the cycle sequence data to obtain a second prediction result and a third prediction result;
the prediction result determining module 1840 is configured to perform fusion processing on the first prediction result, the second prediction result, and the third prediction result to obtain a target prediction result of the cargo volume.
Since each functional module of the time-series-based data prediction apparatus according to the exemplary embodiment of the present disclosure is the same as that in the inventive embodiment of the time-series-based data prediction method, each functional module of the time-series-based physical distribution cargo quantity prediction apparatus according to the exemplary embodiment of the present disclosure is the same as that in the inventive embodiment of the time-series-based physical distribution cargo quantity prediction method, and therefore, no further description is given here.
It should be noted that although several modules or units of the time-series based data prediction apparatus and the time-series based physical distribution quantity prediction apparatus are mentioned in the above detailed description, such division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
In addition, in the exemplary embodiments of the present disclosure, a computer storage medium capable of implementing the above method is also provided. On which a program product capable of implementing the above-described method of the present specification is stored. In some possible embodiments, aspects of the present disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the present disclosure described in the "exemplary methods" section above of this specification, when the program product is run on the terminal device.
Referring to fig. 19, a program product 1900 for implementing the above method according to an exemplary embodiment of the present disclosure is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
In addition, in an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided. As will be appreciated by one skilled in the art, aspects of the present disclosure may be embodied as a system, method or program product. Accordingly, various aspects of the present disclosure may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
An electronic device 2000 according to such an embodiment of the present disclosure is described below with reference to fig. 20. The electronic device 2000 shown in fig. 20 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 20, the electronic device 2000 is embodied in the form of a general purpose computing device. The components of the electronic device 2000 may include, but are not limited to: the at least one processing unit 2010, the at least one memory unit 2020, the bus 2030 connecting the various system components including the memory unit 2020 and the processing unit 2010, and the display unit 2040.
Wherein the memory unit stores program code executable by the processing unit 2010 to cause the processing unit 2010 to perform steps according to various exemplary embodiments of the present disclosure as described in the "exemplary methods" section above of this specification.
The storage unit 2020 may include readable media in the form of volatile storage units, such as a random access memory unit (RAM)2021 and/or a cache memory unit 2022, and may further include a read only memory unit (ROM) 2023.
The storage unit 2020 may also include a program/utility 2024 having a set (at least one) of program modules 2025, such program modules 2025 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 2030 may be one or more of any of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 2000 may also communicate with one or more external devices 2100 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 2000, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 2000 to communicate with one or more other computing devices. Such communication may occur over an input/output (I/O) interface 2050. Also, the electronic device 2000 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) through the network adapter 2060. As shown, the network adapter 2060 communicates with the other modules of the electronic device 2000 via the bus 2030. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with the electronic device 2000, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.
Furthermore, the above-described figures are merely schematic illustrations of processes included in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is to be limited only by the terms of the appended claims.

Claims (16)

1. A data prediction method based on time series is characterized by comprising the following steps:
acquiring historical time sequence data, and performing time sequence decomposition on the historical time sequence data to obtain trend sequence data, periodic sequence data and error sequence data;
performing time series decomposition on the error sequence data, and performing fitting prediction on a plurality of obtained decomposed subsequences to determine a first prediction result corresponding to the error sequence data according to a fitting prediction result;
respectively performing fitting prediction on the trend sequence data and the periodic sequence data by adopting prediction models corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result;
and performing fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result.
2. The method of claim 1, wherein decomposing the time series of the error sequence data and performing a fitting prediction on the plurality of decomposed subsequences to determine a first prediction result corresponding to the error sequence data according to the fitting prediction result comprises:
performing time series decomposition on the error sequence data to obtain sub-trend sequence data, sub-period sequence data and sub-random error sequence data;
discarding the sub-random error sequence data;
fusing the sub-trend sequence data and the sub-cycle sequence data to obtain sub-fusion sequence data;
and fitting and predicting the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result.
3. The method of claim 2, wherein prior to performing a fitting prediction on the sub-fusion sequence data using a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result, the method further comprises:
and labeling and normalizing the sub-fusion sequence data, and determining training set data and test set data according to the sub-fusion sequence data after standard normalization.
4. The method according to claim 3, wherein the prediction model corresponding to the sub-fusion sequence data is a sine wave sequence characterization network and long-short term memory network composite model;
the fitting prediction is performed on the sub-fusion sequence data by using the prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result, and the method comprises the following steps:
inputting the training set data into a sine wave sequence characterization network for sample amplification, and inputting the training set data after sample amplification into a long-short term memory network so as to train the sine wave sequence characterization network and the long-short term memory network composite model;
and inputting the test set data into the trained composite model to obtain the first prediction result.
5. The method of claim 3, wherein fitting predictions to the trend sequence data and the cycle sequence data using prediction models corresponding to the trend sequence data and the cycle sequence data, respectively, to obtain a second prediction result and a third prediction result comprises:
respectively determining training set data and test set data corresponding to the trend sequence data and the cycle sequence data;
inputting training set data corresponding to the trend sequence data into a corresponding first prediction model to train the first prediction model, and inputting test set data corresponding to the trend sequence data into the trained first prediction model to obtain a second prediction result;
inputting training set data corresponding to the periodic sequence data to a corresponding second prediction model to train the second prediction model, and inputting test set data corresponding to the trend sequence data to the trained second prediction model to obtain the third prediction result.
6. The method of claim 5, wherein prior to determining corresponding training set data and test set data from the cycle sequence data, the method further comprises:
and normalizing the periodic sequence data.
7. The method of claim 5, wherein the first predictive model is one of a differential integrated moving average autoregressive model, an exponential smoothing model, or a Theta model; the second prediction model is a sine wave sequence characterization network and long-short term memory network composite model.
8. The method according to claim 6, wherein the fusing the first prediction result, the second prediction result, and the third prediction result to obtain a target prediction result comprises:
carrying out inverse standardization and inverse normalization processing on the first prediction result and the third prediction result;
and adding the second prediction result, the processed first prediction result and the processed third prediction result to obtain the target prediction result.
9. A logistics cargo quantity prediction method based on time series is characterized by comprising the following steps:
acquiring historical cargo quantity time-series data, and performing time-series decomposition on the historical cargo quantity time-series data to obtain trend sequence data, periodic sequence data and error sequence data;
performing time series decomposition on the error sequence data, and performing fitting prediction on a plurality of obtained decomposed subsequences to determine a first prediction result corresponding to the error sequence data according to a fitting prediction result;
respectively performing fitting prediction on the trend sequence data and the periodic sequence data by adopting prediction models corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result;
and performing fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result of the cargo quantity.
10. The method of claim 9, wherein time-series decomposing the error sequence data and fitting and predicting the plurality of decomposed subsequences to determine a first prediction result corresponding to the error sequence data according to the fitting and predicting result comprises:
performing time series decomposition on the error sequence data to obtain sub-trend sequence data, sub-period sequence data and sub-random error sequence data;
discarding the sub-random error sequence data;
fusing the sub-trend sequence data and the sub-cycle sequence data to obtain sub-fusion sequence data;
and fitting and predicting the sub-fusion sequence data by adopting a prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result.
11. The method according to claim 10, wherein the prediction model corresponding to the sub-fusion sequence data is a sine wave sequence characterization network and long-short term memory network composite model;
the fitting prediction is performed on the sub-fusion sequence data by using the prediction model corresponding to the sub-fusion sequence data to obtain the first prediction result, and the method comprises the following steps:
inputting the training set data into a sine wave sequence characterization network for sample amplification, and inputting the obtained amplified training set data into a long-short term memory network so as to train the sine wave sequence characterization network and long-short term memory network composite model;
and inputting the test set data into the trained composite model to obtain the first prediction result.
12. The method according to claim 9, wherein the fusing the first prediction result, the second prediction result, and the third prediction result to obtain the target prediction result of the cargo amount comprises:
and adding the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result of the cargo quantity.
13. A time-series-based data prediction apparatus, comprising:
the data acquisition module is used for acquiring historical time sequence data, and performing time sequence decomposition on the historical time sequence data to obtain trend sequence data, periodic sequence data and error sequence data;
the sequence decomposition module is used for performing time sequence decomposition on the error sequence data and performing fitting prediction on a plurality of obtained decomposition subsequences so as to determine a first prediction result corresponding to the error sequence data according to a fitting prediction result;
the fitting prediction module is used for respectively performing fitting prediction on the trend sequence data and the periodic sequence data by adopting a prediction model corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result;
and the fusion processing module is used for carrying out fusion processing on the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result.
14. A time-series-based logistics cargo quantity prediction apparatus is characterized by comprising:
the time sequence data acquisition module is used for acquiring historical cargo quantity time sequence data, and performing time sequence decomposition on the historical cargo quantity time sequence data to obtain trend sequence data, periodic sequence data and error sequence data;
the time sequence decomposition module is used for performing time sequence decomposition on the error sequence data and performing fitting prediction on the obtained decomposition subsequences so as to determine a first prediction result corresponding to the error sequence data according to a fitting prediction result;
the fitting prediction module is used for respectively performing fitting prediction on the trend sequence data and the periodic sequence data by adopting a prediction model corresponding to the trend sequence data and the periodic sequence data to obtain a second prediction result and a third prediction result;
and the prediction result determining module is used for fusing the first prediction result, the second prediction result and the third prediction result to obtain a target prediction result of the cargo quantity.
15. A storage medium having stored thereon a computer program which, when executed by a processor, implements the time-series based data prediction method according to any one of claims 1 to 8 or the time-series based physical distribution quantity prediction method according to any one of claims 9 to 12.
16. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the time series based data prediction method of any one of claims 1 to 8 or the time series based logistics cargo volume prediction method of any one of claims 9-12 via execution of the executable instructions.
CN202111100746.6A 2021-09-18 2021-09-18 Data prediction method and device, logistics cargo amount prediction method, medium and equipment Active CN113792931B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111100746.6A CN113792931B (en) 2021-09-18 2021-09-18 Data prediction method and device, logistics cargo amount prediction method, medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111100746.6A CN113792931B (en) 2021-09-18 2021-09-18 Data prediction method and device, logistics cargo amount prediction method, medium and equipment

Publications (2)

Publication Number Publication Date
CN113792931A true CN113792931A (en) 2021-12-14
CN113792931B CN113792931B (en) 2024-06-18

Family

ID=79184130

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111100746.6A Active CN113792931B (en) 2021-09-18 2021-09-18 Data prediction method and device, logistics cargo amount prediction method, medium and equipment

Country Status (1)

Country Link
CN (1) CN113792931B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203732A (en) * 2016-07-26 2016-12-07 国网重庆市电力公司 Error in dipping computational methods based on ITD and time series analysis
CN106408341A (en) * 2016-09-21 2017-02-15 北京小米移动软件有限公司 Goods sales volume prediction method and device, and electronic equipment
CN111161538A (en) * 2020-01-06 2020-05-15 东南大学 Short-term traffic flow prediction method based on time series decomposition
CN111160651A (en) * 2019-12-31 2020-05-15 福州大学 STL-LSTM-based subway passenger flow prediction method
CN112989271A (en) * 2019-12-02 2021-06-18 阿里巴巴集团控股有限公司 Time series decomposition
WO2021129086A1 (en) * 2019-12-25 2021-07-01 中兴通讯股份有限公司 Traffic prediction method, device, and storage medium
CN113379168A (en) * 2021-08-11 2021-09-10 云智慧(北京)科技有限公司 Time series prediction processing method, device and equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203732A (en) * 2016-07-26 2016-12-07 国网重庆市电力公司 Error in dipping computational methods based on ITD and time series analysis
CN106408341A (en) * 2016-09-21 2017-02-15 北京小米移动软件有限公司 Goods sales volume prediction method and device, and electronic equipment
CN112989271A (en) * 2019-12-02 2021-06-18 阿里巴巴集团控股有限公司 Time series decomposition
WO2021129086A1 (en) * 2019-12-25 2021-07-01 中兴通讯股份有限公司 Traffic prediction method, device, and storage medium
CN111160651A (en) * 2019-12-31 2020-05-15 福州大学 STL-LSTM-based subway passenger flow prediction method
CN111161538A (en) * 2020-01-06 2020-05-15 东南大学 Short-term traffic flow prediction method based on time series decomposition
CN113379168A (en) * 2021-08-11 2021-09-10 云智慧(北京)科技有限公司 Time series prediction processing method, device and equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WANG XIN 等: "A Trend Prediction Method for Failures Time Series Data by Exploring Singular Spectrum Analysis and Support Vector Machines Regression", 2019 IEEE 7TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 20 January 2020 (2020-01-20), pages 535 - 541 *
孟婷婷;贾宝平;: "基于时间序列季节性分解模型的科技查新课题量预测研究", 科技创新导报, no. 23, 11 August 2017 (2017-08-11), pages 145 - 147 *

Also Published As

Publication number Publication date
CN113792931B (en) 2024-06-18

Similar Documents

Publication Publication Date Title
Lazzeri Machine learning for time series forecasting with Python
US20170076321A1 (en) Predictive analytics in an automated sales and marketing platform
US11775412B2 (en) Machine learning models applied to interaction data for facilitating modifications to online environments
Asai et al. Dynamic asymmetric leverage in stochastic volatility models
CN112116184A (en) Factory risk estimation using historical inspection data
CN113159355A (en) Data prediction method, data prediction device, logistics cargo quantity prediction method, medium and equipment
US20140379310A1 (en) Methods and Systems for Evaluating Predictive Models
Ashik et al. Time series model for stock price forecasting in India
CN111754278A (en) Article recommendation method and device, computer storage medium and electronic equipment
Gulay et al. Comparison of forecasting performances: Does normalization and variance stabilization method beat GARCH (1, 1)‐type models? Empirical Evidence from the Stock Markets
CN113837806A (en) Product traffic prediction method, device, electronic equipment and storage medium
Jang et al. Detection and prediction of house price bubbles: Evidence from a new city
CN115423040A (en) User portrait identification method and AI system of interactive marketing platform
Stødle et al. Data‐driven predictive modeling in risk assessment: Challenges and directions for proper uncertainty representation
CN113947439A (en) Demand prediction model training method and device and demand prediction method and device
US20140282034A1 (en) Data analysis in a network
Akande et al. Application of XGBoost Algorithm for Sales Forecasting Using Walmart Dataset
CN113792931B (en) Data prediction method and device, logistics cargo amount prediction method, medium and equipment
Chen et al. An integrated model for maintenance policies and production scheduling based on immune–culture algorithm
CN112328899B (en) Information processing method, information processing apparatus, storage medium, and electronic device
Sabbani et al. Business matching for event management and marketing in mass based on predictive algorithms
Rigatos et al. Forecasting of commodities prices using a multi‐factor PDE model and Kalman filtering
Shirata et al. A Proposal of a Method to Determine the Appropriate Learning Period in Stock Price Prediction Using Machine Learning
Oust et al. Assessing the explanatory power of dwelling condition in automated valuation models
US20240211250A1 (en) Software version control using forecasts as covariate for experiment variance reduction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant