WO2022053064A1 - Method and apparatus for time sequence prediction - Google Patents

Method and apparatus for time sequence prediction Download PDF

Info

Publication number
WO2022053064A1
WO2022053064A1 PCT/CN2021/118272 CN2021118272W WO2022053064A1 WO 2022053064 A1 WO2022053064 A1 WO 2022053064A1 CN 2021118272 W CN2021118272 W CN 2021118272W WO 2022053064 A1 WO2022053064 A1 WO 2022053064A1
Authority
WO
WIPO (PCT)
Prior art keywords
future
historical
data sequence
time series
neural network
Prior art date
Application number
PCT/CN2021/118272
Other languages
French (fr)
Chinese (zh)
Inventor
朱云依
Original Assignee
胜斗士(上海)科技技术发展有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 胜斗士(上海)科技技术发展有限公司 filed Critical 胜斗士(上海)科技技术发展有限公司
Publication of WO2022053064A1 publication Critical patent/WO2022053064A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/219Managing data history or versioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • the present application relates to time series forecasting, and in particular, to a method, apparatus, and computer-readable storage medium for predicting future data of an object based on historical data of the object.
  • Forecasting sales expectations for future times based on product sales over a past period of time is known as product time series forecasting.
  • the current mainstream technologies for time series forecasting include two categories: one is the traditional statistics-based forecasting algorithm represented by Arima/Prophet, and the other is the deep learning-based forecasting algorithm represented by the LSTM neural network.
  • time series prediction algorithm based on traditional statistics is a linear algorithm, and it is difficult to capture the nonlinear and long-term laws in the time series.
  • the time series prediction algorithm based on LSTM neural network is prone to gradient disappearance or gradient explosion when the scale of the time series becomes larger, which leads to the distortion of the prediction result, and the operation efficiency is low, and the data and calculation are redundant.
  • the present application proposes a method, an apparatus and a computer-readable storage medium for time series prediction, which are used to solve at least one defect existing in the prior art solutions, and extract rules from historical data of objects and combine future influencing factors to predict The object's future data.
  • a method for time series prediction comprising:
  • a predicted feature data sequence is generated based on the regular data sequence, the future dynamic feature sequence of the object corresponding to the future time series, and the future static feature of the object, wherein the future dynamic feature sequence includes a future dynamic feature of the object corresponding to a future time in the time series, the future dynamic feature being associated with the corresponding future time;
  • an apparatus for time series prediction comprising:
  • a historical data acquisition unit configured to acquire a historical data sequence of the object corresponding to a historical time series, the historical data in the historical data sequence including the history of the object corresponding to the historical time in the historical time series dynamic features and historical values, wherein the historical dynamic features are associated with corresponding historical times;
  • a regularity extraction unit configured to use a first neural network model to extract the regularity data sequence of the object corresponding to the future time sequence based on the historical data sequence
  • a predicted feature generation unit configured to generate a predicted feature data sequence based on the regular data sequence, the future dynamic feature sequence of the object corresponding to the future time series, and the future static feature of the object, wherein the future a sequence of dynamic characteristics including future dynamic characteristics of the object corresponding to future times of the future time series, the future dynamic characteristics being associated with the corresponding future times;
  • a prediction unit configured to use a second neural network model to predict a future data sequence of the object corresponding to the future time series based on the predicted feature data sequence, where the future data in the future data sequence includes a sequence related to the future data sequence. The predicted future value of the object corresponding to the future time of the time series.
  • a computer-readable storage medium on which a computer program is stored, the computer program including executable instructions, when the executable instructions are executed by at least one processor, implement the above-mentioned method.
  • an electronic device comprising a processor and a memory for storing executable instructions of the processor, wherein the processor is configured to execute the executable instructions to implement the above the method described.
  • the time series prediction method and device can meet the requirements of efficient calculation, accurately capture the nonlinear effects of trend factors, seasonal factors, external factors, etc. on the predicted object, and make short-distance and Long-range time prediction.
  • FIG. 1 is a schematic diagram of a seq2seq neural network model architecture for time series prediction according to an embodiment of the present application
  • FIG. 2 is an exemplary flowchart of a method for time series forecasting according to an embodiment of the present application
  • FIG. 3 is a schematic block diagram of an apparatus for time series prediction according to an embodiment of the present application.
  • FIG. 4 is a schematic block diagram of an electronic device according to an embodiment of the present application.
  • the method and apparatus for predicting the future data of an object based on the historical data of the object are introduced with a specific neural network model structure hereinafter according to the embodiment, the solution of the present application is not limited to this example. It can be extended to other neural network structures capable of realizing the concept of time series forecasting of the present application, and can also be extended to other deep learning-based forecasting model structures.
  • the time series prediction method is introduced with the sales products of the catering industry at the sales place as the object, but the method of the present application can be applied to any objects and scenarios that require time series prediction.
  • the neural network generally refers to an artificial neural network (ANN).
  • a common convolutional neural network can be used for the neural network, and a fully convolutional neural network can be further used as the case may be.
  • Other specific types and structures of neural networks that are not relevant to the time series forecasting method of the present application have not been described too much herein to avoid confusion.
  • both the traditional statistics-based differential integrated moving average autoregressive (Autoregressive Integrated Moving Average, referred to as Arima) model forecasting algorithm and the time series model Prophet forecasting algorithm can be used to predict trends, seasonality, etc. time-related laws. They first disassemble the historical data corresponding to the historical time series into the linear superposition of trend factors, seasonal factors and external influencing factors, and respectively predict the impact of the above factors on the data corresponding to future time, and finally analyze the impact of these three factors. The subsequent prediction results are superimposed to obtain the final prediction result.
  • Arima Automatic Integrated Moving Average
  • linear algorithms are difficult to capture the laws that exist in time series; different time series are predicted independently, resulting in no consideration of the relationship between different time series, so each time series forecast It is not accurate enough and its simple limited superposition cannot accurately reflect the real process change trend; when the time series is relatively short and the corresponding amount of historical data is relatively small, the linear algorithm cannot capture long-term laws and cannot learn from other sequences.
  • the LSTM Long-Short-Term Memory, long short-term memory
  • the historical observations, historical influencing factors and future influencing factors of the variables corresponding to the time series are used as the input of the neural network model structure, and the future predicted values of the variables are used as the output of the neural network model structure.
  • LSTM network is a temporal recurrent neural network specially designed to solve the long-term dependency problem of general RNN (Recurrent Neural Network), which is suitable for processing and predicting important events with very long intervals and delays in time series.
  • LSTM networks outperform temporal recurrent neural networks and hidden Markov models (HMMs).
  • HMMs hidden Markov models
  • the important structure in the LSTM network is the gate, in which the forget gate determines whether the input is passed into the block, the input gate determines whether the input is accepted to pass into the block, and the output gate determines whether the information in the block memory is passed out.
  • LSTM networks are usually trained using gradient descent.
  • the LSTM network model can overcome the problem of poor long-term forecasting effect of the Arima/Prophet algorithm, it still cannot meet many requirements of time series forecasting.
  • time series prediction process of the present application is described below with reference to the seq2seq neural network model architecture of FIG. 1 and the method flow for time series prediction of FIG. 2 according to an embodiment of the present application.
  • the basic structure of the neural network model architecture 100 in FIG. 1 is a seq2seq (sequence to sequence) network model.
  • the seq2seq neural network model can be regarded as a transformation model.
  • the basic idea is that the former neural network model of the two neural network models connected in series is used as the encoder network, and the latter neural network model is used as the decoder network.
  • the encoder network converts a sequence of data into a vector or sequence of vectors, and the decoder network generates another sequence of data from that vector or sequence of vectors.
  • a usage scenario of the seq2seq network model is speech recognition, in which the encoder network converts or divides English sentences into English or Chinese semantic data or semantic sequences, and the decoder network can convert the semantic data or semantic sequences into English sentences corresponding to them. Chinese sentences.
  • the optimization of the seq2seq network model can use the maximum likelihood estimation method to maximize the probability of the data sequence generated by the decoder to obtain the optimal conversion effect.
  • the seq2seq neural network model architecture 100 includes a first neural network model 110 as an encoder and a second neural network model 120 as a decoder.
  • the first neural network model 110 is used for extracting information in the historical data, especially regular data reflecting the regularity in the historical data.
  • the first neural network model 110 is a WaveNet neural network.
  • the WaveNet network is designed to predict the predicted value of the nth data based on the first n-1 data of a data sequence.
  • WaveNet is particularly suitable for high-throughput input of one-dimensional data sequences as multi-dimensional vectors, which can achieve fast computation.
  • the standard WaveNet network model is a convolutional neural network in which each convolutional layer convolves the previous layer. The larger the convolution kernel of the network and the more layers, the stronger the perception ability in the time domain and the larger the perception range.
  • each time a node is generated the node can be placed in the last node in the input layer and then iteratively generated.
  • the activation function of the WaveNet network can use gate units, for example.
  • the hidden layer between the input layer and the output layer of the network adopts recursive and skip connections, that is, the node of each convolutional layer in the hidden layer will add the original value and the output value of the activation function and pass it to the next A convolutional layer.
  • the operation of reducing the number of channels can be achieved through a 1x1 convolution kernel. Then, the results of the activation function output of each hidden layer are added and finally output through the output layer.
  • the first neural network model 110 has an input layer (ie, a first convolutional layer) 112 , a hidden layer 113 and an output layer 114 .
  • the number of hidden layers 113 may be zero, one or more.
  • the input layer 112 , the hidden layer 113 and the output layer 114 each have a plurality of nodes 111 .
  • the number of nodes in the input layer 112 should at least correspond to the data length in the historical data sequence to ensure that the neural network can receive information at each historical time.
  • the first neural network model 110 using the ordinary WaveNet network is a one-dimensional causal convolutional network
  • the number of nodes used is Decrement by 1 with each convolutional layer. If the length of the historical data is large, many layers need to be added to the first neural network model 110 to satisfy n passes or a large filter is required, so that the selected gradient in the gradient descent process is too small, the training of the network is complicated, and the fitting Ineffective.
  • Dilated Convolutional Neural Network Dilated CNN
  • a dilated convolutional neural network is a convolutional network with "holes".
  • the first convolutional layer (ie, the input layer) of the dilated convolutional neural network may be a one-dimensional causal convolutional network with an expansion coefficient of 1.
  • the expansion coefficient of each convolutional layer is the expansion coefficient of the previous convolutional layer multiplied by the expansion index (Dilation Index), where the expansion index is a value not less than 2 and not greater than the convolutional layer.
  • Dilation Index the expansion index
  • This dilated convolutional neural network configuration can be employed in both the hidden layer and the output layer of the first neural network model 110 .
  • the dilation index is 2
  • the second convolutional layer will only use n, n-2, n-4, ... nodes for convolution
  • the third convolutional layer will only use n, n-4 , n-8, ... nodes, and so on.
  • the expanded neural network structure can significantly speed up the information transfer process in the neural network, avoid gradient disappearance or gradient explosion, and improve the processing speed and prediction accuracy of the first neural network model 110 .
  • the convolution kernel size is 2 and the expansion coefficient is 2
  • the number of convolutional layers of the neural network through which information is passed from the node corresponding to the first historical time to the node corresponding to the last historical time is Log 2 N, where N is the length of data in the historical data series.
  • the second neural network model 120 as a decoder may be a multi-layer perceptual (MLP) network.
  • the MLP network also includes an input layer, a hidden layer and an output layer, where each neuron node has an activation function (such as a sigmoid function) and is trained using a loss function.
  • the MLP network predicts the future value of the object based on the historical law extracted by the encoder network and the future influencing factors (including dynamic and static factors).
  • the time series method of the present application can use any other neural network capable of feature extraction and prediction of sequence data.
  • Network structures such as, but not limited to, various types of recurrent neural networks (RNNs) that can implement the function of time series prediction of the present application.
  • RNNs recurrent neural networks
  • the encoder network can use an LSTM network
  • the decoder network uses an MLP network.
  • the LSTM network has shortcomings, it is still possible to combine with the MLP network and adjust the input and output data sequences of the network to a certain extent. Compared with the existing scheme, better results can be obtained. You can also choose the WaveNet network as the encoder network, LSTM network or other RNN network as the decoder network, etc.
  • the unit of historical time and/or future time can be selected from hour, day, month, year, week, quarter, etc. as required.
  • the historical time and/or the future time may be a time point (for example, the t1th time, as of the first quarter, 10 am in the morning, etc.), or may be a continuous time period (for example, the t2th period of time) , week 2, month 3, October of the current year, etc.).
  • the time intervals between the respective historical times and/or future times may be the same to indicate a period in which the historical data and future data are extracted and predicted at the continuous time interval as a period sexual information.
  • the length of the time period can also be the same, so as to extract and predict the periodic information of the above-mentioned historical data and future data as a period.
  • the method 200 first acquires the historical data sequence 101 of the object corresponding to the historical time series T1 in step S210.
  • the historical value yi is the measured value of the object measured at the historical time ti , such as the actual sales value of the product.
  • the historical value is caused by the internal factors of the object, so it can also be called the internal factors of the object or the internal characteristic data.
  • the historical dynamic feature xi is the dynamic feature of the historical value yi in the historical data that affects the object, for example including one or more of whether it is a holiday, the number of working days, the number of days or weeks away from a holiday, and so on.
  • Historical dynamic characteristics are associated with time, such as including periodic factors that cyclically affect objects with a certain period (also known as periodic historical dynamic characteristics) and aperiodic factors that affect objects aperiodically (also known as aperiodic factors). Sexual History Dynamics).
  • the period of the periodic factor may be determined by the length of the same time interval between each historical time point in the historical time series, or by the length of the historical time as a time period of the same length.
  • the way in which aperiodic factors affect objects is related to a specific historical time, or it can be said to be random or triggered based on events.
  • the corresponding aperiodic factors may be different.
  • the number n of historical times represents the number or length of historical data.
  • the historical value yi may be a multidimensional variable or vector.
  • the historical dynamic feature xi that affects the historical value y i of the object includes many factors, the historical dynamic feature is considered to be a combination of multiple historical sub-dynamic features, and the historical dynamic feature xi can also be a multidimensional variable or vector .
  • Historical dynamic features x i and historical values y i can form a two-dimensional vector ( xi, y i ) T (also called a binary data group, hereinafter unified as a two-dimensional vector), each of which is a two-dimensional vector.
  • the sub-vectors x i and y i are both multidimensional vectors as described above.
  • the historical data sequence 101 can be represented as a one-dimensional sequence of two-dimensional vectors ⁇ (x 1 , y 1 ) T , (x 2 , y 2 ) T , . . . , (x n , y n ) T ⁇ .
  • the first neural network model 110 serving as an encoder does not process historical static data, which can reduce the redundancy of data and calculation, and improve the operation speed of the network model.
  • the first neural network model 110 is used to complete the regularity extraction function of extracting the regularity data sequence 102 of the object corresponding to the future time series T2 based on the historical data sequence 101.
  • the historical data sequence 101 is used as the input of the first neural network model 110, and the extracted regular data sequence 102 is output through the transmission and calculation of each convolutional layer of the WaveNet network.
  • the dilated convolutional neural network described above can speed up the regular information extraction process of the regular data sequence 102 and improve the information extraction accuracy.
  • the periodicity of the historical value of an object, such as the month, is affected.
  • the time interval between the future time ti of the future time series T2 (when the future time is a time point) and/or the length of the future time (when the future time is a time period) is set to be the same as that in the historical time series T1
  • the length of the historical time t n+j is the same, so the periodic regularity characteristic ca is the same for each future time t n +j .
  • the non-periodic regular feature c n+j is based on the future value y n+j of the object affected by a specific future time, so the non-periodic regular feature c n+j for the future time series T2
  • Each corresponding future time tn +j may be different.
  • m is the number of future times in the future time series T2, indicating the number or length of future data to be predicted.
  • the periodic regular feature ca also includes multiple sub-periodic regular features, which can be expressed as a multi-dimensional vector.
  • the dimension of the periodic regularity feature ca may be the same as the number of sub-periodic historical dynamic features in the periodic historical dynamic feature, or smaller than the latter to reduce the amount of computation.
  • an aperiodic historical dynamic feature may also have multiple sub-aperiodic dynamic features, so the aperiodic regularity feature c n+j also includes multiple sub-aperiodic regularity features, which can be represented as multi-dimensional vectors.
  • the dimension of the non-periodic regular feature c n+j can also be the same as the number of sub-periodic historical dynamic features in the aperiodic historical dynamic feature, or smaller than the latter to reduce the amount of computation.
  • the regular data sequence 102 can be represented as a one-dimensional sequence of two-dimensional vectors whose elements are composed of two multi-dimensional sub-vectors, a periodic regular feature c a and a non-periodic regular feature c n+j ⁇ (c a , c n+ 1 ) T , ( ca , cn +2 ) T , . . . , ( ca , cn +m ) T ⁇ .
  • x n+j in Fig. 1 is the future dynamic feature of the future value y n+j of the influence object corresponding to the future time t n+j in the future time series T2.
  • the future dynamic feature x n+j may be, for example, one or more of the promotion activities at a certain time in the future, whether it is a holiday, the number of working days, the number of days or weeks away from a holiday, and so on.
  • the future dynamics are also associated with time and may be a multi-dimensional vector that includes multiple sub-future dynamics.
  • the future dynamic features x n+j form a one-dimensional sequence of multi-dimensional vectors ⁇ x n+1 , x n+2 , ..., x n+m ⁇ .
  • the future static features xs may include properties of the object (which are generally only relevant to the object itself and not to future time) and other features that are not time-dependent.
  • the future static features x s can be the category of the product, the temperature of the product, the sales location of the product (for example, represented by the location of the distribution center), etc. These features are only associated with the object and do not have transsexual.
  • the future static feature x s can be a multidimensional vector composed of multiple sub-features.
  • the future static feature x s can be further processed.
  • the future static features x s are divided into different types, and the correlation between each type is different. Embedding operations can transform sparse discrete variables into continuous variables. Embed the future static features according to their types, for example, divide them into two groups of future static features x s1 and x s2 according to location-related features and product attribute-related features, so that different groups of future static features are not related, that is, keep orthogonality Therefore, it avoids considering each specific static influence factor as a variable or a dimension of a vector, thereby reducing the overall dimension of future static features x s and reducing the computational load of the model.
  • the future static feature set x s1 includes multi-dimensional future static features e 1 , which can affect objects at each future time t n+j
  • the future static feature set x s2 includes multi-dimensional future static features e 2 , can also affect the object at every future time t n+j
  • the number of future static features or future static feature groups may be 0, 1 or more.
  • the number of specific features included in each future static feature determines that its dimension can be one or more.
  • the future static features x s form a 0-, 1-, or multi-dimensional vector as a 1-dimensional sequence of length m ⁇ x s , x s , . . . , x s ⁇ with elements.
  • the one-dimensional sequence can be expressed as ⁇ (e 1 , e 2 ) T , (e 1 , e 2 ) T , . . . , (e 1 , e 2 ) T ⁇ .
  • the predicted feature data sequence 103 may be generated based on the regular data sequence 102, the future dynamic feature sequence of the object corresponding to the future time series, and the future static feature of the object.
  • the generating process can be completed by splicing the one-dimensional sequences of the four influencing factors to form a one-dimensional prediction feature data sequence 103 with multi-dimensional (for example, 4-dimensional) vectors (or can be referred to as quaternary data groups) as elements. As shown in FIG.
  • the one-dimensional sequence 103 can be represented as ⁇ ( ca , cn +1 , e1, e2 , xn +1 ) T , ( ca , cn +2 , e1, e 2 , x n+2 ) T , ..., (ca , c n +m , e 1 , e 2 , x n+m ) T ⁇ .
  • the predicted feature data sequence 103 is input into the second neural network model 120, and the prediction of the future corresponding to the future time series T2 is completed through the transfer and operation of a decoder such as a multi-layer perceptron MLP network.
  • the future value yn +j of each predicted object in the future data sequence 104 is a multidimensional vector having the same dimensions as the historical value yi .
  • the method 200 may also optionally include, prior to using at least one of the first and second neural network models 110 and 120 as encoder and decoder networks, respectively, using the training data set for the neural network model A step S250 of training to determine the optimal parameters of the model.
  • the parameters of the neural network model can be unchanged during the use period after the training is completed, or they can be updated or adjusted based on a new data set after a period of use or in a predetermined period, and the model parameters can also be updated in real time by means of online supervision. .
  • FIG. 3 shows an exemplary structure of an apparatus 300 for time series prediction according to an embodiment of the present application.
  • the apparatus 300 includes a historical data acquisition unit 310 , a regularity extraction unit 320 , a prediction feature generation unit 330 and a prediction unit 340 .
  • the historical data acquisition unit 310 is configured to acquire the historical data sequence 101 of the object corresponding to the historical time series T1.
  • the historical data in the historical data sequence 101 includes the historical dynamic features xi associated with time corresponding to the historical time t i in the historical time series T1, and the historical value yi of the object.
  • the regularity extraction unit 320 includes, for example, the first neural network model 110 as an encoder network in the seq2seq neural network model to extract the regularity of historical data. This unit is used to extract the regular data sequence 102 of the object corresponding to the future time series T2 from the historical data sequence 101 provided by the historical data acquisition unit 310 by using the neural network model.
  • the regular data sequence 102 includes the periodic regular feature ca of the object corresponding to the future time t n+j of the future time series T2 and the aperiodic regular feature cn +j associated with the corresponding future time.
  • the encoder network can choose a sequence data network model such as the WaveNet network, and can further adopt a structure such as a dilated convolutional network to speed up information transfer and computation.
  • the prediction feature generation unit 330 is configured to use the regularity data sequence 102 output by the regularity extraction unit 320, the future dynamic feature sequence composed of the future dynamic feature x n+ j corresponding to the future time t n+j in the future time series T2, and the future dynamic feature sequence.
  • the static features x s are combined to generate the predicted feature data sequence 103 .
  • a future dynamic feature x n+j is associated with a future time t n+j .
  • the predicted feature generating unit 330 may further group the static features x s to orthogonalize each group of static features, thereby reducing the vector of each data element of the predicted feature data sequence dimension.
  • the prediction unit 340 comprises, for example, the second neural network model 120 as a decoder network in a seq2seq neural network model to predict future values of the object.
  • the unit 340 is used to predict the future data sequence 104 of the object corresponding to the future time series T2 from the predicted feature data sequence 103 from the predicted feature generation unit 330 using the second neural network model 120.
  • the second neural network model 120 may use a convolutional neural network such as a multi-layer perceptual MLP network.
  • the apparatus 300 also optionally includes a model training unit 350 for training the corresponding neural network model to determine the optimal parameters of the model before using the neural network model in the above-mentioned extraction unit 320 and the prediction unit 340, and can supervise or Update the parameters of the model.
  • a model training unit 350 for training the corresponding neural network model to determine the optimal parameters of the model before using the neural network model in the above-mentioned extraction unit 320 and the prediction unit 340, and can supervise or Update the parameters of the model.
  • the experiment is carried out in the scenario of product prediction in the catering industry, and the test task requires to predict the sales volume of each product (object) in each distribution center in the next 1-4 weeks.
  • the test dataset targets about 20 distribution centers, each including on average about 200 products. In the historical data of product sales, the longest is 128 weeks and the shortest is 1 week.
  • the test task involves considering 23 dynamic influencing factors (such as whether it is a holiday, the number of working days, the number of weeks until the Spring Festival, etc.) and 7 static influencing factors (such as product classification, temperature, location of distribution center, etc.) in the prediction Wait).
  • Table 1 shows the training time, prediction time and prediction error of the models using different time series forecasting methods.
  • the deep learning method using the seq2seq neural network model requires a lot of floating-point operations, and uses one more graphics processing unit GPU to accelerate the calculation than the traditional statistical algorithm Prophet.
  • the prediction accuracy (error) of the scheme using the WaveNet-MLP seq2seq2 (WaveNet network as the encoder and the MLP network as the decoder) according to the embodiment of the present application is better than that of the traditional statistical algorithm, and also better than that of the seq2seq neural network.
  • Network model structure both the encoder and decoder networks adopt the scheme of the LSTM network model.
  • the solution using the neural network model is faster than the traditional statistical algorithm; and between the solutions using the neural network model, the training time of the WaveNet-MLP seq2seq neural network model structure of this application is significantly reduced .
  • the advantages of the time series prediction method and apparatus lie in the following aspects: using two neural network models such as the WaveNet network and the MLP network as the encoder and the decoder network, respectively, can make historical data Sequences, future data sequences are calculated in parallel at different historical times of the corresponding historical time series and different future times of the future time series, thereby improving the speed of model training and use; using neural network models such as WaveNet networks as encoders , especially the dilated convolutional network structure, which reduces the transmission path of the information in the historical data sequence of the object from the first historical time to the last historical time, avoiding the gradient disappearance and gradient explosion during the training process of the neural network, Thereby, long-distance time series prediction can be performed; only the influencing factors that do not change with time are introduced at the input of the second neural network model as the decoder part, avoiding duplication and calculation at each time point of the encoder network, thereby reducing Redundancy of data and computation; embedded grouping of influencing
  • modules or units of the apparatus for time series prediction are mentioned in the above detailed description, this division is not mandatory. Indeed, according to embodiments of the present application, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one module or unit described above may be further divided into multiple modules or units to be embodied. Components shown as modules or units may or may not be physical units, ie may be located in one place, or may be distributed over multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of the present application. Those of ordinary skill in the art can understand and implement it without creative effort.
  • a computer-readable storage medium on which a computer program is stored, the program including executable instructions, which, when executed by, for example, a processor, can implement any one of the above Steps of the method for time series forecasting described in the Examples.
  • various aspects of the present application can also be implemented in the form of a program product, which includes program code, which is used to cause the program product to run on a terminal device when the program product is executed.
  • the terminal device performs the steps according to various exemplary embodiments of the present application described in the method for time series prediction in this specification.
  • the program product for implementing the above method according to the embodiments of the present application may adopt a portable compact disc read only memory (CD-ROM) and include program codes, and may be executed on a terminal device such as a personal computer.
  • CD-ROM compact disc read only memory
  • the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • the program product may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • the computer-readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, carrying readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a readable storage medium can also be any readable medium other than a readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • Program code embodied on a readable storage medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Program code for carrying out the operations of the present application may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural Programming Language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).
  • LAN local area network
  • WAN wide area network
  • an electronic device which may include a processor, and a memory for storing executable instructions of the processor.
  • the processor is configured to perform the steps of the method for time series prediction in any one of the foregoing embodiments by executing the executable instructions.
  • aspects of the present application may be implemented as a system, method or program product. Therefore, various aspects of the present application can be embodied in the following forms, namely: a complete hardware implementation, a complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software aspects, which may be collectively referred to herein as implementations "circuit", “module” or "system”.
  • the electronic device 400 according to this embodiment of the present application is described below with reference to FIG. 4 .
  • the electronic device 400 shown in FIG. 4 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present application.
  • electronic device 400 takes the form of a general-purpose computing device.
  • Components of the electronic device 400 may include, but are not limited to, at least one processing unit 410, at least one storage unit 420, a bus 430 connecting different system components (including the storage unit 420 and the processing unit 410), a display unit 440, and the like.
  • the storage unit stores program codes, and the program codes can be executed by the processing unit 410, so that the processing unit 410 executes various examples according to the present application described in the method for automatic time series prediction in this specification steps of sexual implementation.
  • the processing unit 410 may perform the steps shown in FIG. 2 .
  • the storage unit 420 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 4201 and/or a cache storage unit 4202 , and may further include a read only storage unit (ROM) 4203 .
  • RAM random access storage unit
  • ROM read only storage unit
  • the storage unit 420 may also include a program/utility 4204 having a set (at least one) of program modules 4205 including, but not limited to, an operating system, one or more application programs, other program modules, and programs Data, each or some combination of these examples may include an implementation of a network environment.
  • program/utility 4204 having a set (at least one) of program modules 4205 including, but not limited to, an operating system, one or more application programs, other program modules, and programs Data, each or some combination of these examples may include an implementation of a network environment.
  • the bus 430 may be representative of one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any of a variety of bus structures. bus.
  • the electronic device 400 may also communicate with one or more external devices 500 (eg, keyboards, pointing devices, Bluetooth devices, etc.), with one or more devices that enable a user to interact with the electronic device 400, and/or with Any device (eg, router, modem, etc.) that enables the electronic device 400 to communicate with one or more other computing devices. Such communication may occur through input/output (I/O) interface 450 . Also, the electronic device 400 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 460 . Network adapter 460 may communicate with other modules of electronic device 400 through bus 430 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 400, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives and data backup storage systems.
  • the exemplary embodiments described herein may be implemented by software, or may be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present application may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or a network Above, several instructions are included to cause a computing device (which may be a personal computer, a server, or a network device, etc.) to execute the method for time series prediction according to an embodiment of the present application.
  • a computing device which may be a personal computer, a server, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Human Resources & Organizations (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Game Theory and Decision Science (AREA)
  • Biophysics (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computing Systems (AREA)
  • Development Economics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A method and apparatus for time sequence prediction and a computer-readable storage medium. The method comprises: acquiring a historical data sequence of an object corresponding to a historical time sequence (S210); using a first neural network model to, on the basis of the historical data sequence, extract a regular data sequence corresponding to a future time sequence (S220); generating a predicted feature data sequence on the basis of the regular data sequence, a future dynamic feature sequence corresponding to the future time sequence, and a future static feature (S230); and using a second neural network model to, on the basis of the predicted feature data sequence, predict a future data sequence of an object corresponding to the future time sequence (S240). The described method can meet the requirements of high-efficiency calculation, and accurately capture the nonlinear impact of trend factors, seasonal factors, external factors, and the like on a predicted object, while simultaneously carrying out short-distance and long-distance time prediction.

Description

用于时间序列预测的方法和装置Method and apparatus for time series forecasting
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求于2020年09月14日递交的中国专利申请第202010959817.7号的优先权,在此全文引用上述中国专利申请公开的内容以作为本申请的一部分。This application claims the priority of Chinese Patent Application No. 202010959817.7 filed on September 14, 2020. The contents disclosed in the above Chinese patent application are hereby cited in their entirety as a part of this application.
技术领域technical field
本申请涉及时间序列预测,特别涉及基于对象的历史数据预测对象的未来数据的方法、装置和计算机可读存储介质。The present application relates to time series forecasting, and in particular, to a method, apparatus, and computer-readable storage medium for predicting future data of an object based on historical data of the object.
背景技术Background technique
在例如零售和餐饮的行业中,需要根据产品的历史销售情况估计未来一段时间的销售数据以用于产品的储备、配送和更新。准确地预测未来时间的产品销售情况可以有效降低成本,及时发现商机,快速调整经营策略而提高竞争力。In industries such as retail and catering, it is necessary to estimate the sales data for a future period of time based on the historical sales of the product for stocking, distribution and updating of the product. Accurately predicting product sales in the future can effectively reduce costs, discover business opportunities in time, and quickly adjust business strategies to improve competitiveness.
基于过去一段时间的产品销售情况来预测未来时间的销售预期被称为产品的时间序列预测。时间序列预测目前的主流技术包括两大类:一类是以Arima/Prophet为代表的基于传统统计学的预测算法,另一类是以LSTM神经网络为代表的基于深度学习的预测算法。Forecasting sales expectations for future times based on product sales over a past period of time is known as product time series forecasting. The current mainstream technologies for time series forecasting include two categories: one is the traditional statistics-based forecasting algorithm represented by Arima/Prophet, and the other is the deep learning-based forecasting algorithm represented by the LSTM neural network.
但是,基于传统统计学的时间序列预测算法是线性算法,难以捕捉时间序列中的非线性规律和长期规律。基于LSTM神经网络的时间序列预测算法则在时间序列规模变大时容易出现梯度消失或梯度爆炸的情况导致预测结果失真,并且运行效率低下,数据和计算冗余。However, the time series prediction algorithm based on traditional statistics is a linear algorithm, and it is difficult to capture the nonlinear and long-term laws in the time series. The time series prediction algorithm based on LSTM neural network is prone to gradient disappearance or gradient explosion when the scale of the time series becomes larger, which leads to the distortion of the prediction result, and the operation efficiency is low, and the data and calculation are redundant.
因此,存在对时间序列预测方法进行改进的需求。Therefore, there is a need for improvements in time series forecasting methods.
需要说明的是,在上述背景技术部分公开的信息仅用于加强对本申请的背景的理解,因此可以包括不构成对本领域普通技术人员已知的现有技 术的信息。It should be noted that the information disclosed in the above Background section is only for enhancing understanding of the background of the application, and therefore may include information that does not form the prior art known to a person of ordinary skill in the art.
发明内容SUMMARY OF THE INVENTION
本申请提出用于时间序列预测的方法、装置和计算机可读存储介质,用于解决现有技术方案中存在的至少一个缺陷,从对象的历史数据中提取规律并结合未来的影响因素,以预测对象的未来数据。The present application proposes a method, an apparatus and a computer-readable storage medium for time series prediction, which are used to solve at least one defect existing in the prior art solutions, and extract rules from historical data of objects and combine future influencing factors to predict The object's future data.
根据本申请的一方面,提出一种用于时间序列预测的方法,包括:According to an aspect of the present application, a method for time series prediction is proposed, comprising:
获取与历史时间序列对应的对象的历史数据序列,所述历史数据序列中的历史数据包括与所述历史时间序列中的历史时间对应的所述对象的历史动态特征和历史值,其中所述历史动态特征与对应的历史时间相关联;Obtain a historical data sequence of an object corresponding to a historical time series, where the historical data in the historical data sequence includes historical dynamic characteristics and historical values of the object corresponding to the historical time in the historical time series, wherein the historical data Dynamic features are associated with corresponding historical time;
使用第一神经网络模型基于所述历史数据序列提取与未来时间序列对应的所述对象的规律数据序列;Using the first neural network model to extract the regular data sequence of the object corresponding to the future time sequence based on the historical data sequence;
基于所述规律数据序列、与所述未来时间序列对应的所述对象的未来动态特征序列、以及所述对象的未来静态特征生成预测特征数据序列,其中所述未来动态特征序列包括与所述未来时间序列中的未来时间对应的所述对象的未来动态特征,所述未来动态特征与对应的未来时间相关联;以及A predicted feature data sequence is generated based on the regular data sequence, the future dynamic feature sequence of the object corresponding to the future time series, and the future static feature of the object, wherein the future dynamic feature sequence includes a future dynamic feature of the object corresponding to a future time in the time series, the future dynamic feature being associated with the corresponding future time; and
使用第二神经网络模型基于所述预测特征数据序列预测与所述未来时间序列对应的所述对象的未来数据序列,所述未来数据序列中的未来数据包括与所述未来时间序列的未来时间对应的所预测的所述对象的未来值。using a second neural network model to predict a future data sequence of the object corresponding to the future time series based on the predicted feature data sequence, where the future data in the future data sequence includes a future time corresponding to the future time series The predicted future value of the object.
根据本申请的另一方面,提出一种用于时间序列预测的装置,包括:According to another aspect of the present application, an apparatus for time series prediction is proposed, comprising:
历史数据获取单元,被配置为获取与历史时间序列对应的所述对象的历史数据序列,所述历史数据序列中的历史数据包括与所述历史时间序列中的历史时间对应的所述对象的历史动态特征和历史值,其中所述历史动态特征与对应的历史时间相关联;A historical data acquisition unit configured to acquire a historical data sequence of the object corresponding to a historical time series, the historical data in the historical data sequence including the history of the object corresponding to the historical time in the historical time series dynamic features and historical values, wherein the historical dynamic features are associated with corresponding historical times;
规律提取单元,被配置为使用第一神经网络模型基于所述历史数据序列提取与未来时间序列对应的所述对象的规律数据序列;a regularity extraction unit configured to use a first neural network model to extract the regularity data sequence of the object corresponding to the future time sequence based on the historical data sequence;
预测特征生成单元,被配置为基于所述规律数据序列、与所述未来时间序列对应的所述对象的未来动态特征序列、以及所述对象的未来静态特征生成预测特征数据序列,其中所述未来动态特征序列包括与所述未来时间序列的未来时间对应的所述对象的未来动态特征,所述未来动态特征与所述对应未来时间相关联;以及A predicted feature generation unit configured to generate a predicted feature data sequence based on the regular data sequence, the future dynamic feature sequence of the object corresponding to the future time series, and the future static feature of the object, wherein the future a sequence of dynamic characteristics including future dynamic characteristics of the object corresponding to future times of the future time series, the future dynamic characteristics being associated with the corresponding future times; and
预测单元,被配置为使用第二神经网络模型基于所述预测特征数据序列预测与所述未来时间序列对应的所述对象的未来数据序列,所述未来数据序列中的未来数据包括与所述未来时间序列的未来时间对应的所预测的所述对象的未来值。A prediction unit configured to use a second neural network model to predict a future data sequence of the object corresponding to the future time series based on the predicted feature data sequence, where the future data in the future data sequence includes a sequence related to the future data sequence. The predicted future value of the object corresponding to the future time of the time series.
根据本申请的又一方面,提出一种计算机可读存储介质,其上存储有计算机程序,该计算机程序包括可执行指令,当该可执行指令被至少一个处理器执行时,实施如上所述的方法。According to yet another aspect of the present application, a computer-readable storage medium is provided, on which a computer program is stored, the computer program including executable instructions, when the executable instructions are executed by at least one processor, implement the above-mentioned method.
根据本申请的再一方面,提出一种电子设备,包括处理器和用于存储所述处理器的可执行指令的存储器,其中,所述处理器被配置为执行所述可执行指令以实施如上所述的方法。According to yet another aspect of the present application, an electronic device is proposed, comprising a processor and a memory for storing executable instructions of the processor, wherein the processor is configured to execute the executable instructions to implement the above the method described.
根据本申请的实施例的时间序列预测方法和装置,能够满足高效计算的要求,准确地捕捉趋势因素、季节性因素、外部因素等对所预测的对象的非线性影响,同时做出短距离和长距离时间预测。The time series prediction method and device according to the embodiments of the present application can meet the requirements of efficient calculation, accurately capture the nonlinear effects of trend factors, seasonal factors, external factors, etc. on the predicted object, and make short-distance and Long-range time prediction.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.
附图说明Description of drawings
通过参照附图详细描述其示例性实施例,本申请的上述和其它特征及优点将变得更加明显。The above and other features and advantages of the present application will become more apparent by describing in detail exemplary embodiments thereof with reference to the accompanying drawings.
图1为根据本申请的一个实施例的用于时间序列预测的seq2seq神经网络模型架构的示意图;1 is a schematic diagram of a seq2seq neural network model architecture for time series prediction according to an embodiment of the present application;
图2为根据本申请的一个实施例的用于时间序列预测的方法的示例性流程图;FIG. 2 is an exemplary flowchart of a method for time series forecasting according to an embodiment of the present application;
图3为根据本申请的一个实施例的用于时间序列预测的装置的示意性框图;以及FIG. 3 is a schematic block diagram of an apparatus for time series prediction according to an embodiment of the present application; and
图4为根据本申请的一个实施例的电子设备的示意性框图。FIG. 4 is a schematic block diagram of an electronic device according to an embodiment of the present application.
具体实施方式detailed description
现在将参考附图更全面地描述示例性实施例。然而,示例性实施例能够以多种形式实施,且不应被理解为限于在此阐述的实施方式;相反,提供这些实施方式使得本申请将全面和完整,并将示例性实施例的构思全面地传达给本领域的技术人员。在图中,为了清晰,可能会夸大部分元件的尺寸或加以变形。在图中相同的附图标记表示相同或类似的结构,因而将省略它们的详细描述。Example embodiments will now be described more fully with reference to the accompanying drawings. Exemplary embodiments, however, can be embodied in various forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this application will be thorough and complete, and will fully convey the concept of exemplary embodiments conveyed to those skilled in the art. In the drawings, the size of most elements may be exaggerated or deformed for clarity. The same reference numerals in the drawings denote the same or similar structures, and thus their detailed descriptions will be omitted.
此外,所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施例中。在下面的描述中,提供许多具体细节从而给出对本申请的实施例的充分理解。然而,本领域技术人员将意识到,可以实践本申请的技术方案而没有所述特定细节中的一个或更多,或者可以采用其它的方法、元件等。在其它情况下,不详细示出或描述公知结构、方法或者操作以避免模糊本申请的各方面。Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided in order to give a thorough understanding of the embodiments of the present application. However, one skilled in the art will appreciate that the technical solutions of the present application may be practiced without one or more of the specific details, or other methods, elements, etc. may be employed. In other instances, well-known structures, methods, or operations are not shown or described in detail to avoid obscuring aspects of the application.
本领域技术人员将理解的是,在下文中虽然根据实施例以具体的神经网络模型结构介绍用于基于对象的历史数据预测对象的未来数据的方法和装置,但是本申请的方案并不限于该示例性神经网络模型,而是可以扩展到其它能够实现本申请的时间序列预测的构思的神经网络结构,还可以扩展到其它基于深度学习的预测模型结构中。在本文的示例性实施例中,时间序列预测方法以餐饮业在销售地的销售产品作为对象进行介绍,但是本申请的方法可以应用于任何需要进行时间序列预测的对象和场景。在下文中,如果没有特别说明,神经网络通常指人工神经网络(ANN)。神经网络可以采用常见的卷积神经网络(CNN),视情况还可以进一步采用全卷积神经网络。本文中没有对与本申请的时间序列预测方法不相关的神经网络的其它具体类型和结构进行过多描述以免造成混淆。Those skilled in the art will understand that, although the method and apparatus for predicting the future data of an object based on the historical data of the object are introduced with a specific neural network model structure hereinafter according to the embodiment, the solution of the present application is not limited to this example. It can be extended to other neural network structures capable of realizing the concept of time series forecasting of the present application, and can also be extended to other deep learning-based forecasting model structures. In the exemplary embodiment of this paper, the time series prediction method is introduced with the sales products of the catering industry at the sales place as the object, but the method of the present application can be applied to any objects and scenarios that require time series prediction. Hereinafter, unless otherwise specified, the neural network generally refers to an artificial neural network (ANN). A common convolutional neural network (CNN) can be used for the neural network, and a fully convolutional neural network can be further used as the case may be. Other specific types and structures of neural networks that are not relevant to the time series forecasting method of the present application have not been described too much herein to avoid confusion.
在主流的时间序列预测算法中,基于传统统计学的差分整合移动平均自回归(Autoregressive Integrated Moving Average,简称为Arima)模型预测算法和时间序列模型Prophet预测算法都可以用于预测趋势、季节性等时间相关的规律。它们先将历史时间序列对应的历史数据拆解为趋势因素、季节性因素和外部影响因素的线性叠加,并分别预测上述因素对与未来时间对应的数据的影响,最后将这三种因素的影响后的预测结果进行叠加得到最终的预测结果。Among the mainstream time series forecasting algorithms, both the traditional statistics-based differential integrated moving average autoregressive (Autoregressive Integrated Moving Average, referred to as Arima) model forecasting algorithm and the time series model Prophet forecasting algorithm can be used to predict trends, seasonality, etc. time-related laws. They first disassemble the historical data corresponding to the historical time series into the linear superposition of trend factors, seasonal factors and external influencing factors, and respectively predict the impact of the above factors on the data corresponding to future time, and finally analyze the impact of these three factors. The subsequent prediction results are superimposed to obtain the final prediction result.
但是传统统计学的时间序列预测算法的主要问题在于:线性算法很难捕捉时间序列中存在的规律;不同的时间序列被独立预测导致没有考虑不同时间序列之间的关系,因此每个时间序列预测不够准确而且其简单的限定叠加不能准确体现真实的过程变化趋势;当时间序列比较短而对应的历史数据量比较少时,线性算法无法捕捉到长期规律,并且无法从其它序列中借鉴规律。However, the main problems of traditional statistical time series forecasting algorithms are: linear algorithms are difficult to capture the laws that exist in time series; different time series are predicted independently, resulting in no consideration of the relationship between different time series, so each time series forecast It is not accurate enough and its simple limited superposition cannot accurately reflect the real process change trend; when the time series is relatively short and the corresponding amount of historical data is relatively small, the linear algorithm cannot capture long-term laws and cannot learn from other sequences.
在基于深度学习的预测算法中,通常采用LSTM(Long-Short-Term Memory,长短期记忆)网络结构。将与时间序列对应的变量的历史观测值、历史影响因素和变量的未来影响因素作为神经网络模型结构的输入,将变量的未来预测值作为神经网络模型结构的输出。In the prediction algorithm based on deep learning, the LSTM (Long-Short-Term Memory, long short-term memory) network structure is usually used. The historical observations, historical influencing factors and future influencing factors of the variables corresponding to the time series are used as the input of the neural network model structure, and the future predicted values of the variables are used as the output of the neural network model structure.
LSTM网络是一种被专门设计为解决一般的RNN(循环神经网络)存在的长期依赖问题的时间循环神经网络,其适合处理和预测时间序列中间隔和延迟非常长的重要事件。LSTM网络比时间递归神经网络和隐马尔科夫模型(HMM)性能更好。在LSTM网络中的重要结构是门,其中遗忘门决定输入不传入区块,输入门决定接受输入以传入区块,输出门决定区块记忆中的信息是否传出。LSTM网络通常使用梯度下降法进行训练。LSTM network is a temporal recurrent neural network specially designed to solve the long-term dependency problem of general RNN (Recurrent Neural Network), which is suitable for processing and predicting important events with very long intervals and delays in time series. LSTM networks outperform temporal recurrent neural networks and hidden Markov models (HMMs). The important structure in the LSTM network is the gate, in which the forget gate determines whether the input is passed into the block, the input gate determines whether the input is accepted to pass into the block, and the output gate determines whether the information in the block memory is passed out. LSTM networks are usually trained using gradient descent.
基于LSTM网络模型的时间序列预测算法存在若干缺陷。首先,信息在经由LSTM网络模型的传递过程中会出现梯度消失或梯度爆炸的现象,所以在长距离时间序列预测时的预测结果可能失真。虽然LSTM网络中的门在一定程度上可以缓解这种情况但是无法根除。其次,在LSTM网络结构中,信息在LSTM网络结构的卷积层中的每个节点上从前到后、从下到 上逐个传递,这使得时间序列中的多个信息难以并行运算,导致网络模型的训练过程缓慢,效率低下。再次,为了将时间序列中所包含的影响因素中不随时间变化的部分信息输入到LSTM网络结构中,需要将这些信息在每个节点上进行复制,造成数据和计算的冗余,进一步降低了LSTM网络模型的处理速度。因此,虽然LSTM网络模型能够克服Arima/Prophet算法的长期预测效果不佳的问题,但是仍然不能满足时间序列预测的诸多要求。There are several flaws in the time series forecasting algorithm based on the LSTM network model. First, the gradient disappears or the gradient explodes during the transmission of information through the LSTM network model, so the prediction results in long-distance time series prediction may be distorted. Although gates in LSTM networks can mitigate this to some extent, they cannot be eradicated. Secondly, in the LSTM network structure, information is passed one by one on each node in the convolutional layer of the LSTM network structure, from front to back and from bottom to top, which makes it difficult for multiple pieces of information in the time series to operate in parallel, resulting in the network model The training process is slow and inefficient. Third, in order to input the time-invariant part of the information of the influencing factors contained in the time series into the LSTM network structure, it is necessary to copy this information on each node, resulting in redundancy of data and calculation, which further reduces the LSTM. The processing speed of the network model. Therefore, although the LSTM network model can overcome the problem of poor long-term forecasting effect of the Arima/Prophet algorithm, it still cannot meet many requirements of time series forecasting.
下面结合根据本申请的实施例的图1的seq2seq神经网络模型架构和图2的用于时间序列预测的方法流程来描述本申请的时间序列预测过程。The time series prediction process of the present application is described below with reference to the seq2seq neural network model architecture of FIG. 1 and the method flow for time series prediction of FIG. 2 according to an embodiment of the present application.
图1中的神经网络模型架构100的基本结构为一个seq2seq(sequence to sequence)网络模型。seq2seq神经网络模型可以被视为一种转换模型,基本思想是串联连接的两个神经网络模型中的前一神经网络模型作为编码器网络,后一个神经网络模型作为解码器网络。编码器网络将数据序列转化为向量或向量序列,解码器网络则根据该向量或向量序列生成另一个数据序列。seq2seq网络模型的一种使用场景为语音识别,其中编码器网络将英文语句转化或分割为英文或中文的语义数据或语义序列,解码器网络则可以将语义数据或语义序列转化为英文语句对应的中文语句。seq2seq网络模型的优化可以采用极大似然估计方法,使得解码器生成的数据序列概率最大以获得最优的转换效果。The basic structure of the neural network model architecture 100 in FIG. 1 is a seq2seq (sequence to sequence) network model. The seq2seq neural network model can be regarded as a transformation model. The basic idea is that the former neural network model of the two neural network models connected in series is used as the encoder network, and the latter neural network model is used as the decoder network. The encoder network converts a sequence of data into a vector or sequence of vectors, and the decoder network generates another sequence of data from that vector or sequence of vectors. A usage scenario of the seq2seq network model is speech recognition, in which the encoder network converts or divides English sentences into English or Chinese semantic data or semantic sequences, and the decoder network can convert the semantic data or semantic sequences into English sentences corresponding to them. Chinese sentences. The optimization of the seq2seq network model can use the maximum likelihood estimation method to maximize the probability of the data sequence generated by the decoder to obtain the optimal conversion effect.
根据本申请的实施例,seq2seq神经网络模型架构100包括作为编码器的第一神经网络模型110和作为解码器的第二神经网络模型120。According to an embodiment of the present application, the seq2seq neural network model architecture 100 includes a first neural network model 110 as an encoder and a second neural network model 120 as a decoder.
第一神经网络模型110用于提取历史数据中的信息,特别是体现历史数据中的规律的规律数据。根据一个实施例,第一神经网络模型110为WaveNet神经网络。WaveNet网络作为一种序列生成模型,被设计用于根据一个数据序列的前n-1个数据预测第n个数据的预测值。WaveNet特别适用于输入为多维向量的一维数据序列的高通量输入,这种一维网络可以实现快速计算。标准的WaveNet网络模型是一种卷积神经网络,其中每个卷积层都对前一层进行卷积。网络的卷积核越大,层数越多,在时域上的 感知能力越强,感知范围越大。在WaveNet网络的创建过程中,每生成一个节点,就把该节点放到输入层中的最后一个节点之后继续迭代生成即可。WaveNet网络的激活函数例如可以使用门单元。该网络的输入层和输出层之间的隐含层之间采用递归和跳跃连接,即隐含层中每一卷积层的节点都会把原值和激活函数的输出值相加后传递给下一卷积层。可以通过1x1的卷积核来实现降通道数的操作。然后,每一个隐含层的激活函数输出的结果相加后最终通过输出层输出。The first neural network model 110 is used for extracting information in the historical data, especially regular data reflecting the regularity in the historical data. According to one embodiment, the first neural network model 110 is a WaveNet neural network. As a sequence generation model, the WaveNet network is designed to predict the predicted value of the nth data based on the first n-1 data of a data sequence. WaveNet is particularly suitable for high-throughput input of one-dimensional data sequences as multi-dimensional vectors, which can achieve fast computation. The standard WaveNet network model is a convolutional neural network in which each convolutional layer convolves the previous layer. The larger the convolution kernel of the network and the more layers, the stronger the perception ability in the time domain and the larger the perception range. In the creation process of the WaveNet network, each time a node is generated, the node can be placed in the last node in the input layer and then iteratively generated. The activation function of the WaveNet network can use gate units, for example. The hidden layer between the input layer and the output layer of the network adopts recursive and skip connections, that is, the node of each convolutional layer in the hidden layer will add the original value and the output value of the activation function and pass it to the next A convolutional layer. The operation of reducing the number of channels can be achieved through a 1x1 convolution kernel. Then, the results of the activation function output of each hidden layer are added and finally output through the output layer.
如图1所示,第一神经网络模型110具有输入层(即第一卷积层)112,隐含层113和输出层114。隐含层113的个数可以是0个,1个或多个。输入层112,隐含层113和输出层114中的每一个卷积层都具有多个节点111。输入层112的节点数应当至少与历史数据序列中的数据长度对应以保证神经网络能够接收到每个历史时间的信息。As shown in FIG. 1 , the first neural network model 110 has an input layer (ie, a first convolutional layer) 112 , a hidden layer 113 and an output layer 114 . The number of hidden layers 113 may be zero, one or more. The input layer 112 , the hidden layer 113 and the output layer 114 each have a plurality of nodes 111 . The number of nodes in the input layer 112 should at least correspond to the data length in the historical data sequence to ensure that the neural network can receive information at each historical time.
当采用普通WaveNet网络的第一神经网络模型110为一维因果卷积网络的情况下,由于需要使用所输入的数据序列的前n-1个数据预测第n个数据,因此所使用的节点数随着每个卷积层减少1。如果历史数据的长度很大,则需要为第一神经网络模型110增加很多层以满足n次传递或者需要很大的过滤器,使得梯度下降过程中选择的梯度过小时网络的训练复杂,拟合效果不好。When the first neural network model 110 using the ordinary WaveNet network is a one-dimensional causal convolutional network, since the first n-1 data of the input data sequence needs to be used to predict the nth data, the number of nodes used is Decrement by 1 with each convolutional layer. If the length of the historical data is large, many layers need to be added to the first neural network model 110 to satisfy n passes or a large filter is required, so that the selected gradient in the gradient descent process is too small, the training of the network is complicated, and the fitting Ineffective.
根据本申请的实施例,可以引入膨胀卷积神经网络(Dilated CNN)的概念。膨胀卷积神经网络是一种具有“空洞”的卷积网络。根据本申请的实施例,膨胀卷积神经网络的第一卷积层(即输入层)可以是膨胀系数为1的一维因果卷积网络。从神经网络的第二卷积层开始,每一卷积层的膨胀系数为前一卷积层的膨胀系数乘以膨胀指数(Dilation Index),其中膨胀指数是一个不小于2且不大于卷积核大小的正整数。可以在第一神经网络模型110的隐含层和输出层都采用这种膨胀卷积神经网络配置。例如,当膨胀指数为2时,第二卷积层只会使用第n,n-2,n-4,…的节点进行卷积,而第三卷积层则只会使用n,n-4,n-8,…的节点,以下依次类推。According to the embodiments of the present application, the concept of Dilated Convolutional Neural Network (Dilated CNN) can be introduced. A dilated convolutional neural network is a convolutional network with "holes". According to an embodiment of the present application, the first convolutional layer (ie, the input layer) of the dilated convolutional neural network may be a one-dimensional causal convolutional network with an expansion coefficient of 1. Starting from the second convolutional layer of the neural network, the expansion coefficient of each convolutional layer is the expansion coefficient of the previous convolutional layer multiplied by the expansion index (Dilation Index), where the expansion index is a value not less than 2 and not greater than the convolutional layer. A positive integer for the kernel size. This dilated convolutional neural network configuration can be employed in both the hidden layer and the output layer of the first neural network model 110 . For example, when the dilation index is 2, the second convolutional layer will only use n, n-2, n-4, ... nodes for convolution, while the third convolutional layer will only use n, n-4 , n-8, ... nodes, and so on.
膨胀神经网络结构可以显著加快信息在神经网络中的传递过程,避免出现梯度消失或梯度爆炸的情况,提高第一神经网络模型110的处理速度和预测精度。例如,当卷积核大小为2并且膨胀系数为2时,信息从与第一个历史时间处对应的节点传递到与最后一个历史时间处对应的节点所经过的神经网络的卷积层数为Log 2N,其中N是历史数据序列中的数据长度。 The expanded neural network structure can significantly speed up the information transfer process in the neural network, avoid gradient disappearance or gradient explosion, and improve the processing speed and prediction accuracy of the first neural network model 110 . For example, when the convolution kernel size is 2 and the expansion coefficient is 2, the number of convolutional layers of the neural network through which information is passed from the node corresponding to the first historical time to the node corresponding to the last historical time is Log 2 N, where N is the length of data in the historical data series.
根据本申请的实施例,作为解码器的第二神经网络模型120可以是多层感知(MLP)网络。MLP网络同样包括输入层,隐含层和输出层,其中每个神经元节点具有激活函数(例如sigmoid函数)并使用损失函数进行训练。MLP网络基于编码器网络提取的历史规律,未来的影响因素(包括动态因素和静态因素)预测出对象的未来值。According to an embodiment of the present application, the second neural network model 120 as a decoder may be a multi-layer perceptual (MLP) network. The MLP network also includes an input layer, a hidden layer and an output layer, where each neuron node has an activation function (such as a sigmoid function) and is trained using a loss function. The MLP network predicts the future value of the object based on the historical law extracted by the encoder network and the future influencing factors (including dynamic and static factors).
虽然上文中以WaveNet网络作为seq2seq神经网络模型架构的编码器网络的示例,以MLP网络作为解码器网络的示例,但是本申请的时间序列方法可以采用能够进行序列数据的特征提取和预测任何其它神经网络结构,这些网络结构例如但不限于能够实现本申请的时间序列预测的功能的各种类型的循环神经网络RNN。例如,当使用seq2seq神经网络模型架构时,编码器网络可以采用LSTM网络,而解码器网络采用MLP网络,虽然LSTM网络存在缺陷,但是与MLP网络结合并且调整网络的输入输出数据序列仍然在一定程度上可以获得相比现有的方案更好的效果。还可以选择WaveNet网络作为编码器网络,LSTM网络或其它RNN网络作为解码器网络等。Although the WaveNet network is used as an example of the encoder network of the seq2seq neural network model architecture, and the MLP network is used as an example of the decoder network, the time series method of the present application can use any other neural network capable of feature extraction and prediction of sequence data. Network structures, such as, but not limited to, various types of recurrent neural networks (RNNs) that can implement the function of time series prediction of the present application. For example, when using the seq2seq neural network model architecture, the encoder network can use an LSTM network, while the decoder network uses an MLP network. Although the LSTM network has shortcomings, it is still possible to combine with the MLP network and adjust the input and output data sequences of the network to a certain extent. Compared with the existing scheme, better results can be obtained. You can also choose the WaveNet network as the encoder network, LSTM network or other RNN network as the decoder network, etc.
在图2所述的流程图中,用于时间序列预测的方法200用于基于与包含n个历史时间的历史时间序列T1={t 1,t 2,t 3,…,t n}对应的历史数据序列101来预测与包括m个未来时间的未来时间序列T2={t n+1,t n+2,t n+3,…,t n+m}对应的未来数据序列104。历史时间和/或未来时间的单位根据需求可以选择小时,日,月,年,周,季度等。例如,对于预测公交车站的乘客上下车人数,可以采用小时,甚至分钟,每刻钟的时间单位或间隔。对于快餐店,可以采用日,月,周等时间单位。依据快餐店的客流量, 以周为单位测量和预测食物产品的销售量相对其它单位更能体现该行业的历史数据规律和未来趋势,在下文中以周为例进行阐述。根据本申请的实施例,历史时间和/或未来时间可以是时间点(例如第t 1时刻,截至第1季度,早上10点等),也可以是连续的时间段(例如第t 2段时间,第2周,第3个月,当年10月份等)。当历史时间和未来时间是时间点时,各个历史时间和/或未来时间之间的时间间隔可以是相同的,以表示在该连续的时间间隔作为周期来提取和预测历史数据和未来数据的周期性信息。当历史时间和未来时间是时间段时,该时间段的长度也可以是相同的,以作为周期来提取和预测上述历史数据和未来数据的周期性信息。 In the flow chart depicted in FIG. 2, the method 200 for time series prediction is based on a historical time series T1={t 1 , t 2 , t 3 , . . . , t n } corresponding to a historical time series containing n historical times The historical data series 101 is used to predict the future data series 104 corresponding to the future time series T2={t n+1 , t n+2 , t n+3 , . . . , t n+m } including m future times. The unit of historical time and/or future time can be selected from hour, day, month, year, week, quarter, etc. as required. For example, for predicting the number of passengers getting on and off at a bus stop, hours, or even minutes, quarter-hour time units or intervals can be used. For fast food restaurants, time units such as days, months, and weeks can be used. According to the passenger flow of fast food restaurants, measuring and predicting the sales volume of food products in a weekly unit can better reflect the historical data laws and future trends of the industry than other units. According to the embodiments of the present application, the historical time and/or the future time may be a time point (for example, the t1th time, as of the first quarter, 10 am in the morning, etc.), or may be a continuous time period (for example, the t2th period of time) , week 2, month 3, October of the current year, etc.). When the historical time and the future time are points in time, the time intervals between the respective historical times and/or future times may be the same to indicate a period in which the historical data and future data are extracted and predicted at the continuous time interval as a period sexual information. When the historical time and the future time are a time period, the length of the time period can also be the same, so as to extract and predict the periodic information of the above-mentioned historical data and future data as a period.
方法200首先在步骤S210中获取与历史时间序列T1对应的对象的历史数据序列101。The method 200 first acquires the historical data sequence 101 of the object corresponding to the historical time series T1 in step S210.
历史数据序列101的历史数据包括与历史时间序列T1中的历史时间t i对应的历史动态特征x i和历史值y i,其中i=1,2,…,n。历史值y i为在该历史时间t i处测量到的对象的测量值,例如产品的实际销量值。历史值是由对象的内部因素造成的,因此也可以被称为对象的内部因素或内部特征数据。历史动态特征x i为历史数据中影响对象的历史值y i的动态特征,例如包括是否是节假日,工作日的天数,距离节假日的天数或周数等中的一项或多项。历史动态特征与时间相关联,例如包括以一定周期循环地影响对象的周期性因素(也可以称为周期性历史动态特征)和非周期地影响对象的非周期性因素(也可以称为非周期性历史动态特征)。周期性因素的周期可以由历史时间序列中的各个历史时间点之间的相同的时间间隔的长度确定,或由作为相同长度的时间段的历史时间的长度来确定。非周期性因素影响对象的方式与特定的历史时间有关,或者可以说其具有随机性或者基于事件触发。在历史时间序列T1中的每个历史时间t i处,对应的非周期性因素可能不同。历史时间的数量n表示历史数据的数量或长度。 The historical data of the historical data series 101 includes historical dynamic features x i and historical values yi corresponding to the historical time t i in the historical time series T1 , where i =1, 2, . . . , n. The historical value yi is the measured value of the object measured at the historical time ti , such as the actual sales value of the product. The historical value is caused by the internal factors of the object, so it can also be called the internal factors of the object or the internal characteristic data. The historical dynamic feature xi is the dynamic feature of the historical value yi in the historical data that affects the object, for example including one or more of whether it is a holiday, the number of working days, the number of days or weeks away from a holiday, and so on. Historical dynamic characteristics are associated with time, such as including periodic factors that cyclically affect objects with a certain period (also known as periodic historical dynamic characteristics) and aperiodic factors that affect objects aperiodically (also known as aperiodic factors). Sexual History Dynamics). The period of the periodic factor may be determined by the length of the same time interval between each historical time point in the historical time series, or by the length of the historical time as a time period of the same length. The way in which aperiodic factors affect objects is related to a specific historical time, or it can be said to be random or triggered based on events. At each historical time ti in the historical time series T1, the corresponding aperiodic factors may be different. The number n of historical times represents the number or length of historical data.
当对象包括多个部分或子对象(例如产品是包括多种产品的集合)时,历史值y i可以是多维变量或向量。同样,影响对象的历史值y i的历史动态特征x i包括多方面的因素时,该历史动态特征被认为是多种历史子动态特 征的组合,历史动态特征x i也可以是多维变量或向量。历史动态特征x i和历史值y i可以组成二维向量(x i,y i) T(也可以被称为二元数据组,在下文中统一为二维向量),该二维向量中的每个子向量x i和y i都是上文中所述的多维向量。因此,历史数据序列101可以表示为二维向量的一维序列{(x 1,y 1) T,(x 2,y 2) T,…,(x n,y n) T}。 When an object includes multiple parts or sub-objects (eg, a product is a collection of multiple products), the historical value yi may be a multidimensional variable or vector. Similarly, when the historical dynamic feature xi that affects the historical value y i of the object includes many factors, the historical dynamic feature is considered to be a combination of multiple historical sub-dynamic features, and the historical dynamic feature xi can also be a multidimensional variable or vector . Historical dynamic features x i and historical values y i can form a two-dimensional vector ( xi, y i ) T (also called a binary data group, hereinafter unified as a two-dimensional vector), each of which is a two-dimensional vector. The sub-vectors x i and y i are both multidimensional vectors as described above. Thus, the historical data sequence 101 can be represented as a one-dimensional sequence of two-dimensional vectors {(x 1 , y 1 ) T , (x 2 , y 2 ) T , . . . , (x n , y n ) T }.
根据本申请的实施例,在历史数据101中并未加入与历史时间不相关的历史静态特征。作为编码器的第一神经网络模型110不处理历史静态数据可以减少数据和计算的冗余,提高网络模型的运算速度。According to the embodiment of the present application, historical static features not related to historical time are not added to the historical data 101 . The first neural network model 110 serving as an encoder does not process historical static data, which can reduce the redundancy of data and calculation, and improve the operation speed of the network model.
在步骤S220,使用第一神经网络模型110基于历史数据序列101完成提取与未来时间序列T2对应的对象的规律数据序列102的规律性提取功能。历史数据序列101作为第一神经网络模型110的输入,经过WaveNet网络的各个卷积层的传递和计算,输出所提取的规律数据序列102。上文所述的膨胀卷积神经网络可以加速上述规律数据序列102的规律信息提取过程并改善信息的提取精度。In step S220, the first neural network model 110 is used to complete the regularity extraction function of extracting the regularity data sequence 102 of the object corresponding to the future time series T2 based on the historical data sequence 101. The historical data sequence 101 is used as the input of the first neural network model 110, and the extracted regular data sequence 102 is output through the transmission and calculation of each convolutional layer of the WaveNet network. The dilated convolutional neural network described above can speed up the regular information extraction process of the regular data sequence 102 and improve the information extraction accuracy.
由于所输入的历史数据序列101中的历史动态特征x i中包括周期性动态特征和非周期性动态特征,因此在第一神经网络模型110所输出的规律数据序列102中表示的所提取的规律信息中,包括来自周期性历史动态特征的周期性规律特征c a以及来自非周期性历史动态特征的非周期性规律特征c n+j,其中j=1,2,…,m。周期性规律特征c a与周期性历史动态特征对应,以一定周期循环地影响对象的未来值y n+j,j=1,2,…,m,其例如包括以季节性、星期和/或月份等影响对象的历史值的周期性规律。由于将未来时间序列T2的未来时间t i之间的时间间隔(当未来时间为时间点时)和/或未来时间的长度(当未来时间为时间段时)设定为与历史时间序列T1中的历史时间t n+j的长度相同,因此周期性规律特征c a对于每个未来时间t n+j是相同的。与非周期性历史特征数据对应,非周期性规律特征c n+j基于特定的未来时间影响对象的未来值y n+j,因此非周期性规律特征c n+j对于未来时间序列T2中的每个对应的未来时间t n+j可能不同。m为未来时间序列T2中的未来时间的数量,表示待预测的未来数据的数量或长度。 Since the historical dynamic features xi in the input historical data sequence 101 include periodic dynamic features and aperiodic dynamic features, the extracted regularity represented in the regularity data sequence 102 output by the first neural network model 110 The information includes the periodic regularity feature ca from the periodic historical dynamic feature and the aperiodic regularity feature cn +j from the aperiodic historical dynamic feature, where j=1, 2, . . . m. The periodic regularity feature ca corresponds to the periodic historical dynamic feature, cyclically affecting the future value y n+j of the object with a certain period, j=1, 2, . . . The periodicity of the historical value of an object, such as the month, is affected. Since the time interval between the future time ti of the future time series T2 (when the future time is a time point) and/or the length of the future time (when the future time is a time period) is set to be the same as that in the historical time series T1 The length of the historical time t n+j is the same, so the periodic regularity characteristic ca is the same for each future time t n +j . Corresponding to the non-periodic historical feature data, the non-periodic regular feature c n+j is based on the future value y n+j of the object affected by a specific future time, so the non-periodic regular feature c n+j for the future time series T2 Each corresponding future time tn +j may be different. m is the number of future times in the future time series T2, indicating the number or length of future data to be predicted.
由于历史动态特征x i中包括的周期性历史动态特征可能是多个子周期性因素的组合,因此周期性规律特征c a也包括多个子周期性规律特征,其可以表示为多维向量。根据本申请的实施例,周期性规律特征c a的维数可以与周期性历史动态特征中的子周期性历史动态特征的数量相同,或者比后者小以减小运算量。类似的,非周期性历史动态特征也可能具有多个子非周期性动态特征,因此非周期性规律特征c n+j也包括多个子非周期性规律特征,其可以表示为多维向量。非周期性规律特征c n+j的维数同样可以与非周期性历史动态特征中的子周期性历史动态特征的数量相同,或者比后者小以减小运算量。这样,规律数据序列102可以表示为由周期性规律特征c a和非周期性规律特征c n+j两个多维子向量组成其元素的二维向量的一维序列{(c a,c n+1) T,(c a,c n+2) T,…,(c a,c n+m) T}。 Since the periodic historical dynamic feature included in the historical dynamic feature xi may be a combination of multiple sub-periodic factors, the periodic regular feature ca also includes multiple sub-periodic regular features, which can be expressed as a multi-dimensional vector. According to the embodiment of the present application, the dimension of the periodic regularity feature ca may be the same as the number of sub-periodic historical dynamic features in the periodic historical dynamic feature, or smaller than the latter to reduce the amount of computation. Similarly, an aperiodic historical dynamic feature may also have multiple sub-aperiodic dynamic features, so the aperiodic regularity feature c n+j also includes multiple sub-aperiodic regularity features, which can be represented as multi-dimensional vectors. The dimension of the non-periodic regular feature c n+j can also be the same as the number of sub-periodic historical dynamic features in the aperiodic historical dynamic feature, or smaller than the latter to reduce the amount of computation. In this way, the regular data sequence 102 can be represented as a one-dimensional sequence of two-dimensional vectors whose elements are composed of two multi-dimensional sub-vectors, a periodic regular feature c a and a non-periodic regular feature c n+j {(c a , c n+ 1 ) T , ( ca , cn +2 ) T , . . . , ( ca , cn +m ) T }.
在预测对象的未来数据序列104中的未来值y n+j时,可能还需要考虑在未来时间影响对象的其它因素。 In predicting future values yn +j in the subject's future data sequence 104, other factors that affect the subject at future times may also need to be considered.
与历史动态特征x i类似,图1中的x n+j是与未来时间序列T2中的未来时间t n+j对应的影响对象的未来值y n+j的未来动态特征。未来动态特征x n+j例如可以是未来某个时间的促销活动,是否是节假日,工作日的天数,距离节假日的天数或周数等中的一项或多项。未来动态特征同样与时间相关联,并且可以是包括多个子未来动态特征的多维向量。未来动态特征x n+j组成多维向量的一维序列{x n+1,x n+2,…,x n+m}。 Similar to the historical dynamic feature xi , x n+j in Fig. 1 is the future dynamic feature of the future value y n+j of the influence object corresponding to the future time t n+j in the future time series T2. The future dynamic feature x n+j may be, for example, one or more of the promotion activities at a certain time in the future, whether it is a holiday, the number of working days, the number of days or weeks away from a holiday, and so on. The future dynamics are also associated with time and may be a multi-dimensional vector that includes multiple sub-future dynamics. The future dynamic features x n+j form a one-dimensional sequence of multi-dimensional vectors {x n+1 , x n+2 , ..., x n+m }.
其它因素还可以包括影响对象的未来值y n+j但是与时间不相关的未来静态特征x s。未来静态特征x s可以包括对象的属性(其通常仅与对象本身有关而与未来时间无关)和与时间不相关的其它特征。例如,当对象为产品时,未来静态特征x s可以是产品的类别、产品的温度、产品的销售位置(例如以配销中心的位置表示)等,这些特征仅与对象相关联而不具有时变性。根据不相关的影响因素的数量,未来静态特征x s可以是由多个子特征组合的多维向量。 Other factors may also include future static features xs that affect the object 's future value yn +j but are not time-dependent. The future static features xs may include properties of the object (which are generally only relevant to the object itself and not to future time) and other features that are not time-dependent. For example, when the object is a product, the future static features x s can be the category of the product, the temperature of the product, the sales location of the product (for example, represented by the location of the distribution center), etc. These features are only associated with the object and do not have transsexual. Depending on the number of uncorrelated influencing factors, the future static feature x s can be a multidimensional vector composed of multiple sub-features.
根据本申请的实施例,可以对未来静态特征x s进行进一步处理。未来静态特征x s分为不同类型,每种类型之间的相关性不同。嵌入(embedding) 操作可以将稀疏的离散变量转化为连续变量。对未来静态特征按照所属的类型进行嵌入操作,例如按照地点相关特征和产品属性相关特征分为两组未来静态特征x s1和x s2,使得不同组未来静态特征之间不相关,即保持正交性,从而避免将每个具体的静态影响因素作为一个变量或向量的一个维度考虑从而降低了未来静态特征x s的整体维度,减少模型的运算量。在图1中,未来静态特征组x s1中包括多维的未来静态特征e 1,可以在每个未来时间t n+j影响对象,而未来静态特征组x s2中包括多维的未来静态特征e 2,同样可以在每个未来时间t n+j影响对象。根据具体情况,未来静态特征或未来静态特征组的数量可以是0个,1个或更多个。每个未来静态特征中包括的具体特征的数量决定其维数可以是1个或更多个。 According to the embodiments of the present application, the future static feature x s can be further processed. The future static features x s are divided into different types, and the correlation between each type is different. Embedding operations can transform sparse discrete variables into continuous variables. Embed the future static features according to their types, for example, divide them into two groups of future static features x s1 and x s2 according to location-related features and product attribute-related features, so that different groups of future static features are not related, that is, keep orthogonality Therefore, it avoids considering each specific static influence factor as a variable or a dimension of a vector, thereby reducing the overall dimension of future static features x s and reducing the computational load of the model. In Figure 1, the future static feature set x s1 includes multi-dimensional future static features e 1 , which can affect objects at each future time t n+j , and the future static feature set x s2 includes multi-dimensional future static features e 2 , can also affect the object at every future time t n+j . According to specific circumstances, the number of future static features or future static feature groups may be 0, 1 or more. The number of specific features included in each future static feature determines that its dimension can be one or more.
未来静态特征x s组成0维、一维或多维向量作为元素的长度为m的一维序列{x s,x s,…,x s}。以图1中的实施例为例,该一维序列可以表示为{(e 1,e 2) T,(e 1,e 2) T,…,(e 1,e 2) T}。 The future static features x s form a 0-, 1-, or multi-dimensional vector as a 1-dimensional sequence of length m {x s , x s , . . . , x s } with elements. Taking the embodiment in FIG. 1 as an example, the one-dimensional sequence can be expressed as {(e 1 , e 2 ) T , (e 1 , e 2 ) T , . . . , (e 1 , e 2 ) T }.
上文描述了具有四个部分的可以影响对象的未来数据y n+j的影响因素。在方法200的步骤S230中,可以基于规律数据序列102、与未来时间序列对应的对象的未来动态特征序列、以及对象的未来静态特征生成预测特征数据序列103。该生成过程可以通过将四种影响因素的一维序列拼接完成,形成以多维(例如4维)向量(或者可以称为四元数据组)为元素的一维的预测特征数据序列103。以图1中所示,该一维序列103可以表示为{(c a,c n+1,e 1,e 2,x n+1) T,(c a,c n+2,e 1,e 2,x n+2) T,…,(c a,c n+m,e 1,e 2,x n+m) T}。 The influence factors that can influence the subject's future data y n+j are described above with four parts. In step S230 of the method 200, the predicted feature data sequence 103 may be generated based on the regular data sequence 102, the future dynamic feature sequence of the object corresponding to the future time series, and the future static feature of the object. The generating process can be completed by splicing the one-dimensional sequences of the four influencing factors to form a one-dimensional prediction feature data sequence 103 with multi-dimensional (for example, 4-dimensional) vectors (or can be referred to as quaternary data groups) as elements. As shown in FIG. 1, the one-dimensional sequence 103 can be represented as {( ca , cn +1 , e1, e2 , xn +1 ) T , ( ca , cn +2 , e1, e 2 , x n+2 ) T , ..., (ca , c n +m , e 1 , e 2 , x n+m ) T }.
在接下来的步骤S240中,预测特征数据序列103被输入到第二神经网络模型120中,通过诸如多层感知器MLP网络的解码器的传递和运算,完成预测与未来时间序列T2对应的未来数据序列104,即{y n+1,y n+2,…,y n+m}的预测功能。未来数据序列104中的每个所预测的对象的未来值y n+j是与历史值y i具有相同维度的多维向量。 In the next step S240, the predicted feature data sequence 103 is input into the second neural network model 120, and the prediction of the future corresponding to the future time series T2 is completed through the transfer and operation of a decoder such as a multi-layer perceptron MLP network. Data sequence 104, the prediction function of {yn +1 , yn +2 , ..., yn +m }. The future value yn +j of each predicted object in the future data sequence 104 is a multidimensional vector having the same dimensions as the historical value yi .
根据本申请的实施例的方法200还可选地包括在使用分别作为编码器和解码器网络的第一和第二神经网络模型110和120中的至少一个之前, 使用训练数据集对神经网络模型进行训练以确定模型的最优参数的步骤S250。神经网络模型的参数在训练完成后的使用期间可以是不变的,也可以在使用一段时间后或者以预定的周期基于新的数据集更新或调整,还可以采用在线监督的方式实时更新模型参数。The method 200 according to an embodiment of the present application may also optionally include, prior to using at least one of the first and second neural network models 110 and 120 as encoder and decoder networks, respectively, using the training data set for the neural network model A step S250 of training to determine the optimal parameters of the model. The parameters of the neural network model can be unchanged during the use period after the training is completed, or they can be updated or adjusted based on a new data set after a period of use or in a predetermined period, and the model parameters can also be updated in real time by means of online supervision. .
图3示出根据本申请的实施例的用于时间序列预测的装置300的示例性结构。装置300包括历史数据获取单元310,规律提取单元320,预测特征生成单元330和预测单元340。FIG. 3 shows an exemplary structure of an apparatus 300 for time series prediction according to an embodiment of the present application. The apparatus 300 includes a historical data acquisition unit 310 , a regularity extraction unit 320 , a prediction feature generation unit 330 and a prediction unit 340 .
历史数据获取单元310用于获取与历史时间序列T1对应的对象的历史数据序列101。历史数据序列101中的历史数据包括与历史时间序列T1中的历史时间t i对应的与时间相关联的历史动态特征x i,以及对象的历史值y iThe historical data acquisition unit 310 is configured to acquire the historical data sequence 101 of the object corresponding to the historical time series T1. The historical data in the historical data sequence 101 includes the historical dynamic features xi associated with time corresponding to the historical time t i in the historical time series T1, and the historical value yi of the object.
规律提取单元320中包括例如作为seq2seq神经网络模型中的编码器网络以提取历史数据规律性的第一神经网络模型110。该单元用于使用该神经网络模型从历史数据获取单元310提供的历史数据序列101中提取与未来时间序列T2对应的对象的规律数据序列102。规律数据序列102中包括与未来时间序列T2的未来时间t n+j对应的对象的周期性规律特征c a和与对应的未来时间相关联的非周期性规律特征c n+j。在seq2seq神经网络模型结构中,编码器网络可以选择诸如WaveNet网络的序列数据网络模型,并可以进一步采用诸如膨胀卷积网络的结构加快信息传递和计算。 The regularity extraction unit 320 includes, for example, the first neural network model 110 as an encoder network in the seq2seq neural network model to extract the regularity of historical data. This unit is used to extract the regular data sequence 102 of the object corresponding to the future time series T2 from the historical data sequence 101 provided by the historical data acquisition unit 310 by using the neural network model. The regular data sequence 102 includes the periodic regular feature ca of the object corresponding to the future time t n+j of the future time series T2 and the aperiodic regular feature cn +j associated with the corresponding future time. In the seq2seq neural network model structure, the encoder network can choose a sequence data network model such as the WaveNet network, and can further adopt a structure such as a dilated convolutional network to speed up information transfer and computation.
预测特征生成单元330用于将规律提取单元320输出的规律数据序列102、与未来时间序列T2中的未来时间t n+j对应的未来动态特征x n+j组成的未来动态特征序列、以及未来静态特征x s组合以生成预测特征数据序列103。未来动态特征x n+j与未来时间t n+j相关联。在生成预测特征数据序列103的过程中,预测特征生成单元330还可以进一步对静态特征x s分组以使各组静态特征之间正交化,从而降低预测特征数据序列的每个数据元素的向量维度。 The prediction feature generation unit 330 is configured to use the regularity data sequence 102 output by the regularity extraction unit 320, the future dynamic feature sequence composed of the future dynamic feature x n+ j corresponding to the future time t n+j in the future time series T2, and the future dynamic feature sequence. The static features x s are combined to generate the predicted feature data sequence 103 . A future dynamic feature x n+j is associated with a future time t n+j . In the process of generating the predicted feature data sequence 103, the predicted feature generating unit 330 may further group the static features x s to orthogonalize each group of static features, thereby reducing the vector of each data element of the predicted feature data sequence dimension.
预测单元340包括例如作为seq2seq神经网络模型中的解码器网络以预测对象的未来值的第二神经网络模型120。该单元340用于使用该第二 神经网络模型120从来自预测特征生成单元330的预测特征数据序列103预测与未来时间序列T2对应的对象的未来数据序列104。第二神经网络模型120可以使用诸如多层感知MLP网络的卷积神经网络。The prediction unit 340 comprises, for example, the second neural network model 120 as a decoder network in a seq2seq neural network model to predict future values of the object. The unit 340 is used to predict the future data sequence 104 of the object corresponding to the future time series T2 from the predicted feature data sequence 103 from the predicted feature generation unit 330 using the second neural network model 120. The second neural network model 120 may use a convolutional neural network such as a multi-layer perceptual MLP network.
装置300还可选地包括模型训练单元350,用于在使用上述提取单元320和预测单元340中的神经网络模型之前对相应的神经网络模型进行训练以确定模型的最优参数,并且可以监督或更新模型的参数。The apparatus 300 also optionally includes a model training unit 350 for training the corresponding neural network model to determine the optimal parameters of the model before using the neural network model in the above-mentioned extraction unit 320 and the prediction unit 340, and can supervise or Update the parameters of the model.
对于各个单元所完成功能的具体细节中与上述用于时间序列预测的方法200中相同或相似的部分不再详述。The specific details of the functions performed by each unit that are the same or similar to those in the above-mentioned method 200 for time series prediction will not be described in detail.
对于根据本申请的实施例的时间序列预测方法和装置,进行了如下实验以与已有的时间序列预测方案比较性能。For the time series forecasting method and apparatus according to the embodiments of the present application, the following experiments were conducted to compare the performance with existing time series forecasting schemes.
该实验在餐饮行业中的产品预测场景下进行,测试任务要求预测在未来1-4周的每个产品(对象)在每个配销中心的销售量。测试数据集针对大约20个配销中心,平均每个配销中心包括约200个产品。在产品销售量的历史数据中,最长是128周,最短是1周。测试任务涉及在预测中考虑23个动态影响因素(例如是否是节假日、工作日的天数、距离春节的周数等)和7个静态影响因素(例如产品的归类、温度、配销中心的位置等)。The experiment is carried out in the scenario of product prediction in the catering industry, and the test task requires to predict the sales volume of each product (object) in each distribution center in the next 1-4 weeks. The test dataset targets about 20 distribution centers, each including on average about 200 products. In the historical data of product sales, the longest is 128 weeks and the shortest is 1 week. The test task involves considering 23 dynamic influencing factors (such as whether it is a holiday, the number of working days, the number of weeks until the Spring Festival, etc.) and 7 static influencing factors (such as product classification, temperature, location of distribution center, etc.) in the prediction Wait).
表1示出采用不同时间序列预测方法的模型的训练时间、预测时间和预测误差。其中使用seq2seq神经网络模型的深度学习方法需要大量的浮点运算,比传统统计学算法Prophet多使用一个图形处理单元GPU加速计算。Table 1 shows the training time, prediction time and prediction error of the models using different time series forecasting methods. Among them, the deep learning method using the seq2seq neural network model requires a lot of floating-point operations, and uses one more graphics processing unit GPU to accelerate the calculation than the traditional statistical algorithm Prophet.
Figure PCTCN2021118272-appb-000001
Figure PCTCN2021118272-appb-000001
表1Table 1
从结果可知,采用根据本申请的实施例的WaveNet-MLP seq2seq2(WaveNet网络作为编码器,MLP网络作为解码器)的方案的预测精度 (误差)优于传统统计学算法,也优于采用seq2seq神经网络模型结构但是编码器和解码器网络都采用LSTM网络模型的方案。在预测时间上来看,使用神经网络模型的方案比传统的统计学算法速度更快;而在同为神经网络模型的方案之间,本申请的WaveNet-MLP seq2seq神经网络模型结构的训练时间显著降低。It can be seen from the results that the prediction accuracy (error) of the scheme using the WaveNet-MLP seq2seq2 (WaveNet network as the encoder and the MLP network as the decoder) according to the embodiment of the present application is better than that of the traditional statistical algorithm, and also better than that of the seq2seq neural network. Network model structure However, both the encoder and decoder networks adopt the scheme of the LSTM network model. In terms of prediction time, the solution using the neural network model is faster than the traditional statistical algorithm; and between the solutions using the neural network model, the training time of the WaveNet-MLP seq2seq neural network model structure of this application is significantly reduced .
由此,根据本申请的实施例的时间序列预测方法和装置的优点在于以下的诸多方面:分别使用诸如WaveNet网络和MLP网络的两个神经网络模型作为编码器和解码器网络,可以使历史数据序列、未来数据序列在对应的历史时间序列的不同历史时间处和未来时间序列的不同未来时间处的计算并行进行,从而提高模型训练和使用的速度;使用诸如WaveNet网络的神经网络模型作为编码器,特别是膨胀卷积网络结构,减少了对象的历史数据序列中的信息从第一个历史时间到最后一个历史时间处的传递路径,避免在神经网络的训练过程中的梯度消失和梯度爆炸,从而可以进行长距离的时间序列预测;仅将不随时间变化的影响因素在作为解码器部分的第二神经网络模型的输入处引入,避免在编码器网络的每个时间点处复制和计算从而减少数据和计算的冗余;对于不随时间变化的诸如静态特征的影响因素进行嵌入分组,使得降低输入数据维度的同时保持了影响因素之间的正交性。Therefore, the advantages of the time series prediction method and apparatus according to the embodiments of the present application lie in the following aspects: using two neural network models such as the WaveNet network and the MLP network as the encoder and the decoder network, respectively, can make historical data Sequences, future data sequences are calculated in parallel at different historical times of the corresponding historical time series and different future times of the future time series, thereby improving the speed of model training and use; using neural network models such as WaveNet networks as encoders , especially the dilated convolutional network structure, which reduces the transmission path of the information in the historical data sequence of the object from the first historical time to the last historical time, avoiding the gradient disappearance and gradient explosion during the training process of the neural network, Thereby, long-distance time series prediction can be performed; only the influencing factors that do not change with time are introduced at the input of the second neural network model as the decoder part, avoiding duplication and calculation at each time point of the encoder network, thereby reducing Redundancy of data and computation; embedded grouping of influencing factors that do not change over time, such as static features, reduces the dimension of input data while maintaining the orthogonality between influencing factors.
应当注意,尽管在上文详细描述中提及了用于时间序列预测的装置的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本申请的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。作为模块或单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本申请方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。It should be noted that although several modules or units of the apparatus for time series prediction are mentioned in the above detailed description, this division is not mandatory. Indeed, according to embodiments of the present application, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one module or unit described above may be further divided into multiple modules or units to be embodied. Components shown as modules or units may or may not be physical units, ie may be located in one place, or may be distributed over multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of the present application. Those of ordinary skill in the art can understand and implement it without creative effort.
在本申请的示例性实施例中,还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序包括可执行指令,该可执行指令被例如处理器执行时可以实现上述任意一个实施例中所述用于时间序列预测的方法的步骤。在一些可能的实施方式中,本申请的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序产品在终端设备上运行时,所述程序代码用于使所述终端设备执行本说明书用于时间序列预测的方法中描述的根据本申请各种示例性实施例的步骤。In an exemplary embodiment of the present application, there is also provided a computer-readable storage medium on which a computer program is stored, the program including executable instructions, which, when executed by, for example, a processor, can implement any one of the above Steps of the method for time series forecasting described in the Examples. In some possible implementations, various aspects of the present application can also be implemented in the form of a program product, which includes program code, which is used to cause the program product to run on a terminal device when the program product is executed. The terminal device performs the steps according to various exemplary embodiments of the present application described in the method for time series prediction in this specification.
根据本申请的实施例的用于实现上述方法的程序产品可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本申请的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。The program product for implementing the above method according to the embodiments of the present application may adopt a portable compact disc read only memory (CD-ROM) and include program codes, and may be executed on a terminal device such as a personal computer. However, the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
所述程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
所述计算机可读存储介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读存储介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。可读存储介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。The computer-readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, carrying readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A readable storage medium can also be any readable medium other than a readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
可以以一种或多种程序设计语言的任意组合来编写用于执行本申请操作的程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。Program code for carrying out the operations of the present application may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural Programming Language - such as the "C" language or similar programming language. The program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).
在本申请的示例性实施例中,还提供一种电子设备,该电子设备可以包括处理器,以及用于存储所述处理器的可执行指令的存储器。其中,所述处理器配置为经由执行所述可执行指令来执行上述任意一个实施例中的用于时间序列预测的方法的步骤。In an exemplary embodiment of the present application, there is also provided an electronic device, which may include a processor, and a memory for storing executable instructions of the processor. Wherein, the processor is configured to perform the steps of the method for time series prediction in any one of the foregoing embodiments by executing the executable instructions.
所属技术领域的技术人员能够理解,本申请的各个方面可以实现为系统、方法或程序产品。因此,本申请的各个方面可以具体实现为以下形式,即:完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等),或硬件和软件方面结合的实施方式,这里可以统称为“电路”、“模块”或“系统”。As will be appreciated by one skilled in the art, various aspects of the present application may be implemented as a system, method or program product. Therefore, various aspects of the present application can be embodied in the following forms, namely: a complete hardware implementation, a complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software aspects, which may be collectively referred to herein as implementations "circuit", "module" or "system".
下面参照图4来描述根据本申请的这种实施方式的电子设备400。图4显示的电子设备400仅仅是一个示例,不应对本申请的实施例的功能和使用范围带来任何限制。The electronic device 400 according to this embodiment of the present application is described below with reference to FIG. 4 . The electronic device 400 shown in FIG. 4 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present application.
如图4所示,电子设备400以通用计算设备的形式表现。电子设备400的组件可以包括但不限于:至少一个处理单元410、至少一个存储单元420、连接不同系统组件(包括存储单元420和处理单元410)的总线430、显示单元440等。As shown in FIG. 4, electronic device 400 takes the form of a general-purpose computing device. Components of the electronic device 400 may include, but are not limited to, at least one processing unit 410, at least one storage unit 420, a bus 430 connecting different system components (including the storage unit 420 and the processing unit 410), a display unit 440, and the like.
其中,所述存储单元存储有程序代码,所述程序代码可以被所述处理单元410执行,使得所述处理单元410执行本说明书用于自动时间序列预 测的方法中描述的根据本申请各种示例性实施方式的步骤。例如,所述处理单元410可以执行如图2中所示的步骤。Wherein, the storage unit stores program codes, and the program codes can be executed by the processing unit 410, so that the processing unit 410 executes various examples according to the present application described in the method for automatic time series prediction in this specification steps of sexual implementation. For example, the processing unit 410 may perform the steps shown in FIG. 2 .
所述存储单元420可以包括易失性存储单元形式的可读介质,例如随机存取存储单元(RAM)4201和/或高速缓存存储单元4202,还可以进一步包括只读存储单元(ROM)4203。The storage unit 420 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 4201 and/or a cache storage unit 4202 , and may further include a read only storage unit (ROM) 4203 .
所述存储单元420还可以包括具有一组(至少一个)程序模块4205的程序/实用工具4204,这样的程序模块4205包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。The storage unit 420 may also include a program/utility 4204 having a set (at least one) of program modules 4205 including, but not limited to, an operating system, one or more application programs, other program modules, and programs Data, each or some combination of these examples may include an implementation of a network environment.
总线430可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存储单元控制器、外围总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。The bus 430 may be representative of one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any of a variety of bus structures. bus.
电子设备400也可以与一个或多个外部设备500(例如键盘、指向设备、蓝牙设备等)通信,还可与一个或者多个使得用户能与该电子设备400交互的设备通信,和/或与使得该电子设备400能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口450进行。并且,电子设备400还可以通过网络适配器460与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。网络适配器460可以通过总线430与电子设备400的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备400使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。The electronic device 400 may also communicate with one or more external devices 500 (eg, keyboards, pointing devices, Bluetooth devices, etc.), with one or more devices that enable a user to interact with the electronic device 400, and/or with Any device (eg, router, modem, etc.) that enables the electronic device 400 to communicate with one or more other computing devices. Such communication may occur through input/output (I/O) interface 450 . Also, the electronic device 400 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 460 . Network adapter 460 may communicate with other modules of electronic device 400 through bus 430 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 400, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives and data backup storage systems.
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本申请的实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计 算设备(可以是个人计算机、服务器、或者网络设备等)执行根据本申请的实施方式的用于时间序列预测的方法。From the description of the above embodiments, those skilled in the art can easily understand that the exemplary embodiments described herein may be implemented by software, or may be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present application may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or a network Above, several instructions are included to cause a computing device (which may be a personal computer, a server, or a network device, etc.) to execute the method for time series prediction according to an embodiment of the present application.
本领域技术人员在考虑说明书及实践这里公开的内容后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本申请的真正范围和精神由所附的权利要求指出。Other embodiments of the present application will readily occur to those skilled in the art upon consideration of the specification and practice of what is disclosed herein. This application is intended to cover any variations, uses or adaptations of this application that follow the general principles of this application and include common knowledge or conventional techniques in the technical field not disclosed in this application . The specification and examples are to be regarded as exemplary only, with the true scope and spirit of the application being indicated by the appended claims.

Claims (22)

  1. 一种用于时间序列预测的方法,其特征在于,包括:A method for time series forecasting, comprising:
    获取与历史时间序列对应的对象的历史数据序列,所述历史数据序列中的历史数据包括与所述历史时间序列中的历史时间对应的所述对象的历史动态特征和历史值,其中所述历史动态特征与对应的历史时间相关联;Obtain a historical data sequence of an object corresponding to a historical time series, where the historical data in the historical data sequence includes historical dynamic characteristics and historical values of the object corresponding to the historical time in the historical time series, wherein the historical data Dynamic features are associated with corresponding historical time;
    使用第一神经网络模型基于所述历史数据序列提取与未来时间序列对应的所述对象的规律数据序列;Using the first neural network model to extract the regular data sequence of the object corresponding to the future time sequence based on the historical data sequence;
    基于所述规律数据序列、与所述未来时间序列对应的所述对象的未来动态特征序列、以及所述对象的未来静态特征生成预测特征数据序列,其中所述未来动态特征序列包括与所述未来时间序列中的未来时间对应的所述对象的未来动态特征,所述未来动态特征与对应的未来时间相关联;以及A predicted feature data sequence is generated based on the regular data sequence, the future dynamic feature sequence of the object corresponding to the future time series, and the future static feature of the object, wherein the future dynamic feature sequence includes a future dynamic feature of the object corresponding to a future time in the time series, the future dynamic feature being associated with the corresponding future time; and
    使用第二神经网络模型基于所述预测特征数据序列预测与所述未来时间序列对应的所述对象的未来数据序列,所述未来数据序列中的未来数据包括与所述未来时间序列的未来时间对应的所预测的所述对象的未来值。using a second neural network model to predict a future data sequence of the object corresponding to the future time series based on the predicted feature data sequence, where the future data in the future data sequence includes a future time corresponding to the future time series The predicted future value of the object.
  2. 根据权利要求1所述的方法,其特征在于,所述规律数据序列中的规律数据包括与未来时间序列的未来时间对应的所述对象的周期性规律特征和非周期性规律特征,其中所述非周期性规律特征与对应的未来时间相关联。The method according to claim 1, wherein the regular data in the regular data sequence includes periodic regular features and aperiodic regular features of the object corresponding to the future time of the future time series, wherein the Aperiodic regular features are associated with corresponding future times.
  3. 根据权利要求1所述的方法,其特征在于,所述第一神经网络模型构成seq2seq网络模型中的编码器,所述第二神经网络模型构成seq2seq网络模型中的解码器。The method according to claim 1, wherein the first neural network model constitutes an encoder in a seq2seq network model, and the second neural network model constitutes a decoder in the seq2seq network model.
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述第一神经网络模型为WaveNet网络。The method according to any one of claims 1 to 3, wherein the first neural network model is a WaveNet network.
  5. 根据权利要求4所述的方法,其特征在于,所述WaveNet网络为膨胀卷积神经网络。The method according to claim 4, wherein the WaveNet network is a dilated convolutional neural network.
  6. 根据权利要求5所述的方法,其特征在于,所述WaveNet网络包括至少两个卷积层,所述至少两个卷积层中的第一卷积层为膨胀系数为1的一维卷积层,所述至少两个卷积层中的在所述第一卷积层之后的其它卷积层的膨胀系数为上一卷积层的膨胀系数乘以膨胀指数。The method according to claim 5, wherein the WaveNet network comprises at least two convolution layers, and the first convolution layer in the at least two convolution layers is a one-dimensional convolution with an expansion coefficient of 1 layer, the expansion coefficient of the other convolutional layers after the first convolutional layer in the at least two convolutional layers is the expansion coefficient of the previous convolutional layer multiplied by the expansion index.
  7. 根据权利要求1至3中任一项所述的方法,其特征在于,所述第二神经网络模型为多层感知(MLP)网络。The method according to any one of claims 1 to 3, wherein the second neural network model is a multilayer perception (MLP) network.
  8. 根据权利要求1至3中任一项所述的方法,其特征在于,所述对象的周期性规律特征对于所述未来时间序列中的每个未来时间是相同的。The method according to any one of claims 1 to 3, wherein the periodic regularity characteristic of the object is the same for each future time in the future time series.
  9. 根据权利要求1至3中任一项所述的方法,其特征在于,基于所述规律数据序列、与所述未来时间序列对应的所述对象的未来动态特征序列、以及所述对象的未来静态特征生成预测特征数据序列包括:The method according to any one of claims 1 to 3, characterized in that, based on the regular data sequence, the future dynamic feature sequence of the object corresponding to the future time series, and the future static state of the object Feature Generation The predicted feature data sequence includes:
    针对所述未来时间序列中的对应的未来时间,将所述规律数据序列中的所述周期性规律特征和所述非周期性规律特征、所述未来动态特征序列中的所述未来动态特征、以及所述未来静态特征拼接为所述预测特征数据序列。For the corresponding future time in the future time series, the periodic regular feature and the non-periodic regular feature in the regular data sequence, the future dynamic feature in the future dynamic feature sequence, and the future static features are spliced into the predicted feature data sequence.
  10. 根据权利要求1至3中任一项所述的方法,其特征在于,所述未来静态特征对于所述未来时间序列中的每个未来时间是相同的。The method of any one of claims 1 to 3, wherein the future static characteristic is the same for each future time in the future time series.
  11. 根据权利要求1至3、9中任一项所述的方法,其特征在于,进一步包括:The method according to any one of claims 1 to 3 and 9, characterized in that, further comprising:
    对所述未来静态特征进行嵌入分组。Embedding grouping of the future static features.
  12. 根据权利要求1至3中任一项所述的方法,其特征在于,还包括在使用所述第一神经网络模型和所述第二神经网络模型中的至少一个之前,对所使用的神经网络模型进行训练。The method according to any one of claims 1 to 3, characterized in that, further comprising, before using at least one of the first neural network model and the second neural network model, analyzing the used neural network The model is trained.
  13. 根据权利要求1至3中任一项所述的方法,其特征在于,所述对象为产品,所述对象的所述历史值和所述未来值分别为所述产品的历史销量和未来销量,所述历史时间和所述未来时间中的至少一个的单位包括以下中的一项:小时,日,月,年,周,季度。The method according to any one of claims 1 to 3, wherein the object is a product, and the historical value and the future value of the object are the historical sales volume and future sales volume of the product, respectively, The unit of at least one of the historical time and the future time includes one of the following: hour, day, month, year, week, quarter.
  14. 根据权利要求13所述的方法,其特征在于,所述对象的所述历史动态特征和所述未来动态特征中的至少一个包括以下中的至少一项:是否是节假日,工作日的天数,距离节假日的天数或周数。The method according to claim 13, wherein at least one of the historical dynamic characteristics and the future dynamic characteristics of the object comprises at least one of the following: whether it is a holiday, the number of working days, the distance The number of days or weeks of the holiday.
  15. 根据权利要求13所述的方法,其特征在于,所述未来静态特征包括以下中的至少一项:产品的类别,产品的温度,产品的销售位置。The method according to claim 13, wherein the future static feature comprises at least one of the following: product category, product temperature, and sales location of the product.
  16. 一种用于时间序列预测的装置,其特征在于,包括:A device for time series prediction, characterized in that it includes:
    历史数据获取单元,被配置为获取与历史时间序列对应的所述对象的历史数据序列,所述历史数据序列中的历史数据包括与所述历史时间序列中的历史时间对应的所述对象的历史动态特征和历史值,其中所述历史动态特征与对应的历史时间相关联;A historical data acquisition unit configured to acquire a historical data sequence of the object corresponding to a historical time series, the historical data in the historical data sequence including the history of the object corresponding to the historical time in the historical time series dynamic features and historical values, wherein the historical dynamic features are associated with corresponding historical times;
    规律提取单元,被配置为使用第一神经网络模型基于所述历史数据序列提取与未来时间序列对应的所述对象的规律数据序列;a regularity extraction unit configured to use a first neural network model to extract the regularity data sequence of the object corresponding to the future time sequence based on the historical data sequence;
    预测特征生成单元,被配置为基于所述规律数据序列、与所述未来时间序列对应的所述对象的未来动态特征序列、以及所述对象的未来静态特征生成预测特征数据序列,其中所述未来动态特征序列包括与所述未来时间序列的未来时间对应的所述对象的未来动态特征,所述未来动态特征与所述对应未来时间相关联;以及A predicted feature generation unit configured to generate a predicted feature data sequence based on the regular data sequence, the future dynamic feature sequence of the object corresponding to the future time series, and the future static feature of the object, wherein the future a sequence of dynamic characteristics including future dynamic characteristics of the object corresponding to future times of the future time series, the future dynamic characteristics being associated with the corresponding future times; and
    预测单元,被配置为使用第二神经网络模型基于所述预测特征数据序列预测与所述未来时间序列对应的所述对象的未来数据序列,所述未来数据序列中的未来数据包括与所述未来时间序列的未来时间对应的所预测的所述对象的未来值。A prediction unit configured to use a second neural network model to predict a future data sequence of the object corresponding to the future time series based on the predicted feature data sequence, where the future data in the future data sequence includes a sequence related to the future data sequence. The predicted future value of the object corresponding to the future time of the time series.
  17. 根据权利要求16所述的装置,其特征在于,所述规律数据序列中的规律数据包括与未来时间序列的未来时间对应的所述对象的周期性规律特征和非周期性规律特征,其中所述非周期性规律特征与对应的未来时间相关联。The apparatus according to claim 16, wherein the regular data in the regular data sequence includes periodic regular features and aperiodic regular features of the object corresponding to the future time of the future time series, wherein the Aperiodic regular features are associated with corresponding future times.
  18. 根据权利要求16所述的装置,其特征在于,所述第一神经网络模型为seq2seq网络模型中的编码器网络,所述第二神经网络模型为seq2seq网络模型中的解码器网络。The apparatus according to claim 16, wherein the first neural network model is an encoder network in a seq2seq network model, and the second neural network model is a decoder network in a seq2seq network model.
  19. 根据权利要求16至18中任一项所述的装置,其特征在于,所述第一神经网络模型为WaveNet网络。The apparatus according to any one of claims 16 to 18, wherein the first neural network model is a WaveNet network.
  20. 根据权利要求16至18中任一项所述的装置,其特征在于,所述第二神经网络模型为多层感知(MLP)网络。The apparatus according to any one of claims 16 to 18, wherein the second neural network model is a multilayer perception (MLP) network.
  21. 一种计算机可读存储介质,其上存储有计算机程序,该计算机程序包括可执行指令,当该可执行指令被至少一个处理器执行时,实施根据权利要求1至15中任一项所述的方法。A computer-readable storage medium having stored thereon a computer program comprising executable instructions that, when executed by at least one processor, implement the method according to any one of claims 1 to 15 method.
  22. 一种电子设备,其特征在于,包括:An electronic device, comprising:
    处理器;以及processor; and
    存储器,用于存储所述处理器的可执行指令;a memory for storing executable instructions for the processor;
    其中,所述处理器被配置为执行所述可执行指令以实施根据权利要求1至15中任一项所述的方法。wherein the processor is configured to execute the executable instructions to implement the method of any of claims 1-15.
PCT/CN2021/118272 2020-09-14 2021-09-14 Method and apparatus for time sequence prediction WO2022053064A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010959817.7 2020-09-14
CN202010959817.7A CN112053004A (en) 2020-09-14 2020-09-14 Method and apparatus for time series prediction

Publications (1)

Publication Number Publication Date
WO2022053064A1 true WO2022053064A1 (en) 2022-03-17

Family

ID=73610632

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/118272 WO2022053064A1 (en) 2020-09-14 2021-09-14 Method and apparatus for time sequence prediction

Country Status (2)

Country Link
CN (1) CN112053004A (en)
WO (1) WO2022053064A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114971057A (en) * 2022-06-09 2022-08-30 支付宝(杭州)信息技术有限公司 Model selection method and device
CN115343621A (en) * 2022-07-27 2022-11-15 山东科技大学 Power battery health state prediction method and device based on data driving
CN115794906A (en) * 2022-12-02 2023-03-14 中电金信软件有限公司 Method, device, equipment and storage medium for determining influence of emergency
CN116307153A (en) * 2023-03-07 2023-06-23 广东热矩智能科技有限公司 Meteorological prediction method and device for energy conservation of refrigeration and heating system and electronic equipment
CN116976956A (en) * 2023-09-22 2023-10-31 通用技术集团机床工程研究院有限公司 CRM system business opportunity deal prediction method, device, equipment and storage medium
CN117252311A (en) * 2023-11-16 2023-12-19 华南理工大学 Rail transit passenger flow prediction method based on improved LSTM network

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112053004A (en) * 2020-09-14 2020-12-08 胜斗士(上海)科技技术发展有限公司 Method and apparatus for time series prediction
CN112232604B (en) * 2020-12-09 2021-06-11 南京信息工程大学 Prediction method for extracting network traffic based on Prophet model
CN112906941B (en) * 2021-01-21 2022-12-06 哈尔滨工程大学 Prediction method and system for dynamic correlation air quality time series
CN112967518B (en) * 2021-02-01 2022-06-21 浙江工业大学 Seq2Seq prediction method for bus track under bus lane condition
CN112801202B (en) * 2021-02-10 2024-03-05 延锋汽车饰件系统有限公司 Vehicle window fogging prediction method, system, electronic equipment and storage medium
CN113313316A (en) * 2021-06-11 2021-08-27 北京明略昭辉科技有限公司 Method and device for outputting prediction data, storage medium and electronic equipment
CN113837858A (en) * 2021-08-19 2021-12-24 同盾科技有限公司 Method, system, electronic device and storage medium for predicting credit risk of user
CN113850418A (en) * 2021-09-02 2021-12-28 支付宝(杭州)信息技术有限公司 Method and device for detecting abnormal data in time sequence
CN113985408B (en) * 2021-09-13 2024-04-05 南京航空航天大学 Inverse synthetic aperture radar imaging method combining gate unit and transfer learning
CN113837487A (en) * 2021-10-13 2021-12-24 国网湖南省电力有限公司 Power system load prediction method based on combined model
CN114580798B (en) * 2022-05-09 2022-09-16 南京安元科技有限公司 Device point location prediction method and system based on transformer

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850891A (en) * 2015-05-29 2015-08-19 厦门大学 Intelligent optimal recursive neural network method of time series prediction
CN106971348A (en) * 2016-01-14 2017-07-21 阿里巴巴集团控股有限公司 A kind of data predication method and device based on time series
US20200074274A1 (en) * 2018-08-28 2020-03-05 Beijing Jingdong Shangke Information Technology Co., Ltd. System and method for multi-horizon time series forecasting with dynamic temporal context learning
CN110889560A (en) * 2019-12-06 2020-03-17 西北工业大学 Express delivery sequence prediction method with deep interpretability
CN111612215A (en) * 2020-04-18 2020-09-01 华为技术有限公司 Method for training time sequence prediction model, time sequence prediction method and device
CN112053004A (en) * 2020-09-14 2020-12-08 胜斗士(上海)科技技术发展有限公司 Method and apparatus for time series prediction

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850891A (en) * 2015-05-29 2015-08-19 厦门大学 Intelligent optimal recursive neural network method of time series prediction
CN106971348A (en) * 2016-01-14 2017-07-21 阿里巴巴集团控股有限公司 A kind of data predication method and device based on time series
US20200074274A1 (en) * 2018-08-28 2020-03-05 Beijing Jingdong Shangke Information Technology Co., Ltd. System and method for multi-horizon time series forecasting with dynamic temporal context learning
CN110889560A (en) * 2019-12-06 2020-03-17 西北工业大学 Express delivery sequence prediction method with deep interpretability
CN111612215A (en) * 2020-04-18 2020-09-01 华为技术有限公司 Method for training time sequence prediction model, time sequence prediction method and device
CN112053004A (en) * 2020-09-14 2020-12-08 胜斗士(上海)科技技术发展有限公司 Method and apparatus for time series prediction

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114971057A (en) * 2022-06-09 2022-08-30 支付宝(杭州)信息技术有限公司 Model selection method and device
CN115343621A (en) * 2022-07-27 2022-11-15 山东科技大学 Power battery health state prediction method and device based on data driving
CN115343621B (en) * 2022-07-27 2024-01-26 山东科技大学 Method and equipment for predicting health state of power battery based on data driving
CN115794906A (en) * 2022-12-02 2023-03-14 中电金信软件有限公司 Method, device, equipment and storage medium for determining influence of emergency
CN116307153A (en) * 2023-03-07 2023-06-23 广东热矩智能科技有限公司 Meteorological prediction method and device for energy conservation of refrigeration and heating system and electronic equipment
CN116976956A (en) * 2023-09-22 2023-10-31 通用技术集团机床工程研究院有限公司 CRM system business opportunity deal prediction method, device, equipment and storage medium
CN117252311A (en) * 2023-11-16 2023-12-19 华南理工大学 Rail transit passenger flow prediction method based on improved LSTM network
CN117252311B (en) * 2023-11-16 2024-03-15 华南理工大学 Rail transit passenger flow prediction method based on improved LSTM network

Also Published As

Publication number Publication date
CN112053004A (en) 2020-12-08

Similar Documents

Publication Publication Date Title
WO2022053064A1 (en) Method and apparatus for time sequence prediction
US11928600B2 (en) Sequence-to-sequence prediction using a neural network model
US20180150783A1 (en) Method and system for predicting task completion of a time period based on task completion rates and data trend of prior time periods in view of attributes of tasks using machine learning models
US10540967B2 (en) Machine reading method for dialog state tracking
US11080707B2 (en) Methods and arrangements to detect fraudulent transactions
US20210142181A1 (en) Adversarial training of machine learning models
US20190130249A1 (en) Sequence-to-sequence prediction using a neural network model
US20190138887A1 (en) Systems, methods, and media for gated recurrent neural networks with reduced parameter gating signals and/or memory-cell units
US20210118430A1 (en) Utilizing a dynamic memory network for state tracking
CN110663049B (en) Neural Network Optimizer Search
WO2018175972A1 (en) Device placement optimization with reinforcement learning
US20210303970A1 (en) Processing data using multiple neural networks
US20210374544A1 (en) Leveraging lagging gradients in machine-learning model training
US11651212B2 (en) System and method for generating scores for predicting probabilities of task completion
US20220391706A1 (en) Training neural networks using learned optimizers
CN116091110A (en) Resource demand prediction model training method, prediction method and device
CN108475346B (en) Neural random access machine
EP4009239A1 (en) Method and apparatus with neural architecture search based on hardware performance
CN112243509A (en) System and method for generating data sets from heterogeneous sources for machine learning
US20230289634A1 (en) Non-linear causal modeling based on encoded knowledge
CN116993185A (en) Time sequence prediction method, device, equipment and storage medium
EP4231202A1 (en) Apparatus and method of data processing
US20200302303A1 (en) Optimization of neural network in equivalent class space
US20190065987A1 (en) Capturing knowledge coverage of machine learning models
CN115062769A (en) Knowledge distillation-based model training method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21866116

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21866116

Country of ref document: EP

Kind code of ref document: A1