WO2022053064A1 - Procédé et appareil de prédiction de séquence de temps - Google Patents

Procédé et appareil de prédiction de séquence de temps Download PDF

Info

Publication number
WO2022053064A1
WO2022053064A1 PCT/CN2021/118272 CN2021118272W WO2022053064A1 WO 2022053064 A1 WO2022053064 A1 WO 2022053064A1 CN 2021118272 W CN2021118272 W CN 2021118272W WO 2022053064 A1 WO2022053064 A1 WO 2022053064A1
Authority
WO
WIPO (PCT)
Prior art keywords
future
historical
data sequence
time series
neural network
Prior art date
Application number
PCT/CN2021/118272
Other languages
English (en)
Chinese (zh)
Inventor
朱云依
Original Assignee
胜斗士(上海)科技技术发展有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 胜斗士(上海)科技技术发展有限公司 filed Critical 胜斗士(上海)科技技术发展有限公司
Publication of WO2022053064A1 publication Critical patent/WO2022053064A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/219Managing data history or versioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • the present application relates to time series forecasting, and in particular, to a method, apparatus, and computer-readable storage medium for predicting future data of an object based on historical data of the object.
  • Forecasting sales expectations for future times based on product sales over a past period of time is known as product time series forecasting.
  • the current mainstream technologies for time series forecasting include two categories: one is the traditional statistics-based forecasting algorithm represented by Arima/Prophet, and the other is the deep learning-based forecasting algorithm represented by the LSTM neural network.
  • time series prediction algorithm based on traditional statistics is a linear algorithm, and it is difficult to capture the nonlinear and long-term laws in the time series.
  • the time series prediction algorithm based on LSTM neural network is prone to gradient disappearance or gradient explosion when the scale of the time series becomes larger, which leads to the distortion of the prediction result, and the operation efficiency is low, and the data and calculation are redundant.
  • the present application proposes a method, an apparatus and a computer-readable storage medium for time series prediction, which are used to solve at least one defect existing in the prior art solutions, and extract rules from historical data of objects and combine future influencing factors to predict The object's future data.
  • a method for time series prediction comprising:
  • a predicted feature data sequence is generated based on the regular data sequence, the future dynamic feature sequence of the object corresponding to the future time series, and the future static feature of the object, wherein the future dynamic feature sequence includes a future dynamic feature of the object corresponding to a future time in the time series, the future dynamic feature being associated with the corresponding future time;
  • an apparatus for time series prediction comprising:
  • a historical data acquisition unit configured to acquire a historical data sequence of the object corresponding to a historical time series, the historical data in the historical data sequence including the history of the object corresponding to the historical time in the historical time series dynamic features and historical values, wherein the historical dynamic features are associated with corresponding historical times;
  • a regularity extraction unit configured to use a first neural network model to extract the regularity data sequence of the object corresponding to the future time sequence based on the historical data sequence
  • a predicted feature generation unit configured to generate a predicted feature data sequence based on the regular data sequence, the future dynamic feature sequence of the object corresponding to the future time series, and the future static feature of the object, wherein the future a sequence of dynamic characteristics including future dynamic characteristics of the object corresponding to future times of the future time series, the future dynamic characteristics being associated with the corresponding future times;
  • a prediction unit configured to use a second neural network model to predict a future data sequence of the object corresponding to the future time series based on the predicted feature data sequence, where the future data in the future data sequence includes a sequence related to the future data sequence. The predicted future value of the object corresponding to the future time of the time series.
  • a computer-readable storage medium on which a computer program is stored, the computer program including executable instructions, when the executable instructions are executed by at least one processor, implement the above-mentioned method.
  • an electronic device comprising a processor and a memory for storing executable instructions of the processor, wherein the processor is configured to execute the executable instructions to implement the above the method described.
  • the time series prediction method and device can meet the requirements of efficient calculation, accurately capture the nonlinear effects of trend factors, seasonal factors, external factors, etc. on the predicted object, and make short-distance and Long-range time prediction.
  • FIG. 1 is a schematic diagram of a seq2seq neural network model architecture for time series prediction according to an embodiment of the present application
  • FIG. 2 is an exemplary flowchart of a method for time series forecasting according to an embodiment of the present application
  • FIG. 3 is a schematic block diagram of an apparatus for time series prediction according to an embodiment of the present application.
  • FIG. 4 is a schematic block diagram of an electronic device according to an embodiment of the present application.
  • the method and apparatus for predicting the future data of an object based on the historical data of the object are introduced with a specific neural network model structure hereinafter according to the embodiment, the solution of the present application is not limited to this example. It can be extended to other neural network structures capable of realizing the concept of time series forecasting of the present application, and can also be extended to other deep learning-based forecasting model structures.
  • the time series prediction method is introduced with the sales products of the catering industry at the sales place as the object, but the method of the present application can be applied to any objects and scenarios that require time series prediction.
  • the neural network generally refers to an artificial neural network (ANN).
  • a common convolutional neural network can be used for the neural network, and a fully convolutional neural network can be further used as the case may be.
  • Other specific types and structures of neural networks that are not relevant to the time series forecasting method of the present application have not been described too much herein to avoid confusion.
  • both the traditional statistics-based differential integrated moving average autoregressive (Autoregressive Integrated Moving Average, referred to as Arima) model forecasting algorithm and the time series model Prophet forecasting algorithm can be used to predict trends, seasonality, etc. time-related laws. They first disassemble the historical data corresponding to the historical time series into the linear superposition of trend factors, seasonal factors and external influencing factors, and respectively predict the impact of the above factors on the data corresponding to future time, and finally analyze the impact of these three factors. The subsequent prediction results are superimposed to obtain the final prediction result.
  • Arima Automatic Integrated Moving Average
  • linear algorithms are difficult to capture the laws that exist in time series; different time series are predicted independently, resulting in no consideration of the relationship between different time series, so each time series forecast It is not accurate enough and its simple limited superposition cannot accurately reflect the real process change trend; when the time series is relatively short and the corresponding amount of historical data is relatively small, the linear algorithm cannot capture long-term laws and cannot learn from other sequences.
  • the LSTM Long-Short-Term Memory, long short-term memory
  • the historical observations, historical influencing factors and future influencing factors of the variables corresponding to the time series are used as the input of the neural network model structure, and the future predicted values of the variables are used as the output of the neural network model structure.
  • LSTM network is a temporal recurrent neural network specially designed to solve the long-term dependency problem of general RNN (Recurrent Neural Network), which is suitable for processing and predicting important events with very long intervals and delays in time series.
  • LSTM networks outperform temporal recurrent neural networks and hidden Markov models (HMMs).
  • HMMs hidden Markov models
  • the important structure in the LSTM network is the gate, in which the forget gate determines whether the input is passed into the block, the input gate determines whether the input is accepted to pass into the block, and the output gate determines whether the information in the block memory is passed out.
  • LSTM networks are usually trained using gradient descent.
  • the LSTM network model can overcome the problem of poor long-term forecasting effect of the Arima/Prophet algorithm, it still cannot meet many requirements of time series forecasting.
  • time series prediction process of the present application is described below with reference to the seq2seq neural network model architecture of FIG. 1 and the method flow for time series prediction of FIG. 2 according to an embodiment of the present application.
  • the basic structure of the neural network model architecture 100 in FIG. 1 is a seq2seq (sequence to sequence) network model.
  • the seq2seq neural network model can be regarded as a transformation model.
  • the basic idea is that the former neural network model of the two neural network models connected in series is used as the encoder network, and the latter neural network model is used as the decoder network.
  • the encoder network converts a sequence of data into a vector or sequence of vectors, and the decoder network generates another sequence of data from that vector or sequence of vectors.
  • a usage scenario of the seq2seq network model is speech recognition, in which the encoder network converts or divides English sentences into English or Chinese semantic data or semantic sequences, and the decoder network can convert the semantic data or semantic sequences into English sentences corresponding to them. Chinese sentences.
  • the optimization of the seq2seq network model can use the maximum likelihood estimation method to maximize the probability of the data sequence generated by the decoder to obtain the optimal conversion effect.
  • the seq2seq neural network model architecture 100 includes a first neural network model 110 as an encoder and a second neural network model 120 as a decoder.
  • the first neural network model 110 is used for extracting information in the historical data, especially regular data reflecting the regularity in the historical data.
  • the first neural network model 110 is a WaveNet neural network.
  • the WaveNet network is designed to predict the predicted value of the nth data based on the first n-1 data of a data sequence.
  • WaveNet is particularly suitable for high-throughput input of one-dimensional data sequences as multi-dimensional vectors, which can achieve fast computation.
  • the standard WaveNet network model is a convolutional neural network in which each convolutional layer convolves the previous layer. The larger the convolution kernel of the network and the more layers, the stronger the perception ability in the time domain and the larger the perception range.
  • each time a node is generated the node can be placed in the last node in the input layer and then iteratively generated.
  • the activation function of the WaveNet network can use gate units, for example.
  • the hidden layer between the input layer and the output layer of the network adopts recursive and skip connections, that is, the node of each convolutional layer in the hidden layer will add the original value and the output value of the activation function and pass it to the next A convolutional layer.
  • the operation of reducing the number of channels can be achieved through a 1x1 convolution kernel. Then, the results of the activation function output of each hidden layer are added and finally output through the output layer.
  • the first neural network model 110 has an input layer (ie, a first convolutional layer) 112 , a hidden layer 113 and an output layer 114 .
  • the number of hidden layers 113 may be zero, one or more.
  • the input layer 112 , the hidden layer 113 and the output layer 114 each have a plurality of nodes 111 .
  • the number of nodes in the input layer 112 should at least correspond to the data length in the historical data sequence to ensure that the neural network can receive information at each historical time.
  • the first neural network model 110 using the ordinary WaveNet network is a one-dimensional causal convolutional network
  • the number of nodes used is Decrement by 1 with each convolutional layer. If the length of the historical data is large, many layers need to be added to the first neural network model 110 to satisfy n passes or a large filter is required, so that the selected gradient in the gradient descent process is too small, the training of the network is complicated, and the fitting Ineffective.
  • Dilated Convolutional Neural Network Dilated CNN
  • a dilated convolutional neural network is a convolutional network with "holes".
  • the first convolutional layer (ie, the input layer) of the dilated convolutional neural network may be a one-dimensional causal convolutional network with an expansion coefficient of 1.
  • the expansion coefficient of each convolutional layer is the expansion coefficient of the previous convolutional layer multiplied by the expansion index (Dilation Index), where the expansion index is a value not less than 2 and not greater than the convolutional layer.
  • Dilation Index the expansion index
  • This dilated convolutional neural network configuration can be employed in both the hidden layer and the output layer of the first neural network model 110 .
  • the dilation index is 2
  • the second convolutional layer will only use n, n-2, n-4, ... nodes for convolution
  • the third convolutional layer will only use n, n-4 , n-8, ... nodes, and so on.
  • the expanded neural network structure can significantly speed up the information transfer process in the neural network, avoid gradient disappearance or gradient explosion, and improve the processing speed and prediction accuracy of the first neural network model 110 .
  • the convolution kernel size is 2 and the expansion coefficient is 2
  • the number of convolutional layers of the neural network through which information is passed from the node corresponding to the first historical time to the node corresponding to the last historical time is Log 2 N, where N is the length of data in the historical data series.
  • the second neural network model 120 as a decoder may be a multi-layer perceptual (MLP) network.
  • the MLP network also includes an input layer, a hidden layer and an output layer, where each neuron node has an activation function (such as a sigmoid function) and is trained using a loss function.
  • the MLP network predicts the future value of the object based on the historical law extracted by the encoder network and the future influencing factors (including dynamic and static factors).
  • the time series method of the present application can use any other neural network capable of feature extraction and prediction of sequence data.
  • Network structures such as, but not limited to, various types of recurrent neural networks (RNNs) that can implement the function of time series prediction of the present application.
  • RNNs recurrent neural networks
  • the encoder network can use an LSTM network
  • the decoder network uses an MLP network.
  • the LSTM network has shortcomings, it is still possible to combine with the MLP network and adjust the input and output data sequences of the network to a certain extent. Compared with the existing scheme, better results can be obtained. You can also choose the WaveNet network as the encoder network, LSTM network or other RNN network as the decoder network, etc.
  • the unit of historical time and/or future time can be selected from hour, day, month, year, week, quarter, etc. as required.
  • the historical time and/or the future time may be a time point (for example, the t1th time, as of the first quarter, 10 am in the morning, etc.), or may be a continuous time period (for example, the t2th period of time) , week 2, month 3, October of the current year, etc.).
  • the time intervals between the respective historical times and/or future times may be the same to indicate a period in which the historical data and future data are extracted and predicted at the continuous time interval as a period sexual information.
  • the length of the time period can also be the same, so as to extract and predict the periodic information of the above-mentioned historical data and future data as a period.
  • the method 200 first acquires the historical data sequence 101 of the object corresponding to the historical time series T1 in step S210.
  • the historical value yi is the measured value of the object measured at the historical time ti , such as the actual sales value of the product.
  • the historical value is caused by the internal factors of the object, so it can also be called the internal factors of the object or the internal characteristic data.
  • the historical dynamic feature xi is the dynamic feature of the historical value yi in the historical data that affects the object, for example including one or more of whether it is a holiday, the number of working days, the number of days or weeks away from a holiday, and so on.
  • Historical dynamic characteristics are associated with time, such as including periodic factors that cyclically affect objects with a certain period (also known as periodic historical dynamic characteristics) and aperiodic factors that affect objects aperiodically (also known as aperiodic factors). Sexual History Dynamics).
  • the period of the periodic factor may be determined by the length of the same time interval between each historical time point in the historical time series, or by the length of the historical time as a time period of the same length.
  • the way in which aperiodic factors affect objects is related to a specific historical time, or it can be said to be random or triggered based on events.
  • the corresponding aperiodic factors may be different.
  • the number n of historical times represents the number or length of historical data.
  • the historical value yi may be a multidimensional variable or vector.
  • the historical dynamic feature xi that affects the historical value y i of the object includes many factors, the historical dynamic feature is considered to be a combination of multiple historical sub-dynamic features, and the historical dynamic feature xi can also be a multidimensional variable or vector .
  • Historical dynamic features x i and historical values y i can form a two-dimensional vector ( xi, y i ) T (also called a binary data group, hereinafter unified as a two-dimensional vector), each of which is a two-dimensional vector.
  • the sub-vectors x i and y i are both multidimensional vectors as described above.
  • the historical data sequence 101 can be represented as a one-dimensional sequence of two-dimensional vectors ⁇ (x 1 , y 1 ) T , (x 2 , y 2 ) T , . . . , (x n , y n ) T ⁇ .
  • the first neural network model 110 serving as an encoder does not process historical static data, which can reduce the redundancy of data and calculation, and improve the operation speed of the network model.
  • the first neural network model 110 is used to complete the regularity extraction function of extracting the regularity data sequence 102 of the object corresponding to the future time series T2 based on the historical data sequence 101.
  • the historical data sequence 101 is used as the input of the first neural network model 110, and the extracted regular data sequence 102 is output through the transmission and calculation of each convolutional layer of the WaveNet network.
  • the dilated convolutional neural network described above can speed up the regular information extraction process of the regular data sequence 102 and improve the information extraction accuracy.
  • the periodicity of the historical value of an object, such as the month, is affected.
  • the time interval between the future time ti of the future time series T2 (when the future time is a time point) and/or the length of the future time (when the future time is a time period) is set to be the same as that in the historical time series T1
  • the length of the historical time t n+j is the same, so the periodic regularity characteristic ca is the same for each future time t n +j .
  • the non-periodic regular feature c n+j is based on the future value y n+j of the object affected by a specific future time, so the non-periodic regular feature c n+j for the future time series T2
  • Each corresponding future time tn +j may be different.
  • m is the number of future times in the future time series T2, indicating the number or length of future data to be predicted.
  • the periodic regular feature ca also includes multiple sub-periodic regular features, which can be expressed as a multi-dimensional vector.
  • the dimension of the periodic regularity feature ca may be the same as the number of sub-periodic historical dynamic features in the periodic historical dynamic feature, or smaller than the latter to reduce the amount of computation.
  • an aperiodic historical dynamic feature may also have multiple sub-aperiodic dynamic features, so the aperiodic regularity feature c n+j also includes multiple sub-aperiodic regularity features, which can be represented as multi-dimensional vectors.
  • the dimension of the non-periodic regular feature c n+j can also be the same as the number of sub-periodic historical dynamic features in the aperiodic historical dynamic feature, or smaller than the latter to reduce the amount of computation.
  • the regular data sequence 102 can be represented as a one-dimensional sequence of two-dimensional vectors whose elements are composed of two multi-dimensional sub-vectors, a periodic regular feature c a and a non-periodic regular feature c n+j ⁇ (c a , c n+ 1 ) T , ( ca , cn +2 ) T , . . . , ( ca , cn +m ) T ⁇ .
  • x n+j in Fig. 1 is the future dynamic feature of the future value y n+j of the influence object corresponding to the future time t n+j in the future time series T2.
  • the future dynamic feature x n+j may be, for example, one or more of the promotion activities at a certain time in the future, whether it is a holiday, the number of working days, the number of days or weeks away from a holiday, and so on.
  • the future dynamics are also associated with time and may be a multi-dimensional vector that includes multiple sub-future dynamics.
  • the future dynamic features x n+j form a one-dimensional sequence of multi-dimensional vectors ⁇ x n+1 , x n+2 , ..., x n+m ⁇ .
  • the future static features xs may include properties of the object (which are generally only relevant to the object itself and not to future time) and other features that are not time-dependent.
  • the future static features x s can be the category of the product, the temperature of the product, the sales location of the product (for example, represented by the location of the distribution center), etc. These features are only associated with the object and do not have transsexual.
  • the future static feature x s can be a multidimensional vector composed of multiple sub-features.
  • the future static feature x s can be further processed.
  • the future static features x s are divided into different types, and the correlation between each type is different. Embedding operations can transform sparse discrete variables into continuous variables. Embed the future static features according to their types, for example, divide them into two groups of future static features x s1 and x s2 according to location-related features and product attribute-related features, so that different groups of future static features are not related, that is, keep orthogonality Therefore, it avoids considering each specific static influence factor as a variable or a dimension of a vector, thereby reducing the overall dimension of future static features x s and reducing the computational load of the model.
  • the future static feature set x s1 includes multi-dimensional future static features e 1 , which can affect objects at each future time t n+j
  • the future static feature set x s2 includes multi-dimensional future static features e 2 , can also affect the object at every future time t n+j
  • the number of future static features or future static feature groups may be 0, 1 or more.
  • the number of specific features included in each future static feature determines that its dimension can be one or more.
  • the future static features x s form a 0-, 1-, or multi-dimensional vector as a 1-dimensional sequence of length m ⁇ x s , x s , . . . , x s ⁇ with elements.
  • the one-dimensional sequence can be expressed as ⁇ (e 1 , e 2 ) T , (e 1 , e 2 ) T , . . . , (e 1 , e 2 ) T ⁇ .
  • the predicted feature data sequence 103 may be generated based on the regular data sequence 102, the future dynamic feature sequence of the object corresponding to the future time series, and the future static feature of the object.
  • the generating process can be completed by splicing the one-dimensional sequences of the four influencing factors to form a one-dimensional prediction feature data sequence 103 with multi-dimensional (for example, 4-dimensional) vectors (or can be referred to as quaternary data groups) as elements. As shown in FIG.
  • the one-dimensional sequence 103 can be represented as ⁇ ( ca , cn +1 , e1, e2 , xn +1 ) T , ( ca , cn +2 , e1, e 2 , x n+2 ) T , ..., (ca , c n +m , e 1 , e 2 , x n+m ) T ⁇ .
  • the predicted feature data sequence 103 is input into the second neural network model 120, and the prediction of the future corresponding to the future time series T2 is completed through the transfer and operation of a decoder such as a multi-layer perceptron MLP network.
  • the future value yn +j of each predicted object in the future data sequence 104 is a multidimensional vector having the same dimensions as the historical value yi .
  • the method 200 may also optionally include, prior to using at least one of the first and second neural network models 110 and 120 as encoder and decoder networks, respectively, using the training data set for the neural network model A step S250 of training to determine the optimal parameters of the model.
  • the parameters of the neural network model can be unchanged during the use period after the training is completed, or they can be updated or adjusted based on a new data set after a period of use or in a predetermined period, and the model parameters can also be updated in real time by means of online supervision. .
  • FIG. 3 shows an exemplary structure of an apparatus 300 for time series prediction according to an embodiment of the present application.
  • the apparatus 300 includes a historical data acquisition unit 310 , a regularity extraction unit 320 , a prediction feature generation unit 330 and a prediction unit 340 .
  • the historical data acquisition unit 310 is configured to acquire the historical data sequence 101 of the object corresponding to the historical time series T1.
  • the historical data in the historical data sequence 101 includes the historical dynamic features xi associated with time corresponding to the historical time t i in the historical time series T1, and the historical value yi of the object.
  • the regularity extraction unit 320 includes, for example, the first neural network model 110 as an encoder network in the seq2seq neural network model to extract the regularity of historical data. This unit is used to extract the regular data sequence 102 of the object corresponding to the future time series T2 from the historical data sequence 101 provided by the historical data acquisition unit 310 by using the neural network model.
  • the regular data sequence 102 includes the periodic regular feature ca of the object corresponding to the future time t n+j of the future time series T2 and the aperiodic regular feature cn +j associated with the corresponding future time.
  • the encoder network can choose a sequence data network model such as the WaveNet network, and can further adopt a structure such as a dilated convolutional network to speed up information transfer and computation.
  • the prediction feature generation unit 330 is configured to use the regularity data sequence 102 output by the regularity extraction unit 320, the future dynamic feature sequence composed of the future dynamic feature x n+ j corresponding to the future time t n+j in the future time series T2, and the future dynamic feature sequence.
  • the static features x s are combined to generate the predicted feature data sequence 103 .
  • a future dynamic feature x n+j is associated with a future time t n+j .
  • the predicted feature generating unit 330 may further group the static features x s to orthogonalize each group of static features, thereby reducing the vector of each data element of the predicted feature data sequence dimension.
  • the prediction unit 340 comprises, for example, the second neural network model 120 as a decoder network in a seq2seq neural network model to predict future values of the object.
  • the unit 340 is used to predict the future data sequence 104 of the object corresponding to the future time series T2 from the predicted feature data sequence 103 from the predicted feature generation unit 330 using the second neural network model 120.
  • the second neural network model 120 may use a convolutional neural network such as a multi-layer perceptual MLP network.
  • the apparatus 300 also optionally includes a model training unit 350 for training the corresponding neural network model to determine the optimal parameters of the model before using the neural network model in the above-mentioned extraction unit 320 and the prediction unit 340, and can supervise or Update the parameters of the model.
  • a model training unit 350 for training the corresponding neural network model to determine the optimal parameters of the model before using the neural network model in the above-mentioned extraction unit 320 and the prediction unit 340, and can supervise or Update the parameters of the model.
  • the experiment is carried out in the scenario of product prediction in the catering industry, and the test task requires to predict the sales volume of each product (object) in each distribution center in the next 1-4 weeks.
  • the test dataset targets about 20 distribution centers, each including on average about 200 products. In the historical data of product sales, the longest is 128 weeks and the shortest is 1 week.
  • the test task involves considering 23 dynamic influencing factors (such as whether it is a holiday, the number of working days, the number of weeks until the Spring Festival, etc.) and 7 static influencing factors (such as product classification, temperature, location of distribution center, etc.) in the prediction Wait).
  • Table 1 shows the training time, prediction time and prediction error of the models using different time series forecasting methods.
  • the deep learning method using the seq2seq neural network model requires a lot of floating-point operations, and uses one more graphics processing unit GPU to accelerate the calculation than the traditional statistical algorithm Prophet.
  • the prediction accuracy (error) of the scheme using the WaveNet-MLP seq2seq2 (WaveNet network as the encoder and the MLP network as the decoder) according to the embodiment of the present application is better than that of the traditional statistical algorithm, and also better than that of the seq2seq neural network.
  • Network model structure both the encoder and decoder networks adopt the scheme of the LSTM network model.
  • the solution using the neural network model is faster than the traditional statistical algorithm; and between the solutions using the neural network model, the training time of the WaveNet-MLP seq2seq neural network model structure of this application is significantly reduced .
  • the advantages of the time series prediction method and apparatus lie in the following aspects: using two neural network models such as the WaveNet network and the MLP network as the encoder and the decoder network, respectively, can make historical data Sequences, future data sequences are calculated in parallel at different historical times of the corresponding historical time series and different future times of the future time series, thereby improving the speed of model training and use; using neural network models such as WaveNet networks as encoders , especially the dilated convolutional network structure, which reduces the transmission path of the information in the historical data sequence of the object from the first historical time to the last historical time, avoiding the gradient disappearance and gradient explosion during the training process of the neural network, Thereby, long-distance time series prediction can be performed; only the influencing factors that do not change with time are introduced at the input of the second neural network model as the decoder part, avoiding duplication and calculation at each time point of the encoder network, thereby reducing Redundancy of data and computation; embedded grouping of influencing
  • modules or units of the apparatus for time series prediction are mentioned in the above detailed description, this division is not mandatory. Indeed, according to embodiments of the present application, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one module or unit described above may be further divided into multiple modules or units to be embodied. Components shown as modules or units may or may not be physical units, ie may be located in one place, or may be distributed over multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of the present application. Those of ordinary skill in the art can understand and implement it without creative effort.
  • a computer-readable storage medium on which a computer program is stored, the program including executable instructions, which, when executed by, for example, a processor, can implement any one of the above Steps of the method for time series forecasting described in the Examples.
  • various aspects of the present application can also be implemented in the form of a program product, which includes program code, which is used to cause the program product to run on a terminal device when the program product is executed.
  • the terminal device performs the steps according to various exemplary embodiments of the present application described in the method for time series prediction in this specification.
  • the program product for implementing the above method according to the embodiments of the present application may adopt a portable compact disc read only memory (CD-ROM) and include program codes, and may be executed on a terminal device such as a personal computer.
  • CD-ROM compact disc read only memory
  • the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • the program product may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • the computer-readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, carrying readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a readable storage medium can also be any readable medium other than a readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • Program code embodied on a readable storage medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Program code for carrying out the operations of the present application may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural Programming Language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).
  • LAN local area network
  • WAN wide area network
  • an electronic device which may include a processor, and a memory for storing executable instructions of the processor.
  • the processor is configured to perform the steps of the method for time series prediction in any one of the foregoing embodiments by executing the executable instructions.
  • aspects of the present application may be implemented as a system, method or program product. Therefore, various aspects of the present application can be embodied in the following forms, namely: a complete hardware implementation, a complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software aspects, which may be collectively referred to herein as implementations "circuit", “module” or "system”.
  • the electronic device 400 according to this embodiment of the present application is described below with reference to FIG. 4 .
  • the electronic device 400 shown in FIG. 4 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present application.
  • electronic device 400 takes the form of a general-purpose computing device.
  • Components of the electronic device 400 may include, but are not limited to, at least one processing unit 410, at least one storage unit 420, a bus 430 connecting different system components (including the storage unit 420 and the processing unit 410), a display unit 440, and the like.
  • the storage unit stores program codes, and the program codes can be executed by the processing unit 410, so that the processing unit 410 executes various examples according to the present application described in the method for automatic time series prediction in this specification steps of sexual implementation.
  • the processing unit 410 may perform the steps shown in FIG. 2 .
  • the storage unit 420 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 4201 and/or a cache storage unit 4202 , and may further include a read only storage unit (ROM) 4203 .
  • RAM random access storage unit
  • ROM read only storage unit
  • the storage unit 420 may also include a program/utility 4204 having a set (at least one) of program modules 4205 including, but not limited to, an operating system, one or more application programs, other program modules, and programs Data, each or some combination of these examples may include an implementation of a network environment.
  • program/utility 4204 having a set (at least one) of program modules 4205 including, but not limited to, an operating system, one or more application programs, other program modules, and programs Data, each or some combination of these examples may include an implementation of a network environment.
  • the bus 430 may be representative of one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any of a variety of bus structures. bus.
  • the electronic device 400 may also communicate with one or more external devices 500 (eg, keyboards, pointing devices, Bluetooth devices, etc.), with one or more devices that enable a user to interact with the electronic device 400, and/or with Any device (eg, router, modem, etc.) that enables the electronic device 400 to communicate with one or more other computing devices. Such communication may occur through input/output (I/O) interface 450 . Also, the electronic device 400 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 460 . Network adapter 460 may communicate with other modules of electronic device 400 through bus 430 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 400, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives and data backup storage systems.
  • the exemplary embodiments described herein may be implemented by software, or may be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present application may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or a network Above, several instructions are included to cause a computing device (which may be a personal computer, a server, or a network device, etc.) to execute the method for time series prediction according to an embodiment of the present application.
  • a computing device which may be a personal computer, a server, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Evolutionary Biology (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

La présente invention concerne un procédé et un appareil de prédiction de séquence de temps et un support d'enregistrement lisible par ordinateur. Le procédé comprend : l'acquisition d'une séquence de données historiques d'un objet correspondant à une séquence de temps historiques (S210) ; l'utilisation d'un premier modèle de réseau neuronal pour extraire, sur la base de la séquence de données historiques, une séquence de données régulières correspondant à une future séquence de temps (S220) ; la génération d'une séquence de données caractéristiques prédite sur la base de la séquence de données régulières, d'une future séquence de caractéristiques dynamiques correspondant à la future séquence de temps, et d'une future caractéristique statique (S230) ; et l'utilisation d'un second modèle de réseau neuronal pour prédire, sur la base de la séquence de données caractéristiques prédite, une future séquence de données d'un objet correspondant à la future séquence de temps (S240). Le procédé décrit peut satisfaire les exigences d'un calcul de grande efficacité, et capturer avec précision l'impact non linéaire de facteurs de tendance, de facteurs saisonniers, de facteurs externes, et similaires sur un objet prédit, tout en effectuant simultanément une prédiction de temps à courte distance et à longue distance.
PCT/CN2021/118272 2020-09-14 2021-09-14 Procédé et appareil de prédiction de séquence de temps WO2022053064A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010959817.7 2020-09-14
CN202010959817.7A CN112053004A (zh) 2020-09-14 2020-09-14 用于时间序列预测的方法和装置

Publications (1)

Publication Number Publication Date
WO2022053064A1 true WO2022053064A1 (fr) 2022-03-17

Family

ID=73610632

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/118272 WO2022053064A1 (fr) 2020-09-14 2021-09-14 Procédé et appareil de prédiction de séquence de temps

Country Status (2)

Country Link
CN (1) CN112053004A (fr)
WO (1) WO2022053064A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114971057A (zh) * 2022-06-09 2022-08-30 支付宝(杭州)信息技术有限公司 模型选择的方法及装置
CN115343621A (zh) * 2022-07-27 2022-11-15 山东科技大学 一种基于数据驱动的动力电池健康状态预测方法及设备
CN115794906A (zh) * 2022-12-02 2023-03-14 中电金信软件有限公司 一种确定突发事件影响的方法、装置、设备及存储介质
CN116307153A (zh) * 2023-03-07 2023-06-23 广东热矩智能科技有限公司 用于制冷制热系统节能的气象预测方法、装置及电子设备
CN116976956A (zh) * 2023-09-22 2023-10-31 通用技术集团机床工程研究院有限公司 Crm系统商机成交预测方法、装置、设备及存储介质
CN117252311A (zh) * 2023-11-16 2023-12-19 华南理工大学 一种基于改进lstm网络的轨道交通客流预测方法

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112053004A (zh) * 2020-09-14 2020-12-08 胜斗士(上海)科技技术发展有限公司 用于时间序列预测的方法和装置
CN112232604B (zh) * 2020-12-09 2021-06-11 南京信息工程大学 基于Prophet模型提取网络流量的预测方法
CN112906941B (zh) * 2021-01-21 2022-12-06 哈尔滨工程大学 面向动态相关空气质量时间序列的预测方法及系统
CN112967518B (zh) * 2021-02-01 2022-06-21 浙江工业大学 一种公交专用道条件下公交车辆轨迹的Seq2Seq预测方法
CN112801202B (zh) * 2021-02-10 2024-03-05 延锋汽车饰件系统有限公司 车窗的起雾预测方法、系统、电子设备及存储介质
CN113313316A (zh) * 2021-06-11 2021-08-27 北京明略昭辉科技有限公司 预测数据的输出方法及装置、存储介质、电子设备
CN113837858A (zh) * 2021-08-19 2021-12-24 同盾科技有限公司 用户信贷风险预测的方法、系统、电子装置和存储介质
CN113850418A (zh) * 2021-09-02 2021-12-28 支付宝(杭州)信息技术有限公司 时间序列中异常数据的检测方法和装置
CN113985408B (zh) * 2021-09-13 2024-04-05 南京航空航天大学 一种结合门单元和迁移学习的逆合成孔径雷达成像方法
CN113837487A (zh) * 2021-10-13 2021-12-24 国网湖南省电力有限公司 基于组合模型的电力系统负荷预测方法
CN114580798B (zh) * 2022-05-09 2022-09-16 南京安元科技有限公司 一种基于transformer的设备点位预测方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850891A (zh) * 2015-05-29 2015-08-19 厦门大学 一种时间序列预测的智能优化递归神经网络方法
CN106971348A (zh) * 2016-01-14 2017-07-21 阿里巴巴集团控股有限公司 一种基于时间序列的数据预测方法和装置
US20200074274A1 (en) * 2018-08-28 2020-03-05 Beijing Jingdong Shangke Information Technology Co., Ltd. System and method for multi-horizon time series forecasting with dynamic temporal context learning
CN110889560A (zh) * 2019-12-06 2020-03-17 西北工业大学 一种具有深度可解释性的快递序列预测的方法
CN111612215A (zh) * 2020-04-18 2020-09-01 华为技术有限公司 训练时间序列预测模型的方法、时间序列预测方法及装置
CN112053004A (zh) * 2020-09-14 2020-12-08 胜斗士(上海)科技技术发展有限公司 用于时间序列预测的方法和装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850891A (zh) * 2015-05-29 2015-08-19 厦门大学 一种时间序列预测的智能优化递归神经网络方法
CN106971348A (zh) * 2016-01-14 2017-07-21 阿里巴巴集团控股有限公司 一种基于时间序列的数据预测方法和装置
US20200074274A1 (en) * 2018-08-28 2020-03-05 Beijing Jingdong Shangke Information Technology Co., Ltd. System and method for multi-horizon time series forecasting with dynamic temporal context learning
CN110889560A (zh) * 2019-12-06 2020-03-17 西北工业大学 一种具有深度可解释性的快递序列预测的方法
CN111612215A (zh) * 2020-04-18 2020-09-01 华为技术有限公司 训练时间序列预测模型的方法、时间序列预测方法及装置
CN112053004A (zh) * 2020-09-14 2020-12-08 胜斗士(上海)科技技术发展有限公司 用于时间序列预测的方法和装置

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114971057A (zh) * 2022-06-09 2022-08-30 支付宝(杭州)信息技术有限公司 模型选择的方法及装置
CN115343621A (zh) * 2022-07-27 2022-11-15 山东科技大学 一种基于数据驱动的动力电池健康状态预测方法及设备
CN115343621B (zh) * 2022-07-27 2024-01-26 山东科技大学 一种基于数据驱动的动力电池健康状态预测方法及设备
CN115794906A (zh) * 2022-12-02 2023-03-14 中电金信软件有限公司 一种确定突发事件影响的方法、装置、设备及存储介质
CN116307153A (zh) * 2023-03-07 2023-06-23 广东热矩智能科技有限公司 用于制冷制热系统节能的气象预测方法、装置及电子设备
CN116976956A (zh) * 2023-09-22 2023-10-31 通用技术集团机床工程研究院有限公司 Crm系统商机成交预测方法、装置、设备及存储介质
CN117252311A (zh) * 2023-11-16 2023-12-19 华南理工大学 一种基于改进lstm网络的轨道交通客流预测方法
CN117252311B (zh) * 2023-11-16 2024-03-15 华南理工大学 一种基于改进lstm网络的轨道交通客流预测方法

Also Published As

Publication number Publication date
CN112053004A (zh) 2020-12-08

Similar Documents

Publication Publication Date Title
WO2022053064A1 (fr) Procédé et appareil de prédiction de séquence de temps
US11928600B2 (en) Sequence-to-sequence prediction using a neural network model
US10846643B2 (en) Method and system for predicting task completion of a time period based on task completion rates and data trend of prior time periods in view of attributes of tasks using machine learning models
US10540967B2 (en) Machine reading method for dialog state tracking
US11080707B2 (en) Methods and arrangements to detect fraudulent transactions
US20210142181A1 (en) Adversarial training of machine learning models
US20190130249A1 (en) Sequence-to-sequence prediction using a neural network model
US20190138887A1 (en) Systems, methods, and media for gated recurrent neural networks with reduced parameter gating signals and/or memory-cell units
US11657802B2 (en) Utilizing a dynamic memory network for state tracking
CN110663049B (zh) 神经网络优化器搜索
WO2018175972A1 (fr) Optimisation de placement de dispositif avec apprentissage de renforcement
US20210303970A1 (en) Processing data using multiple neural networks
US20210374544A1 (en) Leveraging lagging gradients in machine-learning model training
US11651212B2 (en) System and method for generating scores for predicting probabilities of task completion
US20220391706A1 (en) Training neural networks using learned optimizers
CN116091110A (zh) 资源需求量预测模型训练方法、预测方法及装置
CN108475346B (zh) 神经随机访问机器
CN112243509A (zh) 从异构源生成数据集用于机器学习的系统和方法
US20230289634A1 (en) Non-linear causal modeling based on encoded knowledge
CN116993185A (zh) 时间序列预测方法、装置、设备及存储介质
EP4231202A1 (fr) Appareil et procédé de traitement de données
US20200302303A1 (en) Optimization of neural network in equivalent class space
US20190065987A1 (en) Capturing knowledge coverage of machine learning models
KR102653418B1 (ko) 순위패턴매칭과 lstm을 결합한 시계열데이터 예측 방법 및 장치
US11934384B1 (en) Systems and methods for providing a nearest neighbors classification pipeline with automated dimensionality reduction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21866116

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21866116

Country of ref document: EP

Kind code of ref document: A1