CN112508170A

CN112508170A - Multi-correlation time sequence prediction system and method based on generation countermeasure network

Info

Publication number: CN112508170A
Application number: CN202011299519.6A
Authority: CN
Inventors: 吴伟杰; 黄芳; 吴琪; 欧阳洋; 禹克强
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2020-11-19
Filing date: 2020-11-19
Publication date: 2021-03-16

Abstract

The invention discloses a multi-correlation time sequence prediction system and method based on a generation countermeasure network, and provides an end-to-end solution for the prediction problem of a plurality of correlation time sequences widely existing in real life. In a plurality of related time series prediction problems, complex interaction relation existing among time series is hidden in data, and the hidden complex interaction relation cannot be directly extracted by a conventional method. The invention skillfully generates the interactive relationship through one generator, obtains the predicted value through one generator, and optimizes the generated interactive relationship through the discriminator. This method of extracting the interaction directly from the data avoids reliance on other a priori knowledge.

Description

Multi-correlation time sequence prediction system and method based on generation countermeasure network

Technical Field

The invention relates to the technical field of time series prediction, in particular to a multi-correlation time series prediction system and method based on a generation countermeasure network.

Background

Time series data is widely present in various fields of society, economy and life, and analyzing and studying time series has important value and significance. In life, history records of time-varying changes such as precipitation, electricity consumption, sales of goods, price fluctuation of stocks can be regarded as typical time series data. The future development trend of the time series is predicted by finding the change rule of the time series, and the method has great significance for guiding the aspects of reasonably utilizing hydraulic power resources, effectively organizing production, reducing inventory, improving benefits and the like. Research on the time series prediction problem has long been the focus of many scholars both at home and abroad. An ARIMA model (differential Integrated Moving Average Autoregressive model) is widely used in the time series prediction problem, but the model is based on a linear regression theory in statistics, is a linear model, and has obvious defects in fitting data with complex patterns. In recent years, with the development of artificial intelligence, many methods of machine learning and deep learning are also used in the prediction problem of time series. Such as support vector machines and neural networks, these models have the advantage of strong fitting ability and are very suitable for predictive analysis of complex time series data. To date, most of these studies have focused on single time series prediction problems and achieved good results and applications. However, in the real world, the problems encountered tend to be more complex, with the object of study often containing multiple time series.

Compared with the prediction problem of a single time series, in the prediction problem of a plurality of time series, not only the time series relation contained in each time series but also the mutual influence among the time series are considered, and the influence is called as the Interaction Relationship (Interaction Relationship). At present, most time series prediction methods are difficult to capture the influence of complex interaction relations among different time series on prediction results, so that satisfactory prediction effects are often not achieved. Therefore, for such time series predictions with multiple sequence correlations, how to effectively capture such interaction relationships hidden among multiple time series is the biggest challenge to solve the problem.

Disclosure of Invention

The present invention is directed to solving at least one of the problems of the prior art. Therefore, the invention provides a multi-correlation time series prediction system and method based on a generation countermeasure network, which can capture the complex interactive relation among a plurality of time series and the time series internal time series relation at the same time and have unique advantages in the prediction tasks of the plurality of correlation time series.

In a first aspect of the present invention, a multi-correlation time series prediction system based on generation of a countermeasure network is provided, including:

the interactive matrix generator is used for mapping the original random vector into an interactive matrix;

the predicted value generator is used for obtaining intermediate characteristic representation from the time series interaction diagram by using a diagram convolution network and processing the intermediate characteristic representation by using a recurrent neural network to obtain a predicted value of each time series; the time series interaction graph is generated by the interaction matrix and the time series characteristic matrix;

the time sequence discriminator is used for training based on a false time sequence sample and a real time sequence sample, and the trained time sequence discriminator is used for feeding gradient information back to the interaction matrix generator and the predicted value generator; the false time sequence sample is generated by adding the predicted value to the original time sequence feature vector, and the real time sequence sample is generated by adding a real value to the original time sequence feature vector.

According to the embodiment of the invention, at least the following technical effects are achieved:

compared with other existing time sequence prediction models, the system can capture complex interaction relations among a plurality of time sequences and time sequence relations inside the time sequences at the same time, and has unique advantages in the prediction tasks of the plurality of relevant time sequences.

In a plurality of related time series prediction problems, complex interaction relation existing among time series is hidden in data, and the hidden complex interaction relation cannot be directly extracted by a conventional method. The system skillfully generates the interactive relationship through a generator, obtains a predicted value through the generator, and optimizes the generated interactive relationship through a discriminator. This method of extracting the interaction directly from the data avoids reliance on other a priori knowledge.

According to some embodiments of the invention, the interaction matrix generator comprises a transposed convolutional network.

According to some embodiments of the invention, the recurrent neural network is a long-short term memory network.

According to some embodiments of the invention, the network depth of the graph convolution network is set to 3 layers or 4 layers.

The invention provides a plurality of relevant time sequence prediction methods based on generation of a countermeasure network, which are applied to a multi-relevant time sequence prediction system based on generation of the countermeasure network, wherein the multi-relevant time sequence prediction system based on generation of the countermeasure network comprises an interaction matrix generator, a predicted value generator and a time sequence discriminator, and the interaction matrix generator, the predicted value generator and the time sequence discriminator are connected with each other in pairs;

the method comprises the following steps:

mapping the original random vector into an interaction matrix through the interaction matrix generator;

constructing a time sequence interaction graph according to the interaction matrix and the time sequence characteristic matrix;

obtaining an intermediate feature representation from the time series interaction diagram by using a diagram convolution network through the predicted value generator, and processing the intermediate feature representation by using a recurrent neural network to obtain a predicted value of each time series;

training the time sequence discriminator through a false time sequence sample and a real time sequence sample, and feeding gradient information back to the interaction matrix generator and the predicted value generator through the trained time sequence discriminator; and adding the predicted value to the original time sequence feature vector to generate the false time sequence sample, and adding the real value to the original time sequence feature vector to generate the real time sequence sample.

compared with other existing time sequence prediction models, the method can capture the complex interaction relation among a plurality of time sequences and the time sequence relation inside the time sequences at the same time, and has unique advantages in the prediction tasks of the plurality of relevant time sequences.

In a plurality of related time series prediction problems, complex interaction relation existing among time series is hidden in data, and the hidden complex interaction relation cannot be directly extracted by a conventional method. The method skillfully generates the interactive relationship through one generator, obtains a predicted value through one generator, and optimizes the generated interactive relationship through a discriminator. This method of extracting the interaction directly from the data avoids reliance on other a priori knowledge.

According to some embodiments of the invention, the mapping of the original random vector into the interaction matrix comprises the steps of:

and mapping the original random vector into a three-dimensional characteristic representation through a full-connection layer, processing the three-dimensional characteristic representation by using a transposed convolution layer to obtain an output matrix, and carrying out symmetrical processing on the output matrix to obtain the interactive matrix.

In a third aspect of the present invention, there is provided a prediction device for predicting a plurality of correlation time series based on generation of a countermeasure network, comprising: at least one control processor and a memory for communicative connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform a plurality of related time series prediction methods based on generation of a countermeasure network according to the second aspect of the invention.

In a fourth aspect of the present invention, there is provided a computer-readable storage medium storing computer-executable instructions for causing a computer to perform a method for predicting a plurality of correlation time series based on generation of a countermeasure network according to the second aspect of the present invention.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a time series interaction diagram including five time series provided by an embodiment of the present invention;

FIG. 2 is a schematic diagram of a multi-correlation time series prediction system based on generation of a countermeasure network according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a workflow of an interaction matrix generator according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a workflow of a predictive value generator according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a time series discriminator according to an embodiment of the present invention;

FIG. 6 is a graph of experimental data of the impact of an interaction matrix generator on performance according to an embodiment of the present invention;

FIG. 7 is a graph of experimental data of the impact of GCN network depth on performance provided by an embodiment of the present invention;

fig. 8 is a schematic diagram of a multi-correlation time series prediction apparatus based on generation of a countermeasure network according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.

For ease of understanding, the multiple correlation time series prediction is first introduced:

suppose that the subject includes n time series T₁,T₂,…,T_nSince the time required to be predicted is T +1, the data characteristic that can be obtained is T₁ ^[t-w+1,t],T₂ ^[t-w+1,t],…,T_n ^[t-w+1,t]Representing the historical sequence values of each sequence within a sliding window of length w. The goal of multiple correlation time series prediction is to train a model f to map the data features to the predicted value of each time series at time t + 1:

[T₁ ^t+1,T₂ ^t+1,…,T_n ^t+1]＝f(T₁ ^[t-w+1,t],T₂ ^[t-w+1,t],…,T_n ^[t-w+1,t]) (1)

time series feature vector T_i ^[t-w+1,t]The feature vector is a feature vector formed by historical sequence values of a time sequence covered by a sliding window with the length of w at the time t, and the representation form of the time feature vector of the ith time sequence is as follows:

T_i ^[t-w+1,t]＝[T_i ^t-w+1,T_i ^t-w+2,…,T_i ^t] (2)

time series feature matrix X^n×wThe method is formed by n time series characteristic vectors, and each row in a matrix corresponds to one time series characteristic vector. The number of columns of the matrix is equal to the length w of the sliding window, and the number of rows of the matrix is equal to the number n of time sequences, and the specific form is shown as the following formula:

time series interaction graph G: the interaction relationship of the plurality of time series is represented as a weighted undirected graph G ═ V, E. Where V is a set of nodes, each node corresponding to a time series feature vector, a time series feature matrix X may be used^n×wThe set of nodes V is described. The weighted edge set E represents weighted adjacency relations among time series interactive graph nodes and is used for describing interactive relations among different time series. The adjacency matrix of this time-series interaction graph is represented by A^n×nRepresenting, which is called interaction matrix, each matrix element corresponding to an edge in E, A^n×nThe value range of (2) is defined as shown in formula (4). Fig. 1 shows a time series interaction diagram comprising five time series.

A first embodiment;

providing a multi-correlation time series prediction system based on a generation countermeasure network, wherein the overall architecture of the system is shown in FIG. 2; the system comprises an interaction matrix generator, a predicted value generator and a time sequence discriminator which are respectively marked as G_M、G_PAnd D, the system is a deep learning model based on generation of a countermeasure network, and the overall architecture of the system is shown in FIG. 2.

Interaction matrix generator G_MThe original random vector is mapped into an interaction matrix.

As an alternativeMode for carrying out the invention, Generator G_MThe method comprises a transposed convolution network, as shown in fig. 3, wherein a full connection layer is used for mapping an original random vector into a three-dimensional feature representation, the transposed convolution layer is used for processing the three-dimensional feature representation to obtain an output matrix, and the output matrix is subjected to symmetrical processing to obtain an interactive matrix. The interaction matrix is used for representing the interaction relation between different time sequences.

Generator G_MThe effect of (a) is to generate a two-dimensional matrix. This embodiment implements G with a transposed convolutional network_MThe roles of transposed convolution and convolution are just opposite, and transposed convolution is to convert a coarse-grained representation into a fine-grained representation, which is equivalent to an upsampling method. The transposed convolution network has the characteristics of Local Connectivity and convolution kernel Parameter Sharing, can greatly reduce the number of parameters of the network compared with a fully-connected neural network, and has higher efficiency in processing large-scale data. As shown in FIG. 3, a high-dimensional random noise vector sampled from the Gaussian distribution is used as the generator G_MThe high-dimensional random noise vector is first mapped into a three-dimensional feature representation (FeatureRepresentation) through a fully connected layer, the three dimensions being length, width and channel number, respectively. The transposed convolution layer will continue to process the obtained three-dimensional feature representation, the number of channels of the feature representation will be reduced and the length and width will be increased every time one layer of the transposed convolution layer is passed, and finally the transposed convolution network outputs a tensor of n × n × 1 dimension, where n is the number of time series to be processed. And obtaining an interaction matrix by the output result of the transposed convolution network through matrix symmetry operation. The operation of symmetrization is shown in equation (5) below, where O is the output matrix and A is the symmetric matrix.

Predicted value generator G_PObtaining intermediate characteristic representation from the time series interaction diagram by using a diagram convolution network, and processing the intermediate characteristic representation by using a recurrent neural network to obtain a predicted value of each time series; time of flightThe inter-sequence interaction diagram is generated by an interaction matrix and a time sequence characteristic matrix.

In a plurality of related time series prediction problems, two aspects of dependency relationship need to be solved: 1) the interaction relationship existing between the time series; 2) the timing relationships that exist within the time series. Wherein the interaction relation is passed through the generator G_MGet, design generator G_PThe purpose of the method is to comprehensively process the dependency relationship of the two aspects. Generator G_PThe workflow of (2) is as shown in fig. 4. First, the generator G_MAnd the generated interaction matrix is used as an adjacent matrix of the time series interaction diagram, the time series characteristic matrix is used as a characteristic matrix of a node in the time series interaction diagram, and the time series interaction diagram is constructed by the two matrixes. The time sequence feature vector on each node of the time sequence interactive graph contains the time sequence relation in the time sequence, and the weighted edges among the nodes contain the interactive relation among the time sequences. Then, processing the time series interaction graph using a Graph Convolution Network (GCN) can get an intermediate Feature Representation (Feature reconstruction). From the perspective of Graph Embedding (Graph Embedding), GCN embeds topology information in the time series interaction Graph, i.e., information in the interaction edges, into the output intermediate feature representation. Therefore, the feature obtained by GCN processing actually contains two aspects of information: 1) the information in the time sequence characteristic matrix contains the time sequence relation in the time sequence; 2) and the information in the interaction relation matrix contains the interaction relation among the time sequences. Finally, a recurrent neural network is used to process the intermediate feature representation to generate a final predicted value.

As an alternative implementation, the process of obtaining the intermediate feature representation via GCN processing is provided as follows:

generator G_PModeling the interaction between time sequences with GCN, Generator G_PThe graph convolution layer used in (1) is represented by formula (6):

wherein

A is an adjacent matrix obtained by the symmetry of the interaction matrix, I is an n-dimensional unit matrix, and A is converted into

The aim of this is to avoid losing the original information of the nodes in the process of graph convolution operation, which is equivalent to adding a ring edge pointing to each node. Matrix array

Is that

Corresponding degree matrix, elements on its main diagonal

The elements in the other positions are 0.

Left and right are multiplied simultaneously

The method is a normalization process in graph convolution, and prevents the problem that the scales of the node features after convolution are inconsistent.

Is a time series feature matrix on the graph, each row of the matrix is a time series feature vector,

is a trainable parameter (learnabeleparameter) in the GCN.

Is a representation matrix (RepresentationMatrix) obtained after a graph convolution operation. Will be provided with

As a whole, in a matrix form, as shown in equation (7). Matrix X and matrix

And respectively expressed in the form of row vectors, as shown in formula (8) and formula (9). Because of the parameter matrix

The dimension of the right-handed product is not changed after the right-handed product is multiplied by W, so the influence of the parameter matrix W is omitted in the analysis. Then, each node feature vector output after one-layer graph convolution operation is shown as formula (10), and it can be seen that the output feature vector is actually the weighted sum of the feature vectors of the node and all adjacent nodes before input, and the weight coefficient is the result of upper weight normalization of edges between nodes.

The analysis shows that the interaction relation among time series is modeled by processing the time series interaction graph by using the GCN, and the core of the method is to perform weighted fusion on the time feature vectors on the first-order adjacent node and the self node according to the correlation size represented by the edges among the nodes in the graph so as to generate a new feature vector representation. The receptive fields of the nodes in the graph convolution process are all first-order neighbor nodes of the nodes, and the larger the correlation is, the larger the weight coefficient of the nodes is, the larger the influence on the feature vectors newly generated by the nodes is.

As an alternative embodiment, the generator G_PThe recurrent neural network used is a long short term memory network (LSTM). The GCN processes the time series interaction diagram to obtain an intermediate feature representation, wherein the intermediate feature representation comprises the time sequence relation inside the time series and also comprises the complex interaction relation among the time series. After extracting the interaction relation, the generator G_PNext, a future prediction value, that is, an extraction of the time-series relationship, needs to be generated based on the intermediate feature representation. The recurrent neural network is suitable for processing sequence-type data, wherein a long-short term memory network (LSTM) is a typical representation of the recurrent neural network, and the present embodiment uses the LSTM to extract the time-series relationship and generate a predicted value of the future of each time sequence. LSTM uses some "Gate" structure to allow information to selectively affect the state of the recurrent neural network at each time. The so-called "gate" structure is a neural network using sigmoid activation functions and a bitwise multiplication operation, which will not be described in detail here. The following is the definition of each gate in the LSTM:

z＝tanh(W_z[h_t-1,x_t]+b_z) (11)

i＝sigmoid(W_i[h_t-1,x_t]+b_i) (12)

f＝sigmoid(W_f[h_t-1,x_t]+b_f) (13)

o＝sigmoid(W_o[h_t-1,x_t]+b_o) (14)

c_t＝f⊙c_t-1+i⊙z (15)

h_t＝o⊙tanh(c_t) (16)

wherein i, f, o respectively represent an input gate, a forgetting gate and an output gate, c_tThe memory cell representing time t can be regarded asIs a token vector of the previously output sequence information. h is_tIndicating the output value at time t. W and b represent the weight parameter and bias parameter, respectively, for each gate in the LSTM.

In generator G_PIn the method, the GCN performs graph convolution operation according to the interactive relation among the time sequences to obtain an intermediate feature representation, and the feature representation fuses the time sequence relation inside the time sequences and the complex interactive relation among the time sequences. The feature representation obtained by the GCN process is fed into the LSTM, which processes the sequences and generates a predicted value for each time series.

The time sequence discriminator D is trained based on the false time sequence sample and the real time sequence sample, and the trained time sequence discriminator is used for feeding back gradient information to the interaction matrix generator and the predicted value generator; and adding a predicted value for the original time sequence feature vector by the false time sequence sample to generate, and adding a real value for the original time sequence feature vector by the real time sequence sample to generate.

In GAN, the arbiter is the party playing the game with the generator, and needs to correctly distinguish the data generated by the generator from the real data. In the present system, a generator G_PThe predicted value of each time sequence is generated, and if the predicted value and the true value are directly simulated in the GAN, the meaning of distinguishing the predicted value and the true value by the discriminator is not large. The system is implemented by a generator G_PAnd adding the generated predicted value behind the original time sequence feature vector to construct a false time sequence sample, and adding the real value behind the original time sequence feature vector to construct a real time sequence sample. Equations (17) and (18) represent the specific forms of the dummy time-series sample and the real time-series sample, respectively.

T_i ^[t-w+1,t+1]＝[T_i ^t-w+1,T_i ^t-w+2,…,T_i ^t,T_i ^t+1] (18)

The discriminator D is used for correctly distinguishing the true and false time series samples constructed as described above, and the specific structure thereof is shown in fig. 5, and the whole discriminator includes two input ends. One of the inputs is the Embedding layer (Embedding), which accepts an input of One-hot Encoding (One-hot Encoding) vector and outputs a low-dimensional dense vector. This one-hot coded vector is a very high dimensional sparse vector that is used to distinguish from which time series in the data set the current time series samples originated. The other input is a Bidirectional long short term memory (Bidirectional LSTM) network that accepts as input a time series of samples. The main structure of the two-way long-short term memory network is the combination of two one-way LSTM, and at each time t, the input is simultaneously provided to the two LSTM with opposite directions. The two networks independently calculate to respectively generate the hidden state and the output at the moment, and the other structures of the two unidirectional LSTMs are completely symmetrical except the directions of the two unidirectional LSTMs are different. The output of FLSTM at the last moment encodes the forward timing information in the time sequence samples, the output of BLSTM at the first moment encodes the reverse timing information of the time sequence, and the output of the bidirectional long-short term memory network is actually the simple splicing of the output vectors of FLSTM and BLSTM. Finally, the vector output by the embedding layer and the vector output by the bidirectional long-short term memory network are spliced together and input into a fully-connected network, and the fully-connected network gives the probability that the input time sequence samples are true.

From the overall system perspective, the generator G_MData distribution G of fitting interaction matrix_M(z；θ_M) (ii) a Generator G_PFitting the data distribution G of each time sequence predicted value under the condition of giving the time sequence feature matrix and the interaction matrix_P(G_M(z；θ_M),X；θ_P) The final purpose of both generators is to fool the arbiter D; the discriminator D outputs for each time series sample of the construction the probability D (y, X; theta) that this sample is a true sample_D)。G_M，G_PAnd D as competitors of the two parties to carry out the optimization process of the MinimaxGame (MinimaxGame)Can be expressed in a formal manner as shown in formula (19). Finally, the complete generated countermeasure algorithm in the training process of the system is summarized into a pseudo code in the table 1:

TABLE 1

The overall work flow of the system can be divided into a generation process, a discrimination process and a confrontation process. In the generation process, first, the generator G_MAnd converting an original random vector into an interaction matrix, and then combining the interaction matrix with the time series characteristic matrix to construct a time series interaction graph. Then generator G_PAnd extracting the interactive relation among the time series nodes and the time sequence relation in a single time series from the time series interactive graph to generate a future predicted value of each time series. In the distinguishing process, firstly constructing a real sample and a false sample, and obtaining the real time sequence sample by splicing the real data (real data) and the real label (real label) which are represented by the time sequence characteristic matrix through a splicing operation (Concatenate) of the matrix; the real data (Realdata) of the time series feature matrix and a generator G are combined_PAnd the generated false label (Faketarget) obtains false time series samples through the splicing operation of the matrix. The discriminator D is then trained using the real and dummy samples as a training set, and the discriminator training is completed when it can correctly distinguish between the real and dummy samples. In the final countermeasure process, the trained discriminators are fixed as the evaluation function of the generator by adjusting the generator G_MAnd generator G_PSuch that the false samples generated by the two generators are evaluated by the discriminatorThe probability of estimating as a true sample is maximized.

Compared with other existing time series prediction models, the system can capture complex interaction relations among a plurality of time series and time series internal time series relations at the same time, and has unique advantages in a plurality of relevant time series prediction tasks.

In a plurality of related time series prediction problems, complex interaction relation existing among time series is hidden in data, and the hidden complex interaction relation cannot be directly extracted by a conventional method. The system skillfully generates the interactive relationship through a generator, obtains a predicted value through the generator, and optimizes the generated interactive relationship through a discriminator. This method of extracting the interaction directly from the data avoids reliance on other a priori knowledge. In addition, the system adopts the transposition convolution network to realize the interactive matrix generator, thereby improving the expansibility.

A second embodiment;

providing simulation results of the above system, the system comprising a generator G_MGenerator G_PAnd a discriminator D, G_MIs composed of a transposed convolution network; predicted value generator G_PConsists of a graph convolution network and an LSTM network.

In order to verify the effectiveness of the system, the predicted performance of the system and other reference methods is compared on different data sets, and different measurement standards are adopted for evaluation, so that the effectiveness and the applicability of the system are verified. The influence of the system architecture on the predicted performance, generator G, was then investigated experimentally_MThe impact of structure and GCN network depth on the prediction performance.

Firstly, a data set;

(1) store Item Demand Dataset providing daily sales records for 50 different items located in 10 different stores, each sales record starting at 1/2013 and ending at 31/2017/12/2016, i.e. the Dataset contains 500 time series, each time series having a time length of 1826 days.

(2) Web Traffic Dataset, which records the data of the Traffic of the Wikipedia website changing with time. The entire data set contains approximately 145000 time series, where each time series represents the daily visit of a wikipedia page, and the recorded time starts at 7/1/2015 to ends at 9/10/2017, and the time length of each time series is 804 days. The data set contains missing values, and the data used in the experiment were selected from 500 time series that did not contain missing values.

(3) NOAA China Dataset, which is weather data recorded by weather stations at different locations in China provided by the national oceanic and atmospheric administration. This example extracts the data for the daily temperatures of 400 different meteorological stations from 2015 to 2018 as experimental data.

Secondly, setting an experiment;

(1) setting system parameters;

the system is implemented using a Pythrch deep learning framework. In generator G_MThe dimension of the random noise vector is set to 128, and the distribution thereof follows a gaussian distribution. In generator G_PIn the method, the number of GCN layers is set to 3, the number of hidden layers of the LSTM is set to 3, the dimension of the hidden layers is set to 64, and the LSTM finally needs to generate a scalar value, so that the output of the LSTM needs to be converted from 64 dimensions to 1 dimension through a full connection layer. In the discriminator D, the dimension of the embedded layer vector is set to be 8, the number of hidden layers of the bidirectional long-short term memory network is set to be 3, and the dimension of the hidden layers is set to be 64. During the experimental training system, the learning rate was set to 0.001, the batch size parameter was set to 32, Adam was used as the optimization algorithm, and Dropout trick was used to avoid overfitting the model, with the parameter of Dropout set to 0.2.

(2) A reference method;

the method comprises the steps of firstly smoothing a time sequence through differential operation, and then predicting a future value of the time sequence by combining an AR model and an MA model, wherein the method is a very widely used time sequence prediction method;

vector Auto-regressive (VAR), the model is often used to solve the prediction problem of multidimensional time series, and it can consider the correlation between variables of different dimensions;

support Vector Regression models (SVR), which is a very well-known machine learning model and has solid mathematical theoretical Support;

LightGBM (LGB), which is a gradient lifting tree model proposed and implemented by Microsoft, can solve the problems of classification and regression, and shows strong prediction performance in numerous data mining competitions;

long Short-Term Memory (LSTM), a model of the recurrent neural network, is well suited for processing sequence-type data.

The model is also a Recurrent neural network model, modifies a door mechanism in an LSTM model, simplifies the mechanism and improves the training efficiency;

(3) a simulation result;

TABLE 2

Table 2 shows the comparison of the prediction accuracy of the system and other six methods on three data sets of Store Item, Web Traffic and NOAAChina. As can be seen from the table, the system has the best predictive effect on three data sets under both MAE and RMSE indices. In other comparison methods, ARIMA is a single time series prediction method, and in a multiple correlation time series prediction problem, the method does not consider the interaction relation existing between time series, and the prediction effect in an experiment is the worst. VAR converts a plurality of related time series prediction problems into a multi-dimensional time series prediction problem in experiments, and the method can capture the correlation between time series to a certain extent, but the method is a linear model and has limited fitting capability on data with complex modes, so that the prediction effect is only better than that of an ARIMA model. Both SVR and LGB are excellent machine learning models, their prediction effects are very close, and LGB is slightly superior to SVR as a whole. LSTM and GRU are both deep learning models and are also very similar in structure, LSTM predicts slightly better than GRU, but from a model training perspective, GRU's training efficiency is significantly better than LSTM. LGB has better effect on StoreItem and NOAAChina data sets than LSTM, and has better effect on Web Traffic data sets. The results of the whole experiment are integrated, the prediction effect of the system on three data sets comprehensively surpasses other six methods, and the system is proved to have obvious advantages in a plurality of related time series prediction problems.

Three, interactive matrix generator G_MThe impact on performance;

implementing an interaction matrix generator G using a fully-connected neural network and a transposed convolutional neural network_MThe two schemes are compared to test the influence of the system on the performance prediction effect, and the test result is shown in fig. 6. The two subgraphs of each row are the predicted performance change conditions of the system realized by using the two networks (FCN represents a fully-connected neural network, and TConv represents a transposed convolutional neural network) on one data set. The specific meaning of the argument represented by the horizontal axis is the number of time series in the data set used by the model. The concrete meaning of the dependent variable represented by the vertical axis is the prediction precision of the model, the prediction precision evaluation index used by the subgraphs in the left column is MAE, and the prediction precision evaluation index used by the subgraphs in the right column is RMSE. It has been found through experiments that the predicted performance of the models implemented using these two networks does not differ much with a small number of time series, for example between 50 and 100. However, as the number of time series increases, the model prediction performance of the transposed convolution network is better than that of the fully-connected network, and the difference is more obvious as the number of time series increases. Such advantages of transposed convolutional networksThe transposed convolutional network is more efficient in processing two-dimensional trellis data, possibly because of its local connectivity and parameter sharing characteristics like a convolutional network, for implementing an interaction matrix generator G_MHas better prediction performance.

Fourthly, the influence of the GCN network depth on the performance;

in the present experiment, the influence relationship of the number of GCN layers on the predicted performance of the system was studied, and the specific experimental result is shown in fig. 7. And measuring different prediction performances of the system on a training set and a test set under the condition of different GCN layer numbers by using a five-fold cross validation mode on the three data sets respectively, wherein the evaluation index of the prediction result in the first row of subgraphs is MAE, and the evaluation index of the prediction result in the second row of subgraphs is RMSE. Under these two evaluation indexes, the system was found to have the best fitting ability (training set error is minimal) and the best generalization ability (test set error is minimal) when the number of layers of the GCN is 3 to 4. When the number of GCN layers is less than 3, the system does not completely fit the data, and the training error and the generalization error gradually decrease along with the increase of the number of GCN layers. When the number of GCN layers exceeds 6, the system starts to generate an overfitting condition, and the generalization error obviously increases along with the increase of the number of GCN layers.

A third embodiment;

a method for predicting a plurality of related time series based on generation of a countermeasure network is provided, which comprises the following steps:

s100, mapping an original random vector into an interaction matrix through an interaction matrix generator;

s200, constructing a time sequence interaction diagram according to the interaction matrix and the time sequence characteristic matrix;

s300, obtaining intermediate characteristic representation from the time series interaction diagram by using a diagram convolutional network through a predicted value generator, and processing the intermediate characteristic representation by using a recurrent neural network to obtain a predicted value of each time series;

s400, training a time sequence discriminator through the false time sequence sample and the real time sequence sample, and feeding back gradient information to an interaction matrix generator and a predicted value generator through the trained time sequence discriminator; the false time sequence sample adds a predicted value to the original time sequence feature vector to generate, and the real time sequence sample adds a real value to the original time sequence feature vector to generate.

As an alternative embodiment, mapping the original random vector into an interaction matrix includes the following steps:

and mapping the original random vector into three-dimensional characteristic representation through the full-connection layer, processing the three-dimensional characteristic representation by using the transposed convolution layer to obtain an output matrix, and carrying out symmetrical processing on the output matrix to obtain an interactive matrix.

As an alternative embodiment, the recurrent neural network is a long-short term memory network.

As an alternative embodiment, the network depth of the graph convolution network is set to 3 layers or 4 layers.

It should be noted that, since the embodiment of the method and the embodiment of the system are based on the same inventive concept, corresponding contents in the embodiment of the system are also applicable to the embodiment of the method, and are not described herein again.

A fourth embodiment;

referring to fig. 8, a multi-correlation time series prediction device based on generation of a countermeasure network is provided, which may be any type of smart terminal, such as a mobile phone, a tablet computer, a personal computer, and the like. Specifically, the apparatus includes: one or more control processors and memory, here exemplified by a control processor. The control processor and the memory may be connected by a bus or other means, here exemplified by a connection via a bus.

The memory, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the multi-correlation time series prediction device for generating the countermeasure network in embodiments of the present invention. The control processor implements the multi-correlation time series prediction method based on generation of the countermeasure network of the above-described method embodiments by executing non-transitory software programs, instructions, and modules stored in the memory.

The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created from use of a multi-correlation time series prediction system based on generation of a countermeasure network, and the like. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes a memory remotely located from the control processor, and the remote memories may be connected to the device for predicting the multi-correlation time series based on the generated countermeasure network via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The one or more modules are stored in the memory and, when executed by the one or more control processors, perform the multi-correlation time series prediction method based on generation of a countermeasure network in the above-described method embodiments.

Embodiments of the present invention further provide a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and the computer-executable instructions are executed by one or more control processors to perform the multi-correlation time series prediction method based on generation of the countermeasure network in the above method embodiments.

Through the above description of the embodiments, those skilled in the art can clearly understand that the embodiments can be implemented by software plus a general hardware platform. Those skilled in the art will appreciate that all or part of the processes in the methods for implementing the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes in the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example," or "some examples" or the like mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. A multi-correlation time series prediction system based on generation of a countermeasure network, comprising:

2. The multi-correlation time series prediction system based on generation of a countermeasure network of claim 1, wherein the interaction matrix generator comprises a transposed convolutional network.

3. The multi-correlation time series prediction system based on generation of a countermeasure network of claim 1, wherein the recurrent neural network is a long-short term memory network.

4. The multi-correlation time series prediction system based on generation of a countermeasure network of claim 1, wherein a network depth of the graph convolution network is set to 3 layers or 4 layers.

5. The method is characterized by being applied to a multi-correlation time series prediction system based on the generation of a countermeasure network, wherein the multi-correlation time series prediction system based on the generation of the countermeasure network comprises an interaction matrix generator, a predicted value generator and a time series discriminator, and the interaction matrix generator, the predicted value generator and the time series discriminator are connected with one another in pairs;

the method comprises the following steps:

6. The method of predicting a plurality of related time series based on generation of a countermeasure network as claimed in claim 5, wherein said mapping of the original random vectors into an interaction matrix comprises the steps of:

7. The method of claim 5, wherein the recurrent neural network is a long-short term memory network.

8. The multiple correlation time series prediction method based on generation of a countermeasure network according to claim 5, wherein the network depth of the graph convolution network is set to 3 layers or 4 layers.

9. A prediction device based on a plurality of correlation time series for generating a countermeasure network, comprising: at least one control processor and a memory for communicative connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform a plurality of related time series prediction methods based on generation of a countermeasure network according to any of claims 5 to 8.

10. A computer-readable storage medium storing computer-executable instructions for causing a computer to perform the method of any one of claims 5 to 8 for predicting a plurality of correlation time series based on generation of a countermeasure network.