CN112348269A - Time series prediction modeling method of fusion graph structure - Google Patents
Time series prediction modeling method of fusion graph structure Download PDFInfo
- Publication number
- CN112348269A CN112348269A CN202011256308.4A CN202011256308A CN112348269A CN 112348269 A CN112348269 A CN 112348269A CN 202011256308 A CN202011256308 A CN 202011256308A CN 112348269 A CN112348269 A CN 112348269A
- Authority
- CN
- China
- Prior art keywords
- sequence
- time sequence
- data
- time
- individual
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- Evolutionary Biology (AREA)
- Game Theory and Decision Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Entrepreneurship & Innovation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a time series prediction modeling method for a fusion graph structure, which comprises the following four steps: data processing stage, feature extraction, feature fusion and model prediction; the time series prediction method for the fusion graph structure data mainly comprises two types of data types: one is the sequence information of the time sequence itself, and the recorded sequence is the sequence of the individual target value on each node changing along with the time; the other type is event information influencing the sequence, the recorded event information is the event information of the individual at the corresponding statistical time node, and the text is used for recording the events on the time sequence; the invention realizes the research of a modeling method of the field relation, uses a multi-dimensional matrix method to express various incidence relations, carries out vector representation on data information and text information, and then uses the graph convolution neural network to carry out feature fusion in a complex network formed by a plurality of individuals.
Description
Technical Field
The invention relates to a time series prediction modeling method of a fusion graph structure, and belongs to the technical field of time series prediction.
Background
The time series prediction technology is a technical research method for predicting a series generated in the future by means of a past series, and particularly, if a prediction object to be aimed at is a time series value, a time series is formed by a change of the series value with time. The specific methods of time series mainly include two major categories of statistical and autoregressive-based methods and neural network-based methods. The method based on statistics is difficult to simulate the nonlinear relation in time series variables due to the limited capability of representing the time series characteristics, and the prediction effect of the machine learning method is superior to that of the method based on statistics. The method is combined with a deep learning related model method aiming at extreme uncertainty of a time sequence in a specific field, has stronger model adaptability, feature fusion capability and anti-interference capability, and is a main trend of future time sequence prediction development.
The development of a differential autoregressive moving average model (ARIMA) is based on the fact that a past time sequence can influence a future sequence, and the future can be predicted by mastering the rule of a historical time sequence. The ARIMA model is a compound form of an autoregressive model (AR), a moving average Model (MA) and an autoregressive moving average model (ARMA), the time sequence prediction method based on the ARIMA model needs to perform stationarity test of a time sequence, and if the time sequence is unstable, the time sequence needs to be stabilized through taking a logarithm mode, a difference mode and the like.
The main development of the deep learning-based time series prediction method benefits from the strong modeling capability of the Recurrent Neural Network (RNN) on the sequence information, which has been demonstrated in the time series modeling of various fields. The LSTM circulation neural network realizes the memory of the time sequence length structure by adding the structures of an input gate, a forgetting gate and an output gate, and is a neural network basic model which is most widely applied in the time sequence prediction field.
The time sequence reflects the operation mechanism of the objective world, and the time sequence in the real world is often influenced by multiple factors, particularly special events and the association relationship among the sequences. The following is the development of predictive models incorporating time series in addition to timing information.
Due to the development of natural language processing technology, the prediction of time series assisted by text events becomes an effective method, and the chain reaction caused by special events can be quantitatively represented by news collection of specific events through the natural language processing technology. The vectorization representation method of the text event mainly relates to the development of vector Word embedding, for example, the Word2Vec method can train large-scale Word vectors, and the Fasttext network can also be used for semantic learning of texts.
Individuals in time series have some causal links, and deep learning tools for modeling structural features are not mature enough before graph neural network techniques have emerged. There is a significant promoting or diminishing effect between individuals, and conventional deep learning models do not have such structures and features in conjunction with graph relationship data. The core development of this is the powerful graph data modeling capability of the Graph Convolution Network (GCN) in node classification and link prediction for graphs, followed by the generation of a series of related graph neural networks.
One of the prior arts related to the creation of the present invention is a time series model based on a recurrent neural network (LSTM), which is a long-short memory network, and the main reason for this is to solve the problems of gradient extinction and gradient explosion in the long-sequence training of RNN, both of which are not favorable to the idea of deep learning to eliminate training errors through iteration by means of optimization by a gradient descent method. There are the same x in LSTM as in RNNtAnd ytIn addition to these two transmission states ctAnd ht. The LSTM is mainly characterized by having three gate structures, namely an input gate, a forgetting gate and an output gate, controlling state transmission through the gate structures, memorizing long-time memory information after weight learning, and forgetting unimportant information. The second prior art related to the creation of the invention is a vector representation method of a time text, a typical method comprises a TF-IDF text vectorization and word embedding method, a typical word2vec, and a fasttext technology is improved on the word2vec, so that the training speed is accelerated. The word embedding vectorization technique is focused here. Word2vec model and fasttext modelVery similar.
In the prior art, the factor of time sequence change is not considered due to incomplete information utilization, the conventional technology excessively depends on the change rule of a sequence discovered from a historical time sequence, but the influence is gradually weakened, and the time sequence can be effective and accurate only by combining with the time sequence under a specific environmental condition;
the prior art is mainly divided into time series prediction according to data factors and time series prediction considering event influence factors. Correlation models predicted from time series of data factors can have a good fit to trends in the short term, but cannot predict the points of mutation that arise from particular events. The time sequence prediction considering the influence of event factors mainly analyzes the individual news texts and the popular cross-talk patterns and ignores the comprehensive effect of the event texts on the time sequence. In addition, domain knowledge in time series can also have complex effects on the sequence of events, for example, individual a has an effect on individual B, and individual B has an effect on individual C, and neither of the previous two broad categories of methods consider the effects exerted by this relationship.
According to the problems that domain knowledge is lacked and incidence relation change cannot be captured in the prior art, incidence relation characteristics among individuals are aggregated in a complex network through a graph neural network technology, so that different kinds of incidence relations can be utilized under the situation of fusing domain relation knowledge.
Disclosure of Invention
The invention aims to provide a time series prediction modeling method for fusing graph structures, which uses a graph matrix to represent complex relations, uses a graph neural network to aggregate time sequence and event text complexes represented by features to describe data of the graph structures appearing in time series, and obtains a model which combines the time sequence data and the event text together to aggregate information in a complex network.
In order to achieve the purpose, the invention provides the following technical scheme: a time series prediction modeling method for a fusion graph structure comprises the following steps: 1) processing and vector digitalizing the time sequence related data, wherein the time sequence data is divided into time sequence data, news event texts and graph structure matrixes; 2) organizing time sequence data and news text of each individual into independent features for extraction, then using graph neural network technology to aggregate and update feature representation of each individual by utilizing an association relation in graph structure data, wherein the feature representation is an individual feature which is calculated through a complex network, the individual feature comprises numerical information of a time sequence and text event features of corresponding individuals, the group individual feature processed by the composite feature and a target of the time sequence are subjected to error-based learning, and the error is continuously reduced to obtain a final time sequence prediction model fusing sequence numerical values, texts and graph structures.
Compared with the prior art, the invention provides a time series prediction modeling method of a fusion graph structure, which has the following beneficial effects: the method has the advantages that the data are used for representing the incidence relation, so that the diversity of time sequence prediction information is increased by using the incidence relation, and the time sequence prediction effect is improved.
Drawings
FIG. 1 is a flow chart of the present invention
FIG. 2 is an aggregate depicting individual sequence features of the invention
FIG. 3 is a diagram illustrating a prediction method of a group time series according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
Referring to fig. 1-2, an embodiment of the present invention is shown: the flow chart of the technical method of the invention is shown in fig. 1, and the method mainly comprises a data processing stage, feature extraction, feature fusion and model prediction. The data types used by the time series prediction method for fusing graph structure data mainly have two major categories, one is the sequence information of the time series, and the recorded sequence is the sequence of the change of an individual target value on each node along with the time; and the other type is event information influencing the sequence, the recorded event information is the event information of the individual at the corresponding statistical time node, and the text is used for recording the events on the time sequence.
The purpose of conventional time series prediction is to give Z ═ Z at a given inputt1,Zt2,...ZtpPredicting a sequence of q time steps in the future, Y ═ ztp+1,ztp+2,...,ztp+qTime series of each step hereIs an N-dimensional subsequence representing a sequence data set of N individuals at time step t, and a time-series equivalent task is processing sequences of multiple individuals at each time step. Here, although the predicted target is not changed, the sequence of q steps in the future is still predicted, but the input data includes numerical data and event data, and the event data E is { E ═ Et1,et2,...etpLike event data of each step is an N-dimensional subsequence etRepresenting event text descriptions of N individuals, e.g.
et=[event1={∑E},event2={∑E},...,eventN={∑E}]
Wherein each eventiThe method represents an influence event set of an individual i, and since the number of events influencing the individual i is uncertain, the events are directly spliced in a text splicing mode to form an overall influence set of the events on the individual i.
From the above 2 descriptions, aggregates describing the characteristics of individual sequences can be obtained here, and X ═ Z, E ] is used to represent the individual information after aggregation;
and then, respectively processing the spliced two types of data to meet the requirements of input and calculation of a subsequent model. Both continuous and discrete features of the time series data may be included in the pure sequence data. For continuous features, there may be a plurality of features, and a normalization processing method is adopted to map data onto a [0,1] interval uniformly, and a specific formula is as follows:
the numerical characteristic can be converted to the interval of 0-1, and the problem of model defects caused by non-uniform characteristic scales is solved. For the class type feature, ont-hot code (also called one-bit effective code) is used for digitizing, one-hot code is used for mapping the class classification value to an integer value, for example, 2 class identifiers are available, and the two classes can be represented by [0,1] and [1,0] respectively, so as to facilitate input into the neural network. This results in the processing of continuous type feature data.
And (3) training word vectors by using fasttext for event data in the individual aggregated data, and then forming sentence vectors. The processed sentence becomes a specific vector, such as [ "today", "snow", "down", "number", "stop"]Is changed into [ x1,x2,x3,x4,x5]And each x vector is of a specific dimension, and the vector of each word is obtained by inquiring a lookup table formed by a weight matrix of the fasttext.
The key data processing method of the invention is a modeling method of incidence relations among N enterprises, the method uses an N-by-N matrix to represent an incidence relation, k represents the screened incidence relation type, and A formed by one relation riIs expressed as follows:
and the obtained relation matrixes with k relations can be obtained in total, and are spliced together to form a relation matrix with the dimension of k × N, and the relation matrix is transmitted to a graph neural network for learning.
For the vectors obtained in the last step, N vectors which are compounded with the data features and the event text features are used for respectively representing the fusion features of N individuals, and the new N feature vectors are obtained by performing feature time sequence learning through an LSTM network, wherein X is (X is) the number of the new N feature vectors1,x2,...,xN) The input of the graph convolution neural network is X and A, A is a relation matrix obtained before, and the updating mode of the feature vector in the graph relation is as follows:
new feature vectors aggregated by neural network of graphAs the final feature vector, then the dimension reduction of the feature is carried out next to a full connection layer, so that the sequence [ y ] is finally obtained1,y2,...yN]As a prediction of the population time series.
The training of the model is to continuously reduce the variation trend value of the marked time sequence by a gradient descent method and then to reversely propagate to find the optimal solution. Through experimental verification, the new model fusing data and text in a complex space domain obtains good performance.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Claims (1)
1. A time series prediction modeling method of a fusion graph structure is characterized in that: 1) processing and vector digitalizing the time sequence related data, wherein the time sequence data is divided into time sequence data, a news event text and a graph structure matrix; 2) organizing time sequence data and news texts of each individual into independent features for extraction, then utilizing an incidence relation in graph structure data to use a graph neural network technology to aggregate and update feature representation of each individual, wherein the feature representation is an individual feature which is already subjected to complex network operation, the individual feature comprises numerical information of a time sequence and text event features of corresponding individuals, the group individual feature processed by the composite feature and a target of the time sequence are subjected to error-based learning, and the error is continuously reduced to obtain a final time sequence prediction model fusing sequence values, texts and graph structures.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011256308.4A CN112348269A (en) | 2020-11-11 | 2020-11-11 | Time series prediction modeling method of fusion graph structure |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011256308.4A CN112348269A (en) | 2020-11-11 | 2020-11-11 | Time series prediction modeling method of fusion graph structure |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112348269A true CN112348269A (en) | 2021-02-09 |
Family
ID=74363499
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011256308.4A Pending CN112348269A (en) | 2020-11-11 | 2020-11-11 | Time series prediction modeling method of fusion graph structure |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112348269A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113190734A (en) * | 2021-04-27 | 2021-07-30 | 中国科学院计算技术研究所 | Single-platform-based network event popularity prediction method and system |
CN113515568A (en) * | 2021-07-13 | 2021-10-19 | 北京百度网讯科技有限公司 | Graph relation network construction method, graph neural network model training method and device |
US20220269936A1 (en) * | 2021-02-24 | 2022-08-25 | International Business Machines Corporation | Knowledge graphs in machine learning decision optimization |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07234895A (en) * | 1994-02-22 | 1995-09-05 | Nippon Telegr & Teleph Corp <Ntt> | Time series forcasting method |
CN110222149A (en) * | 2019-05-17 | 2019-09-10 | 华中科技大学 | A kind of Time Series Forecasting Methods based on news public sentiment |
EP3564889A1 (en) * | 2018-05-04 | 2019-11-06 | The Boston Consulting Group, Inc. | Systems and methods for learning and predicting events |
CN111367961A (en) * | 2020-02-27 | 2020-07-03 | 西安交通大学 | Time sequence data event prediction method and system based on graph convolution neural network and application thereof |
-
2020
- 2020-11-11 CN CN202011256308.4A patent/CN112348269A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07234895A (en) * | 1994-02-22 | 1995-09-05 | Nippon Telegr & Teleph Corp <Ntt> | Time series forcasting method |
EP3564889A1 (en) * | 2018-05-04 | 2019-11-06 | The Boston Consulting Group, Inc. | Systems and methods for learning and predicting events |
CN110222149A (en) * | 2019-05-17 | 2019-09-10 | 华中科技大学 | A kind of Time Series Forecasting Methods based on news public sentiment |
CN111367961A (en) * | 2020-02-27 | 2020-07-03 | 西安交通大学 | Time sequence data event prediction method and system based on graph convolution neural network and application thereof |
Non-Patent Citations (1)
Title |
---|
李昊天 等: "单时序特征图卷积网络融合预测方法", 《计算机与现代化》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220269936A1 (en) * | 2021-02-24 | 2022-08-25 | International Business Machines Corporation | Knowledge graphs in machine learning decision optimization |
CN113190734A (en) * | 2021-04-27 | 2021-07-30 | 中国科学院计算技术研究所 | Single-platform-based network event popularity prediction method and system |
CN113515568A (en) * | 2021-07-13 | 2021-10-19 | 北京百度网讯科技有限公司 | Graph relation network construction method, graph neural network model training method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112348269A (en) | Time series prediction modeling method of fusion graph structure | |
CN112560432B (en) | Text emotion analysis method based on graph attention network | |
CN113905391B (en) | Integrated learning network traffic prediction method, system, equipment, terminal and medium | |
CN110956309A (en) | Flow activity prediction method based on CRF and LSTM | |
JPH07121495A (en) | Construction method of expert system by using one or more neural networks | |
CN116402352A (en) | Enterprise risk prediction method and device, electronic equipment and medium | |
CN117236677A (en) | RPA process mining method and device based on event extraction | |
CN114841072A (en) | Differential fusion Transformer-based time sequence prediction method | |
CN113779988A (en) | Method for extracting process knowledge events in communication field | |
CN113935489A (en) | Variational quantum model TFQ-VQA based on quantum neural network and two-stage optimization method thereof | |
Wang et al. | A transformer-based multi-entity load forecasting method for integrated energy systems | |
CN117314140A (en) | RPA process mining method and device based on event relation extraction | |
CN116502774A (en) | Time sequence prediction method based on time sequence decomposition and Legend projection | |
Watts et al. | Local score dependent model explanation for time dependent covariates | |
CN114841063A (en) | Aero-engine residual life prediction method based on deep learning | |
CN113537710A (en) | Artificial intelligence-based activity time sequence online prediction method under data driving | |
CN112348275A (en) | Regional ecological environment change prediction method based on online incremental learning | |
CN117093727B (en) | Time sequence knowledge graph completion method based on time relation perception | |
CN111158640B (en) | One-to-many demand analysis and identification method based on deep learning | |
CN117010459B (en) | Method for automatically generating neural network based on modularization and serialization | |
CN117667606B (en) | High-performance computing cluster energy consumption prediction method and system based on user behaviors | |
CN117371594A (en) | Time sequence prediction method based on neural network | |
Hu et al. | Fast Incremental Data Recognition Method Based on TCN Network | |
Krishna et al. | Strategic Integration Of Business Intelligence And Risk Management For Financial Institutions Using A Deep Random Forest Approach | |
CN118278489A (en) | Multistage model hierarchical calling and supervised learning optimization method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |