CN117010567A - Dynamic chemical process prediction method based on static space-time diagram - Google Patents

Dynamic chemical process prediction method based on static space-time diagram Download PDF

Info

Publication number
CN117010567A
CN117010567A CN202310983210.6A CN202310983210A CN117010567A CN 117010567 A CN117010567 A CN 117010567A CN 202310983210 A CN202310983210 A CN 202310983210A CN 117010567 A CN117010567 A CN 117010567A
Authority
CN
China
Prior art keywords
time
space
feature
time sequence
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310983210.6A
Other languages
Chinese (zh)
Other versions
CN117010567B (en
Inventor
杨鑫
魏小鹏
朱理
段辰明
朱建民
尹子涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN202310983210.6A priority Critical patent/CN117010567B/en
Publication of CN117010567A publication Critical patent/CN117010567A/en
Application granted granted Critical
Publication of CN117010567B publication Critical patent/CN117010567B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/02Computing arrangements based on specific mathematical models using fuzzy logic
    • G06N7/06Simulation on general purpose computers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • G06Q50/265Personal security, identity or safety
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Quality & Reliability (AREA)
  • Primary Health Care (AREA)
  • Educational Administration (AREA)
  • Computer Security & Cryptography (AREA)
  • Operations Research (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Automation & Control Theory (AREA)
  • Fuzzy Systems (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)

Abstract

The invention belongs to the field of time sequence prediction in data mining, and provides a chemical process dynamic prediction method based on a static space-time diagram.Taking a feature matrix X and an adjacent matrix A which are formed by multi-dimensional time sequence data acquired by a sensor in a period of time as input data, and learning a mapping functionAnd calculating to obtain the production process state of the future period of time T. The method fully utilizes the time dependency characteristics and the space dependency characteristics among the fused multidimensional time sequence variables to realize the task of predicting the chemical process. The time sequence feature extraction is realized by adopting a GRU network, and the space feature extraction is realized by adopting a GCN network. The graph structure data of the GCN network adopts a mutual information mode to quantitatively calculate the relevance among the multidimensional time sequence variables, and then a static graph is designed and formed.

Description

Dynamic chemical process prediction method based on static space-time diagram
Technical Field
The invention relates to the field of time sequence prediction in data mining, in particular to a chemical process dynamic prediction method based on a static space-time diagram.
Background
The time sequence prediction is to fully mine the trend change rule of the past historical data by establishing an analysis model so as to predict future data. The time sequence prediction has very wide application, such as weather rainfall, traffic flow, finance, sales operation of commercial companies, medicine reaction, operation load of various systems and related application scenes can be seen in various aspects.
Along with the wide popularization and application of industrial automation, monitoring sensors on a production line are deployed on an automatic control system through a communication bus, so that a large amount of historical data is stored in a database, and the chemical industry is driven to enter a large data mining era. Time sequence prediction methods based on data driving are increasingly applied to the chemical industry. The time sequence prediction method based on data driving has the advantages that a complex chemical reaction mechanism in the chemical production process can be omitted, and a prediction model is established from the data perspective.
Spatio-temporal prediction is essentially a task of temporal prediction, except that it spatially models multi-dimensional temporal data. Although the existing prediction models and methods based on data driving are many, and the prediction effect under specific scenes can reach very good effects, most of the prediction models oriented to the field of chemical safety production are still in the traditional time sequence prediction methods, the methods fully utilize time sequence characteristics, well capture time sequence information, such as RNN, LSTM and other time sequence networks, neglect spatial correlation of multidimensional time sequences, and on a chemical production line, the spatial correlation refers to the mutual correlation and influence degree of technological parameters reflected by various sensors during production operation.
The invention aims at the production data space-time modeling problem of the chemical safety production process, introduces a method of graph convolution neural network to model the complex relevance-space relevance-between different sensors in the chemical production process, fuses time sequence information and space information, and further realizes dynamic prediction of the chemical production process. However, the construction of a space-time prediction model for the field of chemical production mainly has the following two problems:
(1) The physical space distribution of the sensors on the production line is difficult to describe and describe, and cannot truly reflect the interrelationship caused by internal mechanisms. Therefore, how to define the initial graph structure information in the graph convolutional network is a difficult task.
(2) The spatio-temporal prediction model extracts multidimensional timing information from two angles: one is the time angle, and the characteristic of the maximum time sequence information is the time dependence; one is the spatial angle, and the spatial dependence characteristic among the multidimensional variables. Therefore, how to combine the two features is a problem to be solved.
Therefore, the invention provides a space-time model based on GRU and GCN fusion, and adopts different convolution kernels to carry out correlation verification and evaluation on two chemical data sets, thereby proving the effectiveness of the method. Meanwhile, a new graph construction method is designed, and a thought is provided for solving the problem of correlation among process parameters in a chemical scene.
Disclosure of Invention
The invention discloses a space-time prediction task for chemical safety production, which mainly aims at remote data of a chemical DCS system, and utilizes historical production data to complete construction of a priori static diagram, so as to guide construction of a space-time prediction model and complete real-time prediction of the remote data. The method can predict the production data in the real scene of the chemical process in real time, and the result is a data sequence generated in the future.
The technical scheme of the invention is as follows: a dynamic prediction method for chemical process based on static space-time diagram features that the characteristic matrix X and adjacent matrix A composed of multi-dimensional time sequence data collected by sensor in a period of time are used as input data, and a mapping function is learnedThe state of the production process for a period of time T in the future is calculated, and the formula is as follows:
where h is the time length of the historical timing, p is the time length of the predicted future timing,wherein X is a matrix of dimension T X N, T represents the time sequence length of the historical time sequence data, and N represents the number of sensors; />Representing the structure of the figure, the invention is described by an adjacency matrix a.
The method specifically comprises the following steps:
step 1: constructing a static space-time diagram;
as the sensors in the chemical production process are of four types, the sensors comprise four types of temperature, liquid level, pressure and flow. The data range hierarchy is different due to the different sensor data dimensions. Therefore, the feature matrix X needs to be scaled to have the various features in the same order of magnitude so that the model will not deviate from a feature, and the result of use is more ideal. And simultaneously, the speed of converging the model to the optimal solution is increased. In the invention, the feature scaling is performed in a standardized manner.
For characteristic matrix X T×N Calculating mutual information values of each sensor node and all sensor nodes by a mutual information method of continuous variables, and using the mutual information values as an association weight value of an adjacent matrix A, wherein the specific formula is as follows:
a ij =I(X i ;X j ),0≤i,j≤N
wherein X is i Time series data for the ith feature of feature matrix X, X j Time series data of the j-th feature of the feature matrix X, a ij Is the element of the ith row and jth column of the adjacency matrix, i.e. a ij E A; since I (X; Y) =i (Y; X), a ij =a ji The method comprises the steps of carrying out a first treatment on the surface of the The elements of the adjacency matrix are represented by elements with weight values; the greater the weight of the adjacency matrix with the weight, the higher the degree of mutual correlation between two associated sensor nodes.
All time sequence data samples construct an adjacency matrix A as a space structure for describing a static diagram; the feature matrix X and the adjacent matrix A jointly form a static space-time diagram;
step 2: constructing a space-time prediction network based on a static space-time diagram;
space-time prediction networks based on static space-time diagrams are mainly divided into two parts: a time sequence feature extraction module and a space feature extraction module; the whole input of the space-time prediction network structure based on the static diagram is multidimensional time sequence dataAnd adjacency matrix->The static time-space diagram data is formed, wherein T is the time sequence length of the historical time sequence data, and N is the number of sensors; the spatial feature extraction module mainly comprises a graph rolling network, wherein the input data of the graph rolling network is multidimensional time sequence data +.>And adjacency matrix->The output is a space feature map; the time sequence feature extraction module mainly comprises GRU network units, and the space feature map obtained by the space feature extraction module is sequentially input into the GRU network units sharing parameters from a time dimension according to time steps to perform time feature extraction;
before entering each gate of the GRU network unit in the time sequence feature extraction module, X is calculated i And hidden state H of last time step i-1 Simultaneously, the information is sent to a spatial feature extraction module, and is subjected to graph convolution operation with an adjacent matrix A, so that information of adjacent nodes is gathered, and further, spatial features are fused to form a spatial feature graph; inputting the spatial feature map of the current time step and the spatial feature map of the last time step into the gate control of the GRU network unit, and further outputting the cell unit state and the hidden state of the GRU network unit of the current time step, wherein the hidden state is used as the input hidden state of the cell unit of the GRU network unit of the next time step; through cyclic iteration for H times, the output characteristics of the cell units of the last GRU network unit are subjected to dimension reduction through a linear layer, so that a prediction result is obtained
The spatial feature extraction module is constructed based on a graph roll-up network GCN. The graph rolling network is a K-hop type adjacency matrix algorithm, and the specific calculation formula is as follows:
wherein, (A+I) K Is to construct a class adjacency matrix with K-order reachable, C i (. Cndot.) is a normalization operator,the addition is Hadamard Product,is a weight parameter of the same size as the adjacency matrix a.
The time feature extraction module adopts a GRU network to perform time dependency feature modeling on the time sequence information; the spatial feature map H obtained from the spatial feature extraction module is sequentially input into the GRU network units of the sharing parameters from the time dimension according to time steps, and the time feature extraction is carried out; the specific calculation formula is as follows:
u i =σ(W u [G(A,X i ),h i-1 ]+b u )
r i =σ(W r [G(A,X i ),h i-1 ]+b r )
c i =tanh(W c [G(A,X i ),(r i ☉h i-1 )]+b c )
h i =u i ☉h i-1 +(1-u i )☉c i
wherein,is a weight parameter, F i To input the number of hidden units F o To output the number of hidden units, F o =2F i ;b u 、b r 、b c Is a bias term; g (A, X) i ) For the spatial feature map output by the spatial feature extraction module, < >>Sigma, tanh is an activation function; the addition is Hadamard product operation; />Is the hidden state of the last time step; u (u) i Updating and gating, and controlling output information of the cell unit state and the hidden state of the last time step; r is (r) i To reset the gating, the concealment of the last time step is controlledUpdating the storage state information; c i Is status information retained in the cell unit; />A hidden state of the output of the cell unit;
when inputting the feature matrix X T×N When the time sequence length of (2) is T, iterating the GRU unit for T times in a circulating way, and finally outputting the hidden state h by the time step unit T As a final extracted temporal feature; finally, dimension reduction is carried out on the hidden unit dimension through a layer of fully-connected linear layer, and a final output result is obtained and is used as a predicted valueWherein P is the time sequence length of the predicted result; the specific formula is as follows:
wherein h is T The output data of the elapsed time feature extraction module is the hidden feature generated by the last time step of the GRU unit; w (W) P B is a weight parameter p Is a bias term.
The feature scaling adopts standardization processing, wherein the standardization is to scale the feature values in the feature matrix into a distribution with the mean value of 0 and the variance of 1, and the formula is as follows:
wherein,is the average value of the ith characteristic data of the multidimensional time series data, sigma (X) i ) Is the variance.
The beneficial results of the invention are: the invention provides a chemical process dynamic prediction method based on a static space-time diagram, which fully utilizes the time dependency characteristic and the space dependency characteristic between fused multidimensional time sequence variables to realize the task of chemical process prediction. The time sequence feature extraction is realized by adopting a GRU network, and the space feature extraction is realized by adopting a GCN network. The graph structure data of the GCN network adopts a mutual information mode to quantitatively calculate the relevance among the multidimensional time sequence variables, and then a static graph is designed and formed.
The invention is oriented to the chemical safety production process, and most of the existing safety early warning methods based on artificial intelligence only focus on single sensor data, and the pure data-driven safety early warning research can not fully mine complex chemical principles and working conditions, and lacks knowledge learning and inheritance taught by the oral experience. The industrial big data not only can be sampling data of the sensor, but also can contain multi-mode data such as process conditions, rule knowledge, early warning measures and the like. The invention fully utilizes the production process data information and provides an effective method for chemical production safety pre-warning.
Drawings
FIG. 1 is a block diagram of a dynamic prediction method of a chemical process based on a static space-time diagram.
For splicing operation, < >>For add operation, ++>For sigmoid function, +.>Is a linear layer->For element dot multiplication operations, +.>For tanh activation function, +.>Is a full connection layer->Laminating the drawing rolls;
FIG. 2 is a graph showing the prediction of the present invention on DMC datasets using a spatio-temporal network of K-hop class adjacency matrix convolution kernels; FIG. 2 (a) is the expected future 12 th minute; FIG. 2 (b) is the predicted 24 th minute in the future; FIG. 2 (c) is the predicted 36 th minute in the future; fig. 2 (d) is the 48 th minute in the future.
Detailed Description
The following describes the embodiments of the present invention further with reference to the drawings and technical schemes.
Step 1: defining and building a dataset
The data of the invention is remote transmission time sequence data of a chemical production line DCS system.
Firstly, multi-dimensional time series data are defined, the multi-dimensional time series data form a characteristic matrix X to be input into a network model,where X is a T N-dimensional matrix, T represents the time length of the historical time sequence data, and N represents the number of sensors. />Data representing N sensors at time i. The multidimensional time sequence data can be used as node attributes of a graph to be constructed later, so that the attributes of the graph nodes comprise four types of time sequence data of temperature, liquid level, flow and pressure.
Next, a diagram structure is defined. For the drawingIndicating (I)>Wherein V represents a graph node, E represents a node and a sectionThe edge where the dot exists. The figure is essentially a network of topology that is used to represent the association between sensors. The graph node is considered a sensor, so V represents a set of sensor nodes, v= { V 1 ,v 2 ,…,v N }. E represents a set of edges. In order to better describe the information of nodes and associated edges of the graph structure, a representation in the form of an adjacency matrix is taken. Adjacency matrix A, < >>Representing the correlation between sensor nodes during the production process. The elements of the adjacency matrix can be represented by either 0/1 or the elements with weight values. The greater the weight of the adjacency matrix with the weight, the higher the degree of mutual correlation between two associated sensor nodes.
Therefore, the space-time prediction problem in the chemical production process can be regarded as taking a characteristic matrix X and an adjacent matrix A which are formed by multi-dimensional time sequence data of a historical period as input space-time diagram data by learning a mapping functionThe state of the production process for a period of time T in the future is calculated, and the formula is as follows:
where h is the time length of the historical timing and p is the time length of the predicted future timing.
Because the different data dimensions and data ranges of the sensor are different, the invention needs to perform feature scaling on the multi-dimensional time sequence data, and various features are in a unified order, so that the model cannot deviate to a certain feature, and the use result is more ideal. And simultaneously, the speed of converging the model to the optimal solution is increased. The invention scales the value of the characteristic into a state with the mean value of 0 and the variance of 1, and the formula is as follows:
wherein,is the average value of the ith characteristic data of the multidimensional time series data, sigma (X) i ) Is the variance.
Step 2: constructing a static space-time diagram;
mutual information (Mutual Information, MI) method. Mutual information is a measure of the interdependence between two random variables, which, interpreted from a probabilistic point of view, represents the degree of similarity of two random variables X, Y, the product p (X) p (Y) of the joint distribution p (X, Y) and the edge distribution. In the case of continuous random variables, the mutual information is defined as follows:
where p (X, Y) is the joint probability density function of X and Y, and p (X) and p (Y) are the edge profile density functions of X and Y, respectively. Mutual information can in turn be equivalently expressed as:
I(X;Y)=H(X,Y)-H(X|Y)-H(Y|X)
where H (X) and H (Y) are edge entropy, H (X|Y) and H (Y|X) are conditional entropy, and H (X, Y) is joint entropy of X and Y. Entropy is a measure of the stability of a system, and for continuous variables, its specific formula is as follows:
the mutual information has the advantage of being suitable for calculating not only between discrete variables but also between continuous variables. Meanwhile, the method is applicable to calculation of nonlinear relation variables or linear relation variables.
The mutual information is more suitable for being used as a method for analyzing and quantifying the mutual relevance among sensors in the chemical scene. For multidimensional time series dataThe characteristic matrix X is composed of T×N The mutual information value of each sensor node and all other sensor nodes (including own nodes) is calculated by using a mutual information method of continuous variables and is used as an association weight value of an adjacent matrix, and the specific formula is as follows:
a ij =I(X i ;X j ),0≤i,j≤N
wherein X is i 、X j Time series data of the ith feature and time series data of the jth feature of the feature matrix X, a ij Is the element of the ith row and jth column of the adjacency matrix, i.e. a ij E A. Since I (X; Y) =i (Y; X), a ij =a ji
All time sequence data globally construct an adjacent matrix, so that the general association rule of the sensor nodes in the whole production operation period can be summarized from the statistical angle, and a static time-space diagram is further generated. For inputting different time sequence data, the same adjacency matrix is used for carrying out space modeling on the space relevance of the multi-dimensional time sequence.
Step 3: constructing a space-time prediction network based on a static space-time diagram;
space-time prediction networks based on static space-time diagrams are mainly divided into two parts: the device comprises a time sequence feature extraction module and a space feature extraction module. The time sequence feature extraction module consists of GRU network, and the space feature extraction module consists of graph convolution network. The whole input of the network structure is multidimensional time sequence dataAnd adjacency matrix->And the static time-space diagram data is formed, wherein H is the historical time length of input data, and N is the number of nodes. The specific flow of the algorithm is as follows: sequentially inputting time sequence data X at each time step in time sequence i (0.ltoreq.i.ltoreq.H) to the GRU network element, X is added before entering each gate of the GRU network element i And hidden state H of last time step i-1 Simultaneously fed intoAnd (3) carrying out graph convolution operation on the graph convolution network and the adjacent matrix A, gathering information of adjacent nodes, and further fusing spatial features. And then inputting the characteristics of the current time step and the characteristics of the last time step which are fused with the spatial characteristics into the GRU gate, and outputting the cell unit state and the hidden state of the current time step, wherein the hidden state is used as the input hidden state of the cell unit of the next time step. Through cyclic iteration for H times, the output unit state characteristics are subjected to dimension reduction on characteristic dimensions through a linear layer, and a prediction result is obtained>
And the spatial feature extraction module is used for modeling the relevance features in the form of graph topological structures by adopting a graph convolution network. The core algorithm of the graph convolution network is the design of convolution kernel. In the invention, three convolution kernel algorithms are introduced to design different graph convolution networks, and comparison experiments and analysis of the effects of different convolution kernels are carried out.
The specific operation formula of the first kind of fast approximate convolution algorithm is as follows:
H 0 =X i
wherein,I N is a unit matrix; h l Representing the output characteristics of the convolution layer of the first layer of graph; />Represents->Degree matrix, X i Representing the time sequence data of X at the moment i, the last layer output is characterized by G (A, X i );
Second, chebyshev polynomial (Chebyshev polynomials) convolution kernel algorithm. The specific operation formula is as follows:
wherein,λ max is the maximum eigenvalue of the Laplace matrix L, I is the unit matrix, beta k Is the corresponding network parameter.
Thirdly, a convolution kernel algorithm, a K-hop class adjacency matrix algorithm (K-hop Neighborhood Matrix), comprises the following specific calculation formula:
wherein, (A+I) K Is to construct a class adjacency matrix with K-order reachable, C i (. Cndot.) is a normalization operator, (. Cndot.) is Hadamard Product operation,is a weight parameter of the same size as the adjacency matrix a.
And selecting one disclosed chemical simulation data set TE and a data set DMC data set in a chemical real scene for experimental comparison and verification. Through experimental result analysis, the space-time network of three different convolution kernels has good effect in the aspect of prediction results compared with other comparison methods. The space-time network adopting the K-hop type adjacency matrix convolution kernel has more excellent effect than other two convolution kernel methods.
The input data of the graph convolutional network is multidimensional time sequence dataAnd adjacency matrix->
The time sequence feature extraction module adopts GRU network to perform time dependency feature modeling on time sequence information. The GRU network is a classical variant of RNN network, which has a similar gating structure as LSTM, except that GRU is gated one less, meaning that the number of network references is less than LSTM, the training speed is faster, and GRU and LSTM are applied with little effect in some tasks. The invention adopts GRU network to extract time sequence characteristics. And the spatial feature map obtained from the spatial feature extraction module is sequentially input into the GRU network units sharing the parameters according to time steps from a time dimension, and the temporal feature extraction is carried out.
In a space-time predictive network training process based on static space-time diagrams, the goal is to minimize the network predictive valueAnd the true value Y. Thus, a mean square error loss (Mean Square Error, MSE) is selected. MSE, also known as quadratic loss, L2 loss, is commonly used for regression prediction tasks. The specific calculation formula is as follows:
meanwhile, in order to avoid the phenomenon of over fitting in the network training process, a regularization punishment term is introduced. Thus, the loss function consists of MSE and regularization penalty term, specifically formulated as follows:
wherein lambda is a super parameter of the regularization penalty term and is used for controlling the penalty strength of loss. L (L) reg Is the sum of the squares of the network parameters. Lambda was set to 0.0004.
In the selection of the optimizer, mainly an Adam optimizer, which is the main one in the field of this task, is chosen, which is used for a number of problems, including sparse or noisy models, while it combines the respective advantages of the AdaGrad and RMSProp algorithms. Adam can initially use the same learning rate for each parameter and adapt the product adjustment independently for each parameter as the network trains.
The learning rate is set to 0.001, the weight decay factor is 0.00015, the batch size (batch size) is set to 32, the temporal feature extraction module hidden layer dimension is 32, the spatial feature extraction module hidden layer output dimension is 64, and the training round is 200 times.
The experimental environment is as follows: ubuntu16.04 system, CPU model is Intel Xeon CPU E5-2650 v4@2.20GHz,GPU is NVIDIA GeForce TITAN V, and video memory is 12G.
In the test, the sampling period of the TE data set in the data set is 3min, the time length of the input historical data is 15 time steps, namely 15 historical data of 45min continuously, the predicted time sequence length is 4 time steps, and the predicted time sequence length is 4 time sequence data with 9min intervals, namely the predicted results of 9min, 18min, 27min and 36min in the future.
The DMC dataset has a sampling period of 4min, the input historical data has a time length of 15 time steps, namely 15 historical data of 60min continuously, and the predicted time sequence has a time length of 4 time steps, which is 4 time sequence data with an interval of 12min, namely the predicted results of 12min, 24min, 36min and 48min in the future, as shown in fig. 2 (a) -2 (d). In the data preprocessing process, the static diagram is generated by calculating mutual information based on all training samples, the adjacent matrix is thinned through setting a threshold value, namely adjacent nodes with small relevance are eliminated, the size of the threshold value is determined by the size of the mutual information calculated by specific time sequence data, and the DMC data set is set to be 0.4.

Claims (3)

1. A dynamic prediction method for chemical process based on static space-time diagram is characterized by comprising the following steps:
step 1: constructing a static space-time diagram;
performing feature scaling on a feature matrix X formed by multidimensional time sequence data acquired by a sensor in a history period, wherein various features are in the same order of magnitude;
for characteristic matrix X T×N Calculating mutual information values of each sensor node and all sensor nodes by a mutual information method of continuous variables, and using the mutual information values as an association weight value of an adjacent matrix A, wherein the specific formula is as follows:
a ij =I(X i ;X j ),0≤i,j≤N
wherein X is i Time series data for the ith feature of feature matrix X, X j Time series data of the j-th feature of the feature matrix X, a ij Is the element of the ith row and jth column of the adjacency matrix, i.e. a ij E A; since I (X; Y) =i (Y; X), a ij =a ji
All time sequence data samples construct an adjacency matrix A as a space structure for describing a static diagram; the feature matrix X and the adjacent matrix A jointly form a static space-time diagram;
step 2: constructing a space-time prediction network based on a static space-time diagram;
space-time prediction networks based on static space-time diagrams are mainly divided into two parts: a time sequence feature extraction module and a space feature extraction module; the whole input of the space-time prediction network structure based on the static diagram is multidimensional time sequence dataAnd adjacency matrix->The static time-space diagram data is formed, wherein T is the time sequence length of the historical time sequence data, and N is the number of sensors; the space feature extraction module mainly comprises a graph rolling network, and the input data of the graph rolling network is multidimensional time sequence dataAnd adjacency matrix->The output is a space feature map; the time sequence feature extraction module mainly comprises GRU network units, and the space feature map obtained by the space feature extraction module is sequentially input into the GRU network units sharing parameters from a time dimension according to time steps to perform time feature extraction;
before entering each gate of the GRU network unit in the time sequence feature extraction module, X is calculated i And hidden state H of last time step i-1 Simultaneously, the information is sent to a spatial feature extraction module, and is subjected to graph convolution operation with an adjacent matrix A, so that information of adjacent nodes is gathered, and further, spatial features are fused to form a spatial feature graph; inputting the spatial feature map of the current time step and the spatial feature map of the last time step into the gate control of the GRU network unit, and further outputting the cell unit state and the hidden state of the GRU network unit of the current time step, wherein the hidden state is used as the input hidden state of the cell unit of the GRU network unit of the next time step; through cyclic iteration for H times, the output characteristics of the cell units of the last GRU network unit are subjected to dimension reduction through a linear layer, so that a prediction result is obtained
2. The dynamic prediction method for chemical processes based on static space-time diagrams according to claim 1, wherein the spatial feature extraction module is constructed based on a graph rolling network GCN, the graph rolling network is a K-hop type adjacency matrix algorithm, and a specific calculation formula is as follows:
wherein, (A+I) K Is to construct a class adjacency matrix with K-order reachable, C i (. Cndot.) is a normalization operator, (. Cndot.) is Hadamard Product operation,is a weight parameter of the same size as the adjacency matrix a.
3. The static space-time diagram-based chemical process dynamic prediction method according to claim 1 or 2, wherein the time feature extraction module adopts a GRU network to perform time dependency feature modeling on time sequence information; the spatial feature map H obtained from the spatial feature extraction module is sequentially input into the GRU network units of the sharing parameters from the time dimension according to time steps, and the time feature extraction is carried out; the specific calculation formula is as follows:
u i =σ(W u [G(A,X i ),h i-1 ]+b u )
r i =σ(W r [G(A,X i ),h i-1 ]+b r )
c i =tanh(W c [G(A,X i ),(r i ⊙h i-1 )]+b c )
h i =u i ⊙h i-1 +(1-u i )⊙c i
wherein,is a weight parameter, F i To input the number of hidden units F o To output the number of hidden units, F o =2F i ;b u 、b r 、b c Is a bias term; g (A, X) i ) For the spatial feature map output by the spatial feature extraction module,sigma, tanh is an activation function; the addition is Hadamard product operation; />Is the hidden state of the last time step; u (u) i Updating and gating, and controlling output information of the cell unit state and the hidden state of the last time step; r is (r) i For resetting the gate control, controlling the updating of the hidden state information of the last time step; c i Is status information retained in the cell unit; />A hidden state of the output of the cell unit;
when inputting the feature matrix X T×N When the time sequence length of (2) is T, iterating the GRU unit for T times in a circulating way, and finally outputting the hidden state h by the time step unit T As a final extracted temporal feature; finally, dimension reduction is carried out on the hidden unit dimension through a layer of fully-connected linear layer, and a final output result is obtained and is used as a predicted valueWherein P is the time sequence length of the predicted result; the specific formula is as follows:
wherein h is T The output data of the elapsed time feature extraction module is the hidden feature generated by the last time step of the GRU unit; w (W) P B is a weight parameter p Is a bias term.
CN202310983210.6A 2023-08-07 2023-08-07 Dynamic chemical process prediction method based on static space-time diagram Active CN117010567B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310983210.6A CN117010567B (en) 2023-08-07 2023-08-07 Dynamic chemical process prediction method based on static space-time diagram

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310983210.6A CN117010567B (en) 2023-08-07 2023-08-07 Dynamic chemical process prediction method based on static space-time diagram

Publications (2)

Publication Number Publication Date
CN117010567A true CN117010567A (en) 2023-11-07
CN117010567B CN117010567B (en) 2024-06-28

Family

ID=88570621

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310983210.6A Active CN117010567B (en) 2023-08-07 2023-08-07 Dynamic chemical process prediction method based on static space-time diagram

Country Status (1)

Country Link
CN (1) CN117010567B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105184373A (en) * 2015-09-08 2015-12-23 深圳大学 Bayesian network structure learning method and system and reliability model construction method
CN112350899A (en) * 2021-01-07 2021-02-09 南京信息工程大学 Network flow prediction method based on graph convolution network fusion multi-feature input
CN113780420A (en) * 2021-09-10 2021-12-10 湖南大学 Method for predicting concentration of dissolved gas in transformer oil based on GRU-GCN
CN114418174A (en) * 2021-12-13 2022-04-29 国网陕西省电力公司电力科学研究院 Electric vehicle charging load prediction method
CN114548572A (en) * 2022-02-25 2022-05-27 中国农业银行股份有限公司 Method, device, equipment and medium for predicting urban road network traffic state

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105184373A (en) * 2015-09-08 2015-12-23 深圳大学 Bayesian network structure learning method and system and reliability model construction method
CN112350899A (en) * 2021-01-07 2021-02-09 南京信息工程大学 Network flow prediction method based on graph convolution network fusion multi-feature input
CN113780420A (en) * 2021-09-10 2021-12-10 湖南大学 Method for predicting concentration of dissolved gas in transformer oil based on GRU-GCN
CN114418174A (en) * 2021-12-13 2022-04-29 国网陕西省电力公司电力科学研究院 Electric vehicle charging load prediction method
CN114548572A (en) * 2022-02-25 2022-05-27 中国农业银行股份有限公司 Method, device, equipment and medium for predicting urban road network traffic state

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
朱力等: "基于GCN-GRU的短期时空负荷预测方法", 基于GCN-GRU的短期时空负荷预测方法, vol. 44, no. 4, pages 211 - 215 *
李昊天;盛益强;: "单时序特征图卷积网络融合预测方法", 计算机与现代化, no. 09, 15 September 2020 (2020-09-15), pages 36 - 40 *
陈俊等: "GCN-GRU:一种无线传感器网络故障检测模型", GCN-GRU:一种无线传感器网络故障检测模型, vol. 49, no. 5, pages 60 - 67 *

Also Published As

Publication number Publication date
CN117010567B (en) 2024-06-28

Similar Documents

Publication Publication Date Title
CN112801404B (en) Traffic prediction method based on self-adaptive space self-attention force diagram convolution
Zhang et al. At-lstm: An attention-based lstm model for financial time series prediction
CN114626512B (en) High-temperature disaster forecasting method based on directed graph neural network
CN111899510B (en) Intelligent traffic system flow short-term prediction method and system based on divergent convolution and GAT
CN109492822B (en) Air pollutant concentration time-space domain correlation prediction method
CN111612243B (en) Traffic speed prediction method, system and storage medium
CN113053115B (en) Traffic prediction method based on multi-scale graph convolution network model
CN112988723A (en) Traffic data restoration method based on space self-attention-diagram convolution cyclic neural network
CN112071065A (en) Traffic flow prediction method based on global diffusion convolution residual error network
Shu et al. VAE-TALSTM: a temporal attention and variational autoencoder-based long short-term memory framework for dam displacement prediction
CN115759461A (en) Internet of things-oriented multivariate time sequence prediction method and system
Hwang et al. Climate modeling with neural diffusion equations
Wang et al. Traffic prediction based on auto spatiotemporal multi-graph adversarial neural network
CN115376317B (en) Traffic flow prediction method based on dynamic graph convolution and time sequence convolution network
CN115018193A (en) Time series wind energy data prediction method based on LSTM-GA model
CN114065996A (en) Traffic flow prediction method based on variational self-coding learning
CN116504060A (en) Diffusion diagram attention network traffic flow prediction method based on Transformer
CN114860715A (en) Lanczos space-time network method for predicting flow in real time
CN116844041A (en) Cultivated land extraction method based on bidirectional convolution time self-attention mechanism
CN115034325A (en) Industrial time sequence prediction method of multidimensional time-attention-space
CN114694379B (en) Traffic flow prediction method and system based on self-adaptive dynamic graph convolution
CN114582131B (en) Monitoring method and system based on ramp intelligent flow control algorithm
Zhou et al. Multi-expert attention network for long-term dam displacement prediction
Lin et al. Hybrid water quality prediction with graph attention and spatio-temporal fusion
CN113255739A (en) Fish feed detection and formula system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant