CN112085124B - Complex network node classification method based on graph attention network - Google Patents

Complex network node classification method based on graph attention network Download PDF

Info

Publication number
CN112085124B
CN112085124B CN202011035811.7A CN202011035811A CN112085124B CN 112085124 B CN112085124 B CN 112085124B CN 202011035811 A CN202011035811 A CN 202011035811A CN 112085124 B CN112085124 B CN 112085124B
Authority
CN
China
Prior art keywords
network
node
nodes
community
monitoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011035811.7A
Other languages
Chinese (zh)
Other versions
CN112085124A (en
Inventor
高智勇
黄婧
高建民
谢军太
李智勇
秦锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202011035811.7A priority Critical patent/CN112085124B/en
Publication of CN112085124A publication Critical patent/CN112085124A/en
Application granted granted Critical
Publication of CN112085124B publication Critical patent/CN112085124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a complex network node classification method based on a graph neural network, aiming at the difficult problem of complex electromechanical system coupling network community division, firstly, detrending coupling correlation analysis is applied to calculating the correlation relation among all monitoring variable nodes, the correlation coefficient is primarily screened by introducing Gaussian noise, and the correlation coefficient is secondarily screened by introducing scale index; secondly, taking the monitoring variable as a network node, converting the relative number into a continuous edge weight, and constructing an undirected weighted complex network; secondly, starting from a static community detection algorithm based on global modularity optimization of module gain, taking each node in the network as one partition, calculating the modularity gain of the neighbor node to the current community according to a modularity function, judging the community attribution of the node according to the modularity gain to obtain the initial partition of the network node, further regarding the community which is divided for the first time of the network as the node again, performing a new round of iteration on a new network, obtaining the optimal community partition result of the network when the modularity obtains the maximum value, and taking the result as the initial training label of the graph attention neural network; and training is carried out based on real-time monitoring data through the attention neural network, so that node classification of the complex network is realized, and reliable basis is provided for accurate description of the coupling network of the complex electromechanical system.

Description

Complex network node classification method based on graph attention network
Technical Field
The invention relates to the technical field of complex electromechanical system coupling network community division, in particular to a complex network node classification method based on a graph attention neural network.
Background
The monitoring variables of the complex electromechanical system in the process industry are numerous, the material flow, the energy flow and the control in production equipment and pipelines and the information flow in a communication network are closely coupled, taking a typical production equipment compressor unit of a chemical enterprise as an example, the arrangement of the system comprises thousands of monitoring points including pressure, temperature, flow, liquid level, vibration, rotating speed, switches, alarm signals and the like, the mutual coupling of the monitoring variables in the system substantially forms a network diagram representing the dynamic change of the complex electromechanical system, and the complex network learns and explains the characteristics of the specific evolution mode of the system based on the network by researching the network structural characteristics and becomes a powerful tool for describing the complex system at present, so that various scholars develop related researches on how to establish the complex network, for example, Wufutao et al develop the influence of the community structure and the weighting factor on the network cascade survivability, introducing a node and edge extinction mechanism, and establishing a weighted network model with a community structure; aiming at the characteristics that the flow industrial production system has a plurality of monitoring points and all the monitoring points have correlation, the Von Longfei et al provides a complex electromechanical system multivariable coupling network modeling and state evaluation method based on trend-removing cross analysis-network structure entropy; sun Xin et al uses mutual information and Pearson correlation coefficients to measure the similarity of sea temperature time sequences between different sea areas, and constructs a nonlinear and linear complex network model of the global marine climate. Meanwhile, due to the limitations of computing power and existing methods, when researches such as state prediction, safety situation analysis, fault tracing, production optimization and the like of a complex electromechanical system are carried out, all equipment and monitoring variables cannot be described, and the mode of modeling by adopting all variables is high in complexity, so that some key characteristics of the system are easily covered, accurate description of a process industrial system is influenced, the research effect is poor, and the difficulty of overall analysis of the system is remarkably reduced by establishing a system community structure model comprising a plurality of communities. The method comprises the steps that the nodes similarity is calculated by the aid of an improved Jaccard algorithm by the aid of the Cupran et al so as to obtain an initial community, and then final community division is completed based on the initial community by the aid of an LPA algorithm; qiushouming et al propose a community division algorithm SM-CD based on node multi-attribute similar agglomeration, and solve the problem of the precision of community division; chen Dongming et al put forward a new evaluation standard of single community structure-community closeness to solve the problem of contradiction between the partitioning effect and the complexity; however, most of the traditional complex network community division perspectives are divided based on the number of connected edges or the probability of connected edges of nodes inside and outside a subgraph on the basis of analyzing inherent statistical indexes (such as degree, betweenness, average path length, clustering coefficient and the like) of the network, and the large data environment of a process industrial production system is not considered. Industrial control core systems are widely adopted in process industrial production enterprises to improve the automation level of production devices, the core monitoring system continuously acquires various data representing the operation state of the system, and the acquisition capacity of field data is continuously enhanced, so that a complex network modeling and community dividing method based on mass data is very important.
Disclosure of Invention
In order to solve the above problems in the prior art, the present invention aims to provide a complex network node classification method based on a graph attention network, which is used for calculating correlation coefficients by applying a detrended coupling fluctuation analysis method, establishing a data-driven complex network model, and realizing complex network node classification based on a graph attention network to complete community division aiming at the difficult problems of complex electromechanical system modeling and community division.
In order to achieve the purpose, the invention adopts the technical scheme that the method for classifying the complex network nodes based on the graph attention network comprises the following steps:
step 1), defining monitoring variables corresponding to monitoring point positions, selecting a variable set of a monitoring target of a complex electromechanical system to be analyzed, and acquiring a multi-dimensional historical monitoring time sequence of a complex electromechanical system sample from the variable set through a DCS (distributed control system);
step 2), establishing a weighted network model representing the interaction dynamics of the bottom layer of the system by taking the monitoring variables in the multidimensional historical monitoring time sequence obtained in the step 1) as nodes, the coupling relationship as edges and the coupling coefficient as edges;
step 3), converting the weighting network model established in the step 2) to obtain an initial training set characteristic vector;
step 4), carrying out community division on the weighting network model established in the step 2 based on the modularity to obtain an initial One-hot type training set label;
step 5), taking the initial training set feature vector obtained in the step 3) and the training set label obtained in the step 4) as input, and training based on a GAT (goal oriented programming) graph attention neural network to obtain a node classification result;
and 6), selecting real-time monitoring data to obtain a real-time training set characteristic vector, inputting the real-time training set characteristic vector into the GAT (goal oriented programming) attention neural network which finishes the historical monitoring data training in the step 5) for training, and finally finishing the classification of the nodes.
Further, the sampling frequency of the multi-dimensional historical monitoring time sequence obtained in the step 1) is set according to the sampling cost and the monitoring precision, the length of a sample is set, and a monitoring data set is obtained from historical data of the system operation process.
Further, the establishment of the undirected weighted network model in the step 2) comprises the following steps:
(1) calculating the coupling relation among all monitoring time sequences based on DCCA detrending correlation analysis, and introducing a Gaussian noise sequence as the comparison of coupling correlation coefficients;
(2) setting the coupling correlation coefficient which is greater than or equal to the lower threshold limit as 0 based on the scale index alpha as the lower threshold limit of the correlation coefficient to obtain an updated correlation coefficient table and further obtain a correlation relationship matrix; the scaling index alpha is in the log-log coordinates (s, F) DCCA (s)), log(s) is used as an explanatory variable log (F) DCCA (s)) making a scatter diagram for the interpreted variables, fitting data by least squares, and defining the slope of a straight line part as a scale index alpha;
(4) and constructing a weighting network for representing the bottom layer interaction dynamics of the system based on the correlation relation matrix.
Further, the DCCA detrending correlation analysis calculation method minimizes the influence of the external trend on the cross correlation by calculating a detrending covariance function based on the random walk theory, and the total displacement of the estimated random walk process from time 1 to time i in the sequence fluctuation analysis is defined as:
Figure BDA0002705088820000041
wherein
Figure BDA0002705088820000042
The detrending covariance function is defined as:
Figure BDA0002705088820000043
further, the step 3) of converting the weighting network model to obtain an initial training set feature vector includes the following two steps:
(1) converting the weighting network model into an adjacent matrix capable of representing the connection relation between the nodes, and defining the number of the non-zero elements in the ith row or the ith column of the adjacent matrix to be just the degree of the ith vertex;
(2) and converting the adjacent matrix representing the connection relation between the nodes into a sparse matrix.
Further, the weighted network model degree is defined as a undirected network formed by | V | ═ N nodes and | E | ═ M edges, assuming that the network G ═ V, E) is a undirected network, and the degree, i.e. the number of neighboring nodes of a node, is expressed as:
Figure BDA0002705088820000044
wherein the content of the first and second substances,
Figure BDA0002705088820000045
the degree index embodies the ability to establish direct contact between the node and surrounding nodes.
Further, for the weighted network model in step 4), the modularity is defined as:
Figure BDA0002705088820000051
in the formula: w is a ij -connecting the edge weights of the node i and the node j; w is the sum of the network edge weights; s i The sum of the weights of the edges connected to the node i; c i The community in which node i is located, C i And C j When the same, (C) i ,C j ) 1, otherwise 0; the value range of the modularity is between 0 and 1.
Further, in the step 5), the graph attention neural network is a new convolutional neural network that operates on graph structure data by using a shielded self-attention layer, and performs aggregation operation on neighbor nodes by using an attention mechanism, so as to realize adaptive distribution of different neighbor weights.
Further, the attention mechanism is a multi-head attention mechanism, that is, K sets of independent attention mechanisms are called, and then output results are spliced together, where the multi-head attention mechanism is defined as:
Figure BDA0002705088820000052
where | | represents a splicing operation,
Figure BDA0002705088820000053
is the weight coefficient, W, calculated by the K-th set of attention mechanisms (k) Is the corresponding learning parameter;
further, in the step 6), the input of the graph attention neural network in the test process is a feature vector corresponding to the real-time monitoring data, and is a label-free training process.
Compared with the prior art, the invention has at least the following beneficial technical effects:
the invention discloses a complex network node classification method based on a graph neural network, which aims at the difficult problem of complex electromechanical system coupling network community division, and realizes the classification of complex network nodes by applying the graph neural network with a multi-attention mechanism, thereby completing the community division. Firstly, applying detrending coupling correlation analysis to calculate the correlation relationship among monitoring variable nodes, carrying out primary screening on correlation coefficients by introducing Gaussian noise, and carrying out secondary screening on the correlation coefficients by introducing scale indexes; secondly, taking the monitoring variable as a network node, converting the relative number into a continuous edge weight, and constructing an undirected weighted complex network; secondly, starting from a static community detection algorithm based on global modularity optimization of module gain, taking each node in the network as one partition, calculating the modularity gain of the neighbor node to the current community according to a modularity function, judging the community attribution of the node according to the modularity gain to obtain the initial partition of the network node, further regarding the community which is divided for the first time of the network as the node again, performing a new round of iteration on a new network, obtaining the optimal community partition result of the network when the modularity obtains the maximum value, and taking the result as the initial training label of the graph attention neural network; and training is carried out based on real-time monitoring data through the attention neural network, so that node classification of the complex network is realized, and reliable basis is provided for accurate description of the coupling network of the complex electromechanical system.
Furthermore, the method can reliably reflect the community relation evolution process of the coupling network.
Further, the multi-head attention mechanism adopted by the invention can balance the connection among nodes and the weight among the nodes, is a comprehensive measurement of the node neighbor connection and the weight, does not need expensive matrix operation, and can be parallelized among all the nodes in the graph.
The invention adopts the graph attention neural network, and the graph attention neural network is a typical variant form, namely, an attention mechanism is introduced into the graph attention neural network, the core of the attention mechanism is to carry out weight distribution on given information, the information with high weight means that a system is required to carry out key processing, codes related to the graph attention neural network can be obtained by opening sources on Github, and only the codes need to be adjusted and modified to be suitable for the requirements of tasks to be completed.
Drawings
FIG. 1 is a partial historical monitoring data time series diagram.
Fig. 2 is a flowchart of a complex network node classification method based on a graph attention network.
Fig. 3 is a diagram of coupling correlation coefficients between the node 3 and each node.
Fig. 4 is a diagram of coupling correlation coefficients between the node 19 and each node.
Fig. 5 is a non-directional weighting network based on coupling correlation coefficients, in which the edge weights are listed separately.
Fig. 6 is a diagram of a complex network community structure obtained after an initial modular division.
FIG. 7 is a graph illustrating the loss function change and the accuracy change of the training process of the force neural network.
Fig. 8 is a graph illustrating the change in accuracy of the force neural network test process.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings:
the invention discloses a complex Network node classification method based on a Graph neural Network, which aims at the difficult problem of community division of a coupling Network of a complex electromechanical system and realizes the classification of complex Network nodes by applying a Graph Attention Network (Graph Attention Network) with a multi-head Attention mechanism, thereby completing the community division. Firstly, applying the Detrending Coupling Correlation Analysis (DCCA) to calculate the correlation relationship among monitoring variable nodes, primarily screening the correlation coefficient by introducing Gaussian noise, and secondarily screening the correlation coefficient by introducing a scale index (alpha); secondly, taking a monitoring variable as a network node, converting the relative relation number into a continuous edge weight, and constructing an undirected weighted complex network; secondly, starting from a static community detection algorithm (BGLL) based on global modularity optimization of module gain, taking each node in the network as a partition, calculating gain of a neighbor node to the current community modularity according to a modularity function (Jaccard similarity algorithm), judging community attribution of the node according to the gain, obtaining initial partition of the network node, further regarding the community of the initial partition of the network as the node, performing a new iteration on a new network, obtaining an optimal community partition result of the network when the modularity obtains the maximum value, and taking the result as an initial training label of the graph attention neural network; and training is carried out based on real-time monitoring data through a graph attention neural network, so that node classification of a complex network is realized. The method realizes dynamic classification of the complex network nodes, and provides a new idea for community division.
As shown in fig. 2, a method for classifying complex network nodes based on a graph attention network includes the following steps:
step 1), defining monitoring variables corresponding to monitoring point positions, selecting a variable set of a monitoring target of a complex electromechanical system to be analyzed, and acquiring a multi-dimensional historical monitoring time sequence of a complex electromechanical system sample from the variable set through a DCS (distributed control system);
step 2), establishing a weighting network model capable of representing the interaction dynamics of the bottom layer of the system by taking the monitoring variables in the multidimensional monitoring sequence as nodes, the coupling relationship as edges and the magnitude of the coupling coefficient as the weight of the edges;
step 3), converting the weighting network model, and finally obtaining an initial training set feature vector;
step 4), carrying out community division on the weighting network based on modularity, and finally obtaining an initial One-hot type training set label;
step 5), taking the initial training set feature vector and the training set label as input, and training based on the GAT graph attention neural network to obtain a node classification result;
step 6), selecting real-time monitoring data to obtain a real-time training set feature vector, and training the feature vector as the input of the GAT (goal oriented programming) graph attention neural network to finally complete the classification of the nodes;
the sampling frequency of the multi-dimensional monitoring sequence needs to be set according to the sampling cost and the monitoring precision, the length of a sample is set, and a monitoring data set is obtained from historical data of the system operation process.
For the construction of a weighting network model capable of representing the interaction dynamics of the system bottom layer, a complex network model needs to be respectively constructed by using historical monitoring data and real-time monitoring data of system monitoring variables.
The establishment of the bottom interactive dynamic weighting network model of the characterization system mainly comprises the following three steps:
(1) calculating the coupling relation among all monitoring time sequences based on DCCA detrending correlation analysis, and introducing a Gaussian noise sequence as the comparison of coupling correlation coefficients;
(2) setting the coupling correlation coefficient which is greater than or equal to the lower threshold limit as 0 based on the scale index alpha as the lower threshold limit of the correlation coefficient to obtain an updated correlation coefficient table and further obtain a correlation relationship matrix;
(3) constructing a weighting network representing the bottom layer interaction dynamics of the system based on the correlation matrix;
further, the DCCA detrending correlation analysis calculation method minimizes the influence of the external trend on the cross correlation by calculating a detrending covariance function based on the random walk theory, and the total displacement of the estimated random walk process from time 1 to time i in the sequence fluctuation analysis is defined as:
Figure BDA0002705088820000091
wherein
Figure BDA0002705088820000092
The detrending covariance function is defined as:
Figure BDA0002705088820000093
further, in log-log coordinates (s, F) for the scale index α DCCA (s)), log(s) is used as an explanatory variable log (F) DCCA (s)) making a scatter diagram for the interpreted variables, fitting data by least squares, and defining the slope of a straight line part as a scale index alpha;
converting the weighting network model to obtain an initial training set feature vector, comprising the following two steps:
(1) converting the weighting network model into an adjacent matrix capable of representing the connection relation between the nodes, and defining that the number of the non-zero elements in the ith row (or ith column) of the adjacent matrix is just the degree of the ith vertex;
further, the weighted network model degree is defined as a undirected network formed by | V | ═ N nodes and | E | ═ M edges, assuming that the network G ═ V, E) is a undirected network, and the degree, i.e. the number of neighboring nodes of a node, is expressed as:
Figure BDA0002705088820000094
wherein the content of the first and second substances,
Figure BDA0002705088820000095
the degree index reflects the capability of establishing direct contact between the node and the surrounding nodes;
(2) converting an adjacent matrix representing the connection relation between the nodes into a sparse matrix;
the modularity is an important measure index reflecting the community structure characteristics of a complex network, the value range of the modularity is between 0 and 1, the larger the modularity is, the more obvious the current community structure is, the internal connection of the community is close, and the external connection is sparse; for a weighted network, the modularity is defined as:
Figure BDA0002705088820000101
in the formula: w is a ij -connecting the edge weights of the node i and the node j; w is the sum of the network edge weights; s i The sum of the weights of the edges connected to the node i; c i The community in which node i is located, C i And C j When the same, (C) i ,C j ) Otherwise, it is 0.
Furthermore, the graph attention neural network is a new convolution type neural network which operates graph structure data by using a shielding self-attention layer, aggregation operation is carried out on neighbor nodes through an attention mechanism, and the graph attention neural network is realizedAdaptive distribution of different neighbor weights; let an arbitrary node v i The feature vector corresponding to the l-th layer is h i ,
Figure BDA0002705088820000102
d (l) Representing the characteristic length of the node, outputting a new characteristic vector of each node after an aggregation operation taking the attention mechanism as a core, and outputting a new characteristic vector h of each node after an aggregation operation taking the attention mechanism as a core i ′,
Figure BDA0002705088820000103
d (l+1) Representing the length of the output feature vector.
Further, the correlation degrees calculated by all the neighbors are subjected to unified normalization processing, wherein the specific form is softmax normalization:
Figure BDA0002705088820000104
furthermore, the attention mechanism is a multi-head attention mechanism, that is, K sets of independent attention mechanisms are called, and then output results are spliced together, where the multi-head attention mechanism is defined as:
Figure BDA0002705088820000105
where | | represents a splicing operation,
Figure BDA0002705088820000106
is the weight coefficient, W, calculated by the K-th set of attention mechanisms (k) Are the corresponding learning parameters.
Further, the input of the force neural network in the step 6) is a feature vector corresponding to the real-time monitoring data, and the process is a label-free training process.
The invention is based on a complicated network node classification method based on a graph attention network, aiming at the difficult problem of the division of a complicated electromechanical system coupling network community, firstly, detrending coupling correlation analysis is applied to the calculation of the correlation relation among all monitoring variable nodes, the correlation coefficient is primarily screened by introducing Gaussian noise, and the correlation coefficient is secondarily screened by introducing a scale index; secondly, taking the monitoring variable as a network node, converting the relative number into a continuous edge weight, and constructing an undirected weighted complex network; secondly, starting from a static community detection algorithm based on global modularity optimization of module gain, taking each node in the network as one partition, calculating the gain of a neighbor node to the modularity of the current community according to a modularity function, judging the community attribution of the node according to the gain, obtaining the initial partition of the network node, further regarding the community which is divided for the first time of the network as the node again, performing a new iteration on a new network, obtaining the optimal community partition result of the network when the modularity obtains the maximum value, and taking the result as an initial training label of the graph attention neural network; and training is carried out based on real-time monitoring data through a graph attention neural network, so that node classification of a complex network is realized. The method realizes dynamic classification of the complex network nodes, and provides a new idea for community division.
Furthermore, verification of the method by using actual service monitoring data of a certain chemical enterprise compressor unit shows that the method can accurately complete classification of complex network nodes, and remarkably reduce the difficulty of overall analysis of the system, so that the accuracy of local coupling analysis of the system is improved, and meanwhile, a reliable basis is provided for production optimization, fault tracing and the like of the complex network.
Example (b):
example 37 monitoring variables in the community with a clear relationship to turbine faults in table 1 (where the 38 th variable is the introduced gaussian noise) were selected as the basis for modeling the system network. The data set selected for this example is derived from an enterprise DCS system monitoring data set having a sampling frequency of 1/60 HZ. In the embodiment, samples collected in a compressor unit steam turbine normal service state for 5 months continuously are selected to verify the method:
the method comprises the following steps: undirected weighted network model establishment based on coupled detrending correlation analysis
Calculating the coupling relation among the historical monitoring time sequences based on DCCA detrending correlation analysis, introducing a Gaussian noise sequence as the comparison of coupling correlation coefficients, introducing a scale index alpha as the lower threshold limit of the correlation coefficients, setting the coupling correlation coefficients which are greater than or equal to the lower threshold limit to be 0 to obtain an updated correlation coefficient table and further obtain a correlation relation matrix, and establishing a weighting network model capable of representing the interaction dynamics of the bottom layer of the system by taking the monitoring variables in the multidimensional monitoring sequences as nodes, the coupling relations as edges and the coupling coefficients as the weights of the edges.
Step two: system coupling network community division based on Jaccard similarity algorithm
And according to the importance of the nodes, carrying out node community division in the first stage by applying a modularity optimization algorithm, regarding each community as a node by using the node division result in the first stage, continuing to judge the community attribution of the nodes until the modularity of the network community division is the maximum value, and carrying out community overlapping structure judgment based on a Jaccard coefficient through the community division result.
Step three: training set feature and label extraction for weighting networks
Performing adjacent matrix conversion on the weighting network model, then performing sparse matrix conversion, and finally obtaining an initial training set feature vector; and obtaining an initial One-hot type training set label based on the community division result of the modularity.
Step four: training and testing based on graph attention network node classification method
Selecting historical monitoring data, training the attention neural network by taking training set data with tags as input based on the three steps, selecting real-time monitoring data, training by taking test set data without tags as input of the GAT attention neural network, and finally finishing node classification;
1. selection of system characteristic variables and description thereof
The variables used in this example and their descriptions are shown in table 1. From the selected variables and the description thereof, the variables used in the example include both the process variables and the monitored equipment variables, because there is always a certain correlation between the monitored equipment variables and the process variables, the process variables can reflect the service state of the equipment to a certain extent, the monitored equipment variables can reflect the adjustment and fluctuation conditions of the process to a certain extent as shown in fig. 1, and such variables constitute a more comprehensive correlation analysis for the whole system.
TABLE 1 compressor set turbine monitoring variables
Figure BDA0002705088820000131
2. Undirected weighted network model establishment based on coupled detrending correlation analysis
The coupling relationship between the time sequences of the historical monitoring is calculated based on DCCA detrending correlation analysis and is shown in table 2, and meanwhile, a Gaussian noise sequence is introduced as a 38 th variable to be used as comparison of coupling correlation coefficients. As shown in table 2, the absolute values of the calculated coupling correlation coefficients between the variables exceed the coupling correlation coefficient with noise, which indicates that there is a certain coupling correlation relationship between the monitored variables.
TABLE 2 correlation coefficient Table
Figure BDA0002705088820000132
Figure BDA0002705088820000141
As shown in fig. 3 and 4, taking node 3 and node 19 as an example, it can be seen that there is a great difference in the coupling correlation between the nodes. Introducing a scale index alpha as a lower threshold of the correlation coefficient, setting a coupling correlation coefficient smaller than the lower threshold as 0, setting a coupling correlation coefficient larger than or equal to the lower threshold as 1, obtaining an updated correlation coefficient table, and further obtaining a correlation relation matrix as follows:
Figure BDA0002705088820000142
a weighting network model capable of representing the interaction dynamics of the system bottom layer is established by taking the monitoring variables in the multidimensional monitoring sequence as nodes, the coupling relationship as edges and the coupling coefficient as edges, as shown in fig. 4, the size of a node circle represents the value of a degree, and the side-connecting weight is the coupling correlation coefficient between the nodes. The established weighting network model is shown in fig. 5, the network part connecting edge weights are shown in table 3 below, and the network statistical characteristic values are shown in table 4 below.
TABLE 3 network part edge weights
Figure BDA0002705088820000143
Figure BDA0002705088820000151
3. System coupling network community division based on Jaccard similarity algorithm
And according to the importance of the nodes, carrying out node community division in the first stage by applying a modularity optimization algorithm, regarding each community as a node by using the node division result in the first stage, continuing to judge the community attribution of the node until the modularity of network community division is the maximum value, and carrying out community overlapping structure judgment based on a Jaccard coefficient through the community division result.
TABLE 4 network statistics
Figure BDA0002705088820000152
Figure BDA0002705088820000161
The obtained modularity division result is shown in fig. 6, the network structure tends to be more regularly represented through community division, and the connection weight value represents the tightness degree of the connection between the entities interacting in the network, so that the network structure can represent the self-structure and effectiveness, and reliable basis can be provided for the production optimization and the fault tracing of the subsequent network.
4. Training set feature and label extraction for weighting networks
Performing adjacent matrix conversion on the weighting network model, and then performing sparse matrix conversion to obtain the following results:
Figure BDA0002705088820000171
further obtaining an initial training set feature vector, and simultaneously obtaining an initial One-hot type training set label based on the community division result of the modularity in the step 3, wherein the labels of part of the training set are shown in the following table 5:
Figure BDA0002705088820000172
5. training and testing based on graph attention network node classification method
The node classification obtained by dividing the communities in the step 4 is only a result obtained after one-time static modeling, and cannot represent the dynamic evolution characteristics of the network, so that the complex network needs to be dynamically classified according to real-time monitoring data by combining a graph attention neural network, and the network evolution process is represented. Firstly, selecting historical monitoring data, training a pattern attention neural network by taking training set data with labels as input based on the three steps, utilizing a shielded self-attention layer to train pattern structure data, aggregation operation is carried out on neighbor nodes through an attention mechanism to realize self-adaptive distribution of different neighbor weights, and after the aggregation operation taking the attention mechanism as a core, after an aggregation operation taking the attention mechanism as a core, K groups of independent attention mechanisms are called, and then splicing the output results together, and outputting a new feature vector of each node, as shown in fig. 7, wherein the loss function is continuously reduced and the accuracy rate is continuously increased in the training process, and the loss function and the accuracy rate are basically stable from about 500 iterations and fluctuate only in a small range, so that the neural network can be considered to complete the training of the node classification task. And then real-time monitoring data is selected, and the data of the test set without the label is used as the input of the GAT graph attention neural network for training, as shown in FIG. 8, on the test set, the accuracy rate reaches 93.8%, and starting when the iteration times are about 700 times, the accuracy rate can be stably kept above 90%, and a good effect is shown on a node classification task.
In summary, the most popular framework-graph neural network for deep learning at present is used in the community division field of the complex network, meanwhile, a method capable of performing measure analysis on nonlinear relations, namely a coupled granger causal analysis method, is selected to model the complex network, a directed weighting network model is established, the directed weighting network model is essentially graph structure data, and the graph structure data is trained and learned through the graph neural network to obtain community division, so that the community division result of the complex network is obtained; the invention can also characterize the data except the structured data, and the key point of mining the intrinsic internal rules lies in the application of the graph neural network, most of the previous researches on community division are based on network statistical characteristics and are not well combined with machine learning or deep learning.
The complex network node classification method based on the graph neural network realizes the classification of complex network nodes by applying the graph neural network introducing a multi-attention mechanism aiming at the difficult problem of complex electromechanical system coupling network community division, thereby completing the community division and realizing the node classification of the complex network, and providing reliable basis for the accurate description of the complex electromechanical system coupling network.

Claims (8)

1. A method for classifying complex network nodes based on a graph attention network is characterized by comprising the following steps:
step 1), defining monitoring variables corresponding to monitoring point positions, selecting a variable set of a monitoring target of a complex electromechanical system to be analyzed, and acquiring a multi-dimensional historical monitoring time sequence of a complex electromechanical system sample from the variable set through a DCS (distributed control system);
step 2), establishing a weighted network model representing the interaction dynamics of the bottom layer of the system by taking the monitoring variables in the multidimensional historical monitoring time sequence obtained in the step 1) as nodes, the coupling relationship as edges and the coupling coefficient as edges; the establishment of the undirected weighting network model in the step 2) comprises the following steps:
(1) calculating the coupling relation among all monitoring time sequences based on DCCA detrending correlation analysis, and introducing a Gaussian noise sequence as the comparison of coupling correlation coefficients;
(2) setting the coupling correlation coefficient which is greater than or equal to the lower threshold limit as 0 based on the scale index alpha as the lower threshold limit of the correlation coefficient to obtain an updated correlation coefficient table and further obtain a correlation relationship matrix; the scaling index alpha is in the log-log coordinates (s, F) DCCA (s)), log(s) is used as an explanatory variable log (F) DCCA (s)) making a scatter diagram for the interpreted variables, fitting data by least squares, and defining the slope of a straight line part as a scale index alpha;
constructing a weighting network representing the bottom interactive dynamics of the system based on the correlation matrix;
step 3), converting the weighting network model established in the step 2) to obtain an initial training set characteristic vector; in the step 3), the weighting network model is converted to obtain an initial training set feature vector, and the method comprises the following two steps:
(1) converting the weighting network model into an adjacent matrix capable of representing the connection relation between the nodes, and defining the number of the non-zero elements in the ith row or the ith column of the adjacent matrix to be just the degree of the ith vertex;
(2) converting an adjacent matrix representing the connection relation between the nodes into a sparse matrix;
step 4), carrying out community division on the weighting network model established in the step 2 based on the modularity to obtain an initial One-hot type training set label;
step 5), taking the initial training set feature vector obtained in the step 3) and the training set label obtained in the step 4) as input, and training based on a GAT (goal oriented programming) graph attention neural network to obtain a node classification result;
and 6), selecting real-time monitoring data to obtain a real-time training set characteristic vector, inputting the real-time training set characteristic vector into the GAT (goal oriented programming) attention neural network which finishes the historical monitoring data training in the step 5) for training, and finally finishing the classification of the nodes.
2. The method for classifying complex network nodes based on the graph attention network as claimed in claim 1, wherein the sampling frequency of the multidimensional historical monitoring time series obtained in the step 1) is set according to the sampling cost and the monitoring precision, the length of the sample is set, and the monitoring data set is obtained from the historical data of the system operation process.
3. The method as claimed in claim 2, wherein the DCCA detrending correlation analysis computing method minimizes the influence of external trends on cross-correlation by computing a detrending covariance function based on the random walk theory, and the total displacement of the estimated random walk process from time 1 to time i in the sequence fluctuation analysis is defined as:
Figure FDA0003645875400000021
wherein
Figure FDA0003645875400000022
The detrending covariance function is defined as:
Figure FDA0003645875400000023
4. the method according to claim 1, wherein the weighted network model degree is defined as that, assuming that the network G ═ V, | ═ E) is a undirected network composed of | V | ═ N nodes and | E | ═ M edges, degree means the number of neighboring nodes of a node, and is expressed as:
Figure FDA0003645875400000031
wherein the content of the first and second substances,
Figure FDA0003645875400000032
the degree index embodies the ability to establish direct contact between the node and surrounding nodes.
5. The method for classifying complex network nodes based on the graph attention network as claimed in claim 1, wherein the modularity for the weighted network model in step 4) is defined as:
Figure FDA0003645875400000033
in the formula: w is a ij -connecting the edge weights of the node i and the node j; w is the sum of the network edge weights; s is i The sum of the weights of the edges connected to the node i; c i The community in which node i is located, C i And C j When the same, (C) i ,C j ) 1, otherwise 0; the value range of the modularity is between 0 and 1.
6. The method as claimed in claim 1, wherein in step 5), the graph attention neural network is a new convolutional neural network that operates on graph structure data by using a shielded self-attention layer, and performs an aggregation operation on neighbor nodes by using an attention mechanism to implement adaptive distribution of different neighbor weights.
7. The method for classifying complex network nodes based on the graph attention network as claimed in claim 6, wherein the attention mechanism is a multi-head attention mechanism, that is, K sets of independent attention mechanisms are called, and then the output results are spliced together, and the multi-head attention mechanism is defined as:
Figure FDA0003645875400000041
where | | represents a stitching operation,
Figure FDA0003645875400000042
is the weight coefficient, W, calculated by the K-th set of attention mechanisms (k) Are the corresponding learning parameters.
8. The method as claimed in claim 7, wherein in the step 6), the input of the graph attention neural network in the testing process is a feature vector corresponding to the real-time monitoring data, and is a label-free training process.
CN202011035811.7A 2020-09-27 2020-09-27 Complex network node classification method based on graph attention network Active CN112085124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011035811.7A CN112085124B (en) 2020-09-27 2020-09-27 Complex network node classification method based on graph attention network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011035811.7A CN112085124B (en) 2020-09-27 2020-09-27 Complex network node classification method based on graph attention network

Publications (2)

Publication Number Publication Date
CN112085124A CN112085124A (en) 2020-12-15
CN112085124B true CN112085124B (en) 2022-08-09

Family

ID=73739107

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011035811.7A Active CN112085124B (en) 2020-09-27 2020-09-27 Complex network node classification method based on graph attention network

Country Status (1)

Country Link
CN (1) CN112085124B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114610950B (en) * 2020-12-04 2023-11-07 中山大学 Graph network node representation method
CN112328505B (en) * 2021-01-04 2021-04-02 中国人民解放军国防科技大学 Method and system for improving coverage rate of fuzz test
CN112860810B (en) * 2021-02-05 2023-07-14 中国互联网络信息中心 Domain name multiple graph embedded representation method, device, electronic equipment and medium
CN112836054B (en) * 2021-03-08 2022-07-26 重庆大学 Service classification method based on symbiotic attention representation learning
CN113108916B (en) * 2021-03-17 2022-04-12 国网江西省电力有限公司电力科学研究院 Multi-point temperature fusion monitoring method based on complex network and Page Rank random walk
CN113254527B (en) * 2021-04-22 2022-04-08 杭州欧若数网科技有限公司 Optimization method of distributed storage map data, electronic device and storage medium
CN112990202B (en) * 2021-05-08 2021-08-06 中国人民解放军国防科技大学 Scene graph generation method and system based on sparse representation
CN113326880A (en) * 2021-05-31 2021-08-31 南京信息工程大学 Unsupervised image classification method based on community division
CN113449403B (en) * 2021-06-28 2023-11-10 江苏省城市规划设计研究院有限公司 Complex network node evaluation method based on hierarchical network division
CN113611366B (en) * 2021-07-26 2022-04-29 哈尔滨工业大学(深圳) Gene module mining method and device based on graph neural network and computer equipment
CN113591259B (en) * 2021-08-11 2022-05-03 华北电力大学 Heat supply pipeline dynamic equivalent modeling method
CN113807012A (en) * 2021-09-14 2021-12-17 杭州莱宸科技有限公司 Water supply network division method based on connection strengthening
CN116244284B (en) * 2022-12-30 2023-11-14 成都中轨轨道设备有限公司 Big data processing method based on three-dimensional content
CN115941501B (en) * 2023-03-08 2023-07-07 华东交通大学 Main machine equipment control method based on graphic neural network
CN117076994B (en) * 2023-10-18 2024-01-26 清华大学深圳国际研究生院 Multi-channel physiological time sequence classification method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10599686B1 (en) * 2018-09-27 2020-03-24 Babylon Partners Limited Method and system for extracting information from graphs
CN111062421A (en) * 2019-11-28 2020-04-24 国网河南省电力公司 Network node multidimensional data community division algorithm based on correlation analysis
CN111696345A (en) * 2020-05-08 2020-09-22 东南大学 Intelligent coupled large-scale data flow width learning rapid prediction algorithm based on network community detection and GCN

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109921921B (en) * 2019-01-26 2021-06-04 复旦大学 Method and device for detecting aging-stable community in time-varying network
CN111191718B (en) * 2019-12-30 2023-04-07 西安电子科技大学 Small sample SAR target identification method based on graph attention network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10599686B1 (en) * 2018-09-27 2020-03-24 Babylon Partners Limited Method and system for extracting information from graphs
CN111062421A (en) * 2019-11-28 2020-04-24 国网河南省电力公司 Network node multidimensional data community division algorithm based on correlation analysis
CN111696345A (en) * 2020-05-08 2020-09-22 东南大学 Intelligent coupled large-scale data flow width learning rapid prediction algorithm based on network community detection and GCN

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Optimization of text feature subsets based on GATS algorithm;Pei-pei Jiang,and etc;《2009 IEEE International Symposium on IT in Medicine & Education》;20090915;第924-927页 *
结合注意力机制的深度学习图像目标检测;孙萍等;《计算机工程与应用》;20191231;第55卷(第17期);第180-184页 *

Also Published As

Publication number Publication date
CN112085124A (en) 2020-12-15

Similar Documents

Publication Publication Date Title
CN112085124B (en) Complex network node classification method based on graph attention network
Li et al. The emerging graph neural networks for intelligent fault diagnostics and prognostics: A guideline and a benchmark study
CN109271975B (en) Power quality disturbance identification method based on big data multi-feature extraction collaborative classification
CN111222549B (en) Unmanned aerial vehicle fault prediction method based on deep neural network
CN112101480B (en) Multivariate clustering and fused time sequence combined prediction method
CN112350876A (en) Network flow prediction method based on graph neural network
Duong Bio-inspired computing
Rafati et al. High dimensional very short-term solar power forecasting based on a data-driven heuristic method
CN106656357B (en) Power frequency communication channel state evaluation system and method
US20220260981A1 (en) Optimization decision-making method of industrial process fusing domain knowledge and multi-source data
CN111062508A (en) Method for evaluating real-time running state of wind turbine generator based on big data technology
CN106649479A (en) Probability graph-based transformer state association rule mining method
CN111027733A (en) Petrochemical device product yield optimization method based on big data technology
Rodrigues et al. A system for analysis and prediction of electricity-load streams
Dang et al. seq2graph: Discovering dynamic non-linear dependencies from multivariate time series
CN111061151A (en) Distributed energy state monitoring method based on multivariate convolutional neural network
CN108596781A (en) A kind of electric power system data excavates and prediction integration method
Grando et al. Computing vertex centrality measures in massive real networks with a neural learning model
Moore et al. Predicting intelligence using hybrid artificial neural networks in context-aware tunneling systems under risk and uncertain geological environment
Hsieh Employing data mining technique to achieve the parameter optimization based on manufacturing intelligence
Wang et al. Bilateral Sensitivity Analysis for Understandable Neural Networks and its application to Reservoir Engineering
Dong et al. A Data-Driven Online Multimodal Identification Method for Industrial Processes Based on Complex Network
Seneviratne et al. Improving degradation prediction models for failure analysis in topside piping: A neuro-fuzzy approach
Mahesh et al. Adaptive Modeling for Real-Time Data Analysis with Machine Learning
Rostami et al. Time Series Forecasting of House Prices: An evaluation of a Support Vector Machine and a Recurrent Neural Network with LSTM cells

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant