CN113065974B - Link prediction method based on dynamic network representation learning - Google Patents
Link prediction method based on dynamic network representation learning Download PDFInfo
- Publication number
- CN113065974B CN113065974B CN202110280461.9A CN202110280461A CN113065974B CN 113065974 B CN113065974 B CN 113065974B CN 202110280461 A CN202110280461 A CN 202110280461A CN 113065974 B CN113065974 B CN 113065974B
- Authority
- CN
- China
- Prior art keywords
- network
- matrix
- dynamic network
- node
- representation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 63
- 239000011159 matrix material Substances 0.000 claims abstract description 83
- 230000002776 aggregation Effects 0.000 claims abstract description 17
- 238000004220 aggregation Methods 0.000 claims abstract description 17
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 12
- 238000013528 artificial neural network Methods 0.000 claims abstract description 9
- 230000008569 process Effects 0.000 claims abstract description 9
- 238000007477 logistic regression Methods 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 13
- 230000001105 regulatory effect Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 238000005096 rolling process Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000005295 random walk Methods 0.000 description 3
- 230000004931 aggregating effect Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000008521 reorganization Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Business, Economics & Management (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Tourism & Hospitality (AREA)
- Strategic Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a link prediction method based on dynamic network representation learning, which comprises the following steps: acquiring an adjacency matrix of a dynamic network; constructing a similarity matrix of the snapshot network by calculating similarity values among the dynamic network nodes; applying the graph convolution neural network to a single snapshot network to perform feature aggregation, and utilizing an adjacency matrix and a similarity matrix to guide a feature aggregation process to determine low-dimensional feature representation of the node; and inputting the low-dimensional characteristic representation of the node into a logistic regression classifier to obtain a link prediction result of the dynamic network. The invention can ensure the low-dimensional representation quality of the nodes on the network at the current moment through the aggregation strategy based on the similarity. By utilizing mutual information maximization strategy used in convolutional neural network, node low-dimensional vector representation of global structure information of the implication network is obtained, and a link prediction result can be output based on the node low-dimensional vector representation input logistic regression classifier.
Description
Technical Field
The invention relates to the technical field of artificial intelligence and complex networks, in particular to a link prediction method based on dynamic network representation learning.
Background
Some complex relationships in the real world may be described by networks, with entities in the network being represented by nodes in an abstract network, and with links between entities being described by edges. Modeling the real world with complex networks is a very efficient approach. Complex networks use scientific research means to present data in the real world in a manner that is easy to understand and apply, and as such, research into complex networks is currently receiving widespread attention. Among other things, link prediction research in complex networks is of great importance for analysis such as the communication and propagation of information between users in social networks.
Social information propagation prediction in a complex network is one of the research hotspots, the purpose of which is to predict the possible presence of edges (relationships between users) between nodes (users) in the network. With the progress of information technology, the scale of social networks has increased dramatically, and huge social instant messaging networks (such as WeChat, email, etc.) have greatly promoted the propagation and communication of information, but have also made the behavior monitoring and information propagation management of the netizens by related departments more tricky and complex. In the face of the problems of explosive type and fragmentation propagation of network information at present, if effective intervention and regulation are not performed, the network public opinion can be greatly diffused to influence the stability of society, so that the information diffusion and the structural change of a large-scale social network are researched, scientific data is provided for maintaining the stable development of the network, and the network has important social application value.
Furthermore, in daily life, social networks often evolve over time, i.e. the interactions between users in the network are changing over time. From the development of social networks, new users can join the network due to the registration of social accounts, and users can exit the network due to the cancellation of social accounts; it is also possible to establish a connection in the future between two social users who are not connected; two users who are frequently contacted may fade the contact so that there is no communication. New requirements and challenges are presented to control of public opinion information propagation, for example, in the face of dynamic changes in the network. Aiming at the problems, an analysis mode based on network representation learning can be adopted, and a dynamic network link prediction method is utilized to predict communication interaction among users possibly existing in a social communication network, so that not only can the analysis of the propagation rule of network information be facilitated, but also the propagation possibility and trend of the network information can be predicted, and further scientific reference is provided for formulating a public opinion suppression scheme for public opinion control departments.
Some schemes have been proposed to predict links that may exist in a network by characterizing a low-dimensional vector representation of the learning network. Three general categories can be distinguished by category: 1. based on a non-negative matrix factorization method. The method is implemented by decomposing a neighbor matrix or other information matrix of the network into a base matrix and a coefficient matrix. Although various attribute information in a network can be projected into a low-dimensional representation space through a matrix decomposition operation, since non-negative matrix decomposition involves a large-scale matrix operation, a large amount of time overhead is caused when the input network is large in scale, making it difficult to apply it to a large-scale network. 2. A random walk-based method. The method adopts natural language processing technology, takes a node sequence obtained by random walk on a network as a sentence, takes the node as a word, and uses word2cec to generate a low-dimensional representation of the node. Compared with a non-negative matrix method, the random walk-based method improves the efficiency in time overhead and reduces the calculation cost, but the algorithm can only capture the topological structure information of the network, but cannot consider the attribute information of the network, so that the application of the method on the attribute network is limited. 3. A deep learning-based method. One implementation of the method is to learn a low-dimensional representation of a node using a graph convolutional neural network. The attribute information of the neighbor nodes can be aggregated to the target node by using a graph rolling operation, the characteristics of the neighbor nodes are updated correspondingly, and the final characteristics are output to be low-dimensional representation. The method can be well applied to a large-scale network based on a local strategy, and captures attribute characteristic information of the network.
However, most of the network representation learning methods based on graph convolution neural networks currently adopt a strategy for aggregating the characteristics of neighborhood nodes on average, and the importance of different neighborhood nodes on target nodes is ignored. Furthermore, most network representation learning methods focus mainly on static networks, although some methods take into account the dynamics of the network, these methods cannot capture the global features of the network in the face of changes in the edges of the network, such as adding, subtracting, changing attributes.
Disclosure of Invention
The embodiment of the invention provides a link prediction method based on dynamic network representation learning, which is used for solving the problems in the background technology.
The embodiment of the invention provides a link prediction method based on dynamic network representation learning, which comprises the following steps:
acquiring an adjacency matrix of a dynamic network;
constructing a similarity matrix of the snapshot network by calculating similarity values among the dynamic network nodes;
applying the graph convolution neural network to a single snapshot network to perform feature aggregation, and utilizing an adjacency matrix and a similarity matrix to guide a feature aggregation process to determine low-dimensional feature representation of the node;
and inputting the low-dimensional characteristic representation of the node into a logistic regression classifier to obtain a link prediction result of the dynamic network.
Further, the constructing a similarity matrix of the snapshot network includes:
wherein ,vi Representing nodes i, v j Representing nodes j, S Dice_new (v i ,v j ) Corresponding to the similarity matrix S Dice_new The ith row and jth column element of (i.e. node v) i And node v j Is a similarity value of (1); n (v) i ) Representing node v i Is set of neighbor nodes, N (v i )∪{v i The expression node v i Itself is also added to its own set of neighbor nodes; n (v) i ) Representing node v j Is set of neighbor nodes, N (v j )∪{v j The expression node v j Itself is also added to its own set of neighbor nodes; n (v) j )∪Ν{v j The set N (v) j )∪{v j Number of elements in }.
Further, the determining a low-dimensional feature representation of the node includes:
wherein ,Ht A low-dimensional representation of positive samples on a single snapshot network t;is an encoder; s is S Dice_new A similarity matrix for the snapshot network; /> and />Representation matrix->Elements corresponding to the ith row and the ith column, wherein the ReLU is a ReLU function; a is that t For the adjacency matrix of the snapshot network t, I N Is a unit matrix; />For regulating parameters and->X t Is a feature matrix of the snapshot network t; />Is the weight matrix of the convolutional neural network at time step t.
Further, the link prediction method based on dynamic network representation learning provided by the embodiment of the invention further comprises the following steps: updating a weight matrix of the convolutional neural network in the time step t by adopting a long and short memory network LSTM; the method comprises the following steps:
F t =σ(M F W t-1 +U F W t-1 +Q F )
I t =σ(M I W t-1 +U I W t-1 +Q I )
O t =σ(M O W t-1 +U O W t-1 +Q O )
W t =O t tanh(C t )
wherein , and />For the weight matrix of the cyclic neural network, Q ξ Is a bias vector and has ζ ε { F, I, O, C }; w (W) t-1 Is the weight matrix of the convolutional neural network at the last moment.
Further, the link prediction method based on dynamic network representation learning provided by the embodiment of the invention further comprises the following steps: maximizing local representation vector clusters of nodes by introducing discriminator D and gt To make a low-dimensional representation matrix H t The global structural characteristics of the network can be captured; the method comprises the following steps:
wherein ,is H t Is the i-th row vector of (a); g t For globally low-dimensional representation on a single snapshot network t;/>Is->I-th row vector of>For a negative-sample low-dimensional representation on a single snapshot network t, is->Representing the result by the discriminatorFor-> and gt Scoring the resulting score.
Further, a negative-sample low-dimensional representation on the single snapshot network tComprising the following steps:
feature matrix X of random snapshot network t t Is scrambled to form a matrix
X is to be t Replaced byObtain->
Further, a global low-dimensional representation g on the single snapshot network t t The method comprises the following steps:
wherein ,is a read function; sigma is a Sigmoid function.
Further, the discriminator D is constituted by a bilinear scoring function:
wherein ,Bt A trainable scoring matrix.
Further, the link prediction result of the dynamic network is:
E={E 1 ,E 2 ,...,E t }
wherein ,Et ={e t i,j },e t i,j Indicating at time t node v i And node v j There are edges, i.e. adjacency matrix A t The value of the ith row and jth column of (c) is 1.
The embodiment of the invention provides a link prediction method based on dynamic network representation learning, which has the following beneficial effects compared with the prior art:
the invention develops a network representation learning method capable of capturing network dynamic characteristics, and simultaneously considers the topological characteristics and attribute characteristics of the network, which is necessary to solve some analysis tasks based on the network, such as link prediction. Specifically, by designing a new Dice similarity matrix to measure the importance among network nodes, the aggregation process of node characteristics can be guided according to the importance of nodes in different fields to target nodes, so as to generate high-quality node representation, namely, the quality of node low-dimensional representation on the network at the current moment can be ensured through an aggregation strategy based on similarity. By utilizing mutual information maximization strategy used in convolutional neural network, node low-dimensional vector representation of the global structural information of the implication network can be obtained, and a link prediction result can be output based on the node low-dimensional vector representation input logistic regression classifier. By utilizing the modeling capability of a long short memory network (LSTM) based on time sequence, potential characteristics in a dynamic network are mined, time sequence characteristic information of the dynamic network can be captured, namely, the weight of the graph convolution neural network is modeled by utilizing the LSTM. The LSTM can well memorize the time sequence characteristics of the network and embed the time sequence characteristics of the network into the low-dimensional representation of the nodes, which has strong advantages in capturing the time sequence characteristic information in the network to improve the accuracy problem of downstream tasks such as link prediction, and the like, so that the weight parameters of the graph convolution network are updated by utilizing the LSTM, the parameter quantity of a model is reduced, the efficiency under the condition of a large number of time steps is ensured, and the efficiency of large-scale network representation learning is improved. The invention is suitable for the undirected attribute dynamic network. Comparing the scheme of the invention with other methods for testing link prediction tasks on a real world network, the result shows that the scheme is superior to other comparison methods and has higher accuracy.
Drawings
FIG. 1 is a flowchart of a link prediction method based on dynamic network representation learning according to an embodiment of the present invention;
FIG. 2 is a detailed illustration of a link prediction method based on dynamic network representation learning according to an embodiment of the present invention;
FIG. 3 is a calculation diagram of a novel Dice similarity matrix guidance feature aggregation provided by an embodiment of the present invention;
FIG. 4 is a weight matrix illustration of an LSTM update map convolutional network provided by an embodiment of the present invention;
FIG. 5 is a diagram illustrating a visualization of a network low-dimensional representation under a manual network according to an embodiment of the present invention;
fig. 6 is a block diagram of a link prediction effect in a social network according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the invention provides a link prediction method based on dynamic network representation learning, which specifically comprises the following steps:
s1: input dynamic network g= { G 1 ,G 2 ,…,G Ti Adjacency matrix. Table 1 shows the scale of the real network data set tested:
table 1 real network dataset size
Data set | Node count | Edge number | Time step |
2029 | 39264 | 29 | |
60730 | 607487 | 20 | |
Askubuntu | 159316 | 964437 | 22 |
Wherein Email is a mail forwarding network, nodes represent individual users, and edges represent that a message forwarding relationship exists between two users. Facebook is a post forwarding network on Facebook, nodes represent specific users, and edges represent that two users have a relationship of forwarding posts. Askubuuntu is a question-answer network, the nodes represent users, and the edges between two users represent the mutual comment relationship.
S2: and judging whether the current time step t is smaller than the total time step Ti, if so, executing S3, otherwise, executing S12.
S3: constructing a similarity matrix of a snapshot network at the moment t, wherein the calculation mode is as follows:
wherein ,vi Representing nodes i, v j Representing nodes j, S Dice_new (v i ,v j ) Corresponding to the similarity matrix S Dice_new The ith row and jth column element of (i.e. node v) i And node v j Is a similarity value of (1); n (v) i ) Representing node v i Is set of neighbor nodes, N (v i )∪{v i The expression node v i Itself is also added to its own set of neighbor nodes; n (v) i ) Representing node v j Is set of neighbor nodes, N (v j )∪{v j The expression node v j Itself is also added to its own set of neighbor nodes; n (v) i )∪{v i The set N (v) i )∪{v i Number of elements in }; n (v) j )∪Ν{v j The set N (v) j )∪{v j Number of elements in }.
S4: if the current time step t=1 (i.e. the first snapshot network), S5 is executed, otherwise S6 is executed.
S5: randomly initializing weight matrix of convolutional neural network of current time step
S6: updating weight W at time step t using LSTM t The calculation mode comprises the following six substeps:
F t =σ(M F W t-1 +U F W t-1 +Q F )
I t =σ(M I W t-1 +U I W t-1 +Q I )
O t =σ(M O W t-1 +U O W t-1 +Q O )
W t =O t tanh(C t )
wherein , and />For the weight matrix of the cyclic neural network, Q ξ Is a bias vector and has ζ ε { F, I, O, C }; w (W) t-1 Is the weight matrix of the convolutional neural network at the last moment.
S7: computing a low-dimensional representation (positive sample) H on a single snapshot network t t Its calculation method represents encoder by designed nodeThe method is realized by the following steps:
wherein ,SDice_new The similarity matrix established for the snapshot network is based on step S3. and />Representation matrix->Elements corresponding to row i and column i of (b), and ReLU is a ReLU function. A is that t For the adjacency matrix of the snapshot network t, I N Is an identity matrix. />Is a parameter which can be regarded as regulated and is provided with +.>X t Is a feature matrix of the snapshot network t.
S8: computing a low-dimensional representation (negative sample) on a single snapshot network tFirst randomly is an original feature matrix X t Is out of order, forming +.>Then will get +.>Replacement X t And performs step S7 to obtain a low-dimensional representation matrix +.>I.e.
S9: computing a single snapshotGlobal low-dimensional representation g on network t t The calculation method is that the function is read outTo realize the method. The method is concretely realized as follows:
wherein ,is H t Is the Sigmoid function.
S10: calculating cross-loss entropy, i.e. need to be maximisedAnd g is equal to t Mutual information between them. The specific calculation is as follows:
wherein ,is->I-th row vector of>For node v i G is represented by a low-dimensional vector of (2) t Is a global graph representation of the network.The representation is made by discriminator->For-> and gt Scoring the resulting score. Specifically, the discriminator D is constituted by a simple bilinear scoring function calculated in the following manner:
wherein ,Bt A trainable scoring matrix.
S11: the resulting low-dimensional representation H t Inputting into a logistic regression classifier to obtain link information E existing in the next snapshot network t ={e t i,j },e t i,j Indicating at time t node v i And node v j There are edges, i.e. adjacency matrix A t The value of the ith row and jth column of (c) is 1.
S12: outputting a link prediction result E= { E of the dynamic network 1 ,E 2 ,…,E t }。
The description and analysis of the above steps S1 to S12 are as follows:
figure 2 shows a detailed illustration of the invention. In particular, the implementation of the present invention can be divided into two modules: (I) The convolution over a single time sequence represents a learning module, (II) a timing characteristics module that captures the network. Wherein the convolution on a single time sequence represents that the learning module consists of four parts: a) A reorganization strategy, which randomly scrambles the attribute matrix of the network to generate a new network attribute matrix, so as to reorganize the network; b) An aggregation strategy, namely constructing a new Dice similarity matrix for the network, and guiding a characteristic aggregation process by utilizing the obtained similarity matrix; c) GCN layer, realize the picture rolling operation, through the low-dimensional representation matrix H of the picture rolling process generation node t And a weight matrix W t The method comprises the steps of carrying out a first treatment on the surface of the D) Mutual information maximization, first, a global vector representation g of the network is obtained by a read-out function t Local representation vector clusters of nodes are then maximized by introducing a discriminator D and gt Mutual information between each other, thereby making the matrix H represented in low dimension t Global structural features of the network can be captured. For the time sequence characteristic module of the capturing network, a long short memory network (LSTM) is utilized to update the weight parameters of the convolution network, so that the snapshot network information in the previous time sequence can be memorized and transferred to the convolution process of the next snapshot network, thereby capturing the time sequence characteristic information of the network.
FIG. 3 illustrates a computational diagram of the present invention in relation to constructing a new Dice similarity matrix to guide feature aggregation. Illustrated as a graph with 6 nodes, node v 1 and v2 The similarity value between them is(elements corresponding to the first row and the second column of the matrix). Similarly, node v 2 and v5 The similarity value between them is(elements corresponding to the fifth column of the second row of the matrix). Since the network is undirected, S is generated Dice_new Is a symmetric matrix. The adjacency matrices a and S of the network are then followed Dice_new Matrix addition, and taking the added numerical value as the aggregation weight among the corresponding nodes. Such as node v 1 and v2 The adjacent matrix has a value of 1, and S is calculated Dice_new (v 1 ,v 2 ) And 1.7 after the addition of 0.7, the aggregate weight between the two nodes is 1.7.
Fig. 4 shows an illustration of the weight matrix of the present invention with respect to an LSTM update graph rolling network. Specifically, the weight W of the neural network (GCN) is rolled up by the previous time chart t Input to LSTM and output the weight W of GCN in the next time step t . Namely W t =LSTM(W t-1 )。
Fig. 5 illustrates the visualization of the network low-dimensional representation of the present invention under a manual network. The artificial network is generated by using a SYN-Event benchmark test set generator, the parameters used for generation are mu=0.15, mu is the definition of a community structure for controlling the generation network, and the larger mu is, the less clear is the community structure. As can be seen from the figure, the method (DGCN) of the present invention can better project similar nodes into adjacent two-dimensional space. While other methods do not clearly project similar nodes into a similar two-dimensional space, the projection limits are not clear.
FIG. 6 illustrates the link prediction effect of the present invention in three real social networks. Wherein the abscissa represents the proportion of edges used for training in the logistic regressor. "Average" in the ordinate represents two estimation operations on links (edges between nodes) when a low-dimensional representation is input to a logistic regression classifier. For the link prediction result, the area under the ROC curve (AUC) is used as a measurement standard, and the higher the AUC value is, the higher the accuracy of the predicted link is. It can be seen from fig. 6 that the proposed method (DGCN) of the present invention is superior to other methods on all networks.
In summary, the network representation learning algorithm related to the invention is an unsupervised learning method based on a graph neural network, and the feature representation of the target node can be updated by aggregating the features among neighboring nodes in the network, so as to capture the structural features and attribute features of the network. But has the disadvantage that the importance of different neighborhood nodes to the target node is difficult to distinguish in the feature aggregation process. Therefore, the invention provides a new Dice similarity matrix to measure the importance among the nodes, and guides the aggregation process of the node characteristics through the importance, so that the nodes can carry out preference aggregation based on the importance degree of the neighborhood nodes to the nodes in the generation process of the characteristic representation. In addition, in order to capture the dynamic characteristics of the network, the network representation learning method provided by the invention utilizes LSTM to memorize and update the weight information of the convolutional neural network. Finally, the method is applied to the message propagation prediction of the social network, and message propagation events possibly existing among social users are successfully predicted, so that scientific basis is provided for the network public opinion manager to formulate a public opinion propagation inhibition scheme.
The foregoing disclosure is only a few specific embodiments of the invention, and those skilled in the art may make various changes and modifications to the embodiments of the invention without departing from the spirit and scope of the invention, but the embodiments of the invention are not limited thereto, and any changes that may be made by those skilled in the art should fall within the scope of the invention.
Claims (6)
1. A method for link prediction based on dynamic network representation learning, comprising:
acquiring an adjacency matrix of a dynamic network; wherein the dynamic network comprises: an Email mail forwarding network, wherein nodes represent individual users, and the connection edge represents the message forwarding relationship between two users; the dynamic network further comprises: the posts on the Facebook are forwarded to the network, the node represents a specific user, and the connecting edge represents that the two users have a relationship of forwarding posts; the dynamic network further comprises: the askubuuntu question-answering network comprises nodes representing users, and a connecting edge between two users represents a mutual comment relation;
constructing a similarity matrix of the dynamic network by calculating similarity values among the nodes of the dynamic network; the similarity matrix is as follows:
wherein ,vi Representing nodes i, v j Representing nodes j, S Dice_new (v i ,v j ) Corresponding to the similarity matrix S Dice_new The ith row and jth column element of (i.e. node v) i And node v j Is a similarity value of (1); n (v) i ) Representing node v i Is set of neighbor nodes, N (v i )∪{v i The expression node v i Itself is also added to its own set of neighbor nodes; n (v) i ) Representing node v j Is set of neighbor nodes, N (v j )∪{v j The expression node v j Itself is also added to its own set of neighbor nodes; n (v) i )∪{v i The set N (v) i )∪{v i Number of elements in }; n (v) j )∪N{v j The set N (v) j )∪{v j Number of elements in };
applying the graph convolution neural network to a single dynamic network to perform feature aggregation, and utilizing an adjacency matrix and a similarity matrix to guide a feature aggregation process to determine low-dimensional feature representation of the node; the low-dimensional characteristics of the nodes are expressed as:
wherein ,Ht A low-dimensional representation of positive samples on a single dynamic network t; p is p (t) :Is an encoder; s is S Dice_new Is a similarity matrix of the dynamic network; /> and />Representation matrix->Elements corresponding to the ith row and the ith column, wherein the ReLU is a ReLU function; a is that t For the adjacency matrix of the dynamic network t, I N Is a unit matrix; />For regulating parameters and->X t Is a feature matrix of the dynamic network t; />A weight matrix of the convolutional neural network at the time step t;
inputting the low-dimensional characteristic representation of the node into a logistic regression classifier to obtain a link prediction result of the dynamic network; the link prediction result E of the dynamic network is as follows:
E={E 1 ,E 2 ,...,E t }
wherein ,Et ={e t i,j },e t id Indicating at time t node v i And node v j There are edges, i.e. adjacency matrix A t The value of the ith row and jth column of (c) is 1.
2. The dynamic network representation learning-based link prediction method of claim 1, further comprising: updating weight matrix W of convolutional neural network on time step t by adopting long and short memory network LSTM t The method comprises the steps of carrying out a first treatment on the surface of the The method comprises the following steps:
F t =σ(M F W t-1 +U F W t-1 +Q F )
I t =σ(M I W t-1 +U I W t-1 +Q I )
O t =σ(M O W t-1 +U O W t-1 +Q O )
W t =O t tanh(C t )
wherein , and />To circulate spiritWeight matrix through network, Q ξ Is a bias vector and has ζ ε { F, I, O, C }; w (W) t-1 Is the weight matrix of the convolutional neural network at the last moment.
3. The dynamic network representation learning-based link prediction method of claim 2, further comprising: by introducing discriminatorsMaximizing the local representation vector cluster of nodes +.> and gt Mutual information between them, so that the matrix H is represented in low dimension t The global structural characteristics of the network can be captured; the method comprises the following steps:
wherein ,is cross-loss entropy; />Is H t Is the i-th row vector of (a); g t A global low-dimensional representation on a single dynamic network t; />Is->I-th row vector of>For a negative-sample low-dimensional representation on a single dynamic network t,/->Representing the result by the discriminatorFor-> and gt Scoring the resulting score.
4. A dynamic network representation learning based link prediction method as claimed in claim 3, wherein the negative-sample low-dimensional representation on the single dynamic network tComprising the following steps:
feature matrix X of random dynamic network t t Is scrambled to form a matrix
X is to be t Replaced byObtain->
5. A dynamic network representation learning based link prediction method as claimed in claim 3, wherein the global low-dimensional representation g on the single dynamic network t t The method comprises the following steps:
wherein ,is a read function; sigma is a Sigmoid function.
6. The dynamic network representation learning-based link prediction method of claim 3, wherein the discriminatorConsists of bilinear scoring functions:
wherein ,Bt A trainable scoring matrix.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110280461.9A CN113065974B (en) | 2021-03-16 | 2021-03-16 | Link prediction method based on dynamic network representation learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110280461.9A CN113065974B (en) | 2021-03-16 | 2021-03-16 | Link prediction method based on dynamic network representation learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113065974A CN113065974A (en) | 2021-07-02 |
CN113065974B true CN113065974B (en) | 2023-08-18 |
Family
ID=76561106
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110280461.9A Active CN113065974B (en) | 2021-03-16 | 2021-03-16 | Link prediction method based on dynamic network representation learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113065974B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113591997B (en) * | 2021-08-03 | 2024-01-02 | 湖州绿色智能制造产业技术研究院 | Assembly feature graph connection relation classification method based on graph learning convolutional neural network |
CN113783725B (en) * | 2021-08-31 | 2023-05-09 | 南昌航空大学 | Opportunistic network link prediction method based on high-pass filter and improved RNN |
CN113962358B (en) * | 2021-09-29 | 2023-12-22 | 西安交通大学 | Information diffusion prediction method based on time sequence hypergraph attention neural network |
CN114826949B (en) * | 2022-05-04 | 2023-06-09 | 北京邮电大学 | Communication network condition prediction method |
CN114970692A (en) * | 2022-05-11 | 2022-08-30 | 青海师范大学 | Novel gravitational field-based link prediction method |
CN115208680B (en) | 2022-07-21 | 2023-04-07 | 中国科学院大学 | Dynamic network risk prediction method based on graph neural network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109886401A (en) * | 2019-01-10 | 2019-06-14 | 南京邮电大学 | A kind of complex network representative learning method |
CN110263280A (en) * | 2019-06-11 | 2019-09-20 | 浙江工业大学 | A kind of dynamic link predetermined depth model and application based on multiple view |
CN111461907A (en) * | 2020-03-13 | 2020-07-28 | 南京邮电大学 | Dynamic network representation learning method oriented to social network platform |
CN111931023A (en) * | 2020-07-01 | 2020-11-13 | 西北工业大学 | Community structure identification method and device based on network embedding |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11544535B2 (en) * | 2019-03-08 | 2023-01-03 | Adobe Inc. | Graph convolutional networks with motif-based attention |
-
2021
- 2021-03-16 CN CN202110280461.9A patent/CN113065974B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109886401A (en) * | 2019-01-10 | 2019-06-14 | 南京邮电大学 | A kind of complex network representative learning method |
CN110263280A (en) * | 2019-06-11 | 2019-09-20 | 浙江工业大学 | A kind of dynamic link predetermined depth model and application based on multiple view |
CN111461907A (en) * | 2020-03-13 | 2020-07-28 | 南京邮电大学 | Dynamic network representation learning method oriented to social network platform |
CN111931023A (en) * | 2020-07-01 | 2020-11-13 | 西北工业大学 | Community structure identification method and device based on network embedding |
Non-Patent Citations (1)
Title |
---|
Rong Zeng ; Yu-Xin Ding ; Xiao-Ling Xia.Link prediction based on dynamic weighted social attribute network.《2016 International Conference on Machine Learning and Cybernetics (ICMLC)》.2017,摘要. * |
Also Published As
Publication number | Publication date |
---|---|
CN113065974A (en) | 2021-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113065974B (en) | Link prediction method based on dynamic network representation learning | |
Zhang et al. | Deep transfer learning for intelligent cellular traffic prediction based on cross-domain big data | |
Zhu et al. | Grouped network vector autoregression | |
CN108563755A (en) | A kind of personalized recommendation system and method based on bidirectional circulating neural network | |
CN112100514B (en) | Friend recommendation method based on global attention mechanism representation learning | |
CN111339818A (en) | Face multi-attribute recognition system | |
US20180336482A1 (en) | Social prediction | |
Wang et al. | Application research of ensemble learning frameworks | |
Meng et al. | POI recommendation for occasional groups Based on hybrid graph neural networks | |
CN117272195A (en) | Block chain abnormal node detection method and system based on graph convolution attention network | |
CN114265954B (en) | Graph representation learning method based on position and structure information | |
CN117033997A (en) | Data segmentation method, device, electronic equipment and medium | |
CN115526293A (en) | Knowledge graph reasoning method considering semantic and structural information | |
Zheng et al. | Federated Learning on Non-iid Data via Local and Global Distillation | |
Hu et al. | Learning Multi-expert Distribution Calibration for Long-tailed Video Classification | |
CN115310589A (en) | Group identification method and system based on depth map self-supervision learning | |
Wang et al. | Hierarchical graph convolutional network for data evaluation of dynamic graphs | |
CN114254738A (en) | Double-layer evolvable dynamic graph convolution neural network model construction method and application | |
CN113158088A (en) | Position recommendation method based on graph neural network | |
Hu et al. | Crowd R-CNN: An object detection model utilizing crowdsourced labels | |
CN116192650B (en) | Link prediction method based on sub-graph features | |
Sun | Optimization of physical education course resource allocation model based on deep belief network | |
Yuxiang et al. | The application of GMKL algorithm to fault diagnosis of local area network | |
CN112307227B (en) | Data classification method | |
Wang et al. | AI-Based Secure Construction of University Information Services Platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240319 Address after: Room 1203-D, 12th Floor, Tiandiyuan Jiezuo Plaza, No. 4 Fenghui South Road, High tech Zone, Xi'an City, Shaanxi Province, 710072 Patentee after: Xi'an Sanhang Shijie Technology Co.,Ltd. Country or region after: Zhong Guo Address before: 710072 No. 127 Youyi West Road, Shaanxi, Xi'an Patentee before: Northwestern Polytechnical University Country or region before: Zhong Guo |
|
TR01 | Transfer of patent right |