CN113254864A - Dynamic subgraph generation method and dispute detection method based on node characteristics and reply path - Google Patents

Dynamic subgraph generation method and dispute detection method based on node characteristics and reply path Download PDF

Info

Publication number
CN113254864A
CN113254864A CN202110478862.5A CN202110478862A CN113254864A CN 113254864 A CN113254864 A CN 113254864A CN 202110478862 A CN202110478862 A CN 202110478862A CN 113254864 A CN113254864 A CN 113254864A
Authority
CN
China
Prior art keywords
matrix
path
node
nodes
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110478862.5A
Other languages
Chinese (zh)
Other versions
CN113254864B (en
Inventor
曹娟
钟雷
王政嘉
盛强
谢添
徐朝喜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Zhongke Ruijian Technology Co ltd
Institute Of Digital Economy Industry Institute Of Computing Technology Chinese Academy Of Sciences
Original Assignee
Hangzhou Zhongke Ruijian Technology Co ltd
Institute Of Digital Economy Industry Institute Of Computing Technology Chinese Academy Of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Zhongke Ruijian Technology Co ltd, Institute Of Digital Economy Industry Institute Of Computing Technology Chinese Academy Of Sciences filed Critical Hangzhou Zhongke Ruijian Technology Co ltd
Priority to CN202110478862.5A priority Critical patent/CN113254864B/en
Publication of CN113254864A publication Critical patent/CN113254864A/en
Application granted granted Critical
Publication of CN113254864B publication Critical patent/CN113254864B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Strategic Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a dynamic subgraph generation method and a dispute detection method based on node characteristics and reply paths, S1, constructing a path matrix P and a path length matrix S based on a 'post-comment' graph G, wherein the path matrix P records all paths from each node in the graph G to terminal nodes, and the terminal nodes comprise post nodes in the graph G and comment nodes without replies; the path length matrix S records the length of each path in the path matrix; s2, calculating to obtain a path Laplacian matrix L based on the path matrix P and the path length matrix S; s3, calculating and obtaining the expression of the current node perception path information based on the Laplacian matrix L of the path and the content characteristics of the nodes in the graph G; and S4, based on the similarity between the current node and all nodes on the corresponding path, reserving the most important part of nodes on each path, wherein the important nodes on all paths form a subgraph corresponding to the current node, and the nodes in the subgraph are local discussions related to the current node.

Description

Dynamic subgraph generation method and dispute detection method based on node characteristics and reply path
Technical Field
The invention relates to a dynamic subgraph generation method and a dispute detection method based on node characteristics and a reply path. The method is suitable for the field of social media platform disputeness detection.
Background
Social media platforms have become an important platform for people to express opinions. People share, comment on social media and have led to intense discussions among parts of posts, indicating disputes among the participating people, which reflect public sentiment and focus. A controversial post has controversial content and the expressed idea or opinion can cause controversy in the reply.
The task of post-level dispute detection is to automatically determine whether a post is disputed. The task is helpful for evaluating the influence of bipolar differentiation and events of human viewpoints, and also provides a reference for news topic selection. The controversial detection plays an important role in mining public emotions of social media, and has become a research hotspot of people in recent years.
The existing dispute detection method comprises the steps of firstly, constructing a graph structure through a post-comment tree, wherein nodes in the graph represent posts or comments, and edges represent a reply relationship between the nodes; the graph convolutional neural network is then used to learn the representation of the nodes in the graph and to use the average information of posts and comments for dispute detection. The method cannot update the node expression by using local discussions related to the node, cannot model a dispute mode, and cannot pay attention to information related to the post.
In an actual scenario, the disputeness of a post is often reflected in Local Discussion (LD) of a post, which refers to the Discussion content related to the current node, and is embodied as a subgraph in a "post-comment tree" graph. There are some local discussions that are disputed and some that are not, we call those local discussions as local disputes (localargumentations). While some discussions in posts are related to posts and some are off-topic, we call those local disputes related to posts as Key local disputes (KLA). Posts with critical local disputes are likely to be controversial posts, so finding a critical local dispute will help in the dispute detection of posts.
FIG. 1 shows a dispute post discussing "vacation" and "accompanying vacation" of a microblog platform, (a) shows the content of the post and comments, and the standpoint of the comment content is divided into 4 aspects of "support", "objection", "neutral" and "irrelevant"; (b) the displayed is a reply relation graph of posts-comments. The partial discussion of the presence of posts has been indicated by dashed circles, such as LD 1-LD 3 in the figure. In these local discussions, there is a debate among LD2 and LD3, and it belongs to discussions related to posts, so LD2 and LD3 belong to key local debates, i.e., KLA1 and KLA2 in the figure. Based on the observation of FIG. 1, the dispute detection with critical local disputes can be performed in two steps: (1) first, the local discussion present in the post is found. (2) The local discussions in which the posts are related and most likely to be disputed are found for dispute detection.
Currently, post-based dispute detection is mainly based on web pages and social media, while web-based research mostly focuses on wikipedia, mainly by using specific features to classify: such as number of modifications, edit history, and dispute tags; more work has been focused on performing disputes detection of social media, some of which use linguistic features to detect such as topics, emotions, and other indicators, some of which are emphasized or Twitter specific; still other efforts use structural features in post-comment graphs for detection, such as propagated or local features, nodularity features, and the like.
The dispute detection is carried out by learning node expressions in post-comment graphs by using the graph convolution neural network, and two main defects exist: (1) the graph convolution neural network only focuses on first-order neighbor information of the node, and cannot directly learn high-order information, and therefore, the local discussion of the node cannot be utilized to learn node information. (2) The use of the average information of posts and comments for dispute detection does not allow modeling of dispute patterns in the data and does not allow for the attention to discussion information related to the posts.
Graph neural networks have been successful in many areas, and widely used graph neural networks include GCN, GraphSage, GAT, and GIN, among others. However, the networks can only aggregate first-order neighbor information of the nodes to update the node expression, and high-order neighbor information can be indirectly learned by using a multilayer network, but experiments show that the model performance is greatly influenced due to the over-smoothing problem.
At present, some works based on a graph neural network and capable of directly learning high-order node information exist, for example, a shortest path is generated for each node by using an attention mechanism, and information updating is performed on each node by using the path information, but the work only focuses on the relationship between node pairs and cannot focus on the whole information of a local subgraph; or, calculating the shortest path length between the nodes as the intimacy between the nodes, and selecting the TopK node with the closest distance as the subgraph corresponding to the node according to the intimacy, but the method only uses the structural information in the graph, and ignores the important node characteristics; or, all paths related to the node are listed which are less than a certain threshold value, and a sub-graph corresponding to the node is generated by enumerating part of the paths.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: aiming at the existing problems, a dynamic subgraph generation method and a dispute detection method based on node characteristics and a reply path are provided.
The technical scheme adopted by the invention is as follows: a dynamic subgraph generation method based on node features and reply paths is characterized in that:
s1, constructing a path matrix P and a path length matrix S based on the 'post-comment' graph G, wherein the path matrix P records all paths from each node in the graph G to terminal nodes, and the terminal nodes comprise post nodes in the graph G and comment nodes without replies; the path length matrix S records the length of each path in the path matrix;
s2, calculating to obtain a path Laplacian matrix L based on the path matrix P and the path length matrix S;
s3, calculating and obtaining the expression of the current node perception path information based on the Laplacian matrix L of the path and the content characteristics of the nodes in the graph G;
and S4, based on the similarity between the current node and all nodes on the corresponding path, reserving the most important part of nodes on each path, wherein the important nodes on all paths form a subgraph corresponding to the current node, and the nodes in the subgraph are local discussions related to the current node.
2. The method for dynamic subgraph generation based on node features and reply paths according to claim 1, wherein the step S1 comprises:
s11, constructing a 'post-comment' graph G ═ V, E according to the reply relation, wherein V is a set of nodes and comprises post nodes and comment nodes; e represents the reply relationship between the nodes, including the connecting edges between the posts and the comments and the connecting edges between the comments and the comments;
s12, constructing a path matrix P based on the graph G, wherein the path matrix P belongs to Rm*mRecording m paths in the graph G, and taking all paths from each node to the terminal node in the graph G;
s13, constructing a path length matrix S based on the graph G, wherein the path length matrix S belongs to Rm*mThe element value on the diagonal of the ith row of the matrix represents the length of the ith path in the path matrix P.
Step S2 includes: the difference of the path matrix P and the path length matrix S is used to define a path laplacian matrix: l ═ S-P.
Step S3 includes:
s31, calculating a normalization form of the path Laplace matrix L:
L′=I-S-1P
wherein I is an identity matrix of M;
s32, calculating the expression of the sensing path information of the central node i based on the matrix L', wherein the expression matrix Q of the central node belongs to Rm*dThe calculation is as follows:
Q=L′H
wherein the matrix H ∈ Rm*dAnd recording the d-dimensional expression vector of each central node in the path matrix.
Step S4 includes:
calculating a correlation matrix between the nodes based on the matrices Q and H:
R=QWsHT
wherein Ws∈Rd*dIs a learnable matrix; each row in the matrix R represents the correlation degree of the central node and all other nodes;
filtering out nodes on a path corresponding to the central node by using the path matrix P, and calculating a normalized correlation value between the central node and the nodes on the corresponding path by using a Softmax function according to a line;
R′=Softmax(P⊙R)
wherein |, represents the product of the corresponding elements in the matrix;
for a path with the node i as the center, accumulating the correlation values on the path from the node i along the path, and cutting off the rest nodes when the accumulated correlation values are larger than a threshold value theta;
the collection of all the truncated paths becomes the subgraph corresponding to the central node i, which is recorded as SGiAnd local discussion information corresponding to the central node i is recorded in the subgraph.
Updating the expression of the node by utilizing the node information in the subgraph based on the classical GNN model, wherein the expression of the node i in the l-th layer is
Figure BDA0003047928880000051
It further comprisesThe new rule is:
Figure BDA0003047928880000052
wherein g is an aggregation function, different aggregation functions being used in different GNN models; σ is a nonlinear activation function; b(l)Is a bias vector.
In the GCN, the number of bits in the GCN,
Figure BDA0003047928880000053
wherein W(l)Is a learnable parameter matrix.
A method of dispute detection, comprising:
A. adopting differences among node expressions to model a dispute mode in a subgraph generated by the dynamic subgraph generation method of any one of claims 1-6;
B. the sub-graphs are re-weighted using a post-directed attention mechanism to capture post-related disputes.
The step A comprises the following steps:
for node i in reply to node x, the difference between node expressions is calculated
Figure BDA0003047928880000054
Using the fully connected layer to learn these differences;
summing all differences in the subgraph to obtain an expression vector of the subgraph, wherein a calculation formula is as follows:
Figure BDA0003047928880000055
wherein
Figure BDA0003047928880000056
Is a matrix of parameters that can be learned,
Figure BDA0003047928880000057
is offsetA matrix of entries.
The individual subgraphs are reweighted using the post-directed attention mechanism, which is calculated as follows:
Figure BDA0003047928880000061
Figure BDA0003047928880000062
wherein h ispAn expression representing a post node; SG represents all subgraph sets in the "post-comment" graph;
Figure BDA0003047928880000063
is the weight of attention mechanism, representing sub-graph SGiRelevance to the post.
The result of the weighted summation is learned by using a full link layer, and finally whether the post is controversial or not is judged, and a loss function uses cross entropy:
Figure BDA0003047928880000064
wherein
Figure BDA0003047928880000066
A real tag representing the ith post;
Figure BDA0003047928880000065
the probability of dispute for each ith post predicted by the model; and N is the size of the batch during training.
The invention has the beneficial effects that: the invention provides a method for mining key local disputes to perform disputed detection based on dynamic subgraph generation, which mainly comprises two parts of dynamic subgraph generation and key local dispute mining.
The dynamic subgraph generation can dynamically generate subgraphs corresponding to each node based on the content of the node and the characteristics of the reply structure, namely relevant local discussion, each node can use relevant local discussion information to express and update, and the method can be integrated into different graph neural networks in a plug-in mode to improve the detection performance of the graph neural networks.
The key local dispute mining can model a dispute mode in discussion and excavate disputes related to the content of the posts for dispute detection.
The method of the invention can deal with irrelevant information in the data and can provide certain model interpretability (the local dispute which is most concerned by the model is probably the reason for dispute of the posts).
Drawings
FIG. 1 shows a dispute post discussing "vacation" and "coss" on the microblog platform.
Fig. 2 is a model structure diagram of the embodiment.
Detailed Description
The embodiment is a method for mining key local disputes to perform dispute detection based on a dynamic subgraph generation method, and comprises a dynamic subgraph generation method and a local dispute mining method based on node characteristics and reply paths.
A subgraph centered around a node (e.g., node i) should include local discussions associated with it that exist in the reply path of the center node i. In this embodiment, the dynamic subgraph generation method first calculates the correlation between each node in the reply path and the central node i, and then truncates the path to remove irrelevant nodes. The collection of all the truncated paths constitutes the subgraph corresponding to node i, i.e. the relevant local discussion.
The common GCN model uses an adjacency matrix, a degree matrix, and a Laplace matrix to model the interactivity of the node and the first-order neighbor nodes, in this example, a path matrix, a path length matrix, and a path Laplace matrix to model the interactivity of the node and the higher-order neighbor nodes.
The dynamic subgraph generation method in the embodiment comprises the following steps:
s1, constructing a path matrix P and a path length matrix S based on the 'post-comment' graph G, wherein the path matrix P records all paths from each node in the graph G to terminal nodes, and the terminal nodes comprise post nodes in the graph G and comment nodes without replies; the path length matrix S records the length of each path in the path matrix.
S11, for each post, firstly, constructing a 'post-comment' graph G (V, E) according to a reply relationship, wherein V is a set of nodes and comprises post nodes and comment nodes, and obtaining initial expressions of the nodes based on texts in the nodes by using a BERT model; e represents the reply relationship between nodes, including the connecting edge between the posts and the comments and the connecting edge between the comments and the comments.
S12, constructing a path matrix P based on the graph G, wherein the path matrix P belongs to Rm*mM paths in the graph G are recorded, each row represents a path, and for the node i, the relevant paths include a path from bottom to top (from the node i to the post node) and all paths from top to bottom (from the node i to the comment node without reply). For example: three paths from node P are recorded in the first 3 rows of the matrix, where 1 represents the corresponding node on the path and 0 represents the corresponding node off the path.
For each node, firstly recording all paths from bottom to top, then recording all paths from top to bottom, and recording the paths corresponding to all nodes according to the breadth-first traversal.
It should be noted that for different paths from the same node, the nodes of the overlapping portions of these paths do not occupy the same column. E.g. for a path P-C from node P1-C1-1And P-C1-C1-2C in the path1The nodes are in different columns (as shown by the circles on the path matrix in fig. 2). In order to make the path matrix P a square matrix, each node occupies the same number of columns and rows.
S12, constructing a path length matrix S based on the graph G, wherein the path length matrix S belongs to Rm*mIt is a diagonal matrix, and the element value on the diagonal of the ith row of the matrix represents the length of the ith path (except for the central node) in the matrix P.
S2, calculating a path Laplace matrix L based on the path matrix P and the path length matrix S, and defining the path Laplace matrix by adopting the difference between the path matrix and the path length matrix: l ═ S-P.
S3, calculating and obtaining the expression of the current node perception path information based on the Laplacian matrix L of the path and the content characteristics of the nodes in the graph G;
s31, calculating the normalization form of the path Laplace matrix L, which is
L′=I-S-1P
Wherein I is an identity matrix of M.
S32, calculating the expression of the sensing path information of the central node i based on the matrix L', and assuming that the matrix H belongs to Rm*dD-dimensional expression vectors of each central node in the path matrix are recorded (note that part of rows in the matrix H correspond to the same central node, so that the corresponding vectors are also the same), and the expression matrix Q epsilon R of the central nodem*dCan be calculated as:
Q=L′H
where each row of elements in Q represents a representation of a central node of the perceptual path information.
And S4, based on the similarity between the current node and all nodes on the corresponding path, reserving the most important part of nodes on each path, wherein the important nodes on all paths form a subgraph corresponding to the current node, and the nodes in the subgraph are local discussions related to the current node.
S41, calculating a correlation matrix between the nodes based on the matrix Q and the matrix H:
R=QWsHT
wherein Ws∈Rd*dIs a learnable matrix; each row in the matrix R represents the relevance of the central node to all other nodes, even if some of the nodes are not on the path corresponding to the central node.
S42, filtering out the nodes on the corresponding paths of the central node by using the path matrix P, and calculating the normalized correlation value between the central node and the nodes on the corresponding paths by using a Softmax function according to the rows:
R′=Softmax(P⊙R)
wherein |, represents the product of the corresponding elements in the matrix.
S43, for the path centered on the node i, accumulating the correlation values on the path from the node i along the path, and when the accumulated correlation values are larger than the threshold θ, truncating the remaining nodes. The collection of all the truncated paths becomes the subgraph corresponding to the central node, and is recorded as SGiAnd partial discussion information corresponding to the node i is recorded in the subgraph.
In this embodiment, after obtaining the subgraph corresponding to each node, the expression of the central node is updated by using the node information in the subgraph. The example is still based on the classical GNN model, but uses the nodes in the subgraph instead of the first-order neighbor nodes, assuming that the expression of node i at layer 1 is
Figure BDA0003047928880000091
The update rule is as follows:
Figure BDA0003047928880000092
wherein g is an aggregation function, different aggregation functions being used in different GNN models; in the GCN, the number of bits in the GCN,
Figure BDA0003047928880000093
wherein W(l)Is a learnable parameter matrix; σ is a nonlinear activation function; b(l)Is a bias vector.
The local dispute mining method in the embodiment comprises the following steps:
in a debate local discussion, there are always many discussion nodes with opposite views, and differences between the expression of these nodes may be apparent. Therefore, the present embodiment uses the differences between node expressions to model the dispute patterns in the subgraph generated by the dynamic subgraph generation method in this example.
For node i in reply to node x, the difference between node expressions is calculated
Figure BDA0003047928880000101
And use a full linkAnd (3) learning the differences by layer connection, and finally summing all the differences in the subgraph to obtain an expression vector of the subgraph, wherein the calculation formula is as follows:
Figure BDA0003047928880000102
wherein
Figure BDA0003047928880000103
Is a matrix of parameters that can be learned,
Figure BDA0003047928880000104
is a matrix of bias terms.
In order to make the model focus on the information related to the post for dispute detection, the embodiment uses the attention mechanism guided by the post information to re-weight each sub-graph so as to capture the disputes related to the post, and the calculation process is as follows:
Figure BDA0003047928880000105
Figure BDA0003047928880000106
wherein h ispRepresenting the expression of the post node, and SG representing all subgraph sets in a 'post-comment' graph;
Figure BDA0003047928880000107
is the weight of attention mechanism, representing sub-graph SGiRelevance to the post.
The result of the weighted summation is learned by using a full link layer, and finally whether the post is controversial or not is judged, and a loss function uses cross entropy:
Figure BDA0003047928880000108
wherein
Figure BDA0003047928880000109
A real tag representing the ith post;
Figure BDA00030479288800001010
the probability of dispute for each ith post predicted by the model; and N is the size of the batch during training.
The present embodiment also provides a storage medium on which a computer program capable of being executed by a processor is stored, and the computer program when executed implements the steps of the dynamic subgraph generation method or the dispute detection method in the present embodiment.
The present embodiment also provides a computer device having a memory and a processor, where the memory stores a computer program capable of being executed by the processor, and the computer program when executed implements the steps of the dynamic subgraph generation method or the dispute detection method in the present embodiment.

Claims (10)

1. A dynamic subgraph generation method based on node features and reply paths is characterized in that:
s1, constructing a path matrix P and a path length matrix S based on the 'post-comment' graph G, wherein the path matrix P records all paths from each node in the graph G to terminal nodes, and the terminal nodes comprise post nodes in the graph G and comment nodes without replies; the path length matrix S records the length of each path in the path matrix;
s2, calculating to obtain a path Laplacian matrix L based on the path matrix P and the path length matrix S;
s3, calculating and obtaining the expression of the current node perception path information based on the Laplacian matrix L of the path and the content characteristics of the nodes in the graph G;
and S4, based on the similarity between the current node and all nodes on the corresponding path, reserving the most important part of nodes on each path, wherein the important nodes on all paths form a subgraph corresponding to the current node, and the nodes in the subgraph are local discussions related to the current node.
2. The method for dynamic subgraph generation based on node features and reply paths according to claim 1, wherein the step S1 comprises:
s11, constructing a 'post-comment' graph G ═ V, E according to the reply relation, wherein V is a set of nodes and comprises post nodes and comment nodes; e represents the reply relationship between the nodes, including the connecting edges between the posts and the comments and the connecting edges between the comments and the comments;
s12, constructing a path matrix P based on the graph G, wherein the path matrix P belongs to Rm*mRecording m paths in the graph G, and taking all paths from each node to the terminal node in the graph G;
s13, constructing a path length matrix S based on the graph G, wherein the path length matrix S belongs to Rm*mThe element value on the diagonal of the ith row of the matrix represents the length of the ith path in the path matrix P.
3. The method for dynamic subgraph generation based on node features and reply paths according to claim 1, wherein the step S2 comprises: the difference of the path matrix P and the path length matrix S is used to define a path laplacian matrix: l ═ S-P.
4. The method for dynamic subgraph generation based on node features and reply paths according to claim 1, wherein the step S3 comprises:
s31, calculating a normalization form of the path Laplace matrix L:
L'=I-S-1P
wherein I is an identity matrix of M;
based on the matrix L', calculating the expression of the sensing path information of the central node i, wherein the expression matrix Q of the central node belongs to Rm*dThe calculation is as follows:
Q=L′H
s32, where the matrix H ∈ Rm*dAnd recording the d-dimensional expression vector of each central node in the path matrix.
5. The method for dynamic subgraph generation based on node features and reply paths according to claim 4, wherein the step S4 comprises:
s41, calculating a correlation matrix between the nodes based on the matrix Q and the matrix H:
R=QWsHT
wherein Ws∈Rd*dIs a learnable matrix; each row in the matrix R represents the correlation degree of the central node and all other nodes;
s42, filtering out nodes on the path corresponding to the central node by using the path matrix P, and calculating a normalized correlation value between the central node and the nodes on the corresponding path by using a Softmax function according to the rows;
R'=Softmax(P⊙R)
wherein |, represents the product of the corresponding elements in the matrix;
s43, for the path with the node i as the center, accumulating the correlation value on the path from the node i along the path, and cutting off the rest nodes when the accumulated correlation value is larger than the threshold value theta;
the collection of all the truncated paths becomes the subgraph corresponding to the central node i, which is recorded as SGiAnd local discussion information corresponding to the central node i is recorded in the subgraph.
6. The method of dynamic subgraph generation based on node features and reply paths according to claim 1, characterized in that: updating the expression of the node by utilizing the node information in the subgraph based on the classical GNN model, wherein the expression of the node i in the l-th layer is
Figure FDA0003047928870000021
The update rule is as follows:
Figure FDA0003047928870000022
wherein g is an aggregation function, different aggregation functions being used in different GNN models; activation of sigma being non-linearA function; b(l)Is a bias vector.
7. The dynamic subgraph generation method based on node features and reply paths according to claim 6, characterized in that:
in the GCN, the number of bits in the GCN,
Figure FDA0003047928870000031
wherein W(l)Is a learnable parameter matrix.
8. A method of dispute detection, comprising:
A. adopting differences among node expressions to model a dispute mode in a subgraph generated by the dynamic subgraph generation method of any one of claims 1-7;
B. the sub-graphs are re-weighted using a post-directed attention mechanism to capture post-related disputes.
9. The dispute detection method according to claim 8, wherein the step a comprises:
for node i in reply to node x, the difference between node expressions is calculated
Figure FDA0003047928870000032
Using the fully connected layer to learn these differences;
summing all differences in the subgraph to obtain an expression vector of the subgraph, wherein a calculation formula is as follows:
Figure FDA0003047928870000033
wherein
Figure FDA0003047928870000034
Is a matrix of parameters that can be learned,
Figure FDA0003047928870000035
is a matrix of bias terms.
10. The dispute detection method according to claim 8, wherein the post-directed attention mechanism is used to re-weight each sub-graph, and the calculation is as follows:
Figure FDA0003047928870000036
Figure FDA0003047928870000037
wherein h ispAn expression representing a post node; SG represents all subgraph sets in the "post-comment" graph;
Figure FDA0003047928870000038
is the weight of attention mechanism, representing sub-graph SGiRelevancy to a post;
the result of the weighted summation is learned by using a full link layer, and finally whether the post is controversial or not is judged, and a loss function uses cross entropy:
Figure FDA0003047928870000039
wherein
Figure FDA0003047928870000041
A real tag representing the ith post;
Figure FDA0003047928870000042
the probability of dispute for each ith post predicted by the model; and N is the size of the batch during training.
CN202110478862.5A 2021-04-29 2021-04-29 Dynamic subgraph generation method and dispute detection method based on node characteristics and reply paths Active CN113254864B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110478862.5A CN113254864B (en) 2021-04-29 2021-04-29 Dynamic subgraph generation method and dispute detection method based on node characteristics and reply paths

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110478862.5A CN113254864B (en) 2021-04-29 2021-04-29 Dynamic subgraph generation method and dispute detection method based on node characteristics and reply paths

Publications (2)

Publication Number Publication Date
CN113254864A true CN113254864A (en) 2021-08-13
CN113254864B CN113254864B (en) 2024-05-28

Family

ID=77223275

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110478862.5A Active CN113254864B (en) 2021-04-29 2021-04-29 Dynamic subgraph generation method and dispute detection method based on node characteristics and reply paths

Country Status (1)

Country Link
CN (1) CN113254864B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113297497A (en) * 2021-06-22 2021-08-24 中国科学院计算技术研究所 Method for mining key local disputes to detect disputeness based on dynamic subgraph generation method
CN116541794A (en) * 2023-07-06 2023-08-04 中国科学技术大学 Sensor data anomaly detection method based on self-adaptive graph annotation network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110462612A (en) * 2017-02-17 2019-11-15 凯恩迪股份有限公司 The method and apparatus for carrying out machine learning using the network at network node with ageng and ranking then being carried out to network node
CN111553916A (en) * 2020-05-09 2020-08-18 杭州中科睿鉴科技有限公司 Image tampering area detection method based on multiple characteristics and convolutional neural network
CN111639252A (en) * 2020-05-18 2020-09-08 华中科技大学 False news identification method based on news-comment relevance analysis
CN112148875A (en) * 2020-08-03 2020-12-29 杭州中科睿鉴科技有限公司 Dispute detection method based on graph convolution neural network integration content and structure information
CN112508085A (en) * 2020-12-05 2021-03-16 西安电子科技大学 Social network link prediction method based on perceptual neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110462612A (en) * 2017-02-17 2019-11-15 凯恩迪股份有限公司 The method and apparatus for carrying out machine learning using the network at network node with ageng and ranking then being carried out to network node
CN111553916A (en) * 2020-05-09 2020-08-18 杭州中科睿鉴科技有限公司 Image tampering area detection method based on multiple characteristics and convolutional neural network
CN111639252A (en) * 2020-05-18 2020-09-08 华中科技大学 False news identification method based on news-comment relevance analysis
CN112148875A (en) * 2020-08-03 2020-12-29 杭州中科睿鉴科技有限公司 Dispute detection method based on graph convolution neural network integration content and structure information
CN112508085A (en) * 2020-12-05 2021-03-16 西安电子科技大学 Social network link prediction method based on perceptual neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LEI ZHONG等: "《Integrating Semantic and Structural Information with Graph Convolutional Network for Controversy Detection》", 《ARXIV:2005.07886V1[CS.CL]16 MAY 2020》, 16 May 2020 (2020-05-16) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113297497A (en) * 2021-06-22 2021-08-24 中国科学院计算技术研究所 Method for mining key local disputes to detect disputeness based on dynamic subgraph generation method
CN116541794A (en) * 2023-07-06 2023-08-04 中国科学技术大学 Sensor data anomaly detection method based on self-adaptive graph annotation network
CN116541794B (en) * 2023-07-06 2023-10-20 中国科学技术大学 Sensor data anomaly detection method based on self-adaptive graph annotation network

Also Published As

Publication number Publication date
CN113254864B (en) 2024-05-28

Similar Documents

Publication Publication Date Title
CN111597314B (en) Reasoning question-answering method, device and equipment
Meng et al. Rating the crisis of online public opinion using a multi-level index system
CN112165462A (en) Attack prediction method and device based on portrait, electronic equipment and storage medium
CN113254864A (en) Dynamic subgraph generation method and dispute detection method based on node characteristics and reply path
Tu et al. Constructing conceptual trajectory maps to trace the development of research fields
US20090138443A1 (en) Method and system for searching for a knowledge owner in a network community
CN115329210A (en) False news detection method based on interactive graph layered pooling
CN114219089B (en) Construction method and equipment of new-generation information technology industry knowledge graph
CN115238773A (en) Malicious account detection method and device for heterogeneous primitive path automatic evaluation
CN105354343B (en) User characteristics method for digging based on remote dialogue
CN110543601B (en) Method and system for recommending context-aware interest points based on intelligent set
CN111177526B (en) Network opinion leader identification method and device
Phuvipadawat et al. Detecting a multi-level content similarity from microblogs based on community structures and named entities
CN113159976B (en) Identification method for important users of microblog network
CN113781110B (en) User behavior prediction method and system based on multi-factor weighted BI-LSTM learning
Cui et al. Identification of Micro-blog Opinion Leaders based on User Features and Outbreak Nodes.
Irani et al. ArguSense: Argument-Centric Analysis of Online Discourse
CN112016004A (en) Multi-granularity information fusion-based job crime screening system and method
CN117408298B (en) Information propagation prediction method based on prototype perception dual-channel graph neural network
Wang et al. Detecting inactive cyberwarriors from online forums
CN118210916B (en) Scientific literature recommendation method based on hypergraph attention and enhanced contrast learning
CN116860952B (en) RPA intelligent response processing method and system based on artificial intelligence
CN113297497A (en) Method for mining key local disputes to detect disputeness based on dynamic subgraph generation method
CN117668259A (en) Knowledge-graph-based inside and outside data linkage analysis method and device
Zhang et al. Multi-scale attention graph convolutional network for heterogeneous graph representation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 310015 floor 12, building D, No. 108 Xiangyuan Road, Gongshu District, Hangzhou City, Zhejiang Province

Applicant after: Zhongke Computing Technology Innovation Research Institute

Applicant after: Hangzhou Zhongke Ruijian Technology Co.,Ltd.

Address before: 12 / F, building 4, 108 Xiangyuan Road, Gongshu District, Hangzhou City, Zhejiang Province 310015

Applicant before: Institute of digital economy industry, Institute of computing technology, Chinese Academy of Sciences

Applicant before: Hangzhou Zhongke Ruijian Technology Co.,Ltd.

CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Cao Juan

Inventor after: Zhong Lei

Inventor after: Wang Zhengjia

Inventor after: Sheng Qiang

Inventor after: Xie Tian

Inventor before: Cao Juan

Inventor before: Zhong Lei

Inventor before: Wang Zhengjia

Inventor before: Sheng Qiang

Inventor before: Xie Tian

Inventor before: Xu Chaoxi

GR01 Patent grant
GR01 Patent grant