CN115168678A - Time sequence perception heterogeneous graph nerve rumor detection model - Google Patents

Time sequence perception heterogeneous graph nerve rumor detection model Download PDF

Info

Publication number
CN115168678A
CN115168678A CN202210721077.2A CN202210721077A CN115168678A CN 115168678 A CN115168678 A CN 115168678A CN 202210721077 A CN202210721077 A CN 202210721077A CN 115168678 A CN115168678 A CN 115168678A
Authority
CN
China
Prior art keywords
event
time sequence
response
representation
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210721077.2A
Other languages
Chinese (zh)
Inventor
宋玉蓉
陈林威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202210721077.2A priority Critical patent/CN115168678A/en
Publication of CN115168678A publication Critical patent/CN115168678A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a time sequence perception heterogeneous graph nerve rumor detection model, in recent years, the breeding and propagation of rumors are greatly accelerated by the development of online social media, and the harmfulness of the rumors leads the automatic rumor detection technology to be widely concerned by researchers. The invention simultaneously considers the global structure relationship between events and the time sequence relationship of message propagation in the events, uses a heterogeneous graph as a carrier to jointly and explicitly model the two relationships, and provides a novel time sequence perception heterogeneous graph nerve rumor detection model. The model captures the time sequence relation between forwarding (or comment) posts in the event by using a time sequence perception self-attention mechanism, and fuses the forwarding (or comment) posts with time sequence information with a source post to obtain the local time sequence representation of the event; capturing a global structure relation between events by using an element level attention mechanism, and learning a global structure representation of the events; finally, the two are fused for detecting rumors.

Description

Time sequence perception heterogeneous graph Neurorumor detection model
Technical Field
The invention provides a time sequence perception heterogeneous graph nerve rumor detection model, belongs to the technical field of rumor detection, and particularly relates to graph nerve network correlation technology.
Background
In recent years, the growth and spread of rumors have been greatly accelerated by the development of online social media, and the harmfulness of rumors has led researchers to pay much attention to the automatic rumor detection technology. Most of the early rumor detection methods utilize feature engineering to mine effective features from the aspects of text content, user configuration information, propagation structure and the like. The method depends on heavy feature engineering, is time-consuming, needs a large amount of human resources, and has strong subjectivity and lacks of high-order feature representation of artificially constructed features. With the development of deep learning, deep neural networks have achieved good results in many natural language processing tasks, such as emotion analysis, machine translation, and text classification. With the above-mentioned introduction, researchers have begun to model text contents, propagation structures, etc. by using deep learning models, and have proposed many effective rumor detection methods. Recently, the graph model-based method utilizes the structural characteristics of graph neural network modeling message propagation to convert the rumor detection task into the graph classification task, and a good effect is achieved. However, these methods only consider the local propagation structure of posts inside the event, and ignore the global structural relationship of the event on the social media. Yuan et al (Yuan Chun-Yuan, ma Qian-wen, zhou Wei, et al. Joint embedding the local and global relationships of a heterologous graph for rumor detection [ C ]//2019 IEEE international conference on data mining (ICDM). IEEE,2019 796-805.) consider that each event is not an independent individual, that a relationship may arise between events due to participation of the same user, and that a relationship between events is ignored only in consideration of the characteristics of each event itself, which necessarily limits the detection performance of the model. Therefore, they studied the associations between events from the perspective of heterogeneous networks, proposing a heterogeneous graph combining global and local relationships to capture the local semantic relationships and global structural information of message propagation. Although the model achieves good effect, the model ignores the timing information in the process of message propagation inside the event.
Disclosure of Invention
The present invention is directed to overcoming the above problems, and providing a time-series aware heterogeneous graph neurorumor detection model, which includes a heterogeneous graph construction module, a local time-series information encoding module, a global structure information encoding module, and a rumor classification module. The method comprises the following steps:
and S01, constructing a heterogeneous graph, wherein the module comprises two parts: constructing a heterogeneous graph based on the interactive relation between the event internal forwarding (comment) posts and the source posts and the interactive relation between the user and the event; performing initialization representation on each type of node in the heterogeneous graph;
s02, extracting local time sequence characteristics of the event, wherein the module comprises two parts: mining local time sequence information in an event by adopting a time sequence perception self-attention mechanism to obtain a response paste representation with time sequence information
Figure BDA0003710386120000011
Then response paste with time sequence information
Figure BDA0003710386120000012
Fusion into source-pasted representations to obtain local temporal characterization of events
Figure BDA0003710386120000013
S03, extracting global structural features of the event, wherein the module comprises two parts: computing participation event c i A specific user u in j Attention vector y in different aspects j (ii) a By attention vector gamma j And participate in event c i All users in the system are aggregated in an element product mode to capture the global structure relation between events;
s04, after obtaining the local time sequence representation and the global structure representation of each event node, splicing the two features to be used as the final representation of the event node, and calculating the prediction result of the event through a full connection layer and a softmax function, namely the probability value of the event as each label
Figure BDA0003710386120000014
Last definitionA loss function that continuously updates the model parameters to obtain the optimal values.
The step S01 specifically includes:
s11, firstly abstracting an event and related users into two different types of nodes in a network, and establishing a connection edge relationship between a user node and the event node according to the participation condition of the user to the event (the user has the behavior of forwarding or commenting posts in the event). And, within each event, a source post and a series of response posts. The response tiles are constructed as a time series according to the time delay of the response tiles after the source tiles are issued, so that each source tile corresponds to a response sequence. And finally constructing an event-user heterogeneous graph with timing information.
S12, performing initialization representation on each node in the heterogeneous graph, and specifically comprising the following steps:
and S12-1, performing initialization representation on the event node. The internal essence of the event is the text contents of the source paste and the response paste, which are initialized in a word vector mode and are coded by using CNN. Specifically, the number of words per post is fixed to be L, and when the number of words is less than L, 0 is used for filling; when the number of words exceeds L, truncation is performed. Then, training is carried out on the corpus in the specific field through the Word2Vec algorithm to obtain the vector representation of each Word, and for the words which do not appear in the pre-training Word vector library, the invention uses uniform distribution to carry out initialization and keeps the Word vectors to be finely adjusted in the training process. Remember an initial vector for each word as
Figure BDA0003710386120000015
j represents the jth word in the post, the post with the number of words L can be represented as:
x 1:L =[x 1 ;x 2 ;…x L ],
wherein, "; "is a splicing operation,
Figure BDA0003710386120000021
further, the sentence sequence is encoded using CNN, given a formulaSentence sequence x composed of word vectors 1:L Performing one-dimensional convolution operation on each possible window through the convolution layer of the CNN:
e i =σ(W*x i:i+h-1 ),
obtaining a characteristic diagram
Figure BDA0003710386120000022
Wherein the content of the first and second substances,
Figure BDA0003710386120000023
is a convolution kernel of size h, followed by a max pooling operation
Figure BDA0003710386120000024
For the ith event, the source label is represented as
Figure BDA0003710386120000025
Each response label is represented as
Figure BDA0003710386120000026
Record the matrix formed by the response paste in the event as
Figure BDA0003710386120000027
Figure BDA0003710386120000028
And S12-2, performing initialization representation on the user node. And encoding attribute information (including gender, age, fan number, attention number and the like) of the user to obtain an initialization vector representation of the user node. And initializing the user information which cannot be obtained through normal distribution.
The step S02 specifically includes:
and S21, mining local timing information in the event by adopting a self-attention mechanism of timing perception, and capturing the difference between response posts generated by the rumor event and the non-rumor event at different time stages and potential timing dependency relations between the response posts.
S21-1, in order to encode the time delay information of each response paste, a position embedding is generated for each response paste by using a Position Encoding (PE) formula in a Transformer model:
Figure BDA0003710386120000029
Figure BDA00037103861200000210
where pos represents the position of the response paste in the sequence, d represents the dimension of PE, 2k represents the even dimension, and 2k +1 represents the odd dimension (i.e., 2k ≦ d,2k +1 ≦ d).
S21-2, associating the embedding of each response sticker with the corresponding position embedding to capture the time sequence information between the response stickers:
Figure BDA00037103861200000211
and S21-3, focusing attention on important response labels by using a multi-head attention mechanism, wherein the self-attention mechanism can explicitly give larger weight to information with larger influence to the self-attention mechanism and weight the information into the self-attention mechanism, so that the representation of nodes is greatly enriched, and the multi-head attention mechanism can consider the influence of external information in many aspects as possible:
Figure BDA00037103861200000212
Figure BDA00037103861200000213
s22, fusing the response paste with the time sequence information into the representation of the source paste to obtain the local time sequence representation of the event
Figure BDA00037103861200000214
The invention regards a series of response stickers as first-order neighbor nodes of corresponding source stickers, and adopts an aggregation function in a graph attention network for fusion, and the concrete calculation is as follows:
α ii =softmax(LeakyReLU(a T [m i ;m i ])),
Figure BDA00037103861200000215
Figure BDA00037103861200000216
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA00037103861200000217
activating a function for sigmoid, α ii 、α ij Respectively representing the attention scores between node i and itself and between node i and node j, N (m) i ) The response posts corresponding to the current source post are all response posts,
Figure BDA00037103861200000218
and (4) a weight parameter for the node feature transformation of the layer.
The step S03 specifically includes:
s31, establishing a global structure relationship between events based on common users, and considering how to learn the global structure characteristics of event nodes
Figure BDA00037103861200000219
Inspired by the element-level attention mechanism widely used in the task of recommendation systems, the invention proposes a user-embedded element-oriented attention mechanism, which assumes that each dimension of user embedding reflects different aspects of information of a user, and that these different attributes of the user have different effects on the propagation of messages. In particular toThe process is as follows:
for participation in event c i Specific user in (1)
Figure BDA0003710386120000031
Computing user u j Attention vector y in different aspects j
γ j =tanh(W c ·u j +b),
Wherein the content of the first and second substances,
Figure BDA0003710386120000032
in order to transform the matrix for the features,
Figure BDA0003710386120000033
is a different aspect of the attention vector.
Figure BDA0003710386120000034
The larger the representation u is embedded by the user j The greater the impact of the kth aspect of (a) on message propagation.
S32, using the attention vector gamma j And participate in event c i All users in (2) are aggregated in an element product manner to capture events and global structural relationships between events:
Figure BDA0003710386120000035
the step S04 specifically includes:
s41, splicing the local time sequence representation and the global structure representation of the event to be used as the final representation of the event node, and calculating the prediction result of the event through a full connection layer and a softmax function, namely the probability value of the event for each label:
Figure BDA0003710386120000036
wherein Fc (·) is a fully connected layer, and the output dimension is consistent with the classification category.
And S42, finally, defining the loss function of the model as the cross entropy between the prediction result and the real label:
Figure BDA0003710386120000037
where r is the number of classes classified, θ is the parameter of the entire model, y i ∈{0,1,2,3}(Twitter),y i E {0,1} (Weibo) is the true tag value.
Compared with the prior art, the time sequence perception heterogeneous graph Neadry detection model implemented by the invention has the following beneficial effects:
the invention fully considers the local time sequence relation between the forwarding (or comment) posts in the event and the source posts and the global structure relation between the event and the event, and jointly and explicitly models the local time sequence information and the global structure information to complete the rumor detection task.
The method is based on the interactive relation between forwarding (or comment) posts and source posts, the local time sequence relation between response posts is modeled through position coding, important response posts are focused by using a multi-head attention mechanism, and then the source posts and the response posts are fused to obtain the local time sequence representation of each event node.
The method is based on the interaction relationship between the user and the event, and utilizes an element-level attention mechanism to learn the global structure representation of each event node so as to capture complex and various propagation structure characteristics.
The experimental results show that the model provided by the invention achieves better effects than the existing model on rumor classification and rumor early detection tasks.
Drawings
Fig. 1 is a diagram of the overall framework of the time-series aware heterogeneous pattern neurumor detection model-SHGN model.
Detailed Description
For a better understanding of the objects, aspects and advantages of the present invention, reference is made to the following detailed description taken in conjunction with the accompanying drawings, which are included to provide a further understanding of the invention, and are not intended to limit the scope of the invention.
FIG. 1 is a block diagram of the general framework of the time-series aware Heterogeneous neurorumor Detection model, SHGN (Sequence-aware Heterogeneous Neural Rumor Detection). As shown in fig. 1, it includes 4 modules: the system comprises a heterogeneous graph construction module, a local time sequence information coding module, a global structure information coding module and a rumor classification module. Specifically comprises
And S01, constructing a heterogeneous graph, wherein the module comprises two parts: constructing a heterogeneous graph based on the interactive relation between the event internal forwarding (comment) posts and the source posts and the interactive relation between the user and the event; and performing initialization representation on each type of node in the heterogeneous graph.
Specifically, the construction of the S01 heterogeneous map comprises the following steps:
s11, firstly abstracting an event and related users into two different types of nodes in a network, and establishing a connection edge relationship between a user node and an event node according to the participation condition of the user to the event (the user has the action of forwarding or commenting posts in the event). And, a source post and a series of response posts are contained in each event. The response tiles are constructed as a time series according to the time delay of the response tiles after the source tiles are issued, so that each source tile corresponds to a response sequence. And finally constructing an event-user heterogeneous graph with timing information.
S12, performing initialization representation on each node in the heterogeneous graph, and specifically comprising the following steps:
and S12-1, performing initialization representation on the event node. The internal essence of the event is the text contents of the source paste and the response paste, which are initialized in a word vector mode and are coded by using CNN. Specifically, the number of words per post is fixed to be L, and when the number of words is less than L, 0 is used for filling; when the number of words exceeds L, truncation is performed. Then, training is carried out on the corpus in the specific field through the Word2Vec algorithm to obtain the vector representation of each Word, and for the words which do not appear in the pre-training Word vector library, the invention uses uniform distribution to carry out initialization and keeps the Word vectors to be finely adjusted in the training process. Remember an initial vector for each word as
Figure BDA0003710386120000041
j represents the jth word in the post, the post with the number of words L can be represented as:
x 1:L =[x 1 ;x 2 ;…x L ],
wherein, "; "is the splicing operation, and the splicing operation,
Figure BDA0003710386120000042
further, the sentence sequence is encoded using CNN, given a sentence sequence x consisting of word vectors 1:L Performing one-dimensional convolution operation on each possible window through the convolution layer of the CNN:
e i =σ(W*x i:i+h-1 ),
get the characteristic diagram
Figure BDA0003710386120000043
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003710386120000044
is a convolution kernel of size h, followed by a max pooling operation
Figure BDA0003710386120000045
Selecting the maximum value of each feature graph, and obtaining the initialization vector representation of each post through splicing operation
Figure BDA0003710386120000046
Each response label is represented as
Figure BDA0003710386120000047
The matrix formed by the response paste in the event is recorded as
Figure BDA00037103861200000420
Figure BDA0003710386120000048
And S12-2, performing initialization representation on the user node. And encoding attribute information (including gender, age, fan number, attention number and the like) of the user to obtain an initialization vector representation of the user node. And initializing the user information which cannot be acquired through normal distribution.
S02, extracting local time sequence characteristics of the event, wherein the module comprises two parts: mining local time sequence information in an event by adopting a time sequence perception self-attention mechanism to obtain a response paste representation with time sequence information
Figure BDA0003710386120000049
Then response paste with time sequence information
Figure BDA00037103861200000410
Fusion into source-pasted representations to obtain local temporal characterization of events
Figure BDA00037103861200000411
Specifically, the S02 local timing information encoding includes the following steps:
and S21, mining local time sequence information in the event by adopting a time sequence-aware self-attention mechanism, and capturing the difference of response patches generated at different time stages of the rumor event and the non-rumor event and the potential time sequence dependency relationship between the response patches.
S21-1, in order to encode the time delay information of each response paste, a position embedding is generated for each response paste by using a Position Encoding (PE) formula in a Transformer model:
Figure BDA00037103861200000412
Figure BDA00037103861200000413
where pos represents the position of the response paste in the sequence, d represents the dimension of PE, 2k represents the even dimension, 2k +1 represents the odd dimension (i.e., 2k ≦ d,2k +1 ≦ d);
s21-2, associating the embedding of each response paste with the corresponding position embedding so as to capture the time sequence information between the response pastes:
Figure BDA00037103861200000414
and S21-3, focusing attention on important response labels by using a multi-head attention mechanism, wherein the self-attention mechanism can explicitly give larger weight to information with larger influence to the self-attention mechanism and weight the information into the self-attention mechanism, so that the representation of nodes is greatly enriched, and the multi-head attention mechanism can consider the influence of external information in many aspects as possible:
Figure BDA00037103861200000415
Figure BDA00037103861200000416
s22, fusing the response paste with the time sequence information into the representation of the source paste to obtain the local time sequence representation of the event
Figure BDA00037103861200000417
The invention regards a series of response stickers as first-order neighbor nodes of corresponding source stickers, and adopts an aggregation function in a graph attention network for fusion, and the concrete calculation is as follows:
α ii =softmax(LeakyReLU(a T [m i ;m i ])),
Figure BDA00037103861200000418
Figure BDA00037103861200000419
wherein the content of the first and second substances,
Figure BDA0003710386120000051
activating a function for sigmoid, α ii 、α ij Respectively representing the attention scores between node i and itself and between node i and node j, N (m) i ) The response posts corresponding to the current source post are all response posts,
Figure BDA0003710386120000052
and weight parameters for the node feature transformation of the layer.
S03, extracting global structural features of the event, wherein the module comprises two parts: computing participation event c i Specific user u in j Attention vector y in different aspects j (ii) a By attention vector gamma j And participate in event c i All users in (2) are aggregated in an element product manner to capture events and global structural relationships between events.
Specifically, the S03 global structure information encoding includes the following steps:
s31, based on the global structure relationship between the events established by the common users, the method considers how to learn the global structure characteristics of the event nodes
Figure BDA0003710386120000053
Inspired by the element-level attention mechanism widely used in the task of recommendation systems, the invention proposes a user-embedded element-oriented attention mechanism, which assumes that each dimension of user embedding reflects different aspects of information of a user, and that these different attributes of the user have different effects on the propagation of messages. The specific process is as follows:
for participation in event c i Specific user in (1)
Figure BDA0003710386120000054
Computing user u j In different aspectsAttention vector y of j
γ j =tanh(W c ·u j +b),
Wherein the content of the first and second substances,
Figure BDA0003710386120000055
in order to transform the matrix for the features,
Figure BDA0003710386120000056
are different aspects of the attention vector.
Figure BDA0003710386120000057
The larger the representation u is embedded by the user j The greater the impact of the kth aspect of (a) on message propagation.
S32, using the attention vector gamma j And participate in event c i All users in (2) are aggregated in an element product manner to capture events and global structural relationships between events:
Figure BDA0003710386120000058
s04, after obtaining the local time sequence representation and the global structure representation of each event node, splicing the two features to be used as the final representation of the event node, and calculating the prediction result of the event through a full connection layer and a softmax function, namely the probability value of the event as each label
Figure BDA00037103861200000511
And finally, defining a loss function, and continuously updating model parameters to obtain an optimal value.
Specifically, the S04 rumor classification includes the following steps:
s41, splicing the local time sequence representation and the global structure representation of the event to be used as the final representation of the event node, and calculating the prediction result of the event through a full connection layer and a softmax function, namely the probability value of the event for each label:
Figure BDA0003710386120000059
wherein, fc (-) is a full connection layer, and the output dimension is consistent with the classification.
And S42, finally, defining the loss function of the model as the cross entropy between the prediction result and the real label:
Figure BDA00037103861200000510
where r is the number of classes classified, θ is the parameter of the entire model, y i ∈{0,1,2,3}(Twitter),y i E {0,1} (Weibo) is the true tag value.
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims (5)

1. A time-series aware heterogeneous map neuadry detection model, comprising 4 modules: the system comprises a heterogeneous graph construction module, a local time sequence information coding module, a global structure information coding module and a rumor classification module, wherein the heterogeneous graph construction module constructs an event-user heterogeneous graph based on a rumor detection data set and initializes and represents each node in the graph by utilizing an embedding technology; the local time sequence information coding module learns the response paste representation with time sequence information by utilizing a time sequence perception self-attention mechanism based on the time sequence relation between event internal forwarding or comment pastes, and then fuses the content information of the source paste to obtain the local time sequence representation of each event node
Figure FDA0003710386110000011
The global structure information coding module learns the global structure representation of each event node by utilizing an element-level attention mechanism based on the interactive relation between users and events
Figure FDA0003710386110000012
The rumor classification module fuses local time sequence characterization and global structure characterization of events and predicts the probability that the current event is a rumor; the module is characterized by specifically comprising the following steps of:
s01, constructing a heterogeneous graph: constructing a heterogeneous graph based on the time sequence relation between the event internal forwarding or comment posts and the source posts and the interaction relation between the user and the event; performing initialization representation on each type of node in the heterogeneous graph;
s02, extracting local time sequence characteristics of the events: mining local time sequence information in an event by adopting a time sequence perception self-attention mechanism to obtain a response paste representation with time sequence information
Figure FDA0003710386110000013
Then response paste with time sequence information
Figure FDA0003710386110000014
Fusion into source-pasted representations to obtain local temporal characterization of events
Figure FDA0003710386110000015
S03, extracting the global structural features of the event: computing participation event c i Specific user u in j Attention vector y in different aspects j (ii) a By attention vector gamma j And participate in event c i All users in the system are aggregated in an element product mode to capture the global structure relation between events;
s04, after the local time sequence representation and the global structure representation of each event node are obtained, the two characteristics are spliced to be used as the final representation of the event node, the prediction result of the event is calculated through a full connection layer and a softmax function, namely the probability value of the event being each label
Figure FDA00037103861100000114
Finally define oneAnd continuously updating the model parameters by using the loss function to obtain an optimal value.
2. The time-series aware heterogeneous graph Neadry detection model of claim 1, wherein the step S01 comprises:
s11, firstly abstracting an event and related users into two different types of nodes in a network, and establishing a connection edge relationship between a user node and the event node according to the participation condition of the user to the event, namely the behavior of forwarding or commenting the posts in the event by the user; and each event contains a source paste and a series of response pastes; constructing the response paste into a time sequence according to the time delay of the response paste after the source paste is issued, so that each source paste corresponds to a response sequence; finally constructing an event-user heterogeneous graph with time sequence information;
s12, performing initialization representation on each node in the heterogeneous graph, and specifically comprising the following steps:
s12-1, initializing the event node: initializing in a word vector mode, and coding the word vector by using CNN; specifically, the number of words per post is fixed to be L, and when the number of words is less than L, 0 is used for filling; when the number of words exceeds L, truncating; then training on the corpus by a Word2Vec algorithm to obtain the vector representation of each Word, initializing the words which do not appear in the pre-training Word vector library by using uniform distribution, and keeping the Word vectors to be allowed to be finely adjusted in the training process; remember an initial vector for each word as
Figure FDA0003710386110000016
j represents the jth word in the post, and the post with the number of each word L is represented as:
x 1:L =[x 1 ;x 2 ;...x L ],
wherein, "; "is a splicing operation,
Figure FDA0003710386110000017
further, the sentence sequence is encoded using CNN: given a sentence sequence x consisting of word vectors 1:L Performing one-dimensional convolution operation on each possible window through the convolution layer of the CNN:
e i =σ(W*x i:i+h-1 ),
obtaining a characteristic diagram
Figure FDA0003710386110000018
Wherein the content of the first and second substances,
Figure FDA0003710386110000019
is a convolution kernel of size h; then using maximum pooling operation
Figure FDA00037103861100000115
Selecting the maximum value of each feature map, and obtaining the initialized vector representation of each post through splicing operation; for the ith event, its source tile is denoted as
Figure FDA00037103861100000110
Each response label is represented as
Figure FDA00037103861100000111
The matrix formed by the response paste in the event is recorded as
Figure FDA00037103861100000112
Figure FDA00037103861100000113
S12-2, initializing the user node: and coding the attribute information of the user, including gender, age, fan number and attention number, to obtain an initialization vector representation of the user node, and initializing the unavailable user information through normal distribution.
3. The time-series aware heterogeneous graph Neadry detection model of claim 1, wherein the step S02 comprises:
s21, mining local time sequence information in the event by adopting a time sequence perception self-attention mechanism, and capturing the difference between response posts generated by the rumor event and the non-rumor event at different time stages and the potential time sequence dependency relationship between the response posts;
s21-1, in order to encode the time delay information of each response paste, generating a position embedding for each response paste by using a position encoding formula in a Transformer model:
Figure FDA0003710386110000021
Figure FDA0003710386110000022
wherein pos represents the position of the response paste in the sequence, d represents the dimension of PE, 2k represents the dimension of even number, 2k +1 represents the odd dimension, i.e. 2k ≦ d,2k +1 ≦ d;
s21-2, associating the embedding of each response sticker with the corresponding position embedding to capture the time sequence information between the response stickers:
Figure FDA0003710386110000023
s21-3, performing important attention on the important response paste by using a multi-head attention mechanism:
Figure FDA0003710386110000024
Figure FDA0003710386110000025
s22, timing information is providedThe response paste is fused into the representation of the source paste to obtain the local time sequence representation of the event
Figure FDA0003710386110000026
A series of response stickers are regarded as first-order neighbor nodes of corresponding source stickers, and fusion is performed by adopting an aggregation function in the attention network, and the specific calculation is as follows:
α ii =softmax(LeakyReLU(a T [m i ;m i ])),
Figure FDA0003710386110000027
Figure FDA0003710386110000028
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003710386110000029
for sigmoid activation functions, alpha ii 、α ij Respectively representing the attention scores between node i and itself and between node i and node j, N (m) i ) The response posts corresponding to the current source post are all response posts,
Figure FDA00037103861100000210
and (4) a weight parameter for the node feature transformation of the layer.
4. The time-series aware heterogeneous graph Neadry detection model of claim 1, wherein the step S03 further comprises:
s31, establishing a global structure relation between events based on common users, wherein the specific process is as follows:
for participation in event c i Specific user in (1)
Figure FDA00037103861100000211
Calculating user u j Attention vector y in different aspects j
γ j =tanh(W c ·u j +b),
Wherein the content of the first and second substances,
Figure FDA00037103861100000212
in order to transform the matrix for the features,
Figure FDA00037103861100000213
is a different aspect of the attention vector;
Figure FDA00037103861100000214
the larger the representation u is embedded by the user j The greater the impact of the kth aspect of (a) on message propagation;
s32, using the attention vector gamma j And participate in event c i All users in (2) are aggregated in an element product manner to capture events and global structural relationships between events:
Figure FDA00037103861100000215
5. the time-series aware heterogeneous graph Neadry detection model of claim 1, wherein the step S04 specifically comprises:
s41, splicing the local time sequence representation and the global structure representation of the event to be used as the final representation of the event node, and calculating the prediction result of the event through a full connection layer and a softmax function, namely the probability value of the event for each label:
Figure FDA00037103861100000216
wherein, fc (-) is a full connection layer, and the output dimension is consistent with the classification;
and S42, finally, defining the loss function of the model as the cross entropy between the prediction result and the real label:
Figure FDA00037103861100000217
where r is the number of classes classified, θ is the parameter of the entire model, y i ∈{0,1,2,3}(Twitter),y i E {0,1} (Weibo) is the true tag value.
CN202210721077.2A 2022-06-23 2022-06-23 Time sequence perception heterogeneous graph nerve rumor detection model Pending CN115168678A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210721077.2A CN115168678A (en) 2022-06-23 2022-06-23 Time sequence perception heterogeneous graph nerve rumor detection model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210721077.2A CN115168678A (en) 2022-06-23 2022-06-23 Time sequence perception heterogeneous graph nerve rumor detection model

Publications (1)

Publication Number Publication Date
CN115168678A true CN115168678A (en) 2022-10-11

Family

ID=83487561

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210721077.2A Pending CN115168678A (en) 2022-06-23 2022-06-23 Time sequence perception heterogeneous graph nerve rumor detection model

Country Status (1)

Country Link
CN (1) CN115168678A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115809327A (en) * 2023-02-08 2023-03-17 四川大学 Real-time social network rumor detection method for multi-mode fusion and topics

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115809327A (en) * 2023-02-08 2023-03-17 四川大学 Real-time social network rumor detection method for multi-mode fusion and topics
CN115809327B (en) * 2023-02-08 2023-05-05 四川大学 Real-time social network rumor detection method based on multimode fusion and topics

Similar Documents

Publication Publication Date Title
CN108875807B (en) Image description method based on multiple attention and multiple scales
CN108073711B (en) Relation extraction method and system based on knowledge graph
CN110163299B (en) Visual question-answering method based on bottom-up attention mechanism and memory network
CN109934261B (en) Knowledge-driven parameter propagation model and few-sample learning method thereof
CN107766447A (en) It is a kind of to solve the method for video question and answer using multilayer notice network mechanism
CN110399518B (en) Visual question-answer enhancement method based on graph convolution
CN110032630A (en) Talk about art recommendation apparatus, method and model training equipment
CN112036276B (en) Artificial intelligent video question-answering method
CN112597296B (en) Abstract generation method based on plan mechanism and knowledge graph guidance
CN109783666A (en) A kind of image scene map generation method based on iteration fining
CN111741330A (en) Video content evaluation method and device, storage medium and computer equipment
CN111651983B (en) Causal event extraction method based on self-training and noise model
CN113535953B (en) Meta learning-based few-sample classification method
CN113254675B (en) Knowledge graph construction method based on self-adaptive few-sample relation extraction
CN113761250A (en) Model training method, merchant classification method and device
CN112364168A (en) Public opinion classification method based on multi-attribute information fusion
CN111191461B (en) Remote supervision relation extraction method based on course learning
CN114339450A (en) Video comment generation method, system, device and storage medium
CN112100486A (en) Deep learning recommendation system and method based on graph model
CN116402352A (en) Enterprise risk prediction method and device, electronic equipment and medium
CN116821291A (en) Question-answering method and system based on knowledge graph embedding and language model alternate learning
CN111597816A (en) Self-attention named entity recognition method, device, equipment and storage medium
CN115114409A (en) Civil aviation unsafe event combined extraction method based on soft parameter sharing
CN114925205A (en) GCN-GRU text classification method based on comparative learning
CN115168678A (en) Time sequence perception heterogeneous graph nerve rumor detection model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination