CN115422945A

CN115422945A - Rumor detection method and system integrating emotion mining

Info

Publication number: CN115422945A
Application number: CN202211139407.3A
Authority: CN
Inventors: 陈羽中; 朱文龙; 饶孟宇; 万宇杰
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2022-09-19
Filing date: 2022-09-19
Publication date: 2022-12-02

Abstract

The invention provides a rumor detection method integrating emotion mining, which comprises the following steps of; step A: collecting and extracting text content and comment content of a source post in a social network medium, and manually marking a real label of the source post to form a training data set DT; and B: training a deep learning network model N based on multi-level attention and a knowledge graph by using a training data set DT, wherein training content comprises authenticity labels for analyzing source posts and forecasting the source posts; and C: inputting the text content and the comment content of the source post into a trained deep learning network model N to obtain an authenticity label of the source post; the invention can improve the accuracy of rumor detection on the microblog.

Description

Rumor detection method and system integrating emotion mining

Technical Field

The invention relates to the technical field of natural language processing, in particular to a rumor detection method and system integrating emotion mining.

Background

Rumor Detection (Rumor Detection), also known as false news Detection, is an important task in the field of Natural Language Processing (NLP). Rumor detection can be regarded as a text classification problem with supervised learning, and can be generally classified into two types, rumors and not rumors. With the development of internet technology, social network platforms such as microblogs, twitter and the like are rapidly popular in the mass life. In social networking platforms, people are not just recipients of information but also creators of content. The social network platform greatly accelerates the speed and depth of information exchange between people. Social networking platforms are able to provide timely and comprehensive information about events occurring around the world, and therefore an increasing number of people are interested in participating in discussions and communications of hot topics on social networking platforms. This discussion and communication, on the one hand, facilitates the dissemination and dissemination of news, enabling people to more easily and quickly understand what is happening. However, in such a convenient environment, the social networking platform also reduces the cost of disseminating unrealistic information. The false rumors typically use false or forged images and aggressive languages, mislead the reader and spread quickly. The spread of the false rumors can have large-scale negative effects on society, causing social agitation.

In recent years, with the rise of deep learning technology, the technology is also widely applied by rumor detection tasks. The most common of these are Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). Since CNNs perform well in capturing semantic information from text, some researchers have applied them to content-based rumor detection. However, the network cannot take full advantage of the contextual information in the sentence, which is critical to modeling the semantic relationship between an aspect and its context. Therefore, the performance of CNN-based neural network models is limited in rumor detection tasks. To address this problem, many researchers have employed RNNs, particularly Long Short Term Memory (LSTM) and gated round-robin units (GRU), to extract contextual semantic information for rumors. Unlike CNN, RNN regards a sentence as a sequence of words, takes each word in time order, takes the output of the hidden layer as the input of the next hidden layer, and learns the context information in the sequence data continuously. Ma et al use recurrent neural networks to capture semantic changes between each source post and its forwarded comments and predict based on the semantic changes. The RNN-based neural network model is significantly better than the CNN-based neural network model in rumor detection.

Researchers have indicated that rumor characteristics of a given post are often determined by several keywords, rather than by all words in the context. While RNN cannot accurately estimate the contribution of different context words to the overall semantics. In contrast, the attention mechanism may capture the importance of each contextual word by computing an attention weight for each contextual word to the semantics of a given post and utilizing this attention weight to compute a semantic representation of the post.

However, most of these neural network models ignore emotional information in the post, which represents the emotion of the publisher to the content of the post, and this is especially important for correctly judging the authenticity label of the post. Recent learners have focused on finding unique emotional characteristics between the fake rumors and the real rumors. Ajao et al verified that there is a relationship between the authenticity of the news (true and false) and the use of emotional words, and designed an emotional signature (the ratio of the number of negative words to the number of positive words) to help detect false news. Furthermore, giachanou et al extract emotional features from news content for rumor detection based on an emotion dictionary. However, the related research in the past ignores syntax-dependent information and external knowledge information required in the aspect of emotion, so that emotion information is not sufficiently extracted.

Disclosure of Invention

The invention provides a rumor detection method and system integrating emotion mining, which can improve the accuracy of rumor detection on microblogs.

The invention adopts the following technical scheme.

A rumor detection method with emotion mining fused, the method comprising the following steps;

step A: collecting and extracting text content and comment content of a source post in a social network medium, and manually marking a real label of the source post to form a training data set DT;

and B, step B: training a deep learning network model N based on multi-level attention and a knowledge graph by using a training data set DT, wherein training content comprises authenticity of analysis source posts and authenticity labels for predicting the source posts;

and C: inputting the text content and the comment content of the source post into the trained deep learning network model N to obtain the authenticity label of the source post.

The step B comprises the following steps;

step B1: coding each training sample in the training data set DT to obtain an initial characterization vector T of the text content ^st Initial characterization vector T of comment content ^rt And syntactic adjacency matrix A ^st ；

And step B2: generating corresponding syntactic knowledge subgraph SK of text content from the knowledge map and the syntactic dependency graph according to the syntactic knowledge subgraph construction algorithm, and obtaining an adjacency matrix A thereof ^SK Then, the nodes are coded to obtain a node knowledge representation vector H of the syntax knowledge subgraph SK ^SK ；

And step B3: the text content initial characterization vector T obtained in the step B1 ^Tabanus Inputting the text content representation vector H into a bidirectional long-short term memory network Bi-LSTM to obtain a context-enhanced text content representation vector H ^st Let U ^st ＝H ^st (ii) a Then, the token vector T is ^sk And initial characterization vector T of comment content ^rt Inputting the data into a multi-head cross attention mechanism together to obtain a comment characterization vector P based on text content ^sr While characterizing the vector T ^st Inputting the text content enhancement representation vector Ps into a multi-head self-attention mechanism;then characterizing the vector P by the comments based on the text content ^sr And respectively inputting the text content enhanced representation vector Ps into the pooling layer to carry out average pooling operation to obtain an average pooled comment content sentence representation vector

And average pooled text content enhanced representation vector

；

And step B4: expressing node knowledge of sub-graph SK as vector H ^SK And the characterization vector U obtained in the step B3 ^st The method comprises the steps that the information is respectively input into two graph convolution networks with K layers, and the information is recorded as a text knowledge graph convolution network SKGCN and a text content graph convolution network SCGCN and used for learning external knowledge information and extracting syntax information; meanwhile, each layer of nodes of the text content graph convolution network SCGCN and the text knowledge graph convolution network SKGCN are subjected to knowledge guidance by using a knowledge guidance mechanism to obtain a graph knowledge representation vector V of a source post ^sks ；

And step B5: characterizing the vector V of the map knowledge obtained in the step B4 by using a cross attention mechanism ^sks And sentence characterization vector U ^st Fusing to obtain a knowledge enhanced sentence-level characterization vector E ^sd To further improve the ability of the model to extract information; then E is drawn by a multi-head self-attention mechanism ^sd Further strengthening to obtain sentence representation E of aggregated word-level information ^mt (ii) a Reducing noise from irregular sentences through a gating mechanism to obtain a source post emotion representation vector E ^sf ；

Step B6: representing vector of average pooling comment content sentences corresponding to source posts

And average pooled text content enhanced representation vector

All input into a multi-head cross attention mechanism and pass through an averaging cellComprehensive semantic representation C for transforming and obtaining comment content ^sr (ii) a The average pooled text content enhancement characterization vector is then

And comprehensive semantic representation C of comment content ^sr Inputting the semantic representation vector V into a fusion gating mechanism to obtain a fine-grained semantic representation vector V of the source post ^t ；

And step B7: representing the emotion vector E obtained in the step B5 ^sf And the fine-grained semantic representation vector V of the source post obtained in the step B6 ^t Combining to obtain final characterization vector E ^f (ii) a Then E is ^f Inputting a full connection layer and a softmax function to obtain a prediction result; calculating the gradient of each parameter in the deep learning network model by using a back propagation method according to the target loss function loss, and updating each parameter by using a random gradient descent method;

and step B8: and when the iterative change of the loss value generated by the deep learning network model N is smaller than a given threshold value or the maximum iteration number is reached, terminating the training process of the deep learning network model N.

The step B1 comprises the following steps;

step B11: traversing a training set DT, performing word segmentation on text content and comment content of a source post, and removing stop words, wherein each training sample in the DT is represented as DT = (st, rt, l); wherein st is the text content of the source post, rt is the comment content corresponding to the source post, l is the authenticity label corresponding to the source post, l belongs to { general fact, rumor, unverified rumor, rumor opened by public rumors };

the text content st of the source post is represented as:

wherein the content of the first and second substances,

for the i-th word in the text content st, i =1,2, …, nThe number of words that are the source post text content st;

the comment content rt of the source post is represented as:

wherein the content of the first and second substances,

for the jth word in the comment content rt, i =1,2, …, m, m is the number of words in the comment content rt;

step B12: for step B11, obtaining text content

Coding is carried out to obtain an initial characterization vector T of the text content st ^st ；T ^st Expressed as:

wherein the word vector matrix is pre-trained

Can be found to obtain

Is the ith word

Corresponding word vectors, d represents the dimensionality of the word vectors, and | V | is the number of words in the dictionary V;

step B13: for the comment content obtained in step B11

Coding is carried out to obtain an initial characterization vector T of the comment content rt ^rt ；T ^rt Expressed as:

wherein the word vector matrix is pre-trained

Can be found to obtain

Denotes the jth word

step B14: performing syntactic dependency analysis on the text content st to obtain a corresponding syntactic dependency tree DTD and an n-level syntactic adjacency matrix A ^st (ii) a The syntax dependency tree DTD is represented as,

wherein the content of the first and second substances,

words representing text content

And text content words

There is a syntactic dependency between them.

The step B2 comprises the following steps;

step B21: taking each original word node in a syntactic dependency tree DTD as a root node, expanding hop layers from a knowledge graph to generate child nodes, and selecting u nodes which are connected with the nodes of the previous layer in the knowledge graph with edges as the nodes of the layer on each layer, namely each seed node has

Expanding child nodes to finally obtain a syntactic knowledge sub-graph SK with the total number of all nodes being z = n + n × q and a z-order adjacency matrix A ^SK (ii) a The syntactic knowledge sub-graph SK is represented as,

wherein the content of the first and second substances,

meaning knowledge node words

Is a text content word

The number of the extended nodes of (1),

meaning knowledge node words

Is a knowledge node word

The knowledge-extending child node of (a),

words representing text content

And text content words

There is a syntactic dependency relationship between the two, u is the number of nodes selected in the knowledge graph, and hop is the number of layers of the topology;

step B22: the nodes of the sentence-method knowledge subgraph SK are encoded by embedding the knowledge graph to obtain the node knowledge expression vector of

Order to

As the initial input of a text knowledge graph convolution network SKGCN; knowledge word vector matrix in pre-training

Can be found to obtain

Is the ith word

And the corresponding knowledge word vector, wherein d represents the dimension of the knowledge word vector, and | V | is the word number of the knowledge word embedded in V.

The step B3 comprises the following steps;

step B31: initial characterization vector of text content

Sequentially and respectively inputting the forward layer and the reverse layer of the first bidirectional long-short term memory network to obtain the state vector sequence of the forward hidden layer and the state vector sequence of the reverse hidden layer, namely

And

wherein

i =1,2,.. N, f is the activation function; text content characterization vectors with context enhancement obtained through connection

Wherein the content of the first and second substances,

i =1,2, · n, ": "denotes a vector join operation; h ^st Is namely U ^st ；

Step B32: an initial characterization vector T of the text content st ^st And an initial token vector T of the review content rt ^rt Inputting the two into a multi-head cross attention mechanism together to obtain a comment characterization vector P based on text content ^sr The calculation formula is as follows:

P ^sr ＝MultiHead(T ^st ，T ^rt ，T ^rt ) A formula seven;

MultiHead(Q′，K′，V′)＝Concat(head ₁ ，head ₂ ，…，head _h )W _o a formula eight;

head _i ＝Attention(Q′W _i ^Q ，K′W _i ^K ，V′W _i ^v ) A formula of nine;

wherein, multihead represents a multi-head attention mechanism, Q ', K ' and V ' represent input vectors of the multi-head attention mechanism, and an initial characterization vector T of text content ^st As a matrix Q', the initial token vector T of the corresponding review content rt ^rt As K 'and V'; head _i The output vector calculated for the ith sub-vector of Q ', K ', V ' using Attention mechanism Attention (·), h being the number of heads of the multi-head Attention mechanism, W _o Training parameters for a multi-headed attention mechanism, W _i ^Q ，W _i ^K ，

Is a weight matrix of the linear projection,

is a scale factor;

step B33: text content is initially characterized by a vector T ^st Inputting the text content enhancement representation vector P into a multi-head self-attention mechanism ^s The calculation formula is as follows:

P ^s ＝MultiHead(T ^st ，T ^st ，T ^st ) A formula eleven;

MultiHead(Q′，K′，V′)＝Concat(head ₁ ，head ₂ ，…，head _h )W ₁ a formula twelve;

head _i ＝Attention(Q′W _i ^Q ，K′W _i ^K ，V′W _i ^V ) A formula thirteen;

wherein, multihead represents a multi-head attention mechanism, Q ', K ' and V ' represent input vectors of the multi-head attention mechanism, and an initial characterization vector T of text content ^st As matrices Q ', K ' and V '; head _i The output vector calculated for the ith sub-vector of Q ', K ', V ' using Attention mechanism Attention (·), h being the number of heads of the multi-head Attention mechanism, W ₁ Training parameters for a multi-headed attention mechanism, W _i ^Q ，W _i ^K ，

Is a weight matrix of the linear projection,

is a scale factor;

step B34: characterizing vectors P based on comments of text content ^sr And a text content enhanced representation vector P ^s Respectively inputting the data into a pooling layer to perform average pooling operation to obtain average pooling comment content sentence representation vectors

And averaging pooled text content enhancement characterization vectors

The calculation formula is as follows:

wherein the content of the first and second substances,

MeanPool is the average pooling function.

The step B4 comprises the following steps;

step B41: the sub-graph node knowledge characterization vector G obtained in the step B22 is used for ^SK，0 Input text knowledge graph convolution network SKGCN first layer graph convolution network using adjacency matrix A ^SK Updating the vector representation of each sub-graph node and outputting G ^SK ^，1 And is used as the input of the next layer of graph convolution network;

wherein G is ^SK，1 Expressed as:

wherein the content of the first and second substances,

is the output of node i in the first layer graph convolution network,

the calculation formula of (a) is as follows:

wherein, the first and the second end of the pipe are connected with each other,

is a bias term; w is a group of ^SK 、b ^SK Are all parameters which can be learnt, and the parameters,

as a weight matrix, relu is an activation function; node i in SKGCN and ith word in comment content

Correspondingly, the edges between the nodes represent the knowledge connection relationship between the words, d _i D is selected to indicate the degree of the node i and to prevent the degree of the node i from being 0 and causing operation errors _i +1 as a divisor;

step B42: for the text content graph convolution network SCGCN, the context enhanced text content representation vector U obtained in the step B31 is used ^st Inputting SCGCN first layer graph convolution network, using adjacency matrix A ^SK Updating the vector representation of each word and outputting U ^st，1 ，

Wherein, U ^st，1 Expressed as:

wherein the content of the first and second substances,

is the output of node i in the first layer graph convolution network,

the calculation formula of (a) is as follows:

wherein, W ^st 、

Are all parameters which can be learnt, and the parameters,

in order to be a weight matrix, the weight matrix,

is a bias term; relu is an activation function; node i in graph convolution network and ith word in comment content

Correspondingly, the edges between nodes in the graph convolution network represent the syntactic dependency between words in the comment content, d _i D is selected to indicate the degree of the node i and to prevent the degree of the node i from being 0 and causing operation errors _i +1 as a divisor;

for the knowledge boot mechanism, the first layer output G of SKGCN ^SK，1 Discarding the content except the words in the current comment content sentence to obtain the first layer knowledge representation about the text content

Then using a cross attention mechanism to output the SCGCN with the first layer output U ^st，1 Combine to obtain knowledgeable review content sentence representation G ^SD，1 And is used as the input of the next layer of the SCGCN,

wherein G is ^SD，1 Expressed as:

wherein, the output of the node i in the SCGCN first layer graph convolution network through the knowledge guiding mechanism is

Is calculated byThe following were used:

wherein, (.) ^T Denotes a transpose operation, α _i Is the attention weight of the knowledge about the ith word in the comment content s;

step B43: the input of the next layer graph convolution network of SKGCN and SCGCN is G ^SK，1 And G ^SD，1 Repeating the steps B41 and B42;

wherein, for SKGCN,

the output of the k layer graph convolution network is used as the input of the k +1 layer graph convolution network, and graph convolution characterization vectors are obtained after iteration is finished

For the case of the SCGCN,

for the output of the k-th layer graph convolution network, U is converted through a knowledge interaction mechanism ^st，k And G ^SD，k The method is used as the input of the (k + 1) th layer of graph convolution network, and graph convolution characterization vectors are obtained after continuous iteration and final end

Wherein K is more than or equal to 1 and less than or equal to K, and K is the layer number of the graph convolution network.

The step B5 comprises the following steps;

step B51: enhancing the context obtained in step B31Text content characterization vector U ^st And V obtained in step B43 ^sks Inputting an attention network, and selecting important knowledge information through the attention network to obtain a knowledge enhanced sentence-level characterization vector E ^sd The calculation formula is as follows:

wherein, (.) ^T Denotes the transposition operation, ∈ _i Is the attention weight of the ith word in the comment content s;

step B52: using the knowledge enhanced sentence-level characterization vector E obtained in step 51 ^sd Inputting the sentence characterization vector E of the aggregated word-level information into a multi-head self-attention mechanism ^mt ，

E ^mt ＝MuliHead(E ^sd ，E ^sd ，E ^sd ) Twenty-nine of a formula;

step B53: for the noise brought by the non-standard sentence pair model, the sentence characterization vector E of the word-level information is aggregated ^mt Inputting a gating function to filter the irrelevant information to obtain a vector E ^sda (ii) a Then inputting the emotion expression vector into a multi-layer perceptron (MLP) to obtain an emotion representation vector E of the source post ^sf (ii) a The specific calculation process is as follows:

wherein the content of the first and second substances,

and

are all parameters which can be learnt, and the parameters,

and

in the form of a matrix of weights,

and

is the bias term.

The step B6 comprises the following steps;

step B61: representing vectors of all average pooled comment content sentences corresponding to source posts

And average pooled text content enhanced representation vector

Inputting the data into a multi-head cross attention mechanism together, and obtaining a comprehensive semantic representation C of the comment content through average pooling ^sr The calculation process is as follows:

C ^sr = MeanPool (C') formula thirty-three;

MeanPool is the average pooling function;

step B62: enhancing the average pooled text content with a characterization vector

And comprehensive semantic representation C of comment content ^sr Jointly input into a fusion gating mechanism to obtain a fine-grained semantic representation vector V of a source post ^t The calculation process is as follows:

wherein the content of the first and second substances,

is a sigmoid activation function, w ₁ ，

And

a parameter learnable in the fusion gating mechanism, which is a dot product operation.

The step B7 comprises the following steps;

step B71: using the source post emotion characterization vector E obtained in the step B53 ^sf And V obtained in step B62 ^t Connecting to obtain a final characterization vector E ^f The calculation formula is as follows:

E ^f ＝Concat(E ^sf ，V ^t ) A formula thirty-six;

wherein the content of the first and second substances,

concat is a vector join operation.

Step B72: final characterization vector E ^f Input to the fully-connected layer and normalized using softmax, compute text content pairsThe probability that the data belongs to each category is calculated as follows:

y＝W ₃ E ^f + b formula thirty-seven;

p ^c (y) = softmax (y) formula thirty-eight;

where y is the output vector of the fully connected layer,

is a weight matrix of the full connection layer,

bias term for fully connected layer, p ^c (y) is the probability of predicting the corresponding category of the text content as c, and p is more than or equal to 0 ^c (y) ≦ 1,c ∈ { general fact, rumor, unverified rumor, rumor daggered };

step B73: calculating a loss value by using the cross entropy as a loss function, updating the learning rate by using a gradient optimization algorithm Adam, and updating model parameters by using back propagation iteration so as to train a model by using a minimized loss function; the minimum loss function loss is calculated as follows:

wherein the content of the first and second substances,

is an L2 regularization term, λ is a learning rate, θ includes all parameters, and c is an authenticity label corresponding to the text content.

A rumor detection system integrating emotion mining adopts the rumor detection method, the social network media is microblog, and the rumor detection system comprises the following modules:

a data collection module: the method comprises the steps of extracting text content and comment content of a source post in a microblog, marking authenticity of the source post and constructing a training set;

a preprocessing module: the system is used for preprocessing training samples in a training set, and comprises word segmentation processing and stop word removal;

the coding module: the method comprises the steps of searching word vectors of words in preprocessed text content and comment content in a pre-trained word vector matrix to obtain initial token vectors of the text content and initial token vectors of the comment content, searching word vectors of nodes in a syntactic knowledge subgraph in a pre-trained knowledge graph word vector matrix to obtain initial token vectors of the syntactic knowledge subgraph related to the comment content;

a network training module: the deep learning network training system is used for inputting an initial characterization vector of text content, an initial characterization vector of comment content and a syntactic knowledge subgraph initial characterization vector into the deep learning network to obtain a final characterization vector and train the deep learning network according to the final characterization vector, and training the whole deep learning network by taking the probability that the characterization vector belongs to a certain class and marks in a training set as losses and taking minimized losses as a target to obtain a deep learning network model based on multi-level attention and a knowledge graph;

rumor detection module: and extracting semantic and emotional information in the input source post text content and comment content by using an NLP tool, analyzing the input source post text content and comment content by using a trained deep learning network model based on multi-level attention and knowledge maps, and outputting a predicted source post authenticity label.

Compared with the prior art, the invention has the following beneficial effects: the method obtains the syntactic knowledge subgraph of the corresponding comment sentence by using a knowledge map and subgraph generation strategy, then codes the comment content and the text content respectively, learns the syntactic dependency and the external knowledge in the comment content through two graph convolution networks and a knowledge guide mechanism, and filters sentence noise by using a gating mechanism to enhance the expression of the comment sentence. The invention also learns the fine-grained semantic information between the text content and the comment content by utilizing a multi-level attention mechanism. Compared with the prior art, the method can enhance the characteristic representation of the rumor by utilizing fine-grained semantic information and rich emotional information, so that the precision of the rumor detection is further improved, and the robustness of the rumor is enhanced.

Drawings

The invention is described in further detail below with reference to the following figures and detailed description:

FIG. 1 is a schematic flow chart of a method implementation of an embodiment of the present invention;

FIG. 2 is a schematic diagram of a model architecture in an embodiment of the invention;

fig. 3 is a schematic system configuration of an embodiment of the present invention.

Detailed Description

The invention is further explained below with reference to the drawings and the embodiments.

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

As shown in the figure, a rumor detection method fusing emotion mining, the method comprises the following steps;

and B: training a deep learning network model N based on multi-level attention and a knowledge graph by using a training data set DT, wherein training content comprises authenticity of analysis source posts and authenticity labels for predicting the source posts;

and C: and inputting the text content and the comment content of the source post into the trained deep learning network model N to obtain the authenticity label of the source post.

The step B comprises the following steps;

step B1: encoding each training sample in the training data set DT to obtain an initial characterization vector T of the text content ^st Initial characterization vector T of comment content ^rt And syntactic adjacency matrix A ^st ；

And step B2: generating a corresponding syntactic knowledge sub-graph SK of the text content from the knowledge graph and the syntactic dependency graph according to a syntactic knowledge sub-graph construction algorithm, and obtaining an adjacency matrix A of the syntactic knowledge sub-graph SK ^SK Then, the nodes are coded to obtain a node knowledge representation vector H of the syntax knowledge subgraph SK ^SK ；

And step B3: the text content initial characterization vector T obtained in the step B1 ^st Inputting the text content representation vector H into a bidirectional long-short term memory network Bi-LSTM to obtain a context-enhanced text content representation vector H ^st Let U ^st ＝H ^st (ii) a Then, the token vector T is ^st And initial characterization vector T of comment content ^rt Inputting the data into a multi-head cross attention mechanism together to obtain a comment characterization vector P based on text content ^sr While characterizing the vector T ^st Inputting the text content enhancement representation vector P into a multi-head self-attention mechanism ^s (ii) a Then characterizing the vector P by the comments based on the text content ^sr And a text content enhanced representation vector P ^s Respectively inputting the average into a pooling layer to perform average pooling operation to obtain average pooled comment content sentence characterization vectors

And average pooled text content enhanced representation vector

And step B4: expressing node knowledge of sub-graph SK as vector H ^SK And the characterization vector U obtained in the step B3 ^st Respectively inputting into two graph convolution networks with K layers, recording as text knowledge graph convolution network SKGCN and text content graph convolution network SCGCN, and learning external knowledge informationAnd extracting syntax information; meanwhile, each layer of nodes of the text content graph convolution network SCGCN and the text knowledge graph convolution network SKGCN are subjected to knowledge guidance by using a knowledge guidance mechanism to obtain a graph knowledge representation vector V of a source post ^sks ；

And step B5: characterizing vector V of the map knowledge obtained in the step B4 by using a cross attention mechanism ^sks And sentence characterization vector U ^st Fusing to obtain a knowledge enhanced sentence-level characterization vector E ^sd To further improve the ability of the model to extract information; then E is drawn by a multi-head self-attention mechanism ^sd Further strengthening to obtain sentence representation E of aggregated word-level information ^mt (ii) a Then reducing noise from irregular sentences through a gating mechanism to obtain source post emotion characterization vector E ^sf ；

And step B6: representing vector of average pooling comment content sentences corresponding to source posts

And average pooled text content enhanced representation vector

Inputting the data into a multi-head cross attention mechanism together, and obtaining a comprehensive semantic representation C of the comment content through average pooling ^sr (ii) a The average pooled text content is then enhanced with a characterization vector

Step B7: representing the emotion vector E obtained in the step B5 ^sf And the fine-grained semantic representation vector V of the source post obtained in the step B6 ^t Combining to obtain final characterization vector E ^f (ii) a Then E is mixed ^f Inputting a full connection layer and a softmax function to obtain a prediction result; calculating the gradient of each parameter in the deep learning network model by using a back propagation method according to the loss function loss of the target, and reducing by using a random gradientThe method updates each parameter;

The step B1 comprises the following steps;

step B11: traversing the training set DT, performing word segmentation on text content and comment content of a source post in the training set DT and removing stop words, wherein each training sample in the DT is represented as DT = (st, rt, l); wherein st is the text content of the source post, rt is the comment content corresponding to the source post, l is the authenticity label corresponding to the source post, l belongs to { general fact, rumor, unverified rumor, rumor opened by public rumors };

the text content st of the source post is represented as:

wherein the content of the first and second substances,

the ith word in the text content st, i =1,2, …, n, n is the number of words in the text content st of the source post;

the comment content rt of the source post is represented as:

wherein the content of the first and second substances,

step B12: for step B11, obtaining text content

For encoding to obtain text content stInitial token vector T ^st ；T ^st Expressed as:

wherein the word vector matrix is pre-trained

Can be found to obtain

Is the ith word

step B13: for the comment content obtained in step B11

wherein the word vector matrix is pre-trained

Can be found to obtain

Denotes the jth word

step B14: syntactic dependency parsing is carried out on the text content st to obtain a corresponding syntactic Dependency Tree (DTD) so as to obtain the corresponding syntactic dependency treeAnd an n-level syntactic adjacency matrix A ^st (ii) a The syntax dependency tree DTD is represented as,

wherein the content of the first and second substances,

words representing text content

And text content words

There is a syntactic dependency between them.

The step B2 comprises the following steps;

step B21: each original word node in the syntactic dependency tree DTD is used as a root node, hop layers are expanded from a knowledge graph to generate child nodes, and u nodes which are connected with the nodes of the previous layer in the knowledge graph in an edge mode are selected from each layer to serve as the nodes of the layer, namely each seed node has

wherein the content of the first and second substances,

meaning knowledge node words

Is a text content word

The node of the expansion of (1) is,

representing knowledge node words

Is a knowledge node word

The knowledge-extending child node of (a),

words representing text content

And text content words

Order to

Can be found

Is the ith word

Corresponding knowledge word vector, whichWhere d represents the dimension of the knowledge word vector, | V | is the number of words in which the knowledge word is embedded in V.

The step B3 comprises the following steps;

step B31: initial characterization vector of text content

Wherein

i =1,2, · n, ": "denotes a vector join operation; h ^st Is namely U ^st ；

P ^sr ＝MultiHead(T ^st ，T ^rt ，T ^rt ) A formula seven;

wherein, multihead represents a multi-head attention mechanism, Q ', K ' and V ' represent input vectors of the multi-head attention mechanism, and an initial characterization vector T of text content ^st As a matrix Q', the initial token vector T of the corresponding review content rt ^rt As K 'and V'; head _i The output vector calculated for the ith sub-vector of Q ', K ', V ' using Attention mechanism Attention (·), h being the number of heads of the multi-head Attention mechanism, W _o For the training parameters of the multi-head attention mechanism,

is a weight matrix of the linear projection and,

is a scale factor;

step B33: initially characterizing text content by a vector T ^st Inputting the text content enhancement representation vector P into a multi-head self-attention mechanism ^s The calculation formula is as follows:

P ^s ＝MultiHead(T ^st ，T ^st ，T ^st ) A formula eleven;

MultiHead(Q ¹ ，K′，V′)＝Concat(head ₁ ，head ₂ ，…，head _h )W ₁ a formula twelve;

wherein, multihead represents a multi-head attention mechanism, Q ', K ' and V ' represent input vectors of the multi-head attention mechanism, and an initial characterization vector T of text content ^st As matrices Q ', K ' and V '; head _i The output vector calculated using Attention mechanism Attention (-) for the ith sub-vector of Q ', K ', V ', h is the number of heads of the multi-head Attention mechanism, W ₁ For the training parameters of the multi-head attention mechanism,

is a weight matrix of the linear projection,

is a scale factor;

And average pooled text content enhanced representation vector

The calculation formula is as follows:

MeanPool is the average pooling function.

The step B4 comprises the following steps;

wherein G is ^SK，1 Expressed as:

wherein the content of the first and second substances,

is the output of node i in the first level graph convolution network,

the calculation formula of (a) is as follows:

is a bias term; w ^SK 、b ^SK Are all parameters which can be learnt, and the parameters,

Correspondingly, the edges between the nodes represent the knowledge existing between the wordsConnection relation, d _i D is selected to indicate the degree of the node i and to prevent the degree of the node i from being 0 and causing operation errors _i +1 as a divisor;

Wherein, U ^st，1 Expressed as:

wherein the content of the first and second substances,

is the output of node i in the first layer graph convolution network,

the calculation formula of (a) is as follows:

wherein, W ^st 、

Are all parameters which can be learnt, and the parameters,

in order to be a weight matrix, the weight matrix,

Correspondingly, edges between nodes in the graph convolution network represent syntactic dependencies between words in the comment content，d _i D is selected to indicate the degree of the node i and to prevent the degree of the node i from being 0 and causing operation errors _i +1 as a divisor;

for the knowledge-guided mechanism, the first layer output G to SKGCN ^SK，1 Discarding the content except the words in the current comment content sentence to obtain the first layer knowledge representation about the text content

wherein G is ^{SD,1 table} Shown as follows:

The calculation formula of (a) is as follows:

step B43: the input of the next layer graph convolution network of SKGCN and SCGCN is ^SK，1 And G ^SD，1 Repeating the steps B41,B42；

Wherein, for SKGCN,

For the case of the SCGCN,

The step B5 comprises the following steps;

step B51: the text content characterization vector U with the enhanced context obtained in the step B31 is used for carrying out the context enhancement ^st And V obtained in step B43 ^sks Inputting an attention network, and selecting important knowledge information through the attention network to obtain a knowledge enhanced sentence-level characterization vector E ^sd The calculation formula is as follows:

wherein, (.) ^T Denotes the transposition operation, ∈ _i Is to commentAttention weight of ith word in the theory s;

E ^mt ＝MuliHead(E ^sd ，E ^sd ，E ^sd ) Twenty-nine of a formula;

wherein the content of the first and second substances,

and

are all parameters which can be learnt, and the parameters,

and

in order to be a weight matrix, the weight matrix,

and

is a bias term.

The step B6 comprises the following steps;

And average pooled text content enhanced representation vector

C ^sr = MeanPool (C') formula thirty-three;

wherein the content of the first and second substances,

MeanPool is the average pooling function;

wherein the content of the first and second substances,

is a sigmoid activation function, w ₁ ，

And

The step B7 comprises the following steps;

E ^f ＝Concat(E ^sf ，V ^t ) A formula thirty-six;

wherein the content of the first and second substances,

concat is a vector join operation.

Step B72: final characterization vector E ^f Inputting the text content into a full connection layer, normalizing by using softmax, and calculating the probability that the text content correspondingly belongs to each category, wherein the calculation formula is as follows:

y＝W ₃ E ^f + b formula thirty-seven;

p ^c (y) = softmax (y) formula thirty-eight;

where y is the output vector of the fully connected layer,

is a matrix of the weights of the full connection layer,

wherein the content of the first and second substances,

The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.

Claims

1. A rumor detection method integrating emotion mining is characterized in that: the method comprises the following steps;

2. The method of claim 1, wherein the method comprises: the step B comprises the following steps;

step B1: encoding each training sample in the training data set DT to obtain an initial characterization vector T of the text content ^st First of comment contentStarting token vector T ^rt And syntactic adjacency matrix A ^st ；

And step B3: the text content initial characterization vector T obtained in the step B1 ^st Inputting the text content representation vector H into a bidirectional long-short term memory network Bi-LSTM to obtain a context-enhanced text content representation vector H ^st Let U ^st ＝H ^st (ii) a Then, the token vector T is ^st And initial characterization vector T of comment content ^rt Inputting the data into a multi-head cross attention mechanism together to obtain a comment characterization vector P based on text content ^sr While characterizing the vector T ^st Inputting the text content enhancement representation vector P into a multi-head self-attention mechanism ^s (ii) a Then characterizing the vector P by the comments based on the text content ^sr And text content enhancement characterization vector P ^s Respectively inputting the data into a pooling layer to perform average pooling operation to obtain average pooling comment content sentence representation vectors

And average pooled text content enhanced representation vector

B4, expressing the node knowledge of the sub-graph SK as a vector H ^SK And the characterization vector U obtained in the step B3 ^st The method comprises the steps that the data are respectively input into two graph convolution networks with K layers, and are recorded as a text knowledge graph convolution network SKGCN and a text content graph convolution network SCGCN, and the data are used for learning external knowledge information and extracting syntax information; meanwhile, each layer of nodes of the text content graph convolution network SCGCN and the text knowledge graph convolution network SKGCN are subjected to knowledge guidance by using a knowledge guidance mechanism to obtain a graph knowledge representation vector V of a source post ^sks ；

B5, characterizing a vector V of the graph knowledge obtained in the step B4 by using a cross attention mechanism ^sks And sentence characterization vector U ^st Fusing to obtain a knowledge enhanced sentence-level characterization vector E ^sd To further improve the ability of the model to extract information; then E is drawn by a multi-head self-attention mechanism ^sd Further strengthening, obtaining sentence representation E of aggregated word-level information ^mt (ii) a Reducing noise from irregular sentences through a gating mechanism to obtain a source post emotion representation vector E ^sf ；

And average pooled text content enhanced representation vector

Step B7: representing the emotion vector E obtained in the step B5 ^sf And the fine-grained semantic representation vector V of the source post obtained in the step B6 ^t Combining to obtain final characterization vector E ^f (ii) a Then E is mixed ^f Inputting a full connection layer and a softmax function to obtain a prediction result; calculating the gradient of each parameter in the deep learning network model by using a back propagation method according to the target loss function loss, and updating each parameter by using a random gradient descent method;

3. The rumor detection method for fusion emotion mining, according to claim 2, characterized in that: the step B1 comprises the following steps;

the text content st of the source post is represented as:

wherein the content of the first and second substances,

the comment content rt of the source post is represented as:

wherein the content of the first and second substances,

step B12: for step B11, obtaining text content

Coding to obtain text contentst initial token vector T ^st ；T ^st Expressed as:

wherein the word vector matrix is pre-trained

Can be found to obtain

Is the ith word

step B13: for the comment content obtained in step B11

wherein the word vector matrix is pre-trained

Can be found to obtain

Denotes the jth word

wherein the content of the first and second substances,

words representing text content

And text content words

There is a syntactic dependency between them.

4. The method of claim 3, wherein the method comprises the steps of: the step B2 comprises the following steps;

wherein the content of the first and second substances,

meaning knowledge node words

Is a text content word

The number of the extended nodes of (1),

meaning knowledge node words

Is a knowledge node word

The knowledge-extending child node of (a),

words representing text content

And text content words

Order to

As the initial input of a text knowledge graph convolution network SKGCN; in the pre-trainingWord recognition vector matrix

Can be found to obtain

Is the ith word W _i ^kg And the corresponding knowledge word vector, wherein d represents the dimension of the knowledge word vector, and | V | is the word number of the knowledge word embedded in V.

5. The method of claim 4, wherein the method comprises the steps of: the step B3 comprises the following steps;

step B31, initial characterization vectors of the text content

And

wherein

Is an activation function; text content characterization vectors with context enhancement obtained through connection

": means a vector join operation; h ^st Is namely U ^st ；

P ^sr ＝MultiHead(T ^st ，T ^rt ，T ^rt ) A formula seven;

MultiHead(Q′，K′，V′)＝Concat(head ¹ ，head ² ，…，head _h )W _o a formula eight;

wherein, multihead represents a multi-head attention mechanism, Q ', K ' and V ' represent input vectors of the multi-head attention mechanism, and an initial characterization vector T of text content ^st As a matrix Q', the initial token vector T of the corresponding review content rt ^rt As K 'and V'; head _i The output vector calculated using Attention mechanism Attention (-) for the ith sub-vector of Q ', K ', V ', h is the number of heads of the multi-head Attention mechanism, W _o For the training parameters of the multi-head attention mechanism,

is a weight matrix of the linear projection,

is a scale factor;

P ^s ＝MultiHead(T ^st ，T ^st ，T ^st ) A formula eleven;

wherein, nultiHead represents a multi-head attention mechanism, Q ', K ' and V ' represent input vectors of the multi-head attention mechanism, and an initial characterization vector T of text content ^st As matrices Q ', K ' and V '; head ⁱ The output vector calculated for the ith sub-vector of Q ', K ', V ' using Attention mechanism Attention (·), h being the number of heads of the multi-head Attention mechanism, W ₁ For the training parameters of the multi-head attention mechanism,

is a weight matrix of the linear projection and,

is a scale factor;

step B34: characterizing vectors P based on comments of text content ^sr And text content enhancement characterization vector P ^s Respectively inputting the data into a pooling layer to perform average pooling operation to obtain average pooling comment content sentence representation vectors

And average pooled text content enhanced representation vector

The calculation formula is as follows:

wherein the content of the first and second substances,

MeanPool is the average pooling function.

6. The method of claim 5, wherein the method comprises: the step B4 comprises the following steps;

step B41: the sub-graph node knowledge characterization vector G obtained in the step B22 is used for ^SK,0 Input text knowledge graph convolution network SKGCN first layer graph convolution network using adjacency matrix A ^SK Updating the vector representation of each sub-graph node and outputting G ^SK,1 And is used as the input of the next layer of graph convolution network;

wherein, G ^SK,1 Expressed as:

wherein the content of the first and second substances,

is the output of node i in the first level graph convolution network,

the calculation formula of (a) is as follows:

wherein the content of the first and second substances,

step B42: for the text content graph convolution network SCGCN, the text content characterization vector U with the enhanced context obtained in the step B31 is used for representing the text content ^st Inputting SCGCN first layer graph convolution network, using adjacency matrix A ^SK Updating the vector representation of each word and outputting U ^st,1 ，

Wherein, U ^st,1 Expressed as:

wherein the content of the first and second substances,

is a middle section of a first layer graph convolution networkThe output of the point i is then taken,

the calculation formula of (a) is as follows:

wherein, W ^st 、

Are all parameters which can be learnt, and are,

in the form of a matrix of weights,

Correspondingly, the edges between nodes in the graph convolution network represent the syntactic dependency between words in the comment content, d _i D is selected to indicate the degree of the node i and to prevent the degree of the node i from being 0 to cause operation error _i +1 as a divisor;

Then using a cross attention mechanism to output the SCGCN with the first layer output U ^st,1 Combine to obtain knowledgeable review content sentence representation G ^SD，1 And is used as the input of the next layer of the SCGCN,

wherein G is ^SD，1 Expressed as:

The calculation formula of (a) is as follows:

wherein, (. Cndot.) ^T Denotes a transpose operation, α _i Is the attention weight of the knowledge about the ith word in the comment content s;

wherein, for SKGCN,

For the case of the SCGCN,

7. The method of claim 6, wherein the method comprises: the step B5 comprises the following steps;

wherein, (. Cndot.) ^T Denotes the transposition operation, ε _i Is the attention weight of the ith word in the comment content s;

step B52: using the knowledge enhancement type sentence level characterization vector E obtained in step 51 ^sd Inputting the sentence characterization vector E of the aggregated word-level information into a multi-head self-attention mechanism ^mt ，

E ^mt ＝MuliHead(E ^sd ，E ^sd ，E ^sd ) Twenty-nine of a formula;

and

are all parameters which can be learnt, and the parameters,

and

in the form of a matrix of weights,

and

is the bias term.

8. The method of claim 7, wherein the method comprises: the step B6 comprises the following steps;

And average pooled text content enhanced representation vector

C ^sr = MeanPool (C') formula thirty-three;

wherein the content of the first and second substances,

MeanPool is the average pooling function;

wherein the content of the first and second substances,

is a sigmoid activation function that is,

and

9. The method of claim 8, wherein the method comprises: the step B7 comprises the following steps;

E ^f ＝Concat(E ^sf ，V ^t ) A formula thirty-six;

wherein the content of the first and second substances,

concat is a vector join operation.

y＝W ₃ E ^f + b formula thirty-seven;

p ^c (y) = softmax (y) formula thirty-eight;

where y is the output vector of the fully connected layer,

is a matrix of the weights of the full connection layer,

bias term for fully connected layer, p ^c (y) predicting the textProbability of content corresponding to class c, 0 ≦ p ^c (y) ≦ 1,c ∈ { general fact, rumor, unverified rumor, rumor daggered };

wherein the content of the first and second substances,

10. A rumor detection system with emotion mining, which employs the rumor detection method of any one of claims 1 to 9, and is characterized in that: the social network media is a microblog, and the rumor detection system comprises the following modules:

and an encoding module: the method comprises the steps of searching word vectors of words in preprocessed text content and comment content in a pre-trained word vector matrix to obtain initial token vectors of the text content and initial token vectors of the comment content, searching word vectors of nodes in a syntactic knowledge subgraph in a pre-trained knowledge graph word vector matrix to obtain initial token vectors of the syntactic knowledge subgraph related to the comment content;

a network training module: the deep learning network model is used for inputting the initial characterization vector of the text content, the initial characterization vector of the comment content and the initial characterization vector of the syntactic knowledge subgraph into the deep learning network to obtain a final characterization vector and train the deep learning network by using the final characterization vector, the probability that the characterization vector belongs to a certain class and the marks in a training set as losses, and the whole deep learning network is trained by using the minimized losses as a target to obtain a deep learning network model based on multi-level attention and a knowledge graph;