CN115422945A - Rumor detection method and system integrating emotion mining - Google Patents
Rumor detection method and system integrating emotion mining Download PDFInfo
- Publication number
- CN115422945A CN115422945A CN202211139407.3A CN202211139407A CN115422945A CN 115422945 A CN115422945 A CN 115422945A CN 202211139407 A CN202211139407 A CN 202211139407A CN 115422945 A CN115422945 A CN 115422945A
- Authority
- CN
- China
- Prior art keywords
- vector
- content
- knowledge
- text content
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention provides a rumor detection method integrating emotion mining, which comprises the following steps of; step A: collecting and extracting text content and comment content of a source post in a social network medium, and manually marking a real label of the source post to form a training data set DT; and B: training a deep learning network model N based on multi-level attention and a knowledge graph by using a training data set DT, wherein training content comprises authenticity labels for analyzing source posts and forecasting the source posts; and C: inputting the text content and the comment content of the source post into a trained deep learning network model N to obtain an authenticity label of the source post; the invention can improve the accuracy of rumor detection on the microblog.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a rumor detection method and system integrating emotion mining.
Background
Rumor Detection (Rumor Detection), also known as false news Detection, is an important task in the field of Natural Language Processing (NLP). Rumor detection can be regarded as a text classification problem with supervised learning, and can be generally classified into two types, rumors and not rumors. With the development of internet technology, social network platforms such as microblogs, twitter and the like are rapidly popular in the mass life. In social networking platforms, people are not just recipients of information but also creators of content. The social network platform greatly accelerates the speed and depth of information exchange between people. Social networking platforms are able to provide timely and comprehensive information about events occurring around the world, and therefore an increasing number of people are interested in participating in discussions and communications of hot topics on social networking platforms. This discussion and communication, on the one hand, facilitates the dissemination and dissemination of news, enabling people to more easily and quickly understand what is happening. However, in such a convenient environment, the social networking platform also reduces the cost of disseminating unrealistic information. The false rumors typically use false or forged images and aggressive languages, mislead the reader and spread quickly. The spread of the false rumors can have large-scale negative effects on society, causing social agitation.
In recent years, with the rise of deep learning technology, the technology is also widely applied by rumor detection tasks. The most common of these are Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). Since CNNs perform well in capturing semantic information from text, some researchers have applied them to content-based rumor detection. However, the network cannot take full advantage of the contextual information in the sentence, which is critical to modeling the semantic relationship between an aspect and its context. Therefore, the performance of CNN-based neural network models is limited in rumor detection tasks. To address this problem, many researchers have employed RNNs, particularly Long Short Term Memory (LSTM) and gated round-robin units (GRU), to extract contextual semantic information for rumors. Unlike CNN, RNN regards a sentence as a sequence of words, takes each word in time order, takes the output of the hidden layer as the input of the next hidden layer, and learns the context information in the sequence data continuously. Ma et al use recurrent neural networks to capture semantic changes between each source post and its forwarded comments and predict based on the semantic changes. The RNN-based neural network model is significantly better than the CNN-based neural network model in rumor detection.
Researchers have indicated that rumor characteristics of a given post are often determined by several keywords, rather than by all words in the context. While RNN cannot accurately estimate the contribution of different context words to the overall semantics. In contrast, the attention mechanism may capture the importance of each contextual word by computing an attention weight for each contextual word to the semantics of a given post and utilizing this attention weight to compute a semantic representation of the post.
However, most of these neural network models ignore emotional information in the post, which represents the emotion of the publisher to the content of the post, and this is especially important for correctly judging the authenticity label of the post. Recent learners have focused on finding unique emotional characteristics between the fake rumors and the real rumors. Ajao et al verified that there is a relationship between the authenticity of the news (true and false) and the use of emotional words, and designed an emotional signature (the ratio of the number of negative words to the number of positive words) to help detect false news. Furthermore, giachanou et al extract emotional features from news content for rumor detection based on an emotion dictionary. However, the related research in the past ignores syntax-dependent information and external knowledge information required in the aspect of emotion, so that emotion information is not sufficiently extracted.
Disclosure of Invention
The invention provides a rumor detection method and system integrating emotion mining, which can improve the accuracy of rumor detection on microblogs.
The invention adopts the following technical scheme.
A rumor detection method with emotion mining fused, the method comprising the following steps;
step A: collecting and extracting text content and comment content of a source post in a social network medium, and manually marking a real label of the source post to form a training data set DT;
and B, step B: training a deep learning network model N based on multi-level attention and a knowledge graph by using a training data set DT, wherein training content comprises authenticity of analysis source posts and authenticity labels for predicting the source posts;
and C: inputting the text content and the comment content of the source post into the trained deep learning network model N to obtain the authenticity label of the source post.
The step B comprises the following steps;
step B1: coding each training sample in the training data set DT to obtain an initial characterization vector T of the text content st Initial characterization vector T of comment content rt And syntactic adjacency matrix A st ;
And step B2: generating corresponding syntactic knowledge subgraph SK of text content from the knowledge map and the syntactic dependency graph according to the syntactic knowledge subgraph construction algorithm, and obtaining an adjacency matrix A thereof SK Then, the nodes are coded to obtain a node knowledge representation vector H of the syntax knowledge subgraph SK SK ;
And step B3: the text content initial characterization vector T obtained in the step B1 Tabanus Inputting the text content representation vector H into a bidirectional long-short term memory network Bi-LSTM to obtain a context-enhanced text content representation vector H st Let U st =H st (ii) a Then, the token vector T is sk And initial characterization vector T of comment content rt Inputting the data into a multi-head cross attention mechanism together to obtain a comment characterization vector P based on text content sr While characterizing the vector T st Inputting the text content enhancement representation vector Ps into a multi-head self-attention mechanism;then characterizing the vector P by the comments based on the text content sr And respectively inputting the text content enhanced representation vector Ps into the pooling layer to carry out average pooling operation to obtain an average pooled comment content sentence representation vectorAnd average pooled text content enhanced representation vector;
And step B4: expressing node knowledge of sub-graph SK as vector H SK And the characterization vector U obtained in the step B3 st The method comprises the steps that the information is respectively input into two graph convolution networks with K layers, and the information is recorded as a text knowledge graph convolution network SKGCN and a text content graph convolution network SCGCN and used for learning external knowledge information and extracting syntax information; meanwhile, each layer of nodes of the text content graph convolution network SCGCN and the text knowledge graph convolution network SKGCN are subjected to knowledge guidance by using a knowledge guidance mechanism to obtain a graph knowledge representation vector V of a source post sks ;
And step B5: characterizing the vector V of the map knowledge obtained in the step B4 by using a cross attention mechanism sks And sentence characterization vector U st Fusing to obtain a knowledge enhanced sentence-level characterization vector E sd To further improve the ability of the model to extract information; then E is drawn by a multi-head self-attention mechanism sd Further strengthening to obtain sentence representation E of aggregated word-level information mt (ii) a Reducing noise from irregular sentences through a gating mechanism to obtain a source post emotion representation vector E sf ;
Step B6: representing vector of average pooling comment content sentences corresponding to source postsAnd average pooled text content enhanced representation vectorAll input into a multi-head cross attention mechanism and pass through an averaging cellComprehensive semantic representation C for transforming and obtaining comment content sr (ii) a The average pooled text content enhancement characterization vector is thenAnd comprehensive semantic representation C of comment content sr Inputting the semantic representation vector V into a fusion gating mechanism to obtain a fine-grained semantic representation vector V of the source post t ;
And step B7: representing the emotion vector E obtained in the step B5 sf And the fine-grained semantic representation vector V of the source post obtained in the step B6 t Combining to obtain final characterization vector E f (ii) a Then E is f Inputting a full connection layer and a softmax function to obtain a prediction result; calculating the gradient of each parameter in the deep learning network model by using a back propagation method according to the target loss function loss, and updating each parameter by using a random gradient descent method;
and step B8: and when the iterative change of the loss value generated by the deep learning network model N is smaller than a given threshold value or the maximum iteration number is reached, terminating the training process of the deep learning network model N.
The step B1 comprises the following steps;
step B11: traversing a training set DT, performing word segmentation on text content and comment content of a source post, and removing stop words, wherein each training sample in the DT is represented as DT = (st, rt, l); wherein st is the text content of the source post, rt is the comment content corresponding to the source post, l is the authenticity label corresponding to the source post, l belongs to { general fact, rumor, unverified rumor, rumor opened by public rumors };
the text content st of the source post is represented as:
wherein the content of the first and second substances,for the i-th word in the text content st, i =1,2, …, nThe number of words that are the source post text content st;
the comment content rt of the source post is represented as:
wherein the content of the first and second substances,for the jth word in the comment content rt, i =1,2, …, m, m is the number of words in the comment content rt;
step B12: for step B11, obtaining text contentCoding is carried out to obtain an initial characterization vector T of the text content st st ;T st Expressed as:
wherein the word vector matrix is pre-trainedCan be found to obtainIs the ith wordCorresponding word vectors, d represents the dimensionality of the word vectors, and | V | is the number of words in the dictionary V;
step B13: for the comment content obtained in step B11Coding is carried out to obtain an initial characterization vector T of the comment content rt rt ;T rt Expressed as:
wherein the word vector matrix is pre-trainedCan be found to obtainDenotes the jth wordCorresponding word vectors, d represents the dimensionality of the word vectors, and | V | is the number of words in the dictionary V;
step B14: performing syntactic dependency analysis on the text content st to obtain a corresponding syntactic dependency tree DTD and an n-level syntactic adjacency matrix A st (ii) a The syntax dependency tree DTD is represented as,
wherein the content of the first and second substances,words representing text contentAnd text content wordsThere is a syntactic dependency between them.
The step B2 comprises the following steps;
step B21: taking each original word node in a syntactic dependency tree DTD as a root node, expanding hop layers from a knowledge graph to generate child nodes, and selecting u nodes which are connected with the nodes of the previous layer in the knowledge graph with edges as the nodes of the layer on each layer, namely each seed node hasExpanding child nodes to finally obtain a syntactic knowledge sub-graph SK with the total number of all nodes being z = n + n × q and a z-order adjacency matrix A SK (ii) a The syntactic knowledge sub-graph SK is represented as,
wherein the content of the first and second substances,meaning knowledge node wordsIs a text content wordThe number of the extended nodes of (1),meaning knowledge node wordsIs a knowledge node wordThe knowledge-extending child node of (a),words representing text contentAnd text content wordsThere is a syntactic dependency relationship between the two, u is the number of nodes selected in the knowledge graph, and hop is the number of layers of the topology;
step B22: the nodes of the sentence-method knowledge subgraph SK are encoded by embedding the knowledge graph to obtain the node knowledge expression vector ofOrder toAs the initial input of a text knowledge graph convolution network SKGCN; knowledge word vector matrix in pre-trainingCan be found to obtainIs the ith wordAnd the corresponding knowledge word vector, wherein d represents the dimension of the knowledge word vector, and | V | is the word number of the knowledge word embedded in V.
The step B3 comprises the following steps;
step B31: initial characterization vector of text contentSequentially and respectively inputting the forward layer and the reverse layer of the first bidirectional long-short term memory network to obtain the state vector sequence of the forward hidden layer and the state vector sequence of the reverse hidden layer, namelyAndwhereini =1,2,.. N, f is the activation function; text content characterization vectors with context enhancement obtained through connectionWherein the content of the first and second substances, i =1,2, · n, ": "denotes a vector join operation; h st Is namely U st ;
Step B32: an initial characterization vector T of the text content st st And an initial token vector T of the review content rt rt Inputting the two into a multi-head cross attention mechanism together to obtain a comment characterization vector P based on text content sr The calculation formula is as follows:
P sr =MultiHead(T st ,T rt ,T rt ) A formula seven;
MultiHead(Q′,K′,V′)=Concat(head 1 ,head 2 ,…,head h )W o a formula eight;
head i =Attention(Q′W i Q ,K′W i K ,V′W i v ) A formula of nine;
wherein, multihead represents a multi-head attention mechanism, Q ', K ' and V ' represent input vectors of the multi-head attention mechanism, and an initial characterization vector T of text content st As a matrix Q', the initial token vector T of the corresponding review content rt rt As K 'and V'; head i The output vector calculated for the ith sub-vector of Q ', K ', V ' using Attention mechanism Attention (·), h being the number of heads of the multi-head Attention mechanism, W o Training parameters for a multi-headed attention mechanism, W i Q ,W i K ,Is a weight matrix of the linear projection,is a scale factor;
step B33: text content is initially characterized by a vector T st Inputting the text content enhancement representation vector P into a multi-head self-attention mechanism s The calculation formula is as follows:
P s =MultiHead(T st ,T st ,T st ) A formula eleven;
MultiHead(Q′,K′,V′)=Concat(head 1 ,head 2 ,…,head h )W 1 a formula twelve;
head i =Attention(Q′W i Q ,K′W i K ,V′W i V ) A formula thirteen;
wherein, multihead represents a multi-head attention mechanism, Q ', K ' and V ' represent input vectors of the multi-head attention mechanism, and an initial characterization vector T of text content st As matrices Q ', K ' and V '; head i The output vector calculated for the ith sub-vector of Q ', K ', V ' using Attention mechanism Attention (·), h being the number of heads of the multi-head Attention mechanism, W 1 Training parameters for a multi-headed attention mechanism, W i Q ,W i K ,Is a weight matrix of the linear projection,is a scale factor;
step B34: characterizing vectors P based on comments of text content sr And a text content enhanced representation vector P s Respectively inputting the data into a pooling layer to perform average pooling operation to obtain average pooling comment content sentence representation vectorsAnd averaging pooled text content enhancement characterization vectorsThe calculation formula is as follows:
The step B4 comprises the following steps;
step B41: the sub-graph node knowledge characterization vector G obtained in the step B22 is used for SK,0 Input text knowledge graph convolution network SKGCN first layer graph convolution network using adjacency matrix A SK Updating the vector representation of each sub-graph node and outputting G SK ,1 And is used as the input of the next layer of graph convolution network;
wherein the content of the first and second substances,is the output of node i in the first layer graph convolution network,the calculation formula of (a) is as follows:
wherein, the first and the second end of the pipe are connected with each other,is a bias term; w is a group of SK 、b SK Are all parameters which can be learnt, and the parameters,as a weight matrix, relu is an activation function; node i in SKGCN and ith word in comment contentCorrespondingly, the edges between the nodes represent the knowledge connection relationship between the words, d i D is selected to indicate the degree of the node i and to prevent the degree of the node i from being 0 and causing operation errors i +1 as a divisor;
step B42: for the text content graph convolution network SCGCN, the context enhanced text content representation vector U obtained in the step B31 is used st Inputting SCGCN first layer graph convolution network, using adjacency matrix A SK Updating the vector representation of each word and outputting U st,1 ,
wherein the content of the first and second substances,is the output of node i in the first layer graph convolution network,the calculation formula of (a) is as follows:
wherein, W st 、Are all parameters which can be learnt, and the parameters,in order to be a weight matrix, the weight matrix,is a bias term; relu is an activation function; node i in graph convolution network and ith word in comment contentCorrespondingly, the edges between nodes in the graph convolution network represent the syntactic dependency between words in the comment content, d i D is selected to indicate the degree of the node i and to prevent the degree of the node i from being 0 and causing operation errors i +1 as a divisor;
for the knowledge boot mechanism, the first layer output G of SKGCN SK,1 Discarding the content except the words in the current comment content sentence to obtain the first layer knowledge representation about the text contentThen using a cross attention mechanism to output the SCGCN with the first layer output U st,1 Combine to obtain knowledgeable review content sentence representation G SD,1 And is used as the input of the next layer of the SCGCN,
wherein, the output of the node i in the SCGCN first layer graph convolution network through the knowledge guiding mechanism isIs calculated byThe following were used:
wherein, (.) T Denotes a transpose operation, α i Is the attention weight of the knowledge about the ith word in the comment content s;
step B43: the input of the next layer graph convolution network of SKGCN and SCGCN is G SK,1 And G SD,1 Repeating the steps B41 and B42;
wherein, for SKGCN,the output of the k layer graph convolution network is used as the input of the k +1 layer graph convolution network, and graph convolution characterization vectors are obtained after iteration is finishedFor the case of the SCGCN,for the output of the k-th layer graph convolution network, U is converted through a knowledge interaction mechanism st,k And G SD,k The method is used as the input of the (k + 1) th layer of graph convolution network, and graph convolution characterization vectors are obtained after continuous iteration and final endWherein K is more than or equal to 1 and less than or equal to K, and K is the layer number of the graph convolution network.
The step B5 comprises the following steps;
step B51: enhancing the context obtained in step B31Text content characterization vector U st And V obtained in step B43 sks Inputting an attention network, and selecting important knowledge information through the attention network to obtain a knowledge enhanced sentence-level characterization vector E sd The calculation formula is as follows:
wherein, (.) T Denotes the transposition operation, ∈ i Is the attention weight of the ith word in the comment content s;
step B52: using the knowledge enhanced sentence-level characterization vector E obtained in step 51 sd Inputting the sentence characterization vector E of the aggregated word-level information into a multi-head self-attention mechanism mt ,
E mt =MuliHead(E sd ,E sd ,E sd ) Twenty-nine of a formula;
step B53: for the noise brought by the non-standard sentence pair model, the sentence characterization vector E of the word-level information is aggregated mt Inputting a gating function to filter the irrelevant information to obtain a vector E sda (ii) a Then inputting the emotion expression vector into a multi-layer perceptron (MLP) to obtain an emotion representation vector E of the source post sf (ii) a The specific calculation process is as follows:
wherein the content of the first and second substances,andare all parameters which can be learnt, and the parameters,andin the form of a matrix of weights,andis the bias term.
The step B6 comprises the following steps;
step B61: representing vectors of all average pooled comment content sentences corresponding to source postsAnd average pooled text content enhanced representation vectorInputting the data into a multi-head cross attention mechanism together, and obtaining a comprehensive semantic representation C of the comment content through average pooling sr The calculation process is as follows:
C sr = MeanPool (C') formula thirty-three;
wherein, the first and the second end of the pipe are connected with each other,MeanPool is the average pooling function;
step B62: enhancing the average pooled text content with a characterization vectorAnd comprehensive semantic representation C of comment content sr Jointly input into a fusion gating mechanism to obtain a fine-grained semantic representation vector V of a source post t The calculation process is as follows:
wherein the content of the first and second substances,is a sigmoid activation function, w 1 ,Anda parameter learnable in the fusion gating mechanism, which is a dot product operation.
The step B7 comprises the following steps;
step B71: using the source post emotion characterization vector E obtained in the step B53 sf And V obtained in step B62 t Connecting to obtain a final characterization vector E f The calculation formula is as follows:
E f =Concat(E sf ,V t ) A formula thirty-six;
Step B72: final characterization vector E f Input to the fully-connected layer and normalized using softmax, compute text content pairsThe probability that the data belongs to each category is calculated as follows:
y=W 3 E f + b formula thirty-seven;
p c (y) = softmax (y) formula thirty-eight;
where y is the output vector of the fully connected layer,is a weight matrix of the full connection layer,bias term for fully connected layer, p c (y) is the probability of predicting the corresponding category of the text content as c, and p is more than or equal to 0 c (y) ≦ 1,c ∈ { general fact, rumor, unverified rumor, rumor daggered };
step B73: calculating a loss value by using the cross entropy as a loss function, updating the learning rate by using a gradient optimization algorithm Adam, and updating model parameters by using back propagation iteration so as to train a model by using a minimized loss function; the minimum loss function loss is calculated as follows:
wherein the content of the first and second substances,is an L2 regularization term, λ is a learning rate, θ includes all parameters, and c is an authenticity label corresponding to the text content.
A rumor detection system integrating emotion mining adopts the rumor detection method, the social network media is microblog, and the rumor detection system comprises the following modules:
a data collection module: the method comprises the steps of extracting text content and comment content of a source post in a microblog, marking authenticity of the source post and constructing a training set;
a preprocessing module: the system is used for preprocessing training samples in a training set, and comprises word segmentation processing and stop word removal;
the coding module: the method comprises the steps of searching word vectors of words in preprocessed text content and comment content in a pre-trained word vector matrix to obtain initial token vectors of the text content and initial token vectors of the comment content, searching word vectors of nodes in a syntactic knowledge subgraph in a pre-trained knowledge graph word vector matrix to obtain initial token vectors of the syntactic knowledge subgraph related to the comment content;
a network training module: the deep learning network training system is used for inputting an initial characterization vector of text content, an initial characterization vector of comment content and a syntactic knowledge subgraph initial characterization vector into the deep learning network to obtain a final characterization vector and train the deep learning network according to the final characterization vector, and training the whole deep learning network by taking the probability that the characterization vector belongs to a certain class and marks in a training set as losses and taking minimized losses as a target to obtain a deep learning network model based on multi-level attention and a knowledge graph;
rumor detection module: and extracting semantic and emotional information in the input source post text content and comment content by using an NLP tool, analyzing the input source post text content and comment content by using a trained deep learning network model based on multi-level attention and knowledge maps, and outputting a predicted source post authenticity label.
Compared with the prior art, the invention has the following beneficial effects: the method obtains the syntactic knowledge subgraph of the corresponding comment sentence by using a knowledge map and subgraph generation strategy, then codes the comment content and the text content respectively, learns the syntactic dependency and the external knowledge in the comment content through two graph convolution networks and a knowledge guide mechanism, and filters sentence noise by using a gating mechanism to enhance the expression of the comment sentence. The invention also learns the fine-grained semantic information between the text content and the comment content by utilizing a multi-level attention mechanism. Compared with the prior art, the method can enhance the characteristic representation of the rumor by utilizing fine-grained semantic information and rich emotional information, so that the precision of the rumor detection is further improved, and the robustness of the rumor is enhanced.
Drawings
The invention is described in further detail below with reference to the following figures and detailed description:
FIG. 1 is a schematic flow chart of a method implementation of an embodiment of the present invention;
FIG. 2 is a schematic diagram of a model architecture in an embodiment of the invention;
fig. 3 is a schematic system configuration of an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in the figure, a rumor detection method fusing emotion mining, the method comprises the following steps;
step A: collecting and extracting text content and comment content of a source post in a social network medium, and manually marking a real label of the source post to form a training data set DT;
and B: training a deep learning network model N based on multi-level attention and a knowledge graph by using a training data set DT, wherein training content comprises authenticity of analysis source posts and authenticity labels for predicting the source posts;
and C: and inputting the text content and the comment content of the source post into the trained deep learning network model N to obtain the authenticity label of the source post.
The step B comprises the following steps;
step B1: encoding each training sample in the training data set DT to obtain an initial characterization vector T of the text content st Initial characterization vector T of comment content rt And syntactic adjacency matrix A st ;
And step B2: generating a corresponding syntactic knowledge sub-graph SK of the text content from the knowledge graph and the syntactic dependency graph according to a syntactic knowledge sub-graph construction algorithm, and obtaining an adjacency matrix A of the syntactic knowledge sub-graph SK SK Then, the nodes are coded to obtain a node knowledge representation vector H of the syntax knowledge subgraph SK SK ;
And step B3: the text content initial characterization vector T obtained in the step B1 st Inputting the text content representation vector H into a bidirectional long-short term memory network Bi-LSTM to obtain a context-enhanced text content representation vector H st Let U st =H st (ii) a Then, the token vector T is st And initial characterization vector T of comment content rt Inputting the data into a multi-head cross attention mechanism together to obtain a comment characterization vector P based on text content sr While characterizing the vector T st Inputting the text content enhancement representation vector P into a multi-head self-attention mechanism s (ii) a Then characterizing the vector P by the comments based on the text content sr And a text content enhanced representation vector P s Respectively inputting the average into a pooling layer to perform average pooling operation to obtain average pooled comment content sentence characterization vectorsAnd average pooled text content enhanced representation vector
And step B4: expressing node knowledge of sub-graph SK as vector H SK And the characterization vector U obtained in the step B3 st Respectively inputting into two graph convolution networks with K layers, recording as text knowledge graph convolution network SKGCN and text content graph convolution network SCGCN, and learning external knowledge informationAnd extracting syntax information; meanwhile, each layer of nodes of the text content graph convolution network SCGCN and the text knowledge graph convolution network SKGCN are subjected to knowledge guidance by using a knowledge guidance mechanism to obtain a graph knowledge representation vector V of a source post sks ;
And step B5: characterizing vector V of the map knowledge obtained in the step B4 by using a cross attention mechanism sks And sentence characterization vector U st Fusing to obtain a knowledge enhanced sentence-level characterization vector E sd To further improve the ability of the model to extract information; then E is drawn by a multi-head self-attention mechanism sd Further strengthening to obtain sentence representation E of aggregated word-level information mt (ii) a Then reducing noise from irregular sentences through a gating mechanism to obtain source post emotion characterization vector E sf ;
And step B6: representing vector of average pooling comment content sentences corresponding to source postsAnd average pooled text content enhanced representation vectorInputting the data into a multi-head cross attention mechanism together, and obtaining a comprehensive semantic representation C of the comment content through average pooling sr (ii) a The average pooled text content is then enhanced with a characterization vectorAnd comprehensive semantic representation C of comment content sr Inputting the semantic representation vector V into a fusion gating mechanism to obtain a fine-grained semantic representation vector V of the source post t ;
Step B7: representing the emotion vector E obtained in the step B5 sf And the fine-grained semantic representation vector V of the source post obtained in the step B6 t Combining to obtain final characterization vector E f (ii) a Then E is mixed f Inputting a full connection layer and a softmax function to obtain a prediction result; calculating the gradient of each parameter in the deep learning network model by using a back propagation method according to the loss function loss of the target, and reducing by using a random gradientThe method updates each parameter;
and step B8: and when the iterative change of the loss value generated by the deep learning network model N is smaller than a given threshold value or the maximum iteration number is reached, terminating the training process of the deep learning network model N.
The step B1 comprises the following steps;
step B11: traversing the training set DT, performing word segmentation on text content and comment content of a source post in the training set DT and removing stop words, wherein each training sample in the DT is represented as DT = (st, rt, l); wherein st is the text content of the source post, rt is the comment content corresponding to the source post, l is the authenticity label corresponding to the source post, l belongs to { general fact, rumor, unverified rumor, rumor opened by public rumors };
the text content st of the source post is represented as:
wherein the content of the first and second substances,the ith word in the text content st, i =1,2, …, n, n is the number of words in the text content st of the source post;
the comment content rt of the source post is represented as:
wherein the content of the first and second substances,for the jth word in the comment content rt, i =1,2, …, m, m is the number of words in the comment content rt;
step B12: for step B11, obtaining text contentFor encoding to obtain text content stInitial token vector T st ;T st Expressed as:
wherein the word vector matrix is pre-trainedCan be found to obtainIs the ith wordCorresponding word vectors, d represents the dimensionality of the word vectors, and | V | is the number of words in the dictionary V;
step B13: for the comment content obtained in step B11Coding is carried out to obtain an initial characterization vector T of the comment content rt rt ;T rt Expressed as:
wherein the word vector matrix is pre-trainedCan be found to obtainDenotes the jth wordCorresponding word vectors, d represents the dimensionality of the word vectors, and | V | is the number of words in the dictionary V;
step B14: syntactic dependency parsing is carried out on the text content st to obtain a corresponding syntactic Dependency Tree (DTD) so as to obtain the corresponding syntactic dependency treeAnd an n-level syntactic adjacency matrix A st (ii) a The syntax dependency tree DTD is represented as,
wherein the content of the first and second substances,words representing text contentAnd text content wordsThere is a syntactic dependency between them.
The step B2 comprises the following steps;
step B21: each original word node in the syntactic dependency tree DTD is used as a root node, hop layers are expanded from a knowledge graph to generate child nodes, and u nodes which are connected with the nodes of the previous layer in the knowledge graph in an edge mode are selected from each layer to serve as the nodes of the layer, namely each seed node hasExpanding child nodes to finally obtain a syntactic knowledge sub-graph SK with the total number of all nodes being z = n + n × q and a z-order adjacency matrix A SK (ii) a The syntactic knowledge sub-graph SK is represented as,
wherein the content of the first and second substances,meaning knowledge node wordsIs a text content wordThe node of the expansion of (1) is,representing knowledge node wordsIs a knowledge node wordThe knowledge-extending child node of (a),words representing text contentAnd text content wordsThere is a syntactic dependency relationship between the two, u is the number of nodes selected in the knowledge graph, and hop is the number of layers of the topology;
step B22: the nodes of the sentence-method knowledge subgraph SK are encoded by embedding the knowledge graph to obtain the node knowledge expression vector ofOrder toAs the initial input of a text knowledge graph convolution network SKGCN; knowledge word vector matrix in pre-trainingCan be foundIs the ith wordCorresponding knowledge word vector, whichWhere d represents the dimension of the knowledge word vector, | V | is the number of words in which the knowledge word is embedded in V.
The step B3 comprises the following steps;
step B31: initial characterization vector of text contentSequentially and respectively inputting the forward layer and the reverse layer of the first bidirectional long-short term memory network to obtain the state vector sequence of the forward hidden layer and the state vector sequence of the reverse hidden layer, namely
Whereini =1,2,.. N, f is the activation function; text content characterization vectors with context enhancement obtained through connectionWherein, the first and the second end of the pipe are connected with each other, i =1,2, · n, ": "denotes a vector join operation; h st Is namely U st ;
Step B32: an initial characterization vector T of the text content st st And an initial token vector T of the review content rt rt Inputting the two into a multi-head cross attention mechanism together to obtain a comment characterization vector P based on text content sr The calculation formula is as follows:
P sr =MultiHead(T st ,T rt ,T rt ) A formula seven;
MultiHead(Q′,K′,V′)=Concat(head 1 ,head 2 ,…,head h )W o a formula eight;
head i =Attention(Q′W i Q ,K′W i K ,V′W i V ) A formula of nine;
wherein, multihead represents a multi-head attention mechanism, Q ', K ' and V ' represent input vectors of the multi-head attention mechanism, and an initial characterization vector T of text content st As a matrix Q', the initial token vector T of the corresponding review content rt rt As K 'and V'; head i The output vector calculated for the ith sub-vector of Q ', K ', V ' using Attention mechanism Attention (·), h being the number of heads of the multi-head Attention mechanism, W o For the training parameters of the multi-head attention mechanism,is a weight matrix of the linear projection and,is a scale factor;
step B33: initially characterizing text content by a vector T st Inputting the text content enhancement representation vector P into a multi-head self-attention mechanism s The calculation formula is as follows:
P s =MultiHead(T st ,T st ,T st ) A formula eleven;
MultiHead(Q 1 ,K′,V′)=Concat(head 1 ,head 2 ,…,head h )W 1 a formula twelve;
head i =Attention(Q′W i Q ,K′W i K ,V′W i V ) A formula thirteen;
wherein, multihead represents a multi-head attention mechanism, Q ', K ' and V ' represent input vectors of the multi-head attention mechanism, and an initial characterization vector T of text content st As matrices Q ', K ' and V '; head i The output vector calculated using Attention mechanism Attention (-) for the ith sub-vector of Q ', K ', V ', h is the number of heads of the multi-head Attention mechanism, W 1 For the training parameters of the multi-head attention mechanism, is a weight matrix of the linear projection,is a scale factor;
step B34: characterizing vectors P based on comments of text content sr And a text content enhanced representation vector P s Respectively inputting the data into a pooling layer to perform average pooling operation to obtain average pooling comment content sentence representation vectorsAnd average pooled text content enhanced representation vectorThe calculation formula is as follows:
wherein, the first and the second end of the pipe are connected with each other,MeanPool is the average pooling function.
The step B4 comprises the following steps;
step B41: the sub-graph node knowledge characterization vector G obtained in the step B22 is used for SK,0 Input text knowledge graph convolution network SKGCN first layer graph convolution network using adjacency matrix A SK Updating the vector representation of each sub-graph node and outputting G SK ,1 And is used as the input of the next layer of graph convolution network;
wherein the content of the first and second substances,is the output of node i in the first level graph convolution network,the calculation formula of (a) is as follows:
wherein, the first and the second end of the pipe are connected with each other,is a bias term; w SK 、b SK Are all parameters which can be learnt, and the parameters,as a weight matrix, relu is an activation function; node i in SKGCN and ith word in comment contentCorrespondingly, the edges between the nodes represent the knowledge existing between the wordsConnection relation, d i D is selected to indicate the degree of the node i and to prevent the degree of the node i from being 0 and causing operation errors i +1 as a divisor;
step B42: for the text content graph convolution network SCGCN, the context enhanced text content representation vector U obtained in the step B31 is used st Inputting SCGCN first layer graph convolution network, using adjacency matrix A SK Updating the vector representation of each word and outputting U st,1 ,
wherein the content of the first and second substances,is the output of node i in the first layer graph convolution network,the calculation formula of (a) is as follows:
wherein, W st 、Are all parameters which can be learnt, and the parameters,in order to be a weight matrix, the weight matrix,is a bias term; relu is an activation function; node i in graph convolution network and ith word in comment contentCorrespondingly, edges between nodes in the graph convolution network represent syntactic dependencies between words in the comment content,d i D is selected to indicate the degree of the node i and to prevent the degree of the node i from being 0 and causing operation errors i +1 as a divisor;
for the knowledge-guided mechanism, the first layer output G to SKGCN SK,1 Discarding the content except the words in the current comment content sentence to obtain the first layer knowledge representation about the text contentThen using a cross attention mechanism to output the SCGCN with the first layer output U st,1 Combine to obtain knowledgeable review content sentence representation G SD,1 And is used as the input of the next layer of the SCGCN,
wherein, the output of the node i in the SCGCN first layer graph convolution network through the knowledge guiding mechanism isThe calculation formula of (a) is as follows:
wherein, (.) T Denotes a transpose operation, α i Is the attention weight of the knowledge about the ith word in the comment content s;
step B43: the input of the next layer graph convolution network of SKGCN and SCGCN is SK,1 And G SD,1 Repeating the steps B41,B42;
Wherein, for SKGCN,the output of the k layer graph convolution network is used as the input of the k +1 layer graph convolution network, and graph convolution characterization vectors are obtained after iteration is finishedFor the case of the SCGCN,for the output of the k-th layer graph convolution network, U is converted through a knowledge interaction mechanism st,k And G SD,k The method is used as the input of the (k + 1) th layer of graph convolution network, and graph convolution characterization vectors are obtained after continuous iteration and final endWherein K is more than or equal to 1 and less than or equal to K, and K is the layer number of the graph convolution network.
The step B5 comprises the following steps;
step B51: the text content characterization vector U with the enhanced context obtained in the step B31 is used for carrying out the context enhancement st And V obtained in step B43 sks Inputting an attention network, and selecting important knowledge information through the attention network to obtain a knowledge enhanced sentence-level characterization vector E sd The calculation formula is as follows:
wherein, (.) T Denotes the transposition operation, ∈ i Is to commentAttention weight of ith word in the theory s;
step B52: using the knowledge enhanced sentence-level characterization vector E obtained in step 51 sd Inputting the sentence characterization vector E of the aggregated word-level information into a multi-head self-attention mechanism mt ,
E mt =MuliHead(E sd ,E sd ,E sd ) Twenty-nine of a formula;
step B53: for the noise brought by the non-standard sentence pair model, the sentence characterization vector E of the word-level information is aggregated mt Inputting a gating function to filter the irrelevant information to obtain a vector E sda (ii) a Then inputting the emotion expression vector into a multi-layer perceptron (MLP) to obtain an emotion representation vector E of the source post sf (ii) a The specific calculation process is as follows:
wherein the content of the first and second substances,andare all parameters which can be learnt, and the parameters,andin order to be a weight matrix, the weight matrix,andis a bias term.
The step B6 comprises the following steps;
step B61: representing vectors of all average pooled comment content sentences corresponding to source postsAnd average pooled text content enhanced representation vectorInputting the data into a multi-head cross attention mechanism together, and obtaining a comprehensive semantic representation C of the comment content through average pooling sr The calculation process is as follows:
C sr = MeanPool (C') formula thirty-three;
step B62: enhancing the average pooled text content with a characterization vectorAnd comprehensive semantic representation C of comment content sr Jointly input into a fusion gating mechanism to obtain a fine-grained semantic representation vector V of a source post t The calculation process is as follows:
wherein the content of the first and second substances,is a sigmoid activation function, w 1 ,Anda parameter learnable in the fusion gating mechanism, which is a dot product operation.
The step B7 comprises the following steps;
step B71: using the source post emotion characterization vector E obtained in the step B53 sf And V obtained in step B62 t Connecting to obtain a final characterization vector E f The calculation formula is as follows:
E f =Concat(E sf ,V t ) A formula thirty-six;
Step B72: final characterization vector E f Inputting the text content into a full connection layer, normalizing by using softmax, and calculating the probability that the text content correspondingly belongs to each category, wherein the calculation formula is as follows:
y=W 3 E f + b formula thirty-seven;
p c (y) = softmax (y) formula thirty-eight;
where y is the output vector of the fully connected layer,is a matrix of the weights of the full connection layer,bias term for fully connected layer, p c (y) is the probability of predicting the corresponding category of the text content as c, and p is more than or equal to 0 c (y) ≦ 1,c ∈ { general fact, rumor, unverified rumor, rumor daggered };
step B73: calculating a loss value by using the cross entropy as a loss function, updating the learning rate by using a gradient optimization algorithm Adam, and updating model parameters by using back propagation iteration so as to train a model by using a minimized loss function; the minimum loss function loss is calculated as follows:
wherein the content of the first and second substances,is an L2 regularization term, λ is a learning rate, θ includes all parameters, and c is an authenticity label corresponding to the text content.
A rumor detection system integrating emotion mining adopts the rumor detection method, the social network media is microblog, and the rumor detection system comprises the following modules:
a data collection module: the method comprises the steps of extracting text content and comment content of a source post in a microblog, marking authenticity of the source post and constructing a training set;
a preprocessing module: the system is used for preprocessing training samples in a training set, and comprises word segmentation processing and stop word removal;
the coding module: the method comprises the steps of searching word vectors of words in preprocessed text content and comment content in a pre-trained word vector matrix to obtain initial token vectors of the text content and initial token vectors of the comment content, searching word vectors of nodes in a syntactic knowledge subgraph in a pre-trained knowledge graph word vector matrix to obtain initial token vectors of the syntactic knowledge subgraph related to the comment content;
a network training module: the deep learning network training system is used for inputting an initial characterization vector of text content, an initial characterization vector of comment content and a syntactic knowledge subgraph initial characterization vector into the deep learning network to obtain a final characterization vector and train the deep learning network according to the final characterization vector, and training the whole deep learning network by taking the probability that the characterization vector belongs to a certain class and marks in a training set as losses and taking minimized losses as a target to obtain a deep learning network model based on multi-level attention and a knowledge graph;
rumor detection module: and extracting semantic and emotional information in the input source post text content and comment content by using an NLP tool, analyzing the input source post text content and comment content by using a trained deep learning network model based on multi-level attention and knowledge maps, and outputting a predicted source post authenticity label.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.
Claims (10)
1. A rumor detection method integrating emotion mining is characterized in that: the method comprises the following steps;
step A: collecting and extracting text content and comment content of a source post in a social network medium, and manually marking a real label of the source post to form a training data set DT;
and B: training a deep learning network model N based on multi-level attention and a knowledge graph by using a training data set DT, wherein training content comprises authenticity of analysis source posts and authenticity labels for predicting the source posts;
and C: inputting the text content and the comment content of the source post into the trained deep learning network model N to obtain the authenticity label of the source post.
2. The method of claim 1, wherein the method comprises: the step B comprises the following steps;
step B1: encoding each training sample in the training data set DT to obtain an initial characterization vector T of the text content st First of comment contentStarting token vector T rt And syntactic adjacency matrix A st ;
And step B2: generating corresponding syntactic knowledge subgraph SK of text content from the knowledge map and the syntactic dependency graph according to the syntactic knowledge subgraph construction algorithm, and obtaining an adjacency matrix A thereof SK Then, the nodes are coded to obtain a node knowledge representation vector H of the syntax knowledge subgraph SK SK ;
And step B3: the text content initial characterization vector T obtained in the step B1 st Inputting the text content representation vector H into a bidirectional long-short term memory network Bi-LSTM to obtain a context-enhanced text content representation vector H st Let U st =H st (ii) a Then, the token vector T is st And initial characterization vector T of comment content rt Inputting the data into a multi-head cross attention mechanism together to obtain a comment characterization vector P based on text content sr While characterizing the vector T st Inputting the text content enhancement representation vector P into a multi-head self-attention mechanism s (ii) a Then characterizing the vector P by the comments based on the text content sr And text content enhancement characterization vector P s Respectively inputting the data into a pooling layer to perform average pooling operation to obtain average pooling comment content sentence representation vectorsAnd average pooled text content enhanced representation vector
B4, expressing the node knowledge of the sub-graph SK as a vector H SK And the characterization vector U obtained in the step B3 st The method comprises the steps that the data are respectively input into two graph convolution networks with K layers, and are recorded as a text knowledge graph convolution network SKGCN and a text content graph convolution network SCGCN, and the data are used for learning external knowledge information and extracting syntax information; meanwhile, each layer of nodes of the text content graph convolution network SCGCN and the text knowledge graph convolution network SKGCN are subjected to knowledge guidance by using a knowledge guidance mechanism to obtain a graph knowledge representation vector V of a source post sks ;
B5, characterizing a vector V of the graph knowledge obtained in the step B4 by using a cross attention mechanism sks And sentence characterization vector U st Fusing to obtain a knowledge enhanced sentence-level characterization vector E sd To further improve the ability of the model to extract information; then E is drawn by a multi-head self-attention mechanism sd Further strengthening, obtaining sentence representation E of aggregated word-level information mt (ii) a Reducing noise from irregular sentences through a gating mechanism to obtain a source post emotion representation vector E sf ;
Step B6: representing vector of average pooling comment content sentences corresponding to source postsAnd average pooled text content enhanced representation vectorInputting the data into a multi-head cross attention mechanism together, and obtaining a comprehensive semantic representation C of the comment content through average pooling sr (ii) a The average pooled text content is then enhanced with a characterization vectorAnd comprehensive semantic representation C of comment content sr Inputting the semantic representation vector V into a fusion gating mechanism to obtain a fine-grained semantic representation vector V of the source post t ;
Step B7: representing the emotion vector E obtained in the step B5 sf And the fine-grained semantic representation vector V of the source post obtained in the step B6 t Combining to obtain final characterization vector E f (ii) a Then E is mixed f Inputting a full connection layer and a softmax function to obtain a prediction result; calculating the gradient of each parameter in the deep learning network model by using a back propagation method according to the target loss function loss, and updating each parameter by using a random gradient descent method;
and step B8: and when the iterative change of the loss value generated by the deep learning network model N is smaller than a given threshold value or the maximum iteration number is reached, terminating the training process of the deep learning network model N.
3. The rumor detection method for fusion emotion mining, according to claim 2, characterized in that: the step B1 comprises the following steps;
step B11: traversing the training set DT, performing word segmentation on text content and comment content of a source post in the training set DT and removing stop words, wherein each training sample in the DT is represented as DT = (st, rt, l); wherein st is the text content of the source post, rt is the comment content corresponding to the source post, l is the authenticity label corresponding to the source post, l belongs to { general fact, rumor, unverified rumor, rumor opened by public rumors };
the text content st of the source post is represented as:
wherein the content of the first and second substances,the ith word in the text content st, i =1,2, …, n, n is the number of words in the text content st of the source post;
the comment content rt of the source post is represented as:
wherein the content of the first and second substances,for the jth word in the comment content rt, i =1,2, …, m, m is the number of words in the comment content rt;
step B12: for step B11, obtaining text contentCoding to obtain text contentst initial token vector T st ;T st Expressed as:
wherein the word vector matrix is pre-trainedCan be found to obtainIs the ith wordCorresponding word vectors, d represents the dimensionality of the word vectors, and | V | is the number of words in the dictionary V;
step B13: for the comment content obtained in step B11Coding is carried out to obtain an initial characterization vector T of the comment content rt rt ;T rt Expressed as:
wherein the word vector matrix is pre-trainedCan be found to obtainDenotes the jth wordCorresponding word vectors, d represents the dimensionality of the word vectors, and | V | is the number of words in the dictionary V;
step B14: performing syntactic dependency analysis on the text content st to obtain a corresponding syntactic dependency tree DTD and an n-level syntactic adjacency matrix A st (ii) a The syntax dependency tree DTD is represented as,
4. The method of claim 3, wherein the method comprises the steps of: the step B2 comprises the following steps;
step B21: taking each original word node in a syntactic dependency tree DTD as a root node, expanding hop layers from a knowledge graph to generate child nodes, and selecting u nodes which are connected with the nodes of the previous layer in the knowledge graph with edges as the nodes of the layer on each layer, namely each seed node hasExpanding child nodes to finally obtain a syntactic knowledge sub-graph SK with the total number of all nodes being z = n + n × q and a z-order adjacency matrix A SK (ii) a The syntactic knowledge sub-graph SK is represented as,
wherein the content of the first and second substances,meaning knowledge node wordsIs a text content wordThe number of the extended nodes of (1),meaning knowledge node wordsIs a knowledge node wordThe knowledge-extending child node of (a),words representing text contentAnd text content wordsThere is a syntactic dependency relationship between the two, u is the number of nodes selected in the knowledge graph, and hop is the number of layers of the topology;
step B22: the nodes of the sentence-method knowledge subgraph SK are encoded by embedding the knowledge graph to obtain the node knowledge expression vector ofOrder toAs the initial input of a text knowledge graph convolution network SKGCN; in the pre-trainingWord recognition vector matrixCan be found to obtainIs the ith word W i kg And the corresponding knowledge word vector, wherein d represents the dimension of the knowledge word vector, and | V | is the word number of the knowledge word embedded in V.
5. The method of claim 4, wherein the method comprises the steps of: the step B3 comprises the following steps;
step B31, initial characterization vectors of the text contentSequentially and respectively inputting the forward layer and the reverse layer of the first bidirectional long-short term memory network to obtain the state vector sequence of the forward hidden layer and the state vector sequence of the reverse hidden layer, namelyAnd
whereinIs an activation function; text content characterization vectors with context enhancement obtained through connectionWherein, the first and the second end of the pipe are connected with each other, ": means a vector join operation; h st Is namely U st ;
Step B32: an initial characterization vector T of the text content st st And an initial token vector T of the review content rt rt Inputting the two into a multi-head cross attention mechanism together to obtain a comment characterization vector P based on text content sr The calculation formula is as follows:
P sr =MultiHead(T st ,T rt ,T rt ) A formula seven;
MultiHead(Q′,K′,V′)=Concat(head 1 ,head 2 ,…,head h )W o a formula eight;
wherein, multihead represents a multi-head attention mechanism, Q ', K ' and V ' represent input vectors of the multi-head attention mechanism, and an initial characterization vector T of text content st As a matrix Q', the initial token vector T of the corresponding review content rt rt As K 'and V'; head i The output vector calculated using Attention mechanism Attention (-) for the ith sub-vector of Q ', K ', V ', h is the number of heads of the multi-head Attention mechanism, W o For the training parameters of the multi-head attention mechanism,is a weight matrix of the linear projection,is a scale factor;
step B33: initially characterizing text content by a vector T st Inputting the text content enhancement representation vector P into a multi-head self-attention mechanism s The calculation formula is as follows:
P s =MultiHead(T st ,T st ,T st ) A formula eleven;
MultiHead(Q′,K′,V′)=Concat(head 1 ,head 2 ,…,head h )W 1 a formula twelve;
wherein, nultiHead represents a multi-head attention mechanism, Q ', K ' and V ' represent input vectors of the multi-head attention mechanism, and an initial characterization vector T of text content st As matrices Q ', K ' and V '; head i The output vector calculated for the ith sub-vector of Q ', K ', V ' using Attention mechanism Attention (·), h being the number of heads of the multi-head Attention mechanism, W 1 For the training parameters of the multi-head attention mechanism, is a weight matrix of the linear projection and,is a scale factor;
step B34: characterizing vectors P based on comments of text content sr And text content enhancement characterization vector P s Respectively inputting the data into a pooling layer to perform average pooling operation to obtain average pooling comment content sentence representation vectorsAnd average pooled text content enhanced representation vectorThe calculation formula is as follows:
6. The method of claim 5, wherein the method comprises: the step B4 comprises the following steps;
step B41: the sub-graph node knowledge characterization vector G obtained in the step B22 is used for SK,0 Input text knowledge graph convolution network SKGCN first layer graph convolution network using adjacency matrix A SK Updating the vector representation of each sub-graph node and outputting G SK,1 And is used as the input of the next layer of graph convolution network;
wherein the content of the first and second substances,is the output of node i in the first level graph convolution network,the calculation formula of (a) is as follows:
wherein the content of the first and second substances,is a bias term; w SK 、b SK Are all parameters which can be learnt, and the parameters,as a weight matrix, relu is an activation function; node i in SKGCN and ith word in comment contentCorrespondingly, the edges between the nodes represent the knowledge connection relationship between the words, d i D is selected to indicate the degree of the node i and to prevent the degree of the node i from being 0 and causing operation errors i +1 as a divisor;
step B42: for the text content graph convolution network SCGCN, the text content characterization vector U with the enhanced context obtained in the step B31 is used for representing the text content st Inputting SCGCN first layer graph convolution network, using adjacency matrix A SK Updating the vector representation of each word and outputting U st,1 ,
wherein the content of the first and second substances,is a middle section of a first layer graph convolution networkThe output of the point i is then taken,the calculation formula of (a) is as follows:
wherein, W st 、Are all parameters which can be learnt, and are,in the form of a matrix of weights,is a bias term; relu is an activation function; node i in graph convolution network and ith word in comment contentCorrespondingly, the edges between nodes in the graph convolution network represent the syntactic dependency between words in the comment content, d i D is selected to indicate the degree of the node i and to prevent the degree of the node i from being 0 to cause operation error i +1 as a divisor;
for the knowledge-guided mechanism, the first layer output G to SKGCN SK,1 Discarding the content except the words in the current comment content sentence to obtain the first layer knowledge representation about the text contentThen using a cross attention mechanism to output the SCGCN with the first layer output U st,1 Combine to obtain knowledgeable review content sentence representation G SD,1 And is used as the input of the next layer of the SCGCN,
wherein, the output of the node i in the SCGCN first layer graph convolution network through the knowledge guiding mechanism is The calculation formula of (a) is as follows:
wherein, (. Cndot.) T Denotes a transpose operation, α i Is the attention weight of the knowledge about the ith word in the comment content s;
step B43: the input of the next layer graph convolution network of SKGCN and SCGCN is G SK,1 And G SD,1 Repeating the steps B41 and B42;
wherein, for SKGCN,the output of the k layer graph convolution network is used as the input of the k +1 layer graph convolution network, and graph convolution characterization vectors are obtained after iteration is finishedFor the case of the SCGCN,for the output of the k-th layer graph convolution network, U is converted through a knowledge interaction mechanism st,k And G SD,k The method is used as the input of the (k + 1) th layer of graph convolution network, and graph convolution characterization vectors are obtained after continuous iteration and final endWherein K is more than or equal to 1 and less than or equal to K, and K is the layer number of the graph convolution network.
7. The method of claim 6, wherein the method comprises: the step B5 comprises the following steps;
step B51: the text content characterization vector U with the enhanced context obtained in the step B31 is used for carrying out the context enhancement st And V obtained in step B43 sks Inputting an attention network, and selecting important knowledge information through the attention network to obtain a knowledge enhanced sentence-level characterization vector E sd The calculation formula is as follows:
wherein, (. Cndot.) T Denotes the transposition operation, ε i Is the attention weight of the ith word in the comment content s;
step B52: using the knowledge enhancement type sentence level characterization vector E obtained in step 51 sd Inputting the sentence characterization vector E of the aggregated word-level information into a multi-head self-attention mechanism mt ,
E mt =MuliHead(E sd ,E sd ,E sd ) Twenty-nine of a formula;
step B53: for the noise brought by the non-standard sentence pair model, the sentence characterization vector E of the word-level information is aggregated mt Inputting a gating function to filter the irrelevant information to obtain a vector E sda (ii) a Then inputting the emotion expression vector into a multi-layer perceptron (MLP) to obtain an emotion representation vector E of the source post sf (ii) a The specific calculation process is as follows:
8. The method of claim 7, wherein the method comprises: the step B6 comprises the following steps;
step B61: representing vectors of all average pooled comment content sentences corresponding to source postsAnd average pooled text content enhanced representation vectorInputting the data into a multi-head cross attention mechanism together, and obtaining a comprehensive semantic representation C of the comment content through average pooling sr The calculation process is as follows:
C sr = MeanPool (C') formula thirty-three;
step B62: enhancing the average pooled text content with a characterization vectorAnd comprehensive semantic representation C of comment content sr Jointly input into a fusion gating mechanism to obtain a fine-grained semantic representation vector V of a source post t The calculation process is as follows:
9. The method of claim 8, wherein the method comprises: the step B7 comprises the following steps;
step B71: using the source post emotion characterization vector E obtained in the step B53 sf And V obtained in step B62 t Connecting to obtain a final characterization vector E f The calculation formula is as follows:
E f =Concat(E sf ,V t ) A formula thirty-six;
Step B72: final characterization vector E f Inputting the text content into a full connection layer, normalizing by using softmax, and calculating the probability that the text content correspondingly belongs to each category, wherein the calculation formula is as follows:
y=W 3 E f + b formula thirty-seven;
p c (y) = softmax (y) formula thirty-eight;
where y is the output vector of the fully connected layer,is a matrix of the weights of the full connection layer,bias term for fully connected layer, p c (y) predicting the textProbability of content corresponding to class c, 0 ≦ p c (y) ≦ 1,c ∈ { general fact, rumor, unverified rumor, rumor daggered };
step B73: calculating a loss value by using the cross entropy as a loss function, updating the learning rate by using a gradient optimization algorithm Adam, and updating model parameters by using back propagation iteration so as to train a model by using a minimized loss function; the minimum loss function loss is calculated as follows:
10. A rumor detection system with emotion mining, which employs the rumor detection method of any one of claims 1 to 9, and is characterized in that: the social network media is a microblog, and the rumor detection system comprises the following modules:
a data collection module: the method comprises the steps of extracting text content and comment content of a source post in a microblog, marking authenticity of the source post and constructing a training set;
a preprocessing module: the system is used for preprocessing training samples in a training set, and comprises word segmentation processing and stop word removal;
and an encoding module: the method comprises the steps of searching word vectors of words in preprocessed text content and comment content in a pre-trained word vector matrix to obtain initial token vectors of the text content and initial token vectors of the comment content, searching word vectors of nodes in a syntactic knowledge subgraph in a pre-trained knowledge graph word vector matrix to obtain initial token vectors of the syntactic knowledge subgraph related to the comment content;
a network training module: the deep learning network model is used for inputting the initial characterization vector of the text content, the initial characterization vector of the comment content and the initial characterization vector of the syntactic knowledge subgraph into the deep learning network to obtain a final characterization vector and train the deep learning network by using the final characterization vector, the probability that the characterization vector belongs to a certain class and the marks in a training set as losses, and the whole deep learning network is trained by using the minimized losses as a target to obtain a deep learning network model based on multi-level attention and a knowledge graph;
rumor detection module: and extracting semantic and emotional information in the input source post text content and comment content by using an NLP tool, analyzing the input source post text content and comment content by using a trained deep learning network model based on multi-level attention and knowledge maps, and outputting a predicted source post authenticity label.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211139407.3A CN115422945A (en) | 2022-09-19 | 2022-09-19 | Rumor detection method and system integrating emotion mining |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211139407.3A CN115422945A (en) | 2022-09-19 | 2022-09-19 | Rumor detection method and system integrating emotion mining |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115422945A true CN115422945A (en) | 2022-12-02 |
Family
ID=84204734
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211139407.3A Pending CN115422945A (en) | 2022-09-19 | 2022-09-19 | Rumor detection method and system integrating emotion mining |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115422945A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117252264A (en) * | 2023-11-20 | 2023-12-19 | 神思电子技术股份有限公司 | Relation extraction method combining language model and graph neural network |
-
2022
- 2022-09-19 CN CN202211139407.3A patent/CN115422945A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117252264A (en) * | 2023-11-20 | 2023-12-19 | 神思电子技术股份有限公司 | Relation extraction method combining language model and graph neural network |
CN117252264B (en) * | 2023-11-20 | 2024-02-02 | 神思电子技术股份有限公司 | Relation extraction method combining language model and graph neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108984724B (en) | Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation | |
CN109376242B (en) | Text classification method based on cyclic neural network variant and convolutional neural network | |
WO2023024412A1 (en) | Visual question answering method and apparatus based on deep learning model, and medium and device | |
CN111274398B (en) | Method and system for analyzing comment emotion of aspect-level user product | |
CN112613303B (en) | Knowledge distillation-based cross-modal image aesthetic quality evaluation method | |
CN108549658A (en) | A kind of deep learning video answering method and system based on the upper attention mechanism of syntactic analysis tree | |
CN111985205A (en) | Aspect level emotion classification model | |
CN114443827A (en) | Local information perception dialogue method and system based on pre-training language model | |
CN111914553B (en) | Financial information negative main body judging method based on machine learning | |
CN114841151B (en) | Medical text entity relation joint extraction method based on decomposition-recombination strategy | |
CN115659966A (en) | Rumor detection method and system based on dynamic heteromorphic graph and multi-level attention | |
CN116028604A (en) | Answer selection method and system based on knowledge enhancement graph convolution network | |
CN114492459A (en) | Comment emotion analysis method and system based on convolution of knowledge graph and interaction graph | |
CN114742069A (en) | Code similarity detection method and device | |
CN113051904B (en) | Link prediction method for small-scale knowledge graph | |
CN115422945A (en) | Rumor detection method and system integrating emotion mining | |
CN116661805B (en) | Code representation generation method and device, storage medium and electronic equipment | |
CN114881038B (en) | Chinese entity and relation extraction method and device based on span and attention mechanism | |
CN116663539A (en) | Chinese entity and relationship joint extraction method and system based on Roberta and pointer network | |
CN116737897A (en) | Intelligent building knowledge extraction model and method based on multiple modes | |
CN116258147A (en) | Multimode comment emotion analysis method and system based on heterogram convolution | |
CN116662924A (en) | Aspect-level multi-mode emotion analysis method based on dual-channel and attention mechanism | |
CN116644760A (en) | Dialogue text emotion analysis method based on Bert model and double-channel model | |
CN116579347A (en) | Comment text emotion analysis method, system, equipment and medium based on dynamic semantic feature fusion | |
CN113254575B (en) | Machine reading understanding method and system based on multi-step evidence reasoning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |