CN111966786B - Microblog rumor detection method - Google Patents

Microblog rumor detection method Download PDF

Info

Publication number
CN111966786B
CN111966786B CN202010757089.1A CN202010757089A CN111966786B CN 111966786 B CN111966786 B CN 111966786B CN 202010757089 A CN202010757089 A CN 202010757089A CN 111966786 B CN111966786 B CN 111966786B
Authority
CN
China
Prior art keywords
microblog
model
training
text
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010757089.1A
Other languages
Chinese (zh)
Other versions
CN111966786A (en
Inventor
宋玉蓉
潘德宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202010757089.1A priority Critical patent/CN111966786B/en
Publication of CN111966786A publication Critical patent/CN111966786A/en
Application granted granted Critical
Publication of CN111966786B publication Critical patent/CN111966786B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The invention provides a microblog rumor detection method, which considers the attention mechanism and comprises the following steps: collecting microblog events and corresponding comment data sets as sample data; preprocessing the sample data, and respectively extracting text contents of the original microblog and the comment; pre-training the text by adopting a BERT pre-training model, and generating a sentence vector with a fixed length for each sentence of the text; constructing a dictionary, and extracting an original microblog and a plurality of corresponding comments to form a microblog event vector matrix; training the vector matrix by adopting a deep learning method Text CNN-Attention, and constructing a multi-level training model; and carrying out classification detection on the vector matrix according to the multi-level training model to obtain a rumor detection result corresponding to the social network data. Compared with the traditional rumor detection method, the method improves the accuracy.

Description

Microblog rumor detection method
Technical Field
The invention belongs to the technical field of natural language processing, and particularly relates to a microblog rumor detection method.
Background
Rumors generally refer to statements or descriptions that are not verified, often in relation to an event. With the rapid development of social media, rumors can rapidly spread through social media at the rate of nuclear fission. Microblogs, namely micro blogs, which are one of social media, are an emerging class of open internet social services in the web2.0 era. Users can update their microblogs with short characters at any time and any place by means of the internet or the propagation media such as mobile phones, and share information with more users. Compared with the traditional blog, the microblog shows the following propagation characteristics: instant blog sharing, innovative interactive mode and vivid on-site deduction. The following are shown in the propagation effect: accumulated human qi, economy and quickness. However, in the diversified distribution, the distribution and diffusion of rumors on microblogs are promoted by the free distribution content, civilized distributors, and wide audience and diversified distribution channels. The propagation of the rumors on the microblog is mostly performed through the comment and forwarding of information between users, and if the false rumors are widely propagated, certain negative effects are generated on the society.
Approaches to rumor detection generally fall into two categories: one is a method for machine learning based on traditional artificial feature extraction, which is characterized by mining features from factors such as rumor content, rumor users, rumor propagation, emotion polarity and user influence and carrying out rumor detection through classifiers such as Bayes and decision trees; the other type is based on a deep learning method, potential features in a text are learned by constructing a neural network and matching with a nonlinear function, feature representation learning is carried out on a text sequence through neural network models such as CNN and RNN, and finally rumor detection is carried out through a nonlinear classifier. At present, word2vec word vectors or ELMo are mostly adopted in a pre-training model in the research of constructing a neural network for rumor detection through deep learning, but the word vectors obtained in the former cannot solve the problem of polysemous words, so that each word obtained through training can only correspond to one vector to represent, the latter can dynamically adjust word embedding according to the context, but LSTM is used for feature extraction instead of transform, and ELMo uses context vector splicing as a current vector, so that the fused vector features are poor. The training model mostly adopts a CNN or an RNN, but although the CNN can extract sentence meaning characteristics, the CNN ignores context and word order characteristics, and the CNN can not distinguish the characteristics with obvious influence when the CNN splices the pooled characteristics after full-connection operation. The invention provides a new rumor detection model considering an attention mechanism aiming at the existing challenges, selects a BERT pre-training model capable of extracting potential features of a text in the aspect of text preprocessing, introduces the attention mechanism into a CNN model on the training model, can automatically distribute different weights according to different event influence forces, and finally uses a Softmax classifier to carry out rumor detection.
In view of the above, a method for detecting microblog rumors is needed to solve the above problems.
Disclosure of Invention
The invention aims to provide a microblog rumor detection method with high accuracy.
In order to achieve the above object, the present invention provides a microblog rumor detection method, comprising the following steps:
A. collecting microblog events and corresponding comment data sets as sample data;
B. preprocessing sample data, and respectively extracting text contents of an original microblog and a comment;
C. pre-training the text by adopting a BERT pre-training model, and generating a sentence vector with a fixed length for each sentence of the text;
D. constructing a dictionary, and extracting an original microblog and a plurality of corresponding comments to form a microblog event vector matrix;
E. training the vector matrix by adopting a deep learning method Text CNN-Attention, and constructing a multi-level training model;
F. and carrying out classification detection on the vector matrix according to the multi-level training model to obtain a rumor detection result corresponding to the social network data.
As a further improvement of the invention, the sample data comprises rumor sample data and non-rumor sample data.
As a further improvement of the present invention, in the step B, a regular expression is used to remove noise in the json file.
As a further improvement of the present invention, the whole text after pre-training is processed according to the training data and the test data according to the following steps of 4: the ratio of 1 is used for subsequent model processing.
As a further improvement of the present invention, the pre-trained BERT model and code enable the embedding of word vectors.
As a further improvement of the invention, the BERT model is used as a word vector model, can fully describe character level, word level and sentence level so as to lead the relation characteristics between sentences, and gradually moves the NLP task to the pre-training generated sentence vector.
As a further improvement of the invention, the BERT model proposes a pre-training target: a Masked Language Model (MLM) overcomes the traditional unidirectional limitation, and an MLM target allows the representation of contexts fusing the left side and the right side, so that a deep bidirectional Transformer can be pre-trained.
As a further improvement of the present invention, the BERT model introduces a "Next sentence prediction" task, which can be used to train the representation of the text pairs with MLM.
As a further improvement of the invention, the BERT model predicts whether texts at two ends of the input BERT are continuous or not by using sentence-level negative sampling; during the training process, the second segment of the input model will be randomly selected from all the texts with a probability of 50%, and the remaining 50% will select the subsequent text of the first segment.
As a further improvement of the invention, the constructed multi-level training model consists of Text CNN and an attention mechanism; the Text CNN model uses three convolution kernels with convolution sizes of 3,4 and 5 respectively to perform convolution operation on a vector matrix to be detected to obtain different feature representations of different convolution kernels based on the vector matrix, only one maximum feature is generated in each convolution kernel according to an input matrix through pooling operation, and the feature representations obtained by the convolution kernels with different sizes are connected through full-connection operation; the attention mechanism gives different weights to the characteristics generated after full connection according to different output influences of each characteristic, so that the characteristics with large influence have larger influence when rumor detection is carried out.
The invention has the following beneficial effects: according to the microblog rumor detection method, a BERT pre-training model is applied in a text preprocessing stage, dependence of a longer distance can be captured more efficiently by using a Transformer, deep context information can be mined, and the sentence vectors obtained through pre-training have better potential characteristics; the training model introduces an attention mechanism, different weights are given to different characteristics according to influence of the characteristics, so that the characteristics which have larger influence on an output result are given more weights, more important influence is generated on the result, rumor detection is facilitated, and the accuracy of detection is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention. Wherein:
FIG. 1 is a general flow chart for rumor detection;
FIG. 2 is a schematic diagram of the structure of the BERT model;
FIG. 3 is a flow chart of a microblog rumor detection method in consideration of attention mechanism according to the present invention;
FIG. 4 is a schematic structural diagram of a neural network Text CNN model;
FIG. 5 is a schematic diagram of a drawing attention mechanism;
FIG. 6 is a MATLAB simulation chart of an experimental result of the embodiment;
FIG. 7 is a MATLAB simulation chart of the results of the second experiment in example II.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
The invention discloses a microblog rumor detection method, which considers the attention mechanism, and the overall flow of the method is shown as a figure 1, and mainly comprises the following steps:
step 1, collecting microblog events and corresponding comment data as sample data;
sample data here includes rumor sample data and non-rumor sample data;
the rumor sample data label is "1" and the non-rumor sample data label is "0".
Step 2, preprocessing sample data, and extracting corresponding text content by using a regular expression;
the main purpose of preprocessing is to remove noise in the text, including non-chinese characters, punctuation, stop words, etc. The sample data is stored in a json format file; the json file stores data in the form of "key-value pairs," with the data name as the key in the json file, and the crawled data values as the values in the json file, such as "text: breakfast. No association is allowed to avoid crossing provinces. ";
all data of a single microblog original event are a json file, and all data of all comments of the single event are the json file;
removing noise in the json file by using a regular expression, and correspondingly extracting and storing text contents of the microblog original events and all comments of the microblog original events;
all texts were as per training data and test data 4: a ratio of 1 is used for subsequent model processing.
Step 3, downloading a BERT pre-training model, and converting the text into corresponding sentence vectors;
the BERT model can be obtained by downloading a BERT pre-training model of Google, the pre-trained Chinese BERT model and codes are from the BERT of Google Research, word vector embedding can be realized, and a basic structure model is shown in figure 2;
BERT: a method for improving architecture-based fine-tuning is disclosed, which is called bi-directional encoding Representation of Bidirectional Encoder repetition from transforms. The BERT model is used as a word vector model, can fully describe character level, word level and sentence level so as to lead the relation characteristics among sentences, and aims to gradually move a downstream NLP task to a pre-training generated sentence vector;
the BERT model includes the following features: the BERT model proposes a new pre-training target: a Masked Language Model (MLM) overcomes the traditional unidirectional limitation, and an MLM target allows the representation of contexts on the left side and the right side of fusion, so that a deep bidirectional Transformer can be pre-trained; the BERT model introduces a task of 'next sentence prediction', and can jointly train the representation of a text pair with MLM; the BERT model applies sentence-level negative sampling, and for sentence-level continuity prediction, whether texts at two ends of input BERT are continuous or not is predicted. During the training process, the second segment of the input model will be randomly selected from all the text with a probability of 50%, with the remaining 50% selecting the subsequent text of the first segment.
Step 4, constructing a corresponding input matrix according to the selected sentence length and the sentence vector dimension;
a BERT base model is adopted, the number of network layers is 12, and the dimensionality of a trained sentence vector is 768 dimensions;
and selecting a fixed sentence vector from the microblog original text and the sentence vectors corresponding to all the comments to form an input matrix.
And 5, constructing a Text CNN-Attention multi-level training model by adopting a deep learning method.
Fig. 3 is a detailed flowchart of a rumor detection method considering attention mechanism proposed by the present invention, where the first layer of the model is an input layer, and mainly consists of sentence vectors generated by entering a BERT pre-training model, and the whole microblog event is formed by adding corresponding random number comments from an original microblog; next, convolution layers are followed, where the sentence vectors of the input layer are learned by performing convolution using filters of different sizes, respectively, and feature representations based on the different filters can be obtained. Splicing the features belonging to the same window to obtain a feature vector of the window, and obtaining a feature sequence according to different sequences; the third layer introduces an attention mechanism into the feature sequence, each feature can be given different weights according to different attention distribution, so that features with larger influence on the output result are given more weights, the result is influenced more importantly, and finally the output is transmitted to a classifier to judge whether the event rumor is not.
FIG. 4 shows a structural description of the Text CNN model, and the detailed process is as follows:
(1) For all rumor and non-rumor events in the dataset and their corresponding reviews, sentence vectors were trained by the BERT pre-processing model. For each microblog event, selecting a plurality of corresponding comments under the event and the original microblog as input and transmitting the input and the input into an input layer, wherein the input layer is an m x n matrix, m is the total number of the selected events, and n is the length of a single sentence vector.
(2) The method comprises the steps of performing convolution by using three filters with different sizes to respectively obtain characteristics corresponding to different filters, enabling the filters to continuously slide in an m multiplied by n input matrix, setting the length of each filter as k and the width of each filter as n as the width of the input matrix in order to extract the characteristics conveniently, and expressing the characteristics extracted by one filter as h epsilon R k×n Then the feature obtained for any u in m is:
w u =(x u ,x u+1 ,…,x u-k+1 )
after the convolution is completed on the input matrix, a feature list c is generated, and the feature generated by each convolution corresponds to c: c. C u =f(w u * h + b), where f is the ReLU function and b is the offset term.
(3) When the filter slides over an input with a length of m, the length of the feature list is (m-k + 1), and if q filters exist, q feature lists are generated, and q is spliced to obtain a matrix:
W 1 =[c 1 ,c 2 ,…,c q ]
c q representing the list of features generated by the qth filter. And a total of three filters of different sizes are used, the total matrix generated finally is:
W=[W 1 ,W 2 ,W 3 ]=[c 1 ,c 2 ,…,c q ,c q+1 ,…,c 2q ,c 2q+1 ,…,c 3q ]
(4) And (3) performing maximum pooling operation on the characteristics obtained by each filter to obtain output characteristics, and fully connecting the output characteristics of different filters to obtain CNN output:
W'=[c 11 ,c 22 ,…,c kk ]。
(5) The attention layer is adopted to perform weighted summation on the output of the CNN layer to obtain a hidden layer representation of a microblog sequence, and a structure drawing introducing an attention mechanism is shown in fig. 5. Different weights can be given to the hidden state sequence W' output by the CNN by introducing an attention mechanism to the CNN, so that the microblog sequence information can be utilized by the model with emphasis when the representation of the microblog sequence is learned. The attention layer will output c of the CNN network kk As input, a representation v corresponding to the microblog sequence is output kk
h i =tanh(W A *c kk +b A )
Figure GDA0002719184820000061
Figure GDA0002719184820000062
Composition matrix V = [ V = 11 ,v 22 ,…,v kk ],W A As a weight matrix, b A Is an offset value, h i Is c kk Hidden layer of (a) i Is h i And context h A Similarity of (v) i Is the output vector.
(6) And sending the output to a full connection layer, and obtaining the probability output of rumors and non-rumors through Softmax so as to achieve the purpose of judging rumor events.
And 6, training and testing the input matrix by using a multi-level training model to obtain a corresponding rumor detection result.
The first embodiment is as follows:
to demonstrate the effectiveness of the present invention, we selected a series of microblog platform based event data collated by Ma et al and used in the paper, the data set being raw information captured through the microblog API and all forwarding and replying to a given event, and also captured general subject posts that were not reported as rumors and collected a similar number of rumor events, detailed statistics are listed in the following table:
Figure GDA0002719184820000071
we fit all data to the training set and test set 4:1, the specific division is as listed in the following table:
Figure GDA0002719184820000072
the evaluation indexes used for evaluating the effectiveness of the model are four values of accuracy, precision, recall and F1, and the conditions generated by the predicted result and the actual result are listed in the following table:
Figure GDA0002719184820000073
four baseline methods, SVM-TS, CNN-1, CNN-2, CNN-GRU, were used for comparison, and the detailed data for the effect of our method on rumor testing compared to the baseline method are shown in the following table, and the MATLAB simulation graph of the experimental results is shown in fig. 6:
Figure GDA0002719184820000074
Figure GDA0002719184820000081
it can be known from the table that the final accuracy of rumor detection performed by using a classifier in the conventional SVM-TS method is only 85.7%, the effect is not particularly excellent, and the final result of comparing three models, namely, GRU-1, GRU-2 and CNN-GRU, shows that after a convolutional neural network is added to a training model, because different potential features in input can be extracted through a filter, the better performance in accuracy reaches 95.7%, and after an attention mechanism is introduced into the model, different weights are given to the input of CNN output, so that more weights are given to the features having larger influence on the output result, and the more important influence on the result is generated to facilitate the rumor detection, and the result shows that the accuracy of the model reaches 96.8%, and the recall rate and the F1 value are improved without errors.
Example two:
in order to prove the feasibility of the method, another microblog Data set CED _ Data set [23] is selected for testing, and sentence vectors obtained by using the same pre-training model are trained on different training models to obtain the accuracy for comparison. The data set contained 1538 rumor events and 1849 non-rumor events, according to training and testing set 4:1, the experimental data are listed in the following table, and the MATLAB simulation chart of the experimental results is shown in fig. 7:
Figure GDA0002719184820000082
experimental results show that sentence vectors obtained through the BERT pre-training model are trained on different training models and still have deviation in the aspect of accuracy, but the deviation amplitude is small when different pre-training models are used before comparison. The accuracy of the SVM-TS is about 86.7% through experiments, then GRU-1, CNN-GRU and GRU-2 models are sequentially adopted, the CNN-Attention model with the best effect is the CNN-Attention model, the accuracy reaches 95.3%, and the effect embodied in the recall rate and the F1 value is the best of the models.
In conclusion, the model shows the best effect on two different data sets, the characteristic expression effect of the preprocessed sentence vectors can be greatly improved by using the BERT pre-training model, the potential characteristics in the text can be more effectively extracted by matching with the CNN model integrated with the attention mechanism, and the model has great significance on rumor detection tasks.
The microblog rumor event detection problem is explained mainly from two aspects of a pre-training model and a training model, the influence of the pre-training model on an experimental result is mainly explained, and a better effect can be obtained when partial downstream NLP tasks are transferred to the pre-training model; on the basis of a training model, a novel rumor detection model introduced with an attention mechanism is provided based on a traditional Text CNN model, and different weights can be given to input sentence vectors according to the influence degree of the input sentence vectors on the input sentence vectors, so that positive influence is generated on predicting whether an event rumor occurs. The method has good rumor detection effect through experimental verification on a real microblog data set.
Although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the present invention.

Claims (9)

1. A microblog rumor detection method is characterized by comprising the following steps:
A. collecting microblog events and corresponding comment data sets as sample data;
B. preprocessing the sample data, and respectively extracting text contents of the original microblog and the comment;
C. pre-training the text by adopting a BERT pre-training model, and generating a sentence vector with a fixed length for each sentence of the text;
D. constructing a dictionary, and extracting an original microblog and a plurality of corresponding comments to form a microblog event vector matrix;
E. training the vector matrix by adopting a deep learning method Text CNN-Attention, and constructing a multi-level training model;
the constructed multi-level training model consists of a Text CNN and an attention mechanism; the Text CNN model performs convolution operation on a vector matrix to be detected by using three convolution kernels with convolution sizes of 3,4 and 5 respectively to obtain different feature representations of different convolution kernels based on the vector matrix, only one maximum feature is generated by each convolution kernel corresponding to an input matrix through pooling operation, and the feature representations obtained by the convolution kernels with different sizes are connected through full-connection operation; the attention mechanism gives different weights to the characteristics generated after full connection according to different output influences of each characteristic, so that the characteristics with large influences have larger influences when rumor detection is carried out;
F. and carrying out classification detection on the vector matrix according to the multi-level training model to obtain a rumor detection result corresponding to the social network data.
2. The microblog rumor detection method according to claim 1, wherein: the sample data includes rumor sample data and non-rumor sample data.
3. The microblog rumor detection method of claim 1, wherein: in the step B, the noise in the json file is eliminated by using a regular expression.
4. The microblog rumor detection method of claim 3, wherein: and (4) all the texts which are subjected to pre-training according to training data and test data: the ratio of 1 is used for subsequent model processing.
5. The microblog rumor detection method according to claim 4, wherein: the pre-trained BERT model and code enable the embedding of word vectors.
6. The microblog rumor detection method according to claim 5, wherein: the BERT model is used as a word vector model, can fully describe character level, word level, sentence level and sentence-sentence relation characteristics, and gradually moves NLP tasks to pre-training generated sentence vectors.
7. The microblog rumor detection method according to claim 1, wherein: the BERT model proposes a pre-training target: a Mask Language Model (MLM) overcomes the traditional unidirectional limitation, and an MLM target allows the representation of contexts fusing the left side and the right side, so that a deep bidirectional Transformer can be pre-trained.
8. The microblog rumor detection method according to claim 7, wherein: the BERT model introduces a "next sentence prediction" task that can be used to train the representation of text pairs with MLM.
9. The microblog rumor detection method according to claim 8, wherein: the BERT model predicts whether texts at two ends of input BERT are continuous or not by using sentence-level negative sampling; during the training process, the second segment of the input model will be randomly selected from all the text with a probability of 50%, with the remaining 50% selecting the subsequent text of the first segment.
CN202010757089.1A 2020-07-31 2020-07-31 Microblog rumor detection method Active CN111966786B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010757089.1A CN111966786B (en) 2020-07-31 2020-07-31 Microblog rumor detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010757089.1A CN111966786B (en) 2020-07-31 2020-07-31 Microblog rumor detection method

Publications (2)

Publication Number Publication Date
CN111966786A CN111966786A (en) 2020-11-20
CN111966786B true CN111966786B (en) 2022-10-25

Family

ID=73363172

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010757089.1A Active CN111966786B (en) 2020-07-31 2020-07-31 Microblog rumor detection method

Country Status (1)

Country Link
CN (1) CN111966786B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560495B (en) * 2020-12-09 2024-03-15 新疆师范大学 Microblog rumor detection method based on emotion analysis
CN112818011B (en) * 2021-01-12 2022-03-08 南京邮电大学 Improved TextCNN and TextRNN rumor identification method
CN113158075A (en) * 2021-03-30 2021-07-23 昆明理工大学 Comment-fused multitask joint rumor detection method
CN113204641B (en) * 2021-04-12 2022-09-02 武汉大学 Annealing attention rumor identification method and device based on user characteristics
CN113705099B (en) * 2021-05-09 2023-06-13 电子科技大学 Social platform rumor detection model construction method and detection method based on contrast learning
CN113127643A (en) * 2021-05-11 2021-07-16 江南大学 Deep learning rumor detection method integrating microblog themes and comments
CN113326437B (en) * 2021-06-22 2022-06-21 哈尔滨工程大学 Microblog early rumor detection method based on dual-engine network and DRQN
CN113377959B (en) * 2021-07-07 2022-12-09 江南大学 Few-sample social media rumor detection method based on meta learning and deep learning
CN116401339A (en) * 2023-06-07 2023-07-07 北京百度网讯科技有限公司 Data processing method, device, electronic equipment, medium and program product

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280057A (en) * 2017-12-26 2018-07-13 厦门大学 A kind of microblogging rumour detection method based on BLSTM
CN111144131A (en) * 2019-12-25 2020-05-12 北京中科研究院 Network rumor detection method based on pre-training language model
CN111159338A (en) * 2019-12-23 2020-05-15 北京达佳互联信息技术有限公司 Malicious text detection method and device, electronic equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280057A (en) * 2017-12-26 2018-07-13 厦门大学 A kind of microblogging rumour detection method based on BLSTM
CN111159338A (en) * 2019-12-23 2020-05-15 北京达佳互联信息技术有限公司 Malicious text detection method and device, electronic equipment and storage medium
CN111144131A (en) * 2019-12-25 2020-05-12 北京中科研究院 Network rumor detection method based on pre-training language model

Also Published As

Publication number Publication date
CN111966786A (en) 2020-11-20

Similar Documents

Publication Publication Date Title
CN111966786B (en) Microblog rumor detection method
CN110119765B (en) Keyword extraction method based on Seq2Seq framework
Chen et al. Call attention to rumors: Deep attention based recurrent neural networks for early rumor detection
CN111144131B (en) Network rumor detection method based on pre-training language model
CN113051916B (en) Interactive microblog text emotion mining method based on emotion offset perception in social network
CN111310476B (en) Public opinion monitoring method and system using aspect-based emotion analysis method
Riadi Detection of cyberbullying on social media using data mining techniques
CN112749274B (en) Chinese text classification method based on attention mechanism and interference word deletion
CN105183717A (en) OSN user emotion analysis method based on random forest and user relationship
CN109325125B (en) Social network rumor detection method based on CNN optimization
CN111914553B (en) Financial information negative main body judging method based on machine learning
Zhang et al. Exploring deep recurrent convolution neural networks for subjectivity classification
Rauf et al. Using bert for checking the polarity of movie reviews
Ke et al. A novel approach for cantonese rumor detection based on deep neural network
CN115329085A (en) Social robot classification method and system
Huang A CNN model for SMS spam detection
CN113220964B (en) Viewpoint mining method based on short text in network message field
CN113535960A (en) Text classification method, device and equipment
CN111859955A (en) Public opinion data analysis model based on deep learning
CN116644760A (en) Dialogue text emotion analysis method based on Bert model and double-channel model
CN115659990A (en) Tobacco emotion analysis method, device and medium
Mo et al. Large language model (llm) ai text generation detection based on transformer deep learning algorithm
CN114238738A (en) Rumor detection method based on attention mechanism and bidirectional GRU
Kavatagi et al. A context aware embedding for the detection of hate speech in social media networks
Al Azhar et al. Identifying Author in Bengali Literature by Bi-LSTM with Attention Mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant