CN112766359A - Word double-dimensional microblog rumor recognition method for food safety public sentiment - Google Patents

Word double-dimensional microblog rumor recognition method for food safety public sentiment Download PDF

Info

Publication number
CN112766359A
CN112766359A CN202110050517.1A CN202110050517A CN112766359A CN 112766359 A CN112766359 A CN 112766359A CN 202110050517 A CN202110050517 A CN 202110050517A CN 112766359 A CN112766359 A CN 112766359A
Authority
CN
China
Prior art keywords
word
model
text
embedding
food safety
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110050517.1A
Other languages
Chinese (zh)
Other versions
CN112766359B (en
Inventor
左敏
何思宇
张青川
颜文婧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Technology and Business University
Original Assignee
Beijing Technology and Business University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Technology and Business University filed Critical Beijing Technology and Business University
Priority to CN202110050517.1A priority Critical patent/CN112766359B/en
Publication of CN112766359A publication Critical patent/CN112766359A/en
Application granted granted Critical
Publication of CN112766359B publication Critical patent/CN112766359B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention relates to a word double-dimensional microblog rumor recognition method for food safety public sentiments, which comprises the following steps: preprocessing the internet crawled data, constructing a food safety field word embedding resource library by combining an open domain word embedding resource library, crawling multi-level encyclopedia corpus to perform incremental training on the word embedding resource library, extracting word dimension text features based on a BERT network, extracting word dimension text features based on a BLSTM network and adding a position attention mechanism, finally obtaining word double-dimension text feature vectors, and performing classification and identification on whether microblog texts are rumors or not. The method solves the problems of serious spoken language conversion, weak structure, strong domain and difficult vectorization of microblog text corpora in the field of food safety public sentiment, extracts the corpus characteristics more fully by constructing the field lexicon and the multi-granularity vectorization method, and improves the accuracy of rumor recognition.

Description

Word double-dimensional microblog rumor recognition method for food safety public sentiment
Technical Field
The invention relates to the field of artificial intelligence, in particular to a word double-dimensional microblog rumor recognition method for food safety public sentiments.
Background
Microblogs are popular due to the characteristics of convenience, openness, timeliness, anonymity and the like, and more people choose to use the microblogs to release opinions and share stories. However, due to the low threshold of microblog user registration and the diversity of the use groups, the quality of the information issued by the microblog users is difficult to monitor and control, so that the microblog users become a hotbed for the propagation of network rumor growth, which not only causes serious interference to the lives of people, but also disturbs social order.
The food field is related to the national civilization, so the influence of microblog rumors related to food safety is particularly serious and severe. Therefore, the establishment of the rumor identification model by using the natural language processing technology has great significance for identifying the food safety microblog rumors.
Text classification recognition is an important and practical direction of research in natural language processing. Before the development of deep learning, the conventional machine learning method is applied to the field of text classification, such as a naive bayes model and a support vector machine model. However, the traditional machine learning model depends on artificial corpus labeling, which not only consumes a large amount of manpower and material resources, but also has a text feature extraction result which is not satisfactory.
With the development of technologies such as deep learning, cloud computing and artificial intelligence in recent years, the deep neural network is applied in various fields and achieves better results. In the natural language processing field, under the condition of large-scale corpus, the multi-level network model realizes automatic text characteristic information mining, the deep neural network becomes one of key technologies in the natural language processing field, and the deep neural network also has a good effect in a text semantic classification task. The development and the use of the long-time memory network and the attention mechanism in the field of natural language processing lay a foundation for the invention.
In addition, in text semantic classification, many researchers have studied as to whether two kinds of embedding granularities, a character level and a word level, have an influence on the classification effect. Kim proposes a model for extracting text semantic information through character-level CNN, and Liulongfei and the like prove the superiority of character-level feature representation in Chinese text processing.
Because the microblog texts are mostly unstructured and lack of standard text corpora, the vectorization difficulty is high, the semantic features of the texts are extracted by singly using word dimensions or word dimensions, the feature extraction is incomplete, the classification precision is lost, and the existing language model is difficult to accurately process the texts in the food safety field. Therefore, the microblog text processing in the food safety field is carried out by combining the word and word two-dimensional neural network model with the constructed word library in the food field.
Disclosure of Invention
The invention solves the problems: the method overcomes the defects of the prior art, provides a word double-dimensional microblog rumor identification method facing to food safety public sentiments, solves the requirement of food safety related rumor identification supervision on the existing microblog, can quickly and accurately identify and judge the rumors, greatly improves the working efficiency of a supervisor, and assists the supervisor to make a judgment.
The invention relates to a word double-dimensional microblog rumor recognition method for food safety public sentiments, which comprises the following steps of:
step 1, preprocessing original text data acquired from a web crawler on the Internet, wherein the preprocessing comprises removing a large number of special symbols, stop words and the like contained in the original text data;
step 2, on the basis of the open domain word embedding resource library, constructing a word embedding resource library in the food safety field and performing incremental training;
and 3, constructing a bidirectional long-time and short-time memory network based on the fusion position perception attention mechanism as a neural network model end for obtaining the vector dimension text features of the text words, firstly, judging the semantic role and the position of the domain key words by combining the domain word library constructed in the step 2, and generating the attention based on position perception. And then, inputting word vectors generated by word embedding of the text corpus into a BLSTM model, enabling the word vectors to participate in the calculation of the intermediate hidden layer, and further calculating the vectors calculated by the hidden layer under the influence of an attention mechanism to obtain word level text semantic features.
Step 4, independently of the BLSTM model constructed in the step 3, constructing a BERT neural network model as a neural network model end for obtaining vector dimension text characteristics of text words, wherein the BERT model converts each word in the text into a vector by inquiring a word vector table to be used as model input; the model output is the vector representation after the full-text semantic information corresponding to each word is input.
And step 5, using SoftMax as a classifier, merging the word dimension text characteristic information obtained in the step 3 and the word dimension text characteristic information obtained in the step 4 at a connecting layer after the linguistic data are processed and output by a BERT and BLSTM two-way neural network, and then inputting the information into the classifier for classification and identification to obtain a final rumor classification and identification result.
Further, in the step 2, on the basis of the open domain word embedding resource library, a word embedding resource library in the food safety field is constructed by combining a skip-gram model and word semantic representation, and corpus expansion is performed on the basis, so that the open hundred-degree encyclopedia corpus is increased, and the word encyclopedia and news corpus in the food field are crawled from the network to perform training of the word vector model. And after a period of time, when certain food safety public opinion linguistic data are accumulated, performing incremental training on the word vector model.
Further, in the step 3, a bidirectional long-time and short-time memory network model based on a fusion position perception attention mechanism is trained to serve as a word dimension text feature extraction model. Converting microblog text corpora into vector representation, taking the vector representation as the input of a network, training a neural network model, building one of two-way network models forming an integral model by utilizing a bidirectional long-time memory network integrating a position attention perception mechanism, and obtaining a local output result through the training of the existing microblog text corpora: word dimension text feature vector representation.
Further, in the step 4, a BERT network model is trained to be used as a word dimension text feature extraction model. The model input contains two parts in addition to the word vector (Token Embedding), one of which is Segment Embedding: the value of the vector is automatically learned in the model training process, is used for depicting the global semantic information of the text and is fused with the semantic information of the single character; the second is Position Embedding (Position Embedding): because semantic information carried by words appearing at different positions of a text is different, the BERT model adds different vectors to the words at different positions respectively for distinguishing. Finally, the BERT model takes the sum of Token Embedding, Segment Embedding and Position Embedding as a sentence vector to obtain one of the two-way network outputs of the whole model: word dimension text feature vector representation.
Further, the BERT network is used as a pre-training model, and in a text classification task, a Token Embedding layer in the BERT network requires that the head of a sentence is marked as [ CLS ] and the labels among multiple sentences are marked as [ SEP ] for input. The Segment Embedding and Position Embedding layers utilize pre-trained model parameters to participate in the calculation.
Further, in the step 5, two neural network models are trained, including a bidirectional long-time and short-time memory network model for extracting a fusion position perception attention mechanism of word dimension text feature vectors and a BERT model for extracting the word dimension text feature vectors; when training is started, randomly initializing weights, connecting the two-way network calculation results through a connection layer after the two-way network calculation results are obtained through neural network calculation, and converting numerical output of the neural network into classified probability output by using a SoftMax function as a loss function; in order to avoid overfitting in the training process, Dropout with certain probability is set, namely partial weight or output of the hidden layer is randomly zeroed in the model training process, so that the interdependence among all nodes is reduced, and the model generalization is improved.
Compared with the prior art, the invention has the advantages that: whether food safety related microblogs are rumors or not can be quickly judged through a word two-way text semantic classification model of an LSTM network and a BERT network based on a fusion position perception attribute mechanism, a more comprehensive and more targeted food safety field public opinion Embedding resource library is constructed aiming at the rumors in the food safety public opinion field, two Embedding granularities of character level and word level are used as model input, and finally the texts are classified by combining a feature extraction result of the two-way network. The model provided by the invention fully utilizes the characteristics of the BLSTM, excavates the semantic features of the text from the word vector level, combines with the position attention mechanism, acquires detailed feature information in the microblog text through training of the BLSTM, and uses the position attention mechanism to calculate, so that the words related to the food safety field play a decisive role in the whole text. Meanwhile, the BERT network can further mine text semantics from the word vector level, avoid the loss of classification precision due to incomplete feature extraction caused by unstructured and lack of standard text corpora, and effectively improve the text semantic classification effect.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
Fig. 1 is a schematic flow chart of a word two-dimensional microblog rumor identification method for food safety public sentiment according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a bidirectional long-short term memory network for a word vector end fused position attention mechanism;
FIG. 3 is a schematic diagram of a word vector end BERT network;
fig. 4 is a connection layer network diagram.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, rather than all embodiments, and all other embodiments obtained by a person skilled in the art based on the embodiments of the present invention belong to the protection scope of the present invention without creative efforts.
As shown in fig. 1, the invention provides a word two-dimensional microblog rumor identification method for food safety public sentiment, which comprises the following steps:
step 1, preprocessing original text data acquired from a web crawler on the Internet, wherein the preprocessing comprises removing a large number of special symbols, stop words and the like contained in the original text data;
step 2, on the basis of the open domain word embedding resource library, constructing a word embedding resource library in the food safety field and performing incremental training;
and 3, constructing a bidirectional long-time and short-time memory network based on the fusion position perception attention mechanism as a neural network model end for obtaining the vector dimension text features of the text words, firstly, judging the semantic role and the position of the domain key words by combining the domain word library constructed in the step 2, and generating the attention based on position perception. Then inputting word vectors generated by word embedding of the text corpus into a BLSTM model, enabling the word vectors to participate in calculation of an intermediate hidden layer, and further calculating the vectors calculated by the hidden layer under the influence of an attention mechanism to obtain word level text semantic features;
step 4, independently of the BLSTM model constructed in the step 3, constructing a BERT neural network model as a neural network model end for obtaining vector dimension text characteristics of text words, wherein the BERT model converts each word in the text into a vector by inquiring a word vector table to be used as model input; the model output is vector representation after the full-text semantic information corresponding to each character is input;
and step 5, using SoftMax as a classifier, merging the word dimension text characteristic information obtained in the step 3 and the word dimension text characteristic information obtained in the step 4 at a connecting layer after the linguistic data are processed and output by a BERT and BLSTM two-way neural network, and then inputting the information into the classifier for classification and identification to obtain a final rumor classification and identification result.
Referring to fig. 1, an overall schematic diagram of the method provided by the invention is shown, crawled food security public sentiment microblog data are preprocessed, a word embedding resource library in the food security field is constructed by combining an open domain word embedding resource library, then, multi-level encyclopedia linguistic data are crawled to perform incremental training on the word embedding resource library, word dimension text features based on a BERT network and word dimension text features based on a BLSTM network and added with a position attention mechanism are obtained, and finally, word two-dimension text feature vectors are obtained, and classification and identification whether microblog texts are rumors or not are performed.
In the embodiment shown in fig. 2, the model first determines the semantic role and location of the domain keyword by combining the domain thesaurus, and generates the attention based on location awareness. The microblog text is embedded into words to generate word dimension vectors, the word dimension vectors are input into a bidirectional long-time memory network, the word vectors are calculated through a middle hidden layer, hidden layer vectors are output, and semantic features of the word dimension text are calculated with position attention.
Aiming at the problem of microblog rumors in the field of food safety public sentiments in the research, keywords in the field of food safety are very important, and adjacent words of the keywords also have a non-negligible effect. The reason is that in the task of text recognition and classification, the influence of each word in the text on the final classification result is different, and the effect of the keywords can be more fully exerted by increasing the attention on the keywords. Therefore, the positions of the keywords are positioned according to the domain lexicon, the model learns more position information, and a position-based attention mechanism is introduced into the model. The influence of the keywords on a certain distance of the hidden layer dimension is assumed to follow a gaussian distribution. A basis matrix K of influence is defined, each column of which represents a basis vector of influence corresponding to a particular distance. K is as defined in formula (1):
K(i,u)~N(Kernel(u),σ) (1)
wherein K (i, u) represents the corresponding influence of the distance u of the food safety domain keyword in the ith dimension, and N represents a normal distribution conforming to the expectation and standard deviation sigma of Kernel (u) value. Kernel (u) is a Gaussian kernel function used to model location-aware based impact propagation, which is defined as formula (2):
Figure BDA0002898887990000051
when u is 0, the current word is a keyword in the food safety field, the obtained propagation influence is the largest, and the propagation influence is weakened along with the increase of the distance.
Obtaining the influence vector of the key word at each specific position by utilizing the influence foundation matrix and according to the position relation of the key words in the food safety field through cumulative calculation:
pj=Kcj (3)
in the formula, pjAs a cumulative influence vector of the words at the j position, cjIs a distance count vector representing the count of all keywords at a distance u for a word at position j, cj(u) is calculated as follows:
Cj(u)=∑q∈Q[(j-u)∈pos(w)]+[(j+u)∈pos(w)] (4)
in the formula, Q is all keywords contained in a microblog text related to food safety public sentiment, Q is one of the keywords, pos (Q) is a position set of the keywords Q appearing in the son, [. cndot. ] is an index function, and if the condition is satisfied, the value is 1, and if the condition is not satisfied, the value is 0.
The attention calculation method of words at the j position in the microblog text related to the food safety public sentiment is shown as a formula (5):
Figure BDA0002898887990000052
in the formula, hjIs a hidden layer vector of j-position words, pjThe location perception influence vectors are accumulated, len is the number of word vectors in a sentence of microblog text related to food security public sentiment, and a (-) is the importance of words for measuring the hidden layer vectors and the location perception influence vectors. The specific form of a (-) is as follows (6):
Figure BDA0002898887990000061
in the formula, WH,WpIs hj,pjWeight matrix of biIs a bias vector belonging to the first layer parameters,
Figure BDA0002898887990000062
for the ReLU function, v is a global vector, b2Is a bias vector belonging to the second layer parameters. After the weights of the words at each position are calculated, all hidden layer vectors in the sentence are weighted to obtainFinal Attention Value:
Figure BDA0002898887990000063
in another embodiment shown in fig. 3, a BERT network is adopted at the word dimension text feature extraction end, and for the text classification task, the Token Embedding layer in BERT requires the head of a sentence to be marked as [ CLS ] for input]Between multiple sentences marked SEP]. The word vectors respectively pass through a Token Embed-dings layer, a Segment Embedding layer and a Position Embedding layer, and the Segment Embedding layer and the Position Embedding layer utilize pre-trained model parameters to participate in calculation. And finally, character-level food safety public sentiment related microblog text characteristic representation is obtained. In FIG. 3, Tok denotes different Token, E denotes an embedding vector, TiRepresenting the feature vector obtained by the ith Token after the BERT process. For text classification in general, BERT directly takes the first [ CLS ]]C, adding a layer of weight W to the final hidden layer vector C, and then using a SoftMax function as an activation function, bcIs a bias vector, as in equation (8):
P=SoftMax(CWT+bc) (8)
and the model fine adjustment is realized by adjusting the model parameters in a specific task.
In the embodiment shown in fig. 4, after the obtained word vector level and word vector level text vectors are obtained, connection is performed in a connection layer, and finally, the probability of whether the microblog text related to the food security public opinion is a rumor is obtained through a SoftMax function, wherein the formula of the SoftMax function is as follows:
Figure BDA0002898887990000064
the function maps the output of the neuron into the interval (0, 1), where n represents the number of classes, i represents a class in j, and giA value, P(s), representing the classificationi) Representing the probability of the ith class.
Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, but various changes may be apparent to those skilled in the art, and it is intended that all inventive concepts utilizing the inventive concepts set forth herein be protected without departing from the spirit and scope of the present invention as defined and limited by the appended claims.

Claims (6)

1. A word double-dimensional microblog rumor recognition method for food safety public sentiments is characterized by comprising the following steps:
step 1, preprocessing original text data acquired from a web crawler on the Internet, wherein the preprocessing comprises the step of removing special symbols and stop words contained in the original text data;
step 2, constructing a word embedding resource library in the food safety field on the basis of the open domain word embedding resource library, and performing incremental training;
step 3, constructing a bidirectional long-time and short-time memory network based on a fusion position perception attention mechanism as a neural network model end for obtaining the vector dimension text features of the text words, and specifically realizing the following steps: firstly, judging semantic roles and positions of domain keywords by combining the domain word library constructed in the step 2 to generate attention based on position perception, then inputting word vectors generated by embedding words into a text corpus into a BLSTM (binary likelihood model), enabling the word vectors to participate in the calculation of an intermediate hidden layer, and further calculating the vectors calculated by the hidden layer under the influence of an attention mechanism to obtain semantic features of a word-level text;
step 4, independently of the BLSTM model constructed in the step 3, constructing a BERT neural network model as a neural network model end for obtaining vector dimension text characteristics of text words, wherein the BERT model converts each word in the text into a vector by inquiring a word vector table to be used as model input; the model output is vector representation after the full-text semantic information corresponding to each character is input;
and step 5, using SoftMax as a classifier, merging the word dimension text characteristic information obtained in the step 3 and the word dimension text characteristic information obtained in the step 4 at a connecting layer after the linguistic data are processed and output by a BERT and BLSTM two-way neural network, and then inputting the information into the classifier for classification and identification to obtain a final rumor classification and identification result.
2. The word two-dimensional microblog rumor identification method for food safety public opinion of claim 1, wherein the method comprises the following steps: in the step 2, on the basis of an open domain word embedding resource library, a word embedding resource library in the food safety domain is constructed by combining a skip-gram model and word semantic expression, corpus expansion is performed on the basis, the published encyclopedia corpus is added, vocabulary encyclopedia and news corpus in the food field are crawled from a network, word vector model training is performed, and after a period of time, when certain food safety public opinion corpus is accumulated, incremental training is performed on the word vector model.
3. The word two-dimensional microblog rumor identification method for food safety public opinion of claim 1, wherein the method comprises the following steps: in the step 3, a bidirectional long-time and short-time memory network model based on a fusion position perception attention mechanism is trained to serve as a word dimension text feature extraction model, microblog text corpora are converted into vector representations to serve as input of a network, a neural network model is trained, one of two-way network models forming an integral model is built by using the bidirectional long-time and short-time memory network of the fusion position perception mechanism, and a word dimension text feature vector representation is obtained through training of the existing microblog text corpora.
4. The word two-dimensional microblog rumor identification method for food safety public opinion of claim 1, wherein the method comprises the following steps: in step 4, the BERT network model is trained as a word dimension text feature extraction model, and the model input includes two parts except a word vector (Token Embedding), one of which is segmentation Embedding (Segment Embedding): the value of the vector is automatically learned in the model training process, is used for depicting the global semantic information of the text and is fused with the semantic information of the single character; the second is Position Embedding (Position Embedding): because semantic information carried by words appearing at different positions of a text is different, the BERT model adds different vectors to the words at different positions respectively for distinguishing; and finally, the BERT model takes the sum of Token Embedding, Segment Embedding and Position Embedding as a sentence vector to obtain one of two-way network outputs of the overall model, namely character dimension text feature vector representation.
5. The word two-dimensional microblog rumor identification method for food safety public opinion of claim 1, wherein the method comprises the following steps: the BERT network is used as a pre-training model, in a text classification task, a Token Embedding layer in the BERT network marks the head of an input request sentence as [ CLS ], marks among multiple sentences as [ SEP ], and Segment Embedding and Position Embedding layers utilize pre-trained model parameters to participate in calculation.
6. The word two-dimensional microblog rumor identification method for food safety public opinion of claim 1, wherein the method comprises the following steps: in the step 5, two neural network models are trained, including a bidirectional long-time and short-time memory network model for extracting a fusion position perception attention mechanism of word dimension text feature vectors and a BERT model for extracting the word dimension text feature vectors; when training is started, randomly initializing weights, connecting the two-way network calculation results through a connection layer after the two-way network calculation results are obtained through neural network calculation, and converting numerical output of the neural network into classified probability output by using a SoftMax function as a loss function; in order to avoid overfitting in the training process, Dropout with certain probability is set, namely partial weight or output of the hidden layer is randomly zeroed in the model training process, so that the interdependence among all nodes is reduced, and the model generalization is improved.
CN202110050517.1A 2021-01-14 2021-01-14 Word double-dimension microblog rumor identification method for food safety public opinion Active CN112766359B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110050517.1A CN112766359B (en) 2021-01-14 2021-01-14 Word double-dimension microblog rumor identification method for food safety public opinion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110050517.1A CN112766359B (en) 2021-01-14 2021-01-14 Word double-dimension microblog rumor identification method for food safety public opinion

Publications (2)

Publication Number Publication Date
CN112766359A true CN112766359A (en) 2021-05-07
CN112766359B CN112766359B (en) 2023-07-25

Family

ID=75700739

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110050517.1A Active CN112766359B (en) 2021-01-14 2021-01-14 Word double-dimension microblog rumor identification method for food safety public opinion

Country Status (1)

Country Link
CN (1) CN112766359B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378024A (en) * 2021-05-24 2021-09-10 哈尔滨工业大学 Deep learning-based public inspection field-oriented related event identification method
CN113592338A (en) * 2021-08-09 2021-11-02 新疆大学 Food quality management safety risk pre-screening model
CN113946680A (en) * 2021-10-20 2022-01-18 河南师范大学 Online network rumor identification method based on graph embedding and information flow analysis
CN115082947A (en) * 2022-07-12 2022-09-20 江苏楚淮软件科技开发有限公司 Paper letter rapid collecting, sorting and reading system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090046A (en) * 2017-12-29 2018-05-29 武汉大学 A kind of microblogging rumour recognition methods based on LDA and random forest
CN108614855A (en) * 2018-03-19 2018-10-02 众安信息技术服务有限公司 A kind of rumour recognition methods
US20200321002A1 (en) * 2019-04-05 2020-10-08 Samsung Electronics Co., Ltd. System and method for context-enriched attentive memory network with global and local encoding for dialogue breakdown detection
CN112069397A (en) * 2020-08-21 2020-12-11 三峡大学 Rumor detection method combining self-attention mechanism with generation of confrontation network
CN112200197A (en) * 2020-11-10 2021-01-08 天津大学 Rumor detection method based on deep learning and multi-mode
US20210012199A1 (en) * 2019-07-04 2021-01-14 Zhejiang University Address information feature extraction method based on deep neural network model

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090046A (en) * 2017-12-29 2018-05-29 武汉大学 A kind of microblogging rumour recognition methods based on LDA and random forest
CN108614855A (en) * 2018-03-19 2018-10-02 众安信息技术服务有限公司 A kind of rumour recognition methods
US20200321002A1 (en) * 2019-04-05 2020-10-08 Samsung Electronics Co., Ltd. System and method for context-enriched attentive memory network with global and local encoding for dialogue breakdown detection
US20210012199A1 (en) * 2019-07-04 2021-01-14 Zhejiang University Address information feature extraction method based on deep neural network model
CN112069397A (en) * 2020-08-21 2020-12-11 三峡大学 Rumor detection method combining self-attention mechanism with generation of confrontation network
CN112200197A (en) * 2020-11-10 2021-01-08 天津大学 Rumor detection method based on deep learning and multi-mode

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谌志群;鞠婷;: "基于BERT和双向LSTM的微博评论倾向性分析研究", 情报理论与实践, no. 08 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378024A (en) * 2021-05-24 2021-09-10 哈尔滨工业大学 Deep learning-based public inspection field-oriented related event identification method
CN113378024B (en) * 2021-05-24 2023-09-01 哈尔滨工业大学 Deep learning-oriented public inspection method field-based related event identification method
CN113592338A (en) * 2021-08-09 2021-11-02 新疆大学 Food quality management safety risk pre-screening model
CN113592338B (en) * 2021-08-09 2023-09-12 新疆大学 Food quality management safety risk pre-screening model
CN113946680A (en) * 2021-10-20 2022-01-18 河南师范大学 Online network rumor identification method based on graph embedding and information flow analysis
CN113946680B (en) * 2021-10-20 2024-04-16 河南师范大学 Online network rumor identification method based on graph embedding and information flow analysis
CN115082947A (en) * 2022-07-12 2022-09-20 江苏楚淮软件科技开发有限公司 Paper letter rapid collecting, sorting and reading system
CN115082947B (en) * 2022-07-12 2023-08-15 江苏楚淮软件科技开发有限公司 Paper letter quick collecting, sorting and reading system

Also Published As

Publication number Publication date
CN112766359B (en) 2023-07-25

Similar Documents

Publication Publication Date Title
US11631007B2 (en) Method and device for text-enhanced knowledge graph joint representation learning
CN107578106B (en) Neural network natural language reasoning method fusing word semantic knowledge
Liu et al. Review of intent detection methods in the human-machine dialogue system
CN112766359B (en) Word double-dimension microblog rumor identification method for food safety public opinion
Sivakumar et al. Review on word2vec word embedding neural net
CN112487203A (en) Relation extraction system integrated with dynamic word vectors
CN110929030A (en) Text abstract and emotion classification combined training method
CN103226580A (en) Interactive-text-oriented topic detection method
Cai et al. Intelligent question answering in restricted domains using deep learning and question pair matching
Wahid et al. Topic2Labels: A framework to annotate and classify the social media data through LDA topics and deep learning models for crisis response
CN109543017A (en) Legal issue keyword generation method and its system
CN108874896A (en) A kind of humorous recognition methods based on neural network and humorous feature
CN113515632B (en) Text classification method based on graph path knowledge extraction
Zhou Research on sentiment analysis model of short text based on deep learning
CN115238691A (en) Knowledge fusion based embedded multi-intention recognition and slot filling model
Kang et al. Utilization strategy of user engagements in korean fake news detection
CN116522165B (en) Public opinion text matching system and method based on twin structure
Sun et al. A hybrid approach to news recommendation based on knowledge graph and long short-term user preferences
Chowanda et al. Generative Indonesian conversation model using recurrent neural network with attention mechanism
Zhang et al. Combining the attention network and semantic representation for Chinese verb metaphor identification
CN116757218A (en) Short text event coreference resolution method based on sentence relation prediction
CN116595166A (en) Dual-channel short text classification method and system combining feature improvement and expansion
Cai et al. Multi-view and attention-based bi-lstm for weibo emotion recognition
CN115759102A (en) Chinese poetry wine culture named entity recognition method
CN113869040A (en) Voice recognition method for power grid dispatching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant