CN112766359B - Word double-dimension microblog rumor identification method for food safety public opinion - Google Patents

Word double-dimension microblog rumor identification method for food safety public opinion Download PDF

Info

Publication number
CN112766359B
CN112766359B CN202110050517.1A CN202110050517A CN112766359B CN 112766359 B CN112766359 B CN 112766359B CN 202110050517 A CN202110050517 A CN 202110050517A CN 112766359 B CN112766359 B CN 112766359B
Authority
CN
China
Prior art keywords
word
model
text
dimension
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110050517.1A
Other languages
Chinese (zh)
Other versions
CN112766359A (en
Inventor
左敏
何思宇
张青川
颜文婧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Technology and Business University
Original Assignee
Beijing Technology and Business University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Technology and Business University filed Critical Beijing Technology and Business University
Priority to CN202110050517.1A priority Critical patent/CN112766359B/en
Publication of CN112766359A publication Critical patent/CN112766359A/en
Application granted granted Critical
Publication of CN112766359B publication Critical patent/CN112766359B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention relates to a word double-dimension micro-rumor identification method for food security public opinion, which comprises the following steps: preprocessing Internet crawling data, constructing a word-enabling library in the food safety field by combining an open domain word-enabling library, crawling multi-level hundred-degree encyclopedia corpus to perform incremental training on the word-enabling library, extracting word dimension text features based on BERT network, extracting word dimension text features based on BLSTM network and adding a position attention mechanism, finally obtaining word double-dimension text feature vectors, and performing classification recognition on whether microblog text is rumor not. The method solves the problems of serious spoken language, weak structure, strong territory and difficult vectorization of microblog text in the field of food safety public opinion, and improves the accuracy of rumor recognition by constructing a field word stock and a multi-granularity vectorization method to more fully extract corpus features.

Description

Word double-dimension microblog rumor identification method for food safety public opinion
Technical Field
The invention relates to the field of artificial intelligence, in particular to a word double-dimension microblog rumors identification method for food security public opinion.
Background
Microblogs are popular among the public because of their convenience, openness, timeliness, anonymity, etc., and more people choose to use microblogs to publish views and share stories. However, due to the low threshold of microblog user registration and the diversity of using groups, the quality of information released by the microblog users is difficult to monitor and control, so that the microblog users become a hotbed for propagation of network rumors, and the microblog users can not only cause serious interference to the life of people, but also disturb social order.
The food field is related to national life, so the influence of the food safety related microblog rumors is particularly serious and severe. Therefore, the method for establishing the rumor identification model by using the natural language processing technology has great significance for the identification of the food safety microblog rumors.
Text classification recognition is an important and practical research direction for natural language processing. Prior to the advent of deep learning, traditional machine learning methods were applied in the field of text classification, such as naive bayes models and support vector machine models. However, the traditional machine learning model depends on manual corpus labeling, so that a large amount of manpower and material resources are consumed, and the text feature extraction result is not satisfactory.
With the development of deep learning, cloud computing, artificial intelligence and other technologies in recent years, the deep neural network is applied to various fields and achieves good results. In the field of natural language processing, under the condition of large-scale corpus, a multi-level network model realizes automatic mining of text characteristic information, and a deep neural network becomes one of key technologies in the field of natural language processing, and achieves good effects in text semantic classification tasks. The development and the use of long-short-term memory networks and attention mechanisms in the field of natural language processing lay a foundation for the invention.
In addition, in text semantic classification, many researchers have studied whether or not both the character-level and word-level embedding granularities have an influence on the classification effect. Kim proposes a model for extracting text semantic information through character-level CNN, liu Longfei, etc. demonstrates the superiority of character-level feature representation in chinese text processing.
Because the microblog texts are mostly unstructured text corpora lacking in specifications, the vectorization difficulty is high, text semantic features are extracted by singly using word dimensions or word dimensions, feature extraction is incomplete, classification precision is lost, and the text in the food safety field is difficult to accurately process by the existing language model. Therefore, the invention adopts word two-dimensional neural network model to combine with the constructed word stock in the food field to process the microblog text in the food safety field.
Disclosure of Invention
The invention solves the technical problems: the method for identifying the micro-rumors in the word two dimensions for food safety public opinion aims at solving the requirements of food safety related rumors on the current micro-rumors on identification and supervision, can quickly and accurately identify and judge the rumors, greatly improves the working efficiency of a supervisor, and assists the supervisor in making judgment.
The invention discloses a word double-dimension micro-rumor identification method for food security public opinion, which comprises the following steps:
step 1, preprocessing original text data acquired from web crawlers on the Internet, wherein the preprocessing comprises removing a large number of special symbols, stop words and the like contained in the original text data;
step 2, constructing a word-casting resource library in the food safety field and performing incremental training on the basis of the open-field word-casting resource library;
and 3, constructing a bidirectional long-short-time memory network based on a fused position-aware attention mechanism as a neural network model end for obtaining text word vector dimension text characteristics, and firstly judging the sense roles and positions of the field key words by combining the field word stock constructed in the step 2 to generate the attention based on position awareness. And then, word vectors generated by word embedding of the text corpus are input into a BLSTM model, the word vectors participate in calculation of an intermediate hidden layer, and then the vectors calculated by the hidden layer are further calculated under the influence of an attention mechanism to obtain word-level text semantic features.
Step 4, constructing a BERT neural network model as a neural network model end for obtaining text character vector dimension text characteristics independently of the BLSTM model constructed in the step 3, and converting each character in the text into a vector by inquiring a character vector table by the BERT model to be used as a model input; the model output is a vector representation after the fusion of the full text semantic information corresponding to each word is input.
And 5, using softMax as a classifier, processing and outputting the corpus through a BERT and BLSTM two-way neural network, combining word dimension text characteristic information obtained in the step 3 with word dimension text characteristic information obtained in the step 4 in a connecting layer, and inputting the word dimension text characteristic information into the classifier for classification and identification to obtain a final rumor classification and identification result.
Further, in the step 2, a skip-gram model and word semantic representation are combined on the basis of an open domain word embedding resource library, a word embedding resource library in the food safety domain is constructed, corpus expansion is performed on the basis, the published hundred-degree encyclopedia is increased, and word vector model training is performed by crawling word encyclopedia and news corpus in the food domain from a network. And after that, at intervals, when a certain food safety public opinion corpus is accumulated, performing incremental training on the word vector model.
Further, in the step 3, training a bidirectional long-short-time memory network model based on a fused position awareness attention mechanism as a word dimension text feature extraction model. Converting microblog text corpus into vector representation, using the vector representation as input of a network, training a neural network model, constructing one of two-way network models forming an integral model by using a two-way long-short-time memory network integrating a position attention sensing mechanism, and obtaining a current output result through the existing microblog text corpus training: word dimension text feature vector representation.
Further, in the step 4, the BERT network model is trained as a character dimension text feature extraction model. The model input contains two parts in addition to the word vector (Token Embedding), one is segmentation Embedding (Segment Embedding): the value of the vector is automatically learned in the model training process and is used for describing global semantic information of texts and fusing the global semantic information with semantic information of single words; and secondly, position embedding (Position Embedding): since there is a difference in semantic information carried by words appearing in different positions of the text, the BERT model appends a different vector to the words in different positions to distinguish them. Finally, the BERT model takes the summation of Token components, segment Embedding and Position Embedding as sentence vectors to obtain one of two-way network outputs of the whole model: word dimension text feature vector representation.
Further, the BERT network is used as a pre-training model, and in the text classification task, the Token Embedding layer in the BERT network marks [ CLS ] the head of the sentence required for input and marks [ SEP ] among multiple sentences. Layers Segment Embedding and Position Embedding utilize pre-trained model parameters to participate in the computation.
Further, in the step 5, training two neural network models, including a two-way long-short-time memory network model for extracting a fusion position perception attention mechanism of word dimension text feature vectors, and a BERT model for extracting word dimension text feature vectors; when training is started, randomly initializing weights, after a two-way network calculation result is obtained through neural network calculation, connecting the two-way network calculation result through a connecting layer, and converting the numerical output of the neural network into classified probability output by using a softMax function as a loss function; in order to avoid over fitting in the training process, dropout with certain probability is set, namely partial weight or output of a random zeroing hidden layer in the model training process is set, so that interdependence among all nodes is reduced, and model generalization is improved.
Compared with the prior art, the invention has the advantages that: according to the method, whether the food safety related microblogs are rumors or not can be rapidly judged through a word double-way text semantic classification model based on an LSTM network and a BERT network of a fusion position perception technology mechanism, a more comprehensive and more specific food safety field public opinion Embedding resource library is built aiming at the identification of the rumors in the food safety public opinion field, two embedded granularities of a character level and a word level are used as model input, and finally the text is classified by combining a feature extraction result of the double-way network. The model provided by the invention fully utilizes the characteristics of the BLSTM, mines the semantic features of the text from the word vector level, is combined with a position attention mechanism, acquires detailed feature information in the microblog text through the training of the BLSTM, and uses the position attention mechanism for calculation, so that the words related to the food safety field play a decisive role on the whole text. Meanwhile, the BERT network can further excavate text semantics from the word vector level, so that the problem that classification accuracy is lost due to incomplete feature extraction caused by unstructured text corpus lacking specifications is avoided, and the text semantic classification effect is effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
Fig. 1 is a schematic flow chart of a word bi-dimensional microblog rumors recognition method for food security public opinion provided by the embodiment of the invention;
FIG. 2 is a schematic diagram of a two-way long short term memory network of a word vector end fusion position attention mechanism;
FIG. 3 is a diagram of a word vector end BERT network;
fig. 4 is a schematic diagram of a connection layer network.
Detailed Description
The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments, and all other embodiments obtained by those skilled in the art without the inventive effort based on the embodiments of the present invention are within the scope of protection of the present invention.
As shown in fig. 1, the invention provides a word two-dimensional microblog rumors identification method for food security public opinion, which comprises the following steps:
step 1, preprocessing original text data acquired from web crawlers on the Internet, wherein the preprocessing comprises removing a large number of special symbols, stop words and the like contained in the original text data;
step 2, constructing a word-casting resource library in the food safety field and performing incremental training on the basis of the open-field word-casting resource library;
and 3, constructing a bidirectional long-short-time memory network based on a fused position-aware attention mechanism as a neural network model end for obtaining text word vector dimension text characteristics, and firstly judging the sense roles and positions of the field key words by combining the field word stock constructed in the step 2 to generate the attention based on position awareness. Then, word vectors generated by word embedding of the text corpus are input into a BLSTM model, the word vectors participate in calculation of an intermediate hidden layer, and then the vectors calculated by the hidden layer are further calculated under the influence of an attention mechanism to obtain word-level text semantic features;
step 4, constructing a BERT neural network model as a neural network model end for obtaining text character vector dimension text characteristics independently of the BLSTM model constructed in the step 3, and converting each character in the text into a vector by inquiring a character vector table by the BERT model to be used as a model input; the model output is vector representation after the fusion of the full text semantic information corresponding to each word is input;
and 5, using softMax as a classifier, processing and outputting the corpus through a BERT and BLSTM two-way neural network, combining word dimension text characteristic information obtained in the step 3 with word dimension text characteristic information obtained in the step 4 in a connecting layer, and inputting the word dimension text characteristic information into the classifier for classification and identification to obtain a final rumor classification and identification result.
Referring to fig. 1, an overall schematic diagram of the method provided by the invention is shown, the crawled food safety public opinion microblog data is preprocessed, an open domain word-filling-in resource library is combined to construct a word-filling-in resource library in the food safety field, then a multi-level hundred-degree encyclopedic corpus is crawled to perform incremental training on the word-filling-in resource library, word dimension text features based on a BERT network and word dimension text features based on a BLSTM network and added with a position attention mechanism are obtained, finally word two-dimension text feature vectors are obtained, and classification recognition on whether microblog texts are rumors is performed.
In the embodiment shown in fig. 2, the model first generates a location-aware based attention by determining domain keyword sense roles and locations in conjunction with a domain word stock. And (3) word embedding is carried out on the microblog text to generate word dimension vectors, the word vectors are input into a bidirectional long-short-time memory network, the word vectors are calculated through the middle hidden layer, hidden layer vectors are output, and the word dimension text semantic features are obtained through calculation with position attention.
Aiming at the problem of identifying microblog rumors in the food safety public opinion field in the research, keywords in the food safety field are very important, and the adjacent words of the keywords have non-negligible effect. This is because in the text recognition classification task, the influence of each word in the text on the final classification result is different, and the effect of the keywords can be fully exerted by increasing the attention to the keywords. Therefore, the invention positions the keyword according to the domain word stock, so that the model learns more position information, and a position-based attention mechanism is introduced into the model. Assume that the influence of keywords on a particular distance in the hidden layer dimension follows a gaussian distribution. A basis matrix K of influences is defined, each column of which represents an influence basis vector corresponding to a specific distance. K is defined as formula (1):
K(i,u)~N(Kernel(u),σ) (1)
where K (i, u) represents the corresponding impact of the distance u of the food safety domain keyword in dimension i, and N represents the normal distribution of the expected and standard deviation σ according to the Kernel (u) value. Kernel (u) is a gaussian Kernel function used to model location-aware based impact propagation, defined as equation (2):
when u=0, the current word is a keyword in the food safety domain, and the obtained propagation influence is maximum, and the propagation influence is weakened along with the increase of the distance.
And obtaining the influence vector of the keywords at each specific position by using the influence basic matrix and according to the position relation of the keywords in the food safety field through cumulative calculation:
p j =Kc j (3)
wherein p is j C is the cumulative influence vector of the words at the j position j For distance count vector, represent the count of all keywords at distance u, c for the word at the j position j The calculation of (u) is as follows:
C j (u)=∑ q∈Q [(j-u)∈pos(w)]+[(j+u)∈pos(w)] (4)
wherein Q is all keywords contained in a food safety public opinion related microblog text, Q is one of the keywords, pos (Q) is a position set of the keywords Q appearing in the sub-category, [ (DEG ] is an index function, and the condition is 1 when satisfied and 0 when not satisfied.
The attention calculating method of the words at the j position in the food safety public opinion related microblog text is shown as the following (5):
in the formula, h j Is the hidden layer vector of the j-position word, p j The method is an accumulated position perception influence vector, len is the number of word vectors in a sentence of food safety public opinion related microblog text, and a (-) is the importance of words used for measuring based on hidden layer vectors and position perception influence vectors. The specific form of a (.) is as shown in formula (6):
in which W is H ,W p Is h j ,p j Weight matrix of b) i Is the bias vector belonging to the first layer parameter,v is a global vector, b 2 Is the bias vector belonging to the second layer parameters. After the weight of each position word is calculated, weighting all hidden layer vectors in the sentence to obtain a final attribute Value:
in another embodiment shown in FIG. 3, a BERT network is employed at the word dimension text feature extraction end, and for text classification tasks, the Token Embedding layer in BERT is labeled [ CLS ] for the input requirement sentence header]Inter-sentence label [ SEP ]]. The word vectors are calculated through Token embedded-bands, segment Embeddings, position Embedding, segment Embedding, and Position Embedding, respectively, using pre-trained model parameters. Finally obtaining the character-level food safety public opinion related microblog textAnd (5) sign representation. In FIG. 3, tok represents a different Token, E represents an embedded vector, T i Representing the feature vector of the i-th Token after the BERT processing. For text classification in general, BERT takes the first [ CLS directly]Adding a layer of weight W to the final hidden layer vector C in the model, and then using a softMax function as an activation function, b c Is a bias vector, as in equation (8):
P=SoftMax(CW T +b c ) (8)
model fine tuning is achieved by adjusting model parameters in specific tasks.
In the embodiment shown in fig. 4, after the obtained word vector level and word vector level text vector are connected at the connection layer, finally, the probability of whether the food safety public opinion related microblog text is rumor or is obtained through a SoftMax function, and the formula of the SoftMax function is as follows:
the function maps the output of neurons into (0, 1) intervals, where n represents the number of classes, i represents some class, g i A value representing the classification, P (s i ) Representing the probability of the ith class.
While the foregoing has been described in relation to illustrative embodiments thereof, so as to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, but is to be construed as limited to the spirit and scope of the invention as defined and defined by the appended claims, as long as various changes are apparent to those skilled in the art, all within the scope of which the invention is defined by the appended claims.

Claims (4)

1. The word double-dimension microblog rumors identification method for food safety public opinion is characterized by comprising the following steps of:
step 1, preprocessing original text data acquired from web crawlers on the Internet, wherein the preprocessing comprises the steps of removing special symbols and stop words contained in the original text data;
step 2, constructing a word-casting resource library in the food safety field on the basis of the open-field word-casting resource library, and performing incremental training;
step 3, constructing a bidirectional long-short-time memory network based on a fused position awareness attention mechanism as a neural network model end for obtaining text word vector dimension text characteristics, and specifically realizing: firstly, judging the meaning roles and positions of key words in the field by combining the field word stock constructed in the step 2, generating attention based on position perception, inputting word vectors generated by word embedding of text corpus into a BLSTM model, participating the word vectors into the calculation of an intermediate hidden layer, and further calculating the vectors calculated by the hidden layer under the influence of an attention mechanism to obtain word-level text semantic features;
step 4, constructing a BERT neural network model as a neural network model end for obtaining text character vector dimension text characteristics independently of the BLSTM model constructed in the step 3, and converting each character in the text into a vector by inquiring a character vector table by the BERT model to be used as a model input; the model output is vector representation after the fusion of the full text semantic information corresponding to each word is input;
step 5, using softMax as a classifier, processing and outputting corpus through BERT and BLSTM two-way neural network, combining word dimension text characteristic information obtained in the step 3 with word dimension text characteristic information obtained in the step 4 in a connecting layer, and inputting the word dimension text characteristic information into the classifier for classification and identification to obtain a final rumor classification and identification result;
in the step 3, training a two-way long-short-time memory network model based on a fusion position awareness mechanism as a word dimension text feature extraction model, converting microblog text corpus into vector representation, using the vector representation as network input, training a neural network model, constructing one of two-way network models forming an integral model by using a two-way long-short-time memory network of the fusion position awareness mechanism, and obtaining a local output result, namely word dimension text feature vector representation, through the training of the existing microblog text corpus;
in the step 4, the BERT network model is trained as a character dimension text feature extraction model, and the model input comprises two parts except a character vector (Token Embedding), namely a segmentation Embedding (Segment Embedding): the value of the vector is automatically learned in the model training process and is used for describing global semantic information of texts and fusing the global semantic information with semantic information of single words; and secondly, position embedding (Position Embedding): because of the difference of semantic information carried by words appearing in different positions of the text, the BERT model respectively adds a different vector to the words in different positions to distinguish the words; finally, the BERT model takes the sum of Token references Segment Embedding and Position Embedding as sentence vectors to obtain one of two-way network outputs of the whole model, namely, character dimension text feature vector representation.
2. The method for identifying the word bi-dimensional microblog rumors for food safety public opinion according to claim 1, which is characterized in that: in the step 2, a skip-gram model and word semantic representation are combined on the basis of an open domain word filling library, a word filling library in the food safety domain is constructed, corpus expansion is carried out on the basis, the published hundred-degree encyclopedia is increased, the word vector model is trained by crawling the vocabulary encyclopedia and news corpus in the food safety domain from a network, and then incremental training is carried out on the word vector model at intervals when certain food safety public opinion corpus is accumulated.
3. The method for identifying the word bi-dimensional microblog rumors for food safety public opinion according to claim 1, which is characterized in that: the BERT network is used as a pre-training model, and in the text classification task, token Embedding layers in the BERT network are marked with [ CLS ] for the head of an input required sentence, marks among multiple sentences [ SEP ], and Segment Embedding and Position Embedding layers utilize pre-trained model parameters to participate in calculation.
4. The method for identifying the word bi-dimensional microblog rumors for food safety public opinion according to claim 1, which is characterized in that: in the step 5, training two neural network models, including a two-way long-short-time memory network model of a fusion position perception attention mechanism for extracting word dimension text feature vectors and a BERT model for extracting word dimension text feature vectors; when training is started, randomly initializing weights, after a two-way network calculation result is obtained through neural network calculation, connecting the two-way network calculation result through a connecting layer, and converting the numerical output of the neural network into classified probability output by using a softMax function as a loss function; in order to avoid over fitting in the training process, dropout with certain probability is set, namely partial weight or output of a random zeroing hidden layer in the model training process is set, so that interdependence among all nodes is reduced, and model generalization is improved.
CN202110050517.1A 2021-01-14 2021-01-14 Word double-dimension microblog rumor identification method for food safety public opinion Active CN112766359B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110050517.1A CN112766359B (en) 2021-01-14 2021-01-14 Word double-dimension microblog rumor identification method for food safety public opinion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110050517.1A CN112766359B (en) 2021-01-14 2021-01-14 Word double-dimension microblog rumor identification method for food safety public opinion

Publications (2)

Publication Number Publication Date
CN112766359A CN112766359A (en) 2021-05-07
CN112766359B true CN112766359B (en) 2023-07-25

Family

ID=75700739

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110050517.1A Active CN112766359B (en) 2021-01-14 2021-01-14 Word double-dimension microblog rumor identification method for food safety public opinion

Country Status (1)

Country Link
CN (1) CN112766359B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378024B (en) * 2021-05-24 2023-09-01 哈尔滨工业大学 Deep learning-oriented public inspection method field-based related event identification method
CN113592338B (en) * 2021-08-09 2023-09-12 新疆大学 Food quality management safety risk pre-screening model
CN113946680B (en) * 2021-10-20 2024-04-16 河南师范大学 Online network rumor identification method based on graph embedding and information flow analysis
CN115082947B (en) * 2022-07-12 2023-08-15 江苏楚淮软件科技开发有限公司 Paper letter quick collecting, sorting and reading system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090046A (en) * 2017-12-29 2018-05-29 武汉大学 A kind of microblogging rumour recognition methods based on LDA and random forest
CN108614855A (en) * 2018-03-19 2018-10-02 众安信息技术服务有限公司 A kind of rumour recognition methods
CN112069397A (en) * 2020-08-21 2020-12-11 三峡大学 Rumor detection method combining self-attention mechanism with generation of confrontation network
CN112200197A (en) * 2020-11-10 2021-01-08 天津大学 Rumor detection method based on deep learning and multi-mode

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11580970B2 (en) * 2019-04-05 2023-02-14 Samsung Electronics Co., Ltd. System and method for context-enriched attentive memory network with global and local encoding for dialogue breakdown detection
CN110377686B (en) * 2019-07-04 2021-09-17 浙江大学 Address information feature extraction method based on deep neural network model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090046A (en) * 2017-12-29 2018-05-29 武汉大学 A kind of microblogging rumour recognition methods based on LDA and random forest
CN108614855A (en) * 2018-03-19 2018-10-02 众安信息技术服务有限公司 A kind of rumour recognition methods
CN112069397A (en) * 2020-08-21 2020-12-11 三峡大学 Rumor detection method combining self-attention mechanism with generation of confrontation network
CN112200197A (en) * 2020-11-10 2021-01-08 天津大学 Rumor detection method based on deep learning and multi-mode

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于BERT和双向LSTM的微博评论倾向性分析研究;谌志群;鞠婷;;情报理论与实践(第08期);全文 *

Also Published As

Publication number Publication date
CN112766359A (en) 2021-05-07

Similar Documents

Publication Publication Date Title
CN112766359B (en) Word double-dimension microblog rumor identification method for food safety public opinion
CN109871451B (en) Method and system for extracting relation of dynamic word vectors
CN110633409B (en) Automobile news event extraction method integrating rules and deep learning
CN108959270A (en) A kind of entity link method based on deep learning
CN103226580A (en) Interactive-text-oriented topic detection method
CN111639252A (en) False news identification method based on news-comment relevance analysis
CN113360582B (en) Relation classification method and system based on BERT model fusion multi-entity information
CN111914553B (en) Financial information negative main body judging method based on machine learning
Zhang et al. A BERT fine-tuning model for targeted sentiment analysis of Chinese online course reviews
Chen et al. Clause sentiment identification based on convolutional neural network with context embedding
CN114490991A (en) Dialog structure perception dialog method and system based on fine-grained local information enhancement
Zhi et al. Financial fake news detection with multi fact CNN-LSTM model
Sadr et al. Unified topic-based semantic models: A study in computing the semantic relatedness of geographic terms
El Desouki et al. Exploring the recent trends of paraphrase detection
CN113051922A (en) Triple extraction method and system based on deep learning
Yong et al. A new emotion analysis fusion and complementary model based on online food reviews
CN114398900A (en) Long text semantic similarity calculation method based on RoBERTA model
CN116304064A (en) Text classification method based on extraction
KR20210053539A (en) Apparatus and method for estimation of patent novelty
CN115906816A (en) Text emotion analysis method of two-channel Attention model based on Bert
CN113408289B (en) Multi-feature fusion supply chain management entity knowledge extraction method and system
CN115759102A (en) Chinese poetry wine culture named entity recognition method
CN113204971B (en) Scene self-adaptive Attention multi-intention recognition method based on deep learning
Shan Social network text sentiment analysis method based on CNN-BiGRU in big data environment
Chen Semantic matching efficiency of supply and demand text on cross-border E-commerce online technology trading platforms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant