CN113343118A - Hot event discovery method under mixed new media - Google Patents

Hot event discovery method under mixed new media Download PDF

Info

Publication number
CN113343118A
CN113343118A CN202110444596.4A CN202110444596A CN113343118A CN 113343118 A CN113343118 A CN 113343118A CN 202110444596 A CN202110444596 A CN 202110444596A CN 113343118 A CN113343118 A CN 113343118A
Authority
CN
China
Prior art keywords
event
topic
topics
modeling
news
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110444596.4A
Other languages
Chinese (zh)
Inventor
曹玖新
洪智高
刘佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202110444596.4A priority Critical patent/CN113343118A/en
Publication of CN113343118A publication Critical patent/CN113343118A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for discovering hot events under a mixed new media, which comprises the following steps: firstly, performing word segmentation and slicing processing on online news portal website data in a specific time period, and discovering and mining various topic events based on a probabilistic topic model; then according to the information of the topic, the keywords, the named entity and the like of the event, searching and acquiring social information related to the event and user behavior relation data thereof from social network media; and finally, judging whether the event belongs to the hot event or not according to the report quantity of the event in the news portal website and the propagation scale of the event in the social network. The research result of the algorithm has an important supporting effect on the practical application in the aspects of network event retrieval, online public opinion monitoring, emergency detection, related safety decision and the like.

Description

Hot event discovery method under mixed new media
Technical Field
The invention relates to a method for discovering social hot events in a mixed media environment, belonging to the technical field of internet monitoring.
Background
Currently, social networks (such as micro blogs, micro messages, and the like) are social new media which are most active, rich in content, and most widely influenced by users, and form a mixed online new media environment together with various online news portal networks. Some social events are known by people through news portal reports, and are transferred and fermented through various social media, so that netizens are fiercely discussed, network public opinion games are developed, and finally internet social hotspot events are formed.
The invention constructs a mixed new media environment by comprehensively considering the functional action and the interaction relation of the social new media and the news portal website in the Internet. On the basis, the topics of the events are found through mining a news portal website, news corpus data and social media data are obtained facing the events, and the social hotspot events are judged, so that people are helped to deeply understand and grasp the current situation and the future development trend of the social hotspot events in the network environment. The research result of the invention has important support effect on the practical application of network event retrieval, online public opinion monitoring, emergency detection, related safety decision and the like.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the method provides a model which can effectively extract the potential topic information in the document and judge whether the topic information is a hot event or not.
In order to solve the technical problems, the technical scheme adopted by the invention is that after data is preprocessed, a document is vectorized and expressed and is subjected to modeling by a neural topic model, and then topics obtained by modeling are combined.
In order to achieve the purpose, the technical scheme of the invention is as follows: a method for discovering hot events under mixed new media comprehensively considers the functions and the relations of social new media and a news portal website in the Internet, constructs a mixed new media environment, obtains news corpus data and social media data facing social hot events, and discovers topics through the data of the mixed media, so that people are helped to deeply understand and grasp the current situation and the future development trend of the social hot events in a network environment, and the method comprises the following steps:
step 1) preprocessing the collected data by news data, including removing hypertext links, stop words, punctuation marks and digital useless information, and performing word segmentation by using a HanLP natural language processing tool;
step 2) dispersing the document to each time slice according to time sequence, wherein the time interval is 1 day, so that the subsequent evolution analysis processing is facilitated, and all events examine the document within 30 days of the occurrence of the event, namely 30 time slices;
step 3), vectorizing the text, and expressing the text by using a document pre-trained by BERT to improve the continuity of the topic;
step 4) topic modeling, namely performing topic modeling by using a neural topic model, wherein the input bag-of-words representation is replaced by context embedding;
step 5) modeling the topics obtained in the step 4), merging the topics,
step 6), after event detection of the news portal website is completed, the microblog content of each event in the social network and the user social relationship of each event need to be associated;
and 7) according to a certain judgment standard, judging that the event is a hot spot event when a certain threshold value is exceeded.
The division of the time slices in the step 2) has important influence on the evolution of the processing time in a period of time and the heat change rule thereof, and the time slices are fixed in 30 days in the invention and can be adaptively set according to the time length of the crawled news content.
The text vectorization in the step 3) replaces the bag-of-words representation of the input topic model with context embedding, namely, a neural coding layer of document representation pre-trained from a BERT language model is introduced before the topic modeling process. First, a dictionary of topic corpus is built by calling the BERT _ serving packet and a BERT word vector model is trained. And each document obtains a matrix formed by word vectors, and well matched data is stored so as to facilitate task processing of subsequent topic modeling.
And in the step 4), when topic modeling is performed, the vectorized text data in the step 3) is used as a context embedded model, the neural topic model used in the invention is a generation model based on a neural variational inference framework, is inspired by a variational automatic encoder, and selects Gaussian distribution generation parameters, wherein the Gaussian parameters can be obtained by linear calculation.
Step 5) after modeling the topics, merging the topics, setting a threshold value zeta to measure the distance between the two topics, and if the distance between the two topics is greater than the threshold value, judging the two topics to be the same topic and merging the topics; otherwise, the two topics are different, and the two topics do not need to be combined.
In the steps 6) and 7), the microblog platform provides rich topic classification and content label information, integrates the time, the named entity and the keyword information obtained in the event detection process, searches microblog content related to the event key information from the microblog, calculates the cosine distance between the event key information and the content, classification and label of the search result, detects the similarity relation between the event and the microblog, and establishes the event-news-microblog association relation. For the judgment of the hot event, the invention combines the social network attribute of the event, and calculates the heat value of the topic obtained in the step 5) by using a formula (1):
Figure BDA0003036308590000031
wherein N ise、SeAnd CeRespectively representing the number of news reports, the number of user forwarding and the number of comments of the event e, and N, S and C respectively representing the total number of corresponding indexes; α, β, γ respectively represent proportionality coefficients (e.g., 0.6, 0.2, 0.2) set according to the importance of the above factors, when the integrated calorific value (range is [0,1 ]]) And if the ratio of the report to the discussion of the event e exceeds 0.4 (namely, if the ratio of the report to the discussion of the event e exceeds 40 percent), the event e is judged to be the hot event.
Compared with the prior art, the invention has the following advantages:
1. the invention improves the modeling method of topics under the mixed media, comprehensively considers the functions and the relations of the new social media and the news portal website in the Internet, obtains news corpus data and social media data facing social hot events, and discovers the current hot topics through the data of the mixed media.
2. The NTM neural topic model is provided based on the variational automatic encoder framework, and because the encoder and the decoder in the variational automatic encoder can carry out combined training through back propagation, compared with the traditional probability model, the complexity of the mathematical derivation process during the training of the NTM model is lower, and the extension is easy.
3. The NTM model used by the invention receives the document representation after BERT training as input, the topic modeling part consists of an encoder and a decoder, the process of generating topics by the NTM is similar to the data reconstruction process, and the bag-of-words representation of the input topic model is replaced by context embedding, namely, before the topic modeling process, a neural coding layer of the document representation pre-trained by the BERT language model is introduced, so that the interpretability and the consistency of the topics are improved.
4. According to the method, news report data of the main news media in a period of time are crawled by a certain keyword, the evolution situation of news in a period of time can be tracked, the time slice of news evolution is divided in a self-adaptive mode, and the stage change of a hot event is judged by combination or not.
Drawings
Fig. 1 is a flowchart illustrating a hot event determination process according to the present invention.
FIG. 2 is a topic model diagram of the present invention.
Detailed Description
The invention is further illustrated by the following description of specific embodiments, which are intended to be illustrative only and not to be limiting of the scope of the invention, and various equivalent modifications of the invention will occur to those skilled in the art upon reading the present disclosure and fall within the limits of the appended claims.
Example 1: referring to fig. 1 and 2, a method for discovering a hotspot event under a mixed new media includes the following steps:
step 1) preprocessing the collected data by news data, including removing hypertext links, stop words, punctuation marks and digital useless information, and performing word segmentation by using a HanLP natural language processing tool;
step 2) dispersing the document to each time slice according to time sequence, wherein the time interval is 1 day, so that the subsequent evolution analysis processing is facilitated, and all events examine the document within 30 days of the occurrence of the event, namely 30 time slices;
step 3), vectorizing the text, and expressing the text by using a document pre-trained by BERT to improve the continuity of the topic;
step 4) topic modeling, namely performing topic modeling by using a neural topic model, wherein the input bag-of-words representation is replaced by context embedding;
step 5) modeling the topics obtained in the step 4), merging the topics,
step 6), after event detection of the news portal website is completed, the microblog content of each event in the social network and the user social relationship of each event need to be associated;
and 7) according to a certain judgment standard, judging that the event is a hot spot event when a certain threshold value is exceeded.
The division of the time slices in the step 2) has important influence on the evolution of the processing time in a period of time and the heat change rule thereof, and the time slices are fixed in 30 days in the invention and can be adaptively set according to the time length of the crawled news content.
The text vectorization in the step 3) replaces the bag-of-words representation of the input topic model with context embedding, namely, a neural coding layer of document representation pre-trained from a BERT language model is introduced before the topic modeling process. First, a dictionary of topic corpus is built by calling the BERT _ serving packet and a BERT word vector model is trained. And each document obtains a matrix formed by word vectors, and well matched data is stored so as to facilitate task processing of subsequent topic modeling.
And in the step 4), when topic modeling is performed, the vectorized text data in the step 3) is used as a context embedded model, the neural topic model used in the invention is a generation model based on a neural variational inference framework, is inspired by a variational automatic encoder, and selects Gaussian distribution generation parameters, wherein the Gaussian parameters can be obtained by linear calculation.
Step 5) after modeling the topics, merging the topics, setting a threshold value zeta to measure the distance between the two topics, and if the distance between the two topics is greater than the threshold value, judging the two topics to be the same topic and merging the topics; otherwise, the two topics are different, and the two topics do not need to be combined.
In the steps 6) and 7), the microblog platform provides rich topic classification and content label information, integrates the time, the named entity and the keyword information obtained in the event detection process, searches microblog content related to the event key information from the microblog, calculates the cosine distance between the event key information and the content, classification and label of the search result, detects the similarity relation between the event and the microblog, and establishes the event-news-microblog association relation. For the judgment of the hot event, the invention combines the social network attribute of the event, and calculates the heat value of the topic obtained in the step 5) by using a formula (1):
Figure BDA0003036308590000061
wherein N ise、SeAnd CeRespectively representing the number of news reports, the number of user forwarding and the number of comments of the event e, and N, S and C respectively representing the total number of corresponding indexes; α, β, γ respectively represent proportionality coefficients (e.g., 0.6, 0.2, 0.2) set according to the importance of the above factors, when the integrated calorific value (range is [0,1 ]]) And if the ratio of the report to the discussion of the event e exceeds 0.4 (namely, if the ratio of the report to the discussion of the event e exceeds 40 percent), the event e is judged to be the hot event.
Application example 1: referring to fig. 2, the method for topic modeling of a document based on a neural topic model according to the present invention includes the following steps:
step 1. encoding procedure
Generating a Gaussian prior distribution theta for the document d by using an encoder:
1) a document representation s is obtained after the BERT processing.
s=BERT(d) (1)
2) The document representation s is projected towards the hidden layer, which is concatenated with the bag of words representation BoW of document d.
h=[s,BoW] (2)
3) Mu and log sigma, which are hyper-parameters set by the present invention for computing gaussian unit variance, are obtained by two independent multi-layer feed-forward neural networks. Wherein f (-) denotes a neural perceptron with a ReLU activation function, weight W1,W2And deviation b1,b2Are learnable parameters that are shared between different inputs.
μ=W1f(h)+b1 (3)
logσ=W2f(h)+b2 (4)
4) Selecting hidden variables z-N (mu, sigma)2) Wherein N (μ, σ)2) In a multidimensional Gaussian distribution, the z component is a Gaussian distribution random variable which is independent of each other. The hidden variable z can be expressed as:
Figure BDA0003036308590000062
where epsilon can be considered as an auxiliary noise variable. ε may be sampled from the normal distribution N (0, I).
Step 2. decoding procedure
Assume that there are K topics in a given corpus C, each topic K being distributed by a topic vocabulary
Figure BDA0003036308590000076
(k) And each document d in the C corresponds to a topic set represented by a variable theta, wherein theta is a K-dimensional distribution vector and is constructed by Gaussian softmax. Therefore, the decoder takes the following steps to simulate the way each document d is generated:
1) deriving a Gaussian prior distribution θ from an implicit variable z, where wθAre variables that can be trained.
θ=softmax(wθz) (6)
2) Deducing each vocabulary w in the document d from the variable theta, where fφDistribution of words to topics
Figure BDA0003036308590000075
(k) Weight matrix of
Figure BDA0003036308590000071
In summary, based on the lower bound of variation, the objective function of the NTM model defined by the present invention is:
LNTM=Eq(z|d)[p(d|z)]-DKL[q(z|d)||p(z|μ,σ)] (8)
the first term in equation (8) is the reconstruction loss, the second term is the Kullback-Leibler divergence loss, and p (z | μ, σ) represents the standard normal prior. q (z | d) and p (z | μ, σ) denote an encoding process and a decoding process, respectively.
To achieve back propagation during model training, a re-parameterization technique is used, as shown in equation (5), by sampling the noise ε from the normal distribution N (0, I), to obtain θ. To calculate LNTMThe gradient of the model adopts the Adam algorithm as a gradient descent algorithm.
Step 3. merging of the same topics
The method for calculating the distribution distance is usually adopted for identifying the same topic, because the topics obtained after modeling are distributed on the same dimension, and because the distribution distances among different topics are determined and are not related to the sequence of the topics, the similarity among the topics can be measured through the symmetrical Kullback-Leibler distance.
Let wiIs the probability distribution of the ith word in a topic,
Figure BDA0003036308590000072
is the topic vocabulary distribution of the kth topic, then the topic k1And k2The KL distance of (a) can be calculated by equation (10):
Figure BDA0003036308590000073
while the symmetric KL distance can be further calculated using the KL distance:
Figure BDA0003036308590000074
as can be seen from equations (9) and (10), the smaller the KL distance between two topics, the closer the KL distance to 0, the closer the two probability distributions are, and the higher the similarity between the two topics. If the KL distance between two topics is larger, the probability distribution difference of the two topics is larger. A threshold value ζ is set, and if the KL distance between two topics is greater than the threshold value, the two topics are determined to be the same topic, and the topics need to be merged. Otherwise, the two topics are different, and the two topics do not need to be combined.
It should be noted that the above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention, and all equivalent substitutions or substitutions made on the basis of the above-mentioned technical solutions belong to the scope of the present invention.

Claims (6)

1. A hot event discovery method under a mixed new media is characterized by comprising the following steps:
step 1) preprocessing the collected data by news data, including removing hypertext links, stop words, punctuation marks and digital useless information, and performing word segmentation by using a HanLP natural language processing tool;
step 2) dispersing the document to each time slice according to time sequence, wherein the time interval is 1 day, so that the subsequent evolution analysis processing is facilitated, and all events examine the document within 30 days of the occurrence of the event, namely 30 time slices;
step 3), vectorizing the text, and expressing the text by using a document pre-trained by BERT to improve the continuity of the topic;
step 4) topic modeling, namely performing topic modeling by using a neural topic model, wherein the input bag-of-words representation is replaced by context embedding;
step 5) modeling the topics obtained in the step 4), and then merging the topics;
step 6), after event detection of the news portal website is completed, associating microblog content of each event in the social network and the social relation of the event to a user;
and 7) calculating the heat value of the topic, and judging that the topic is a hot event when the heat value exceeds a certain threshold value.
2. The method for discovering social hotspot events in the mixed media environment according to claim 1, wherein the time slice division in the step 2) has an important influence on the evolution of the processing time in a period of time and the change rule of the heat degree, and can be fixed in 30 days in the invention or can be adaptively set according to the time length of crawling news contents.
3. The method for discovering social hotspot events in a mixed media environment as claimed in claim 1, wherein in the step 3), text vectorization replaces bag-of-word representation of the input topic model with context embedding, that is, before the topic modeling process, a neural coding layer represented by a document pre-trained from a BERT language model is introduced, and first, a dictionary of a self-constructed topic corpus is called by a BERT _ serving packet and a BERT word vector model is trained, each document obtains a matrix formed by word vectors, and the well-matched data is stored for task processing of subsequent topic modeling.
4. The method for discovering social hotspot events in the mixed media environment according to claim 1, wherein during topic modeling in the step 4), the vectorized text data in the step 3) is used as a context embedding model, the neural topic model used in the invention is a generation model based on a neural variation inference framework, is inspired by a variation automatic encoder, and selects Gaussian distribution generation parameters, wherein the Gaussian parameters can be obtained by linear computation.
5. The method for discovering social hotspot events in the mixed media environment according to claim 1, wherein in the step 5), after modeling the topics, merging the topics is required, a threshold value ζ is set to measure the distance between the two topics, and if the distance between the two topics is greater than the threshold value, the two topics are determined as the same topic and the topics are required to be merged; otherwise, the two topics are different, and the two topics do not need to be combined.
6. The method for discovering social hotspot events in the mixed media environment according to claim 1, wherein in steps 6) and 7), the microblog platform provides rich topic classification and content tag information, integrates the time, named entity and keyword information obtained in the event detection process, searches microblog content related to the event key information from the microblog, then calculates the cosine distance between the event key information and the content, classification and tag of the search result to detect the similarity between the event and the microblog, establishes the event-news-microblog association relationship, and for the discrimination of the hotspot events, the social network attribute of the event is combined, and the heat value of the topic obtained in step 5) is calculated by using a formula (1):
Figure FDA0003036308580000021
wherein N ise、SeAnd CeRespectively representing the number of news reports, the number of user forwarding and the number of comments of the event e, and N, S and C respectively representing the total number of corresponding indexes; α, β, γ respectively represent proportionality coefficients (e.g., 0.6, 0.2, 0.2) set according to the importance of the above factors, when the integrated calorific value (range is [0,1 ]]) And if the ratio of the report to the discussion of the event e exceeds 0.4 (namely, if the ratio of the report to the discussion of the event e exceeds 40 percent), the event e is judged to be the hot event.
CN202110444596.4A 2021-04-23 2021-04-23 Hot event discovery method under mixed new media Pending CN113343118A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110444596.4A CN113343118A (en) 2021-04-23 2021-04-23 Hot event discovery method under mixed new media

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110444596.4A CN113343118A (en) 2021-04-23 2021-04-23 Hot event discovery method under mixed new media

Publications (1)

Publication Number Publication Date
CN113343118A true CN113343118A (en) 2021-09-03

Family

ID=77468472

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110444596.4A Pending CN113343118A (en) 2021-04-23 2021-04-23 Hot event discovery method under mixed new media

Country Status (1)

Country Link
CN (1) CN113343118A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113822069A (en) * 2021-09-17 2021-12-21 国家计算机网络与信息安全管理中心 Emergency early warning method and device based on meta-knowledge and electronic device
TWI825535B (en) * 2021-12-22 2023-12-11 中華電信股份有限公司 System, method and computer-readable medium for formulating potential hot word degree

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120079020A1 (en) * 2010-09-27 2012-03-29 Korea Institute Of Science And Technology Highlight providing system and method based on hot topic event detection
CN107203513A (en) * 2017-06-06 2017-09-26 中国人民解放军国防科学技术大学 Microblogging text data fine granularity topic evolution analysis method based on probabilistic model
CN107644089A (en) * 2017-09-26 2018-01-30 武大吉奥信息技术有限公司 A kind of hot ticket extracting method based on the network media
CN111324801A (en) * 2020-02-17 2020-06-23 昆明理工大学 Hot event discovery method in judicial field based on hot words

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120079020A1 (en) * 2010-09-27 2012-03-29 Korea Institute Of Science And Technology Highlight providing system and method based on hot topic event detection
CN107203513A (en) * 2017-06-06 2017-09-26 中国人民解放军国防科学技术大学 Microblogging text data fine granularity topic evolution analysis method based on probabilistic model
CN107644089A (en) * 2017-09-26 2018-01-30 武大吉奥信息技术有限公司 A kind of hot ticket extracting method based on the network media
CN111324801A (en) * 2020-02-17 2020-06-23 昆明理工大学 Hot event discovery method in judicial field based on hot words

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JUNCHI ZHANG 等: "Topic-informed neural approach for biomedical event extraction", ARTIFICIAL INTELLIGENCE IN MEDICINE, 31 December 2020 (2020-12-31), pages 1 - 9 *
张洪宽 等: "基于BERT的端到端中文篇章事件抽取", 中国计算语言学大会, 1 November 2020 (2020-11-01), pages 1 - 12 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113822069A (en) * 2021-09-17 2021-12-21 国家计算机网络与信息安全管理中心 Emergency early warning method and device based on meta-knowledge and electronic device
CN113822069B (en) * 2021-09-17 2024-03-12 国家计算机网络与信息安全管理中心 Sudden event early warning method and device based on meta-knowledge and electronic device
TWI825535B (en) * 2021-12-22 2023-12-11 中華電信股份有限公司 System, method and computer-readable medium for formulating potential hot word degree

Similar Documents

Publication Publication Date Title
Poornima et al. A comparative sentiment analysis of sentence embedding using machine learning techniques
CN110990564B (en) Negative news identification method based on emotion calculation and multi-head attention mechanism
Sivakumar et al. Review on word2vec word embedding neural net
CN111611809B (en) Chinese sentence similarity calculation method based on neural network
CN112487203A (en) Relation extraction system integrated with dynamic word vectors
CN110321563B (en) Text emotion analysis method based on hybrid supervision model
JP3682529B2 (en) Summary automatic evaluation processing apparatus, summary automatic evaluation processing program, and summary automatic evaluation processing method
CN110781679B (en) News event keyword mining method based on associated semantic chain network
CN107577665B (en) Text emotional tendency judging method
Mary et al. Sentimental Analysis of Twitter Data using Machine Learning Algorithms
CN111046171B (en) Emotion discrimination method based on fine-grained labeled data
CN113343118A (en) Hot event discovery method under mixed new media
CN112364161A (en) Microblog theme mining method based on dynamic behaviors of heterogeneous social media users
CN111984782A (en) Method and system for generating text abstract of Tibetan language
CN115017887A (en) Chinese rumor detection method based on graph convolution
CN116756303A (en) Automatic generation method and system for multi-topic text abstract
CN113449508B (en) Internet public opinion correlation deduction prediction analysis method based on event chain
WO2024087754A1 (en) Multi-dimensional comprehensive text identification method
CN116992886A (en) BERT-based hot news event context generation method and device
CN113051886B (en) Test question duplicate checking method, device, storage medium and equipment
CN115495671A (en) Cross-domain rumor propagation control method based on graph structure migration
Tang et al. Text semantic understanding based on knowledge enhancement and multi-granular feature extraction
Alorini et al. Machine learning enabled sentiment index estimation using social media big data
CN110489741B (en) Microblog burst topic detection method based on burst word detection and filtering
Brown et al. Simple and efficient identification of personally identifiable information on a public website

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination