CN111694930A - Dynamic knowledge hotspot evolution and trend analysis method - Google Patents

Dynamic knowledge hotspot evolution and trend analysis method Download PDF

Info

Publication number
CN111694930A
CN111694930A CN202010528034.3A CN202010528034A CN111694930A CN 111694930 A CN111694930 A CN 111694930A CN 202010528034 A CN202010528034 A CN 202010528034A CN 111694930 A CN111694930 A CN 111694930A
Authority
CN
China
Prior art keywords
topic
words
hotspot
dynamic
document
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010528034.3A
Other languages
Chinese (zh)
Other versions
CN111694930B (en
Inventor
侯颖
崔运鹏
刘娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Information Institute of CAAS
Original Assignee
Agricultural Information Institute of CAAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Information Institute of CAAS filed Critical Agricultural Information Institute of CAAS
Priority to CN202010528034.3A priority Critical patent/CN111694930B/en
Publication of CN111694930A publication Critical patent/CN111694930A/en
Application granted granted Critical
Publication of CN111694930B publication Critical patent/CN111694930B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/358Browsing; Visualisation therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a dynamic knowledge hotspot evolution and trend analysis method which comprises the steps of dynamically modeling potential topics in a given document through time change and capturing the dynamic evolution of the topics along with time. And obtaining the theme preference of all the documents through dynamic modeling, so that a user can position the document information through the hot words under the theme. The dynamic knowledge hotspot evolution and trend analysis method provided by the invention intuitively presents the change trend of words in the theme in a graph form, helps a user to know or predict the development trend of the theme words, helps the user to locate the literature information related to the theme words through the hotspot words in the theme, and is convenient for the user to quickly evaluate and know the target subject field.

Description

Dynamic knowledge hotspot evolution and trend analysis method
Technical Field
The invention relates to the field of natural language processing and information extraction, in particular to a dynamic knowledge hotspot evolution and trend analysis method.
Background
With the continuous development of information technology, a large amount of information resources are emerging continuously, from scientific and technical literature, books, news, blogs, web pages and the like. In the face of massive information, in order to effectively extract useful information from explosively-growing electronic documents, new technologies and tools are urgently needed to help users analyze these massive data sets so as to help users quickly evaluate and understand the target subject field.
A large number of texts in a corpus (e.g., scientific literature) have temporal attributes, and some specific text information appears in a specific time period. The text visualization method extracts the key information by analyzing the text resources and displays the key information in a graphical mode, and is one of important branches of information visualization.
At present, the dynamic modeling analysis of the theme for the text with the time attribute cannot effectively show the dynamic evolution of the hot words on the time sequence in a visualization mode, and cannot find the corresponding metadata information of the literature through the hot words. Therefore, for the literature information collected by the user, a method for assisting the user to quickly know the target field and accurately searching the corresponding literature metadata according to the hot words is needed.
Disclosure of Invention
The invention aims to provide a dynamic knowledge hotspot evolution and trend analysis method. The method carries out dynamic modeling on the text through time change, captures the dynamic evolution of the theme along with the time, analyzes the change trend of words in different themes along with the time, or predicts and extracts the potential development trend of the theme, and can locate the literature information related to the hot words through the theme.
The purpose of the invention is realized by the following technical scheme:
the invention comprises the following steps:
s10, collecting metadata of documents by users according to the requirements, and outputting or forming a record file which is separated by a tab and has an encoding format of UTF-8 and contains fields such as title, abstract and the like;
s20, preprocessing the derived document metadata;
s30, selecting abstracts and publication years of the preprocessed document metadata, and performing dynamic modeling analysis on potential topics and preference calculation on the document topics to obtain hot words;
s40, visualizing the topic cluster of the hot words and displaying the hot words most relevant to each topic in each year;
s50, the variation trend of the hot words in the theme is visualized: the user selects a word of interest in the theme, and the variation trend of the word on the time sequence is displayed through a curve graph;
furthermore, the collected document metadata mainly comprises fields such as titles, abstracts, publication years and the like, the file storage format is a tab separation, UTF-8 coded csv or txt plain text type, and the data set can derive a corresponding format from a Web of Science core database or other customized data sets meeting the format requirements.
Further, the preprocessing work comprises the steps of deleting invalid metadata, completing word drying, removing stop words, removing meaningless characters and recognizing phrases.
Further, the topic modeling analysis employs variational inference to approximate a posterior distribution. The method is based on the following assumptions:
1) dividing data according to time slices;
2) the topic associated with time slice t evolves from the topic associated with time slice t-1;
3) each time slice models the document by using a K component topic model;
further, the visualization of the topic cluster of the hot words is to display the hot words in the model analysis result, display the hot words of each time slice (such as year) according to the topic classification, and display the words according to the probability order of the model analysis result.
Further, the visualization method comprises the following specific steps:
1) acquiring a hot word selected by a user;
2) performing additional graph calculation on hot word information in the subject dynamic modeling analysis result based on the received first interactive instruction, wherein the graph comprises an equivalent point; rendering based on hot word information in the subject dynamic modeling analysis result to obtain a corresponding phase point value;
3) and based on the received second interactive instruction, rendering the additional graph by connecting a plurality of phase points on the grid graph to obtain a curve trend graph.
Further, the topic dynamic modeling calculates coherence values with different topic numbers of 5, 10, 15, 20, 25, respectively, to obtain the optimal topic number.
Further, the generation process of analyzing the sequence corpus on the time slice t in the topic dynamic modeling is as follows:
1) according to βt(t-1)~N(β(t-1),2I) Generating a topic-vocabulary probability distribution β over a time slice tt
2) According to αt(t-1)~N(α(t-1),2I) Generating α a prior topic prior distribution over a time slice tt
3) For each article d on time slice t, according to η -N (α)t,a2I) Generating a document-topic probability distribution η over time slice t;
4) for each word n in the document d, generating a word-subject distribution identification vector Z according to Z-Mult (pi (η)), and according to W(t,d,n)~Mult(π(βt,z) Generate a word W(t,d,n)
Further, the approximate variational posterior formula used by the dynamic modeling analysis document of the topic or the preference calculation is as follows:
Figure BDA0002534267440000031
the variational approach described above optimizes latent variables (topic β)t,kMixing ratio of thetat,dAnd a topic index Zt,d,n) Parameter of upper distribution in { βk,1,...,βk,TIn the variation distribution, by setting a "variation observed value" having a gaussian "
Figure BDA0002534267440000032
Dynamic model protection ofLeaving the sequential structure of the topics. In the variation distribution of the document-level latent variables, each scale vector thetat,dIs given a free Dirichlet parameter γt,d(ii) a Subject index Zt,d,nGiven a free polynomial parameter phit,d,nOptimization of topic grading observations Using conjugate gradient method, resulting Natural topic parameters { βk,1,...,βk,TThe variational approximation of the } incorporates temporal dynamics.
One or more embodiments of the present invention may have the following advantages over the prior art:
according to the dynamic knowledge hotspot evolution and trend analysis method provided by the invention, dynamic modeling is carried out on a text through time change, a modeling analysis result is visualized, the change trend of words in different themes along with time is analyzed, or the potential development trend of the theme is predicted and extracted, and a user is helped to locate document information related to the hotspot words through the theme, so that the user can conveniently and rapidly evaluate and know the target subject field.
Drawings
FIG. 1 is a flow chart of a dynamic knowledge hotspot evolution and trend analysis method;
FIG. 2 is a flow chart of a dynamic knowledge hotspot evolution and trend analysis method preprocessing;
FIG. 3 is a diagram of a process of generating a sequence corpus over a dynamic modeling analysis time slice t of a dynamic knowledge hotspot evolution and trend analysis method subject;
FIG. 4 is a visualization diagram of dynamic topic modeling analysis results of the dynamic knowledge hotspot evolution and trend analysis method;
FIG. 5 is a graph of the dynamic knowledge hotspot evolution and trend analysis method hotspot word change trend;
FIG. 6 is a diagram of a dynamic knowledge hotspot evolution and trend analysis method for finding document metadata;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.
As shown in fig. 1, a dynamic knowledge hotspot evolution and trend analysis method includes:
step S10 collects document metadata;
the user collects the metadata information of the literature according to the requirement of the user, the metadata mainly comprises fields such as a title, an abstract and a publication year, the file storage format is a tab separation and UTF-8 coded csv or txt pure text type, and the data set can be a corresponding format derived from a Webof Science core database or other self-defined data sets meeting the format requirement.
Step S20 preprocessing collected document metadata;
this step completes the data pre-processing of the abstract and year of publication fields to meet the format requirements for the next step of dynamic modeling analysis of the underlying subject matter of the text, as shown in FIG. 2. Preprocessing requires the completion of deleting invalid metadata, completing word drying, removing stop words, removing meaningless characters, and recognizing phrases.
Step S30 dynamic modeling analysis of the subject;
the step is a core analysis step of the system and completes the main calculation task of the system.
The topic dynamic modeling analysis employs variational inference to approximate a posterior distribution. The method is based on the following assumptions:
1) data is divided by time slice, such as by year;
2) the topic associated with time slice t evolves from the topic associated with time slice t-1;
3) each time slice models the document by using a K component topic model;
the generation process of the sequence corpus over the time slice t is as follows, as shown in fig. 3:
1) according to βt(t-1)~N(β(t-1),2I) Generating a topic-vocabulary probability distribution β over a time slice tt
2) According to αt(t-1)~N(α(t-1),2I) Generating α a prior topic prior distribution over a time slice tt
3) For each article d on time slice t, according to η -N (α)t,a2I) Generating a document-topic probability distribution η over time slice t;
4) for each word n in the document d, generating a word-subject distribution identification vector Z according to Z-Mult (pi (η)), and according to W(t,d,n)~Mult(π(βt,z) Generate a word W(t,d,n)
Therefore, the approximate variational posterior formula of the entire model is:
Figure BDA0002534267440000051
the variational approach described above optimizes latent variables (topic β)t,kMixing ratio of thetat,dAnd a topic index Zt,d,n) Parameter of upper distribution in { βk,1,...,βk,TIn the variation distribution, by setting a "variation observed value" having a gaussian "
Figure BDA0002534267440000052
The dynamic model of (1) preserves the sequential structure of the topic. In the variation distribution of the document-level latent variables, each scale vector thetat,dIs given a free Dirichlet parameter γt,d(ii) a Subject index Zt,d,nGiven a free polynomial parameter phit,d,nOptimization of topic grading observations Using conjugate gradient method, resulting Natural topic parameters { βk,1,...,βk,TThe variational approximation of the } incorporates temporal dynamics.
The subject dynamic modeling analysis results of this step are exemplified as follows:
1) time slice sequences, divided by year, e.g. [2008,2009,2010 ].
2) The probability that each time slice sequence corresponds to the most relevant word and word of the topic in a different topic, e.g. (because there are too many actual hot words, only the first 3 hot words on each time sequence are listed here):
{0:['0.0140231014*application+0.0138825359*stream+0.0123572007*datum','0.0140471977*application+0.0138904899*stream+0.0124764708*datum','0.0139453390*stream+0.0138278045*application+0.0128339716*datum',
1:['0.0125233824*video+0.0118972892*propose+0.0103776871*network','0.0128266652*video+0.0116539875*propose+0.0104339393*network','0.0132288953*video+0.0113926101*propose+0.0103314936*network'],
2:['0.0201108175*stream+0.0160505421*use+0.0143336972*compute','0.0204567699*stream+0.0159369303*use+0.0145109152*compute','0.0204072031*stream+0.0159959192*use+0.0144690685*compute'],
3:['0.0224408733*algorithm+0.0203485369*stream+0.0184875342*compute','0.0227468752*algorithm+0.0205000072*stream+0.0185545889*compute','0.0230975940*algorithm+0.0206288220*stream+0.0185272671*compute'],
4:['0.0209717427*use+0.0150956938*stream+0.0111105387*propose','0.0207826879*use+0.0151531082*stream+0.0112516701*propose','0.0203461357*use+0.0151239365*stream+0.0117703962*propose']
}
3) document topic preferences, for example, topic distribution for document 20 is:
[1.17577895e-04,9.99529688e-01,1.17577895e-04,1.17577895e-04,1.17577895e-04]
it can be seen that, of the 5 topics, the 20 th document has a preference for the topic 1, and the topic preference of each document is counted in turn and stored in the table together with the document metadata information.
S40 visualizing the hot word topic clustering result;
the result returned by the previous step of topic dynamic modeling analysis includes a time slice sequence, and the probability corresponding to the most relevant words and words of each time slice and topic under each topic, and the first 50 hot words most relevant to each topic in each year are displayed according to the hot words in the analysis result, as shown in fig. 4.
S50 visualizing the change trend of the hotspot words in the theme;
as shown in fig. 5, the user selects a word of interest in the topic, analyzes the time slice sequence of the returned result, the hot word and the corresponding probability information according to the dynamic modeling of the topic, and traces the variation trend of the hot word on the time sequence through a graph.
As shown in fig. 6, the user selects one or more words of interest in the topic and the relationship between the words (and, or), and queries the document metadata containing the relationship between the one or more words under the selected topic according to the calculation result of the preference of the document topic.
The method comprises the steps that a theme dynamic modeling analysis is carried out to obtain a metadata search request, wherein the metadata search request carries search keywords; matching the search keywords with the retrieval keywords of the target documents; capturing the dynamic evolution of the theme along with the time, obtaining the theme preference of all documents through dynamic modeling, providing a user to visually present the variation trend of words in the theme in the form of a hot word positioning document information curve graph under the theme, and finding out corresponding document metadata when the user selects one or more words of interest in the theme and the relationship among the words. The method helps the user to know or predict the development trend of the subject word, helps the user to locate the relevant document information through the hot word under the subject, and is convenient for the user to quickly evaluate and know the target subject field. And returning the description information of the target document under the condition that the search keyword is successfully matched with the retrieval keyword of the target document, wherein the target document is the document matched with the search keyword.
Although the embodiments of the present invention have been described above, the above descriptions are only for the convenience of understanding the present invention, and are not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (9)

1. A dynamic knowledge hotspot evolution and trend analysis method is characterized by comprising the following steps:
the method comprises the following steps:
s10, collecting metadata of documents by users according to the requirements, and outputting or forming a record file which is separated by a tab and has an encoding format of UTF-8 and contains fields such as title, abstract and the like;
s20, preprocessing the derived document metadata;
s30, selecting abstracts and publication years of the preprocessed document metadata, and performing dynamic modeling analysis on potential topics and preference calculation on the document topics to obtain hot words;
s40, visualizing the topic cluster of the hot words and displaying the hot words most relevant to each topic in each year;
s50, the variation trend of the hot words in the theme is visualized: the user selects a word of interest in the theme, and the variation trend of the word on the time series is displayed through the graph.
2. The method of claim 1, wherein the collected metadata of documents mainly includes fields such as title, abstract and year of publication, the file storage format is tab separation, UTF-8 coded csv or txt plain text type, and the data set can derive corresponding format from the Web of Science core database or other customized data set meeting the format requirement.
3. The method of claim 1, wherein the preprocessing comprises the steps of deleting invalid metadata, completing word drying, deactivating words, removing meaningless characters, and recognizing phrases.
4. The method of claim 1, wherein the topic modeling analysis uses variational inference to approximate a posterior distribution. The method is based on the following assumptions:
1) dividing data according to time slices;
2) the topic associated with time slice t evolves from the topic associated with time slice t-1;
3) each time slice models the document using a K-component topic model.
5. The dynamic knowledge hotspot evolution and trend analysis method of claim 1, wherein the visualization of the topic clusters of the hotspot words is a display of the hotspot words in the model analysis results, the hotspot words of each time slice (such as year) are displayed according to topic classification, and the words are displayed in order of probability of the model analysis results.
6. The dynamic knowledge hotspot evolution and trend analysis method of claim 1, wherein the visualization method comprises the following specific steps:
1) acquiring a hot word selected by a user;
2) performing additional graph calculation on hot word information in the subject dynamic modeling analysis result based on the received first interactive instruction, wherein the graph comprises an equivalent point; rendering based on hot word information in the subject dynamic modeling analysis result to obtain a corresponding phase point value;
3) and based on the received second interactive instruction, rendering the additional graph by connecting a plurality of phase points on the grid graph to obtain a curve trend graph.
7. The method of claim 1 or 4, wherein the topic dynamic modeling calculates coherence values with different topic numbers of 5, 10, 15, 20, and 25 to obtain the optimal topic number.
8. The method for dynamic knowledge hotspot evolution and trend analysis according to claim 1 or 4, wherein the generation process of the sequence corpus on the analysis time slice t in the topic dynamic modeling is as follows:
1) according to βt(t-1)~N(β(t-1),2I) Generating a topic-vocabulary probability distribution β over a time slice tt
2) According to αt(t-1)~N(α(t-1),2I) Generating α a prior topic prior distribution over a time slice tt
3) For each article d on time slice t, according to η -N (α)t,a2I) Generating a document-topic probability distribution η over time slice t;
4) for each word n in the document d, generating a word-subject distribution identification vector Z according to Z-Mult (pi (η)), and according to W(t,d,n)~Mult(π(βt,z) Generate a word W(t,d,n)
9. The method of claim 1, wherein the topic dynamic modeling analysis document or the approximate variational posterior formula used in the preference calculation is:
Figure FDA0002534267430000021
the variational approach described above optimizes latent variables (topic β)t,kMixing ratio of thetat,dAnd a topic index Zt,d,n) Parameter of upper distribution in { βk,1,...,βk,TIn the variation distribution, by setting a "variation observed value" having a gaussian "
Figure FDA0002534267430000031
The dynamic model of (1) preserves the sequential structure of the topic. In the variation distribution of the document-level latent variables, each scale vector thetat,dIs given a free Dirichlet parameter γt,d(ii) a Subject index Zt,d,nGiven a free polynomial parameter phit,d,nOptimization of topic grading observations Using conjugate gradient method, resulting Natural topic parameters { βk,1,...,βk,TThe variational approximation of the } incorporates temporal dynamics.
CN202010528034.3A 2020-06-11 2020-06-11 Dynamic knowledge hot-spot evolution and trend analysis method Active CN111694930B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010528034.3A CN111694930B (en) 2020-06-11 2020-06-11 Dynamic knowledge hot-spot evolution and trend analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010528034.3A CN111694930B (en) 2020-06-11 2020-06-11 Dynamic knowledge hot-spot evolution and trend analysis method

Publications (2)

Publication Number Publication Date
CN111694930A true CN111694930A (en) 2020-09-22
CN111694930B CN111694930B (en) 2023-11-14

Family

ID=72480257

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010528034.3A Active CN111694930B (en) 2020-06-11 2020-06-11 Dynamic knowledge hot-spot evolution and trend analysis method

Country Status (1)

Country Link
CN (1) CN111694930B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051893A (en) * 2021-04-30 2021-06-29 中国银行股份有限公司 Hot word statistical method, system, electronic equipment and storage medium
TWI825535B (en) * 2021-12-22 2023-12-11 中華電信股份有限公司 System, method and computer-readable medium for formulating potential hot word degree

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572623A (en) * 2015-01-12 2015-04-29 上海交通大学 Efficient data summary and analysis method of online LDA model
CN106874365A (en) * 2016-12-30 2017-06-20 中国科学院自动化研究所 Tracking based on social event on Social Media platform
CN106951554A (en) * 2017-03-29 2017-07-14 浙江大学 A kind of stratification hot news and its excavation and the method for visualizing of evolution
US20180366013A1 (en) * 2014-08-28 2018-12-20 Ideaphora India Private Limited System and method for providing an interactive visual learning environment for creation, presentation, sharing, organizing and analysis of knowledge on subject matter
CN110852059A (en) * 2019-11-14 2020-02-28 中国农业科学院农业信息研究所 Grouping-based document content difference comparison visualization analysis method
CN111198975A (en) * 2019-12-25 2020-05-26 上海杰狮信息技术有限公司 Grid-based space-time big data visualization method and visualization system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180366013A1 (en) * 2014-08-28 2018-12-20 Ideaphora India Private Limited System and method for providing an interactive visual learning environment for creation, presentation, sharing, organizing and analysis of knowledge on subject matter
CN104572623A (en) * 2015-01-12 2015-04-29 上海交通大学 Efficient data summary and analysis method of online LDA model
CN106874365A (en) * 2016-12-30 2017-06-20 中国科学院自动化研究所 Tracking based on social event on Social Media platform
CN106951554A (en) * 2017-03-29 2017-07-14 浙江大学 A kind of stratification hot news and its excavation and the method for visualizing of evolution
CN110852059A (en) * 2019-11-14 2020-02-28 中国农业科学院农业信息研究所 Grouping-based document content difference comparison visualization analysis method
CN111198975A (en) * 2019-12-25 2020-05-26 上海杰狮信息技术有限公司 Grid-based space-time big data visualization method and visualization system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051893A (en) * 2021-04-30 2021-06-29 中国银行股份有限公司 Hot word statistical method, system, electronic equipment and storage medium
CN113051893B (en) * 2021-04-30 2024-01-26 中国银行股份有限公司 Hotword statistics method, system, electronic equipment and storage medium
TWI825535B (en) * 2021-12-22 2023-12-11 中華電信股份有限公司 System, method and computer-readable medium for formulating potential hot word degree

Also Published As

Publication number Publication date
CN111694930B (en) 2023-11-14

Similar Documents

Publication Publication Date Title
JP5154832B2 (en) Document search system and document search method
Fried et al. Maps of computer science
US7853595B2 (en) Method and apparatus for creating a tool for generating an index for a document
US20150269138A1 (en) Publication Scope Visualization and Analysis
Bykau et al. Fine-grained controversy detection in Wikipedia
CN111694930B (en) Dynamic knowledge hot-spot evolution and trend analysis method
Leonandya et al. A semi-supervised algorithm for Indonesian named entity recognition
Sandhiya et al. A review of topic modeling and its application
JP4426041B2 (en) Information retrieval method by category factor
CN116882414B (en) Automatic comment generation method and related device based on large-scale language model
JP2007279978A (en) Document retrieval device and document retrieval method
CN117420998A (en) Client UI interaction component generation method, device, terminal and medium
Riehmann et al. Visualizing a thinker's life
CN115617980A (en) Litigation case retrieval report generation method and system
Rybak et al. Machine learning-enhanced text mining as a support tool for research on climate change: theoretical and technical considerations
KR100862565B1 (en) Patent db construction system of specific needs
AT&T
Pandit et al. A query specific graph based approach to multi-document text summarization: simultaneous cluster and sentence ranking
Ahmad et al. A comparative study on text mining techniques
Azeroual A text and data analytics approach to enrich the quality of unstructured research information
Lama Clustering system based on text mining using the K-means algorithm: news headlines clustering
Preethi et al. A survey paper on text mining-techniques, applications, and issues
Geng et al. Visualizing translation variation: Shakespeare’s othello
CN118350368B (en) Multi-document select and edit method of large language model based on NLP technology
Ramya et al. Automatic extraction of facets for user queries [AEFUQ]

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant