CN113961694A - Conference-based auxiliary analysis method and system for operation condition of each company unit - Google Patents

Conference-based auxiliary analysis method and system for operation condition of each company unit Download PDF

Info

Publication number
CN113961694A
CN113961694A CN202111105581.1A CN202111105581A CN113961694A CN 113961694 A CN113961694 A CN 113961694A CN 202111105581 A CN202111105581 A CN 202111105581A CN 113961694 A CN113961694 A CN 113961694A
Authority
CN
China
Prior art keywords
data
task
conference
subject
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111105581.1A
Other languages
Chinese (zh)
Inventor
杨梦琳
周峰
杨迪
梁懿
彭放
陈红
赵鹏
闫崇峰
陈雪萍
翁贞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Big Data Center Of State Grid Corp Of China
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
State Grid Shandong Electric Power Co Ltd
State Grid Fujian Electric Power Co Ltd
State Grid Shanghai Electric Power Co Ltd
Weifang Power Supply Co of State Grid Shandong Electric Power Co Ltd
Fujian Yirong Information Technology Co Ltd
Original Assignee
Big Data Center Of State Grid Corp Of China
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
State Grid Shandong Electric Power Co Ltd
State Grid Fujian Electric Power Co Ltd
State Grid Shanghai Electric Power Co Ltd
Weifang Power Supply Co of State Grid Shandong Electric Power Co Ltd
Fujian Yirong Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Big Data Center Of State Grid Corp Of China, State Grid Corp of China SGCC, State Grid Information and Telecommunication Co Ltd, State Grid Shandong Electric Power Co Ltd, State Grid Fujian Electric Power Co Ltd, State Grid Shanghai Electric Power Co Ltd, Weifang Power Supply Co of State Grid Shandong Electric Power Co Ltd, Fujian Yirong Information Technology Co Ltd filed Critical Big Data Center Of State Grid Corp Of China
Priority to CN202111105581.1A priority Critical patent/CN113961694A/en
Publication of CN113961694A publication Critical patent/CN113961694A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a conference-based auxiliary analysis method and system for operation conditions of each company unit, and relates to the technical field of document analysis. The method comprises the following steps: the method comprises a word2vec model training process, a subject thesaurus building process, a subject word-task association process, a subject word-conference association process and a task-conference association process. The embodiment of the invention analyzes the working development condition of each unit by combining the working task and the subject label based on the conference data of each unit, thereby knowing the development condition of the actual service of the lower unit, and promoting the comprehensive control of the specific working condition of the lower unit and the process tracking of the execution condition of the counterweight work. The embodiment of the invention mainly applies the technologies of text mining, natural language processing, machine learning, deep learning and the like, analyzes the completion conditions of key work tasks and characteristic work tasks of each unit on the basis of the conference data, and improves the capability of the conference data in assisting the decision of companies.

Description

Conference-based auxiliary analysis method and system for operation condition of each company unit
Technical Field
The invention relates to the technical field of document analysis, in particular to a conference-based auxiliary analysis method and system for the operation condition of each company unit.
Background
The conference is an important carrier for driving government affairs and enterprise business activities, and is a driving hub for important business activities. The main body information and the related content of the conference can represent the important basis of the enterprise for executing superior policy and company characteristic work.
In the prior art, intelligent association analysis is mainly performed on conference data, traditional conference information is analyzed in key extraction and clustering modes, association relations between multiple associated topics and hierarchies are lacked, and association relations between company key work tasks and topic labels cannot be reflected from a global level.
Disclosure of Invention
The invention aims to solve the technical problem of providing a conference-based auxiliary analysis method and system for the operation condition of each company unit, which are used for analyzing by combining a company key work task and a theme label on the basis of conference information and mining the internal association relation before the conference, the key work task and the theme label, thereby improving the application value of conference data analysis.
In a first aspect, the present invention provides a conference-based method for assisting in analyzing the operation conditions of each company unit, including:
word2vec model training procedure: filtering office system data, then segmenting words, combining with a user-defined term library, and training through a word2vec algorithm to obtain a word2vec model;
the process of constructing the subject word library comprises the following steps: firstly, extracting subject words in headquarter task data to form a subject word set, then obtaining keywords associated with each subject word to form a keyword set by combining a trained word2vec model and a manual carding method, and finally merging the subject word set and the keyword set to obtain a subject word library;
subject word-task association process: importing task data, wherein the task data comprises headquarter task data and network province task data, and performing correlation analysis on the subject word bank and the task data to obtain subject word-task correlation data;
subject term-meeting association process: importing network province conference data, and performing association analysis on the subject word bank and the network province conference data to obtain subject word-conference association data;
task-meeting association procedure: and performing correlation analysis on the subject term-task associated data and the subject term-conference associated data, summarizing the task data and the conference data associated with the same subject term into task-conference associated data, classifying the tasks corresponding to the subject terms of the conference associated with the headquarter task and the network province task simultaneously into headquarter key tasks, and classifying the tasks corresponding to the subject terms of the conference associated with the network province task only into network province special tasks.
Further, the word2vec model training process further specifically includes: the method comprises the steps of firstly importing office system data into a user-defined stop word bank for filtering, then performing jieba word segmentation, merging the office system data with a user-defined term bank to obtain office system data word segmentation texts, and training the office system data word segmentation texts through a word2vec algorithm to obtain a word2vec model.
Further, in the process of constructing the topic word library, extracting topic words in headquarter task data to form a topic word set, and specifically realizing the topic word set through a TF-IDF algorithm, importing a hot topic word library and a manual carding method.
In a second aspect, the present invention provides a conference-based auxiliary analysis system for the operation status of each company unit, including: the system comprises a word2vec model training module, a subject thesaurus building module, a subject word-task association module, a subject word-conference association module and a task-conference association module;
the word2vec model training module is used for filtering office system data, dividing words, combining with a user-defined term base, and training through a word2vec algorithm to obtain a word2vec model;
the topic word library construction module is used for firstly extracting topic words in headquarter task data to form a topic word set, then obtaining keywords associated with each topic word by combining a trained word2vec model and a manual carding method to form a keyword set, and finally combining the topic word set and the keyword set to obtain a topic word library;
the topic word-task association module is used for importing task data, wherein the task data comprises headquarter task data and internet province task data, and performing association analysis on the topic word bank and the task data to obtain topic word-task association data;
the topic word-conference association module is used for importing network province conference data and performing association analysis on the topic word library and the network province conference data to obtain topic word-conference association data;
the task-conference association module is used for performing association analysis on the subject term-task association data and the subject term-conference association data, summarizing the task data and the conference data associated with the same subject term into task-conference association data, classifying the tasks corresponding to the subject terms of the conference associated with the headquarter task and the cybercoin task at the same time into headquarter key tasks, and classifying the tasks corresponding to the subject terms of the conference associated with the cybercoin task only into the cybercoin special tasks.
Further, the word2vec model training module is further specifically configured to: the method comprises the steps of firstly importing office system data into a user-defined stop word bank for filtering, then performing jieba word segmentation, merging the office system data with a user-defined term bank to obtain office system data word segmentation texts, and training the office system data word segmentation texts through a word2vec algorithm to obtain a word2vec model.
Further, in the topic word library construction module, topic words in headquarter task data are extracted to form a topic word set, and the topic word set is specifically realized through a TF-IDF algorithm, hot-point word library importing and manual carding methods.
The embodiment of the invention provides a technical scheme, which has the following technical effects or advantages:
by constructing a theme word library, mining the internal association between theme words and tasks (headquarter tasks and network province tasks), between theme words and meetings and between tasks and meetings, the association relationship between the key work tasks and the theme labels of the company is embodied from the global level, an important basis is provided for supporting the auxiliary decision of the company, and reference is provided for other similar data analysis scenes, so that the application value of meeting data analysis is improved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
The invention will be further described with reference to the following examples with reference to the accompanying drawings.
FIG. 1 is a flow chart of a method according to one embodiment of the present invention;
FIG. 2 is a flow chart of a task-meeting correlation technique route according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a system according to a second embodiment of the present invention.
Detailed Description
The technical scheme in the embodiment of the invention has the following general idea:
the embodiment of the invention analyzes the working development condition of each unit by combining the working task and the subject label based on the conference data of each unit, thereby knowing the development condition of the actual service of the lower unit, and promoting the comprehensive control of the specific working condition of the lower unit and the process tracking of the execution condition of the counterweight work. The embodiment of the invention mainly applies the technologies of text mining, natural language processing, machine learning, deep learning and the like, analyzes the completion conditions of key work tasks and characteristic work tasks of each unit on the basis of the conference data, and improves the capability of the conference data in assisting the decision of companies.
The embodiment of the invention mainly takes the conference information as the basis, combines the key work tasks and the theme labels of the company to carry out analysis, and applies machine learning and deep learning algorithms to mine the internal association relation among the conference, the key work tasks and the theme labels, thereby improving the application value of the conference data analysis.
The technology mainly relates to the construction of a subject word bank, the mining of internal associations between subject words and tasks (headquarter tasks and cyberse tasks), between subject words and meetings and between tasks and meetings, and mainly adopts advanced technologies such as manual carding, word2vec algorithm and the like to extract the subject words and key words (attributes of the subject words) in headquarter task data, and then analyzes the associations between the subject words and the tasks, between the subject words and the meetings and between the tasks and the meetings to finally obtain meeting data, headquarter key tasks and cyberse special tasks which are associated with the tasks.
Example one
Referring to fig. 1 and fig. 2, the present embodiment provides a method for auxiliary analysis of operation conditions of each company unit based on a conference, including:
s1, word2vec model training process: filtering office system data, then segmenting words, combining with a user-defined term library, and training through a word2vec algorithm to obtain a word2vec model;
word2vec, a group of correlation models used to generate Word vectors. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic word text. The network is represented by words and the input words in adjacent positions are guessed, and the order of the words is unimportant under the assumption of the bag-of-words model in word2 vec. After training is completed, the word2vec model can be used to map each word to a vector, which can be used to represent word-to-word relationships, and the vector is a hidden layer of the neural network.
In a specific embodiment, the word2vec model training process further includes: the method comprises the steps of firstly importing office system data into a user-defined stop word bank for filtering, then performing jieba word segmentation, merging the office system data with a user-defined term bank to obtain office system data word segmentation texts, and training the office system data word segmentation texts through a word2vec algorithm to obtain a word2vec model.
The custom deactivation thesaurus is used for filtering out useless information, such as: the user-defined term library is a professional subject term library defined by a user, so that the subject library is larger, the contained subject terms are more diversified, and the subsequent correlation operation is more accurate.
S2, constructing a subject word library: firstly, extracting subject words in headquarter task data to form a subject word set, then obtaining keywords associated with each subject word to form a keyword set by combining a trained word2vec model and a manual carding method, and finally merging the subject word set and the keyword set to obtain a subject word library;
in a specific embodiment, in the process of constructing the topic lexicon, topic words in the headquarter task data are extracted to form a topic lexicon, and the topic lexicon is specifically constructed by a TF-IDF algorithm, importing a hot topic lexicon and a manual combing method.
The TF-IDF (term frequency-inverse document frequency) algorithm is a commonly used weighting technique for information retrieval and data mining. TF is term frequency (termfequency), and IDF is Inverse text frequency index (Inverse document frequency). The main idea of TF-IDF is: if a word or phrase appears in an article with a high frequency TF and rarely appears in other articles, the word or phrase is considered to have a good classification capability and is suitable for classification.
TF-IDF is a statistical method to evaluate the importance of a word to one of a set of documents or a corpus. The importance of a word increases in proportion to the number of times it appears in a document, but at the same time decreases in inverse proportion to the frequency with which it appears in the corpus. Various forms of TF-IDF weighting are often applied by search engines as a measure or rating of the degree of relevance between a document and a user query.
On the basis of a word2vec model and a TF-IDF algorithm, a hot word bank and a manual carding method are introduced to supplement and correct the word set, so that the correctness of the subsequent association steps is ensured.
S3-1, subject word-task association process: importing task data, wherein the task data comprises headquarter task data and internet province task data, and performing association analysis (for example, finding intersection) on the subject word library and the task data to obtain subject word-task association data (for example, displaying in the form of an association tree diagram);
s3-2, subject term-conference association process: importing the online province conference data, and performing association analysis (for example, finding intersection) on the subject word library and the online province conference data to obtain subject word-conference association data (for example, displaying in the form of an associated tree diagram);
s4, task-conference association process: and performing correlation analysis on the subject term-task associated data and the subject term-conference associated data, summarizing the task data and the conference data associated with the same subject term into task-conference associated data (conference data corresponding to each task), classifying the tasks corresponding to the subject terms of the conference associated with the headquarter task and the network province task simultaneously into headquarter key tasks, and classifying the tasks corresponding to the subject terms of the conference associated with the network province task only into network province special tasks.
The task-conference associated data comprises conference data corresponding to each task, and for any one of the task-conference associated data, whether the task belongs to a headquarter key task or a network province characteristic task can be judged according to whether a conference associated subject term is associated with the headquarter task and the network province task at the same time, and then according to the conference associated data of the task, the content of the conference is obtained to be analyzed, so that the progress condition of the task is obtained.
Based on the same inventive concept, the application also provides a device corresponding to the method in the first embodiment, which is detailed in the second embodiment.
Example two
In this embodiment, an auxiliary analysis system for operation conditions of each company unit based on a conference is provided, as shown in fig. 3, including: the system comprises a word2vec model training module, a subject thesaurus building module, a subject word-task association module, a subject word-conference association module and a task-conference association module;
the word2vec model training module is used for filtering office system data, dividing words, combining with a user-defined term base, and training through a word2vec algorithm to obtain a word2vec model;
the topic word library construction module is used for firstly extracting topic words in headquarter task data to form a topic word set, then obtaining keywords associated with each topic word by combining a trained word2vec model and a manual carding method to form a keyword set, and finally combining the topic word set and the keyword set to obtain a topic word library;
the topic word-task association module is used for importing task data, wherein the task data comprises headquarter task data and internet province task data, and performing association analysis on the topic word bank and the task data to obtain topic word-task association data;
the topic word-conference association module is used for importing network province conference data and performing association analysis on the topic word library and the network province conference data to obtain topic word-conference association data;
the task-conference association module is used for performing association analysis on the subject term-task association data and the subject term-conference association data, summarizing the task data and the conference data associated with the same subject term into task-conference association data, classifying the tasks corresponding to the subject terms of the conference associated with the headquarter task and the cybercoin task at the same time into headquarter key tasks, and classifying the tasks corresponding to the subject terms of the conference associated with the cybercoin task only into the cybercoin special tasks.
In a specific embodiment, the word2vec model training module is further specifically configured to: the method comprises the steps of firstly importing office system data into a user-defined stop word bank for filtering, then performing jieba word segmentation, merging the office system data with a user-defined term bank to obtain office system data word segmentation texts, and training the office system data word segmentation texts through a word2vec algorithm to obtain a word2vec model.
In a specific embodiment, in the topic lexicon building module, topic words in headquarter task data are extracted to form a topic lexicon, and the topic lexicon is specifically implemented by a TF-IDF algorithm, a hot-spot lexicon importing method and a manual combing method.
Since the apparatus described in the second embodiment of the present invention is an apparatus used for implementing the method of the first embodiment of the present invention, based on the method described in the first embodiment of the present invention, a person skilled in the art can understand the specific structure and the deformation of the apparatus, and thus the details are not described herein. All the devices adopted in the method of the first embodiment of the present invention belong to the protection scope of the present invention.
The embodiment of the invention provides a technical scheme, which has the following technical effects or advantages:
by constructing a theme word library, mining the internal association between theme words and tasks (headquarter tasks and network province tasks), between theme words and meetings and between tasks and meetings, the association relationship between the key work tasks and the theme labels of the company is embodied from the global level, an important basis is provided for supporting the auxiliary decision of the company, and reference is provided for other similar data analysis scenes, so that the application value of meeting data analysis is improved.
Although specific embodiments of the invention have been described above, it will be understood by those skilled in the art that the specific embodiments described are illustrative only and are not limiting upon the scope of the invention, and that equivalent modifications and variations can be made by those skilled in the art without departing from the spirit of the invention, which is to be limited only by the appended claims.

Claims (6)

1. A conference-based auxiliary analysis method for operation conditions of each company unit is characterized by comprising the following steps:
word2vec model training procedure: filtering office system data, then segmenting words, combining with a user-defined term library, and training through a word2vec algorithm to obtain a word2vec model;
the process of constructing the subject word library comprises the following steps: firstly, extracting subject words in headquarter task data to form a subject word set, then obtaining keywords associated with each subject word to form a keyword set by combining a trained word2vec model and a manual carding method, and finally merging the subject word set and the keyword set to obtain a subject word library;
subject word-task association process: importing task data, wherein the task data comprises headquarter task data and network province task data, and performing correlation analysis on the subject word bank and the task data to obtain subject word-task correlation data;
subject term-meeting association process: importing network province conference data, and performing association analysis on the subject word bank and the network province conference data to obtain subject word-conference association data;
task-meeting association procedure: and performing correlation analysis on the subject term-task associated data and the subject term-conference associated data, summarizing the task data and the conference data associated with the same subject term into task-conference associated data, classifying the tasks corresponding to the subject terms of the conference associated with the headquarter task and the network province task simultaneously into headquarter key tasks, and classifying the tasks corresponding to the subject terms of the conference associated with the network province task only into network province special tasks.
2. The method of claim 1, wherein: the word2vec model training process further comprises the following specific steps: the method comprises the steps of firstly importing office system data into a user-defined stop word bank for filtering, then performing jieba word segmentation, merging the office system data with a user-defined term bank to obtain office system data word segmentation texts, and training the office system data word segmentation texts through a word2vec algorithm to obtain a word2vec model.
3. The method according to claim 1 or 2, characterized in that: in the process of constructing the topic word library, extracting topic words in headquarter task data to form a topic word set, and specifically realizing the topic word set through a TF-IDF algorithm, importing a hot topic word library and a manual carding method.
4. A conference-based auxiliary analysis system for operation conditions of each company unit is characterized in that: the method comprises the following steps: the system comprises a word2vec model training module, a subject thesaurus building module, a subject word-task association module, a subject word-conference association module and a task-conference association module;
the word2vec model training module is used for filtering office system data, dividing words, combining with a user-defined term base, and training through a word2vec algorithm to obtain a word2vec model;
the topic word library construction module is used for firstly extracting topic words in headquarter task data to form a topic word set, then obtaining keywords associated with each topic word by combining a trained word2vec model and a manual carding method to form a keyword set, and finally combining the topic word set and the keyword set to obtain a topic word library;
the topic word-task association module is used for importing task data, wherein the task data comprises headquarter task data and internet province task data, and performing association analysis on the topic word bank and the task data to obtain topic word-task association data;
the topic word-conference association module is used for importing network province conference data and performing association analysis on the topic word library and the network province conference data to obtain topic word-conference association data;
the task-conference association module is used for performing association analysis on the subject term-task association data and the subject term-conference association data, summarizing the task data and the conference data associated with the same subject term into task-conference association data, classifying the tasks corresponding to the subject terms of the conference associated with the headquarter task and the cybercoin task at the same time into headquarter key tasks, and classifying the tasks corresponding to the subject terms of the conference associated with the cybercoin task only into the cybercoin special tasks.
5. The system of claim 4, wherein: the word2vec model training module is further specifically configured to: the method comprises the steps of firstly importing office system data into a user-defined stop word bank for filtering, then performing jieba word segmentation, merging the office system data with a user-defined term bank to obtain office system data word segmentation texts, and training the office system data word segmentation texts through a word2vec algorithm to obtain a word2vec model.
6. The system according to claim 4 or 5, characterized in that: in the subject word bank building module, the subject words in the headquarter task data are extracted to form a subject word set, and the topic word set is specifically realized through a TF-IDF algorithm, a hot word bank import and a manual combing method.
CN202111105581.1A 2021-09-22 2021-09-22 Conference-based auxiliary analysis method and system for operation condition of each company unit Pending CN113961694A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111105581.1A CN113961694A (en) 2021-09-22 2021-09-22 Conference-based auxiliary analysis method and system for operation condition of each company unit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111105581.1A CN113961694A (en) 2021-09-22 2021-09-22 Conference-based auxiliary analysis method and system for operation condition of each company unit

Publications (1)

Publication Number Publication Date
CN113961694A true CN113961694A (en) 2022-01-21

Family

ID=79461841

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111105581.1A Pending CN113961694A (en) 2021-09-22 2021-09-22 Conference-based auxiliary analysis method and system for operation condition of each company unit

Country Status (1)

Country Link
CN (1) CN113961694A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170120389A (en) * 2016-04-21 2017-10-31 (주)원제로소프트 Method and system for managing total financial information
CN107861945A (en) * 2017-11-01 2018-03-30 平安科技(深圳)有限公司 Finance data analysis method, application server and computer-readable recording medium
CN108538286A (en) * 2017-03-02 2018-09-14 腾讯科技(深圳)有限公司 A kind of method and computer of speech recognition
CN108595593A (en) * 2018-04-19 2018-09-28 南京大学 Meeting research hotspot based on topic model and development trend information analysis method
CN109800429A (en) * 2019-01-04 2019-05-24 平安科技(深圳)有限公司 Topics Crawling method, apparatus and storage medium, computer equipment
CN110705285A (en) * 2019-09-20 2020-01-17 北京市计算中心 Government affair text subject word bank construction method, device, server and readable storage medium
CN112749279A (en) * 2021-01-18 2021-05-04 南京中新赛克科技有限责任公司 Subject term extraction method based on text clustering

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170120389A (en) * 2016-04-21 2017-10-31 (주)원제로소프트 Method and system for managing total financial information
CN108538286A (en) * 2017-03-02 2018-09-14 腾讯科技(深圳)有限公司 A kind of method and computer of speech recognition
CN107861945A (en) * 2017-11-01 2018-03-30 平安科技(深圳)有限公司 Finance data analysis method, application server and computer-readable recording medium
CN108595593A (en) * 2018-04-19 2018-09-28 南京大学 Meeting research hotspot based on topic model and development trend information analysis method
CN109800429A (en) * 2019-01-04 2019-05-24 平安科技(深圳)有限公司 Topics Crawling method, apparatus and storage medium, computer equipment
CN110705285A (en) * 2019-09-20 2020-01-17 北京市计算中心 Government affair text subject word bank construction method, device, server and readable storage medium
CN112749279A (en) * 2021-01-18 2021-05-04 南京中新赛克科技有限责任公司 Subject term extraction method based on text clustering

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吕鹏飞;王春宁;周峰;朱月琴;: "基于文献的知识发现在成矿预测领域的应用研究", 中国矿业, no. 09, 15 September 2017 (2017-09-15) *

Similar Documents

Publication Publication Date Title
CN107153641B (en) Comment information determination method, comment information determination device, server and storage medium
Hammad et al. An approach for detecting spam in Arabic opinion reviews
Zhu et al. Mobile app classification with enriched contextual information
CN107784092A (en) A kind of method, server and computer-readable medium for recommending hot word
CN111325029A (en) Text similarity calculation method based on deep learning integration model
CN110162632B (en) Method for discovering news special events
CN113962293B (en) LightGBM classification and representation learning-based name disambiguation method and system
CN112989208B (en) Information recommendation method and device, electronic equipment and storage medium
WO2015084404A1 (en) Matching of an input document to documents in a document collection
CN114722137A (en) Security policy configuration method and device based on sensitive data identification and electronic equipment
CN113434636A (en) Semantic-based approximate text search method and device, computer equipment and medium
Krishnaraj et al. Conceptual semantic model for web document clustering using term frequency
CN114840677A (en) Short text classification and intelligent analysis system for multi-granularity requirements
CN107908749B (en) Character retrieval system and method based on search engine
Tian et al. Research of product ranking technology based on opinion mining
CN111767730B (en) Event type identification method and device
CN114298020A (en) Keyword vectorization method based on subject semantic information and application thereof
CN113961694A (en) Conference-based auxiliary analysis method and system for operation condition of each company unit
CN113934910A (en) Automatic optimization and updating theme library construction method and hot event real-time updating method
Shinde et al. Pattern discovery techniques for the text mining and its applications
Chen et al. FAQ system in specific domain based on concept hierarchy and question type
Ramachandran et al. Document Clustering Using Keyword Extraction
Wang et al. Ontology-assisted deep Web source selection
Yu et al. Interpretative topic categorization via deep multiple instance learning
Shi et al. Automatic Search Method for Unstructured Database of Natural Language Instructions Combining Cross-Step Matching Word Segmentation Algorithm and Web Text Clustering Analysis Model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination