CN112307210A - Document tag prediction method, system, medium and electronic device - Google Patents
Document tag prediction method, system, medium and electronic device Download PDFInfo
- Publication number
- CN112307210A CN112307210A CN202011232409.8A CN202011232409A CN112307210A CN 112307210 A CN112307210 A CN 112307210A CN 202011232409 A CN202011232409 A CN 202011232409A CN 112307210 A CN112307210 A CN 112307210A
- Authority
- CN
- China
- Prior art keywords
- document
- predicted
- keywords
- keyword
- original
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000012549 training Methods 0.000 claims abstract description 44
- 238000002372 labelling Methods 0.000 claims abstract description 14
- 238000013528 artificial neural network Methods 0.000 claims abstract description 11
- 239000013598 vector Substances 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 14
- 230000014509 gene expression Effects 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 8
- 238000007781 pre-processing Methods 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 5
- 238000012216 screening Methods 0.000 claims description 3
- 238000004891 communication Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a document label prediction method, a system, a medium and an electronic device, wherein the method comprises the following steps: extracting keywords according to the original document to obtain a keyword set of the original document; classifying the keywords in the keyword set to obtain a classification system of the documents corresponding to the keywords; labeling a classification system of the documents corresponding to the keywords to obtain a training data set; inputting a training data set into a document classification neural network for training to obtain a document label prediction model; inputting a document to be predicted into the document label prediction model, and performing label prediction on the document to be predicted; according to the document label prediction method, the original document is processed to obtain the document label prediction model, the document to be predicted is input into the document label prediction model to be trained, label prediction of the document to be predicted is achieved, the matching degree of the document and the label is high, implementation is convenient, and accuracy is high.
Description
Technical Field
The present invention relates to the field of electronics, and in particular, to a method, a system, a medium, and an electronic device for predicting a document tag.
Background
The characters are carriers of human civilization, contain a large amount of valuable information, and are typical unstructured data, so that corresponding labels are marked on text contents, the application is very difficult, at present, the labels are usually added to the documents in a manual mode, the matching degree of the labels and the document contents is low, the accuracy is low, and the working efficiency is low.
Disclosure of Invention
The invention provides a document tag prediction method, a system, a medium and an electronic device, which aim to solve the problems that tags are not convenient to add to documents and the matching degree is low in the prior art.
The document tag prediction method provided by the invention comprises the following steps:
extracting keywords according to an original document to obtain a keyword set of the original document;
classifying the keywords in the keyword set to obtain a classification system of the documents corresponding to the keywords;
labeling a classification system of the document corresponding to the keyword to obtain a training data set;
inputting the training data set into a document classification neural network for training to obtain a document label prediction model;
and inputting the document to be predicted into the document label prediction model, and performing label prediction on the document to be predicted.
Optionally, the step of obtaining the training data set includes:
acquiring related vocabularies of the keywords in the keyword set, wherein the related vocabularies and the keywords in the keyword set have a superior-inferior relation;
classifying original documents according to keywords in a keyword set and the associated vocabularies to obtain a document classification system, and taking the keywords in the keyword set and the associated vocabularies as associated keywords in the document classification system;
and labeling the document classification system to obtain the training data set.
Optionally, the step of extracting the keyword according to the original document includes:
acquiring an original document;
performing word segmentation on the original document, acquiring a first original vocabulary set, and further acquiring word frequency of vocabularies in the first original vocabulary set;
determining irrelevant words according to the word frequency of the words in the first original word set, and further acquiring a disabled word set;
extracting keywords according to an original document to obtain a keyword set of the original document;
and screening stop words in the keyword set according to the stop word set, and further determining the keyword set.
Optionally, the step of performing label prediction on the document to be predicted includes:
performing word segmentation on the document to be predicted and removing stop words so as to obtain a word set to be predicted;
vectorizing the vocabulary to be predicted to obtain the vectorized vocabulary to be predicted;
vectorizing the document to be predicted according to the vectorized vocabulary to be predicted to obtain a document vector to be predicted;
and inputting the document vector to be predicted into the document label prediction model for training, and performing label prediction on the document to be predicted.
Optionally, the step of inputting the document vector to be predicted into the document tag prediction model for training, and the step of performing tag prediction on the document to be predicted includes:
extracting keywords from the document to be predicted according to the vector of the document to be predicted, and matching the obtained keywords with associated keywords in different categories to obtain a matching result;
classifying and labeling the documents to be predicted according to the matching result to obtain the categories of the documents to be predicted;
and performing label prediction on the document to be predicted according to the associated keywords corresponding to the category of the document to be predicted.
Optionally, the step of performing label prediction on the document to be predicted according to the associated keyword corresponding to the category of the document to be predicted includes:
acquiring the weight of the associated keywords corresponding to the category of the document to be predicted;
acquiring the scores of the associated keywords according to the weights of the associated keywords and the word frequencies of the associated keywords in the original document;
performing label prediction on the document to be predicted according to the scores of the associated keywords;
the mathematical expression of the weight of the associated keyword corresponding to the category of the document to be predicted is obtained as follows:
wherein w is the weight of the associated keyword, nwordFor the number of occurrences of the associated keyword, ndocThe number of original documents corresponding to the associated keywords of the same category.
Optionally, the step of performing label prediction on the document to be predicted according to the score of the associated keyword includes:
when the score of the associated keyword is larger than the preset score threshold value, performing label prediction on the document to be predicted, wherein the mathematical expression of obtaining the score of the associated keyword is as follows:
wherein s is the association score, and n is the high-frequency vocabularyNumber of (2), wiIs the weight corresponding to the high frequency vocabulary, xiAnd i is the word frequency of the high-frequency words and the serial number of the words.
The invention also provides a document tag prediction system, comprising:
the preprocessing module is used for extracting keywords according to an original document to obtain a keyword set of the original document; classifying the keywords in the keyword set to obtain a classification system of the documents corresponding to the keywords;
the processing module is used for labeling the classification system of the document corresponding to the keyword to obtain a training data set; inputting the training data set into a document classification neural network for training to obtain a document label prediction model;
the prediction module is used for inputting the document to be predicted into the document label prediction model and performing label prediction on the document to be predicted; the preprocessing module, the processing module and the prediction module are connected.
The invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method as defined in any one of the above.
The present invention also provides an electronic terminal, comprising: a processor and a memory;
the memory is adapted to store a computer program and the processor is adapted to execute the computer program stored by the memory to cause the terminal to perform the method as defined in any one of the above.
The invention has the beneficial effects that: according to the document label prediction method, the original document is processed to obtain the document label prediction model, the document to be predicted is input into the document label prediction model to be trained, label prediction of the document to be predicted is achieved, the matching degree of the document and the label is high, implementation is convenient, and accuracy is high.
Drawings
FIG. 1 is a flow chart of a document tag prediction method according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a document tag prediction method according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a document tag prediction system in an embodiment of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
The inventor finds that a character is typical unstructured data, and a corresponding label is marked on text content, so that the application is very difficult, at present, a label is usually added to a document in a manual mode, the matching degree of the label and the document content is low, the accuracy is low, and the working efficiency is low.
As shown in fig. 1, the document tag prediction method in the present embodiment includes:
s101: extracting keywords according to an original document to obtain a keyword set of the original document; wherein the original document includes: according to different application scenarios, the content of the original document can be adjusted, for example, according to text data such as news, policy, and comment: label prediction and/or recommendation are carried out on the policy files, the related policy files can be used as original documents, the relevance of the original documents is increased, and the matching degree of prediction labels and the documents to be predicted is improved;
according to the original document, the step of extracting the key words comprises the following steps:
acquiring the word frequency and the inverse document frequency of the vocabulary in the original document, acquiring the weight corresponding to the vocabulary in the original document according to the word frequency and the inverse document frequency of the vocabulary in the original document, and extracting the keywords according to the acquired weight, wherein the mathematical expression of the weight corresponding to the vocabulary in the original document is as follows:
TF-IDF=TF×IDF
the TF-IDF is the weight of the vocabulary in the original document, the TF is the word frequency of the vocabulary in the original document, and the IDF is the inverse document frequency corresponding to the vocabulary in the original document;
the mathematical expressions for obtaining the word frequency and the inverse document frequency are as follows:
calculating to obtain the TF-IDF value of each word according to the calculation formula, and extracting keywords according to the TF-IDF value, for example: taking a vocabulary with a larger TF-IDF value as a keyword of the document;
s102: classifying the keywords in the keyword set to obtain a classification system of the documents corresponding to the keywords, labeling the classification system of the documents corresponding to the keywords to obtain a training data set;
the step of establishing a classification system comprises the following steps: merging the same and similar vocabularies to form different categories, and mining the upper and lower position relations among the different categories, for example, the automobile is the upper concept of the gearbox; collecting the categories of different topics to form classification trees of different topics, and further forming a complete classification system;
s103: inputting the training data set into a document classification neural network for training to obtain a document label prediction model;
s104: inputting a document to be predicted into the document label prediction model, and performing label prediction on the document to be predicted; the method comprises the steps of extracting keywords from a high-frequency vocabulary set in an original document, classifying and labeling the keywords in the keyword set to obtain a training data set, enabling coverage data of the training data set to be comprehensive, enabling accuracy of document classification to be high, inputting the training data set into a document classification neural network for training to obtain a document label prediction model, enabling the document label prediction model to be capable of conducting deep learning, classifying and label predicting input documents, inputting the documents to be predicted into the document label prediction model for training, enabling label prediction of the documents to be predicted to be achieved, enabling matching degree of the documents and labels to be high, and being convenient to implement, high in accuracy and low in cost.
As shown in FIG. 2, a document tag prediction method in some embodiments includes:
s201: acquiring an original document, segmenting words of the original document, acquiring a first original vocabulary set, and further acquiring word frequency of the vocabulary in the first original vocabulary set;
s202: determining a keyword set of the original document according to the word frequency of the vocabulary in the first original vocabulary set, namely determining irrelevant vocabulary according to the word frequency of the vocabulary in the first original vocabulary set so as to obtain a deactivated vocabulary set; extracting keywords according to an original document to obtain a keyword set of the original document; according to the stop word set, stop word screening is carried out on the keyword set, and then the keyword set is determined;
s203: classifying original documents according to the keywords in the keyword set and associated vocabularies having upper and lower relations with the keywords to obtain a document classification system, namely acquiring the associated vocabularies of the keywords in the keyword set, wherein the associated vocabularies have upper and lower relations with the keywords in the keyword set; classifying original documents according to keywords in a keyword set and the associated vocabularies to obtain a document classification system, and taking the keywords in the keyword set and the associated vocabularies as associated keywords of different categories in the document classification system; the method comprises the steps of classifying original documents by acquiring associated vocabularies with upper and lower relations with keywords, and classifying the original documents by the keywords in the keyword set and the associated vocabularies with the upper and lower relations with the keywords, so that the accuracy of classifying the original documents is improved;
s204: labeling the document classification system to obtain the training data set;
in some embodiments, the document classification system is labeled to obtain a test data set, the test data set is input into the document label prediction model, and the document label prediction model is tested, so that the accuracy of the document label prediction model is improved; testing the document label prediction model through a test data set to ensure the prediction precision of the document label prediction model;
s205: building a document classification neural network based on deep learning, inputting the training data set into the document classification neural network for training, and obtaining a document label prediction model;
s206: inputting the document to be predicted into the document label prediction model for training, and performing label prediction on the document to be predicted, wherein the training step comprises the following steps: performing word segmentation on the document to be predicted and removing stop words so as to obtain a word set to be predicted; vectorizing the vocabulary to be predicted to obtain the vectorized vocabulary to be predicted; vectorizing the document to be predicted according to the vectorized vocabulary to be predicted to obtain a document vector to be predicted; inputting the document vector to be predicted into the document label prediction model for training, extracting keywords from the document to be predicted by the document label prediction model according to the document vector to be predicted, and matching the obtained keywords with associated keywords in different categories to obtain a matching result; classifying and labeling the documents to be predicted according to the matching result to obtain the categories of the documents to be predicted;
performing label prediction on the document to be predicted according to the associated keywords corresponding to the category of the document to be predicted; acquiring the weight of the associated keywords corresponding to the category of the document to be predicted; acquiring the scores of the associated keywords according to the weights of the associated keywords and the word frequencies of the associated keywords in the original document; performing label prediction on the document to be predicted according to the scores of the associated keywords;
when the score of the associated keyword is greater than the preset score threshold, performing label prediction on the document to be predicted, wherein the mathematical expression of the weight of the associated keyword corresponding to the category of the document to be predicted is obtained as follows:
wherein w is the weight of the associated keyword, nwordFor the number of occurrences of the associated keyword, ndocThe number of original documents corresponding to the associated keywords in the same category; in some embodiments, the obtained weights may be further normalized to obtain weights of the normalized associated keywords in the keyword sets of the corresponding categories;
obtaining a mathematical expression of the score of the associated keyword as:
wherein s is the association score, n is the number of the high-frequency vocabulary, and wiIs the weight corresponding to the high frequency vocabulary, xiAnd i is the word frequency of the high-frequency words and the serial number of the words.
The document tag prediction method provided by this embodiment may also be applied to a plurality of application scenarios, such as document search, personalized recommendation, and knowledge graph construction, for example: inputting keywords into a document label prediction model for classification and matching, selecting a document with a high matching degree for recommendation, or inputting a document to be predicted into the document label prediction model, extracting the keywords of the document to be predicted, obtaining a feature vector of the keywords of the document to be predicted, splicing and combining the feature vectors of the keywords of the document to be predicted, obtaining a spliced feature vector of the document to be predicted, inputting the spliced feature vector into the document label prediction model for training, and performing label prediction on the document to be predicted.
In some embodiments, after performing tag prediction on the document to be predicted, the document to be predicted may also be subjected to tag recommendation by a tag recommendation model, and receive feedback of a user on a recommended tag, and update the document tag prediction model according to feedback content, so as to improve accuracy of the document tag prediction, where the tag recommendation model performs tag recommendation according to a tag prediction result, and plays roles of prompting and assisting recommendation in tagging, and the tag recommendation model is obtained by: one or more label prediction results are obtained, and the label prediction results are input into a deep learning neural network for training to obtain a label recommendation model.
As shown in fig. 3, the present embodiment also provides a document tag prediction system, including:
the preprocessing module is used for extracting keywords according to an original document to obtain a keyword set of the original document; classifying the keywords in the keyword set to obtain a classification system of the documents corresponding to the keywords;
the processing module is used for labeling the classification system of the document corresponding to the keyword to obtain a training data set; inputting the training data set into a document classification neural network for training to obtain a document label prediction model;
the prediction module is used for inputting the document to be predicted into the document label prediction model and performing label prediction on the document to be predicted; the preprocessing module, the processing module and the prediction module are connected. The method comprises the steps of processing an original document to obtain a document label prediction model, inputting a document to be predicted into the document label prediction model for training, so that label prediction and/or recommendation of the document to be predicted are/is realized, the matching degree of the document and a label is high, the implementation is convenient, and the accuracy is high.
The present embodiment also provides a computer-readable storage medium on which a computer program is stored, which when executed by a processor implements any of the methods in the present embodiments.
The present embodiment further provides an electronic terminal, including: a processor and a memory;
the memory is used for storing computer programs, and the processor is used for executing the computer programs stored by the memory so as to enable the terminal to execute the method in the embodiment.
The computer-readable storage medium in the present embodiment can be understood by those skilled in the art as follows: all or part of the steps for implementing the above method embodiments may be performed by hardware associated with a computer program. The aforementioned computer program may be stored in a computer readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The electronic terminal provided by the embodiment comprises a processor, a memory, a transceiver and a communication interface, wherein the memory and the communication interface are connected with the processor and the transceiver and are used for completing mutual communication, the memory is used for storing a computer program, the communication interface is used for carrying out communication, and the processor and the transceiver are used for operating the computer program so that the electronic terminal can execute the steps of the method.
In this embodiment, the Memory may include a Random Access Memory (RAM), and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.
Claims (10)
1. A document tag prediction method, comprising:
extracting keywords according to an original document to obtain a keyword set of the original document;
classifying the keywords in the keyword set to obtain a classification system of the documents corresponding to the keywords;
labeling a classification system of the document corresponding to the keyword to obtain a training data set;
inputting the training data set into a document classification neural network for training to obtain a document label prediction model;
and inputting the document to be predicted into the document label prediction model, and performing label prediction on the document to be predicted.
2. The document tag prediction method of claim 1, wherein the step of obtaining a training data set comprises:
acquiring related vocabularies of the keywords in the keyword set, wherein the related vocabularies and the keywords in the keyword set have a superior-inferior relation;
classifying original documents according to keywords in a keyword set and the associated vocabularies to obtain a document classification system, and taking the keywords in the keyword set and the associated vocabularies as associated keywords in the document classification system;
and labeling the document classification system to obtain the training data set.
3. The method of claim 1, wherein the step of extracting keywords from the original document comprises:
acquiring an original document;
performing word segmentation on the original document, acquiring a first original vocabulary set, and further acquiring word frequency of vocabularies in the first original vocabulary set;
determining irrelevant words according to the word frequency of the words in the first original word set, and further acquiring a disabled word set;
extracting keywords according to an original document to obtain a keyword set of the original document;
and screening stop words in the keyword set according to the stop word set, and further determining the keyword set.
4. The document tag prediction method according to claim 1, wherein the step of performing tag prediction on the document to be predicted comprises:
performing word segmentation on the document to be predicted and removing stop words so as to obtain a word set to be predicted;
vectorizing the vocabulary to be predicted to obtain the vectorized vocabulary to be predicted;
vectorizing the document to be predicted according to the vectorized vocabulary to be predicted to obtain a document vector to be predicted;
and inputting the document vector to be predicted into the document label prediction model for training, and performing label prediction on the document to be predicted.
5. The training set obtaining method according to claim 4, wherein the document vector to be predicted is input into the document label prediction model for training, and the step of performing label prediction on the document to be predicted comprises:
extracting keywords from the document to be predicted according to the vector of the document to be predicted, and matching the obtained keywords with associated keywords in different categories to obtain a matching result;
classifying and labeling the documents to be predicted according to the matching result to obtain the categories of the documents to be predicted;
and performing label prediction on the document to be predicted according to the associated keywords corresponding to the category of the document to be predicted.
6. The document tag recommendation method according to claim 5, wherein the step of performing tag prediction on the document to be predicted according to the associated keywords corresponding to the category of the document to be predicted comprises:
acquiring the weight of the associated keywords corresponding to the category of the document to be predicted;
acquiring the scores of the associated keywords according to the weights of the associated keywords and the word frequencies of the associated keywords in the original document;
performing label prediction on the document to be predicted according to the scores of the associated keywords;
the mathematical expression of the weight of the associated keyword corresponding to the category of the document to be predicted is obtained as follows:
wherein w is the weight of the associated keyword, nwordFor the number of occurrences of the associated keyword, ndocThe number of original documents corresponding to the associated keywords of the same category.
7. The document tag prediction method according to claim 6, wherein the step of performing tag prediction on the document to be predicted according to the score of the associated keyword comprises:
when the score of the associated keyword is larger than the preset score threshold value, performing label prediction on the document to be predicted, wherein the mathematical expression of obtaining the score of the associated keyword is as follows:
wherein s is the association score, n is the number of the high-frequency vocabulary, and wiIs the weight corresponding to the high frequency vocabulary, xiAnd i is the word frequency of the high-frequency words and the serial number of the words.
8. A document tag prediction system, comprising:
the preprocessing module is used for extracting keywords according to an original document to obtain a keyword set of the original document; classifying the keywords in the keyword set to obtain a classification system of the documents corresponding to the keywords;
the processing module is used for labeling the classification system of the document corresponding to the keyword to obtain a training data set; inputting the training data set into a document classification neural network for training to obtain a document label prediction model;
the prediction module is used for inputting the document to be predicted into the document label prediction model and performing label prediction on the document to be predicted; the preprocessing module, the processing module and the prediction module are connected.
9. A computer-readable storage medium having stored thereon a computer program, characterized in that: the computer program, when executed by a processor, implements the method of any one of claims 1 to 7.
10. An electronic terminal, comprising: a processor and a memory;
the memory is for storing a computer program and the processor is for executing the computer program stored by the memory to cause the terminal to perform the method of any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011232409.8A CN112307210B (en) | 2020-11-06 | 2020-11-06 | Document tag prediction method, system, medium and electronic device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011232409.8A CN112307210B (en) | 2020-11-06 | 2020-11-06 | Document tag prediction method, system, medium and electronic device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112307210A true CN112307210A (en) | 2021-02-02 |
CN112307210B CN112307210B (en) | 2024-07-30 |
Family
ID=74326508
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011232409.8A Active CN112307210B (en) | 2020-11-06 | 2020-11-06 | Document tag prediction method, system, medium and electronic device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112307210B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113204653A (en) * | 2021-06-04 | 2021-08-03 | 中国银行股份有限公司 | Demand value labeling method and device, computer equipment and readable storage medium |
CN115861606A (en) * | 2022-05-09 | 2023-03-28 | 北京中关村科金技术有限公司 | Method and device for classifying long-tail distribution documents and storage medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103064969A (en) * | 2012-12-31 | 2013-04-24 | 武汉传神信息技术有限公司 | Method for automatically creating keyword index table |
WO2017101342A1 (en) * | 2015-12-15 | 2017-06-22 | 乐视控股(北京)有限公司 | Sentiment classification method and apparatus |
CN106997340A (en) * | 2016-01-25 | 2017-08-01 | 阿里巴巴集团控股有限公司 | The generation of dictionary and the Document Classification Method and device using dictionary |
CN110196910A (en) * | 2019-05-30 | 2019-09-03 | 珠海天燕科技有限公司 | A kind of method and device of corpus classification |
CN110275935A (en) * | 2019-05-10 | 2019-09-24 | 平安科技(深圳)有限公司 | Processing method, device and storage medium, the electronic device of policy information |
CN110309303A (en) * | 2019-05-22 | 2019-10-08 | 浙江工业大学 | A kind of judicial dispute data visualization analysis method based on Weighted T F-IDF |
CN110717042A (en) * | 2019-09-24 | 2020-01-21 | 北京工商大学 | Method for constructing document-keyword heterogeneous network model |
CN110837601A (en) * | 2019-10-25 | 2020-02-25 | 杭州叙简科技股份有限公司 | Automatic classification and prediction method for alarm condition |
WO2020207431A1 (en) * | 2019-04-12 | 2020-10-15 | 智慧芽信息科技(苏州)有限公司 | Document classification method, apparatus and device, and storage medium |
-
2020
- 2020-11-06 CN CN202011232409.8A patent/CN112307210B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103064969A (en) * | 2012-12-31 | 2013-04-24 | 武汉传神信息技术有限公司 | Method for automatically creating keyword index table |
WO2017101342A1 (en) * | 2015-12-15 | 2017-06-22 | 乐视控股(北京)有限公司 | Sentiment classification method and apparatus |
CN106997340A (en) * | 2016-01-25 | 2017-08-01 | 阿里巴巴集团控股有限公司 | The generation of dictionary and the Document Classification Method and device using dictionary |
WO2020207431A1 (en) * | 2019-04-12 | 2020-10-15 | 智慧芽信息科技(苏州)有限公司 | Document classification method, apparatus and device, and storage medium |
CN110275935A (en) * | 2019-05-10 | 2019-09-24 | 平安科技(深圳)有限公司 | Processing method, device and storage medium, the electronic device of policy information |
CN110309303A (en) * | 2019-05-22 | 2019-10-08 | 浙江工业大学 | A kind of judicial dispute data visualization analysis method based on Weighted T F-IDF |
CN110196910A (en) * | 2019-05-30 | 2019-09-03 | 珠海天燕科技有限公司 | A kind of method and device of corpus classification |
CN110717042A (en) * | 2019-09-24 | 2020-01-21 | 北京工商大学 | Method for constructing document-keyword heterogeneous network model |
CN110837601A (en) * | 2019-10-25 | 2020-02-25 | 杭州叙简科技股份有限公司 | Automatic classification and prediction method for alarm condition |
Non-Patent Citations (1)
Title |
---|
宁建飞;刘降珍;: "融合Word2vec与TextRank的关键词抽取研究", 现代图书情报技术, no. 06, 25 June 2016 (2016-06-25) * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113204653A (en) * | 2021-06-04 | 2021-08-03 | 中国银行股份有限公司 | Demand value labeling method and device, computer equipment and readable storage medium |
CN115861606A (en) * | 2022-05-09 | 2023-03-28 | 北京中关村科金技术有限公司 | Method and device for classifying long-tail distribution documents and storage medium |
CN115861606B (en) * | 2022-05-09 | 2023-09-08 | 北京中关村科金技术有限公司 | Classification method, device and storage medium for long-tail distributed documents |
Also Published As
Publication number | Publication date |
---|---|
CN112307210B (en) | 2024-07-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110222160B (en) | Intelligent semantic document recommendation method and device and computer readable storage medium | |
CN108121700B (en) | Keyword extraction method and device and electronic equipment | |
CN110929038B (en) | Knowledge graph-based entity linking method, device, equipment and storage medium | |
CN109471944B (en) | Training method and device of text classification model and readable storage medium | |
CN110781276A (en) | Text extraction method, device, equipment and storage medium | |
CN110737756B (en) | Method, apparatus, device and medium for determining answer to user input data | |
CN110688452B (en) | Text semantic similarity evaluation method, system, medium and device | |
CN110399547B (en) | Method, apparatus, device and storage medium for updating model parameters | |
CN110688405A (en) | Expert recommendation method, device, terminal and medium based on artificial intelligence | |
WO2021190662A1 (en) | Medical text sorting method and apparatus, electronic device, and storage medium | |
CN112667782A (en) | Text classification method, device, equipment and storage medium | |
CN110765765A (en) | Contract key clause extraction method and device based on artificial intelligence and storage medium | |
CN113609847B (en) | Information extraction method, device, electronic equipment and storage medium | |
CN113434636A (en) | Semantic-based approximate text search method and device, computer equipment and medium | |
CN112579729B (en) | Training method and device for document quality evaluation model, electronic equipment and medium | |
CN112307210B (en) | Document tag prediction method, system, medium and electronic device | |
CN112800226A (en) | Method for obtaining text classification model, method, device and equipment for text classification | |
CN112380421A (en) | Resume searching method and device, electronic equipment and computer storage medium | |
CN110413992A (en) | A kind of semantic analysis recognition methods, system, medium and equipment | |
CN111931516A (en) | Text emotion analysis method and system based on reinforcement learning | |
CN114003725A (en) | Information annotation model construction method and information annotation generation method | |
CN108241650B (en) | Training method and device for training classification standard | |
CN111241848B (en) | Article reading comprehension answer retrieval method and device based on machine learning | |
CN114742062B (en) | Text keyword extraction processing method and system | |
CN114647739B (en) | Entity chain finger method, device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |