CN113204620A - Method, system, equipment and computer storage medium for automatically constructing narrative table - Google Patents

Method, system, equipment and computer storage medium for automatically constructing narrative table Download PDF

Info

Publication number
CN113204620A
CN113204620A CN202110515734.3A CN202110515734A CN113204620A CN 113204620 A CN113204620 A CN 113204620A CN 202110515734 A CN202110515734 A CN 202110515734A CN 113204620 A CN113204620 A CN 113204620A
Authority
CN
China
Prior art keywords
word
words
narrative
occurrence
expression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110515734.3A
Other languages
Chinese (zh)
Inventor
张凯
周建设
刘杰
王伟丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Capital Normal University
Original Assignee
Capital Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Capital Normal University filed Critical Capital Normal University
Priority to CN202110515734.3A priority Critical patent/CN113204620A/en
Publication of CN113204620A publication Critical patent/CN113204620A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The application provides a method for automatically constructing a narrative word list, which comprises the steps of performing statistics on the contract and calculation of distribution similarity, and then identifying the grade relation among words so as to compile a natural language narrative word list; calculating the co-occurrence weight among the words according to the frequency of the words in the file, the co-occurrence frequency among the words and the adjusting factor; thirdly, constructing a feature vector, and calculating semantic similarity, so that all words are combined into a cluster; the words in the cluster are transformed into each grade according to the grade coefficient, and the upper and lower relations of the words are identified; and finally, constructing a narrative word list according to the related relations among the words and the superior-inferior relations of the narrative word set.

Description

Method, system, equipment and computer storage medium for automatically constructing narrative table
Technical Field
The present application relates to the field of artificial intelligence, and in particular, to a method, system, device, and computer storage medium for automatic construction of a narrative table.
Background
The rapid development of the network brings about the explosive growth of information resources, provides convenience for people, and makes people gradually realize that the information is submerged in the ocean, so that how to accurately and efficiently acquire required information from massive information becomes a problem to be solved urgently. Most of the existing network information retrieval tools (such as search engines and the like) adopt a full-text retrieval mode based on keyword literal matching, the method is simple and feasible, the searching is convenient, the full-text retrieval rate is higher, but the returned information of the retrieval is too much, only a few parts of the returned information meet the requirements of a retriever, the precision rate is low, and meanwhile, the phenomena of missed retrieval and false retrieval exist. The normalized control narrative word list is applied to the indexing and searching process, so that the detection rate can be effectively improved. However, the traditional narrative table is difficult to be applied to word list establishment and maintenance and in a network information retrieval environment, so that the research on how to automatically construct the natural language narrative table is of great significance.
At present, how to automatically identify semantic relationships such as equivalence, level and correlation among narrative words by using a computer technology is a key link for realizing automatic construction of a narrative word list and is also a difficulty.
Disclosure of Invention
In order to solve the technology that the narrative table is difficult to compile in the word table, the application provides a method, a system, equipment and a computer storage medium for automatically constructing the narrative table.
In a first aspect of the present application, a method for automatically constructing a narrative table is provided, wherein the method comprises:
s1, collecting vocabularies, and inputting original data files required by building a narrative word list;
s2, extracting each word according to the original data file to form a narrative word set;
s3, calculating the co-occurrence weight of the words in the narrative word set according to the frequency of the words in the file, the co-occurrence frequency of the words and the adjusting factor, thereby obtaining the association degree of the words;
s4, constructing feature vectors of each word and other words according to the association degree, wherein the other words are selected as the most relevant K words;
s5, carrying out hierarchical clustering on the words of the narrative word set, and calculating semantic similarity among the words according to the feature vectors; setting a threshold value, and merging words with semantic similarity values smaller than the threshold value to form clusters;
s6, dividing the words in the cluster into various levels according to the level coefficients, and identifying the upper and lower level relations;
and S7, finally, constructing a narrative word list according to the related relations among words and the upper and lower relations of the narrative word set.
Preferably, the calculation formula of the co-occurrence weight between the words is:
Figure BDA0003061603720000021
wherein, W (T)i,Tj) The expression TiAnd TjCo-occurrence weight of, tf (T)iTj) The expression TiAnd TjFrequency of co-occurrence in corpus, tf (T)i) The expression TiFrequency in corpus, WeightingFactor (T)i,Tj) Is an adjustment factor;
preferably, the formula of the adjustment factor is:
Figure BDA0003061603720000022
min(length(di) ) express a word TiAnd TjThe minimum length in the co-occurrence corpus,
Figure BDA0003061603720000023
represents the average length of the co-occurrence corpus, and k is the co-occurrence corpus length.
Preferably, the calculation formula of the feature vector is as follows:
V(T)=(<T1,W1>,<T2,W2>,…,<Tk,Wk>)
wherein, T1,T2,…,TkRepresenting words related to the word T, W1,W2,…,WkAre the words T and T, respectively1,T2,…,TkCo-occurrence weight of.
Preferably, the semantic similarity is calculated by the following formula:
Figure BDA0003061603720000031
wherein, Sim (T)1,T2) The expression T1And T2Semantic similarity of (1), W1iThe expression T1Of the ith dimension, W2iThe expression T2Is given by the number of words in the feature vector, k represents the dimension of the feature vector, and n represents the number of identical words in the feature vector.
Preferably, the grade coefficient is calculated by the formula:
Figure BDA0003061603720000032
H(Ti) Is the word TiClass coefficient of (1), tf (T)i) The expression TiWord frequency of (n), len (T)i) Indicating a word length.
Preferably, the hierarchical clustering algorithm includes: single connectivity, full connectivity, and average connectivity.
Preferably, the hierarchical clustering algorithm is preferably average connectivity.
Preferably, the threshold is 0.1.
Preferably, the algorithm flow of the above-mentioned identifying the context relationship of the words in the cluster is:
s501, determining the grade number, and classifying the words in the cluster into each grade according to the grade coefficient; the words with high grade coefficient are in high grade, and the highest grade is L0And the rest are L in sequence1,L2,…,Li
S502, generating a superior-inferior relation between adjacent word levels. Word level LiA word T in the table, calculating the word T and the word level Li-1The similarity of each word in the Chinese character is taken as the superior word of the word T, and the word with the maximum similarity is taken as the superior word of the word T; continue from word level LiGet words until LiEstablishing a superior-inferior relation among all the words; examining the word level Li-1The middle word, the word without the hyponym is moved to the word level Li
And S503, judging whether the bottom layer is reached, if so, ending, otherwise, continuing to execute the operation of the S502.
In a second aspect, the present application provides a system for automatically constructing a narrative table, the system comprising: the system comprises an original file acquisition module, a word segmentation module, a narrative extraction module and a narrative table construction module, wherein:
the original file acquisition module is used for acquiring original file data;
the word division module is used for acquiring each word in the original file;
the narrative extraction module is used for realizing the calculation mode of the method so as to determine the correlation among words and the superior-inferior relation;
and the narrative word list construction module is used for constructing a narrative word list according to the correlation among the words and the superior-inferior relation.
A third aspect of the present application provides an apparatus for automatically constructing a narrative table, the apparatus comprising:
a memory storing executable program code;
a processor coupled with the memory;
the processor calls the executable program code stored in the memory to execute the method as described above.
A fourth aspect of the present application provides a computer storage medium having stored thereon computer instructions for executing the method as described above when the computer instructions are invoked.
The invention has the beneficial effects that:
compared with the existing construction of a narrative word list, the similarity among words without similar characters can be identified by analyzing and calculating the correlation among words at the same time; on the basis, the words expressing different subject categories can be basically distinguished by using a level identification method, the generated word clusters are distributed more uniformly, and the similarity between words in the clusters is higher; the adopted grade recognition algorithm can basically classify the words in the cluster into different grades; thus, a narrative word list is automatically constructed according to the correlation among words and the superior-inferior relation.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
FIG. 1 is a flow chart of a method for automatically constructing a narrative table disclosed in the embodiment of the application.
Fig. 2 is a schematic diagram of an algorithm flow for identifying a context relationship of words in a cluster in a method for automatically constructing a narrative table disclosed in an embodiment of the present application.
FIG. 3 is a schematic structural diagram of a system for automatically constructing a narrative table disclosed in an embodiment of the present application.
FIG. 4 is a schematic structural diagram of an apparatus for automatically constructing narrative tables disclosed in the embodiments of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In the description of the present application, it should be noted that if the terms "upper", "lower", "inside", "outside", etc. are used for indicating the orientation or positional relationship based on the orientation or positional relationship shown in the drawings or the orientation or positional relationship which the present invention product is usually put into use, it is only for convenience of describing the present application and simplifying the description, but it is not intended to indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation and be operated, and thus, should not be construed as limiting the present application.
Furthermore, the appearances of the terms "first," "second," and the like, if any, are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.
It should be noted that the features of the embodiments of the present application may be combined with each other without conflict.
Example 1
Referring to fig. 1, fig. 1 is a flow chart illustrating a method for automatically constructing a narrative table according to an embodiment of the present disclosure. As shown in fig. 1, a first aspect of the present application provides a method for automatically constructing a narrative table, the method comprising:
s1, collecting vocabularies, and inputting original data files required by building a narrative word list;
s2, extracting each word according to the original data file to form a narrative word set;
s3, calculating the co-occurrence weight of the words in the narrative word set according to the frequency of the words in the file, the co-occurrence frequency of the words and the adjusting factor, thereby obtaining the association degree of the words;
s4, constructing feature vectors of each word and other words according to the association degree, wherein the other words are selected as the most relevant K words;
s5, carrying out hierarchical clustering on the words of the narrative word set, and calculating semantic similarity among the words according to the feature vectors; setting a threshold value, and merging words with semantic similarity values smaller than the threshold value to form clusters;
s6, dividing the words in the cluster into various levels according to the level coefficients, and identifying the upper and lower level relations;
and S7, finally, constructing a narrative word list according to the related relations among words and the upper and lower relations of the narrative word set.
In this embodiment, the calculation formula for obtaining the co-occurrence weight between the words is as follows:
Figure BDA0003061603720000061
wherein, W (T)i,Tj) The expression TiAnd TjCo-occurrence weight of, tf (T)iTj) The expression TiAnd TjFrequency of co-occurrence in corpus, tf (T)i) The expression TiFrequency in corpus, WeightingFactor (T)i,Tj) Is an adjustment factor;
the formula for calculating the adjustment factor in this embodiment is:
Figure BDA0003061603720000062
min(length(di) ) express a word TiAnd TjThe minimum length in the co-occurrence corpus,
Figure BDA0003061603720000071
represents the average length of the co-occurrence corpus, k is the corpus of the co-occurrence corpus,by calculating the co-occurrence association degree between words, an "associated concept space" can be constructed: and taking the word as a point and taking the co-occurrence weight as an undirected graph of the edge weight.
In this embodiment, the calculation formula for constructing the feature vector is:
V(T)=(<T1,W1>,<T2,W2>,…,<Tk,Wk>)
wherein, T1,T2,…,TkRepresenting words related to the word T, W1,W2,…,WkAre the words T and T, respectively1,T2,…,TkCo-occurrence weight of.
In this embodiment, the calculation formula for obtaining the semantic similarity between words is as follows:
Figure BDA0003061603720000072
wherein, Sim (T)1,T2) The expression T1And T2Semantic similarity of (1), W1iThe expression T1Of the ith dimension, W2iThe expression T2Is given by the number of words in the feature vector, k represents the dimension of the feature vector, and n represents the number of identical words in the feature vector.
In this embodiment, the calculation formula for obtaining the inter-word ranking coefficient is:
Figure BDA0003061603720000073
H(Ti) Is the word TiClass coefficient of (1), tf (T)i) The expression TiWord frequency of (n), len (T)i) Indicating a word length.
In this embodiment, the hierarchical clustering algorithm includes: single connectivity, full connectivity, and average connectivity.
Wherein, hierarchical clustering with an average connectivity algorithm is adopted, and the effect is better when the threshold value is 0.1.
In this embodiment, the context relationship of the words in the cluster is identified, and the algorithm flow is as follows:
s501, determining the grade number, and classifying the words in the cluster into each grade according to the grade coefficient; the words with high grade coefficient are in high grade, and the highest grade is L0And the rest are L in sequence1,L2,…,Li
S502, generating a superior-inferior relation between adjacent word levels. Word level LiA word T in the table, calculating the word T and the word level Li-1The similarity of each word in the Chinese character is taken as the superior word of the word T, and the word with the maximum similarity is taken as the superior word of the word T; continue from word level LiGet words until LiEstablishing a superior-inferior relation among all the words; examining the word level Li-1The middle word, the word without the hyponym is moved to the word level Li
And S503, judging whether the bottom layer is reached, if so, ending, otherwise, continuing to execute the operation of the S502.
Example 2
Referring to fig. 3, fig. 3 is a schematic structural diagram of a system for automatically constructing a narrative table according to an embodiment of the present disclosure. As shown in fig. 3, a second aspect of the present application provides a system for automatically constructing a narrative table, comprising: the method comprises the following steps: the system comprises an original file acquisition module, a word segmentation module, a narrative extraction module and a narrative table construction module, wherein:
the original file acquisition module is used for acquiring original file data;
the word division module is used for acquiring each word in the original file;
the narrative extraction module is used for realizing the calculation mode of the method so as to determine the correlation among words and the superior-inferior relation;
and the narrative word list construction module is used for constructing a narrative word list according to the correlation among the words and the superior-inferior relation.
Example 3
Referring to fig. 4, fig. 4 is a schematic structural diagram of an apparatus for automatically constructing a narrative table disclosed in an embodiment of the present application. As shown in fig. 4, a third aspect of the present application provides an apparatus for automatically constructing a narrative table, comprising:
a memory storing executable program code;
a processor coupled with the memory;
the processor calls the executable program codes stored in the memory to execute the method for automatically constructing the narrative table in the embodiment 1.
Example 4
This embodiment provides a computer storage medium, characterized in that the computer storage medium stores computer instructions for executing the method of automatic construction of the thesaurus in embodiment 1 when the computer instructions are called.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method for automatically constructing a narrative table, which is characterized by comprising the following steps:
s1, collecting vocabularies, and inputting original data files required by building a narrative word list;
s2, extracting each word according to the original data file to form a narrative word set;
s3, calculating the co-occurrence weight of the words in the narrative word set according to the frequency of the words in the file, the co-occurrence frequency of the words and the adjusting factor, thereby obtaining the association degree of the words;
s4, constructing feature vectors of each word and other words according to the association degree, wherein the other words are selected as the most relevant K words;
s5, carrying out hierarchical clustering on the words of the narrative word set, and calculating semantic similarity among the words according to the feature vectors; setting a threshold value, and merging words with semantic similarity values smaller than the threshold value to form clusters;
s6, dividing the words in the cluster into various levels according to the level coefficients, and identifying the upper and lower level relations;
and S7, finally, constructing a narrative word list according to the related relations among words and the upper and lower relations of the narrative word set.
2. The method of claim 1, wherein the co-occurrence weight between words is calculated by the formula:
Figure FDA0003061603710000011
wherein, W (T)i,Tj) The expression TiAnd TjCo-occurrence weight of, tf (T)iTj) The expression TiAnd TjFrequency of co-occurrence in corpus, tf (T)i) The expression TiFrequency in corpus, WeightingFactor (T)i,Tj) Is an adjustment factor;
3. the method of claim 2, wherein the adjustment factor is calculated by the formula:
Figure FDA0003061603710000012
min(length(di) ) express a word TiAnd TjThe minimum length in the co-occurrence corpus,
Figure FDA0003061603710000021
represents the average length of the co-occurrence corpus, and k is the co-occurrence corpus length.
4. The method of claim 1, wherein the eigenvector is calculated by the formula:
V(T)=(<T1,W1>,<T2,W2>,…,<Tk,Wk>)
wherein, T1,T2,…,TkRepresenting words related to the word T, W1,W2,…,WkAre the words T and T, respectively1,T2,…,TkCo-occurrence weight of.
5. The method of claim 4, wherein the semantic similarity is calculated by the formula:
Figure FDA0003061603710000022
wherein, Sim (T)1,T2) The expression T1And T2Semantic similarity of (1), W1iThe expression T1Of the ith dimension, W2iThe expression T2Is given by the number of words in the feature vector, k represents the dimension of the feature vector, and n represents the number of identical words in the feature vector.
6. The method of claim 1, wherein the level coefficient is calculated by the formula:
Figure FDA0003061603710000023
H(Ti) Is the word TiClass coefficient of (1), tf (T)i) The expression TiWord frequency of (n), len (T)i) Indicating a word length.
7. The method of claim 1, wherein said hierarchical clustering comprises the steps of: single connectivity, full connectivity, and average connectivity.
8. The method of claim 7, wherein said hierarchical clustering algorithm is preferably mean-connected.
9. The method of claim 8, wherein the threshold is preferably 0.1.
10. The method of claim 1, wherein the algorithm for identifying context relationships of words in clusters comprises:
step 1: determining the number of grades, and classifying the words in the cluster into each word grade according to the grade coefficient; the words with high grade coefficient are in high grade, and the highest grade is L0And the rest are L in sequence1,L2,…,Li
Step 2: generating a superior-inferior relation between adjacent word levels; word level LiA word T in the table, calculating the word T and the word level Li-1The similarity of each word in the Chinese character is taken as the superior word of the word T, and the word with the maximum similarity is taken as the superior word of the word T; continue from word level LiGet words until LiEstablishing a superior-inferior relation among all the words; examining the word level Li-1The middle word, the word without the hyponym is moved to the word level Li
And step 3: and (4) judging whether the bottom layer is reached, if so, ending, otherwise, continuing to execute the operation of the step (2).
CN202110515734.3A 2021-05-12 2021-05-12 Method, system, equipment and computer storage medium for automatically constructing narrative table Pending CN113204620A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110515734.3A CN113204620A (en) 2021-05-12 2021-05-12 Method, system, equipment and computer storage medium for automatically constructing narrative table

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110515734.3A CN113204620A (en) 2021-05-12 2021-05-12 Method, system, equipment and computer storage medium for automatically constructing narrative table

Publications (1)

Publication Number Publication Date
CN113204620A true CN113204620A (en) 2021-08-03

Family

ID=77031933

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110515734.3A Pending CN113204620A (en) 2021-05-12 2021-05-12 Method, system, equipment and computer storage medium for automatically constructing narrative table

Country Status (1)

Country Link
CN (1) CN113204620A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004053735A1 (en) * 2002-12-12 2004-06-24 Honda Motor Co., Ltd. Information processing device, information processing method, and information processing program
CN104102847A (en) * 2014-07-25 2014-10-15 中国科学技术信息研究所 Chinese descriptor list building system
CN112307204A (en) * 2020-10-22 2021-02-02 首都师范大学 Clustering grade relation based automatic identification method, system, equipment and storage medium
CN112328736A (en) * 2020-11-13 2021-02-05 首都师范大学 Method and system for constructing theme word list and computer storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004053735A1 (en) * 2002-12-12 2004-06-24 Honda Motor Co., Ltd. Information processing device, information processing method, and information processing program
CN104102847A (en) * 2014-07-25 2014-10-15 中国科学技术信息研究所 Chinese descriptor list building system
CN112307204A (en) * 2020-10-22 2021-02-02 首都师范大学 Clustering grade relation based automatic identification method, system, equipment and storage medium
CN112328736A (en) * 2020-11-13 2021-02-05 首都师范大学 Method and system for constructing theme word list and computer storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杜慧平;侯汉清;: "网络环境中汉语叙词表的自动构建研究", 情报学报, no. 06 *

Similar Documents

Publication Publication Date Title
CN109635273B (en) Text keyword extraction method, device, equipment and storage medium
CN108132927B (en) Keyword extraction method for combining graph structure and node association
CN108804421B (en) Text similarity analysis method and device, electronic equipment and computer storage medium
NZ524988A (en) A document categorisation system
CN112732871B (en) Multi-label classification method for acquiring client intention labels through robot induction
CN111858912A (en) Abstract generation method based on single long text
CN108647322B (en) Method for identifying similarity of mass Web text information based on word network
CN112347223B (en) Document retrieval method, apparatus, and computer-readable storage medium
JP5094830B2 (en) Image search apparatus, image search method and program
CN112836029A (en) Graph-based document retrieval method, system and related components thereof
CN112633011B (en) Research front edge identification method and device for fusing word semantics and word co-occurrence information
CN115048464A (en) User operation behavior data detection method and device and electronic equipment
CN114997288A (en) Design resource association method
CN116401345A (en) Intelligent question-answering method, device, storage medium and equipment
JP2012079186A (en) Image retrieval device, image retrieval method and program
AU2018226420B2 (en) Voice assisted intelligent searching in mobile documents
CN114461783A (en) Keyword generation method and device, computer equipment, storage medium and product
CN112307364B (en) Character representation-oriented news text place extraction method
CN111125329B (en) Text information screening method, device and equipment
CN112307204A (en) Clustering grade relation based automatic identification method, system, equipment and storage medium
US20220318318A1 (en) Systems and methods for automated information retrieval
CN114943285B (en) Intelligent auditing system for internet news content data
CN113204620A (en) Method, system, equipment and computer storage medium for automatically constructing narrative table
CN115794987A (en) Cross-language information retrieval system and equipment based on shared semantic model
CN112328736A (en) Method and system for constructing theme word list and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination