CN112784063A - Idiom knowledge graph construction method and device - Google Patents
Idiom knowledge graph construction method and device Download PDFInfo
- Publication number
- CN112784063A CN112784063A CN202110116596.1A CN202110116596A CN112784063A CN 112784063 A CN112784063 A CN 112784063A CN 202110116596 A CN202110116596 A CN 202110116596A CN 112784063 A CN112784063 A CN 112784063A
- Authority
- CN
- China
- Prior art keywords
- processed
- word
- idiom
- idioms
- description information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000010276 construction Methods 0.000 title claims abstract description 23
- 238000000034 method Methods 0.000 claims abstract description 40
- 238000004422 calculation algorithm Methods 0.000 claims description 28
- 238000012545 processing Methods 0.000 claims description 27
- 230000011218 segmentation Effects 0.000 claims description 23
- 238000004891 communication Methods 0.000 claims description 19
- 238000001914 filtration Methods 0.000 claims description 15
- 238000010586 diagram Methods 0.000 claims description 12
- 238000012216 screening Methods 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 4
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000000605 extraction Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 240000004516 Madia sativa Species 0.000 description 2
- 239000000969 carrier Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000000692 anti-sense effect Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/38—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/383—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Library & Information Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Evolutionary Computation (AREA)
- Animal Behavior & Ethology (AREA)
- Machine Translation (AREA)
Abstract
The embodiment of the invention provides a idiom knowledge graph construction method, which comprises the following steps: acquiring a plurality of idioms to be processed and description information of each idiom to be processed; analyzing the description information of the idiom to be processed aiming at each idiom to be processed, and determining a label corresponding to the idiom to be processed; and constructing the knowledge graphs of the multiple idioms to be processed based on the incidence relation between the multiple idioms to be processed and the labels corresponding to the idioms to be processed. Therefore, the corresponding label can be determined for each idiom to be processed based on the description information, the knowledge graph is constructed based on the association relation between the label and the idiom to be processed, when a user queries the idioms, a plurality of corresponding idioms can be determined according to a certain label, and compared with a method for searching the idioms according to a specific idiom or a specific keyword, the method is beneficial for the user to obtain idiom information from more sides, and the idiom use requirement of the user is met.
Description
Technical Field
The invention relates to the technical field of information storage, in particular to a idiom knowledge graph construction method and device.
Background
In the existing network idiom dictionary, a large amount of idiom information including pronunciation, paraphrase, source, similar meaning word, antisense word and the like of each idiom is stored, so that idiom-related services can be provided for users.
In the related art, a relational database is usually used to store idiom information, so that a user can search for information related to a specific idiom by searching for the specific idiom, or search for an idiom related to a specific keyword in an paraphrase of the idiom and related information thereof.
However, in the relational database, it is difficult for the user to acquire idiom information from more sides, for example, although both "the ancient year" and "the Mao die year" are idioms about the age, the user has difficulty in acquiring information of both idioms by searching for "the age", and thus, the related art has difficulty in satisfying the idiom use requirements of the user.
Disclosure of Invention
The embodiment of the invention aims to provide a idiom knowledge graph construction method and device, so that idiom information can be obtained from more sides, and idiom use requirements of users are met. The specific technical scheme is as follows:
the embodiment of the invention provides a idiom knowledge graph construction method, which comprises the following steps:
acquiring a plurality of idioms to be processed and description information of each idiom to be processed;
analyzing the description information of the idiom to be processed aiming at each idiom to be processed, and determining a label corresponding to the idiom to be processed;
and constructing the knowledge graphs of the multiple idioms to be processed based on the incidence relation between the multiple idioms to be processed and the labels corresponding to the idioms to be processed.
Optionally, the analyzing, for each idiom to be processed, the description information of the idiom to be processed to determine a tag corresponding to the idiom to be processed includes:
performing word segmentation processing on the description information to obtain a word list corresponding to the idiom to be processed;
and screening the words with semantic similarity meeting preset conditions with the idioms to be processed from the word list to serve as labels corresponding to the idioms to be processed.
Optionally, the performing word segmentation processing on the description information to obtain a word list corresponding to the idiom to be processed includes:
filtering stop words and symbols in the description information to obtain filtering information;
and performing word segmentation processing on the filtering information to obtain a word list corresponding to the idiom to be processed.
Optionally, before the words whose semantic similarity with the to-be-processed idiom satisfies a preset condition are screened from the word list and used as the tags corresponding to the to-be-processed idiom, the method further includes:
acquiring a relevant word of each word in the word list, and adding the relevant word to the word list;
judging whether the number of the words in the word list changes, if so, returning to the step of acquiring the associated words of each word in the word list and adding the associated words to the word list, and if not, executing the step of screening the words, the semantic similarity of which with the idiom to be processed meets the preset condition, from the word list to serve as the label corresponding to the idiom to be processed.
Optionally, the constructing a knowledge graph of the multiple idioms to be processed based on the association relationship between the multiple idioms to be processed and the tags corresponding to each idiom to be processed includes:
generating idiom entities corresponding to the multiple idioms to be processed and label entities corresponding to labels corresponding to the idioms to be processed respectively;
and establishing an association relationship between each idiom entity and each label entity based on the association relationship between the multiple idioms to be processed and the labels corresponding to the idioms to be processed to obtain the knowledge maps of the multiple idioms to be processed.
Optionally, the description information includes: the pronunciation, paraphrase and origin of the idiom to be processed.
Optionally, after the storing the knowledge-graph, the method further includes:
acquiring a term to be queried;
querying a label matched with the term to be queried in the knowledge graph as a target label;
and outputting the idioms to be processed corresponding to the target tags.
The embodiment of the invention also provides a device for constructing the idiom knowledge graph, which comprises the following components:
the acquisition module is used for acquiring a plurality of idioms to be processed and the description information of each idiom to be processed;
the determining module is used for analyzing the description information of the idiom to be processed aiming at each idiom to be processed and determining a label corresponding to the idiom to be processed;
and the construction module is used for constructing the knowledge graph of the multiple idioms to be processed based on the incidence relation between the multiple idioms to be processed and the labels corresponding to the idioms to be processed.
Optionally, the determining module is specifically configured to:
performing word segmentation processing on the description information to obtain a word list corresponding to the idiom to be processed;
and screening the words with semantic similarity meeting preset conditions with the idioms to be processed from the word list to serve as labels corresponding to the idioms to be processed.
Optionally, the determining module is specifically configured to:
filtering stop words and symbols in the description information to obtain filtering information;
and performing word segmentation processing on the filtering information to obtain a word list corresponding to the idiom to be processed.
Optionally, the determining module is further configured to:
acquiring a relevant word of each word in the word list, and adding the relevant word to the word list;
and judging whether the number of the words in the word list changes, if so, returning to the step of acquiring the associated words of each word in the word list and adding the associated words to the word list, and if not, executing the step of screening the words, the semantic similarity of which with the to-be-processed idiom meets the preset condition, from the word list to serve as the labels corresponding to the to-be-processed idiom.
Optionally, the building module is specifically configured to:
generating idiom entities corresponding to the multiple idioms to be processed and label entities corresponding to labels corresponding to the idioms to be processed respectively;
and establishing an association relationship between each idiom entity and each label entity based on the association relationship between the multiple idioms to be processed and the labels corresponding to the idioms to be processed to obtain the knowledge maps of the multiple idioms to be processed.
Optionally, the description information includes: the pronunciation, paraphrase and origin of the idiom to be processed.
Optionally, the apparatus further comprises:
the query module is used for acquiring terms to be queried; querying a label matched with the term to be queried in the knowledge graph as a target label; and outputting the idioms to be processed corresponding to the target tags.
The embodiment of the invention also provides electronic equipment which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing any idiom knowledge graph construction method when the program stored in the memory is executed.
The embodiment of the invention also provides a computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, and when being executed by a processor, the computer program realizes any one of the idiom knowledge graph construction methods.
Embodiments of the present invention further provide a computer program product containing instructions, which when run on a computer, cause the computer to execute any one of the idiom knowledge-graph construction methods described above.
The idiom knowledge graph construction method and device provided by the embodiment of the invention comprise the steps of firstly, obtaining a plurality of idioms to be processed and description information of each idiom to be processed, analyzing the description information of the idiom to be processed aiming at each idiom to be processed, determining a label corresponding to the idiom to be processed, and then constructing a plurality of knowledge graphs of the idioms to be processed based on the idioms to be processed and the label corresponding to each idiom to be processed. Therefore, the corresponding label can be determined for each idiom to be processed based on the description information, the knowledge graph is constructed based on the association relation between the label and the idiom to be processed, when a user queries the idioms, a plurality of corresponding idioms can be determined according to a certain label, and compared with a method for searching the idioms according to a specific idiom or a specific keyword, the method is beneficial for the user to obtain idiom information from more sides, and the idiom use requirement of the user is met. Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a idiom knowledge graph construction method according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of another idiom knowledge-graph building method according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of an idiom knowledge-graph constructing apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the related art, a relational database is usually used to store idiom information, so that a user can search for information related to a specific idiom by searching for the specific idiom, or search for an idiom related to a specific keyword in an paraphrase of the idiom and related information thereof.
However, in the relational database, it is difficult for the user to acquire idiom information from more sides, for example, although both "the ancient year" and "the Mao die year" are idioms about the age, the user has difficulty in acquiring information of both idioms by searching for "the age", and thus, the related art has difficulty in satisfying the idiom use requirements of the user.
Compared with the prior art, the embodiment of the invention provides the idiom knowledge graph construction method, and a computer, a server or other electronic equipment can construct the idiom knowledge graph by using the method.
The idiom knowledge graph construction method provided by the embodiment of the invention is generally explained below.
Acquiring a plurality of idioms to be processed and description information of each idiom to be processed;
analyzing the description information of the idiom to be processed aiming at each idiom to be processed, and determining a label corresponding to the idiom to be processed;
and constructing a knowledge graph of the multiple idioms to be processed based on the incidence relation between the multiple idioms to be processed and the labels corresponding to the idioms to be processed.
As can be seen from the above, the idiom knowledge graph construction method and apparatus provided in the embodiments of the present invention can determine a corresponding tag for each idiom to be processed based on the description information, and construct a knowledge graph based on the association relationship between the tags and the idioms to be processed, when a user performs idiom query, a plurality of corresponding idioms can be determined according to a certain tag, and compared with a method of searching idioms according to a specific idiom or a specific keyword, the method and apparatus are helpful for the user to obtain idiom information from more sides, and meet the idiom use requirements of the user.
The idiom knowledge graph construction method provided by the embodiment of the invention will be described in detail through specific embodiments.
As shown in fig. 1, a flow diagram of a idiom knowledge graph construction method provided in an embodiment of the present invention includes the following steps:
s101: and acquiring a plurality of idioms to be processed and the description information of each idiom to be processed.
In some scenarios, the electronic device (execution subject) may obtain some to-be-processed idioms, and store the to-be-processed idioms, so that a user can perform operations such as query and browsing on the to-be-processed idioms. And different storage modes of the idiom to be processed correspond to different query modes of the idiom to be processed.
In this step, the idioms to be processed may be idioms with any number of words, each idiom to be processed has its corresponding description information, and these description information may describe the idioms to be processed to distinguish them from other idioms. The description information may include one or more of pronunciation, paraphrase, and provenance information of the idiom to be processed, and is not limited specifically.
S102: and analyzing the description information of the idiom to be processed aiming at each idiom to be processed, and determining a label corresponding to the idiom to be processed.
After obtaining a plurality of idioms to be processed and the description information of each idiom to be processed, the description information of each idiom to be processed can be analyzed, and the label corresponding to each idiom to be processed is respectively determined, wherein each idiom to be processed can correspond to a unique label or a plurality of labels, and different idioms can correspond to the same label or different labels, and the details are not limited.
In one implementation manner, the manner of analyzing the description information of each idiom to be processed and determining the tag corresponding to the idiom to be processed may be: firstly, performing word segmentation processing on description information to obtain a word list corresponding to a to-be-processed idiom, and then screening words with semantic similarity meeting preset conditions with the to-be-processed idiom from the word list to serve as tags corresponding to the to-be-processed idiom.
For example, the shortest path algorithm may be used to perform word segmentation processing on the description information: firstly, the description information is segmented into a plurality of word string data, and an association diagram between the word string data is constructed according to the association relationship between the word string data. And then, calculating the association diagram by using a preset word frequency probability algorithm to obtain the word frequency probability of each associated word of the word string data. And eliminating ambiguity generated when the description information is segmented according to the word frequency probability of each associated word of the word string data aiming at each word string data, thereby identifying the words in the original text data more accurately. Alternatively, an n-gram model method, a maximum matching algorithm, a cross ambiguity algorithm, and the like may also be adopted, which is not limited in the embodiment of the present invention.
In addition, the semantic similarity between each word and the to-be-processed idiom may be calculated by using a jaccard similarity coefficient algorithm or a cosine similarity algorithm, and then words whose semantic similarity with the to-be-processed idiom satisfies a preset condition are screened from the word list, or a worker may manually review the words in the word list and the to-be-processed idiom, and words whose semantic similarity with the to-be-processed idiom satisfies a preset condition are screened from the word list, where the preset condition may be a word with the highest semantic similarity or a word whose semantic similarity reaches a preset threshold, and is not particularly limited.
When the word segmentation processing is performed on the description information to obtain the word list corresponding to the idiom to be processed, the stop words and symbols in the description information can be filtered to obtain the filtering information, and then the word segmentation processing is performed on the filtering information to obtain the word list corresponding to the idiom to be processed.
Therefore, repeated or useless information in the description information can be filtered, a more effective word list is obtained, and the efficiency and the accuracy of label extraction are further improved.
In addition, before the words with the semantic similarity to the idiom to be processed meeting the preset condition are screened from the word list and used as the tags corresponding to the idiom to be processed, association summarization processing can be further performed on the words in the word list.
The relevant word of each word may be a synonym or a synonym of the word, for example, if the word is "age", the relevant word may be "age", or the relevant word may be a hypernym of the word, for example, if the word is "fifty", the relevant word may be "age", and the like. When the relevant word of each word is obtained, the word can be inquired in a preset semantic dictionary, or the word can be input into an algorithm model obtained by pre-training for calculation, so that the relevant word of the word is obtained, and the method is not limited specifically.
Therefore, words in the word list can be richer and more general, and the efficiency and the accuracy of label extraction are further improved.
For example, the descriptor "ancient rare year" indicates that the person can live to the age of seventy, rare since ancient times "," Mao die year "indicates that the descriptor is" very old, after the two idioms to be processed and the description information thereof are obtained, word segmentation processing can be carried out on the description information to obtain a word list corresponding to the idioms to be processed, wherein the word list for "ancient rare year" may be "people \ seventy years old \ rare", "Madie year" may be "old \ big", then, the words in the word list can be summarized in a higher order, for example, the related word of "seventies" is "age", "the related word of" ages "is also" age ", furthermore, the words with the highest similarity with the idioms to be processed can be screened from the word list to be used as the labels corresponding to the idioms to be processed, thus, the labels corresponding to "ancient Mao year" and "die year" may both be "age".
S103: and constructing a knowledge graph of the multiple idioms to be processed based on the incidence relation between the multiple idioms to be processed and the labels corresponding to the idioms to be processed.
The knowledge map is also called a scientific knowledge map, is a knowledge domain visual mapping map, and can describe knowledge resources and carriers thereof by using a visual technology. That is, based on the association relationship between multiple idioms to be processed and the corresponding tags of each idiom to be processed, a knowledge graph of the multiple idioms to be processed can be constructed, so that the idioms to be processed can be visually described, and therefore, the idioms and the interrelations among the idioms can be mined, analyzed, constructed, drawn and displayed by a user.
For example, the method for constructing the knowledge graph of multiple idioms to be processed based on the association relationship between the multiple idioms to be processed and the corresponding tags of each idiom to be processed may be: firstly, generating a plurality of idioms corresponding to the idioms to be processed and label entities corresponding to labels corresponding to the idioms to be processed respectively, and then establishing the association between each idiom entity and each label entity based on the association between the idioms to be processed and the labels corresponding to the idioms to be processed to obtain a knowledge graph of the idioms to be processed.
In one implementation, after constructing a plurality of knowledge maps of idioms to be processed, a user can utilize the knowledge maps to perform idiom query.
For example, a user may input any term to be queried, and after obtaining the term to be queried, the electronic device (the execution main body) may query, in the knowledge base, a tag that matches the term to be queried, as a target tag, and then output a to-be-processed idiom corresponding to the target tag. Thus, the user can inquire all idioms related to the terms to be inquired.
For example, when the user inputs "age", the electronic device (the executing entity) may query the knowledge graph for the tag matching "age", and then output the idioms to be processed corresponding to the target tag, such as "ancient rare year" and "Matom die" year ", so as to further satisfy the idiom usage requirement of the user.
As can be seen from the above, the idiom knowledge graph construction method provided in the embodiment of the present invention can determine a corresponding tag for each idiom to be processed, construct a knowledge graph based on an association relationship between the tags and the idioms to be processed, and store the knowledge graph, so that the stored idioms are more organized, so that a user can query the idioms according to the tags.
As shown in fig. 2, a schematic flow chart of another idiom knowledge graph building method provided in the embodiment of the present invention includes the following steps:
s201: and acquiring a plurality of idioms to be processed and the description information of each idiom to be processed.
In some scenarios, the electronic device (execution subject) may obtain some to-be-processed idioms, and store the to-be-processed idioms, so that a user can perform operations such as query and browsing on the to-be-processed idioms. And different storage modes of the idiom to be processed correspond to different query modes of the idiom to be processed.
In this step, the idioms to be processed may be idioms with any number of words, each idiom to be processed has its corresponding description information, and these description information may describe the idioms to be processed to distinguish them from other idioms. The description information may include one or more of pronunciation, paraphrase, and provenance information of the idiom to be processed, and is not limited specifically.
S202: and filtering stop words and symbols in the description information to obtain filtering information.
The stop words and symbols in the description information can be filtered to obtain filtered information, and then the filtered information is subjected to word segmentation to obtain a word list corresponding to the idioms to be processed.
Therefore, repeated or useless information in the description information can be filtered, a more effective word list is obtained, and the efficiency and the accuracy of label extraction are further improved.
S203: and performing word segmentation processing on the filtering information to obtain a word list corresponding to the idioms to be processed.
For example, the shortest path algorithm may be used to perform word segmentation processing on the description information: firstly, the description information is segmented into a plurality of word string data, and an association diagram between the word string data is constructed according to the association relationship between the word string data. And then, calculating the association diagram by using a preset word frequency probability algorithm to obtain the word frequency probability of each associated word of the word string data. And eliminating ambiguity generated when the description information is segmented according to the word frequency probability of each associated word of the word string data aiming at each word string data, thereby identifying the words in the original text data more accurately.
Alternatively, an n-gram model method, a maximum matching algorithm, a cross ambiguity algorithm, and the like may also be adopted, which is not limited in the embodiment of the present invention.
S204: and acquiring the associated words of each word in the word list, and adding the associated words to the word list.
The relevant word of each word may be a synonym or a synonym of the word, for example, if the word is "age", the relevant word may be "age", or the relevant word may be a hypernym of the word, for example, if the word is "fifty", the relevant word may be "age", and the like. When the relevant word of each word is obtained, the word can be inquired in a preset semantic dictionary, or the word can be input into an algorithm model obtained by pre-training for calculation, so that the relevant word of the word is obtained, and the method is not limited specifically.
S205: and judging whether the number of the words in the word list is changed, if so, returning to the S204, and if not, executing the S206.
For example, firstly, a relevant word of each word in the word list can be obtained, the relevant word is added into the word list, then, whether the number of the words in the word list changes or not is judged, if the number of the words in the word list changes, the words in the word list are continuously summarized until the number of the words in the word list does not change, and then, words whose semantic similarity to the idiom to be processed meets the preset condition are screened from the word list to serve as tags corresponding to the idiom to be processed.
Therefore, words in the word list can be richer and more general, and the efficiency and the accuracy of label extraction are further improved.
S206: and screening the words with semantic similarity meeting preset conditions with the idioms to be processed from the word list to serve as labels corresponding to the idioms to be processed.
For example, a Jacard similarity coefficient algorithm or a cosine similarity algorithm may be used to calculate the semantic similarity between each word and the to-be-processed idiom, and then a word whose semantic similarity with the to-be-processed idiom satisfies a preset condition is selected from the word list, or a worker may manually review the word and the to-be-processed idiom in the word list, and a word whose semantic similarity with the to-be-processed idiom satisfies a preset condition is selected from the word list, where the preset condition may be the word with the highest semantic similarity or the word whose semantic similarity reaches a preset threshold, and is not particularly limited.
Each idiom to be processed may correspond to a unique tag or to multiple tags, and different idioms may correspond to the same tag or to different tags, which is not limited specifically.
For example, the descriptor "ancient rare year" indicates that the person can live to the age of seventy, rare since ancient times "," Mao die year "indicates that the descriptor is" very old, after the two idioms to be processed and the description information thereof are obtained, word segmentation processing can be carried out on the description information to obtain a word list corresponding to the idioms to be processed, wherein the word list for "ancient rare year" may be "people \ seventy years old \ rare", "Madie year" may be "old \ big", then, the words in the word list can be summarized in a higher order, for example, the related word of "seventies" is "age", "the related word of" ages "is also" age ", furthermore, the words with the highest similarity with the idioms to be processed can be screened from the word list to be used as the labels corresponding to the idioms to be processed, thus, the labels corresponding to "ancient Mao year" and "die year" may both be "age".
S207: and constructing a knowledge graph of the multiple idioms to be processed based on the incidence relation between the multiple idioms to be processed and the labels corresponding to the idioms to be processed.
The knowledge map is also called a scientific knowledge map, is a knowledge domain visual mapping map, and can describe knowledge resources and carriers thereof by using a visual technology. That is, based on the association relationship between multiple idioms to be processed and the corresponding tags of each idiom to be processed, a knowledge graph of the multiple idioms to be processed can be constructed, so that the idioms to be processed can be visually described, and therefore, the idioms and the interrelations among the idioms can be mined, analyzed, constructed, drawn and displayed by a user.
For example, the method for constructing the knowledge graph of multiple idioms to be processed based on the association relationship between the multiple idioms to be processed and the corresponding tags of each idiom to be processed may be: firstly, generating a plurality of idioms corresponding to the idioms to be processed and label entities corresponding to labels corresponding to the idioms to be processed respectively, and then establishing the association between each idiom entity and each label entity based on the association between the idioms to be processed and the labels corresponding to the idioms to be processed to obtain a knowledge graph of the idioms to be processed.
In one implementation, after constructing a plurality of knowledge maps of idioms to be processed, a user can utilize the knowledge maps to perform idiom query.
For example, a user may input any term to be queried, and after obtaining the term to be queried, the electronic device (the execution main body) may query, in the knowledge base, a tag that matches the term to be queried, as a target tag, and then output a to-be-processed idiom corresponding to the target tag. Thus, the user can inquire all idioms related to the terms to be inquired.
For example, when the user inputs "age", the electronic device (the executing entity) may query the knowledge graph for the tag matching "age", and then output the idioms to be processed corresponding to the target tag, such as "ancient rare year" and "Matom die" year ", so as to further satisfy the idiom usage requirement of the user.
As can be seen from the above, the idiom knowledge graph construction method provided in the embodiment of the present invention can determine a corresponding tag for each idiom to be processed, construct a knowledge graph based on an association relationship between the tags and the idioms to be processed, and store the knowledge graph, so that the stored idioms are more organized, so that a user can query the idioms according to the tags.
The embodiment of the present invention further provides a idiom knowledge graph constructing apparatus, as shown in fig. 3, which is a schematic structural diagram of the idiom knowledge graph constructing apparatus provided in the embodiment of the present invention, and the apparatus includes:
an obtaining module 301, configured to obtain multiple idioms to be processed and description information of each idiom to be processed;
a determining module 302, configured to analyze, for each idiom to be processed, description information of the idiom to be processed, and determine a tag corresponding to the idiom to be processed;
a building module 303, configured to build a knowledge graph of the multiple idioms to be processed based on the association relationship between the multiple idioms to be processed and the tags corresponding to each idiom to be processed.
In an implementation manner, the determining module 302 is specifically configured to:
performing word segmentation processing on the description information to obtain a word list corresponding to the idiom to be processed;
and screening the words with semantic similarity meeting preset conditions with the idioms to be processed from the word list to serve as labels corresponding to the idioms to be processed.
In an implementation manner, the determining module 302 is specifically configured to:
filtering stop words and symbols in the description information to obtain filtering information;
and performing word segmentation processing on the filtering information to obtain a word list corresponding to the idiom to be processed.
In one implementation, the determining module 302 is further configured to:
acquiring a relevant word of each word in the word list, and adding the relevant word to the word list;
judging whether the number of the words in the word list changes, if so, returning to the step of acquiring the associated words of each word in the word list and adding the associated words to the word list, and if not, executing the step of screening the words, the semantic similarity of which with the idiom to be processed meets the preset condition, from the word list to serve as the label corresponding to the idiom to be processed.
In an implementation manner, the building module 303 is specifically configured to:
generating idiom entities corresponding to the multiple idioms to be processed and label entities corresponding to labels corresponding to the idioms to be processed respectively;
and establishing an association relationship between each idiom entity and each label entity based on the association relationship between the multiple idioms to be processed and the labels corresponding to the idioms to be processed to obtain the knowledge maps of the multiple idioms to be processed.
In one implementation, the description information includes: the pronunciation, paraphrase and origin of the idiom to be processed.
In one implementation, the apparatus further includes:
a query module 304, configured to obtain a term to be queried; querying a label matched with the term to be queried in the knowledge graph as a target label; and outputting the idioms to be processed corresponding to the target tags.
As can be seen from the above, the idiom knowledge graph constructing device provided in the embodiment of the present invention can determine a corresponding tag for each idiom to be processed, construct a knowledge graph based on an association relationship between the tags and the idioms to be processed, and store the knowledge graph, so that the stored idioms are more organized, so that a user can query the idioms according to the tags.
An embodiment of the present invention further provides an electronic device, as shown in fig. 4, including a processor 401, a communication interface 402, a memory 403, and a communication bus 404, where the processor 401, the communication interface 402, and the memory 403 complete mutual communication through the communication bus 404,
a memory 403 for storing a computer program;
the processor 401, when executing the program stored in the memory 403, implements the following steps:
acquiring a plurality of idioms to be processed and description information of each idiom to be processed;
analyzing the description information of the idiom to be processed aiming at each idiom to be processed, and determining a label corresponding to the idiom to be processed;
and constructing a knowledge graph of the multiple idioms to be processed based on the incidence relation between the multiple idioms to be processed and the labels corresponding to the idioms to be processed.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
As can be seen from the above, the idiom knowledge graph construction method and apparatus provided in the embodiments of the present invention can determine a corresponding tag for each idiom to be processed, construct a knowledge graph based on an association relationship between the tags and the idioms to be processed, and store the knowledge graph, so that the stored idioms are more organized, so that a user can query the idioms according to the tags.
In another embodiment of the present invention, there is further provided a computer-readable storage medium, having stored therein instructions, which when executed on a computer, cause the computer to execute the idiomatic knowledge-graph constructing method according to any one of the above embodiments.
In yet another embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the idiomatic knowledge-graph construction method of any one of the above embodiments.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, the electronic device embodiment and the storage medium embodiment, since they are substantially similar to the method embodiment, the description is relatively simple, and in relation to the description, reference may be made to some portions of the description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.
Claims (20)
1. A idiom knowledge graph construction method is characterized by comprising the following steps:
acquiring a plurality of idioms to be processed and description information of each idiom to be processed;
performing word segmentation processing on the description information to obtain a word list corresponding to the idiom to be processed;
for each word in the word list, performing association summarization processing on the word list based on the associated word of the word;
screening words with semantic similarity meeting preset conditions with the idiom to be processed from the processed word list, and taking the words as labels corresponding to the idiom to be processed;
and constructing the knowledge graphs of the multiple idioms to be processed based on the incidence relation between the multiple idioms to be processed and the labels corresponding to the idioms to be processed.
2. The method according to claim 1, wherein the constructing the knowledge graph of the multiple idioms based on the association relationship between the multiple idioms to be processed and the corresponding tags of each idiom to be processed comprises:
generating idiom entities corresponding to the multiple idioms to be processed and label entities corresponding to labels corresponding to the idioms to be processed respectively;
and establishing an association relationship between each idiom entity and each label entity based on the association relationship between the multiple idioms to be processed and the labels corresponding to the idioms to be processed to obtain the knowledge maps of the multiple idioms to be processed.
3. The method of claim 1, wherein the description information comprises: the pronunciation, paraphrase and origin of the idiom to be processed.
4. The method of claim 1, further comprising:
after the knowledge graph is stored, obtaining terms to be inquired;
querying a label matched with the term to be queried in the knowledge graph as a target label;
and outputting the idioms to be processed corresponding to the target tags.
5. The method according to claim 1, wherein the performing word segmentation processing on the description information to obtain a word list corresponding to the idiom to be processed comprises:
and performing word segmentation processing on the description information by adopting a shortest path algorithm to obtain a word list corresponding to the idiom to be processed.
6. The method of claim 5, wherein the performing word segmentation processing on the description information by using a shortest path algorithm to obtain a word list corresponding to the idiom to be processed comprises:
segmenting the description information to obtain a plurality of word string data;
constructing an association diagram among the word string data according to the association relationship among the word string data;
calculating the association diagram by using a preset word frequency probability algorithm to obtain the word frequency probability of each associated word of the word string data;
and aiming at each word string data, obtaining a word list corresponding to the idiom to be processed according to the word frequency probability of each relevant word of the word string data.
7. The method according to claim 1, wherein before the filtering, from the processed word list, words whose semantic similarity to the idiom to be processed satisfies a preset condition as tags corresponding to the idiom to be processed, the method further comprises:
calculating the semantic similarity between each word in the processed word list and the to-be-processed idiom based on a preset similarity algorithm; wherein the preset similarity algorithm is a Jacard similarity coefficient algorithm or a cosine similarity algorithm.
8. The method of claim 1, wherein the associated word of the word comprises at least one of: a synonym of the term, and a hypernym of the term.
9. The method of claim 1, wherein before the associating and summarizing the word list based on the associated words of each word in the word list, the method further comprises:
inquiring each word in the word list in a preset semantic dictionary to obtain a relevant word of the word; or,
and aiming at each word in the word list, inputting the word into a pre-trained algorithm model to obtain the associated word of the word.
10. An idiom knowledge graph building apparatus, the apparatus comprising:
the acquisition module is used for acquiring a plurality of idioms to be processed and the description information of each idiom to be processed;
the word list generating module is used for carrying out word segmentation processing on the description information to obtain a word list corresponding to the idiom to be processed;
the association summarization processing module is used for carrying out association summarization processing on the word list based on the associated words of each word in the word list;
the label acquisition module is used for screening the words of which the semantic similarity with the idiom to be processed meets the preset condition from the processed word list to be used as labels corresponding to the idiom to be processed;
and the construction module is used for constructing the knowledge graph of the multiple idioms to be processed based on the incidence relation between the multiple idioms to be processed and the labels corresponding to the idioms to be processed.
11. The apparatus according to claim 10, wherein the building block is specifically configured to:
generating idiom entities corresponding to the multiple idioms to be processed and label entities corresponding to labels corresponding to the idioms to be processed respectively;
and establishing an association relationship between each idiom entity and each label entity based on the association relationship between the multiple idioms to be processed and the labels corresponding to the idioms to be processed to obtain the knowledge maps of the multiple idioms to be processed.
12. The apparatus of claim 10, wherein the description information comprises: the pronunciation, paraphrase and origin of the idiom to be processed.
13. The apparatus of claim 10, further comprising:
the query module is used for acquiring terms to be queried; querying a label matched with the term to be queried in the knowledge graph as a target label; and outputting the idioms to be processed corresponding to the target tags.
14. The apparatus according to claim 10, wherein the word list generating module is specifically configured to perform word segmentation processing on the description information by using a shortest path algorithm to obtain a word list corresponding to the to-be-processed idiom.
15. The apparatus according to claim 14, wherein the word list generating module is specifically configured to perform segmentation processing on the description information to obtain a plurality of word string data;
constructing an association diagram among the word string data according to the association relationship among the word string data;
calculating the association diagram by using a preset word frequency probability algorithm to obtain the word frequency probability of each associated word of the word string data;
and aiming at each word string data, obtaining a word list corresponding to the idiom to be processed according to the word frequency probability of each relevant word of the word string data.
16. The apparatus of claim 10, further comprising:
the semantic similarity calculation module is used for screening words, the semantic similarity of which with the idiom to be processed meets a preset condition, from the word list, and calculating the semantic similarity of each word in the word list with the idiom to be processed based on a preset similarity calculation method before the words are used as labels corresponding to the idiom to be processed; wherein the preset similarity algorithm is a Jacard similarity coefficient algorithm or a cosine similarity algorithm.
17. The apparatus of claim 10, wherein the associated word of the word comprises at least one of: a synonym of the term, and a hypernym of the term.
18. The apparatus of claim 10, further comprising:
the relevant word acquisition module is used for inquiring each word in the word list in a preset semantic dictionary to obtain a relevant word of the word; or,
and aiming at each word in the word list, inputting the word into a pre-trained algorithm model to obtain the associated word of the word.
19. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method of any one of claims 1 to 9 when executing a program stored in a memory.
20. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method of any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110116596.1A CN112784063B (en) | 2019-03-15 | 2019-03-15 | Idiom knowledge graph construction method and device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110116596.1A CN112784063B (en) | 2019-03-15 | 2019-03-15 | Idiom knowledge graph construction method and device |
CN201910200003.2A CN109977233B (en) | 2019-03-15 | 2019-03-15 | Idiom knowledge graph construction method and device |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910200003.2A Division CN109977233B (en) | 2019-03-15 | 2019-03-15 | Idiom knowledge graph construction method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112784063A true CN112784063A (en) | 2021-05-11 |
CN112784063B CN112784063B (en) | 2024-07-02 |
Family
ID=67079105
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110116596.1A Active CN112784063B (en) | 2019-03-15 | 2019-03-15 | Idiom knowledge graph construction method and device |
CN202110116579.8A Active CN112784062B (en) | 2019-03-15 | 2019-03-15 | Idiom knowledge graph construction method and device |
CN201910200003.2A Active CN109977233B (en) | 2019-03-15 | 2019-03-15 | Idiom knowledge graph construction method and device |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110116579.8A Active CN112784062B (en) | 2019-03-15 | 2019-03-15 | Idiom knowledge graph construction method and device |
CN201910200003.2A Active CN109977233B (en) | 2019-03-15 | 2019-03-15 | Idiom knowledge graph construction method and device |
Country Status (1)
Country | Link |
---|---|
CN (3) | CN112784063B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110442735B (en) * | 2019-08-13 | 2022-05-13 | 北京金山数字娱乐科技有限公司 | Idiom synonym recommendation method and device |
CN110688838B (en) * | 2019-10-08 | 2023-07-18 | 北京金山数字娱乐科技有限公司 | Idiom synonym list generation method and device |
CN111125369A (en) * | 2019-11-25 | 2020-05-08 | 深圳壹账通智能科技有限公司 | Tacit degree detection method, equipment, server and readable storage medium |
CN111309872B (en) * | 2020-03-26 | 2023-08-08 | 北京百度网讯科技有限公司 | Search processing method, device and equipment |
CN113569051A (en) * | 2020-04-29 | 2021-10-29 | 北京金山数字娱乐科技有限公司 | Knowledge graph construction method and device |
CN113127626B (en) * | 2021-04-22 | 2024-04-30 | 广联达科技股份有限公司 | Recommendation method, device, equipment and readable storage medium based on knowledge graph |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009230173A (en) * | 2008-03-19 | 2009-10-08 | Nec Corp | Synonym conversion system, synonym conversion method and synonym-converting program |
CN103853702A (en) * | 2012-12-06 | 2014-06-11 | 富士通株式会社 | Device and method for correcting idiom error in linguistic data |
CN104484411A (en) * | 2014-12-16 | 2015-04-01 | 中国科学院自动化研究所 | Building method for semantic knowledge base based on a dictionary |
CN105589728A (en) * | 2015-12-16 | 2016-05-18 | 西安文理学院 | Sub-graph semantic isomorphism based instruction idiom identification method |
CN106778862A (en) * | 2016-12-12 | 2017-05-31 | 上海智臻智能网络科技股份有限公司 | A kind of information classification approach and device |
US20170193393A1 (en) * | 2016-01-04 | 2017-07-06 | International Business Machines Corporation | Automated Knowledge Graph Creation |
CN107451126A (en) * | 2017-08-21 | 2017-12-08 | 广州多益网络股份有限公司 | A kind of near synonym screening technique and system |
CN107526812A (en) * | 2017-08-24 | 2017-12-29 | 北京奇艺世纪科技有限公司 | A kind of searching method, device and electronic equipment |
CN107748754A (en) * | 2017-09-15 | 2018-03-02 | 广州唯品会研究院有限公司 | A kind of knowledge mapping improving method and device |
WO2019041524A1 (en) * | 2017-08-31 | 2019-03-07 | 平安科技(深圳)有限公司 | Method, electronic apparatus, and computer readable storage medium for generating cluster tag |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102074235B (en) * | 2010-12-20 | 2013-04-03 | 上海华勤通讯技术有限公司 | Method of video speech recognition and search |
US11074495B2 (en) * | 2013-02-28 | 2021-07-27 | Z Advanced Computing, Inc. (Zac) | System and method for extremely efficient image and pattern recognition and artificial intelligence platform |
US9904579B2 (en) * | 2013-03-15 | 2018-02-27 | Advanced Elemental Technologies, Inc. | Methods and systems for purposeful computing |
CN104516904B (en) * | 2013-09-29 | 2018-04-03 | 北大方正集团有限公司 | A kind of Key Points recommend method and its system |
CN103577549B (en) * | 2013-10-16 | 2017-02-15 | 复旦大学 | Crowd portrayal system and method based on microblog label |
CN104090955A (en) * | 2014-07-07 | 2014-10-08 | 科大讯飞股份有限公司 | Automatic audio/video label labeling method and system |
CN104484459B (en) * | 2014-12-29 | 2019-07-23 | 北京奇虎科技有限公司 | The method and device that entity in a kind of pair of knowledge mapping merges |
EP3101534A1 (en) * | 2015-06-01 | 2016-12-07 | Siemens Aktiengesellschaft | Method and computer program product for semantically representing a system of devices |
CN107967267A (en) * | 2016-10-18 | 2018-04-27 | 中兴通讯股份有限公司 | A kind of knowledge mapping construction method, apparatus and system |
CN106815293A (en) * | 2016-12-08 | 2017-06-09 | 中国电子科技集团公司第三十二研究所 | System and method for constructing knowledge graph for information analysis |
CN106919689B (en) * | 2017-03-03 | 2018-05-11 | 中国科学技术信息研究所 | Professional domain knowledge mapping dynamic fixing method based on definitions blocks of knowledge |
US10614106B2 (en) * | 2017-03-10 | 2020-04-07 | Eduworks Corporation | Automated tool for question generation |
CN107368468B (en) * | 2017-06-06 | 2020-11-24 | 广东广业开元科技有限公司 | Operation and maintenance knowledge map generation method and system |
CN107562918A (en) * | 2017-09-12 | 2018-01-09 | 北京点易通科技有限公司 | A kind of mathematical problem knowledge point discovery and batch label acquisition method |
CN107665252B (en) * | 2017-09-27 | 2020-08-25 | 深圳证券信息有限公司 | Method and device for creating knowledge graph |
US20190179878A1 (en) * | 2017-12-12 | 2019-06-13 | Google Llc | Generating organization-specific tags for communications from various sources of an organization using an expanded knowledge graph for organization-specific content |
CN108255813B (en) * | 2018-01-23 | 2021-11-16 | 重庆邮电大学 | Text matching method based on word frequency-inverse document and CRF |
CN109189939A (en) * | 2018-09-05 | 2019-01-11 | 安阳师范学院 | A kind of Chinese Character Semantics knowledge mapping construction method, device, equipment, storage medium |
-
2019
- 2019-03-15 CN CN202110116596.1A patent/CN112784063B/en active Active
- 2019-03-15 CN CN202110116579.8A patent/CN112784062B/en active Active
- 2019-03-15 CN CN201910200003.2A patent/CN109977233B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009230173A (en) * | 2008-03-19 | 2009-10-08 | Nec Corp | Synonym conversion system, synonym conversion method and synonym-converting program |
CN103853702A (en) * | 2012-12-06 | 2014-06-11 | 富士通株式会社 | Device and method for correcting idiom error in linguistic data |
CN104484411A (en) * | 2014-12-16 | 2015-04-01 | 中国科学院自动化研究所 | Building method for semantic knowledge base based on a dictionary |
CN105589728A (en) * | 2015-12-16 | 2016-05-18 | 西安文理学院 | Sub-graph semantic isomorphism based instruction idiom identification method |
US20170193393A1 (en) * | 2016-01-04 | 2017-07-06 | International Business Machines Corporation | Automated Knowledge Graph Creation |
CN106778862A (en) * | 2016-12-12 | 2017-05-31 | 上海智臻智能网络科技股份有限公司 | A kind of information classification approach and device |
CN107451126A (en) * | 2017-08-21 | 2017-12-08 | 广州多益网络股份有限公司 | A kind of near synonym screening technique and system |
CN107526812A (en) * | 2017-08-24 | 2017-12-29 | 北京奇艺世纪科技有限公司 | A kind of searching method, device and electronic equipment |
WO2019041524A1 (en) * | 2017-08-31 | 2019-03-07 | 平安科技(深圳)有限公司 | Method, electronic apparatus, and computer readable storage medium for generating cluster tag |
CN107748754A (en) * | 2017-09-15 | 2018-03-02 | 广州唯品会研究院有限公司 | A kind of knowledge mapping improving method and device |
Also Published As
Publication number | Publication date |
---|---|
CN109977233B (en) | 2021-02-23 |
CN109977233A (en) | 2019-07-05 |
CN112784062B (en) | 2024-06-04 |
CN112784062A (en) | 2021-05-11 |
CN112784063B (en) | 2024-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109977233B (en) | Idiom knowledge graph construction method and device | |
CN111581976B (en) | Medical term standardization method, device, computer equipment and storage medium | |
CN108121700B (en) | Keyword extraction method and device and electronic equipment | |
WO2020001373A1 (en) | Method and apparatus for ontology construction | |
WO2018157805A1 (en) | Automatic questioning and answering processing method and automatic questioning and answering system | |
WO2019091026A1 (en) | Knowledge base document rapid search method, application server, and computer readable storage medium | |
CN106033416B (en) | Character string processing method and device | |
CN111797214A (en) | FAQ database-based problem screening method and device, computer equipment and medium | |
US20130060769A1 (en) | System and method for identifying social media interactions | |
CN107102993B (en) | User appeal analysis method and device | |
CN110019474B (en) | Automatic synonymy data association method and device in heterogeneous database and electronic equipment | |
US11556812B2 (en) | Method and device for acquiring data model in knowledge graph, and medium | |
CN109947903B (en) | Idiom query method and device | |
CN110990532A (en) | Method and device for processing text | |
CN113609847B (en) | Information extraction method, device, electronic equipment and storage medium | |
CN108804550B (en) | Query term expansion method and device and electronic equipment | |
CN116226350A (en) | Document query method, device, equipment and storage medium | |
CN111553556A (en) | Business data analysis method and device, computer equipment and storage medium | |
CN114091426A (en) | Method and device for processing field data in data warehouse | |
CN117112595A (en) | Information query method and device, electronic equipment and storage medium | |
CN110705285B (en) | Government affair text subject word library construction method, device, server and readable storage medium | |
CN112148841A (en) | Object classification and classification model construction method and device | |
CN118093629A (en) | Database query statement generation method, device, equipment and medium | |
CN116383412B (en) | Functional point amplification method and system based on knowledge graph | |
CN117216275A (en) | Text processing method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |