CN106844658A - A kind of Chinese text knowledge mapping method for auto constructing and system - Google Patents

A kind of Chinese text knowledge mapping method for auto constructing and system Download PDF

Info

Publication number
CN106844658A
CN106844658A CN201710050095.1A CN201710050095A CN106844658A CN 106844658 A CN106844658 A CN 106844658A CN 201710050095 A CN201710050095 A CN 201710050095A CN 106844658 A CN106844658 A CN 106844658A
Authority
CN
China
Prior art keywords
document
entity
relation
word
storehouse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710050095.1A
Other languages
Chinese (zh)
Other versions
CN106844658B (en
Inventor
苏晓恒
万海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Sun Yat Sen University
Original Assignee
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Sun Yat Sen University filed Critical National Sun Yat Sen University
Priority to CN201710050095.1A priority Critical patent/CN106844658B/en
Publication of CN106844658A publication Critical patent/CN106844658A/en
Application granted granted Critical
Publication of CN106844658B publication Critical patent/CN106844658B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Abstract

The method that the present invention is provided can realize the structure of Chinese text knowledge mapping, and the method, when in use with the growth of access times, the text library of its every field, relation storehouse, entity storehouse are also progressively expanded, and the effect for building knowledge mapping is better.

Description

A kind of Chinese text knowledge mapping method for auto constructing and system
Technical field
Field is built the present invention relates to knowledge mapping, is built automatically more particularly, to a kind of Chinese text knowledge mapping Method and system.
Background technology
Knowledge mapping is a kind of institutional framework of knowledge, and form is gained the name as collection of illustrative plates.One typical knowledge mapping is usual Comprising series of concepts, example and relation.For plain text, knowledge mapping is structuring, the node table in collection of illustrative plates Be shown as concept or example, and the side between node and node then represents relation therebetween, thus text be normally thought of as it is non- Structuring.The application of knowledge mapping widely, may apply to semantic search, intelligent answer, knowledge engineering, data mining and The various fields such as digital library.Generally, the structure of knowledge mapping is divided into manual construction, automatic structure and semi-automatic structure Build.Manual construction knowledge mapping can expend great man power and material, and be difficult with the change of knowledge and be adjusted;From Dynamic structure knowledge mapping will rely on knowledge acquisition technology, machine learning techniques and statistical technique from non-structured data resource Obtain knowledge mapping;Semi-automatic constructing technology is between manual construction and automation structure, because complete automation is difficult to Realize.
Building the main method of knowledge mapping at this stage includes the method based on lexical pattern, the method based on cluster and base In the method for distribution similarity.Method based on lexical pattern therefrom extracts corresponding general by predefining some patterns, then Read and relation, such as Fruit such as apple, then what such pattern was represented is that apple is a kind of fruit;Based on poly- The method of class is clustered according to the feature of concept or example, typically results in the knowledge mapping of hierarchical relationship;It is similar based on being distributed The method of degree is main to possess similar implication it is assumed that such as Beijing is the capital of China, Tokyo according to similar cliction up and down It is the capital of Japan, then Beijing and China have similar context to Tokyo and Japan.External structure knowledge mapping The early development of technology starting is fast, but the country can accomplish to automatically extract the knowledge graph of Chinese text currently without a complete system Spectrum.Main reason is that Chinese form unlike English is fixed, expression is simple and does not need participle, the complex structure of Chinese, Expression-form is various and needs participle.
The content of the invention
The present invention is to solve the problem of above prior art, there is provided a kind of Chinese text knowledge mapping side of structure automatically Method, the knowledge mapping of its Chinese text can be built using the method.
To realize above goal of the invention, the technical scheme of use is:
A kind of Chinese text knowledge mapping method for auto constructing, comprises the following steps:
S1. encyclopaedia crawls the document of every field from network, then goes out entity according to the knowledge organization structure extraction of the encyclopaedia page It is stored in the entity storehouse and relation storehouse in corresponding field with relation, the document of the every field for crawling also is stored in corresponding field In text library;
If S2. a document j needs to carry out the operation of structure knowledge mapping, following treatment is performed to it;
S3. word segmentation processing is carried out to document j;
S4. the extraction of core word is carried out to document j;
S5. the primary word of document j is extracted using the technology of TF-IDF;
S6. the field belonging to document j is determined:
S61. all words of document j are found out, their TF-IDF values is then calculated respectively, document is obtained according to the order of word The vocabulary vector expression of j;
S62. the vocabulary vector expression of the document of every field is obtained using the method for step S61, the word of document j is then calculated The cosine value of the vocabulary vector expression of the document of remittance vector expression and every field, the maximum corresponding neck of document of cosine value Domain is the field belonging to document j;Then document j is stored in the text library in the field;
S7. the triple of the entity, relation and entity in document j is extracted:
S71. the sentence of Field Words appearance is picked out from document j as affairs, affairs are referred in select sentence All entries set;The entry that wherein described Field Words collect for the entity storehouse and relation storehouse of document j arts;
S72. the support of each entry in affairs is calculated, then regards support as frequent episode higher than the entry of threshold value;
S73. the confidence level between any two frequent episode is calculated, if the confidence level between two frequent episodes is higher than threshold value, is carried Two frequent episodes are taken as word pair;
S74. the word of word pair, core word, primary word are constituted into an entry set, it is all in locating documents j to contain the entry collection The sentence of entry in conjunction, then these sentences are carried out reference resolution and delete sentence in submember, obtain extract entity, Noun and verb that the triple of relation and entity needs;
S75. the verb in sentence is found first, then by noun one candidate of composition of verb front and back in sentence (Noun, verb, noun)Triple, then calculates the relation in the relation storehouse of document j arts using similarity analysis With the similitude of the verb in candidate's triple, if similitude is more than threshold value, verb is put into the relation of document j arts In storehouse, while the noun in candidate's triple is put into the entity storehouse of document j arts;Now, candidate(Noun, moves Word, noun)Triple is the triple of formal entity, relation and entity that document j is extracted;
If S76. step S75 extracts the triple less than entity, relation and entity, find another in addition to core word in sentence Individual noun, then calculates the similitude of the entity and the noun in the entity storehouse of document j arts using similarity analysis, If similitude is more than threshold value, the word between two nouns is found, it and document j institutes are then calculated using similarity analysis The similitude of the relation in the relation storehouse in category field, if similitude is more than threshold value, document j arts is put into by the word In relation storehouse, and the noun that step S75 is extracted is put into the entity storehouse of document j arts;Now, document j is obtained to extract Entity, relation and entity triple;
S8. the knowledge mapping of document j is generated using the triple of the entity, relation and entity for extracting.
Preferably, the step S3 carries out word segmentation processing using jieba instruments to document j.
Preferably, the similarity analysis application Word2vec or Chinese thesaurus.
Meanwhile, present invention also offers a kind of system of application above method, its specific scheme is as follows:
Knowledge data library module, document process module, entity and relation extraction module and knowledge mapping life including every field Into module;The knowledge data library module of wherein described every field includes entity storehouse, relation storehouse and the text library of every field, its Described in document process module be used to perform step S3 ~ S62, the entity and relation extraction module be used to performing step S7 ~ S76, the knowledge mapping generation module is used to perform step S8.
Compared with prior art, the beneficial effects of the invention are as follows:
The method that the present invention is provided can realize the structure of Chinese text knowledge mapping, and the method is when in use with using The growth of number of times, the text library of its every field, relation storehouse, entity storehouse are also progressively expanded, and build the effect of knowledge mapping Better.
Brief description of the drawings
The schematic diagram of Fig. 1 knowledge data library modules.
Fig. 2 is the schematic diagram of document process module.
Fig. 3 is entity and relation extraction module, the schematic diagram of knowledge mapping generation module.
Specific embodiment
Accompanying drawing being for illustration only property explanation, it is impossible to be interpreted as the limitation to this patent;
Below in conjunction with drawings and Examples, the present invention is further elaborated.
Embodiment 1
The system that the present invention is provided mainly includes four modules:Document process module, entity and relation extraction module, knowledge mapping The knowledge data library module of generation module and every field.Concrete workflow is to first pass through document process module to enter input document Row pretreatment, then extracts the entity in document and relation, finally the reality for extracting by entity and relation extraction module Body and relation are sent to the complete knowledge mapping of knowledge mapping generation module structure and return to user, and update knowledge mapping generation mould Data in block.Here is that each module is discussed in detail.
The knowledge data library module of every field excavates the knowledge data of every field in encyclopaedia from network, then preserves To be used to build the knowledge mapping of document.Knowledge data base has been substantially divided into art, section by the present invention according to the classification of encyclopaedia , nature, culture, geography, life, society, personage, economy, physical culture, 11 major classes of history and in each major class again point Into more detailed group, such as the present invention has divided health medical treatment, electronic information, aviation boat again in science this major class My god, automobile engineering, biomedicine waits 16 groups.The purpose of do so is exactly the knowledge number that all directions are sorted out from encyclopaedia According to storehouse, then for the document for being arbitrarily given, the present invention just can be using Algorithm mapping the knowing to specific field of classification DBM is known, so as to use the heuristic knowledge mapping for building document of knowledge data library module in the field.Such as Fig. 1 institutes Show, knowledge data library module is divided into systematic knowledge database and user knowledge database again, at the beginning of systematic knowledge database is system The knowledge data base carried during beginningization, new knowledge data is found for heuristic from customer documentation, and user's lack of competence is carried out Operation;User knowledge database refers to the process of the knowledge data that user obtains in extraction document, and user is had permission to this operation, Self-defined or autonomous addition dictionary can be carried out so as to improve overall structure effect according to the job requirement of user.Network processing layers It is responsible for encyclopaedia from network(Baidupedia, interactive encyclopaedia, Wiki Chinese encyclopaedia)The all of document of each classification is crawled, and according to hundred The knowledge organization structure extraction of section's page goes out entity and relation, and the entity storehouse and relation storehouse put into knowledge data library module In, while the also encyclopaedia text data after reservation process, is put into text library.Then the models such as word2vec are trained, and is put into To in the middle of knowledge data library module.So knowledge data library module is by entity storehouse, relation storehouse, text library and the part group of model 4 Into.
Document process module is to pre-process as smaller processing unit document.It is main to include that pretreatment, core word are extracted Extracted with primary word and the part of document classification four.As shown in Fig. 2 of the invention by each sentence in document in preprocessing part Subprocessing is a line, and Chinese word segmentation is carried out to document using jieba instruments, and jieba has good performance, participle efficiency high And it is accurate, and allow to import entry, the vocabulary in participle process jieba can pay the utmost attention to entry is of the invention by knowledge number The entry base of jieba is merged into according to the entity storehouse and relation storehouse in library module, also needs to filter stop words in addition to participle;Core Heart word extracts the main core for extracting document in part(Purport)Word, the present invention proposes the core word extraction algorithm, the algorithm base Assume that every document is all around core in one(Purport)Word is illustrated, and core word is distributed by block , such as the 1st ~ 5 section is description internal memory, and the 6th ~ 12 section is explanation hard disk.In addition for each core word block, here all Having corresponding knowledge carries out additional notes to the entry, such as this is general to be likely to occur memory bar in this core word block of internal memory Read, or the relation between internal memory and memory bar.This is also the basis that the present invention builds knowledge mapping.The algorithm that core word is extracted Including two aspects, the entry density of the word after each participle is calculated first(What i.e. the entry occurred in this document is total The line number that number of times is crossed over divided by the entry), a threshold value is defined, candidate's core word is selected, then calculate the uniform of candidate word Property, i.e. each candidate word is uniform in their word block, and candidate word block is divided into multiple entries by the present invention, such as every three One entry of behavior, then the uniformity of candidate word is equal to the ratio of entry where candidate word;It is document that document classification part is The knowledge data library module of a certain classification is found, because there is initial information in system library, can be found with heuristic help more Novel entities and relation under the category, of the invention here to carry out text classification using the cosine law, specific practice is as follows, first Find all words under the document(Notional word), their TF-IDF values are calculated, then obtained according to the order of all vocabularies The vocabulary vector expression of the document, the value of each of which element represents contribution of the word to the document, because each class Document all substantially can be exactly contribution situation of the specialized vocabulary under such to such document by the combination collocation of fixed specialty, The last present invention is calculated document using the text library in knowledge data library module and calculates cosine with categorization vector in text library Value, the document is divided into the middle of the field of maximum.The usual time important word that has in core word block was previously noted to come The concept of auxiliary description core word, such as memory bar aids in describing the concept of internal memory, so primary word extracts the main work(in part Can be exactly to extract time primary word, the present invention is extracted using the technology of TF-IDF here, before document classification also use The technology, TF(Term -Frequency)Item frequency, represents the frequency that the entry occurs in this document, IDF (Inverse Document Fequency)Inverse document frequency, represents document of the entry in generic knowledge data base The frequency of middle appearance, TF-IDF is two products of value, integrating representation importance of the entry in document.Present invention setting one Individual threshold value higher, then selects important vocabulary.Then core word and primary word are put into knowledge data base mould by the present invention In the middle of block.
Entity is responsible for extracting the triple of the entity in the document, relation and entity with relation extraction module.Such as Fig. 3 institutes Show, association analysis part is used for excavating possible related word pair in document, there is the relation extracted using the present invention in document.This Invention improves Apriori related analysis technologies, picks out the Field Words from document first(Refer here to systematic knowledge The entry that entity storehouse and relation storehouse in database and user knowledge database are collected)The sentence of appearance refers to as affairs, affairs Be the present invention the set of all entries in select sentence, it is seen that, in each affairs at least one Field Words.Then the ratio that the support of each entry in document, i.e. affairs where each entry account for all affairs is calculated Support is regarded as frequent episode by example, the present invention higher than the entry of certain threshold value, then calculates two confidence levels of frequent episode, i.e., Affairs where one of frequent episode account for two frequent episode entries where affairs ratio, higher than certain threshold value, the present invention Extract as word pair.Such word equally also includes the word pair without core word to including core word mostly.Sentence is cut down Part is exactly to delete the submember in important sentences, so as to excavate the relation of word centering.The present invention the word of frequent word pair and The core word and primary word of the document constitute an entry set, all in locating documents to contain entry in the entry set Sentence, because association analysis does not take into account that the structure and composition of sentence, but the present invention may determine which word there may be relation. Reference resolution is carried out first, and the pronoun in clause that will be in a sentence is replaced with the noun that it is referred to, such as " during Beijing is The capital of state, Chinese have deep love for it ", after reference resolution, " Beijing is the capital of China, and Chinese have deep love for Beijing ", because One sentence is a processing unit, but a sentence long is usually to be made up of multiple clauses, and clause can also contain information, make The meaning of clause can be enriched after being decomposed with reference, then as the unit of present invention treatment.The present invention directly invokes Stamford Reference resolution bag in natural language processing bag, it is then of the invention noun in sentence to be cut down according to predefined pattern or is moved Word, that is, delete the submember in sentence so as to propose the real noun and verb for needing of the invention, such as " preposition+noun " Form, " teacher sees Xiao Ming in park." core of the words is exactly " teacher sees Xiao Ming " and " in park " just belongs to Secondary part, this noun of park therein needs to delete, and can otherwise influence the extraction of entity.The present invention has summed up several need The pattern to be cut down:" preposition+adjective+noun ", " preposition+verb " etc..It is from after reduction to extract entity and relationship part Noun or verb are extracted in sentence as entity or relation, the verb part in sentence is found first, then being close to verb One candidate's of noun and noun below and then composition above(Noun, verb, noun)Triple.Then similitude is utilized It is dynamic in relation and candidate's triple in the relation storehouse that analysis comes in computing system knowledge data base and relation knowledge data base The similitude of word, if similitude is more than certain threshold value, the present invention is just put into the pass of user knowledge database by the verb In being storehouse, while another noun is put into the entity storehouse of user knowledge database;The system has taken into full account that noun fills When the situation in relation storehouse, after the failure of the first extracting method, the present invention then finds another that go out outside core word in sentence Noun, then with similarity analysis come the entity in the entity storehouse in computing system knowledge data base and relation knowledge data base with The noun it is similar, if above certain threshold value(The threshold value typically setting is higher), the word and then looked between two nouns Language(Verb or noun), the similitude of they and relation storehouse is calculated, if greater than certain threshold value just by corresponding noun or dynamic Word puts into relation storehouse, and the noun for extracting before is put into entity storehouse.Similarity analysis part is mainly concerned with the meter of similitude Calculate, that is, give the Semantic Similarity that two words are calculated between them.Mainly two kinds of technologies are used:Word2vec and synonymous Word word woods, word2vec is a instrument that word is characterized as real number value vector that Google increased income in 2013, its profit With the thought of deep learning, the vector fortune in K gts can be reduced to the treatment to content of text by training Calculate, and the similarity in vector space can be used to characterize the similarity on text semantic.The present invention takes full advantage of system and knows Know the text library in database to train the model, so as to obtain the model in systematic knowledge database, carry out similitude point The process of analysis, the present invention directly invokes the model in knowledge data base.Another similitude instrument is society of Harbin Institute of Technology The extended edition Chinese thesaurus write with Research into information retrieval center are calculated, the Chinese thesaurus are divided into 5 levels, the present invention Using hierarchical relationship, two similitudes of word can be solved.The present invention leads to two kinds of technologies of word2vec and Chinese thesaurus The method of weighting is crossed to calculate two similitudes of word.
After the triple of the entity in the document is extracted, relation and entity, knowledge mapping generation module is using extraction The triple of entity, relation and entity generates the knowledge mapping of document.
Obviously, the above embodiment of the present invention is only intended to clearly illustrate example of the present invention, and is not right The restriction of embodiments of the present invention.For those of ordinary skill in the field, may be used also on the basis of the above description To make other changes in different forms.There is no need and unable to be exhaustive to all of implementation method.It is all this Any modification, equivalent and improvement made within the spirit and principle of invention etc., should be included in the claims in the present invention Protection domain within.

Claims (4)

1. a kind of Chinese text knowledge mapping method for auto constructing, it is characterised in that:Comprise the following steps:
S1. encyclopaedia crawls the document of every field from network, then goes out entity according to the knowledge organization structure extraction of the encyclopaedia page It is stored in the entity storehouse and relation storehouse in corresponding field with relation, the document of the every field for crawling also is stored in corresponding field In text library;
If S2. a document j needs to carry out the operation of structure knowledge mapping, following treatment is performed to it;
S3. word segmentation processing is carried out to document j;
S4. the extraction of core word is carried out to document j;
S5. the primary word of document j is extracted using the technology of TF-IDF;
S6. the field belonging to document j is determined:
S61. all words of document j are found out, their TF-IDF values is then calculated respectively, document is obtained according to the order of word The vocabulary vector expression of j;
S62. the vocabulary vector expression of the document of every field is obtained using the method for step S61, the word of document j is then calculated The cosine value of the vocabulary vector expression of the document of remittance vector expression and every field, the maximum corresponding neck of document of cosine value Domain is the field belonging to document j;Then document j is stored in the text library in the field;
S7. the triple of the entity, relation and entity in document j is extracted:
S71. the sentence of Field Words appearance is picked out from document j as affairs, affairs are referred in select sentence All entries set;The entry that wherein described Field Words collect for the entity storehouse and relation storehouse of document j arts;
S72. the support of each entry in affairs is calculated, then regards support as frequent episode higher than the entry of threshold value;
S73. the confidence level between any two frequent episode is calculated, if the confidence level between two frequent episodes is higher than threshold value, is carried Two frequent episodes are taken as word pair;
S74. the word of word pair, core word, primary word are constituted into an entry set, it is all in locating documents j to contain the entry collection The sentence of entry in conjunction, then these sentences are carried out reference resolution and delete sentence in submember, obtain extract entity, Noun and verb that the triple of relation and entity needs;
S75. the verb in sentence is found first, then by noun one candidate of composition of verb front and back in sentence (Noun, verb, noun)Triple, then calculates the relation in the relation storehouse of document j arts using similarity analysis With the similitude of the verb in candidate's triple, if similitude is more than threshold value, verb is put into the relation of document j arts In storehouse, while the noun in candidate's triple is put into the entity storehouse of document j arts;Now, candidate(Noun, moves Word, noun)Triple is the triple of formal entity, relation and entity that document j is extracted;
If S76. step S75 extracts the triple less than entity, relation and entity, find another in addition to core word in sentence Individual noun, then calculates the similitude of the entity and the noun in the entity storehouse of document j arts using similarity analysis, If similitude is more than threshold value, the word between two nouns is found, it and document j institutes are then calculated using similarity analysis The similitude of the relation in the relation storehouse in category field, if similitude is more than threshold value, document j arts is put into by the word In relation storehouse, and the noun that step S75 is extracted is put into the entity storehouse of document j arts;Now, document j is obtained to extract Entity, relation and entity triple;
S8. the knowledge mapping of document j is generated using the triple of the entity, relation and entity for extracting.
2. Chinese text knowledge mapping method for auto constructing according to claim 1, it is characterised in that:The step S3 makes Word segmentation processing is carried out to document j with jieba instruments.
3. Chinese text knowledge mapping method for auto constructing according to claim 1, it is characterised in that:The similitude point Word2vec or Chinese thesaurus are applied in analysis.
4. the system that one kind applies claim 1 ~ 3 any one method, it is characterised in that:Knowledge data base including every field Module, document process module, entity and relation extraction module and knowledge mapping generation module;The knowledge of wherein described every field DBM includes entity storehouse, relation storehouse and the text library of every field, wherein the document process module is used to perform step Rapid S3 ~ S62, the entity is used to perform step S7 ~ S76 with relation extraction module, and the knowledge mapping generation module is used to hold Row step S8.
CN201710050095.1A 2017-01-23 2017-01-23 Automatic construction method and system of Chinese text knowledge graph Active CN106844658B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710050095.1A CN106844658B (en) 2017-01-23 2017-01-23 Automatic construction method and system of Chinese text knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710050095.1A CN106844658B (en) 2017-01-23 2017-01-23 Automatic construction method and system of Chinese text knowledge graph

Publications (2)

Publication Number Publication Date
CN106844658A true CN106844658A (en) 2017-06-13
CN106844658B CN106844658B (en) 2019-12-13

Family

ID=59120209

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710050095.1A Active CN106844658B (en) 2017-01-23 2017-01-23 Automatic construction method and system of Chinese text knowledge graph

Country Status (1)

Country Link
CN (1) CN106844658B (en)

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247813A (en) * 2017-07-26 2017-10-13 北京理工大学 A kind of network struction and evolution method based on weighting technique
CN107391693A (en) * 2017-07-26 2017-11-24 北京理工大学 A kind of information extraction and description method for English patent
CN107894986A (en) * 2017-09-26 2018-04-10 北京纳人网络科技有限公司 A kind of business connection division methods, server and client based on vectorization
CN107943874A (en) * 2017-11-13 2018-04-20 平安科技(深圳)有限公司 Knowledge mapping processing method, device, computer equipment and storage medium
CN107967290A (en) * 2017-10-09 2018-04-27 国家计算机网络与信息安全管理中心 A kind of knowledge mapping network establishing method and system, medium based on magnanimity scientific research data
CN108415950A (en) * 2018-02-01 2018-08-17 腾讯科技(深圳)有限公司 A kind of hypernym polymerization and device
CN108563710A (en) * 2018-03-27 2018-09-21 腾讯科技(深圳)有限公司 A kind of knowledge mapping construction method, device and storage medium
CN108664595A (en) * 2018-05-08 2018-10-16 和美(深圳)信息技术股份有限公司 Domain knowledge base construction method, device, computer equipment and storage medium
CN109145003A (en) * 2018-08-24 2019-01-04 蜜小蜂智慧(北京)科技有限公司 A kind of method and device constructing knowledge mapping
CN109145071A (en) * 2018-08-06 2019-01-04 中国地质大学(武汉) A kind of automated construction method and system towards geophysics field knowledge mapping
CN109189939A (en) * 2018-09-05 2019-01-11 安阳师范学院 A kind of Chinese Character Semantics knowledge mapping construction method, device, equipment, storage medium
CN109271557A (en) * 2018-08-31 2019-01-25 北京字节跳动网络技术有限公司 Method and apparatus for output information
CN109508390A (en) * 2018-12-28 2019-03-22 北京金山安全软件有限公司 Input prediction method and device based on knowledge graph and electronic equipment
CN109508391A (en) * 2018-12-28 2019-03-22 北京金山安全软件有限公司 Input prediction method and device based on knowledge graph and electronic equipment
CN109726298A (en) * 2019-01-08 2019-05-07 上海市研发公共服务平台管理中心 Knowledge mapping construction method, system, terminal and medium suitable for scientific and technical literature
CN109800317A (en) * 2018-03-19 2019-05-24 中山大学 A kind of image querying answer method based on the alignment of image scene map
CN109857793A (en) * 2018-12-28 2019-06-07 考拉征信服务有限公司 Processing method, device, electronic equipment and the storage medium of technical background data
CN109918436A (en) * 2019-03-08 2019-06-21 上海一健事信息科技有限公司 A kind of Medical Knowledge management and inquiry system
CN109993381A (en) * 2017-12-29 2019-07-09 中国移动通信集团湖北有限公司 Demand management application method, device, equipment and the medium of knowledge based map
CN110059310A (en) * 2018-01-19 2019-07-26 腾讯科技(深圳)有限公司 Extending method and device, electronic equipment, the storage medium of hypernym network
CN110245239A (en) * 2019-05-13 2019-09-17 吉林大学 A kind of construction method and system towards automotive field knowledge mapping
CN110543574A (en) * 2019-08-30 2019-12-06 北京百度网讯科技有限公司 knowledge graph construction method, device, equipment and medium
CN110598002A (en) * 2019-08-14 2019-12-20 广州视源电子科技股份有限公司 Knowledge graph library construction method and device, computer storage medium and electronic equipment
CN110781310A (en) * 2019-09-09 2020-02-11 深圳壹账通智能科技有限公司 Target concept graph construction method and device, computer equipment and storage medium
CN111209411A (en) * 2020-01-03 2020-05-29 北京明略软件系统有限公司 Document analysis method and device
CN111209412A (en) * 2020-02-10 2020-05-29 同方知网(北京)技术有限公司 Method for building knowledge graph of periodical literature by cyclic updating iteration
CN111242554A (en) * 2020-01-17 2020-06-05 秒针信息技术有限公司 Method and device for determining type of picking mode
CN111859976A (en) * 2019-04-30 2020-10-30 广东小天才科技有限公司 Method and device for expanding regular expression based on knowledge graph
CN112988974A (en) * 2021-03-25 2021-06-18 上海园域信息科技有限公司 Method and device for constructing industry chain knowledge graph based on vector space
CN114969385A (en) * 2022-08-03 2022-08-30 北京长河数智科技有限责任公司 Knowledge graph optimization method and device based on document attribute assignment entity weight
CN116010587A (en) * 2023-03-23 2023-04-25 中国人民解放军63921部队 Method, device, medium and equipment for pushing spaceflight test issuing guarantee condition knowledge

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593792A (en) * 2013-11-13 2014-02-19 复旦大学 Individual recommendation method and system based on Chinese knowledge mapping
KR20150084706A (en) * 2015-06-26 2015-07-22 경북대학교 산학협력단 Apparatus for knowledge learning of ontology and method thereof
CN104933164A (en) * 2015-06-26 2015-09-23 华南理工大学 Method for extracting relations among named entities in Internet massive data and system thereof
CN105630901A (en) * 2015-12-21 2016-06-01 清华大学 Knowledge graph representation learning method
CN106250412A (en) * 2016-07-22 2016-12-21 浙江大学 The knowledge mapping construction method merged based on many source entities

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593792A (en) * 2013-11-13 2014-02-19 复旦大学 Individual recommendation method and system based on Chinese knowledge mapping
KR20150084706A (en) * 2015-06-26 2015-07-22 경북대학교 산학협력단 Apparatus for knowledge learning of ontology and method thereof
CN104933164A (en) * 2015-06-26 2015-09-23 华南理工大学 Method for extracting relations among named entities in Internet massive data and system thereof
CN105630901A (en) * 2015-12-21 2016-06-01 清华大学 Knowledge graph representation learning method
CN106250412A (en) * 2016-07-22 2016-12-21 浙江大学 The knowledge mapping construction method merged based on many source entities

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐增林等: "知识图谱技术综述", 《电子科技大学学报》 *

Cited By (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107391693A (en) * 2017-07-26 2017-11-24 北京理工大学 A kind of information extraction and description method for English patent
CN107247813A (en) * 2017-07-26 2017-10-13 北京理工大学 A kind of network struction and evolution method based on weighting technique
CN107894986A (en) * 2017-09-26 2018-04-10 北京纳人网络科技有限公司 A kind of business connection division methods, server and client based on vectorization
CN107894986B (en) * 2017-09-26 2021-03-30 北京纳人网络科技有限公司 Enterprise relation division method based on vectorization, server and client
CN107967290A (en) * 2017-10-09 2018-04-27 国家计算机网络与信息安全管理中心 A kind of knowledge mapping network establishing method and system, medium based on magnanimity scientific research data
CN107943874A (en) * 2017-11-13 2018-04-20 平安科技(深圳)有限公司 Knowledge mapping processing method, device, computer equipment and storage medium
CN109993381A (en) * 2017-12-29 2019-07-09 中国移动通信集团湖北有限公司 Demand management application method, device, equipment and the medium of knowledge based map
CN109993381B (en) * 2017-12-29 2021-11-30 中国移动通信集团湖北有限公司 Demand management application method, device, equipment and medium based on knowledge graph
CN110059310B (en) * 2018-01-19 2022-10-28 腾讯科技(深圳)有限公司 Hypernym network expansion method and device, electronic equipment and storage medium
CN110059310A (en) * 2018-01-19 2019-07-26 腾讯科技(深圳)有限公司 Extending method and device, electronic equipment, the storage medium of hypernym network
CN108415950A (en) * 2018-02-01 2018-08-17 腾讯科技(深圳)有限公司 A kind of hypernym polymerization and device
CN108415950B (en) * 2018-02-01 2021-03-23 腾讯科技(深圳)有限公司 Hypernym aggregation method and device
CN109800317A (en) * 2018-03-19 2019-05-24 中山大学 A kind of image querying answer method based on the alignment of image scene map
CN108563710B (en) * 2018-03-27 2021-02-02 腾讯科技(深圳)有限公司 Knowledge graph construction method and device and storage medium
CN108563710A (en) * 2018-03-27 2018-09-21 腾讯科技(深圳)有限公司 A kind of knowledge mapping construction method, device and storage medium
CN108664595A (en) * 2018-05-08 2018-10-16 和美(深圳)信息技术股份有限公司 Domain knowledge base construction method, device, computer equipment and storage medium
CN108664595B (en) * 2018-05-08 2020-10-16 和美(深圳)信息技术股份有限公司 Domain knowledge base construction method and device, computer equipment and storage medium
CN109145071A (en) * 2018-08-06 2019-01-04 中国地质大学(武汉) A kind of automated construction method and system towards geophysics field knowledge mapping
CN109145071B (en) * 2018-08-06 2021-08-27 中国地质大学(武汉) Automatic construction method and system for geophysical field knowledge graph
CN109145003A (en) * 2018-08-24 2019-01-04 蜜小蜂智慧(北京)科技有限公司 A kind of method and device constructing knowledge mapping
CN109271557A (en) * 2018-08-31 2019-01-25 北京字节跳动网络技术有限公司 Method and apparatus for output information
CN109189939A (en) * 2018-09-05 2019-01-11 安阳师范学院 A kind of Chinese Character Semantics knowledge mapping construction method, device, equipment, storage medium
CN109857793A (en) * 2018-12-28 2019-06-07 考拉征信服务有限公司 Processing method, device, electronic equipment and the storage medium of technical background data
CN109508391B (en) * 2018-12-28 2022-04-08 北京金山安全软件有限公司 Input prediction method and device based on knowledge graph and electronic equipment
CN109508390B (en) * 2018-12-28 2021-12-14 北京金山安全软件有限公司 Input prediction method and device based on knowledge graph and electronic equipment
CN109508390A (en) * 2018-12-28 2019-03-22 北京金山安全软件有限公司 Input prediction method and device based on knowledge graph and electronic equipment
CN109508391A (en) * 2018-12-28 2019-03-22 北京金山安全软件有限公司 Input prediction method and device based on knowledge graph and electronic equipment
CN109726298B (en) * 2019-01-08 2020-12-29 上海市研发公共服务平台管理中心 Knowledge graph construction method, system, terminal and medium suitable for scientific and technical literature
CN109726298A (en) * 2019-01-08 2019-05-07 上海市研发公共服务平台管理中心 Knowledge mapping construction method, system, terminal and medium suitable for scientific and technical literature
CN109918436B (en) * 2019-03-08 2022-12-20 麦博(上海)健康科技有限公司 Medical knowledge management and query system
CN109918436A (en) * 2019-03-08 2019-06-21 上海一健事信息科技有限公司 A kind of Medical Knowledge management and inquiry system
CN111859976A (en) * 2019-04-30 2020-10-30 广东小天才科技有限公司 Method and device for expanding regular expression based on knowledge graph
CN110245239A (en) * 2019-05-13 2019-09-17 吉林大学 A kind of construction method and system towards automotive field knowledge mapping
CN110598002A (en) * 2019-08-14 2019-12-20 广州视源电子科技股份有限公司 Knowledge graph library construction method and device, computer storage medium and electronic equipment
CN110543574B (en) * 2019-08-30 2022-05-17 北京百度网讯科技有限公司 Knowledge graph construction method, device, equipment and medium
CN110543574A (en) * 2019-08-30 2019-12-06 北京百度网讯科技有限公司 knowledge graph construction method, device, equipment and medium
WO2021047327A1 (en) * 2019-09-09 2021-03-18 深圳壹账通智能科技有限公司 Method and apparatus for constructing target concept map, computer device, and storage medium
CN110781310A (en) * 2019-09-09 2020-02-11 深圳壹账通智能科技有限公司 Target concept graph construction method and device, computer equipment and storage medium
CN111209411A (en) * 2020-01-03 2020-05-29 北京明略软件系统有限公司 Document analysis method and device
CN111242554A (en) * 2020-01-17 2020-06-05 秒针信息技术有限公司 Method and device for determining type of picking mode
CN111242554B (en) * 2020-01-17 2023-10-17 秒针信息技术有限公司 Method and device for determining type of picking mode
CN111209412A (en) * 2020-02-10 2020-05-29 同方知网(北京)技术有限公司 Method for building knowledge graph of periodical literature by cyclic updating iteration
CN111209412B (en) * 2020-02-10 2023-05-12 同方知网数字出版技术股份有限公司 Periodical literature knowledge graph construction method for cyclic updating iteration
CN112988974A (en) * 2021-03-25 2021-06-18 上海园域信息科技有限公司 Method and device for constructing industry chain knowledge graph based on vector space
CN114969385A (en) * 2022-08-03 2022-08-30 北京长河数智科技有限责任公司 Knowledge graph optimization method and device based on document attribute assignment entity weight
CN116010587A (en) * 2023-03-23 2023-04-25 中国人民解放军63921部队 Method, device, medium and equipment for pushing spaceflight test issuing guarantee condition knowledge

Also Published As

Publication number Publication date
CN106844658B (en) 2019-12-13

Similar Documents

Publication Publication Date Title
CN106844658A (en) A kind of Chinese text knowledge mapping method for auto constructing and system
Wang et al. K-adapter: Infusing knowledge into pre-trained models with adapters
CN109190117B (en) Short text semantic similarity calculation method based on word vector
Grishman Information extraction
CN103207856B (en) A kind of Ontological concept and hierarchical relationship generation method
US9317593B2 (en) Modeling topics using statistical distributions
CN103914548B (en) Information search method and device
CN105528437B (en) A kind of question answering system construction method extracted based on structured text knowledge
CN108681574B (en) Text abstract-based non-fact question-answer selection method and system
CN109344236A (en) One kind being based on the problem of various features similarity calculating method
CN106776562A (en) A kind of keyword extracting method and extraction system
CN103886099B (en) Semantic retrieval system and method of vague concepts
EP2224361A1 (en) Generating a domain corpus and a dictionary for an automated ontology
EP2224360A1 (en) Generating a dictionary and determining a co-occurrence context for an automated ontology
KR20060122276A (en) Relation extraction from documents for the automatic construction of ontologies
CN111625622B (en) Domain ontology construction method and device, electronic equipment and storage medium
CN109522396B (en) Knowledge processing method and system for national defense science and technology field
CN107688583A (en) The method and apparatus for creating the training data for natural language processing device
Alian et al. Arabic semantic similarity approaches-review
CN109508460A (en) Unsupervised composition based on Subject Clustering is digressed from the subject detection method and system
Yusuf et al. Query expansion method for quran search using semantic search and lucene ranking
CN112883182A (en) Question-answer matching method and device based on machine reading
Michelbacher Multi-word tokenization for natural language processing
CN106776590A (en) A kind of method and system for obtaining entry translation
Tian et al. Measuring the similarity of short texts by word similarity and tree kernels

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant