CN106844658A - A kind of Chinese text knowledge mapping method for auto constructing and system - Google Patents
A kind of Chinese text knowledge mapping method for auto constructing and system Download PDFInfo
- Publication number
- CN106844658A CN106844658A CN201710050095.1A CN201710050095A CN106844658A CN 106844658 A CN106844658 A CN 106844658A CN 201710050095 A CN201710050095 A CN 201710050095A CN 106844658 A CN106844658 A CN 106844658A
- Authority
- CN
- China
- Prior art keywords
- document
- entity
- relation
- word
- storehouse
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/247—Thesauruses; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Abstract
The method that the present invention is provided can realize the structure of Chinese text knowledge mapping, and the method, when in use with the growth of access times, the text library of its every field, relation storehouse, entity storehouse are also progressively expanded, and the effect for building knowledge mapping is better.
Description
Technical field
Field is built the present invention relates to knowledge mapping, is built automatically more particularly, to a kind of Chinese text knowledge mapping
Method and system.
Background technology
Knowledge mapping is a kind of institutional framework of knowledge, and form is gained the name as collection of illustrative plates.One typical knowledge mapping is usual
Comprising series of concepts, example and relation.For plain text, knowledge mapping is structuring, the node table in collection of illustrative plates
Be shown as concept or example, and the side between node and node then represents relation therebetween, thus text be normally thought of as it is non-
Structuring.The application of knowledge mapping widely, may apply to semantic search, intelligent answer, knowledge engineering, data mining and
The various fields such as digital library.Generally, the structure of knowledge mapping is divided into manual construction, automatic structure and semi-automatic structure
Build.Manual construction knowledge mapping can expend great man power and material, and be difficult with the change of knowledge and be adjusted;From
Dynamic structure knowledge mapping will rely on knowledge acquisition technology, machine learning techniques and statistical technique from non-structured data resource
Obtain knowledge mapping;Semi-automatic constructing technology is between manual construction and automation structure, because complete automation is difficult to
Realize.
Building the main method of knowledge mapping at this stage includes the method based on lexical pattern, the method based on cluster and base
In the method for distribution similarity.Method based on lexical pattern therefrom extracts corresponding general by predefining some patterns, then
Read and relation, such as Fruit such as apple, then what such pattern was represented is that apple is a kind of fruit;Based on poly-
The method of class is clustered according to the feature of concept or example, typically results in the knowledge mapping of hierarchical relationship;It is similar based on being distributed
The method of degree is main to possess similar implication it is assumed that such as Beijing is the capital of China, Tokyo according to similar cliction up and down
It is the capital of Japan, then Beijing and China have similar context to Tokyo and Japan.External structure knowledge mapping
The early development of technology starting is fast, but the country can accomplish to automatically extract the knowledge graph of Chinese text currently without a complete system
Spectrum.Main reason is that Chinese form unlike English is fixed, expression is simple and does not need participle, the complex structure of Chinese,
Expression-form is various and needs participle.
The content of the invention
The present invention is to solve the problem of above prior art, there is provided a kind of Chinese text knowledge mapping side of structure automatically
Method, the knowledge mapping of its Chinese text can be built using the method.
To realize above goal of the invention, the technical scheme of use is:
A kind of Chinese text knowledge mapping method for auto constructing, comprises the following steps:
S1. encyclopaedia crawls the document of every field from network, then goes out entity according to the knowledge organization structure extraction of the encyclopaedia page
It is stored in the entity storehouse and relation storehouse in corresponding field with relation, the document of the every field for crawling also is stored in corresponding field
In text library;
If S2. a document j needs to carry out the operation of structure knowledge mapping, following treatment is performed to it;
S3. word segmentation processing is carried out to document j;
S4. the extraction of core word is carried out to document j;
S5. the primary word of document j is extracted using the technology of TF-IDF;
S6. the field belonging to document j is determined:
S61. all words of document j are found out, their TF-IDF values is then calculated respectively, document is obtained according to the order of word
The vocabulary vector expression of j;
S62. the vocabulary vector expression of the document of every field is obtained using the method for step S61, the word of document j is then calculated
The cosine value of the vocabulary vector expression of the document of remittance vector expression and every field, the maximum corresponding neck of document of cosine value
Domain is the field belonging to document j;Then document j is stored in the text library in the field;
S7. the triple of the entity, relation and entity in document j is extracted:
S71. the sentence of Field Words appearance is picked out from document j as affairs, affairs are referred in select sentence
All entries set;The entry that wherein described Field Words collect for the entity storehouse and relation storehouse of document j arts;
S72. the support of each entry in affairs is calculated, then regards support as frequent episode higher than the entry of threshold value;
S73. the confidence level between any two frequent episode is calculated, if the confidence level between two frequent episodes is higher than threshold value, is carried
Two frequent episodes are taken as word pair;
S74. the word of word pair, core word, primary word are constituted into an entry set, it is all in locating documents j to contain the entry collection
The sentence of entry in conjunction, then these sentences are carried out reference resolution and delete sentence in submember, obtain extract entity,
Noun and verb that the triple of relation and entity needs;
S75. the verb in sentence is found first, then by noun one candidate of composition of verb front and back in sentence
(Noun, verb, noun)Triple, then calculates the relation in the relation storehouse of document j arts using similarity analysis
With the similitude of the verb in candidate's triple, if similitude is more than threshold value, verb is put into the relation of document j arts
In storehouse, while the noun in candidate's triple is put into the entity storehouse of document j arts;Now, candidate(Noun, moves
Word, noun)Triple is the triple of formal entity, relation and entity that document j is extracted;
If S76. step S75 extracts the triple less than entity, relation and entity, find another in addition to core word in sentence
Individual noun, then calculates the similitude of the entity and the noun in the entity storehouse of document j arts using similarity analysis,
If similitude is more than threshold value, the word between two nouns is found, it and document j institutes are then calculated using similarity analysis
The similitude of the relation in the relation storehouse in category field, if similitude is more than threshold value, document j arts is put into by the word
In relation storehouse, and the noun that step S75 is extracted is put into the entity storehouse of document j arts;Now, document j is obtained to extract
Entity, relation and entity triple;
S8. the knowledge mapping of document j is generated using the triple of the entity, relation and entity for extracting.
Preferably, the step S3 carries out word segmentation processing using jieba instruments to document j.
Preferably, the similarity analysis application Word2vec or Chinese thesaurus.
Meanwhile, present invention also offers a kind of system of application above method, its specific scheme is as follows:
Knowledge data library module, document process module, entity and relation extraction module and knowledge mapping life including every field
Into module;The knowledge data library module of wherein described every field includes entity storehouse, relation storehouse and the text library of every field, its
Described in document process module be used to perform step S3 ~ S62, the entity and relation extraction module be used to performing step S7 ~
S76, the knowledge mapping generation module is used to perform step S8.
Compared with prior art, the beneficial effects of the invention are as follows:
The method that the present invention is provided can realize the structure of Chinese text knowledge mapping, and the method is when in use with using
The growth of number of times, the text library of its every field, relation storehouse, entity storehouse are also progressively expanded, and build the effect of knowledge mapping
Better.
Brief description of the drawings
The schematic diagram of Fig. 1 knowledge data library modules.
Fig. 2 is the schematic diagram of document process module.
Fig. 3 is entity and relation extraction module, the schematic diagram of knowledge mapping generation module.
Specific embodiment
Accompanying drawing being for illustration only property explanation, it is impossible to be interpreted as the limitation to this patent;
Below in conjunction with drawings and Examples, the present invention is further elaborated.
Embodiment 1
The system that the present invention is provided mainly includes four modules:Document process module, entity and relation extraction module, knowledge mapping
The knowledge data library module of generation module and every field.Concrete workflow is to first pass through document process module to enter input document
Row pretreatment, then extracts the entity in document and relation, finally the reality for extracting by entity and relation extraction module
Body and relation are sent to the complete knowledge mapping of knowledge mapping generation module structure and return to user, and update knowledge mapping generation mould
Data in block.Here is that each module is discussed in detail.
The knowledge data library module of every field excavates the knowledge data of every field in encyclopaedia from network, then preserves
To be used to build the knowledge mapping of document.Knowledge data base has been substantially divided into art, section by the present invention according to the classification of encyclopaedia
, nature, culture, geography, life, society, personage, economy, physical culture, 11 major classes of history and in each major class again point
Into more detailed group, such as the present invention has divided health medical treatment, electronic information, aviation boat again in science this major class
My god, automobile engineering, biomedicine waits 16 groups.The purpose of do so is exactly the knowledge number that all directions are sorted out from encyclopaedia
According to storehouse, then for the document for being arbitrarily given, the present invention just can be using Algorithm mapping the knowing to specific field of classification
DBM is known, so as to use the heuristic knowledge mapping for building document of knowledge data library module in the field.Such as Fig. 1 institutes
Show, knowledge data library module is divided into systematic knowledge database and user knowledge database again, at the beginning of systematic knowledge database is system
The knowledge data base carried during beginningization, new knowledge data is found for heuristic from customer documentation, and user's lack of competence is carried out
Operation;User knowledge database refers to the process of the knowledge data that user obtains in extraction document, and user is had permission to this operation,
Self-defined or autonomous addition dictionary can be carried out so as to improve overall structure effect according to the job requirement of user.Network processing layers
It is responsible for encyclopaedia from network(Baidupedia, interactive encyclopaedia, Wiki Chinese encyclopaedia)The all of document of each classification is crawled, and according to hundred
The knowledge organization structure extraction of section's page goes out entity and relation, and the entity storehouse and relation storehouse put into knowledge data library module
In, while the also encyclopaedia text data after reservation process, is put into text library.Then the models such as word2vec are trained, and is put into
To in the middle of knowledge data library module.So knowledge data library module is by entity storehouse, relation storehouse, text library and the part group of model 4
Into.
Document process module is to pre-process as smaller processing unit document.It is main to include that pretreatment, core word are extracted
Extracted with primary word and the part of document classification four.As shown in Fig. 2 of the invention by each sentence in document in preprocessing part
Subprocessing is a line, and Chinese word segmentation is carried out to document using jieba instruments, and jieba has good performance, participle efficiency high
And it is accurate, and allow to import entry, the vocabulary in participle process jieba can pay the utmost attention to entry is of the invention by knowledge number
The entry base of jieba is merged into according to the entity storehouse and relation storehouse in library module, also needs to filter stop words in addition to participle;Core
Heart word extracts the main core for extracting document in part(Purport)Word, the present invention proposes the core word extraction algorithm, the algorithm base
Assume that every document is all around core in one(Purport)Word is illustrated, and core word is distributed by block
, such as the 1st ~ 5 section is description internal memory, and the 6th ~ 12 section is explanation hard disk.In addition for each core word block, here all
Having corresponding knowledge carries out additional notes to the entry, such as this is general to be likely to occur memory bar in this core word block of internal memory
Read, or the relation between internal memory and memory bar.This is also the basis that the present invention builds knowledge mapping.The algorithm that core word is extracted
Including two aspects, the entry density of the word after each participle is calculated first(What i.e. the entry occurred in this document is total
The line number that number of times is crossed over divided by the entry), a threshold value is defined, candidate's core word is selected, then calculate the uniform of candidate word
Property, i.e. each candidate word is uniform in their word block, and candidate word block is divided into multiple entries by the present invention, such as every three
One entry of behavior, then the uniformity of candidate word is equal to the ratio of entry where candidate word;It is document that document classification part is
The knowledge data library module of a certain classification is found, because there is initial information in system library, can be found with heuristic help more
Novel entities and relation under the category, of the invention here to carry out text classification using the cosine law, specific practice is as follows, first
Find all words under the document(Notional word), their TF-IDF values are calculated, then obtained according to the order of all vocabularies
The vocabulary vector expression of the document, the value of each of which element represents contribution of the word to the document, because each class
Document all substantially can be exactly contribution situation of the specialized vocabulary under such to such document by the combination collocation of fixed specialty,
The last present invention is calculated document using the text library in knowledge data library module and calculates cosine with categorization vector in text library
Value, the document is divided into the middle of the field of maximum.The usual time important word that has in core word block was previously noted to come
The concept of auxiliary description core word, such as memory bar aids in describing the concept of internal memory, so primary word extracts the main work(in part
Can be exactly to extract time primary word, the present invention is extracted using the technology of TF-IDF here, before document classification also use
The technology, TF(Term -Frequency)Item frequency, represents the frequency that the entry occurs in this document, IDF
(Inverse Document Fequency)Inverse document frequency, represents document of the entry in generic knowledge data base
The frequency of middle appearance, TF-IDF is two products of value, integrating representation importance of the entry in document.Present invention setting one
Individual threshold value higher, then selects important vocabulary.Then core word and primary word are put into knowledge data base mould by the present invention
In the middle of block.
Entity is responsible for extracting the triple of the entity in the document, relation and entity with relation extraction module.Such as Fig. 3 institutes
Show, association analysis part is used for excavating possible related word pair in document, there is the relation extracted using the present invention in document.This
Invention improves Apriori related analysis technologies, picks out the Field Words from document first(Refer here to systematic knowledge
The entry that entity storehouse and relation storehouse in database and user knowledge database are collected)The sentence of appearance refers to as affairs, affairs
Be the present invention the set of all entries in select sentence, it is seen that, in each affairs at least one
Field Words.Then the ratio that the support of each entry in document, i.e. affairs where each entry account for all affairs is calculated
Support is regarded as frequent episode by example, the present invention higher than the entry of certain threshold value, then calculates two confidence levels of frequent episode, i.e.,
Affairs where one of frequent episode account for two frequent episode entries where affairs ratio, higher than certain threshold value, the present invention
Extract as word pair.Such word equally also includes the word pair without core word to including core word mostly.Sentence is cut down
Part is exactly to delete the submember in important sentences, so as to excavate the relation of word centering.The present invention the word of frequent word pair and
The core word and primary word of the document constitute an entry set, all in locating documents to contain entry in the entry set
Sentence, because association analysis does not take into account that the structure and composition of sentence, but the present invention may determine which word there may be relation.
Reference resolution is carried out first, and the pronoun in clause that will be in a sentence is replaced with the noun that it is referred to, such as " during Beijing is
The capital of state, Chinese have deep love for it ", after reference resolution, " Beijing is the capital of China, and Chinese have deep love for Beijing ", because
One sentence is a processing unit, but a sentence long is usually to be made up of multiple clauses, and clause can also contain information, make
The meaning of clause can be enriched after being decomposed with reference, then as the unit of present invention treatment.The present invention directly invokes Stamford
Reference resolution bag in natural language processing bag, it is then of the invention noun in sentence to be cut down according to predefined pattern or is moved
Word, that is, delete the submember in sentence so as to propose the real noun and verb for needing of the invention, such as " preposition+noun "
Form, " teacher sees Xiao Ming in park." core of the words is exactly " teacher sees Xiao Ming " and " in park " just belongs to
Secondary part, this noun of park therein needs to delete, and can otherwise influence the extraction of entity.The present invention has summed up several need
The pattern to be cut down:" preposition+adjective+noun ", " preposition+verb " etc..It is from after reduction to extract entity and relationship part
Noun or verb are extracted in sentence as entity or relation, the verb part in sentence is found first, then being close to verb
One candidate's of noun and noun below and then composition above(Noun, verb, noun)Triple.Then similitude is utilized
It is dynamic in relation and candidate's triple in the relation storehouse that analysis comes in computing system knowledge data base and relation knowledge data base
The similitude of word, if similitude is more than certain threshold value, the present invention is just put into the pass of user knowledge database by the verb
In being storehouse, while another noun is put into the entity storehouse of user knowledge database;The system has taken into full account that noun fills
When the situation in relation storehouse, after the failure of the first extracting method, the present invention then finds another that go out outside core word in sentence
Noun, then with similarity analysis come the entity in the entity storehouse in computing system knowledge data base and relation knowledge data base with
The noun it is similar, if above certain threshold value(The threshold value typically setting is higher), the word and then looked between two nouns
Language(Verb or noun), the similitude of they and relation storehouse is calculated, if greater than certain threshold value just by corresponding noun or dynamic
Word puts into relation storehouse, and the noun for extracting before is put into entity storehouse.Similarity analysis part is mainly concerned with the meter of similitude
Calculate, that is, give the Semantic Similarity that two words are calculated between them.Mainly two kinds of technologies are used:Word2vec and synonymous
Word word woods, word2vec is a instrument that word is characterized as real number value vector that Google increased income in 2013, its profit
With the thought of deep learning, the vector fortune in K gts can be reduced to the treatment to content of text by training
Calculate, and the similarity in vector space can be used to characterize the similarity on text semantic.The present invention takes full advantage of system and knows
Know the text library in database to train the model, so as to obtain the model in systematic knowledge database, carry out similitude point
The process of analysis, the present invention directly invokes the model in knowledge data base.Another similitude instrument is society of Harbin Institute of Technology
The extended edition Chinese thesaurus write with Research into information retrieval center are calculated, the Chinese thesaurus are divided into 5 levels, the present invention
Using hierarchical relationship, two similitudes of word can be solved.The present invention leads to two kinds of technologies of word2vec and Chinese thesaurus
The method of weighting is crossed to calculate two similitudes of word.
After the triple of the entity in the document is extracted, relation and entity, knowledge mapping generation module is using extraction
The triple of entity, relation and entity generates the knowledge mapping of document.
Obviously, the above embodiment of the present invention is only intended to clearly illustrate example of the present invention, and is not right
The restriction of embodiments of the present invention.For those of ordinary skill in the field, may be used also on the basis of the above description
To make other changes in different forms.There is no need and unable to be exhaustive to all of implementation method.It is all this
Any modification, equivalent and improvement made within the spirit and principle of invention etc., should be included in the claims in the present invention
Protection domain within.
Claims (4)
1. a kind of Chinese text knowledge mapping method for auto constructing, it is characterised in that:Comprise the following steps:
S1. encyclopaedia crawls the document of every field from network, then goes out entity according to the knowledge organization structure extraction of the encyclopaedia page
It is stored in the entity storehouse and relation storehouse in corresponding field with relation, the document of the every field for crawling also is stored in corresponding field
In text library;
If S2. a document j needs to carry out the operation of structure knowledge mapping, following treatment is performed to it;
S3. word segmentation processing is carried out to document j;
S4. the extraction of core word is carried out to document j;
S5. the primary word of document j is extracted using the technology of TF-IDF;
S6. the field belonging to document j is determined:
S61. all words of document j are found out, their TF-IDF values is then calculated respectively, document is obtained according to the order of word
The vocabulary vector expression of j;
S62. the vocabulary vector expression of the document of every field is obtained using the method for step S61, the word of document j is then calculated
The cosine value of the vocabulary vector expression of the document of remittance vector expression and every field, the maximum corresponding neck of document of cosine value
Domain is the field belonging to document j;Then document j is stored in the text library in the field;
S7. the triple of the entity, relation and entity in document j is extracted:
S71. the sentence of Field Words appearance is picked out from document j as affairs, affairs are referred in select sentence
All entries set;The entry that wherein described Field Words collect for the entity storehouse and relation storehouse of document j arts;
S72. the support of each entry in affairs is calculated, then regards support as frequent episode higher than the entry of threshold value;
S73. the confidence level between any two frequent episode is calculated, if the confidence level between two frequent episodes is higher than threshold value, is carried
Two frequent episodes are taken as word pair;
S74. the word of word pair, core word, primary word are constituted into an entry set, it is all in locating documents j to contain the entry collection
The sentence of entry in conjunction, then these sentences are carried out reference resolution and delete sentence in submember, obtain extract entity,
Noun and verb that the triple of relation and entity needs;
S75. the verb in sentence is found first, then by noun one candidate of composition of verb front and back in sentence
(Noun, verb, noun)Triple, then calculates the relation in the relation storehouse of document j arts using similarity analysis
With the similitude of the verb in candidate's triple, if similitude is more than threshold value, verb is put into the relation of document j arts
In storehouse, while the noun in candidate's triple is put into the entity storehouse of document j arts;Now, candidate(Noun, moves
Word, noun)Triple is the triple of formal entity, relation and entity that document j is extracted;
If S76. step S75 extracts the triple less than entity, relation and entity, find another in addition to core word in sentence
Individual noun, then calculates the similitude of the entity and the noun in the entity storehouse of document j arts using similarity analysis,
If similitude is more than threshold value, the word between two nouns is found, it and document j institutes are then calculated using similarity analysis
The similitude of the relation in the relation storehouse in category field, if similitude is more than threshold value, document j arts is put into by the word
In relation storehouse, and the noun that step S75 is extracted is put into the entity storehouse of document j arts;Now, document j is obtained to extract
Entity, relation and entity triple;
S8. the knowledge mapping of document j is generated using the triple of the entity, relation and entity for extracting.
2. Chinese text knowledge mapping method for auto constructing according to claim 1, it is characterised in that:The step S3 makes
Word segmentation processing is carried out to document j with jieba instruments.
3. Chinese text knowledge mapping method for auto constructing according to claim 1, it is characterised in that:The similitude point
Word2vec or Chinese thesaurus are applied in analysis.
4. the system that one kind applies claim 1 ~ 3 any one method, it is characterised in that:Knowledge data base including every field
Module, document process module, entity and relation extraction module and knowledge mapping generation module;The knowledge of wherein described every field
DBM includes entity storehouse, relation storehouse and the text library of every field, wherein the document process module is used to perform step
Rapid S3 ~ S62, the entity is used to perform step S7 ~ S76 with relation extraction module, and the knowledge mapping generation module is used to hold
Row step S8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710050095.1A CN106844658B (en) | 2017-01-23 | 2017-01-23 | Automatic construction method and system of Chinese text knowledge graph |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710050095.1A CN106844658B (en) | 2017-01-23 | 2017-01-23 | Automatic construction method and system of Chinese text knowledge graph |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106844658A true CN106844658A (en) | 2017-06-13 |
CN106844658B CN106844658B (en) | 2019-12-13 |
Family
ID=59120209
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710050095.1A Active CN106844658B (en) | 2017-01-23 | 2017-01-23 | Automatic construction method and system of Chinese text knowledge graph |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106844658B (en) |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107247813A (en) * | 2017-07-26 | 2017-10-13 | 北京理工大学 | A kind of network struction and evolution method based on weighting technique |
CN107391693A (en) * | 2017-07-26 | 2017-11-24 | 北京理工大学 | A kind of information extraction and description method for English patent |
CN107894986A (en) * | 2017-09-26 | 2018-04-10 | 北京纳人网络科技有限公司 | A kind of business connection division methods, server and client based on vectorization |
CN107943874A (en) * | 2017-11-13 | 2018-04-20 | 平安科技(深圳)有限公司 | Knowledge mapping processing method, device, computer equipment and storage medium |
CN107967290A (en) * | 2017-10-09 | 2018-04-27 | 国家计算机网络与信息安全管理中心 | A kind of knowledge mapping network establishing method and system, medium based on magnanimity scientific research data |
CN108415950A (en) * | 2018-02-01 | 2018-08-17 | 腾讯科技(深圳)有限公司 | A kind of hypernym polymerization and device |
CN108563710A (en) * | 2018-03-27 | 2018-09-21 | 腾讯科技(深圳)有限公司 | A kind of knowledge mapping construction method, device and storage medium |
CN108664595A (en) * | 2018-05-08 | 2018-10-16 | 和美(深圳)信息技术股份有限公司 | Domain knowledge base construction method, device, computer equipment and storage medium |
CN109145003A (en) * | 2018-08-24 | 2019-01-04 | 蜜小蜂智慧(北京)科技有限公司 | A kind of method and device constructing knowledge mapping |
CN109145071A (en) * | 2018-08-06 | 2019-01-04 | 中国地质大学(武汉) | A kind of automated construction method and system towards geophysics field knowledge mapping |
CN109189939A (en) * | 2018-09-05 | 2019-01-11 | 安阳师范学院 | A kind of Chinese Character Semantics knowledge mapping construction method, device, equipment, storage medium |
CN109271557A (en) * | 2018-08-31 | 2019-01-25 | 北京字节跳动网络技术有限公司 | Method and apparatus for output information |
CN109508390A (en) * | 2018-12-28 | 2019-03-22 | 北京金山安全软件有限公司 | Input prediction method and device based on knowledge graph and electronic equipment |
CN109508391A (en) * | 2018-12-28 | 2019-03-22 | 北京金山安全软件有限公司 | Input prediction method and device based on knowledge graph and electronic equipment |
CN109726298A (en) * | 2019-01-08 | 2019-05-07 | 上海市研发公共服务平台管理中心 | Knowledge mapping construction method, system, terminal and medium suitable for scientific and technical literature |
CN109800317A (en) * | 2018-03-19 | 2019-05-24 | 中山大学 | A kind of image querying answer method based on the alignment of image scene map |
CN109857793A (en) * | 2018-12-28 | 2019-06-07 | 考拉征信服务有限公司 | Processing method, device, electronic equipment and the storage medium of technical background data |
CN109918436A (en) * | 2019-03-08 | 2019-06-21 | 上海一健事信息科技有限公司 | A kind of Medical Knowledge management and inquiry system |
CN109993381A (en) * | 2017-12-29 | 2019-07-09 | 中国移动通信集团湖北有限公司 | Demand management application method, device, equipment and the medium of knowledge based map |
CN110059310A (en) * | 2018-01-19 | 2019-07-26 | 腾讯科技(深圳)有限公司 | Extending method and device, electronic equipment, the storage medium of hypernym network |
CN110245239A (en) * | 2019-05-13 | 2019-09-17 | 吉林大学 | A kind of construction method and system towards automotive field knowledge mapping |
CN110543574A (en) * | 2019-08-30 | 2019-12-06 | 北京百度网讯科技有限公司 | knowledge graph construction method, device, equipment and medium |
CN110598002A (en) * | 2019-08-14 | 2019-12-20 | 广州视源电子科技股份有限公司 | Knowledge graph library construction method and device, computer storage medium and electronic equipment |
CN110781310A (en) * | 2019-09-09 | 2020-02-11 | 深圳壹账通智能科技有限公司 | Target concept graph construction method and device, computer equipment and storage medium |
CN111209411A (en) * | 2020-01-03 | 2020-05-29 | 北京明略软件系统有限公司 | Document analysis method and device |
CN111209412A (en) * | 2020-02-10 | 2020-05-29 | 同方知网(北京)技术有限公司 | Method for building knowledge graph of periodical literature by cyclic updating iteration |
CN111242554A (en) * | 2020-01-17 | 2020-06-05 | 秒针信息技术有限公司 | Method and device for determining type of picking mode |
CN111859976A (en) * | 2019-04-30 | 2020-10-30 | 广东小天才科技有限公司 | Method and device for expanding regular expression based on knowledge graph |
CN112988974A (en) * | 2021-03-25 | 2021-06-18 | 上海园域信息科技有限公司 | Method and device for constructing industry chain knowledge graph based on vector space |
CN114969385A (en) * | 2022-08-03 | 2022-08-30 | 北京长河数智科技有限责任公司 | Knowledge graph optimization method and device based on document attribute assignment entity weight |
CN116010587A (en) * | 2023-03-23 | 2023-04-25 | 中国人民解放军63921部队 | Method, device, medium and equipment for pushing spaceflight test issuing guarantee condition knowledge |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103593792A (en) * | 2013-11-13 | 2014-02-19 | 复旦大学 | Individual recommendation method and system based on Chinese knowledge mapping |
KR20150084706A (en) * | 2015-06-26 | 2015-07-22 | 경북대학교 산학협력단 | Apparatus for knowledge learning of ontology and method thereof |
CN104933164A (en) * | 2015-06-26 | 2015-09-23 | 华南理工大学 | Method for extracting relations among named entities in Internet massive data and system thereof |
CN105630901A (en) * | 2015-12-21 | 2016-06-01 | 清华大学 | Knowledge graph representation learning method |
CN106250412A (en) * | 2016-07-22 | 2016-12-21 | 浙江大学 | The knowledge mapping construction method merged based on many source entities |
-
2017
- 2017-01-23 CN CN201710050095.1A patent/CN106844658B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103593792A (en) * | 2013-11-13 | 2014-02-19 | 复旦大学 | Individual recommendation method and system based on Chinese knowledge mapping |
KR20150084706A (en) * | 2015-06-26 | 2015-07-22 | 경북대학교 산학협력단 | Apparatus for knowledge learning of ontology and method thereof |
CN104933164A (en) * | 2015-06-26 | 2015-09-23 | 华南理工大学 | Method for extracting relations among named entities in Internet massive data and system thereof |
CN105630901A (en) * | 2015-12-21 | 2016-06-01 | 清华大学 | Knowledge graph representation learning method |
CN106250412A (en) * | 2016-07-22 | 2016-12-21 | 浙江大学 | The knowledge mapping construction method merged based on many source entities |
Non-Patent Citations (1)
Title |
---|
徐增林等: "知识图谱技术综述", 《电子科技大学学报》 * |
Cited By (46)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107391693A (en) * | 2017-07-26 | 2017-11-24 | 北京理工大学 | A kind of information extraction and description method for English patent |
CN107247813A (en) * | 2017-07-26 | 2017-10-13 | 北京理工大学 | A kind of network struction and evolution method based on weighting technique |
CN107894986A (en) * | 2017-09-26 | 2018-04-10 | 北京纳人网络科技有限公司 | A kind of business connection division methods, server and client based on vectorization |
CN107894986B (en) * | 2017-09-26 | 2021-03-30 | 北京纳人网络科技有限公司 | Enterprise relation division method based on vectorization, server and client |
CN107967290A (en) * | 2017-10-09 | 2018-04-27 | 国家计算机网络与信息安全管理中心 | A kind of knowledge mapping network establishing method and system, medium based on magnanimity scientific research data |
CN107943874A (en) * | 2017-11-13 | 2018-04-20 | 平安科技(深圳)有限公司 | Knowledge mapping processing method, device, computer equipment and storage medium |
CN109993381A (en) * | 2017-12-29 | 2019-07-09 | 中国移动通信集团湖北有限公司 | Demand management application method, device, equipment and the medium of knowledge based map |
CN109993381B (en) * | 2017-12-29 | 2021-11-30 | 中国移动通信集团湖北有限公司 | Demand management application method, device, equipment and medium based on knowledge graph |
CN110059310B (en) * | 2018-01-19 | 2022-10-28 | 腾讯科技(深圳)有限公司 | Hypernym network expansion method and device, electronic equipment and storage medium |
CN110059310A (en) * | 2018-01-19 | 2019-07-26 | 腾讯科技(深圳)有限公司 | Extending method and device, electronic equipment, the storage medium of hypernym network |
CN108415950A (en) * | 2018-02-01 | 2018-08-17 | 腾讯科技(深圳)有限公司 | A kind of hypernym polymerization and device |
CN108415950B (en) * | 2018-02-01 | 2021-03-23 | 腾讯科技(深圳)有限公司 | Hypernym aggregation method and device |
CN109800317A (en) * | 2018-03-19 | 2019-05-24 | 中山大学 | A kind of image querying answer method based on the alignment of image scene map |
CN108563710B (en) * | 2018-03-27 | 2021-02-02 | 腾讯科技(深圳)有限公司 | Knowledge graph construction method and device and storage medium |
CN108563710A (en) * | 2018-03-27 | 2018-09-21 | 腾讯科技(深圳)有限公司 | A kind of knowledge mapping construction method, device and storage medium |
CN108664595A (en) * | 2018-05-08 | 2018-10-16 | 和美(深圳)信息技术股份有限公司 | Domain knowledge base construction method, device, computer equipment and storage medium |
CN108664595B (en) * | 2018-05-08 | 2020-10-16 | 和美(深圳)信息技术股份有限公司 | Domain knowledge base construction method and device, computer equipment and storage medium |
CN109145071A (en) * | 2018-08-06 | 2019-01-04 | 中国地质大学(武汉) | A kind of automated construction method and system towards geophysics field knowledge mapping |
CN109145071B (en) * | 2018-08-06 | 2021-08-27 | 中国地质大学(武汉) | Automatic construction method and system for geophysical field knowledge graph |
CN109145003A (en) * | 2018-08-24 | 2019-01-04 | 蜜小蜂智慧(北京)科技有限公司 | A kind of method and device constructing knowledge mapping |
CN109271557A (en) * | 2018-08-31 | 2019-01-25 | 北京字节跳动网络技术有限公司 | Method and apparatus for output information |
CN109189939A (en) * | 2018-09-05 | 2019-01-11 | 安阳师范学院 | A kind of Chinese Character Semantics knowledge mapping construction method, device, equipment, storage medium |
CN109857793A (en) * | 2018-12-28 | 2019-06-07 | 考拉征信服务有限公司 | Processing method, device, electronic equipment and the storage medium of technical background data |
CN109508391B (en) * | 2018-12-28 | 2022-04-08 | 北京金山安全软件有限公司 | Input prediction method and device based on knowledge graph and electronic equipment |
CN109508390B (en) * | 2018-12-28 | 2021-12-14 | 北京金山安全软件有限公司 | Input prediction method and device based on knowledge graph and electronic equipment |
CN109508390A (en) * | 2018-12-28 | 2019-03-22 | 北京金山安全软件有限公司 | Input prediction method and device based on knowledge graph and electronic equipment |
CN109508391A (en) * | 2018-12-28 | 2019-03-22 | 北京金山安全软件有限公司 | Input prediction method and device based on knowledge graph and electronic equipment |
CN109726298B (en) * | 2019-01-08 | 2020-12-29 | 上海市研发公共服务平台管理中心 | Knowledge graph construction method, system, terminal and medium suitable for scientific and technical literature |
CN109726298A (en) * | 2019-01-08 | 2019-05-07 | 上海市研发公共服务平台管理中心 | Knowledge mapping construction method, system, terminal and medium suitable for scientific and technical literature |
CN109918436B (en) * | 2019-03-08 | 2022-12-20 | 麦博(上海)健康科技有限公司 | Medical knowledge management and query system |
CN109918436A (en) * | 2019-03-08 | 2019-06-21 | 上海一健事信息科技有限公司 | A kind of Medical Knowledge management and inquiry system |
CN111859976A (en) * | 2019-04-30 | 2020-10-30 | 广东小天才科技有限公司 | Method and device for expanding regular expression based on knowledge graph |
CN110245239A (en) * | 2019-05-13 | 2019-09-17 | 吉林大学 | A kind of construction method and system towards automotive field knowledge mapping |
CN110598002A (en) * | 2019-08-14 | 2019-12-20 | 广州视源电子科技股份有限公司 | Knowledge graph library construction method and device, computer storage medium and electronic equipment |
CN110543574B (en) * | 2019-08-30 | 2022-05-17 | 北京百度网讯科技有限公司 | Knowledge graph construction method, device, equipment and medium |
CN110543574A (en) * | 2019-08-30 | 2019-12-06 | 北京百度网讯科技有限公司 | knowledge graph construction method, device, equipment and medium |
WO2021047327A1 (en) * | 2019-09-09 | 2021-03-18 | 深圳壹账通智能科技有限公司 | Method and apparatus for constructing target concept map, computer device, and storage medium |
CN110781310A (en) * | 2019-09-09 | 2020-02-11 | 深圳壹账通智能科技有限公司 | Target concept graph construction method and device, computer equipment and storage medium |
CN111209411A (en) * | 2020-01-03 | 2020-05-29 | 北京明略软件系统有限公司 | Document analysis method and device |
CN111242554A (en) * | 2020-01-17 | 2020-06-05 | 秒针信息技术有限公司 | Method and device for determining type of picking mode |
CN111242554B (en) * | 2020-01-17 | 2023-10-17 | 秒针信息技术有限公司 | Method and device for determining type of picking mode |
CN111209412A (en) * | 2020-02-10 | 2020-05-29 | 同方知网(北京)技术有限公司 | Method for building knowledge graph of periodical literature by cyclic updating iteration |
CN111209412B (en) * | 2020-02-10 | 2023-05-12 | 同方知网数字出版技术股份有限公司 | Periodical literature knowledge graph construction method for cyclic updating iteration |
CN112988974A (en) * | 2021-03-25 | 2021-06-18 | 上海园域信息科技有限公司 | Method and device for constructing industry chain knowledge graph based on vector space |
CN114969385A (en) * | 2022-08-03 | 2022-08-30 | 北京长河数智科技有限责任公司 | Knowledge graph optimization method and device based on document attribute assignment entity weight |
CN116010587A (en) * | 2023-03-23 | 2023-04-25 | 中国人民解放军63921部队 | Method, device, medium and equipment for pushing spaceflight test issuing guarantee condition knowledge |
Also Published As
Publication number | Publication date |
---|---|
CN106844658B (en) | 2019-12-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106844658A (en) | A kind of Chinese text knowledge mapping method for auto constructing and system | |
Wang et al. | K-adapter: Infusing knowledge into pre-trained models with adapters | |
CN109190117B (en) | Short text semantic similarity calculation method based on word vector | |
Grishman | Information extraction | |
CN103207856B (en) | A kind of Ontological concept and hierarchical relationship generation method | |
US9317593B2 (en) | Modeling topics using statistical distributions | |
CN103914548B (en) | Information search method and device | |
CN105528437B (en) | A kind of question answering system construction method extracted based on structured text knowledge | |
CN108681574B (en) | Text abstract-based non-fact question-answer selection method and system | |
CN109344236A (en) | One kind being based on the problem of various features similarity calculating method | |
CN106776562A (en) | A kind of keyword extracting method and extraction system | |
CN103886099B (en) | Semantic retrieval system and method of vague concepts | |
EP2224361A1 (en) | Generating a domain corpus and a dictionary for an automated ontology | |
EP2224360A1 (en) | Generating a dictionary and determining a co-occurrence context for an automated ontology | |
KR20060122276A (en) | Relation extraction from documents for the automatic construction of ontologies | |
CN111625622B (en) | Domain ontology construction method and device, electronic equipment and storage medium | |
CN109522396B (en) | Knowledge processing method and system for national defense science and technology field | |
CN107688583A (en) | The method and apparatus for creating the training data for natural language processing device | |
Alian et al. | Arabic semantic similarity approaches-review | |
CN109508460A (en) | Unsupervised composition based on Subject Clustering is digressed from the subject detection method and system | |
Yusuf et al. | Query expansion method for quran search using semantic search and lucene ranking | |
CN112883182A (en) | Question-answer matching method and device based on machine reading | |
Michelbacher | Multi-word tokenization for natural language processing | |
CN106776590A (en) | A kind of method and system for obtaining entry translation | |
Tian et al. | Measuring the similarity of short texts by word similarity and tree kernels |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |