CN103488724B - A kind of reading domain knowledge map construction method towards books - Google Patents

A kind of reading domain knowledge map construction method towards books Download PDF

Info

Publication number
CN103488724B
CN103488724B CN201310420375.9A CN201310420375A CN103488724B CN 103488724 B CN103488724 B CN 103488724B CN 201310420375 A CN201310420375 A CN 201310420375A CN 103488724 B CN103488724 B CN 103488724B
Authority
CN
China
Prior art keywords
entity
concept
knowledge
books
relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310420375.9A
Other languages
Chinese (zh)
Other versions
CN103488724A (en
Inventor
肖仰华
张可尊
汪卫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN201310420375.9A priority Critical patent/CN103488724B/en
Publication of CN103488724A publication Critical patent/CN103488724A/en
Application granted granted Critical
Publication of CN103488724B publication Critical patent/CN103488724B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

The invention belongs to Chinese knowledge base applied technical field, a kind of reading domain knowledge map construction method towards books.The method is divided into three parts: world knowledge map construction, domain knowledge map construction and intelligence are read and recommended.That is: the knowledge on the Internet, integrated universal knowledge mapping are obtained;The mode of iteration is utilized to extend the relevant concept of books and entity, binding entity Infobox table and conventional relationship extraction entity relationship in conjunction with world knowledge collection of illustrative plates;Mark the kernel entity in e-book according to entity from long to short, and set up linking of entity and books knowledge mapping, recommend realizing intelligent knowledge.Entity in books, by setting up the reading domain knowledge collection of illustrative plates towards books, is explained or knowledge recommendation, adds the degree of depth of knowledge, it is achieved that the facilitation of electronic reading, intellectuality and hommization, have more preferable Consumer's Experience by the present invention.

Description

A kind of reading domain knowledge map construction method towards books
Technical field
The invention belongs to Chinese knowledge base applied technical field, be specifically related to a kind of reading domain knowledge figure towards books The construction method of spectrum.
Background technology
Along with the development of computer technology and popularizing of mobile device, the reading method of people there occurs deep change, Electronic reading gradually replaces traditional paper reading to become one of main flow reading model.Comparing tradition reading, electronic reading is avoided The waste more environmental protection of paper, electronic reading can help reader to realize reading easily.Electronic reading becomes already , more there is the trend leading knowledge acquisition one of a kind of important channel of knowledge acquisition.
But, the knowledge acquisition that Current electronic is read is limited to books itself, and reader runs into strange vocabulary, knowledge point Time need to consult aid, such as dictionary, encyclopedia etc., strange knowledge is explained.This brings additionally to reading Burden, how the explanation of knowledge in books shows reader become the bottleneck that Current electronic is read intuitively, solves this and ask Topic will make electronic reading more convenient, intelligent and hommization.
Current electronic reader is attempted explaining the knowledge in books.Kindle reader is by e-book Word is linked in wikipedia scan for, to produce the explanation of word.There is reading that word is linked to Chinese interactive hundred Section explains.These improvement improve the intelligibility of e-book and the degree of depth of knowledge to a certain extent.Although these change Entering to extend the knowledge outside books and content, but the knowledge the most not carrying out intelligence arranges and recommends, reader still needs To arrange, to select the knowledge of needs from the Search Results of word, there is not the knowledge that reader wants in even these encyclopaedia pages. So, existing electric reads intelligence the most not, it is impossible to screening knowledge and knowledge recommendation automatically.
Knowledge mapping (knowledge graph) refers to using entity, concept as node, using semantic relation as the language on limit Justice network.Knowledge mapping makes knowledge acquisition more direct, and therefore knowledge mapping can provide knowing of semantic association for electronic reading Know, thus realize facilitation, intellectuality and the hommization read.But, current Chinese knowledge mapping still falls within the structure stage, and And be general knowledge mapping.It would therefore be desirable to build one for each book nationality to read domain knowledge collection of illustrative plates.
Summary of the invention
The present invention is directed to Current electronic and read problems such as there is shallow, the knowledge recommendation intelligence not of knowledge hierarchy, propose one In conjunction with world knowledge collection of illustrative plates, the method constructing the domain knowledge collection of illustrative plates towards books, construct knowledge network for e-book, thus Realize books word being explained and the knowledge recommendation of intelligence.
The reading domain knowledge map construction method towards books that the present invention proposes, in conjunction with existing world knowledge figure Spectrum, is identified the kernel entity in books and concept and marks, excavation semantic relation between entity, concept, thus structure Make the domain knowledge collection of illustrative plates of books.When the kernel entity of readers' preference mark carries out knowledge query, reader will be known from field Know the knowledge that in collection of illustrative plates, query semantics is relevant and carry out the knowledge recommendation of intelligence.The inventive method includes three parts (i.e. three moulds Block): world knowledge map construction, domain knowledge map construction and intelligence read application, shown in method Organization Chart as accompanying drawing 1.
One, world knowledge map construction
Knowledge mapping refers to the semantic network being made up of the entity of magnanimity, concept and the semantic relation between them.Know Knowing collection of illustrative plates and can provide the knowledge and explanation that entity is the most comprehensive, associate, therefore we are books by world knowledge collection of illustrative plates Build domain knowledge collection of illustrative plates, thus make reasonable dismissal for the word in books, knowledge point.
The Chinese knowledge mapping that there is currently includes that Google's Chinese knowledge mapping, Baidu's knowledge mapping and search dog are known cube. We utilize existing knowledge source as realizing the knowledge source of books domain knowledge map construction, by obtaining Baidupedia, mutually Dynamic encyclopaedia and the Chinese entity of wikipedia, concept and relation, and in addition integrated obtain high-quality Universal Chinese character with cleaning and know Know collection of illustrative plates.
Two, domain knowledge map construction
This module combines world knowledge collection of illustrative plates and uses alternative manner constantly to expand key concept and kernel entity, then digs Semantic relation between pick entity, thus build domain knowledge collection of illustrative plates.This module is taken out by step concept, Entity recognition and relation Take and realize.
Concept, Entity recognition
The target of concept identification is to identify all concepts being closely related with books, and the present invention is by world knowledge collection of illustrative plates The open classification information realization of middle entity.
Book keyword defines
First, the concept relevant in order to identify books, need the keyword that a small amount of books of Manual definition are closely related, crucial Word can select book name, it is also possible to selects the keyword in book name.This step can obtain set of keywords KEYWORD(defines: the set that set of keywords i.e. forms for the keyword being correlated with by book name).
Seed concept identification
Seed concept is the concept directly comprising keyword string in knowledge mapping, will comprise keyword word string in knowledge mapping Concept add classification seed concept set close SEEDCONCEPT(definition: classification seed concept set close be i.e. by knowledge mapping bag The set formed containing the concept of the keyword substring in set KEYWORD).
Concept, entity iteration extend
The extension of concept, entity iteration is according to seed concept, expands all relevant to books from world knowledge collection of illustrative plates Concept and entity.Implementation is as follows, and extension flow chart is shown in accompanying drawing 2:
First, the entity of correspondence can be obtained from seed concept set SEEDCONCEPT, add kernel entity set COREENTITY(defines: kernel entity set is i.e. by the set being made up of the entity under seed concept).
Secondly, the kernel entity in scanning COREENTITY, the concept not in SEEDCONCEPT can be produced, be referred to as Candidate concepts, adds candidate concepts set CANDIDATECONCEPT(definition: candidate concepts set is i.e. for by belonging to kernel entity And do not appear in the set that the concept in key concept set is formed).
Then, calculate candidate concepts in CANDIDATECONCEPT to define with key concept set CORECONCEPT(: core The set that heart concept set is i.e. made up of the closely-related concept of books, by seed concept and the concept bigger with its similarity Composition) between semantic dependency.Will be greater than given threshold value(definition: semantic dependency threshold value.If concept and the language of set Justice dependency then thinks semantic relevant more than this value) candidate concepts as related notion, add key concept set In CORECONCEPT.Wherein, candidate concepts c(represents any candidate concepts) and key concept set between CS(represent that core is general Read set CORECONCEPT) semantic dependency be defined as: Rel(c, cs).
Wherein,Represent and belong simultaneously to classify c and the physical quantities of classification k,With Num (k) table respectively Showing the quantity of the entity belonging to classification c or k, c and k represents the open classification of the entity in knowledge mapping respectively.
Finally, iteratively extension CORECONCEPT and COREENTITY of increment until there is no new concept or reality Body produces, and thus obtains whole with books related notion and entity.
But, but these entities and concept there may be some more common entities the strongest with topic relativity with general Read, accordingly, it would be desirable to be carried out.Cleaning process is realized by the IDF value of computational entity or concept, i.e. relatively low for IDF value Entity or concept, as noise, are shown below:
Num represents entity sum in knowledge mapping,Represent the entity number comprising link entity e in knowledge mapping Amount, Num (c) represents the physical quantities comprising classification c in knowledge mapping.E represents the entity in knowledge mapping, and c represents knowledge graph The open classification of entity in spectrum.So can punish entity or concept that versatility is bigger, thus retain maximally related entity and Concept.
Entitative concept Relation extraction
Entitative concept Relation extraction is for acquired entitative concept constructing semantic relation, is the important of structure knowledge mapping Step.Entity relationship is expressed as tlv triple, wherein source represents that source is real Body, target represents target entity, relation presentation-entity relation, r presentation-entity relationship description set.The pass that books are relevant System refers to that in tlv triple, source or target is in COREENTITY.The present invention uses two kinds mainly in combination with world knowledge collection of illustrative plates Relation extraction method: based on Infobox(define: Infobox refers to entity attributes table in knowledge mapping) Relation extraction method With Relation extraction method based on pattern.
Relation extraction method based on Infobox
Infobox describes the base attribute information of entity in table form.The expression of Infobox () represent identical with entity relationship, i.e. entity correspondence source, attribute Corresponding relation, value correspondence target.Wherein entity presentation-entity, attribute presentation-entity attribute, value table Show the property value that entity is corresponding.First, check Infobox table, if entity or value belongs to COREENTITY, then should Bar attribute adds set R(definition: the set being made up of entity relationship tlv triple), and attribute is added entity relationship retouch State set r.
Relation extraction method based on pattern
Use Infobox can obtain entity relationship triplet sets R and entity relationship description collections r.In order to excavate more Many entities, the present invention uses Relation extraction method based on pattern.
Challenge in entity relation extraction is that the extraction of " relationship description ", method based on Infobox have been obtained for " relationship description " set r.Therefore, used here as the method for natural language processing and combine Chinese word segmentation identification entity, i.e. first from One sentence is found out the position of " relationship description ", finds nearest kernel entity the most respectively forwardly, backward or noun is real Body.Relation extraction pattern is:, i.e. entity relationship words of description and its forward, the most nearest entity Constitute a relation tlv triple.
Particularly for the extraction of character relation in books, using the decimation pattern in table 1, language material text is the name of entity Sheet is introduced, r representative figure's set of relationship { such as " father ", " husband ", " wife " etc. } here:
Table 1. character relation decimation pattern
Note: * * represents that any word ,/nr ,/u ,/uj ,/v represent the part-of-speech tagging after Chinese word segmentation, and { r} is relationship description A relationship description word in set r.
Entity refers to Relation extraction
In books, some entity has another name or special address, but all referring to same entity.In order to identify these Refer to entity, need to carry out entity and refer to judge, mainly by synonyms map and the reality of the entity in knowledge mapping Synonym in body Infobox table describes attribute (" another name ", " former name ", " formal name used at school ", " pseudonym " etc.) and will refer to entity pass It is linked to kernel entity.
Three, intelligence reads application
This module main purpose is the entity in mark e-book and to complete entity in books domain knowledge collection of illustrative plates The mapping of knowledge.When entity in readers' preference books, the knowledge of correspondence is selected to be pushed away from books domain knowledge collection of illustrative plates Recommend and show.Explain including entity mark, entity:
Entity mark is to mark, the entity in kernel entity set COREENTITY in order to improve mark in electronic books The accuracy of note and speed, sort entity according to length, mark the most successively, cause to avoid entity to comprise Mistake mark;
Entity explains the explanation that the entity marked out in e-book finds in knowledge mapping correspondence, when user selects When needing the entity explained, corresponding knowledge is selected to be recommended.
In sum, the domain knowledge collection of illustrative plates towards books of structure in the present invention is used can to accurately identify e-book In the extraction of entity, concept and entity relationship, and kernel entity can be marked accurately, in conjunction with knowledge mapping to e-book Entity in nationality does accurate, intelligent knowledge recommendation, greatly improves the convenience of books, intelligibility.This is existing electronics The function that reading system is all not carried out.
According to foregoing, the reading domain knowledge map construction method towards books of the present invention, it is summarized as follows:
(1) for given e-book and world knowledge collection of illustrative plates, identify, extract and belong to the relevant of this e-book Knowledge, to provide the knowledge recommendation of intelligence.These relevant knowledges include entity, concept and the semantic pass making an explanation and being correlated with thereof System, the semantic network that composition books are relevant, i.e. books domain knowledge collection of illustrative plates.
(2) for the domain knowledge collection of illustrative plates built and e-book, the reading system of intelligence is generated.Mark out e-book Core vocabulary (entity in knowledge mapping, concept) in nationality, and the glossary explanation in knowledge mapping is linked to e-book In.When reader asks glossary explanation, the knowledge interpretation that semanteme is relevant is selected to be recommended from domain knowledge collection of illustrative plates.
The step of books domain knowledge map construction method described in step (1) is as follows:
A the identification of () conceptual entity uses the classification information in world knowledge collection of illustrative plates, first define book keyword, secondly Preliminary obtain the seed concept that books are relevant, then the expansion concept of iteration and entity.General by definition candidate concepts and core Read the dependency between setDecide whether to add candidate concepts key concept set.Wherein Represent and belong simultaneously to classificationRepresent respectively with the physical quantities of classification k, Num (c) and Num (k) and belong to entity under classification c or k Quantity.
Finally, by the entity using IDF index to clean to obtain, concept, obtain entity closely-related with books and generally Read.
Entity Infobox information in (b) entitative concept Relation extraction use world knowledge collection of illustrative plates, extraction relation tlv triple < Source, relation, target > in description collections { relation} and the part entity relation of relation.Then for general The text of knowledge mapping entity, uses Relation extraction method based on pattern, extracts more relation.
C () entity refers in synonym information and the Infobox table that relation mainly passes through entity in world knowledge collection of illustrative plates Synonym describes attribute, will refer to the kernel entity that entity link refers to it.
The step that intelligent reading system described in step (2) generates method is as follows:
A () needs the vocabulary of explanation to mark out in e-book, by the entitative concept in books domain knowledge collection of illustrative plates Sort according to character length, carry out the most in electronic books mating, marking.
B the knowledge of books related entities, concept is taken out from world knowledge collection of illustrative plates by (), be integrated into the domain knowledge of books Collection of illustrative plates, then complete the link to relevant knowledge of the books vocabulary.
Accompanying drawing explanation
Fig. 1 is the Organization Chart towards books reading domain knowledge collection of illustrative plates.
Fig. 2 is concept, the flow chart of entity extraction.
Fig. 3 is for carrying out entity mark and knowledge recommendation design sketch for books Dream of the Red Mansion.
Fig. 4 is that the A Dream of Red Mansions part personage's graph of a relation using Relation extraction method to obtain shows.
Detailed description of the invention
Below as a example by e-book Dream of the Red Mansion, further describe the present invention:
Module one: world knowledge map construction
Use Baidu's Chinese knowledge mapping as knowledge source, use interactive encyclopaedia and the knowledge source of Chinese wikipedia simultaneously As supplementing.By crawling and resolve encyclopaedia data, the encyclopaedia entity obtained is integrated and cleaned, high-quality entity, Concept and entity relationship.Thus construct world knowledge collection of illustrative plates.
Module two: domain knowledge map construction
1. entity, concept extraction
First, for e-book Dream of the Red Mansion, the artificial key that sets gathers KEYWORD{ " A Dream of Red Mansions " }, then from knowing Know in collection of illustrative plates entity classification and search key concept set CORECONCEPT{ " A Dream of Red Mansions " comprising " A Dream of Red Mansions " keyword, " red Building dream personage ", " A Dream of Red Mansions dress ornament " ....Secondly, from knowledge mapping, search the entity belonging to key concept, constitute core real Body set COREENTITY{ " Jia Fu ", " precious jade ", " Lin Daiyu " ... }.By belonging to COREENTITY and not at CORECONCEPT In concept add CANDIDATECONCEPT.Itself and CORECONCEPT are calculated for the concept in CANDIDATECONCEPT Between semantic relevancy Rel (c, CS), select degree of association more than threshold valueConcept add CORECPNCEPT.Finally, Extension COREENTITY and CORECONCEPT of iteration, until converging to do not have new kernel entity and concept to add set, The entity scale relevant to books " A Dream of Red Mansions " is shown in Table 2.
2. Relation extraction
First, use Relation extraction method based on Infobox extract relation, i.e. judge Infobox table < entity, Attribute, value > entity or value whether belong to kernel entity set.If then by entity relationship tlv triple < Entity, attribute, value > add set R, relationship description attribute is added relationship description set r simultaneously. As<Lin Daiyu, father, Lin Ruhai>can obtain relation " Lin Daiyu-father-Lin Ruhai " and relationship description " father ".
Secondly, use Relation extraction method based on pattern, use patternExtend from text Relation, as used relationship description " father ", can extract entity relationship " Jia Baoyu-father-Jia Zheng ".
Then, using the pattern in table 1 to describe from text and extract the character relation in Dream of the Red Mansion, the personage obtained is closed It is that collection of illustrative plates scale is shown in Table 2, and the character relation graphical effect centered by " Wang Xifeng ", " Lin Daiyu " is shown in accompanying drawing 4.
Finally, refer to for the entity in books, as in Dream of the Red Mansion " phoenix elder sister ", " phoenix spicy " all referring to " Wang Xifeng ". In order to identify that these refer to entity, make use of in synonyms map and the entity Infobox table of entity in knowledge mapping Synonym describes attribute (" another name ", " former name ", " formal name used at school ", " pseudonym " etc.) and will refer to entity associated to kernel entity.
The entity that used by this module, concept identification, the knowledge mapping scale that Relation extraction method builds is shown in Table 2.
The scale of table 2. A Dream of Red Mansions domain knowledge collection of illustrative plates
The scale of domain knowledge collection of illustrative plates Physical quantities (individual) Entity relationship quantity (individual) Concept quantity (individual)
A Dream of Red Mansions entity collection of illustrative plates 1560 2731 85
A Dream of Red Mansions personage's subgraph is composed 804 1530 2
Module three: intelligence reads application
Dream of the Red Mansion related entities module two obtained sorts from long to short according to physical length, the most successively " red Lou Meng " e-book marks out.The mistake that so entity can be avoided to comprise and to cause, improves the accuracy of mark simultaneously And efficiency.Entity mark effect is shown in accompanying drawing 3, and in Dream of the Red Mansion, the related entities such as " precious jade ", " Lin Daiyu " is all accurately marked out Come.
The entity marked out in Dream of the Red Mansion books is found in knowledge mapping the explanation of correspondence, when user selects needs During the entity explained, corresponding knowledge is selected to be recommended.Accompanying drawing 3 is shown as the explanation letter of entity in Dream of the Red Mansion " Lin Daiyu " Breath.

Claims (3)

1. the reading domain knowledge map construction method towards books, it is characterised in that concrete steps are divided into: world knowledge Map construction, domain knowledge map construction and intelligence read application;
One, world knowledge map construction
Knowledge mapping refers to the semantic network being made up of the entity of magnanimity, concept and the semantic relation between them;By logical It is that books build domain knowledge collection of illustrative plates with knowledge mapping, thus makes reasonable dismissal for the word in books, knowledge point;Logical Include that Google's Chinese knowledge mapping, Baidu's knowledge mapping and search dog are known vertical with knowledge mapping with the Chinese knowledge mapping that there is currently The existing knowledge source of Fang Zuowei builds;
Two, domain knowledge map construction
Use alternative manner constantly to expand key concept and kernel entity in conjunction with world knowledge collection of illustrative plates, then excavate between entity Semantic relation, thus build domain knowledge collection of illustrative plates;Including concept, Entity recognition and Relation extraction:
-2.1 concepts, Entity recognition
The target of concept identification is to identify all concepts being closely related with books, and concept identification is by world knowledge collection of illustrative plates The open classification information realization of entity;
-2.1.1 book keyword defines
First, the concept relevant in order to identify books, a small amount of books of Manual definition the keyword being closely related, keyword selects Book name, or select the keyword in book name;Set of keywords KEYWORD is obtained by this step;
-2.1.2 seed concept identification
Seed concept is the concept directly comprising keyword string in knowledge mapping, will comprise the general of keyword word string in knowledge mapping Reading and add classification seed concept set conjunction SEEDCONCEPT, classification seed concept set is combined into by comprising set in knowledge mapping The set that the concept of the keyword word string in KEYWORD is formed;
-2.1.3 concept, entity iteration extend
The extension of concept, entity iteration is according to seed concept, expands all relevant to books general from world knowledge collection of illustrative plates Read and entity;Concrete grammar is as follows:
First, obtain the entity of correspondence from seed concept set SEEDCONCEPT, add kernel entity set COREENTITY, Kernel entity set is the set being made up of the entity under seed concept;
Secondly, the kernel entity in scanning COREENTITY, produce the not concept in SEEDCONCEPT, referred to as candidate concepts, Adding candidate concepts set CANDIDATECONCEPT, candidate concepts set is for by belonging to kernel entity and do not appear in core The set that concept in concept set is formed;
Then, the semantic phase between candidate concepts with key concept set CORECONCEPT in CANDIDATECONCEPT is calculated Guan Xing, the set that described key concept set is made up of the closely-related concept of books, by seed concept and similar to it Property bigger concept composition;Will be greater than given threshold valueCandidate concepts as related notion, add key concept set In CORECONCEPT;Wherein, between candidate concepts c and key concept set, the semantic dependency of CS is defined as: Rel(c, Cs);
Wherein, (c, k) expression belongs simultaneously to classify c and the physical quantities of classification k to Num, and Num (c) and Num (k) represents genus respectively Quantity in classification c or k entity;
Finally, iteratively extension CORECONCEPT and COREENTITY of increment until not having new concept or entity to produce Raw, thus obtain whole with books related notion and entity;
-2.2 entitative concept Relation extractions
Entitative concept Relation extraction is for acquired entitative concept constructing semantic relation, and entity relationship is expressed as tlv triple, wherein r presentation-entity relationship description set;Books are correlated with Relation refers to that in tlv triple, source or target is in COREENTITY;Two kinds of Relation extractions are used in conjunction with world knowledge collection of illustrative plates Method: Relation extraction method based on Infobox and Relation extraction method based on pattern;
-2.2.1 Relation extraction based on Infobox method
Infobox refers to entity attributes table, describes the base attribute information of entity in table form;The expression of Infobox, represent identical with entity relationship, i.e. entity correspondence source, Attribute correspondence relation, value correspondence target;First, check Infobox table, if entity or value belongs to In COREENTITY, then this attribute is added the set that set R, R are made up of entity relationship tlv triple, by attribute Add entity relationship description collections r;
-2.2.2 Relation extraction based on pattern method
The method of Infobox has been obtained for " relationship description " set r, and Relation extraction based on pattern uses natural language processing Method and combine Chinese word segmentation identification entity, from a sentence, i.e. first find out the position of " relationship description ", the most respectively to Before, find nearest kernel entity or noun entity backward;Relation extraction pattern is:;I.e. entity Relationship description word and its forward, the most nearest entity constitute a relation tlv triple;
-2.2.3 entity refers to Relation extraction
In books, some entity has another name or special address, but all referring to same entity;In order to identify that these refer to Entity, needs to carry out entity and refers to judge, be synonyms map and the entity Infobox that make use of the entity in knowledge mapping Synonym in table describes attribute and includes that " another name ", " former name ", " formal name used at school ", " pseudonym " will refer to entity associated to core in fact Body;
Three, intelligence reads application
Explain including entity mark, entity:
Entity mark is to mark, the entity in kernel entity set COREENTITY in order to improve mark in electronic books Accuracy and speed, sort entity according to length, mark the most successively, the mistake caused to avoid entity to comprise Mark by mistake;
Entity explains the explanation that the entity marked out in e-book finds in knowledge mapping correspondence, when user selects needs During the entity explained, corresponding knowledge is selected to be recommended.
Reading domain knowledge map construction method towards books the most according to claim 1, it is characterised in that concept, In entity iteration spread step, but can there are some more common entities the strongest with topic relativity with general in entity and concept Read, accordingly, it would be desirable to be carried out;Cleaning process is realized by the IDF value of computational entity or concept, i.e. relatively low for IDF value Entity or concept, as noise, are shown below:
Num represents entity sum in knowledge mapping, and Num (e) represents the physical quantities comprising link entity e in knowledge mapping, Num C () represents the physical quantities comprising classification c in knowledge mapping.
Reading domain knowledge map construction method towards books the most according to claim 1, it is characterised in that based on In the Relation extraction of pattern, for the extraction of character relation in books, using the decimation pattern in table 1, language material text is entity Business card introduction, r representative figure's set of relationship here,
Table 1. character relation decimation pattern
* represents that any word ,/nr ,/u ,/uj ,/v represent the part-of-speech tagging after Chinese word segmentation, and { r} is in relationship description set r A relationship description word.
CN201310420375.9A 2013-09-16 2013-09-16 A kind of reading domain knowledge map construction method towards books Expired - Fee Related CN103488724B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310420375.9A CN103488724B (en) 2013-09-16 2013-09-16 A kind of reading domain knowledge map construction method towards books

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310420375.9A CN103488724B (en) 2013-09-16 2013-09-16 A kind of reading domain knowledge map construction method towards books

Publications (2)

Publication Number Publication Date
CN103488724A CN103488724A (en) 2014-01-01
CN103488724B true CN103488724B (en) 2016-09-28

Family

ID=49828950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310420375.9A Expired - Fee Related CN103488724B (en) 2013-09-16 2013-09-16 A kind of reading domain knowledge map construction method towards books

Country Status (1)

Country Link
CN (1) CN103488724B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107423820A (en) * 2016-05-24 2017-12-01 清华大学 The knowledge mapping of binding entity stratigraphic classification represents learning method

Families Citing this family (67)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102713B (en) * 2014-07-16 2018-01-19 百度在线网络技术(北京)有限公司 Recommendation results show method and apparatus
CN104133916B (en) * 2014-08-14 2019-01-15 百度在线网络技术(北京)有限公司 Search result information method for organizing and device
CN104462227A (en) * 2014-11-13 2015-03-25 中国测绘科学研究院 Automatic construction method of graphic knowledge genealogy
CN104408148B (en) * 2014-12-03 2017-12-01 复旦大学 A kind of field encyclopaedia constructing system based on general encyclopaedia website
CN105117115B (en) * 2015-08-07 2018-05-08 小米科技有限责任公司 A kind of method and apparatus for showing electronic document
CN105574098B (en) * 2015-12-11 2019-02-12 百度在线网络技术(北京)有限公司 The generation method and device of knowledge mapping, entity control methods and device
CN105824802B (en) * 2016-03-31 2018-10-30 清华大学 It is a kind of to obtain the method and device that knowledge mapping vectorization indicates
CN105912656B (en) * 2016-04-07 2020-03-17 桂林电子科技大学 Method for constructing commodity knowledge graph
CN107391512B (en) * 2016-05-17 2021-05-11 北京邮电大学 Method and device for predicting knowledge graph
CN106095748B (en) * 2016-06-06 2019-08-27 东软集团股份有限公司 A kind of method and device generating event relation map
CN107894884A (en) * 2016-09-30 2018-04-10 中国电子科技集团公司信息科学研究院 Object representation device and its description method
WO2018076348A1 (en) * 2016-10-31 2018-05-03 Microsoft Technology Licensing, Llc Building and updating a connected segment graph
CN108073587B (en) * 2016-11-09 2022-05-27 阿里巴巴集团控股有限公司 Automatic question answering method and device and electronic equipment
CN106776564B (en) * 2016-12-21 2020-04-24 张永成 Semantic recognition method and system based on knowledge graph
CN106874378B (en) * 2017-01-05 2020-06-02 北京工商大学 Method for constructing knowledge graph based on entity extraction and relation mining of rule model
US10579689B2 (en) 2017-02-08 2020-03-03 International Business Machines Corporation Visualization and augmentation of human knowledge construction during material consumption
CN106875014B (en) * 2017-03-02 2021-06-15 上海交通大学 Automatic construction implementation method of software engineering knowledge base based on semi-supervised learning
CN106934032B (en) * 2017-03-14 2019-10-18 北京软通智城科技有限公司 A kind of city knowledge mapping construction method and device
CN106933806A (en) * 2017-03-15 2017-07-07 北京大数医达科技有限公司 The determination method and apparatus of medical synonym
CN106951526B (en) * 2017-03-21 2020-08-07 北京邮电大学 Entity set extension method and device
CN108694177B (en) * 2017-04-06 2022-02-18 北大方正集团有限公司 Knowledge graph construction method and system
CN107038261B (en) * 2017-05-28 2019-09-20 海南大学 A kind of processing framework resource based on data map, Information Atlas and knowledge mapping can Dynamic and Abstract Semantic Modeling Method
CN107103100B (en) * 2017-06-10 2019-07-30 海南大学 A kind of fault-tolerant intelligent semantic searching method based on map framework
CN109241289A (en) * 2017-07-04 2019-01-18 北京国双科技有限公司 Entity information map extending method and device
CN107330125B (en) * 2017-07-20 2020-06-30 云南电网有限责任公司电力科学研究院 Mass unstructured distribution network data integration method based on knowledge graph technology
CN107609052B (en) * 2017-08-23 2019-09-24 中国科学院软件研究所 A kind of generation method and device of the domain knowledge map based on semantic triangle
CN107704637B (en) * 2017-11-20 2019-12-13 中国人民解放军国防科技大学 knowledge graph construction method for emergency
CN108052576B (en) * 2017-12-08 2021-04-23 国家计算机网络与信息安全管理中心 Method and system for constructing affair knowledge graph
CN108182245A (en) * 2017-12-28 2018-06-19 北京锐安科技有限公司 The construction method and device of people's object properties classificating knowledge collection of illustrative plates
CN108171213A (en) * 2018-01-22 2018-06-15 北京邮电大学 A kind of Relation extraction method for being applicable in picture and text knowledge mapping
CN108415971B (en) * 2018-02-08 2021-07-23 兰州智豆信息科技有限公司 Method and device for recommending supply and demand information by using knowledge graph
CN108959242B (en) * 2018-05-08 2021-07-27 中国科学院信息工程研究所 Target entity identification method and device based on part-of-speech characteristics of Chinese characters
CN108710695B (en) * 2018-05-23 2019-08-06 掌阅科技股份有限公司 Mind map generation method and electronic equipment based on e-book
CN108920527A (en) * 2018-06-07 2018-11-30 桂林电子科技大学 A kind of personalized recommendation method of knowledge based map
CN108829865B (en) * 2018-06-22 2021-04-09 海信集团有限公司 Information retrieval method and device
CN109189942B (en) * 2018-09-12 2021-07-09 山东大学 Construction method and device of patent data knowledge graph
US10923114B2 (en) * 2018-10-10 2021-02-16 N3, Llc Semantic jargon
CN109522551B (en) * 2018-11-09 2024-02-20 天津新开心生活科技有限公司 Entity linking method and device, storage medium and electronic equipment
CN109597895B (en) * 2018-11-09 2021-10-22 中电科大数据研究院有限公司 Knowledge graph-based official document searching method
CN111209407B (en) * 2018-11-21 2023-06-16 北京嘀嘀无限科技发展有限公司 Data processing method, device, electronic equipment and computer readable storage medium
CN111274812B (en) * 2018-12-03 2023-04-18 阿里巴巴集团控股有限公司 Figure relation recognition method, equipment and storage medium
CN109766444B (en) * 2018-12-10 2021-02-23 北京百度网讯科技有限公司 Application database generation method and device of knowledge graph
CN109739994B (en) * 2018-12-14 2023-05-02 复旦大学 API knowledge graph construction method based on reference document
CN109726298B (en) * 2019-01-08 2020-12-29 上海市研发公共服务平台管理中心 Knowledge graph construction method, system, terminal and medium suitable for scientific and technical literature
CN109885691A (en) * 2019-01-08 2019-06-14 平安科技(深圳)有限公司 Knowledge mapping complementing method, device, computer equipment and storage medium
CN109710748B (en) * 2019-01-17 2021-04-27 北京光年无限科技有限公司 Intelligent robot-oriented picture book reading interaction method and system
CN109885660B (en) * 2019-02-22 2020-10-02 上海乐言信息科技有限公司 Knowledge graph energizing question-answering system and method based on information retrieval
CN110008354B (en) * 2019-04-10 2022-06-07 华侨大学 Method for constructing foreign Chinese learning content based on knowledge graph
CN110162640A (en) * 2019-04-28 2019-08-23 北京百度网讯科技有限公司 Novel entities method for digging, device, computer equipment and storage medium
CN110245239A (en) * 2019-05-13 2019-09-17 吉林大学 A kind of construction method and system towards automotive field knowledge mapping
CN110275966B (en) * 2019-07-01 2021-10-01 科大讯飞(苏州)科技有限公司 Knowledge extraction method and device
CN110532399A (en) * 2019-08-07 2019-12-03 广州多益网络股份有限公司 Knowledge mapping update method, system and the device of object game question answering system
CN110598002A (en) * 2019-08-14 2019-12-20 广州视源电子科技股份有限公司 Knowledge graph library construction method and device, computer storage medium and electronic equipment
CN110489032B (en) * 2019-08-14 2021-08-24 掌阅科技股份有限公司 Dictionary query method for electronic book and electronic equipment
CN110929038B (en) * 2019-10-18 2023-07-21 平安科技(深圳)有限公司 Knowledge graph-based entity linking method, device, equipment and storage medium
CN111091006B (en) * 2019-12-20 2023-08-29 北京百度网讯科技有限公司 Method, device, equipment and medium for establishing entity intention system
TWI747220B (en) * 2020-03-31 2021-11-21 股感生活金融科技股份有限公司 Knowledge graph association search method and system
CN111708892B (en) * 2020-04-24 2021-08-03 陆洋 Database system based on depth knowledge graph
CN112100396B (en) * 2020-08-28 2023-10-27 泰康保险集团股份有限公司 Data processing method and device
CN112256835B (en) * 2020-10-29 2021-07-23 东南大学 Subgraph extraction method for accurately describing element semantics in knowledge graph
CN112463980A (en) * 2020-11-25 2021-03-09 南京摄星智能科技有限公司 Intelligent plan recommendation method based on knowledge graph
CN112597285B (en) * 2020-12-10 2021-08-10 太极计算机股份有限公司 Man-machine interaction method and system based on knowledge graph
CN112487212A (en) * 2020-12-18 2021-03-12 清华大学 Method and device for constructing domain knowledge graph
CN112632226B (en) * 2020-12-29 2021-10-26 天津汇智星源信息技术有限公司 Semantic search method and device based on legal knowledge graph and electronic equipment
CN112749284B (en) * 2020-12-31 2021-12-17 平安科技(深圳)有限公司 Knowledge graph construction method, device, equipment and storage medium
CN113326697A (en) * 2021-05-31 2021-08-31 云南电网有限责任公司电力科学研究院 Knowledge graph-based electric power text entity semantic understanding method
CN113297347A (en) * 2021-06-29 2021-08-24 中国人民解放军国防科技大学 Intelligent auxiliary method, system and storage medium for professional document reading

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6408302B1 (en) * 1999-06-28 2002-06-18 Davox Corporation System and method of mapping database fields to a knowledge base using a graphical user interface
CN1448863A (en) * 2002-04-04 2003-10-15 迪吉科技有限公司 Establishing, editing, searching of knowledge map and editing method of corresponding network information content
CN1466046A (en) * 2002-07-01 2004-01-07 财团法人资讯工业策进会 Knowledge pattern system and method using ontologics as basis

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060277176A1 (en) * 2005-06-01 2006-12-07 Mydrew Inc. System, method and apparatus of constructing user knowledge base for the purpose of creating an electronic marketplace over a public network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6408302B1 (en) * 1999-06-28 2002-06-18 Davox Corporation System and method of mapping database fields to a knowledge base using a graphical user interface
CN1448863A (en) * 2002-04-04 2003-10-15 迪吉科技有限公司 Establishing, editing, searching of knowledge map and editing method of corresponding network information content
CN1466046A (en) * 2002-07-01 2004-01-07 财团法人资讯工业策进会 Knowledge pattern system and method using ontologics as basis

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107423820A (en) * 2016-05-24 2017-12-01 清华大学 The knowledge mapping of binding entity stratigraphic classification represents learning method

Also Published As

Publication number Publication date
CN103488724A (en) 2014-01-01

Similar Documents

Publication Publication Date Title
CN103488724B (en) A kind of reading domain knowledge map construction method towards books
CN106776711B (en) Chinese medical knowledge map construction method based on deep learning
CN110399457B (en) Intelligent question answering method and system
CN110633409B (en) Automobile news event extraction method integrating rules and deep learning
CN104391942B (en) Short essay eigen extended method based on semantic collection of illustrative plates
CN108959256B (en) Short text generation method and device, storage medium and terminal equipment
CN101593200B (en) Method for classifying Chinese webpages based on keyword frequency analysis
CN105528437B (en) A kind of question answering system construction method extracted based on structured text knowledge
CN105760495B (en) A kind of knowledge based map carries out exploratory searching method for bug problem
CN107832229A (en) A kind of system testing case automatic generating method based on NLP
CN101299217B (en) Method, apparatus and system for processing map information
CN109190117A (en) A kind of short text semantic similarity calculation method based on term vector
Zheng et al. Template-independent news extraction based on visual consistency
CN105706078A (en) Automatic definition of entity collections
CN110298395B (en) Image-text matching method based on three-modal confrontation network
CN106570191A (en) Wikipedia-based Chinese and English cross-language entity matching method
CN104462063B (en) Positional information structuring extracting method based on semantic locations model and system
CN106021392A (en) News key information extraction method and system
CN108073576A (en) Intelligent search method, searcher and search engine system
CN104268283A (en) Method for automatically analyzing Internet web page
CN107656921A (en) A kind of short text dependency analysis method based on deep learning
CN101556596A (en) Input method system and intelligent word making method
CN108920482A (en) Microblogging short text classification method based on Lexical Chains feature extension and LDA model
CN112966091A (en) Knowledge graph recommendation system fusing entity information and heat
CN113722490A (en) Visual rich document information extraction method based on key value matching relation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160928

Termination date: 20190916