CN103488724B - A kind of reading domain knowledge map construction method towards books - Google Patents
A kind of reading domain knowledge map construction method towards books Download PDFInfo
- Publication number
- CN103488724B CN103488724B CN201310420375.9A CN201310420375A CN103488724B CN 103488724 B CN103488724 B CN 103488724B CN 201310420375 A CN201310420375 A CN 201310420375A CN 103488724 B CN103488724 B CN 103488724B
- Authority
- CN
- China
- Prior art keywords
- entity
- concept
- knowledge
- books
- relation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Abstract
The invention belongs to Chinese knowledge base applied technical field, a kind of reading domain knowledge map construction method towards books.The method is divided into three parts: world knowledge map construction, domain knowledge map construction and intelligence are read and recommended.That is: the knowledge on the Internet, integrated universal knowledge mapping are obtained;The mode of iteration is utilized to extend the relevant concept of books and entity, binding entity Infobox table and conventional relationship extraction entity relationship in conjunction with world knowledge collection of illustrative plates;Mark the kernel entity in e-book according to entity from long to short, and set up linking of entity and books knowledge mapping, recommend realizing intelligent knowledge.Entity in books, by setting up the reading domain knowledge collection of illustrative plates towards books, is explained or knowledge recommendation, adds the degree of depth of knowledge, it is achieved that the facilitation of electronic reading, intellectuality and hommization, have more preferable Consumer's Experience by the present invention.
Description
Technical field
The invention belongs to Chinese knowledge base applied technical field, be specifically related to a kind of reading domain knowledge figure towards books
The construction method of spectrum.
Background technology
Along with the development of computer technology and popularizing of mobile device, the reading method of people there occurs deep change,
Electronic reading gradually replaces traditional paper reading to become one of main flow reading model.Comparing tradition reading, electronic reading is avoided
The waste more environmental protection of paper, electronic reading can help reader to realize reading easily.Electronic reading becomes already
, more there is the trend leading knowledge acquisition one of a kind of important channel of knowledge acquisition.
But, the knowledge acquisition that Current electronic is read is limited to books itself, and reader runs into strange vocabulary, knowledge point
Time need to consult aid, such as dictionary, encyclopedia etc., strange knowledge is explained.This brings additionally to reading
Burden, how the explanation of knowledge in books shows reader become the bottleneck that Current electronic is read intuitively, solves this and ask
Topic will make electronic reading more convenient, intelligent and hommization.
Current electronic reader is attempted explaining the knowledge in books.Kindle reader is by e-book
Word is linked in wikipedia scan for, to produce the explanation of word.There is reading that word is linked to Chinese interactive hundred
Section explains.These improvement improve the intelligibility of e-book and the degree of depth of knowledge to a certain extent.Although these change
Entering to extend the knowledge outside books and content, but the knowledge the most not carrying out intelligence arranges and recommends, reader still needs
To arrange, to select the knowledge of needs from the Search Results of word, there is not the knowledge that reader wants in even these encyclopaedia pages.
So, existing electric reads intelligence the most not, it is impossible to screening knowledge and knowledge recommendation automatically.
Knowledge mapping (knowledge graph) refers to using entity, concept as node, using semantic relation as the language on limit
Justice network.Knowledge mapping makes knowledge acquisition more direct, and therefore knowledge mapping can provide knowing of semantic association for electronic reading
Know, thus realize facilitation, intellectuality and the hommization read.But, current Chinese knowledge mapping still falls within the structure stage, and
And be general knowledge mapping.It would therefore be desirable to build one for each book nationality to read domain knowledge collection of illustrative plates.
Summary of the invention
The present invention is directed to Current electronic and read problems such as there is shallow, the knowledge recommendation intelligence not of knowledge hierarchy, propose one
In conjunction with world knowledge collection of illustrative plates, the method constructing the domain knowledge collection of illustrative plates towards books, construct knowledge network for e-book, thus
Realize books word being explained and the knowledge recommendation of intelligence.
The reading domain knowledge map construction method towards books that the present invention proposes, in conjunction with existing world knowledge figure
Spectrum, is identified the kernel entity in books and concept and marks, excavation semantic relation between entity, concept, thus structure
Make the domain knowledge collection of illustrative plates of books.When the kernel entity of readers' preference mark carries out knowledge query, reader will be known from field
Know the knowledge that in collection of illustrative plates, query semantics is relevant and carry out the knowledge recommendation of intelligence.The inventive method includes three parts (i.e. three moulds
Block): world knowledge map construction, domain knowledge map construction and intelligence read application, shown in method Organization Chart as accompanying drawing 1.
One, world knowledge map construction
Knowledge mapping refers to the semantic network being made up of the entity of magnanimity, concept and the semantic relation between them.Know
Knowing collection of illustrative plates and can provide the knowledge and explanation that entity is the most comprehensive, associate, therefore we are books by world knowledge collection of illustrative plates
Build domain knowledge collection of illustrative plates, thus make reasonable dismissal for the word in books, knowledge point.
The Chinese knowledge mapping that there is currently includes that Google's Chinese knowledge mapping, Baidu's knowledge mapping and search dog are known cube.
We utilize existing knowledge source as realizing the knowledge source of books domain knowledge map construction, by obtaining Baidupedia, mutually
Dynamic encyclopaedia and the Chinese entity of wikipedia, concept and relation, and in addition integrated obtain high-quality Universal Chinese character with cleaning and know
Know collection of illustrative plates.
Two, domain knowledge map construction
This module combines world knowledge collection of illustrative plates and uses alternative manner constantly to expand key concept and kernel entity, then digs
Semantic relation between pick entity, thus build domain knowledge collection of illustrative plates.This module is taken out by step concept, Entity recognition and relation
Take and realize.
Concept, Entity recognition
The target of concept identification is to identify all concepts being closely related with books, and the present invention is by world knowledge collection of illustrative plates
The open classification information realization of middle entity.
Book keyword defines
First, the concept relevant in order to identify books, need the keyword that a small amount of books of Manual definition are closely related, crucial
Word can select book name, it is also possible to selects the keyword in book name.This step can obtain set of keywords
KEYWORD(defines: the set that set of keywords i.e. forms for the keyword being correlated with by book name).
Seed concept identification
Seed concept is the concept directly comprising keyword string in knowledge mapping, will comprise keyword word string in knowledge mapping
Concept add classification seed concept set close SEEDCONCEPT(definition: classification seed concept set close be i.e. by knowledge mapping bag
The set formed containing the concept of the keyword substring in set KEYWORD).
Concept, entity iteration extend
The extension of concept, entity iteration is according to seed concept, expands all relevant to books from world knowledge collection of illustrative plates
Concept and entity.Implementation is as follows, and extension flow chart is shown in accompanying drawing 2:
First, the entity of correspondence can be obtained from seed concept set SEEDCONCEPT, add kernel entity set
COREENTITY(defines: kernel entity set is i.e. by the set being made up of the entity under seed concept).
Secondly, the kernel entity in scanning COREENTITY, the concept not in SEEDCONCEPT can be produced, be referred to as
Candidate concepts, adds candidate concepts set CANDIDATECONCEPT(definition: candidate concepts set is i.e. for by belonging to kernel entity
And do not appear in the set that the concept in key concept set is formed).
Then, calculate candidate concepts in CANDIDATECONCEPT to define with key concept set CORECONCEPT(: core
The set that heart concept set is i.e. made up of the closely-related concept of books, by seed concept and the concept bigger with its similarity
Composition) between semantic dependency.Will be greater than given threshold value(definition: semantic dependency threshold value.If concept and the language of set
Justice dependency then thinks semantic relevant more than this value) candidate concepts as related notion, add key concept set
In CORECONCEPT.Wherein, candidate concepts c(represents any candidate concepts) and key concept set between CS(represent that core is general
Read set CORECONCEPT) semantic dependency be defined as: Rel(c, cs).
Wherein,Represent and belong simultaneously to classify c and the physical quantities of classification k,With Num (k) table respectively
Showing the quantity of the entity belonging to classification c or k, c and k represents the open classification of the entity in knowledge mapping respectively.
Finally, iteratively extension CORECONCEPT and COREENTITY of increment until there is no new concept or reality
Body produces, and thus obtains whole with books related notion and entity.
But, but these entities and concept there may be some more common entities the strongest with topic relativity with general
Read, accordingly, it would be desirable to be carried out.Cleaning process is realized by the IDF value of computational entity or concept, i.e. relatively low for IDF value
Entity or concept, as noise, are shown below:
Num represents entity sum in knowledge mapping,Represent the entity number comprising link entity e in knowledge mapping
Amount, Num (c) represents the physical quantities comprising classification c in knowledge mapping.E represents the entity in knowledge mapping, and c represents knowledge graph
The open classification of entity in spectrum.So can punish entity or concept that versatility is bigger, thus retain maximally related entity and
Concept.
Entitative concept Relation extraction
Entitative concept Relation extraction is for acquired entitative concept constructing semantic relation, is the important of structure knowledge mapping
Step.Entity relationship is expressed as tlv triple, wherein source represents that source is real
Body, target represents target entity, relation presentation-entity relation, r presentation-entity relationship description set.The pass that books are relevant
System refers to that in tlv triple, source or target is in COREENTITY.The present invention uses two kinds mainly in combination with world knowledge collection of illustrative plates
Relation extraction method: based on Infobox(define: Infobox refers to entity attributes table in knowledge mapping) Relation extraction method
With Relation extraction method based on pattern.
Relation extraction method based on Infobox
Infobox describes the base attribute information of entity in table form.The expression of Infobox () represent identical with entity relationship, i.e. entity correspondence source, attribute
Corresponding relation, value correspondence target.Wherein entity presentation-entity, attribute presentation-entity attribute, value table
Show the property value that entity is corresponding.First, check Infobox table, if entity or value belongs to COREENTITY, then should
Bar attribute adds set R(definition: the set being made up of entity relationship tlv triple), and attribute is added entity relationship retouch
State set r.
Relation extraction method based on pattern
Use Infobox can obtain entity relationship triplet sets R and entity relationship description collections r.In order to excavate more
Many entities, the present invention uses Relation extraction method based on pattern.
Challenge in entity relation extraction is that the extraction of " relationship description ", method based on Infobox have been obtained for
" relationship description " set r.Therefore, used here as the method for natural language processing and combine Chinese word segmentation identification entity, i.e. first from
One sentence is found out the position of " relationship description ", finds nearest kernel entity the most respectively forwardly, backward or noun is real
Body.Relation extraction pattern is:, i.e. entity relationship words of description and its forward, the most nearest entity
Constitute a relation tlv triple.
Particularly for the extraction of character relation in books, using the decimation pattern in table 1, language material text is the name of entity
Sheet is introduced, r representative figure's set of relationship { such as " father ", " husband ", " wife " etc. } here:
Table 1. character relation decimation pattern
Note: * * represents that any word ,/nr ,/u ,/uj ,/v represent the part-of-speech tagging after Chinese word segmentation, and { r} is relationship description
A relationship description word in set r.
Entity refers to Relation extraction
In books, some entity has another name or special address, but all referring to same entity.In order to identify these
Refer to entity, need to carry out entity and refer to judge, mainly by synonyms map and the reality of the entity in knowledge mapping
Synonym in body Infobox table describes attribute (" another name ", " former name ", " formal name used at school ", " pseudonym " etc.) and will refer to entity pass
It is linked to kernel entity.
Three, intelligence reads application
This module main purpose is the entity in mark e-book and to complete entity in books domain knowledge collection of illustrative plates
The mapping of knowledge.When entity in readers' preference books, the knowledge of correspondence is selected to be pushed away from books domain knowledge collection of illustrative plates
Recommend and show.Explain including entity mark, entity:
Entity mark is to mark, the entity in kernel entity set COREENTITY in order to improve mark in electronic books
The accuracy of note and speed, sort entity according to length, mark the most successively, cause to avoid entity to comprise
Mistake mark;
Entity explains the explanation that the entity marked out in e-book finds in knowledge mapping correspondence, when user selects
When needing the entity explained, corresponding knowledge is selected to be recommended.
In sum, the domain knowledge collection of illustrative plates towards books of structure in the present invention is used can to accurately identify e-book
In the extraction of entity, concept and entity relationship, and kernel entity can be marked accurately, in conjunction with knowledge mapping to e-book
Entity in nationality does accurate, intelligent knowledge recommendation, greatly improves the convenience of books, intelligibility.This is existing electronics
The function that reading system is all not carried out.
According to foregoing, the reading domain knowledge map construction method towards books of the present invention, it is summarized as follows:
(1) for given e-book and world knowledge collection of illustrative plates, identify, extract and belong to the relevant of this e-book
Knowledge, to provide the knowledge recommendation of intelligence.These relevant knowledges include entity, concept and the semantic pass making an explanation and being correlated with thereof
System, the semantic network that composition books are relevant, i.e. books domain knowledge collection of illustrative plates.
(2) for the domain knowledge collection of illustrative plates built and e-book, the reading system of intelligence is generated.Mark out e-book
Core vocabulary (entity in knowledge mapping, concept) in nationality, and the glossary explanation in knowledge mapping is linked to e-book
In.When reader asks glossary explanation, the knowledge interpretation that semanteme is relevant is selected to be recommended from domain knowledge collection of illustrative plates.
The step of books domain knowledge map construction method described in step (1) is as follows:
A the identification of () conceptual entity uses the classification information in world knowledge collection of illustrative plates, first define book keyword, secondly
Preliminary obtain the seed concept that books are relevant, then the expansion concept of iteration and entity.General by definition candidate concepts and core
Read the dependency between setDecide whether to add candidate concepts key concept set.Wherein
Represent and belong simultaneously to classificationRepresent respectively with the physical quantities of classification k, Num (c) and Num (k) and belong to entity under classification c or k
Quantity.
Finally, by the entity using IDF index to clean to obtain, concept, obtain entity closely-related with books and generally
Read.
Entity Infobox information in (b) entitative concept Relation extraction use world knowledge collection of illustrative plates, extraction relation tlv triple <
Source, relation, target > in description collections { relation} and the part entity relation of relation.Then for general
The text of knowledge mapping entity, uses Relation extraction method based on pattern, extracts more relation.
C () entity refers in synonym information and the Infobox table that relation mainly passes through entity in world knowledge collection of illustrative plates
Synonym describes attribute, will refer to the kernel entity that entity link refers to it.
The step that intelligent reading system described in step (2) generates method is as follows:
A () needs the vocabulary of explanation to mark out in e-book, by the entitative concept in books domain knowledge collection of illustrative plates
Sort according to character length, carry out the most in electronic books mating, marking.
B the knowledge of books related entities, concept is taken out from world knowledge collection of illustrative plates by (), be integrated into the domain knowledge of books
Collection of illustrative plates, then complete the link to relevant knowledge of the books vocabulary.
Accompanying drawing explanation
Fig. 1 is the Organization Chart towards books reading domain knowledge collection of illustrative plates.
Fig. 2 is concept, the flow chart of entity extraction.
Fig. 3 is for carrying out entity mark and knowledge recommendation design sketch for books Dream of the Red Mansion.
Fig. 4 is that the A Dream of Red Mansions part personage's graph of a relation using Relation extraction method to obtain shows.
Detailed description of the invention
Below as a example by e-book Dream of the Red Mansion, further describe the present invention:
Module one: world knowledge map construction
Use Baidu's Chinese knowledge mapping as knowledge source, use interactive encyclopaedia and the knowledge source of Chinese wikipedia simultaneously
As supplementing.By crawling and resolve encyclopaedia data, the encyclopaedia entity obtained is integrated and cleaned, high-quality entity,
Concept and entity relationship.Thus construct world knowledge collection of illustrative plates.
Module two: domain knowledge map construction
1. entity, concept extraction
First, for e-book Dream of the Red Mansion, the artificial key that sets gathers KEYWORD{ " A Dream of Red Mansions " }, then from knowing
Know in collection of illustrative plates entity classification and search key concept set CORECONCEPT{ " A Dream of Red Mansions " comprising " A Dream of Red Mansions " keyword, " red
Building dream personage ", " A Dream of Red Mansions dress ornament " ....Secondly, from knowledge mapping, search the entity belonging to key concept, constitute core real
Body set COREENTITY{ " Jia Fu ", " precious jade ", " Lin Daiyu " ... }.By belonging to COREENTITY and not at CORECONCEPT
In concept add CANDIDATECONCEPT.Itself and CORECONCEPT are calculated for the concept in CANDIDATECONCEPT
Between semantic relevancy Rel (c, CS), select degree of association more than threshold valueConcept add CORECPNCEPT.Finally,
Extension COREENTITY and CORECONCEPT of iteration, until converging to do not have new kernel entity and concept to add set,
The entity scale relevant to books " A Dream of Red Mansions " is shown in Table 2.
2. Relation extraction
First, use Relation extraction method based on Infobox extract relation, i.e. judge Infobox table < entity,
Attribute, value > entity or value whether belong to kernel entity set.If then by entity relationship tlv triple <
Entity, attribute, value > add set R, relationship description attribute is added relationship description set r simultaneously.
As<Lin Daiyu, father, Lin Ruhai>can obtain relation " Lin Daiyu-father-Lin Ruhai " and relationship description " father ".
Secondly, use Relation extraction method based on pattern, use patternExtend from text
Relation, as used relationship description " father ", can extract entity relationship " Jia Baoyu-father-Jia Zheng ".
Then, using the pattern in table 1 to describe from text and extract the character relation in Dream of the Red Mansion, the personage obtained is closed
It is that collection of illustrative plates scale is shown in Table 2, and the character relation graphical effect centered by " Wang Xifeng ", " Lin Daiyu " is shown in accompanying drawing 4.
Finally, refer to for the entity in books, as in Dream of the Red Mansion " phoenix elder sister ", " phoenix spicy " all referring to " Wang Xifeng ".
In order to identify that these refer to entity, make use of in synonyms map and the entity Infobox table of entity in knowledge mapping
Synonym describes attribute (" another name ", " former name ", " formal name used at school ", " pseudonym " etc.) and will refer to entity associated to kernel entity.
The entity that used by this module, concept identification, the knowledge mapping scale that Relation extraction method builds is shown in Table 2.
The scale of table 2. A Dream of Red Mansions domain knowledge collection of illustrative plates
The scale of domain knowledge collection of illustrative plates | Physical quantities (individual) | Entity relationship quantity (individual) | Concept quantity (individual) |
A Dream of Red Mansions entity collection of illustrative plates | 1560 | 2731 | 85 |
A Dream of Red Mansions personage's subgraph is composed | 804 | 1530 | 2 |
Module three: intelligence reads application
Dream of the Red Mansion related entities module two obtained sorts from long to short according to physical length, the most successively " red
Lou Meng " e-book marks out.The mistake that so entity can be avoided to comprise and to cause, improves the accuracy of mark simultaneously
And efficiency.Entity mark effect is shown in accompanying drawing 3, and in Dream of the Red Mansion, the related entities such as " precious jade ", " Lin Daiyu " is all accurately marked out
Come.
The entity marked out in Dream of the Red Mansion books is found in knowledge mapping the explanation of correspondence, when user selects needs
During the entity explained, corresponding knowledge is selected to be recommended.Accompanying drawing 3 is shown as the explanation letter of entity in Dream of the Red Mansion " Lin Daiyu "
Breath.
Claims (3)
1. the reading domain knowledge map construction method towards books, it is characterised in that concrete steps are divided into: world knowledge
Map construction, domain knowledge map construction and intelligence read application;
One, world knowledge map construction
Knowledge mapping refers to the semantic network being made up of the entity of magnanimity, concept and the semantic relation between them;By logical
It is that books build domain knowledge collection of illustrative plates with knowledge mapping, thus makes reasonable dismissal for the word in books, knowledge point;Logical
Include that Google's Chinese knowledge mapping, Baidu's knowledge mapping and search dog are known vertical with knowledge mapping with the Chinese knowledge mapping that there is currently
The existing knowledge source of Fang Zuowei builds;
Two, domain knowledge map construction
Use alternative manner constantly to expand key concept and kernel entity in conjunction with world knowledge collection of illustrative plates, then excavate between entity
Semantic relation, thus build domain knowledge collection of illustrative plates;Including concept, Entity recognition and Relation extraction:
-2.1 concepts, Entity recognition
The target of concept identification is to identify all concepts being closely related with books, and concept identification is by world knowledge collection of illustrative plates
The open classification information realization of entity;
-2.1.1 book keyword defines
First, the concept relevant in order to identify books, a small amount of books of Manual definition the keyword being closely related, keyword selects
Book name, or select the keyword in book name;Set of keywords KEYWORD is obtained by this step;
-2.1.2 seed concept identification
Seed concept is the concept directly comprising keyword string in knowledge mapping, will comprise the general of keyword word string in knowledge mapping
Reading and add classification seed concept set conjunction SEEDCONCEPT, classification seed concept set is combined into by comprising set in knowledge mapping
The set that the concept of the keyword word string in KEYWORD is formed;
-2.1.3 concept, entity iteration extend
The extension of concept, entity iteration is according to seed concept, expands all relevant to books general from world knowledge collection of illustrative plates
Read and entity;Concrete grammar is as follows:
First, obtain the entity of correspondence from seed concept set SEEDCONCEPT, add kernel entity set COREENTITY,
Kernel entity set is the set being made up of the entity under seed concept;
Secondly, the kernel entity in scanning COREENTITY, produce the not concept in SEEDCONCEPT, referred to as candidate concepts,
Adding candidate concepts set CANDIDATECONCEPT, candidate concepts set is for by belonging to kernel entity and do not appear in core
The set that concept in concept set is formed;
Then, the semantic phase between candidate concepts with key concept set CORECONCEPT in CANDIDATECONCEPT is calculated
Guan Xing, the set that described key concept set is made up of the closely-related concept of books, by seed concept and similar to it
Property bigger concept composition;Will be greater than given threshold valueCandidate concepts as related notion, add key concept set
In CORECONCEPT;Wherein, between candidate concepts c and key concept set, the semantic dependency of CS is defined as: Rel(c,
Cs);
Wherein, (c, k) expression belongs simultaneously to classify c and the physical quantities of classification k to Num, and Num (c) and Num (k) represents genus respectively
Quantity in classification c or k entity;
Finally, iteratively extension CORECONCEPT and COREENTITY of increment until not having new concept or entity to produce
Raw, thus obtain whole with books related notion and entity;
-2.2 entitative concept Relation extractions
Entitative concept Relation extraction is for acquired entitative concept constructing semantic relation, and entity relationship is expressed as tlv triple, wherein r presentation-entity relationship description set;Books are correlated with
Relation refers to that in tlv triple, source or target is in COREENTITY;Two kinds of Relation extractions are used in conjunction with world knowledge collection of illustrative plates
Method: Relation extraction method based on Infobox and Relation extraction method based on pattern;
-2.2.1 Relation extraction based on Infobox method
Infobox refers to entity attributes table, describes the base attribute information of entity in table form;The expression of Infobox, represent identical with entity relationship, i.e. entity correspondence source,
Attribute correspondence relation, value correspondence target;First, check Infobox table, if entity or value belongs to
In COREENTITY, then this attribute is added the set that set R, R are made up of entity relationship tlv triple, by attribute
Add entity relationship description collections r;
-2.2.2 Relation extraction based on pattern method
The method of Infobox has been obtained for " relationship description " set r, and Relation extraction based on pattern uses natural language processing
Method and combine Chinese word segmentation identification entity, from a sentence, i.e. first find out the position of " relationship description ", the most respectively to
Before, find nearest kernel entity or noun entity backward;Relation extraction pattern is:;I.e. entity
Relationship description word and its forward, the most nearest entity constitute a relation tlv triple;
-2.2.3 entity refers to Relation extraction
In books, some entity has another name or special address, but all referring to same entity;In order to identify that these refer to
Entity, needs to carry out entity and refers to judge, be synonyms map and the entity Infobox that make use of the entity in knowledge mapping
Synonym in table describes attribute and includes that " another name ", " former name ", " formal name used at school ", " pseudonym " will refer to entity associated to core in fact
Body;
Three, intelligence reads application
Explain including entity mark, entity:
Entity mark is to mark, the entity in kernel entity set COREENTITY in order to improve mark in electronic books
Accuracy and speed, sort entity according to length, mark the most successively, the mistake caused to avoid entity to comprise
Mark by mistake;
Entity explains the explanation that the entity marked out in e-book finds in knowledge mapping correspondence, when user selects needs
During the entity explained, corresponding knowledge is selected to be recommended.
Reading domain knowledge map construction method towards books the most according to claim 1, it is characterised in that concept,
In entity iteration spread step, but can there are some more common entities the strongest with topic relativity with general in entity and concept
Read, accordingly, it would be desirable to be carried out;Cleaning process is realized by the IDF value of computational entity or concept, i.e. relatively low for IDF value
Entity or concept, as noise, are shown below:
Num represents entity sum in knowledge mapping, and Num (e) represents the physical quantities comprising link entity e in knowledge mapping, Num
C () represents the physical quantities comprising classification c in knowledge mapping.
Reading domain knowledge map construction method towards books the most according to claim 1, it is characterised in that based on
In the Relation extraction of pattern, for the extraction of character relation in books, using the decimation pattern in table 1, language material text is entity
Business card introduction, r representative figure's set of relationship here,
Table 1. character relation decimation pattern
* represents that any word ,/nr ,/u ,/uj ,/v represent the part-of-speech tagging after Chinese word segmentation, and { r} is in relationship description set r
A relationship description word.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310420375.9A CN103488724B (en) | 2013-09-16 | 2013-09-16 | A kind of reading domain knowledge map construction method towards books |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310420375.9A CN103488724B (en) | 2013-09-16 | 2013-09-16 | A kind of reading domain knowledge map construction method towards books |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103488724A CN103488724A (en) | 2014-01-01 |
CN103488724B true CN103488724B (en) | 2016-09-28 |
Family
ID=49828950
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310420375.9A Expired - Fee Related CN103488724B (en) | 2013-09-16 | 2013-09-16 | A kind of reading domain knowledge map construction method towards books |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103488724B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107423820A (en) * | 2016-05-24 | 2017-12-01 | 清华大学 | The knowledge mapping of binding entity stratigraphic classification represents learning method |
Families Citing this family (67)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104102713B (en) * | 2014-07-16 | 2018-01-19 | 百度在线网络技术(北京)有限公司 | Recommendation results show method and apparatus |
CN104133916B (en) * | 2014-08-14 | 2019-01-15 | 百度在线网络技术(北京)有限公司 | Search result information method for organizing and device |
CN104462227A (en) * | 2014-11-13 | 2015-03-25 | 中国测绘科学研究院 | Automatic construction method of graphic knowledge genealogy |
CN104408148B (en) * | 2014-12-03 | 2017-12-01 | 复旦大学 | A kind of field encyclopaedia constructing system based on general encyclopaedia website |
CN105117115B (en) * | 2015-08-07 | 2018-05-08 | 小米科技有限责任公司 | A kind of method and apparatus for showing electronic document |
CN105574098B (en) * | 2015-12-11 | 2019-02-12 | 百度在线网络技术(北京)有限公司 | The generation method and device of knowledge mapping, entity control methods and device |
CN105824802B (en) * | 2016-03-31 | 2018-10-30 | 清华大学 | It is a kind of to obtain the method and device that knowledge mapping vectorization indicates |
CN105912656B (en) * | 2016-04-07 | 2020-03-17 | 桂林电子科技大学 | Method for constructing commodity knowledge graph |
CN107391512B (en) * | 2016-05-17 | 2021-05-11 | 北京邮电大学 | Method and device for predicting knowledge graph |
CN106095748B (en) * | 2016-06-06 | 2019-08-27 | 东软集团股份有限公司 | A kind of method and device generating event relation map |
CN107894884A (en) * | 2016-09-30 | 2018-04-10 | 中国电子科技集团公司信息科学研究院 | Object representation device and its description method |
WO2018076348A1 (en) * | 2016-10-31 | 2018-05-03 | Microsoft Technology Licensing, Llc | Building and updating a connected segment graph |
CN108073587B (en) * | 2016-11-09 | 2022-05-27 | 阿里巴巴集团控股有限公司 | Automatic question answering method and device and electronic equipment |
CN106776564B (en) * | 2016-12-21 | 2020-04-24 | 张永成 | Semantic recognition method and system based on knowledge graph |
CN106874378B (en) * | 2017-01-05 | 2020-06-02 | 北京工商大学 | Method for constructing knowledge graph based on entity extraction and relation mining of rule model |
US10579689B2 (en) | 2017-02-08 | 2020-03-03 | International Business Machines Corporation | Visualization and augmentation of human knowledge construction during material consumption |
CN106875014B (en) * | 2017-03-02 | 2021-06-15 | 上海交通大学 | Automatic construction implementation method of software engineering knowledge base based on semi-supervised learning |
CN106934032B (en) * | 2017-03-14 | 2019-10-18 | 北京软通智城科技有限公司 | A kind of city knowledge mapping construction method and device |
CN106933806A (en) * | 2017-03-15 | 2017-07-07 | 北京大数医达科技有限公司 | The determination method and apparatus of medical synonym |
CN106951526B (en) * | 2017-03-21 | 2020-08-07 | 北京邮电大学 | Entity set extension method and device |
CN108694177B (en) * | 2017-04-06 | 2022-02-18 | 北大方正集团有限公司 | Knowledge graph construction method and system |
CN107038261B (en) * | 2017-05-28 | 2019-09-20 | 海南大学 | A kind of processing framework resource based on data map, Information Atlas and knowledge mapping can Dynamic and Abstract Semantic Modeling Method |
CN107103100B (en) * | 2017-06-10 | 2019-07-30 | 海南大学 | A kind of fault-tolerant intelligent semantic searching method based on map framework |
CN109241289A (en) * | 2017-07-04 | 2019-01-18 | 北京国双科技有限公司 | Entity information map extending method and device |
CN107330125B (en) * | 2017-07-20 | 2020-06-30 | 云南电网有限责任公司电力科学研究院 | Mass unstructured distribution network data integration method based on knowledge graph technology |
CN107609052B (en) * | 2017-08-23 | 2019-09-24 | 中国科学院软件研究所 | A kind of generation method and device of the domain knowledge map based on semantic triangle |
CN107704637B (en) * | 2017-11-20 | 2019-12-13 | 中国人民解放军国防科技大学 | knowledge graph construction method for emergency |
CN108052576B (en) * | 2017-12-08 | 2021-04-23 | 国家计算机网络与信息安全管理中心 | Method and system for constructing affair knowledge graph |
CN108182245A (en) * | 2017-12-28 | 2018-06-19 | 北京锐安科技有限公司 | The construction method and device of people's object properties classificating knowledge collection of illustrative plates |
CN108171213A (en) * | 2018-01-22 | 2018-06-15 | 北京邮电大学 | A kind of Relation extraction method for being applicable in picture and text knowledge mapping |
CN108415971B (en) * | 2018-02-08 | 2021-07-23 | 兰州智豆信息科技有限公司 | Method and device for recommending supply and demand information by using knowledge graph |
CN108959242B (en) * | 2018-05-08 | 2021-07-27 | 中国科学院信息工程研究所 | Target entity identification method and device based on part-of-speech characteristics of Chinese characters |
CN108710695B (en) * | 2018-05-23 | 2019-08-06 | 掌阅科技股份有限公司 | Mind map generation method and electronic equipment based on e-book |
CN108920527A (en) * | 2018-06-07 | 2018-11-30 | 桂林电子科技大学 | A kind of personalized recommendation method of knowledge based map |
CN108829865B (en) * | 2018-06-22 | 2021-04-09 | 海信集团有限公司 | Information retrieval method and device |
CN109189942B (en) * | 2018-09-12 | 2021-07-09 | 山东大学 | Construction method and device of patent data knowledge graph |
US10923114B2 (en) * | 2018-10-10 | 2021-02-16 | N3, Llc | Semantic jargon |
CN109522551B (en) * | 2018-11-09 | 2024-02-20 | 天津新开心生活科技有限公司 | Entity linking method and device, storage medium and electronic equipment |
CN109597895B (en) * | 2018-11-09 | 2021-10-22 | 中电科大数据研究院有限公司 | Knowledge graph-based official document searching method |
CN111209407B (en) * | 2018-11-21 | 2023-06-16 | 北京嘀嘀无限科技发展有限公司 | Data processing method, device, electronic equipment and computer readable storage medium |
CN111274812B (en) * | 2018-12-03 | 2023-04-18 | 阿里巴巴集团控股有限公司 | Figure relation recognition method, equipment and storage medium |
CN109766444B (en) * | 2018-12-10 | 2021-02-23 | 北京百度网讯科技有限公司 | Application database generation method and device of knowledge graph |
CN109739994B (en) * | 2018-12-14 | 2023-05-02 | 复旦大学 | API knowledge graph construction method based on reference document |
CN109726298B (en) * | 2019-01-08 | 2020-12-29 | 上海市研发公共服务平台管理中心 | Knowledge graph construction method, system, terminal and medium suitable for scientific and technical literature |
CN109885691A (en) * | 2019-01-08 | 2019-06-14 | 平安科技(深圳)有限公司 | Knowledge mapping complementing method, device, computer equipment and storage medium |
CN109710748B (en) * | 2019-01-17 | 2021-04-27 | 北京光年无限科技有限公司 | Intelligent robot-oriented picture book reading interaction method and system |
CN109885660B (en) * | 2019-02-22 | 2020-10-02 | 上海乐言信息科技有限公司 | Knowledge graph energizing question-answering system and method based on information retrieval |
CN110008354B (en) * | 2019-04-10 | 2022-06-07 | 华侨大学 | Method for constructing foreign Chinese learning content based on knowledge graph |
CN110162640A (en) * | 2019-04-28 | 2019-08-23 | 北京百度网讯科技有限公司 | Novel entities method for digging, device, computer equipment and storage medium |
CN110245239A (en) * | 2019-05-13 | 2019-09-17 | 吉林大学 | A kind of construction method and system towards automotive field knowledge mapping |
CN110275966B (en) * | 2019-07-01 | 2021-10-01 | 科大讯飞(苏州)科技有限公司 | Knowledge extraction method and device |
CN110532399A (en) * | 2019-08-07 | 2019-12-03 | 广州多益网络股份有限公司 | Knowledge mapping update method, system and the device of object game question answering system |
CN110598002A (en) * | 2019-08-14 | 2019-12-20 | 广州视源电子科技股份有限公司 | Knowledge graph library construction method and device, computer storage medium and electronic equipment |
CN110489032B (en) * | 2019-08-14 | 2021-08-24 | 掌阅科技股份有限公司 | Dictionary query method for electronic book and electronic equipment |
CN110929038B (en) * | 2019-10-18 | 2023-07-21 | 平安科技(深圳)有限公司 | Knowledge graph-based entity linking method, device, equipment and storage medium |
CN111091006B (en) * | 2019-12-20 | 2023-08-29 | 北京百度网讯科技有限公司 | Method, device, equipment and medium for establishing entity intention system |
TWI747220B (en) * | 2020-03-31 | 2021-11-21 | 股感生活金融科技股份有限公司 | Knowledge graph association search method and system |
CN111708892B (en) * | 2020-04-24 | 2021-08-03 | 陆洋 | Database system based on depth knowledge graph |
CN112100396B (en) * | 2020-08-28 | 2023-10-27 | 泰康保险集团股份有限公司 | Data processing method and device |
CN112256835B (en) * | 2020-10-29 | 2021-07-23 | 东南大学 | Subgraph extraction method for accurately describing element semantics in knowledge graph |
CN112463980A (en) * | 2020-11-25 | 2021-03-09 | 南京摄星智能科技有限公司 | Intelligent plan recommendation method based on knowledge graph |
CN112597285B (en) * | 2020-12-10 | 2021-08-10 | 太极计算机股份有限公司 | Man-machine interaction method and system based on knowledge graph |
CN112487212A (en) * | 2020-12-18 | 2021-03-12 | 清华大学 | Method and device for constructing domain knowledge graph |
CN112632226B (en) * | 2020-12-29 | 2021-10-26 | 天津汇智星源信息技术有限公司 | Semantic search method and device based on legal knowledge graph and electronic equipment |
CN112749284B (en) * | 2020-12-31 | 2021-12-17 | 平安科技(深圳)有限公司 | Knowledge graph construction method, device, equipment and storage medium |
CN113326697A (en) * | 2021-05-31 | 2021-08-31 | 云南电网有限责任公司电力科学研究院 | Knowledge graph-based electric power text entity semantic understanding method |
CN113297347A (en) * | 2021-06-29 | 2021-08-24 | 中国人民解放军国防科技大学 | Intelligent auxiliary method, system and storage medium for professional document reading |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6408302B1 (en) * | 1999-06-28 | 2002-06-18 | Davox Corporation | System and method of mapping database fields to a knowledge base using a graphical user interface |
CN1448863A (en) * | 2002-04-04 | 2003-10-15 | 迪吉科技有限公司 | Establishing, editing, searching of knowledge map and editing method of corresponding network information content |
CN1466046A (en) * | 2002-07-01 | 2004-01-07 | 财团法人资讯工业策进会 | Knowledge pattern system and method using ontologics as basis |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060277176A1 (en) * | 2005-06-01 | 2006-12-07 | Mydrew Inc. | System, method and apparatus of constructing user knowledge base for the purpose of creating an electronic marketplace over a public network |
-
2013
- 2013-09-16 CN CN201310420375.9A patent/CN103488724B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6408302B1 (en) * | 1999-06-28 | 2002-06-18 | Davox Corporation | System and method of mapping database fields to a knowledge base using a graphical user interface |
CN1448863A (en) * | 2002-04-04 | 2003-10-15 | 迪吉科技有限公司 | Establishing, editing, searching of knowledge map and editing method of corresponding network information content |
CN1466046A (en) * | 2002-07-01 | 2004-01-07 | 财团法人资讯工业策进会 | Knowledge pattern system and method using ontologics as basis |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107423820A (en) * | 2016-05-24 | 2017-12-01 | 清华大学 | The knowledge mapping of binding entity stratigraphic classification represents learning method |
Also Published As
Publication number | Publication date |
---|---|
CN103488724A (en) | 2014-01-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103488724B (en) | A kind of reading domain knowledge map construction method towards books | |
CN106776711B (en) | Chinese medical knowledge map construction method based on deep learning | |
CN110399457B (en) | Intelligent question answering method and system | |
CN110633409B (en) | Automobile news event extraction method integrating rules and deep learning | |
CN104391942B (en) | Short essay eigen extended method based on semantic collection of illustrative plates | |
CN108959256B (en) | Short text generation method and device, storage medium and terminal equipment | |
CN101593200B (en) | Method for classifying Chinese webpages based on keyword frequency analysis | |
CN105528437B (en) | A kind of question answering system construction method extracted based on structured text knowledge | |
CN105760495B (en) | A kind of knowledge based map carries out exploratory searching method for bug problem | |
CN107832229A (en) | A kind of system testing case automatic generating method based on NLP | |
CN101299217B (en) | Method, apparatus and system for processing map information | |
CN109190117A (en) | A kind of short text semantic similarity calculation method based on term vector | |
Zheng et al. | Template-independent news extraction based on visual consistency | |
CN105706078A (en) | Automatic definition of entity collections | |
CN110298395B (en) | Image-text matching method based on three-modal confrontation network | |
CN106570191A (en) | Wikipedia-based Chinese and English cross-language entity matching method | |
CN104462063B (en) | Positional information structuring extracting method based on semantic locations model and system | |
CN106021392A (en) | News key information extraction method and system | |
CN108073576A (en) | Intelligent search method, searcher and search engine system | |
CN104268283A (en) | Method for automatically analyzing Internet web page | |
CN107656921A (en) | A kind of short text dependency analysis method based on deep learning | |
CN101556596A (en) | Input method system and intelligent word making method | |
CN108920482A (en) | Microblogging short text classification method based on Lexical Chains feature extension and LDA model | |
CN112966091A (en) | Knowledge graph recommendation system fusing entity information and heat | |
CN113722490A (en) | Visual rich document information extraction method based on key value matching relation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160928 Termination date: 20190916 |