KR101739540B1 - System and method for building integration knowledge base based - Google Patents
System and method for building integration knowledge base based Download PDFInfo
- Publication number
- KR101739540B1 KR101739540B1 KR1020160010071A KR20160010071A KR101739540B1 KR 101739540 B1 KR101739540 B1 KR 101739540B1 KR 1020160010071 A KR1020160010071 A KR 1020160010071A KR 20160010071 A KR20160010071 A KR 20160010071A KR 101739540 B1 KR101739540 B1 KR 101739540B1
- Authority
- KR
- South Korea
- Prior art keywords
- integrated
- data
- entity
- knowledge
- knowledge data
- Prior art date
Links
Images
Classifications
-
- G06F17/30286—
-
- G06F17/30289—
-
- G06F17/30345—
-
- G06F17/30604—
Abstract
An integrated knowledge base construction system according to the present invention includes: an individual knowledge base building module for converting external data received from a first data server to generate first internal knowledge data; And a knowledge base integration module for integrating the integrated knowledge data and the first internal knowledge data, wherein the knowledge base integration module compares entities of the integrated knowledge data with the object of the first internal knowledge data, Searching the integrated knowledge data for selecting an integrated entity candidate of knowledge data, generating similarity between the target entity and the integrated entity candidate, and selecting an integrated entity based on the similarity among the integrated entity candidates And an integrated entity conversion unit for adding the object related data among the first internal knowledge data to the integrated knowledge data using integrated information including a selection result for the integrated entity.
Description
Technical aspects of the present invention relate to an integrated knowledge base building system, and more particularly, to a system and method for building an integrated knowledge base including a knowledge base integration module.
The present invention is derived from research conducted and conducted by Saltlux Co., Ltd. as part of the SW Technology Computing Industry Source Technology Development Project (SW) of the future Creation Science Department. [Research period: 2015.03.01 ~ 2016.02.29] Research institute: Information and Communication Technology Promotion Center, Research title: WiseKB: Development of self-learning knowledge base and reasoning technology based on big data understanding, 15-0054]
Korean knowledge LOD (Linked Open Data) is limited to some special knowledge. There is also a knowledge base which includes information on general knowledge such as DBPedia, but the construction of knowledge base constructed from various data sources is not applied. Therefore, there is a need for a means for integrating the data of various date sources into one integrated knowledge data and building an integrated knowledge base.
The technical idea of the present invention is to provide a system and method for constructing an integrated knowledge base system.
An integrated knowledge base construction system according to the present invention includes: an individual knowledge base building module for converting external data received from a first data server to generate first internal knowledge data; And a knowledge base integration module for integrating the integrated knowledge data and the first internal knowledge data, wherein the knowledge base integration module compares entities of the integrated knowledge data with the object of the first internal knowledge data, Searching the integrated knowledge data for selecting an integrated entity candidate of knowledge data, generating similarity between the target entity and the integrated entity candidate, and selecting an integrated entity based on the similarity among the integrated entity candidates And an integrated entity converting unit for adding the object related data among the first internal knowledge data to the integrated knowledge data, based on the integrated information including the selection result for the integrated entity.
In addition, the individual knowledge base rescue module may convert external data received from the second data server to generate second knowledge data separately, and the knowledge base integration module may extract the integrated knowledge data and the second knowledge data Integration.
The entity similarity analyzing unit may be configured to classify a predetermined attribute (or relationship) of a target entity of the first inner knowledge data and a value (or an entity) corresponding thereto and the predetermined property (Or relationships) and corresponding values (or entities) of the integrated knowledge data to compare attributes (or relationships) with the predetermined attributes (or relationships) of the objects and the corresponding values (or entities) And an entity having the same value (or entity) is selected as the integrated entity candidate.
The entity similarity analyzing unit may generate similarity between the target entity and the candidate entity based on at least one of graph information, ontology information, and syntax information corresponding to the target entity and the candidate entity, And the integration entity is selected based on the similarity degree generated among the integration entity candidates.
The integrated entity conversion unit may convert the identifier of the target entity into the integrated identifier of the integrated entity.
Also, when the object similarity analyzing unit fails to select an integrated entity candidate of the integrated knowledge data or fails to select the integrated entity among the integrated entity candidates, the integrated entity conversion unit generates a new integrated identifier, And converts an identifier for the entity into the new unified identifier.
The integrated knowledge base establishing system may further include a curation module for selecting one of the plurality of integrated entities based on input data received from the outside when the selected integrated entity is a plurality .
In addition, the integrated knowledge base construction system may further include a function of adding or deleting entity-related data of the integrated knowledge data based on input data received from outside, changing the integrated knowledge data, detecting the changed integrated knowledge data And a curation module for generating change data information.
The knowledge base integration module may further include an integrated knowledge data updating unit that stores the change data information received from the curation module and updates the integrated knowledge data based on the change information.
The integrated knowledge data updating unit compares the changed data information with the selected integrated entity related data, and determines whether to add the integrated entity related data based on the comparison result.
According to the technical idea of the present invention, it is possible to update the integrated knowledge data continuously based on the input data, to add the external data received from the new data server to the integrated knowledge data, .
1 is a block diagram illustrating an integrated knowledge base building system and its input / output relationship according to an embodiment of the present invention.
FIGS. 2A and 2B are block diagrams illustrating an embodiment of an individual knowledge base building module according to an embodiment of the present invention.
FIG. 2C is a view for explaining an operation method of an individual knowledge base module according to an embodiment of the present invention.
FIG. 3A is a block diagram illustrating an embodiment of a knowledge base integration module according to an embodiment of the present invention.
3B is a diagram illustrating an integrated knowledge base including integrated knowledge data generated by a knowledge base integration module according to an embodiment of the present invention.
4A is a block diagram illustrating an embodiment of an entity similarity analyzer according to an embodiment of the present invention.
FIG. 4B is a diagram for explaining the operation of the integrated entity candidate search unit according to an embodiment of the present invention.
4C is a block diagram illustrating an embodiment of an entity similarity calculation unit according to an embodiment of the present invention.
4D to 4E are views for explaining the operation of the integrated entity selecting unit according to an embodiment of the present invention.
FIG. 5A is a block diagram illustrating an example of an integrated knowledge base building system according to an embodiment of the present invention.
5B and 5C are views for explaining the operation of the curation module according to an embodiment of the present invention.
6A is a block diagram illustrating an exemplary implementation of a knowledge base integration module and a curation module according to an embodiment of the present invention.
6B is a view for explaining the operation of the curation module and the knowledge base integration module according to an embodiment of the present invention.
FIG. 7 is a flowchart illustrating an operation of building an integrated knowledge base according to an embodiment of the present invention.
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. Embodiments of the present invention are provided to more fully describe the present invention to those skilled in the art. The present invention is capable of various modifications and various forms, and specific embodiments are illustrated and described in detail in the drawings. It should be understood, however, that the invention is not intended to be limited to the particular forms disclosed, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. Like reference numerals are used for similar elements in describing each drawing. In the accompanying drawings, the dimensions of the structures are enlarged or reduced from the actual dimensions for the sake of clarity of the present invention.
The terminology used in this application is used only to describe a specific embodiment and is not intended to limit the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In this application, the terms "comprises", "having", and the like are used to specify that a feature, a number, a step, an operation, an element, a part or a combination thereof is described in the specification, But do not preclude the presence or addition of one or more other features, integers, steps, operations, components, parts, or combinations thereof.
Also, the terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms may be used for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component.
Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the contextual meaning of the related art and are to be interpreted as either ideal or overly formal in the sense of the present application Do not.
1 is a block diagram illustrating an integrated knowledge base building system and its input / output relationship according to an embodiment of the present invention.
Linked Open Data (LOD) can refer to formatted data provided on a network in a form of representing resource information on a network, such as RDF (Resource Description Framework), in an integrated knowledge base construction system on a network. Such an LOD may include a plurality of entities, each of which may be accessed and used via an identifier, such as a Uniform Resource Identifier (URI), and may be represented on the web by using an HTTP protocol or the like. Data sharing between systems is possible by LOD, and a knowledge base containing vast amount of data can be implemented. Hereinafter, the knowledge data may refer to the LOD, which is the data in which the knowledge information is formulated.
1, the integrated knowledge
The
The individual knowledge
The knowledge
The knowledge
In one embodiment, the
According to the integrated knowledge
FIGs. 2A and 2B are block diagrams illustrating an embodiment of an individual knowledge base building module according to an embodiment of the present invention, and FIG. 2C is a diagram illustrating an operation method of an individual knowledge base module.
As shown in FIG. 2A, the individual knowledge
The inner knowledge
FIG. 2C illustrates an internal knowledge base 10, 20 including internal knowledge data in which the individual knowledge
Internal knowledge data included in each internal knowledge base can be expressed in the form of a triple like RDF. For example, the first internal knowledge data may be expressed in a triple form including a first entity, a relation between a first entity and a second entity, a second entity, or a first entity, an attribute (or a data type) Can be expressed in a triple form including a value. In addition, the identifier of the internal knowledge data may include path information for accessing the entity included in such a triple, and access to knowledge data related to the entity, that is, triple data, as well as the entity through the identifier. In addition, an entity may have a textual value for an attribute called a label, so that the knowledge data may include triple data of an entity, a label, and a value. However, the identifier and the entity are shown separately in the drawings for convenience of explanation, but the identifier corresponds to one means that can express the identity of the entity, and the identifier may be the same as the entity. For example, the identifier 'URI_1_A1' of the first internal knowledge data of the first internal knowledge base 10 may be the same as each other, indicating the object 'A1'.
The first inner knowledge data of the first inner knowledge base 10 and the second inner knowledge data of the second inner knowledge base 20 may each include a plurality of triple data, And the triple data of the second inner knowledge data can be accessed through the second type identifier. The internal knowledge
In one embodiment, the first inner knowledge data of the first inner knowledge base 10 may include data corresponding to five triples with respect to entity 'A1' accessed by the identifier 'URI_1_A1'. For example, the first inner knowledge data may include a plurality of triple data, such as triple data (A1-P1-N1) meaning an object 'A1' and a value 'N1' In addition, the second inner knowledge data of the second inner knowledge base 20 may include data corresponding to five triples with respect to the entity 'A2' accessed by the identifier 'URI_2_A2'. For example, the second inner knowledge data may include a plurality of triple data, such as triple data (A2-P1-N2), which means entity 'A2' and value 'N2' of attribute 'P1'. However, the first internal knowledge base 10 and the second internal knowledge base 20 shown are by way of example only and may include various numbers of triple data.
FIG. 3A is a block diagram illustrating an exemplary embodiment of a knowledge base integration module according to an exemplary embodiment of the present invention. FIG. 3B is a block diagram illustrating an integrated knowledge base including integrated knowledge data generated by a knowledge base integration module according to an exemplary embodiment of the present invention. Fig.
As shown in FIG. 3A, the integrated knowledge
Referring to FIG. 3B, the knowledge
For example, when the knowledge
The integrated knowledge data stored in the integrated
However, in order to integrate knowledge data of a plurality of internal knowledge bases, triple data of integrated knowledge data can be accessed through an integrated identifier. Accordingly, the integrated entity conversion unit 136 first generates new integrated identifiers, converts the first type identifiers of the first knowledge data of the first internal knowledge base 10 into new integrated identifiers, respectively, Lt; / RTI > As described above, the identifier and the entity may be the same concept, and the conversion of the first type identifier into the unified identifier may be the same concept as the conversion of the entity of the first internal knowledge data into the unified entity.
In one embodiment, the aggregate knowledge data of the integrated
FIG. 4A is a block diagram illustrating an embodiment of an entity similarity analysis unit according to an embodiment of the present invention, and FIG. 4B is a diagram for explaining an operation of the integrated entity candidate search unit. FIG. 4C is a block diagram illustrating an embodiment of an entity similarity calculation unit according to an embodiment of the present invention, and FIGS. 4D to 4E are views for explaining operations of the integrated entity selection unit.
4A, the
The object
Based on the degree of similarity generated by the object
Furthermore, the integrated
For example, the integrated
It is possible to select an integrated entity candidate having the highest degree of similarity as an integrated entity. For example, referring to FIG. 4D, the integrated
According to another embodiment of the present invention, the entity
Referring to FIG. 3A, the
Referring to FIG. 4E, the second type identifier 'URI_2_A2' for accessing the 'A2' entity selected as the target entity among the entities of the second inner knowledge data stored in the second inner knowledge base 20 is replaced with the unified identifier 'CURI_A' And can integrate the second internal knowledge data with the integrated knowledge data of the integrated
Then, the integrated entity
FIG. 5A is a block diagram illustrating an example of an integrated knowledge base building system according to an embodiment of the present invention, and FIGS. 5B and 5C are views for explaining the operation of the curation module according to an embodiment of the present invention .
As shown in FIG. 5A, the integrated knowledge
FIG. 6A is a block diagram illustrating an exemplary embodiment of a knowledge base integration module and a curation module according to an embodiment of the present invention. FIG. 6B is a flowchart illustrating an operation of a curation module and a knowledge base integration module according to an exemplary embodiment of the present invention. Fig.
As shown in FIG. 6A, the integrated knowledge
The integrated knowledge
Accordingly, when the same knowledge data as the integrated entity related data exists, the storage capacity of the integrated
FIG. 7 is a flowchart illustrating an operation of building an integrated knowledge base according to an embodiment of the present invention.
7, external data is received from a plurality of data servers, and each external data is subjected to normalization and refinement based on a predetermined conversion rule, and a plurality of knowledge bases (S100). A plurality of entities of the integrated knowledge data are compared with a target entity of the inner knowledge data to search for a plurality of entities of the integrated knowledge data to select entities having the same attribute or relationship value or relationship as the entity candidate (S110). The degree of similarity between the candidate of the selected integrated entity and the target object of the internal knowledge data is analyzed (S120). Based on the analyzed similarity, an integrated entity candidate having a degree of similarity higher than the reference value or having the highest degree of similarity is selected as an integrated entity (S130). It is determined whether the number of selected integrated entities is one or less (S140). If the number of selected integrated entities is equal to or less than one (S140, YES), in order to integrate the knowledge data related to the target entity with the integrated knowledge data, the integrated information including the selection result for the integrated entity is used The integrated data is integrated with the integrated knowledge data to establish an integrated knowledge base (S150). If the number of selected integrated entities is not less than one (S140, NO), the integrated entity selected based on the input data received from the outside is modified so that only one or less entities are selected as integration entities (S160) , And integrates the data related to the target entity with the integrated knowledge data using the integrated information including the modified selection result for the integrated entity (S150).
While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, but, on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims. Accordingly, the true scope of the present invention should be determined by the technical idea of the appended claims.
Claims (10)
The knowledge base integration module includes:
Searching the integrated knowledge data for selecting an integrated entity candidate of the integrated knowledge data by comparing the object of the first inner knowledge data with the objects of the integrated knowledge data, An entity similarity analyzer for selecting an integrated entity based on the similarity among the integrated entity candidates,
And an integrated entity converting unit for adding the object related data among the first internal knowledge data to the integrated knowledge data based on the integrated information including the selection result for the integrated entity,
Wherein the curation module comprises:
When a plurality of the integration entities are selected,
And selects one integrated entity of the plurality of integrated entities based on input data received from the outside.
Wherein the individual knowledge base rescue module comprises:
Converting the external data received from the second data server to generate second knowledge data individually,
The knowledge base integration module includes:
And integrates the integrated knowledge data and the second knowledge data.
Wherein the object similarity analyzer comprises:
(Or relationship) of the object of the first internal knowledge data and the value (or object) corresponding thereto and the predetermined attribute (or relation) of each entity of the integrated knowledge data and the value (Or relationship) and the same value (or object) as the predetermined property (or relationship) of the target object among the objects of the integrated knowledge data and the corresponding value (or object) And selects an entity having the same as the integrated entity candidate.
Wherein the object similarity analyzer comprises:
Generating similarity between the target entity and the integrated entity candidate based on at least one of graph information, ontology information, and syntax information corresponding to the target entity and the individual entity candidates,
And selects the integrated entity based on the degree of similarity generated among the integrated entity candidates.
Wherein the integrated entity conversion unit comprises:
And converting an identifier for the target entity into an integrated identifier for the integrated entity.
Wherein the integrated entity conversion unit comprises:
When the object similarity analyzing unit fails to select an integrated entity candidate of the integrated knowledge data or fails to select the integrated entity among the integrated entity candidates,
And converting the identifier for the target entity into the new integrated identifier.
Wherein the curation module comprises:
The method according to any one of claims 1 to 3, further comprising the steps of: adding or deleting entity-related data of the integrated knowledge data based on input data received from the outside, modifying the integrated knowledge data, Integrated knowledge base building system.
The knowledge base integration module includes:
And an integrated knowledge data updating unit for storing the change data information received from the curation module and updating the integrated knowledge data based on the change data information.
Wherein the integrated knowledge data updating unit comprises:
Comparing the change data information with the selected integrated entity related data, and determining whether to add the integrated entity related data based on the comparison result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020160010071A KR101739540B1 (en) | 2016-01-27 | 2016-01-27 | System and method for building integration knowledge base based |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020160010071A KR101739540B1 (en) | 2016-01-27 | 2016-01-27 | System and method for building integration knowledge base based |
Publications (1)
Publication Number | Publication Date |
---|---|
KR101739540B1 true KR101739540B1 (en) | 2017-06-08 |
Family
ID=59221161
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020160010071A KR101739540B1 (en) | 2016-01-27 | 2016-01-27 | System and method for building integration knowledge base based |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101739540B1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20190079805A (en) * | 2017-12-28 | 2019-07-08 | 주식회사 솔트룩스 | System and method for building integration knowledge base based a plurality of data sources |
KR102098255B1 (en) * | 2018-11-30 | 2020-04-07 | 주식회사 솔트룩스 | System and method for consolidating knowledge based on knowledge embedding |
KR102111734B1 (en) * | 2018-11-29 | 2020-05-15 | 주식회사 솔트룩스 | System and method for building integration knowledge base based |
KR102121504B1 (en) * | 2018-11-29 | 2020-06-10 | 주식회사 솔트룩스 | System and method for building integration knowledge data base based a plurality of data sources |
KR20210050206A (en) * | 2019-10-28 | 2021-05-07 | 주식회사 한글과컴퓨터 | Knowledge database management device for building a knowledge database using tables included in spreadsheet documents and enabling user access to the knowledge database, and operating method thereof |
KR20210077251A (en) * | 2019-12-17 | 2021-06-25 | 주식회사 한글과컴퓨터 | Database building device that can build a knowledge database from a table-inserted image and operating method thereof |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101467707B1 (en) * | 2013-12-23 | 2014-12-02 | 포항공과대학교 산학협력단 | Method for instance-matching in knowledge base and device therefor |
-
2016
- 2016-01-27 KR KR1020160010071A patent/KR101739540B1/en active IP Right Grant
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101467707B1 (en) * | 2013-12-23 | 2014-12-02 | 포항공과대학교 산학협력단 | Method for instance-matching in knowledge base and device therefor |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20190079805A (en) * | 2017-12-28 | 2019-07-08 | 주식회사 솔트룩스 | System and method for building integration knowledge base based a plurality of data sources |
KR102006214B1 (en) | 2017-12-28 | 2019-08-02 | 주식회사 솔트룩스 | System and method for building integration knowledge base based a plurality of data sources |
KR102111734B1 (en) * | 2018-11-29 | 2020-05-15 | 주식회사 솔트룩스 | System and method for building integration knowledge base based |
WO2020111371A1 (en) * | 2018-11-29 | 2020-06-04 | 주식회사 솔트룩스 | Integrated knowledge base construction system and method |
KR102121504B1 (en) * | 2018-11-29 | 2020-06-10 | 주식회사 솔트룩스 | System and method for building integration knowledge data base based a plurality of data sources |
KR102098255B1 (en) * | 2018-11-30 | 2020-04-07 | 주식회사 솔트룩스 | System and method for consolidating knowledge based on knowledge embedding |
KR20210050206A (en) * | 2019-10-28 | 2021-05-07 | 주식회사 한글과컴퓨터 | Knowledge database management device for building a knowledge database using tables included in spreadsheet documents and enabling user access to the knowledge database, and operating method thereof |
KR102300467B1 (en) | 2019-10-28 | 2021-09-09 | 주식회사 한글과컴퓨터 | Knowledge database management device for building a knowledge database using tables included in spreadsheet documents and enabling user access to the knowledge database, and operating method thereof |
KR20210077251A (en) * | 2019-12-17 | 2021-06-25 | 주식회사 한글과컴퓨터 | Database building device that can build a knowledge database from a table-inserted image and operating method thereof |
KR102328034B1 (en) | 2019-12-17 | 2021-11-17 | 주식회사 한글과컴퓨터 | Database building device that can build a knowledge database from a table-inserted image and operating method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101739540B1 (en) | System and method for building integration knowledge base based | |
US11068439B2 (en) | Unsupervised method for enriching RDF data sources from denormalized data | |
CN111782965B (en) | Intention recommendation method, device, equipment and storage medium | |
US9495345B2 (en) | Methods and systems for modeling complex taxonomies with natural language understanding | |
US20200192727A1 (en) | Intent-Based Organisation Of APIs | |
US11960513B2 (en) | User-customized question-answering system based on knowledge graph | |
US10747958B2 (en) | Dependency graph based natural language processing | |
US20170262868A1 (en) | Methods and systems for analyzing customer care data | |
KR20220115046A (en) | Method and appartuas for semantic retrieval, device and storage medium | |
US11281864B2 (en) | Dependency graph based natural language processing | |
CN107679035B (en) | Information intention detection method, device, equipment and storage medium | |
US11836120B2 (en) | Machine learning techniques for schema mapping | |
US20170103125A1 (en) | Apparatus and method of exploring and accessing relevant data from big data repository | |
Dyvak et al. | Recognition of Relevance of Web Resource Content Based on Analysis of Semantic Components | |
CN114996549A (en) | Intelligent tracking method and system based on active object information mining | |
Rahmani et al. | Entity resolution in disjoint graphs: an application on genealogical data | |
US20170124090A1 (en) | Method of discovering and exploring feature knowledge | |
KR20150112442A (en) | System and method for generating knowledge | |
Kumar et al. | Efficient structuring of data in big data | |
US20150154198A1 (en) | Method for in-loop human validation of disambiguated features | |
US20230032208A1 (en) | Augmenting data sets for machine learning models | |
Shafi et al. | [WiP] Web Services Classification Using an Improved Text Mining Technique | |
US9910890B2 (en) | Synthetic events to chain queries against structured data | |
US20230142351A1 (en) | Methods and systems for searching and retrieving information | |
CN114648121A (en) | Data processing method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
GRNT | Written decision to grant |