WO2015047073A1

WO2015047073A1 - Method for performing distributed reasoning over linked data

Info

Publication number: WO2015047073A1
Application number: PCT/MY2014/000125
Authority: WO
Inventors: Weng Onn Kow; Lukose Dickson
Original assignee: Mimos Berhad
Priority date: 2013-09-27
Filing date: 2014-05-28
Publication date: 2015-04-02
Also published as: MY185830A

Abstract

The present invention relates to a method for reasoning knowledge that is distributed across a plurality of linked knowledge bases. It handles queries in two ways. First, it handles a query by determining whether or not a fact is true. Secondly, it handles a query by returning query results that are not directly explicit within the triples of the queries knowledge bases which are the subjects, predicates and objects.

Description

METHOD FOR PERFORMING DISTRIBUTED REASONING OVER LINKED DATA

FIELD OF INVENTION

The present invention relates to a method for reasoning knowledge that is distributed across a plurality of linked knowledge bases.

BACKGROUND OF THE INVENTION

In recent times, there are plenty of information sources that everyone is capable to obtain information from all over the world on the world wide web. A knowledge base, also known as kb, is a special database for knowledge management. It is an information repository that provides a means for information to be collected, organised, shared, searched and utilized. A knowledge base can either be machine-readable or intended for human use. Human-readable knowledge bases are used to enable people to retrieve and use the knowledge they contain. A human- readable knowledge base can also be coupled with a machine-readable one.

Machine-readable knowledge bases store knowledge in a computer-readable form which usually has automated deductive reasoning applied to them. These knowledge bases contain a set of data, often in the form of rules that describe the knowledge in a logically consistent manner, whereby an ontology is used to define the structure of the stored data. One of the examples is the Web Ontology Language (OWL), which is a family of knowledge representation language for authoring ontologies. The languages are characterised by formal semantics and Resource Description Framework (RDF) / Extensible Markup Language (XML) - based serializations for the semantic web.

A search engine is usually used to locate information in the system or to browse through a classification scheme. Search engines often provide compilation and retrieval services. Due to the growing amount of data contained within the world wide web and other information sources such as these knowledge bases, it is difficult for users to obtain all information they need when only a single search request is used. Apart from that, reasoning on a distributed set of knowledge bases requires all the information such as the rules and facts to be stored in the same location before the inferencing for new facts can be performed. This can lead to several issues on how to decide which knowledge bases to bring together and limited results are achieved as results are returned only from the knowledge bases that were specified to be federated. Therefore, there is a need to provide a method that addresses the above mentioned drawbacks of the existing method for reasoning knowledge.

SUMMARY OF INVENTION

The present invention relates to a method for reasoning knowledge that is distributed across a plurality of linked knowledge bases. The method is characterised by the steps of receiving an initial query from user; identifying an appropriate knowledge base to answer the query by a data source index; sending the query to the appropriate knowledge base for a local query; storing the results from the local query in a computer memory; performing a transitive reasoning within the same knowledge base if the query involves a transitive property; discovering equivalences of concepts and relations to the answers of the query in other knowledge bases; generating new queries from the discovered equivalences; distributing queries to knowledge bases to perform reasoning until a timeout is reached or if there are no more answers found from the knowledge bases; aggregating results from all knowledge bases; and returning the results to user.

Preferably, the step of performing a transitive reasoning within the same knowledge base if the query involves a transitive property includes replacing the subject of the initial query with the result received from the knowledge base to form a new query; and finding results of the new query within the same knowledge base.

Preferably, the step of discovering equivalences of concepts and relations to the answers in other knowledge bases includes finding explicit equivalent subjects, objects and predicates of the results of the queries; and finding implicit equivalent subjects, objects and predicates of the results of the queries.

Preferably, the explicit equivalent subjects, objects and predicates of the results of the queries are found by using Web Ontology Language (OWL).

Preferably, the implicit equivalent subjects, objects and predicates of the results of the queries are found by using functional and inverse functional properties. BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention.

FIG. 1 illustrates a flow chart of a method for reasoning knowledge according to an embodiment of the present invention.

FIG. 2 illustrates a flow chart of a sub-process for transitive reasoning.

FIG. 3 illustrates a flow chart of a sub-process for equivalences discovering.

FIG. 4 illustrates the aggregated results of an example of a query across a plurality of linked knowledge bases.

DESCRIPTION OF THE PREFFERED EMBODIMENT

A preferred embodiment of the present invention will be described herein below with reference to the accompanying drawings. In the following description, well known functions or constructions are not described in detail since they would obscure the description with unnecessary detail.

Reference is made initially to FIG. 1, which illustrates a flow chart of a method for reasoning knowledge according to an embodiment of the present invention. The method which is used in a query system enables the inference of knowledge using rules and facts that are distributed across a plurality of linked knowledge bases. It handles queries in two ways. First, it handles a query by determining whether or not a fact is true. Secondly, it handles a query by returning query results that are not directly explicit within the triples of the queries knowledge bases which are the subjects, predicates and objects.

Initially, when a query is received, a data source index is used to determine which data source or knowledge base that contains the answer to the initial query as in step 100. The data source index stores a list of namespaces which is used to identify an appropriate knowledge base based on a query. After the appropriate knowledge base is determined, the query is sent to the knowledge base for a local query as in step 200. The results from the local query are then stored in a computer memory until the final aggregated results are returned as in step 700.

If the query involves a transitive property, the system then performs a transitive reasoning within the same knowledge base and retrieves more results as in step 300. For a transitive property, if "a" has ancestor "b" and "b" has ancestor "c", we can infer that "a" also has ancestor "c". Transitive properties are usually used for hierarchy relations such as subclass and relations such as "part of or "located in" predicates. However, if the query has no transitive property, the system directly returns the results to the user without any further queries. After a query with a transitive property has completed its transitive reasoning, the equivalences of the subject of the query are looked up as in step 400. The subject is replaced with the results of the query to generate a set of new queries as in step 500. The entire process is repeated for each new query from step 100 with the identification of which knowledge base that could answer the new query. The process repeats until a timeout is reached or if there are no more answers to the query found from the knowledge base, upon which the aggregated results are returned to the user as in step 600 and step 700. Step 300 is further described in FIG. 2 which illustrates a flow chart of a sub- process for the transitive reasoning. Initially, from the original query in step 100, any query clause that contains a transitive property undergoes the replacement of the subject with the result as in step 310. An example of a query having a transitive property is "PJ is located in X?", wherein X is the answer that the system needs to find. Since "is located in" is a transitive predicate, as explained previously, the subject i.e. "PJ" is replaced by the result of the query i.e. "Selangor" which creates a new question i.e. "Selangor is located in X?", wherein the system repeats and continues the query performing and subject replacing to create new queries until the knowledge base stops returning answers to the queries as in steps 310 to 330. Each time a query runs, the timeout is checked as in step 320. If the query exceeds the timeout, then whatever results found within the timeout are returned.

When a knowledge base gets answers for the sent queries as shown in FIG. 2, the system continues to look for equivalences or similar concepts to the answers in another knowledge base as in step 400. From the previous query example, the query "PJ is located in X?" returns the result "Selangor." The system will then find if "Selangor" exists in any other knowledge base. If the system finds that "Selangor" in the first knowledge base is the same as "Selangor" in the second knowledge base, the query that is created from the transitive reasoning i.e. "Selangor is located in X?" is then sent to the second knowledge base to get more results.

Referring now to FIG. 3, there is shown a flow chart of a sub-process for finding explicit and implicit equivalences of the results from the queries. Explicit equivalences are found by using existing relations such Web Ontology Language (OWL) to find the equivalent subjects, objects and predicates as in steps 410 to 430. OWL has predicates called "equivalentClass" and "sameAs" to indicate that a concept is equivalent to another. The previous result from the query example of "PJ is located in X?" is used to further explain how the equivalences are found by using OWL. The first knowledge base might have a triple of results which are "kb1 : Selangor", "owhsameAs" and "kb2: Selangor", wherein kb1 is a first knowledge base, owl is web ontology language and kb2 is a second knowledge base. The triple of results means that "Selangor" in the first knowledge base is equivalent to "Selangor" in the second knowledge base. On the other hand, implicit equivalences are found by using functional and inverse functional properties as in steps 440 to 450. A functional property is a constraint that says a property can only have one object value for each unique subject. For example, a country can only have one capital city. In another example, a person can only have one biological mother. Inference can be performed by the following rule whereby if "a" is a property of "b" and "a" is a property of "c", it is said that "b" is equivalent to "c." An example of a functional property, the first statement says "Malaysia's capital city is KL" and the second statement says "Malaysia's capital city is Kuala Lumpur." It can be deduced that "KL" is the same as "Kuala Lumpur". Similarly, an inverse functional property is a functional property in reverse where the subject is constrained. The rule of an inverse property states that if "a" has a property of "b" and "c" has a property of "b", then it is said that "a" is equivalent to "c." For example, in a statement it is said that "ferum has an atomic weight of 55.845" and in another statement, it is said that "iron has an atomic weight of 55.845." Thus, it can be deduced that ferum is the same as iron. After finding the explicit and implicit equivalences of the results, these concepts are then replaced to generate a set of new queries as in step 500. The other knowledge bases will generate more queries and the entire process is repeated again for each new query from step 100 by determining the right knowledge base that could answer the new queries. The process repeats until a new timeout is reached or if there are no more answers to the query found from the knowledge base, upon which the aggregated results are returned to the user as in step 600 and step 700.

Table 1 and Table 2 show another detailed example on finding the equivalent subjects, objects and predicates of the results of the queries according to an embodiment of the present invention, wherein kb1 is a first knowledge base, kb2 is a second knowledge base, owl is web ontology language and rdf is resource description framework.

Table 1

The query "KLCC located in X?" can be answered according to the steps as described earlier on, wherein X is the answer that we need to find. Initially, when the query "KLCC located in X?" is received, a data source index is used to determine which knowledge base contains the answer to this query as in step 100. This query is sent to the chosen knowledge base for a local query as in step 200. The answer "KL" from the local query is stored in a computer memory until the final aggregated results are returned as in step 700.

Since this query involves a transitive property, the system performs the transitive reasoning within the same knowledge base as in step 300 and replaces the subject of the query which in this case is "KLCC" with the result i.e. "KL". The system then creates a new question i.e. "KL is located in X?" and repeats the steps from step 310 to step 330 until the knowledge base stops returning answers. From this step, the system retrieves more result ie "Malaysia". After the knowledge base gets answers for the queries i.e. "KL" and "Malaysia", the system continues to look for equivalences or similar concepts to these answers in another knowledge base as in step 400. The system finds out if "KL" and "Malaysia" exist in any other knowledge base besides the first knowledge base. From Table 1 , it shows that the system finds that "Malaysia" exists in a second knowledge base.

The explicit equivalences are found by using OWL to find the equivalent subjects, objects and predicates as in steps 410 to 430. From Table 1 , it shows that OWL is used to relate that "Malaysia" in the first knowledge base is the same as "Malaysia" from the second knowledge base. The queries are then sent to the second knowledge base to get more results. It is also computed that the predicate "located in" from the first knowledge base is the same as the predicate "has location" from the second knowledge base. On the other hand, implicit equivalences are found as in steps 440 to 450 by using a functional property. By using "has capital" as the predicate, the first knowledge base returns a result of "KL" while the second knowledge base returns a result of "Kuala Lumpur." By using the functional property, it is deduced that "KL" is the same as "Kuala Lumpur." These concepts are then replaced to generate a set of new queries as in step

500. In Table 2, it shows that the second knowledge base generates two new queries which are "Kuala Lumpur has location X?" and "Malaysia has location X?", wherein X is the answer that we need to find. The entire process is repeated again for each new query from step 100 with the identification of the appropriate knowledge base that could answer the new queries. The process repeats until a timeout is reached or if there are no more answers to the queries found from the knowledge bases, upon which the aggregated results are returned to the user as in step 600 and step 700. Table 2 shows that the results returned from the second knowledge bases are "Klang Valley" and "Malaysia." The final aggregated results from the first knowledge base are "KL" and "Malaysia", while the aggregated results from the second knowledge base are "Kuala Lumpur", "Malaysia", "Klang Valley" and "South east Asia." The final result of the query is shown in FIG. 4.

While embodiments of the invention have been illustrated and described, it is not intended that these embodiments illustrate and describe all possible forms of the invention. Rather, the words used in the specifications are words of description rather than limitation and various changes may be made without departing from the scope of the invention.

Claims

A method for reasoning knowledge that is distributed across a plurality of linked knowledge bases is characterised by the steps of:

a) receiving an initial query from user;

b) identifying an appropriate knowledge base to answer the query by a data source index;

c) sending the query to the appropriate knowledge base for a local query;

d) storing the results from the local query in a computer memory;

e) performing a transitive reasoning within the same knowledge base if the query involves a transitive property;

f) discovering equivalences of concepts and relations to the answers of the query in other knowledge bases;

g) generating new queries from the discovered equivalences;

h) distributing queries to knowledge bases to perform reasoning until a timeout is reached or if there are no more answers found from the knowledge bases;

i) aggregating results from all knowledge bases; and

j) returning the results to user.

The method as claimed in claim 1 , wherein the step of performing a transitive reasoning within the same knowledge base if the query involves a transitive property includes:

a) replacing the subject of the initial query with the result received from the knowledge base to form a new query; and

b) finding results of the new query within the same knowledge base.

The method as claimed in claim 1 , wherein the step of discovering equivalences of concepts and relations to the answers in other knowledge bases includes:

a) finding explicit equivalent subjects, objects and predicates of the results of the queries; and

b) finding implicit equivalent subjects, objects and predicates of the results of the queries. The method as claimed in claim 3, wherein the explicit equivalent subjects, objects and predicates of the results of the queries are found by using Web Ontology Language (OWL).

The method as claimed in claim 3, wherein the implicit equivalent subjects, objects and predicates of the results of the queries are found by using functional and inverse functional properties.