WO2015060709A1

WO2015060709A1 - A method for translating a knowledge base

Info

Publication number: WO2015060709A1
Application number: PCT/MY2014/000196
Authority: WO
Inventors: Ben Mohamed KHALIL; Lukose Dickson
Original assignee: Mimos Berhad
Priority date: 2013-10-21
Filing date: 2014-06-26
Publication date: 2015-04-30
Also published as: MY168821A

Abstract

The present invention relates to a method for translating a knowledge base. The method comprises the steps of extracting a first semantic network notation (SN1) from a first knowledge base; translating SN1 into a second network notation (SN2) by using a series of SN1 to SN2 translation rules; translating a SN2 into SN1 by using a series of SN2 to SN1 translation rules; storing the translated SN1 in a second knowledge base; and validating the translations.

Description

A METHOD FOR TRANSLATING A KNOWLEDGE BASE

FIELD OF INVENTION

The present invention relates to a method for translating a knowledge base. Moreover, the present invention provides a method for validating the translations of the knowledge base between different semantic network notations.

BACKGROUND OF THE INVENTION

A knowledge base is an information repository that provides a means for information to be collected, organized, shared, searched and utilized. It can be either machine-readable or intended for human use. A knowledge base can be translated whereby a specific semantic network notation is translated to another semantic network notation. For example, translating a knowledge base represented in any of a W3C vocabulary for representing a web-based ontology (W3C-KB) to an Extended Conceptual Graphs (ECG-KB) notation. There are a few methods of the knowledge base translation between different semantic network notations which are manually validated by an expert.

An example of such method is disclosed in US Publication No. 2013/0086100 A1 whereby the method includes receiving a request from a requestor to validate a data assemblage expressed in JavaScript Object Notation (JSON); translating the data assemblage expressed in JSON into an extensible mark-up language (XML) instance; validating the XML instance using syntactic schema and semantic schema specifications; and sending a response to the requestor. For a case where the data assemblage contains invalid data in at least one field the response includes an output array containing information for specifying valid data for the at least one field and a message explaining a reason why the field is invalid. A system for performing the method is also described as a computer program product that can be used to execute the method. However, the data translation depends on the request and response of the user whereby the translation is not automatically validated.

In another example, US Publication No. 2011/0314451 A1 discloses a method and system for validating translated files for inclusion in an application being developed. Translatable files having externalized content in a single base language are sent for translation into other languages. Translated files resulting from a translation of the translatable files are received. Each translated file is statically and dynamically validated to detect error(s). The static validation is based on comparing the translatable files to the translated files. The dynamic validation is based on a simulation of how a user interface of the application presents the externalized content, without including an actual presentation of the externalized content by the user interface. Modified translated files that correct the detected error(s) are received and provided for a presentation of the externalized content by the user interface. However, this prior art does not perform automated validation of the translations whereby , it performs the static validation based on comparing the translatable files to the translated files and the dynamic validation based on a simulation of how user interface of the application presents the externalized content excluding an actual presentation of the externalized content by the user interface.

Therefore, there is a need to provide a method to validate the translation of the knowledge base that addresses the above mentioned drawbacks.

SUMMARY OF INVENTION

A method for translating a knowledge base comprises the steps of extracting a first semantic network notation (SN1) from a first knowledge base; translating SN1 into a second network notation (SN2) by using a series of SN1 to SN2 translation rules; translating a SN2 into SN1 by using a series of SN2 to SN1 translation rules; storing the translated SN1 in a second knowledge base; and validating the translations. Preferably, the method for validating the translations includes the steps of generating a set of queries based on SN1 from the first knowledge base and a set of preferences; querying SN1 from the first knowledge base and the second knowledge base if there is set of queries generated; generating answers for each respective knowledge base; comparing obtained answers from the first and second knowledge base results; and storing generated query and the obtained answers in a repository if the second knowledge base results are different from the first knowledge base results.

Preferably, the method for generating the set of queries based on the KB S 1 Input from the input knowledge base and the set of preferences includes the steps of extracting facts from the first knowledge base; computing sub-graphs of the facts by using depth-first or breadth-first backtrack algorithms if there is at least one fact extracted from the first knowledge base; extracting a maximum number of determined sub-graphs based on the preferred size or based on the closest sizes if preferences indicate a maximum number of sub-graphs and a preferred size; extracting the sup-graphs based on the maximum number indicated if the preferences only indicates the maximum number of sub-graphs; extracting the sup- graphs based on the preferred size if the preferences only indicates the preferred size; computing generalizations of extracted sub-graphs if preferences indicate the use of ontology; extracting the generalizations based on the maximum number of generalizations and generalizing all concepts as well as properties not more than the maximum ontology depth exploration if the preferences indicates the maximum number of generalizations and the maximum ontology depth exploration; extracting the generalizations based on the maximum number of generalizations if the preferences only indicate the maximum number of generalizations; generalizing all concepts and properties of the generalizations based on the maximum ontology depth exploration if the preferences only indicate the maximum ontology depth exploration; and extracting all possible generalizations if there are no preferences. BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention. FIG. 1 illustrates a flowchart of a method for translating a knowledge base according to an embodiment of the present invention.

FIG. 2 illustrates a flowchart of the sub-steps for validating the translation of the knowledge base according to an embodiment of the present invention.

FIG. 3 illustrates a flowchart of the sub-steps for creating queries according to an embodiment of the present invention. DESCRIPTION OF THE PREFFERED EMBODIMENT

A preferred embodiment of the present invention will be described herein below with reference to the accompanying drawings. In the following description, well known functions or constructions are not described in detail since they would obscure the description with unnecessary detail.

FIG.1 shows a flowchart of a method for translating a knowledge base according to an embodiment of the present invention. The method translates a first semantic network notation (SN1) to a second semantic network notation (SN2). In step 200, SN1 is extracted from a first knowledge base and translated into SN2 by using a series of SN1 to SN2 translation rules. For example, the translation rule between ECG and WC3 notations is as IF Concept c; THEN addTriple "c rdf.type rdfs:Class";IF Concepts c1 and c2 and c1 < c2 (c1 is a direct specialization of c2); THEN addTriple "d rdfs:subClassOf c2". In step 300, SN2 is translated into SN1 using a series of SN2 to SN1 translation rules. The translated SN1 is stored in a second knowledge base. In step 400, the translations are validated and the information loss is returned. The information loss occurred when the expressivity of source semantic notation and target semantic notation is different, wherein some information in source semantic notation cannot be represented in target source semantic notation.

FIG. 2 shows a flowchart of the sub-steps for validating the translation of the knowledge base according to an embodiment of the present invention. The flowchart shows the step of the validation as provided in step 400 of FIG. 1. In step 410, a set of queries is generated based on SN1 from the first knowledge base and a set of preferences. In decision 420, if there is no set of queries generated, the method ends. If there is a set of queries generated, SN1 from the first knowledge base and the second knowledge base are queried as in step 430. Later on, the answers are generated for each respective knowledge base. From the first and second knowledge base results produced in step 430, the obtained answers are compared as in step 440. In decision 450, if the second knowledge base answers are different from the first knowledge base answers, the generated query and the obtained answers are stored in a repository as in step 460. If there is no answer difference, the method returns to step 420. FIG. 3 shows a flowchart of the sub-steps for creating queries as provided in step 410 of FIG. 2. In step 411, facts from the first knowledge base are extracted. In decision 412, if there are no facts extracted from the first knowledge base, the method ends. If there is at least one fact extracted from the first knowledge base, sub-graphs of the facts are computed by using depth-first or breadth-first backtrack algorithms, where at each step one node of the graph is removed which forms a new sub-graph as in step 413 wherein a number of sub-graphs from the facts is the queries extracted based on preferences. If the preferences indicate a maximum number of sub-graphs and a preferred size, the maximum number of determined sub-graphs are extracted based on the preferred size or based on the closest sizes if the sizes is not enough. If the preferences only indicates a maximum number of subgraphs, the sup-graphs are extracted based on the maximum number indicated. If the preferences only indicates a preferred size, the sup-graphs are extracted based on the preferred size. If there are no preferences, all possible sub-graphs are extracted.

In decision 414, if the preferences do not indicate the use of ontology, the method returns to decision 412. The use of ontology is based on maximum number of generalizations and maximum ontology depth exploration. If the preferences indicate the use of the ontology, generalizations of the extracted sub-graphs are computed as in step 415 wherein each generalization is a query. If the preferences indicates a maximum number of generalizations and a maximum ontology depth exploration, the generalizations are extracted based on the maximum number of generalizations and all concepts as well as properties are generalized not more than the maximum ontology depth exploration. Maximum number of generalizations indicates a threshold to stop creating new generalizations while maximum ontology depth exploration indicates a threshold to not creating too general new generalizations. If the preferences only indicate a maximum number of generalizations, the generalizations are only extracted based on the maximum number of generalizations. If the preferences only indicate a maximum ontology depth exploration, all concepts and properties of the generalizations are generalized based on the maximum ontology depth exploration. If there are no preferences, all possible generalizations are extracted. While embodiments of the invention have been illustrated and described, it is not intended that these embodiments illustrate and describe all possible forms of the invention. Rather, the words used in the specifications are words of description rather than limitation and various changes may be made without departing from the scope of the invention.

Claims

1. A method for translating a knowledge base comprises the steps of:

a) extracting a first semantic network notation (SN1) from a first knowledge base;

b) translating SN1 into a second network notation (SN2) by using a series of S 1 to SN2 translation rules;

c) translating a SN2 into S 1 by using a series of SN2 to SN1 translation rules;

d) storing the translated SN1 in a second knowledge base; and e) validating the translations.

2. The method for translating a knowledge base as claimed in claim 1 , wherein validating the translations includes the steps of:

a) generating a set of queries based on SN1 from the first knowledge base and a set of preferences;

b) querying SN1 from the first knowledge base and the second knowledge base if there is set of queries generated;

c) generating answers for each respective knowledge base; d) comparing obtained answers from the first and second knowledge base results; and

e) storing generated query and the obtained answers in a repository if the second knowledge base results are different from the first knowledge base results.

3. The method for translating a knowledge base as claimed in claim 2, wherein generating the set of queries based on the KB SN1 Input from the input knowledge base and the set of preferences includes the steps of:

a) extracting facts from the first knowledge base;

b) computing sub-graphs of the facts by using depth-first or breadth-first backtrack algorithms if there is at least one fact extracted from the first knowledge base;

c) extracting a maximum number of determined sub-graphs based on the preferred size or based on the closest sizes if preferences indicate a maximum number of sub-graphs and a preferred size; d) extracting the sup-graphs based on the maximum number indicated if the preferences only indicates the maximum number of sub-graphs; e) extracting the sup-graphs based on the preferred size if the preferences only indicates the preferred size;

f) computing generalizations of extracted sub-graphs if preferences indicate the use of ontology;

g) extracting the generalizations based on the maximum number of generalizations and generalizing all concepts as well as properties not more than the maximum ontology depth exploration if the preferences indicates the maximum number of generalizations and the maximum ontology depth exploration;

h) extracting the generalizations based on the maximum number of generalizations if the preferences only indicate the maximum number of generalizations;

i) generalizing all concepts and properties of the generalizations based on the maximum ontology depth exploration if the preferences only indicate the maximum ontology depth exploration; and

j) extracting all possible generalizations if there are no preferences.