WO2015060709A1 - A method for translating a knowledge base - Google Patents

A method for translating a knowledge base Download PDF

Info

Publication number
WO2015060709A1
WO2015060709A1 PCT/MY2014/000196 MY2014000196W WO2015060709A1 WO 2015060709 A1 WO2015060709 A1 WO 2015060709A1 MY 2014000196 W MY2014000196 W MY 2014000196W WO 2015060709 A1 WO2015060709 A1 WO 2015060709A1
Authority
WO
WIPO (PCT)
Prior art keywords
knowledge base
preferences
generalizations
maximum number
extracting
Prior art date
Application number
PCT/MY2014/000196
Other languages
French (fr)
Inventor
Ben Mohamed KHALIL
Lukose Dickson
Original Assignee
Mimos Berhad
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mimos Berhad filed Critical Mimos Berhad
Publication of WO2015060709A1 publication Critical patent/WO2015060709A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation

Definitions

  • the present invention relates to a method for translating a knowledge base. Moreover, the present invention provides a method for validating the translations of the knowledge base between different semantic network notations.
  • a knowledge base is an information repository that provides a means for information to be collected, organized, shared, searched and utilized. It can be either machine-readable or intended for human use.
  • a knowledge base can be translated whereby a specific semantic network notation is translated to another semantic network notation. For example, translating a knowledge base represented in any of a W3C vocabulary for representing a web-based ontology (W3C-KB) to an Extended Conceptual Graphs (ECG-KB) notation. There are a few methods of the knowledge base translation between different semantic network notations which are manually validated by an expert.
  • US Publication No. 2011/0314451 A1 discloses a method and system for validating translated files for inclusion in an application being developed.
  • Translatable files having externalized content in a single base language are sent for translation into other languages.
  • Translated files resulting from a translation of the translatable files are received.
  • Each translated file is statically and dynamically validated to detect error(s).
  • the static validation is based on comparing the translatable files to the translated files.
  • the dynamic validation is based on a simulation of how a user interface of the application presents the externalized content, without including an actual presentation of the externalized content by the user interface.
  • Modified translated files that correct the detected error(s) are received and provided for a presentation of the externalized content by the user interface.
  • this prior art does not perform automated validation of the translations whereby , it performs the static validation based on comparing the translatable files to the translated files and the dynamic validation based on a simulation of how user interface of the application presents the externalized content excluding an actual presentation of the externalized content by the user interface.
  • a method for translating a knowledge base comprises the steps of extracting a first semantic network notation (SN1) from a first knowledge base; translating SN1 into a second network notation (SN2) by using a series of SN1 to SN2 translation rules; translating a SN2 into SN1 by using a series of SN2 to SN1 translation rules; storing the translated SN1 in a second knowledge base; and validating the translations.
  • SN1 semantic network notation
  • SN2 second network notation
  • the method for validating the translations includes the steps of generating a set of queries based on SN1 from the first knowledge base and a set of preferences; querying SN1 from the first knowledge base and the second knowledge base if there is set of queries generated; generating answers for each respective knowledge base; comparing obtained answers from the first and second knowledge base results; and storing generated query and the obtained answers in a repository if the second knowledge base results are different from the first knowledge base results.
  • the method for generating the set of queries based on the KB S 1 Input from the input knowledge base and the set of preferences includes the steps of extracting facts from the first knowledge base; computing sub-graphs of the facts by using depth-first or breadth-first backtrack algorithms if there is at least one fact extracted from the first knowledge base; extracting a maximum number of determined sub-graphs based on the preferred size or based on the closest sizes if preferences indicate a maximum number of sub-graphs and a preferred size; extracting the sup-graphs based on the maximum number indicated if the preferences only indicates the maximum number of sub-graphs; extracting the sup- graphs based on the preferred size if the preferences only indicates the preferred size; computing generalizations of extracted sub-graphs if preferences indicate the use of ontology; extracting the generalizations based on the maximum number of generalizations and generalizing all concepts as well as properties not more than the maximum ontology depth exploration if the preferences indicates the maximum number of generalizations and the maximum ont
  • FIG. 1 illustrates a flowchart of a method for translating a knowledge base according to an embodiment of the present invention.
  • FIG. 2 illustrates a flowchart of the sub-steps for validating the translation of the knowledge base according to an embodiment of the present invention.
  • FIG. 3 illustrates a flowchart of the sub-steps for creating queries according to an embodiment of the present invention.
  • FIG.1 shows a flowchart of a method for translating a knowledge base according to an embodiment of the present invention.
  • the method translates a first semantic network notation (SN1) to a second semantic network notation (SN2).
  • SN1 is extracted from a first knowledge base and translated into SN2 by using a series of SN1 to SN2 translation rules.
  • the translation rule between ECG and WC3 notations is as IF Concept c; THEN addTriple "c rdf.type rdfs:Class";IF Concepts c1 and c2 and c1 ⁇ c2 (c1 is a direct specialization of c2); THEN addTriple "d rdfs:subClassOf c2".
  • step 300 SN2 is translated into SN1 using a series of SN2 to SN1 translation rules.
  • the translated SN1 is stored in a second knowledge base.
  • step 400 the translations are validated and the information loss is returned. The information loss occurred when the expressivity of source semantic notation and target semantic notation is different, wherein some information in source semantic notation cannot be represented in target source semantic notation.
  • FIG. 2 shows a flowchart of the sub-steps for validating the translation of the knowledge base according to an embodiment of the present invention.
  • the flowchart shows the step of the validation as provided in step 400 of FIG. 1.
  • step 410 a set of queries is generated based on SN1 from the first knowledge base and a set of preferences.
  • decision 420 if there is no set of queries generated, the method ends. If there is a set of queries generated, SN1 from the first knowledge base and the second knowledge base are queried as in step 430. Later on, the answers are generated for each respective knowledge base. From the first and second knowledge base results produced in step 430, the obtained answers are compared as in step 440.
  • step 450 if the second knowledge base answers are different from the first knowledge base answers, the generated query and the obtained answers are stored in a repository as in step 460. If there is no answer difference, the method returns to step 420.
  • FIG. 3 shows a flowchart of the sub-steps for creating queries as provided in step 410 of FIG. 2.
  • step 411 facts from the first knowledge base are extracted.
  • decision 412 if there are no facts extracted from the first knowledge base, the method ends.
  • sub-graphs of the facts are computed by using depth-first or breadth-first backtrack algorithms, where at each step one node of the graph is removed which forms a new sub-graph as in step 413 wherein a number of sub-graphs from the facts is the queries extracted based on preferences. If the preferences indicate a maximum number of sub-graphs and a preferred size, the maximum number of determined sub-graphs are extracted based on the preferred size or based on the closest sizes if the sizes is not enough. If the preferences only indicates a maximum number of subgraphs, the sup-graphs are extracted based on the maximum number indicated. If the preferences only indicates a preferred size, the sup-graphs are extracted based on the preferred size. If there are no preferences, all possible sub-graphs are extracted.
  • ontology is based on maximum number of generalizations and maximum ontology depth exploration. If the preferences indicate the use of the ontology, generalizations of the extracted sub-graphs are computed as in step 415 wherein each generalization is a query. If the preferences indicates a maximum number of generalizations and a maximum ontology depth exploration, the generalizations are extracted based on the maximum number of generalizations and all concepts as well as properties are generalized not more than the maximum ontology depth exploration. Maximum number of generalizations indicates a threshold to stop creating new generalizations while maximum ontology depth exploration indicates a threshold to not creating too general new generalizations.
  • the generalizations are only extracted based on the maximum number of generalizations. If the preferences only indicate a maximum ontology depth exploration, all concepts and properties of the generalizations are generalized based on the maximum ontology depth exploration. If there are no preferences, all possible generalizations are extracted. While embodiments of the invention have been illustrated and described, it is not intended that these embodiments illustrate and describe all possible forms of the invention. Rather, the words used in the specifications are words of description rather than limitation and various changes may be made without departing from the scope of the invention.

Abstract

The present invention relates to a method for translating a knowledge base. The method comprises the steps of extracting a first semantic network notation (SN1) from a first knowledge base; translating SN1 into a second network notation (SN2) by using a series of SN1 to SN2 translation rules; translating a SN2 into SN1 by using a series of SN2 to SN1 translation rules; storing the translated SN1 in a second knowledge base; and validating the translations.

Description

A METHOD FOR TRANSLATING A KNOWLEDGE BASE
FIELD OF INVENTION
The present invention relates to a method for translating a knowledge base. Moreover, the present invention provides a method for validating the translations of the knowledge base between different semantic network notations.
BACKGROUND OF THE INVENTION
A knowledge base is an information repository that provides a means for information to be collected, organized, shared, searched and utilized. It can be either machine-readable or intended for human use. A knowledge base can be translated whereby a specific semantic network notation is translated to another semantic network notation. For example, translating a knowledge base represented in any of a W3C vocabulary for representing a web-based ontology (W3C-KB) to an Extended Conceptual Graphs (ECG-KB) notation. There are a few methods of the knowledge base translation between different semantic network notations which are manually validated by an expert.
An example of such method is disclosed in US Publication No. 2013/0086100 A1 whereby the method includes receiving a request from a requestor to validate a data assemblage expressed in JavaScript Object Notation (JSON); translating the data assemblage expressed in JSON into an extensible mark-up language (XML) instance; validating the XML instance using syntactic schema and semantic schema specifications; and sending a response to the requestor. For a case where the data assemblage contains invalid data in at least one field the response includes an output array containing information for specifying valid data for the at least one field and a message explaining a reason why the field is invalid. A system for performing the method is also described as a computer program product that can be used to execute the method. However, the data translation depends on the request and response of the user whereby the translation is not automatically validated.
In another example, US Publication No. 2011/0314451 A1 discloses a method and system for validating translated files for inclusion in an application being developed. Translatable files having externalized content in a single base language are sent for translation into other languages. Translated files resulting from a translation of the translatable files are received. Each translated file is statically and dynamically validated to detect error(s). The static validation is based on comparing the translatable files to the translated files. The dynamic validation is based on a simulation of how a user interface of the application presents the externalized content, without including an actual presentation of the externalized content by the user interface. Modified translated files that correct the detected error(s) are received and provided for a presentation of the externalized content by the user interface. However, this prior art does not perform automated validation of the translations whereby , it performs the static validation based on comparing the translatable files to the translated files and the dynamic validation based on a simulation of how user interface of the application presents the externalized content excluding an actual presentation of the externalized content by the user interface.
Therefore, there is a need to provide a method to validate the translation of the knowledge base that addresses the above mentioned drawbacks.
SUMMARY OF INVENTION
A method for translating a knowledge base comprises the steps of extracting a first semantic network notation (SN1) from a first knowledge base; translating SN1 into a second network notation (SN2) by using a series of SN1 to SN2 translation rules; translating a SN2 into SN1 by using a series of SN2 to SN1 translation rules; storing the translated SN1 in a second knowledge base; and validating the translations. Preferably, the method for validating the translations includes the steps of generating a set of queries based on SN1 from the first knowledge base and a set of preferences; querying SN1 from the first knowledge base and the second knowledge base if there is set of queries generated; generating answers for each respective knowledge base; comparing obtained answers from the first and second knowledge base results; and storing generated query and the obtained answers in a repository if the second knowledge base results are different from the first knowledge base results.
Preferably, the method for generating the set of queries based on the KB S 1 Input from the input knowledge base and the set of preferences includes the steps of extracting facts from the first knowledge base; computing sub-graphs of the facts by using depth-first or breadth-first backtrack algorithms if there is at least one fact extracted from the first knowledge base; extracting a maximum number of determined sub-graphs based on the preferred size or based on the closest sizes if preferences indicate a maximum number of sub-graphs and a preferred size; extracting the sup-graphs based on the maximum number indicated if the preferences only indicates the maximum number of sub-graphs; extracting the sup- graphs based on the preferred size if the preferences only indicates the preferred size; computing generalizations of extracted sub-graphs if preferences indicate the use of ontology; extracting the generalizations based on the maximum number of generalizations and generalizing all concepts as well as properties not more than the maximum ontology depth exploration if the preferences indicates the maximum number of generalizations and the maximum ontology depth exploration; extracting the generalizations based on the maximum number of generalizations if the preferences only indicate the maximum number of generalizations; generalizing all concepts and properties of the generalizations based on the maximum ontology depth exploration if the preferences only indicate the maximum ontology depth exploration; and extracting all possible generalizations if there are no preferences. BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention. FIG. 1 illustrates a flowchart of a method for translating a knowledge base according to an embodiment of the present invention.
FIG. 2 illustrates a flowchart of the sub-steps for validating the translation of the knowledge base according to an embodiment of the present invention.
FIG. 3 illustrates a flowchart of the sub-steps for creating queries according to an embodiment of the present invention. DESCRIPTION OF THE PREFFERED EMBODIMENT
A preferred embodiment of the present invention will be described herein below with reference to the accompanying drawings. In the following description, well known functions or constructions are not described in detail since they would obscure the description with unnecessary detail.
FIG.1 shows a flowchart of a method for translating a knowledge base according to an embodiment of the present invention. The method translates a first semantic network notation (SN1) to a second semantic network notation (SN2). In step 200, SN1 is extracted from a first knowledge base and translated into SN2 by using a series of SN1 to SN2 translation rules. For example, the translation rule between ECG and WC3 notations is as IF Concept c; THEN addTriple "c rdf.type rdfs:Class";IF Concepts c1 and c2 and c1 < c2 (c1 is a direct specialization of c2); THEN addTriple "d rdfs:subClassOf c2". In step 300, SN2 is translated into SN1 using a series of SN2 to SN1 translation rules. The translated SN1 is stored in a second knowledge base. In step 400, the translations are validated and the information loss is returned. The information loss occurred when the expressivity of source semantic notation and target semantic notation is different, wherein some information in source semantic notation cannot be represented in target source semantic notation.
FIG. 2 shows a flowchart of the sub-steps for validating the translation of the knowledge base according to an embodiment of the present invention. The flowchart shows the step of the validation as provided in step 400 of FIG. 1. In step 410, a set of queries is generated based on SN1 from the first knowledge base and a set of preferences. In decision 420, if there is no set of queries generated, the method ends. If there is a set of queries generated, SN1 from the first knowledge base and the second knowledge base are queried as in step 430. Later on, the answers are generated for each respective knowledge base. From the first and second knowledge base results produced in step 430, the obtained answers are compared as in step 440. In decision 450, if the second knowledge base answers are different from the first knowledge base answers, the generated query and the obtained answers are stored in a repository as in step 460. If there is no answer difference, the method returns to step 420. FIG. 3 shows a flowchart of the sub-steps for creating queries as provided in step 410 of FIG. 2. In step 411, facts from the first knowledge base are extracted. In decision 412, if there are no facts extracted from the first knowledge base, the method ends. If there is at least one fact extracted from the first knowledge base, sub-graphs of the facts are computed by using depth-first or breadth-first backtrack algorithms, where at each step one node of the graph is removed which forms a new sub-graph as in step 413 wherein a number of sub-graphs from the facts is the queries extracted based on preferences. If the preferences indicate a maximum number of sub-graphs and a preferred size, the maximum number of determined sub-graphs are extracted based on the preferred size or based on the closest sizes if the sizes is not enough. If the preferences only indicates a maximum number of subgraphs, the sup-graphs are extracted based on the maximum number indicated. If the preferences only indicates a preferred size, the sup-graphs are extracted based on the preferred size. If there are no preferences, all possible sub-graphs are extracted.
In decision 414, if the preferences do not indicate the use of ontology, the method returns to decision 412. The use of ontology is based on maximum number of generalizations and maximum ontology depth exploration. If the preferences indicate the use of the ontology, generalizations of the extracted sub-graphs are computed as in step 415 wherein each generalization is a query. If the preferences indicates a maximum number of generalizations and a maximum ontology depth exploration, the generalizations are extracted based on the maximum number of generalizations and all concepts as well as properties are generalized not more than the maximum ontology depth exploration. Maximum number of generalizations indicates a threshold to stop creating new generalizations while maximum ontology depth exploration indicates a threshold to not creating too general new generalizations. If the preferences only indicate a maximum number of generalizations, the generalizations are only extracted based on the maximum number of generalizations. If the preferences only indicate a maximum ontology depth exploration, all concepts and properties of the generalizations are generalized based on the maximum ontology depth exploration. If there are no preferences, all possible generalizations are extracted. While embodiments of the invention have been illustrated and described, it is not intended that these embodiments illustrate and describe all possible forms of the invention. Rather, the words used in the specifications are words of description rather than limitation and various changes may be made without departing from the scope of the invention.

Claims

1. A method for translating a knowledge base comprises the steps of:
a) extracting a first semantic network notation (SN1) from a first knowledge base;
b) translating SN1 into a second network notation (SN2) by using a series of S 1 to SN2 translation rules;
c) translating a SN2 into S 1 by using a series of SN2 to SN1 translation rules;
d) storing the translated SN1 in a second knowledge base; and e) validating the translations.
2. The method for translating a knowledge base as claimed in claim 1 , wherein validating the translations includes the steps of:
a) generating a set of queries based on SN1 from the first knowledge base and a set of preferences;
b) querying SN1 from the first knowledge base and the second knowledge base if there is set of queries generated;
c) generating answers for each respective knowledge base; d) comparing obtained answers from the first and second knowledge base results; and
e) storing generated query and the obtained answers in a repository if the second knowledge base results are different from the first knowledge base results.
3. The method for translating a knowledge base as claimed in claim 2, wherein generating the set of queries based on the KB SN1 Input from the input knowledge base and the set of preferences includes the steps of:
a) extracting facts from the first knowledge base;
b) computing sub-graphs of the facts by using depth-first or breadth-first backtrack algorithms if there is at least one fact extracted from the first knowledge base;
c) extracting a maximum number of determined sub-graphs based on the preferred size or based on the closest sizes if preferences indicate a maximum number of sub-graphs and a preferred size; d) extracting the sup-graphs based on the maximum number indicated if the preferences only indicates the maximum number of sub-graphs; e) extracting the sup-graphs based on the preferred size if the preferences only indicates the preferred size;
f) computing generalizations of extracted sub-graphs if preferences indicate the use of ontology;
g) extracting the generalizations based on the maximum number of generalizations and generalizing all concepts as well as properties not more than the maximum ontology depth exploration if the preferences indicates the maximum number of generalizations and the maximum ontology depth exploration;
h) extracting the generalizations based on the maximum number of generalizations if the preferences only indicate the maximum number of generalizations;
i) generalizing all concepts and properties of the generalizations based on the maximum ontology depth exploration if the preferences only indicate the maximum ontology depth exploration; and
j) extracting all possible generalizations if there are no preferences.
PCT/MY2014/000196 2013-10-21 2014-06-26 A method for translating a knowledge base WO2015060709A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI2013701985 2013-10-21
MYPI2013701985A MY168821A (en) 2013-10-21 2013-10-21 A method for translating a knowledge base

Publications (1)

Publication Number Publication Date
WO2015060709A1 true WO2015060709A1 (en) 2015-04-30

Family

ID=51703372

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/MY2014/000196 WO2015060709A1 (en) 2013-10-21 2014-06-26 A method for translating a knowledge base

Country Status (2)

Country Link
MY (1) MY168821A (en)
WO (1) WO2015060709A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110314451A1 (en) 2010-06-16 2011-12-22 International Business Machines Corporation Validating translations of externalized content for inclusion in an application
US20130086100A1 (en) 2011-09-29 2013-04-04 International Business Machines Corporation Method and System Providing Document Semantic Validation and Reporting of Schema Violations

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110314451A1 (en) 2010-06-16 2011-12-22 International Business Machines Corporation Validating translations of externalized content for inclusion in an application
US20130086100A1 (en) 2011-09-29 2013-04-04 International Business Machines Corporation Method and System Providing Document Semantic Validation and Reporting of Schema Violations

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JEAN-FRANÇOIS BAGET ET AL: "Translations between RDF(S) and Conceptual Graphs", 26 July 2010, CONCEPTUAL STRUCTURES: FROM INFORMATION TO INTELLIGENCE, SPRINGER BERLIN HEIDELBERG, BERLIN, HEIDELBERG, PAGE(S) 28 - 41, ISBN: 978-3-642-14196-6, XP019146411 *
OLIVIER CORBY ET AL: "A Conceptual Graph Model for W3C Resource Description Framework", 31 January 2006, CONCEPTUAL STRUCTURES: LOGICAL, LINGUISTIC, AND COMPUTATIONAL ISSUES LECTURE NOTES IN COMPUTER SCIENCE;LECTURE NOTES IN ARTIFICIAL INTELLIG ENCE;LNCS, SPRINGER, BERLIN, DE, PAGE(S) 468 - 482, ISBN: 978-3-540-67859-5, XP019049201 *

Also Published As

Publication number Publication date
MY168821A (en) 2018-12-04

Similar Documents

Publication Publication Date Title
US11763175B2 (en) Systems and methods for semantic inference and reasoning
US11669540B2 (en) Matching subsets of tabular data arrangements to subsets of graphical data arrangements at ingestion into data-driven collaborative datasets
US11599714B2 (en) Methods and systems for modeling complex taxonomies with natural language understanding
KR101878217B1 (en) Method, apparatus and computer program for medical data
Zhao et al. Ontology integration for linked data
CN111417940B (en) Method, system and medium for generating answers to questions
US10698905B2 (en) Natural language querying of data in a structured context
US8661004B2 (en) Representing incomplete and uncertain information in graph data
Corby et al. STTL: a SPARQL-based transformation language for RDF
KR20110020462A (en) System and method for intelligent searching and question-answering
US20220043935A1 (en) Data processing systems and methods for automatically redacting unstructured data from a data subject access request
Vaccari et al. An evaluation of ontology matching in geo-service applications
WO2023169072A1 (en) Configuration method and apparatus, and analysis method and apparatus for entities in knowledge graph
WO2015031610A1 (en) Method and apparatus for generating health quality metrics
Chen et al. A node semantic similarity schema-matching method for multi-version Web Coverage Service retrieval
EP3168791A1 (en) Method and system for data validation in knowledge extraction apparatus
Ba et al. Integration of web sources under uncertainty and dependencies using probabilistic XML
Rouces et al. Heuristics for connecting heterogeneous knowledge via FrameBase
Toman et al. Finding ALL answers to OBDA queries using referring expressions
Thuy et al. A semantic approach for transforming xml data into rdf ontology
JP6327799B2 (en) Natural language reasoning system, natural language reasoning method and program
Saquicela et al. Lightweight semantic annotation of geospatial RESTful services
Heyvaert et al. Semantically annotating CEUR-WS workshop proceedings with RML
WO2015060709A1 (en) A method for translating a knowledge base
Iqbal et al. A Negation Query Engine for Complex Query Transformations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14784124

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14784124

Country of ref document: EP

Kind code of ref document: A1