WO2014069983A2 - A system and method for distributed querying of linked semantic webs - Google Patents
A system and method for distributed querying of linked semantic webs Download PDFInfo
- Publication number
- WO2014069983A2 WO2014069983A2 PCT/MY2013/000177 MY2013000177W WO2014069983A2 WO 2014069983 A2 WO2014069983 A2 WO 2014069983A2 MY 2013000177 W MY2013000177 W MY 2013000177W WO 2014069983 A2 WO2014069983 A2 WO 2014069983A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- queries
- sub
- query
- index
- ontologies
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
- G06F16/134—Distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
- G06F16/24534—Query rewriting; Transformation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/80—Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
- G06F16/81—Indexing, e.g. XML tags; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/80—Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
- G06F16/83—Querying
- G06F16/835—Query processing
- G06F16/8365—Query optimisation
Definitions
- the present invention relates to a system and method for distributed querying of linked semantic webs.
- each of the ontologies is queried separately and the results are therefore not aggregated, and the user also needs to know intimate details of each ontology in order to make a query.
- the user needs to at least know: (i) which ontology might have the knowledge; (ii) how to query the ontology, that is, whether it has an endpoint; what the URL for Malaysia is in the ontology; and the property that is used to represent population.
- United States Patent No. 5,600,329 describes a database system that provides independence between the query and physical structure of the database tables by captioning each database table with a partial query reflecting the contents of that table.
- the partial query is a query that if applied to a larger database of a standard configuration would produce the data of the table.
- Relevant tables for a particular query may be identified by piecing together the partial queries until the user query is obtained.
- the database system may be integrated with an optimizer by comparing each of the identified tables against the others for the amount of overlap their sub-queries have with the user query and the cost of accessing the table and then repeating this process as the tables are joined in various combinations.
- join processing and grouping techniques have been proposed to minimize the number of remote requests required, and to develop an effective solution for source selection in the absence of pre-processed metadata.
- frameworks have been proposed that enable SPARQL query processing on heterogeneous, virtually integrated Linked Data sources.
- the present invention advantageously provides a way for a user to perform SPARQL queries on a set of linked ontologies without needing to know the names of the ontologies, the location at which they are stored, or how they are internally structured.
- the present invention relates to a system and method for distributed querying of linked semantic webs.
- One aspect of the present invention provides a system (100) for distributed querying of linked semantic webs (110) comprising at least one LOD ontologies index (120) comprising LOD ontologies and metadata relating to the LOD ontologies; at least one concept index (130) comprising concepts and corresponding URIs; at least one relation index (140) comprising relations and corresponding URIs; at least one query interface (160) for entering SPARQL queries; and a distributed query engine (150) in communication with the LOD ontologies index (120), concept index (130) and relations index (140) and adapted to receive queries from the query interface (160).
- the distributed query engine (150) is adapted to parse and rewrite queries received from the query interface (160) and generate a plurality of sub-queries; identify dependencies within the sub-queries and chunk sub-queries based on ontology; execute sub-queries by sending to relevant source ontology; and merge results obtained from execution of the sub-queries.
- the invention provides a system (100) wherein the metadata included in the LOD ontologies index comprises one or more of namespace(s), vocabulary used (RDF, OWL, SKOS, etc.), properties and domain ranges information, and SPARQL endpoint.
- the metadata included in the LOD ontologies index is in the form of a database table, knowledge base and/or text file.
- the invention provides a system wherein the concept index and the relation index are incorporated into a single index.
- the invention provides a system wherein the concept index includes classes and instances.
- the invention provides a method (200) for distributed querying of linked semantic webs (110) comprising receiving an initial query from a user (210); parsing the query (220) and replacing generic terms (230); breaking the queries into sub-queries and chunking the sub-queries (240); executing the sub-queries (250) based on ontology; and merging the results obtained (260) to determine whether an answer is reached (270 and, if so, returning the answer (280) to the user.
- the method for breaking the queries into sub-queries and chunking the sub-queries (240) further comprises steps of selecting a clause in a query (241); determining the ontology of terms (242) and variables in the query clause (243); determining whether a variable is dependent (244); if the variable is not dependent, identifying any other clauses querying the same ontology (245) and grouping the clauses into a sub-query (246) or, if there are no other clauses querying " the same ontology, establishing a new sub-query (247); if the variable is dependent, determining whether the dependent clause queries the same ontology (248) and, if the dependent clause does query the same ontology, grouping it into a sub-query (246) and, if not, sequencing it as a sub-query after the dependent clause (249).
- the invention provides a method wherein, if an answer is not reached, the steps of the method are repeated, other than the step of receiving the initial query.
- the invention provides a method wherein the step of parsing the query (220) and replacing generic terms (230) comprises checking concepts and/or relations and replacing generic terms with their actual URIs.
- the invention provides a method wherein the process of replacing generic terms (230) comprises determining whether a term is generic or not (232); if the term is generic, determining whether or not the term is a concept or relation (234),searching a concept index (235) or a relation index (236); and replacing the term with its actual URI (237) reiterating the steps until all generic terms are replaced.
- a method comprising repeating the steps of the immediately preceding paragraph until all clauses are included in the chunked sub-queries.
- FIG. 1 illustrates the top level architecture of an embodiment of the invention.
- FIG. 2 illustrates a flowchart for a querying process according to an embodiment of the invention.
- FIG. 3 illustrates a replace generic terms flowchart of an embodiment of the invention.
- FIG. 4 illustrates a sub-query chunking flowchart of an embodiment of the invention.
- Table 1 shows an example of a concept index (130) with three concepts in it.
- Table 2 shows an example of a relations index (140) with three relations in it.
- the present invention provides a system and method for distributed querying of linked semantic webs.
- the system (100) includes a number of modules, each of which will be discussed below.
- the system (100) includes a LOD metadata module (120) that is provided with an index of ontologies that are included within the LOD.
- the metadata may include, but is not limited to, namespace(s), vocabulary used (RDF, OWL, SKOS, etc.), properties and domain ranges information, and SPARQL endpoint.
- the metadata may be provided in any suitable form, for example a database table, knowledge base, text file and so on.
- a concept index (130) is also provided that includes an index of concepts, such as classes and instances, and their actual uniform resource identifier (URI) details. Each unique concept included in the index can appear in multiple ontologies with different URIs. Table 1 shows an example of a concept index (130) with three concepts in it.
- the system (100) includes a relation index (140) that includes an index of relations and their actual URIs. Again, each of the unique relations can appear in multiple ontologies with different URIs. Table 2 shows an example of a relations index (140) with three relations in it.
- the system (100) includes a distributed query engine (150) which is adapted to receive a query from a user at a query interface (160) and breaking the query down to sub-queries. The sub-queries may be parallel or sequential, based on dependencies of the sub-queries.
- the distributed query engine (150) searches the LOD metadata module (120) for all ontologies that may be able to provide answers for each sub-query. This search may, for example, include semantic matching of query terms to the properties and concepts in each of the ontologies. If there are a number of relevant ontologies for a particular sub-query, then that sub-query is sent to all of the matching ontologies. The distributed query engine (150) then merges the answers received for each of the sub-queries and forms a final answer to the query.
- FIG. 2 A flowchart illustrating the querying process (200) employed by the distributed query engine (150) is provided in Figure 2.
- an initial query is received (210) by the distributed query engine (150).
- the query is then parsed (220) and generic terms replaced (230), as described in more detail below with reference to Figure 3.
- the distributed query engine (150) checks the concept index (130) and relation index (140) and replaces generic terms with their actual URIs.
- the queries are chunked into sub-queries (240), as discussed above and described in more detail below with reference to Figure 4, and the sub-queries executed (250).
- the results of the sub-queries are merged (260) to determine whether an answer is reached (270 and, if so, the answer is returned (280) to the user. If an answer is not reached, the process may be repeated.
- a term enters the process (231) and it is determined whether the term is generic or not (232). If the term is not generic, it is added to the queries (233). If the term is identified as being generic, it is determined whether or not the term is a concept (234). If the term is considered a concept, the concept index is searched (235) and, if not, the relation index is searched (236). Once searching is complete, the term is replaced in the query with its actual URl (237). The process then identifies any further terms requiring consideration (238) or ends to provide a list of queries. As an example, the query "What is the population of Malaysia?" can be written as a generic SPARQL query, such as:
- the methodology of the invention identifies the generic concept, Malaysia, and searches the concept index (130). By replacing the concept Malaysia with URIs from the concept index (130), a total of three possible queries are formed:
- the methodology of embodiments of the invention may attempt to execute all possible query combinations in parallel. However, in some instances there are dependencies between SPARQL clauses. In such cases, the SPARQL queries must be executed in series. When there are dependencies, the clauses are rearranged and grouped into possible sub-queries. Information from the LOD metadata module (120) is used to determine which parts of the query can be resolved by querying a single ontology and which has to be distributed.
- the chunking process (240) involves a clause in the query being selected (241 ).
- the ontology of terms is determined (242) and variables in the query clause obtained (243). Once obtained, the process determines whether the variable is dependent (244). If the variable is not dependent, any other clauses querying the same ontology are identified (245) and, if so, grouped into a sub-query (246) and, if not, a new sub-query established (247). If the variable is determined to be dependent, the process involves determining whether the dependent clause queries the same ontology (248).
- the dependent clause does query the same ontology, it is grouped into a sub-query (246) and, if not, it is sequenced as a sub-query after the dependent clause (249). This process may be repeated as necessary until there are no more clauses to provide the chunked sub-queries.
- the query can be executed in parallel and, in this case, to two difference ontologies.
- this query must be executed in sequence, first identifying the capital of Malaysia by executing the following sub-query:
- This sub-query may be executed to arrive at an answer to the original query.
Abstract
A system (100) for distributed querying of linked semantic webs (110) comprising at least one LOD ontologies index (120) comprising LOD ontologies and metadata relating to said LOD ontologies; at least one concept index (130) comprising concepts and corresponding URIs; at least one relation index (140) comprising relations and corresponding URIs; query interface (160) for entering SPARQL queries; and a distributed query engine (150) in communication with said LOD ontologies index (120), concept index (130) and relations index (140) and adapted to receive queries from said query interface (160); characterised in that said distributed query engine (150) is adapted to: parse and rewrite queries received from said query interface (160) and generate a plurality of sub-queries; identify dependencies within said sub-queries and chunk sub-queries based on ontology; execute sub-queries by sending to relevant source ontology; and merge results obtained from execution of said sub-queries.
Description
A SYSTEM AND METHOD FOR DISTRIBUTED QUERYING OF LINKED SEMANTIC
WEBS
FIELD OF INVENTION
The present invention relates to a system and method for distributed querying of linked semantic webs.
BACKGROUND ART
In recent years, there has been unprecedented growth in the amount of publically available semantic information. This information is encoded in RDF form in dozens of ontologies and linked together to form the Linked Open Data (LOD) cloud. However, the ability to treat the entire LOD as a single super World Wide Semantic Web for the purpose of querying and interfacing is still currently lacking. The only method available to query the ontologies that make up an interconnected public semantic web, such as the LOD cloud, is to use the public SPARQL endpoints provided. According to this method, each ontology is queried separately and the user needs to know at least the basic interna) structure before a query can be made.
As noted above, in currently available methodology each of the ontologies is queried separately and the results are therefore not aggregated, and the user also needs to know intimate details of each ontology in order to make a query. For example, to determine the population of Malaysia, the user needs to at least know: (i) which ontology might have the knowledge; (ii) how to query the ontology, that is, whether it has an endpoint; what the URL for Malaysia is in the ontology; and the property that is used to represent population.
As an example, United States Patent No. 5,600,329 describes a database system that provides independence between the query and physical structure of the database tables by captioning each database table with a partial query reflecting the contents of that table. In particular, the partial query is a query that if applied to a larger database of a standard configuration would produce the data of the table. Relevant tables for a
particular query may be identified by piecing together the partial queries until the user query is obtained. As described in this patent, the database system may be integrated with an optimizer by comparing each of the identified tables against the others for the amount of overlap their sub-queries have with the user query and the cost of accessing the table and then repeating this process as the tables are joined in various combinations.
Other join processing and grouping techniques have been proposed to minimize the number of remote requests required, and to develop an effective solution for source selection in the absence of pre-processed metadata. In particular, frameworks have been proposed that enable SPARQL query processing on heterogeneous, virtually integrated Linked Data sources.
The present invention, at least in certain embodiments, advantageously provides a way for a user to perform SPARQL queries on a set of linked ontologies without needing to know the names of the ontologies, the location at which they are stored, or how they are internally structured.
The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one exemplary technology area where some embodiments described herein may be practice.
SUMMARY OF INVENTION
The present invention relates to a system and method for distributed querying of linked semantic webs.
One aspect of the present invention provides a system (100) for distributed querying of linked semantic webs (110) comprising at least one LOD ontologies index (120) comprising LOD ontologies and metadata relating to the LOD ontologies; at least one concept index (130) comprising concepts and corresponding URIs; at least one relation index (140) comprising relations and corresponding URIs; at least one query interface (160) for entering SPARQL queries; and a distributed query engine (150) in communication with the LOD ontologies index (120), concept index (130) and relations index (140) and adapted to receive queries from the query interface (160).The distributed query engine (150) is adapted to parse and rewrite queries received from the query interface (160) and generate a plurality of sub-queries; identify dependencies within the sub-queries and chunk sub-queries based on ontology; execute sub-queries by sending to relevant source ontology; and merge results obtained from execution of the sub-queries. In another aspect the invention provides a system (100) wherein the metadata included in the LOD ontologies index comprises one or more of namespace(s), vocabulary used (RDF, OWL, SKOS, etc.), properties and domain ranges information, and SPARQL endpoint. In a further aspect the invention provides a system wherein the metadata included in the LOD ontologies index is in the form of a database table, knowledge base and/or text file.
In yet another aspect the invention provides a system wherein the concept index and the relation index are incorporated into a single index.
In still another aspect the invention provides a system wherein the concept index includes classes and instances.
In a further aspect the invention provides a method (200) for distributed querying of linked semantic webs (110) comprising receiving an initial query from a user (210); parsing the query (220) and replacing generic terms (230); breaking the queries into sub-queries and chunking the sub-queries (240); executing the sub-queries (250) based on ontology; and merging the results obtained (260) to determine whether an answer is reached (270 and, if so, returning the answer (280) to the user. The method for breaking the queries into sub-queries and chunking the sub-queries (240) further comprises steps of selecting a clause in a query (241); determining the ontology of terms (242) and variables in the query clause (243); determining whether a variable is dependent (244); if the variable is not dependent, identifying any other clauses querying the same ontology (245) and grouping the clauses into a sub-query (246) or, if there are no other clauses querying " the same ontology, establishing a new sub-query (247); if the variable is dependent, determining whether the dependent clause queries the same ontology (248) and, if the dependent clause does query the same ontology, grouping it into a sub-query (246) and, if not, sequencing it as a sub-query after the dependent clause (249).
In another aspect the invention provides a method wherein, if an answer is not reached, the steps of the method are repeated, other than the step of receiving the initial query. In yet another aspect the invention provides a method wherein the step of parsing the query (220) and replacing generic terms (230) comprises checking concepts and/or relations and replacing generic terms with their actual URIs.
In a further aspect the invention provides a method wherein the process of replacing generic terms (230) comprises determining whether a term is generic or not (232); if the term is generic, determining whether or not the term is a concept or relation (234),searching a concept index (235) or a relation index (236); and replacing the term with its actual URI (237) reiterating the steps until all generic terms are replaced.
In still another aspect of the invention there is provided a method comprising repeating the steps of the immediately preceding paragraph until all clauses are included in the chunked sub-queries. The present invention consists of features and a combination of parts hereinafter fully described and. illustrated in the accompanying drawings, it being understood that various changes in the details may be made without departing from the scope of the invention or sacrificing any of the advantages of the present invention.
BRIEF DESCRIPTION OF ACCOMPANYING DRAWINGS
To further clarify various aspects of some embodiments of the present invention, a more particular description of the invention will be rendered by references to specific embodiments thereof, which are illustrated in the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail through the accompanying drawings in which:
FIG. 1 illustrates the top level architecture of an embodiment of the invention.
FIG. 2 illustrates a flowchart for a querying process according to an embodiment of the invention.
FIG. 3 illustrates a replace generic terms flowchart of an embodiment of the invention.
FIG. 4 illustrates a sub-query chunking flowchart of an embodiment of the invention. Table 1 shows an example of a concept index (130) with three concepts in it. Table 2 shows an example of a relations index (140) with three relations in it.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
The present invention provides a system and method for distributed querying of linked semantic webs.
Hereinafter, this specification will describe the present invention according to the preferred embodiments. It is to be understood that limiting the description to the preferred embodiments of the invention is merely to facilitate discussion of the present invention and it is envisioned without departing from the scope of the appended claims.
Referring to Figure 1 , a system (100) for distributed querying of linked semantic webs, particularly the LOD (110), is illustrated. The system (100) includes a number of modules, each of which will be discussed below. The system (100) includes a LOD metadata module (120) that is provided with an index of ontologies that are included within the LOD. The metadata may include, but is not limited to, namespace(s), vocabulary used (RDF, OWL, SKOS, etc.), properties and domain ranges information, and SPARQL endpoint. The metadata may be provided in any suitable form, for example a database table, knowledge base, text file and so on.
A concept index (130) is also provided that includes an index of concepts, such as classes and instances, and their actual uniform resource identifier (URI) details. Each unique concept included in the index can appear in multiple ontologies with different URIs. Table 1 shows an example of a concept index (130) with three concepts in it.
In addition, the system (100) includes a relation index (140) that includes an index of relations and their actual URIs. Again, each of the unique relations can appear in multiple ontologies with different URIs. Table 2 shows an example of a relations index (140) with three relations in it.
In addition to the above mentioned components, the system (100) includes a distributed query engine (150) which is adapted to receive a query from a user at a query interface (160) and breaking the query down to sub-queries. The sub-queries may be parallel or sequential, based on dependencies of the sub-queries. Once the query is broken down into these sub-queries, the distributed query engine (150) searches the LOD metadata module (120) for all ontologies that may be able to provide answers for each sub-query. This search may, for example, include semantic matching of query terms to the properties and concepts in each of the ontologies. If there are a number of relevant ontologies for a particular sub-query, then that sub-query is sent to all of the matching ontologies. The distributed query engine (150) then merges the answers received for each of the sub-queries and forms a final answer to the query.
A flowchart illustrating the querying process (200) employed by the distributed query engine (150) is provided in Figure 2. Referring to Figure 2, an initial query is received (210) by the distributed query engine (150). The query is then parsed (220) and generic terms replaced (230), as described in more detail below with reference to Figure 3. Briefly, during this process the distributed query engine (150) checks the concept index (130) and relation index (140) and replaces generic terms with their actual URIs. Once this process is completed, the queries are chunked into sub-queries (240), as discussed above and described in more detail below with reference to Figure 4, and the sub-queries executed (250). The results of the sub-queries are merged (260) to determine whether an answer is reached (270 and, if so, the answer is returned (280) to the user. If an answer is not reached, the process may be repeated.
Referring to Figure 3, the process of replacing generic terms (230) is illustrated. In this process, a term enters the process (231) and it is determined whether the term is generic or not (232). If the term is not generic, it is added to the queries (233). If the term is identified as being generic, it is determined whether or not the term is a concept (234). If the term is considered a concept, the concept index is searched (235) and, if not, the relation index is searched (236). Once searching is complete, the term is replaced in the query with its actual URl (237). The process then identifies any further terms requiring consideration (238) or ends to provide a list of queries.
As an example, the query "What is the population of Malaysia?" can be written as a generic SPARQL query, such as:
SeUzCTIpaputeticm WHERE
{
Malaysia population ^population
}
The methodology of the invention identifies the generic concept, Malaysia, and searches the concept index (130). By replacing the concept Malaysia with URIs from the concept index (130), a total of three possible queries are formed:
SELBCT ?popuiation WHERE
{
dbp:f,1slaysla population ?population
J
SELBCT ΫρορνΙαϋοη WHERE
{
geoimmesMataysia population ?populatlon
J
SELBCT' ? population WHERE (
geolnfoMa!ays population ?popviatfan
} Next, the generic term population is identified and the relation index (140) searched for matches. This returns another three URIs. Replacing each of the generated queries above produces a total of nine different queries, some of which include:
SeLECT ?popwiatf©» WHERE
{
di>p:Ma!aysia dbpprop:poputatlanCensus ?popvlatior>
)
SELECT ^population WHERE
i
geottames.-Mala sJa gcona 0s:p pulation ?popul<ition
} i r TpopuMimtW em
{
geolnfoMilaysIs gooinfo:pop lneonTotat ?population
}
These are the legitimate SPARQL queries that can be sent to the various SPARQL endpoints in the ontology index provided in the LOD metadata module (120).
The methodology of embodiments of the invention may attempt to execute all possible query combinations in parallel. However, in some instances there are dependencies between SPARQL clauses. In such cases, the SPARQL queries must be executed in series. When there are dependencies, the clauses are rearranged and grouped into possible sub-queries. Information from the LOD metadata module (120) is used to determine which parts of the query can be resolved by querying a single ontology and which has to be distributed.
Referring to Figure 4, the chunking process (240) involves a clause in the query being selected (241 ). The ontology of terms is determined (242) and variables in the query clause obtained (243). Once obtained, the process determines whether the variable is dependent (244). If the variable is not dependent, any other clauses querying the same ontology are identified (245) and, if so, grouped into a sub-query (246) and, if not, a new sub-query established (247). If the variable is determined to be dependent, the process involves determining whether the dependent clause queries the same ontology (248). If the dependent clause does query the same ontology, it is grouped into a sub-query (246) and, if not, it is sequenced as a sub-query after the dependent clause (249). This
process may be repeated as necessary until there are no more clauses to provide the chunked sub-queries.
For example, given the input query "What is the capital and population of Malaysia?", the following query may be obtained:
SELECT Teaplui 7iH¾j«l9«0>* WHERE
/
Malaysia capital, cHy ?c<tpital
Mala sia population fpopufatitm
}
After replacement of generic terms (230), the query below is generated:
S .ecr mpUai tpoptilsitan WH&tB
(
dhp.-Matsysle di}pprx>p:t;»p!(of ?capHl>!
geottamesiftefaysla gaoni>mos:populstlot> fpopwhrtfon
}
As the two clauses in the query are independent of each other, the query can be executed in parallel and, in this case, to two difference ontologies.
If, on the other hand, the input query is "What is the population of the capital of Malaysia?, the following queries may be obtained:
&ELMCT ^ca ital ?popuifli/oi> WHERE
f
Malaysia capital ' tit ?capitel
?capiMpofmlaii tpttpui &
)
After replacement
SELECT Tcaplta) fpopula on WHERE
{
dbp;Meiaysla dbpprop capital ?capltai
7cBpttnl S)0oname$:popalatfon ?poput»ikm
>
In this case, the second clause is dependent on the first. As such, this query must be executed in sequence, first identifying the capital of Malaysia by executing the following sub-query:
SELECT ?eapiti>l WHERE
{ >
The result (dbp:Kaulal_umpur) is then replaced and a second sub-query is generated:
SELECT ?(K>pulatlon WHERE
{
goof)a os:kus(a umpurgoon3mes:popitlat!on ^papulation
r
This sub-query may be executed to arrive at an answer to the original query.
Unless the context requires otherwise or specifically stated to the contrary, integers, steps or elements of the invention recited herein as singular integers, steps or elements clearly encompass both singular and plural forms of the recited integers, steps or elements.
Throughout this specification, unless the context requires otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated step or element or integer or group of steps or elements or integers, but not the exclusion of any other step or element or integer or group of steps, elements or integers. Thus, in the context of this specification, the term "comprising" is used in an inclusive sense and thus should be understood as meaning "including principally, but not necessarily solely".
It will be appreciated that the foregoing description has been given by way of illustrative example of the invention and that all such modifications and variations thereto as would be apparent to persons of skill in the art are deemed to fall within the broad scope and ambit of the invention as herein set forth.
Claims
1. A system (100) for distributed querying of linked semantic webs (110), the system comprising:
at least one LOD ontologies index (120) comprising LOD ontologies and metadata relating to said LOD ontologies;
at least one concept index (130) comprising concepts and corresponding URIs;
at least one relation index (140) comprising relations and corresponding URIs;
at least one query interface (160) for entering SPARQL queries; and a distributed query engine (150) in communication with said LOD ontologies index (120), concept index (130) and relations index (140) and adapted to receive queries from said query interface (160); characterised in that said distributed query engine (150) further having means to:
parse and rewrite queries received from said query interface (160) and generate a plurality of sub-queries;
identify dependencies within said sub-queries and chunk sub-queries based on ontology;
execute sub-queries by sending to relevant source ontology; and merge results obtained from execution of said sub-queries.
2. A system (100) according to claim 1 , wherein said metadata included in said LOD ontologies index comprises one or more of namespace(s), vocabulary used (RDF, OWL, SKOS, etc.), properties and domain ranges information, and SPARQL endpoint.
3. A system according to claim 1 or 2, wherein said metadata included in said LOD ontologies index is in the form of a database table, knowledge base and/or text file.
A system according to any of claim 1 , wherein said concept index and said relation index are incorporated into a single index.
A system according to claim 1 , wherein said concept index includes classes and instances.
A method (200) for distributed querying of linked semantic webs (110), the method comprising steps of:
receiving an initial query from a user (210);
parsing said query (220) and replacing generic terms (230);
breaking said queries into sub-queries and chunking said sub-queries
(240);
executing said sub-queries (250) based on ontology; and
merging the results obtained (260) to determine whether an answer is reached (270 and, if so, returning the answer (280) to the user characterized in that
breaking said queries into sub-queries and chunking said sub-queries (240) further comprises steps of:
selecting a clause in a query (241 );
determining the ontology of terms (242) and variables in the query clause (243);
determining whether a variable is dependent (244); if the variable is not dependent, identifying any other clauses querying the same ontology (245) and grouping said clauses into a sub-query (246) or, if there are no other clauses querying the same ontology, establishing a new sub-query (247); if the variable is dependent, determining whether the dependent clause queries the same ontology (248) and, if the dependent clause does query the same ontology, grouping it into a sub-query
(246) and, if not, sequencing it as a sub-query after the dependent clause (249).
7. A method according to claim 6, wherein, merging the results obtained (260) to determine whether an answer is reached (270) further comprises repeating the steps of the method if an answer is not reached, , other than said step of receiving said initial query.
8. A method according to claim 6 , wherein said step of parsing said query (220) and replacing generic terms (230) further comprises checking concepts and/or relations and replacing generic terms with their actual URIs. 9. A method according to claim 8, wherein said step of replacing generic terms (230) further comprises steps of:
determining whether a term is generic or not (232);
if the term is generic, determining whether or not the term is a concept or relation (234);
searching a concept index (235) or a relation index (236); and replacing the term with its actual URI (237) and reiterating the steps until all generic terms are replaced.
10. A method according to claim 6, further comprising repeating the steps of claim 6 until all clauses are included in said chunked sub-queries when breaking said queries into sub-queries and chunking said sub-queries (240).
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
MYPI2012004796A MY164083A (en) | 2012-11-01 | 2012-11-01 | A system and method for distributed querying of linked semantic webs |
MYPI2012004796 | 2012-11-01 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2014069983A2 true WO2014069983A2 (en) | 2014-05-08 |
WO2014069983A3 WO2014069983A3 (en) | 2014-12-04 |
Family
ID=49551726
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/MY2013/000177 WO2014069983A2 (en) | 2012-11-01 | 2013-09-30 | A system and method for distributed querying of linked semantic webs |
Country Status (2)
Country | Link |
---|---|
MY (1) | MY164083A (en) |
WO (1) | WO2014069983A2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111339334A (en) * | 2020-02-11 | 2020-06-26 | 支付宝(杭州)信息技术有限公司 | Data query method and system for heterogeneous graph database |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002080026A1 (en) * | 2001-03-30 | 2002-10-10 | British Telecommunications Public Limited Company | Global database management system integrating heterogeneous data resources |
US20040243595A1 (en) * | 2001-09-28 | 2004-12-02 | Zhan Cui | Database management system |
-
2012
- 2012-11-01 MY MYPI2012004796A patent/MY164083A/en unknown
-
2013
- 2013-09-30 WO PCT/MY2013/000177 patent/WO2014069983A2/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002080026A1 (en) * | 2001-03-30 | 2002-10-10 | British Telecommunications Public Limited Company | Global database management system integrating heterogeneous data resources |
US20040243595A1 (en) * | 2001-09-28 | 2004-12-02 | Zhan Cui | Database management system |
Non-Patent Citations (2)
Title |
---|
BASTIAN QUILITZ ET AL: "Querying Distributed RDF Data Sources with SPARQL", 3 June 2007 (2007-06-03), THE SEMANTIC WEB: RESEARCH AND APPLICATIONS; [LECTURE NOTES IN COMPUTER SCIENCE], SPRINGER BERLIN HEIDELBERG, BERLIN, HEIDELBERG, PAGE(S) 524 - 538, XP019075716, ISBN: 978-3-540-68233-2 abstract; figure 1 page 2 - page 7 * |
Ian C Millard ET AL: "Consuming multiple linked data sources: Challenges and Experiences", , 7 November 2010 (2010-11-07), pages 1-12, XP055128961, Retrieved from the Internet: URL:http://eprints.soton.ac.uk/271681/1/cold2010-paper16-camera-ready.pdf [retrieved on 2014-07-15] * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111339334A (en) * | 2020-02-11 | 2020-06-26 | 支付宝(杭州)信息技术有限公司 | Data query method and system for heterogeneous graph database |
CN111339334B (en) * | 2020-02-11 | 2023-04-07 | 支付宝(杭州)信息技术有限公司 | Data query method and system for heterogeneous graph database |
Also Published As
Publication number | Publication date |
---|---|
WO2014069983A3 (en) | 2014-12-04 |
MY164083A (en) | 2017-11-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11698937B2 (en) | Robust location, retrieval, and display of information for dynamic networks | |
US9448995B2 (en) | Method and device for performing natural language searches | |
Hogan et al. | An empirical survey of linked data conformance | |
US7933916B2 (en) | Querying nonSQL data stores with a SQL-style language | |
Sycara et al. | Larks: Dynamic matchmaking among heterogeneous software agents in cyberspace | |
EP3080721B1 (en) | Query techniques and ranking results for knowledge-based matching | |
EP3080723B1 (en) | Building features and indexing for knowledge-based matching | |
US20090089047A1 (en) | Natural Language Hypernym Weighting For Word Sense Disambiguation | |
Harth et al. | Using naming authority to rank data and ontologies for web search | |
EP1713010A2 (en) | Using attribute inheritance to identify crawl paths | |
CN104850554A (en) | Searching method and system | |
CN103488759A (en) | Method and device for searching application programs according to key words | |
US20170193095A1 (en) | Machine Processing of Search Query based on Grammar Rules | |
Mišutka et al. | System description: Egomath2 as a tool for mathematical searching on wikipedia. org | |
CN114168622A (en) | Data query method and device based on domain specific language | |
CN112559709A (en) | Knowledge graph-based question and answer method, device, terminal and storage medium | |
WO2017063596A1 (en) | Method, apparatus and device for processing sitemap | |
KR20230005797A (en) | Apparatus, method and computer program for processing inquiry | |
CN112000690B (en) | Method and device for analyzing structured operation statement | |
WO2012091541A1 (en) | A semantic web constructor system and a method thereof | |
Eyal-Salman et al. | Feature-to-code traceability in legacy software variants | |
US8498987B1 (en) | Snippet search | |
WO2014069983A2 (en) | A system and method for distributed querying of linked semantic webs | |
CN110222156B (en) | Method and device for discovering entity, electronic equipment and computer readable medium | |
CN113420219A (en) | Method and device for correcting query information, electronic equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13786765 Country of ref document: EP Kind code of ref document: A2 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13786765 Country of ref document: EP Kind code of ref document: A2 |