EP2583195B1 - Method and server for handling database queries - Google Patents

Method and server for handling database queries Download PDF

Info

Publication number
EP2583195B1
EP2583195B1 EP10853759.8A EP10853759A EP2583195B1 EP 2583195 B1 EP2583195 B1 EP 2583195B1 EP 10853759 A EP10853759 A EP 10853759A EP 2583195 B1 EP2583195 B1 EP 2583195B1
Authority
EP
European Patent Office
Prior art keywords
sub
query
queries
database
predicate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
EP10853759.8A
Other languages
German (de)
French (fr)
Other versions
EP2583195A4 (en
EP2583195A1 (en
Inventor
Vincent Huang
Xianwei Shen
Simon Moritz
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Telefonaktiebolaget LM Ericsson AB
Original Assignee
Telefonaktiebolaget LM Ericsson AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget LM Ericsson AB filed Critical Telefonaktiebolaget LM Ericsson AB
Publication of EP2583195A1 publication Critical patent/EP2583195A1/en
Publication of EP2583195A4 publication Critical patent/EP2583195A4/en
Application granted granted Critical
Publication of EP2583195B1 publication Critical patent/EP2583195B1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24535Query rewriting; Transformation of sub-queries or views

Definitions

  • the invention relates to a method and a server for handling database queries with efficient search and retrieval of data in a semantic database.
  • the stored data may relate to any type of information that can be arranged logically in different classes or categories.
  • the term "resource” is used throughout this description, although other terms could also be used such as items, assets, entities, etc. Some practical examples of resources occurring in ordinary databases are inventory items, products for sale, pieces of equipment, humans, media items, telecommunication subscribers, etc.
  • a plurality of attributes are typically relevant for each resource, and data referring to such attributes is stored in the database available for retrieval in response to data queries related to particular resources or classes.
  • data in a conventional database is often organised in a hierarchical structure where different attributes typically form columns in one or more interrelated tables in the database.
  • attribute represents any aspect or characteristic relevant for describing a resource.
  • relational databases are typically used for holding data organized in the above class-based hierarchical manner, such as SQL(Structured Query language) databases.
  • semantic annotation can be employed for different elements in the database information which can easily be identified and analysed logically by a "machine”, i.e. using software executed by a processor or computer.
  • a concept called “RDF” Resource Description Framework
  • RDF Resource Description Framework
  • US 7363289 B2 discloses how a user query, made to a relational database with hierarchically arranged tables, can be evaluated by means of a query optimiser.
  • the query optimiser generates a query execution plan for the user query based on statistical information maintained for intermediate query components, in order to search different relevant tables in a sequence order that minimises the search cost.
  • the query execution plan is closely related to the hierarchical nature of the relational database which cannot be applied to a semantic database having the above-described "flat" ontology framework.
  • " OptARQ: A SPARQL Optimization Approach based on Triple Pattern Selectivity Estimation” Abraham Bernstein ET AL, reviews the general concept of selectivity of conditions and proposes adaptations to the selectivity of triple patterns and selectivity estimation of triple patterns.
  • a piece of information being stored in an EDF-based semantic database is expressed as a statement in the form of a triplet comprising a "subject", a "predicate” and an "object".
  • the subject identifies the entity that the statement concerns
  • the predicate identifies a property or characteristic of the subject that the statement specifies
  • the object identifies some "value” or range of that property.
  • the predicate in a stored triplet provides a relation between the subject and the object of that triplet
  • One simple exemplary information triplet expressing a statement is: "Person X (subject) is a member (predicate) of Enterprise Z (object)", where both Person X and Enterprise Z are resources that are interrelated by the triplet above.
  • each element of the statement triplets may be annotated as an element identifier such as a URI (Unified Resource Identifier) in HTTP (HyperText Transfer Protocol) format or alternatively as a literal name or the like.
  • a typical statement stored in a database may thus be represented by a triplet of element identifiers such as (URI"X", URI"Y", URI"Z").
  • the semantic annotation of the content in a database can further be represented in the form of ontologies of statements regarding resources, which will be explained by means of a simple example below.
  • ontology is a topic dealing with how entities or resources can be described in terms of properties.
  • an ontology is a framework of triplets including predefined resource classes of subjects and objects which are linked by the predicates, and this ontology can be seen as a template which can be filled with information on specific individual resources, also referred to as instances.
  • An illustrative RDF graph can be made with nodes representing the classes, which can be either subjects or objects in the statements, and the nodes are interconnected by properties representing some linking description or characteristic, i.e. the predicates. This type of information is thus annotated in the semantic database as ontologies by the above triplets of element identifiers.
  • Fig. 1 illustrates a simplified exemplary RDF ontology graph forming a framework with four nodes A-D where resource class A relates to resource class B by the linking property or predicate P1, and class B in turn relates to class C by the linking property or predicate P2 and further to class D by the linking property or predicate P3.
  • the ontology graph shown in Fig. 1 is thus basically a framework with three triplets (A,P1,B), (B,P2,C) and (B,P3,D) in which element identifiers such as URIs can be stored for individual instances, as described above.
  • each resource A-D and predicate P1-P3 is annotated by a respective element identifier or URI, as described above, in a "machine-readable” manner.
  • a URL Unified Resource locator
  • URN Unified Resource Name
  • SPARQL allows users to write globally intelligible queries in the above triplet format that can be processed in multiple heterogeneous databases.
  • SPARQL can generally be used to define queries on a high semantic level.
  • a SPARQL query typically comprises a set of triplets forming conditions for the query which can be formulated by means of the so-called "where clause", which is well-known in this field. These conditional triplets can be regarded as a set of sub-queries which are executed one by one in the database search.
  • a solution is provided in a server configured to handle database queries directed to a semantic database in which information is stored in the above-described triplet format and according to a known ontology structure.
  • semantic database is used to represent, without limitation, any number of internal, external and/ or distributed semantic databases that can be accessed according to the procedures and embodiments described herein.
  • query server when used below should be understood as any server or functional entity that is configured to handle incoming database queries and operate according to any of the embodiments described below.
  • An incoming database query can be defined as a first set of sub-queries in triplet form with variables, forming an original search path of nodes/ subclasses and may be in the above-described SPARQL format
  • an incoming SPARQL query with a set of sub-queries thus forms a search path and the sub-queries are executed one by one basically in the order given in the query, often resulting in excessive search costs since this combination and order of sub-queries is typically not very efficient in terms of search space required for the search.
  • the data set stored in the semantic database is analysed in a "configuring" phase in order to define, or find, semantic rules relating to conclusions that are implied in a logic sense by the information in the database.
  • semantic rules can be inferred from the information stored in the database. For example, if no films by a particular producer, e.g. Disney, can be found in a "horror” genre, a semantic rule can be defined basically saying “No films by Disney exist in the horror genre” and the object variable "Genre” is thus devoid of instances when Disney is the subject
  • the data set may imply that all documentary movies are produced by BBC.
  • a new semantic rule can then be defined saying that "All movies with genre documentary have provider BBC".
  • statistics on the stored data is also collected in the configuring phase, including counting how many individual, i.e. distinct, subjects and objects appear in different triplets in the database.
  • the outcome of this configuring phase of finding semantic rules and collecting statistics is then utilised to handle an incoming search query in an efficient manner to reduce the search cost by modifying and improving the search query as follows. It should be noted that the activities of the configuring phase may continue in the background as data is added and removed in the database, to keep the above statistics and semantic rules up-to-date.
  • an incoming query forming a first set of sub-queries can be analysed and rewritten as a second set of sub-queries in the triplet form, preferably according to SPARQL, based on the above collection of statistics and defined semantic rules, and also using the known ontology structure, in order to improve and optimise the search path with respect to processing time and cost
  • the second set of sub-queries thus forms a modified and hopefully improved query in terms of reducing the search space as early in the search as possible.
  • the ontology structure in the database may provide plural alternative paths from one node/ subclass to another node/ subclass, and the search path implied by the first set of sub-queries in the incoming query may be substituted by an equivalent search path involving fewer nodes or different nodes with fewer instances than in the original search path of the incoming query.
  • the incoming query can be rewritten as a second set of sub-queries forming a shorter search path and/ or a search path with fewer instances in its nodes to search.
  • the previously defined semantic rules may suggest that a node in the original search path of the incoming query is devoid of instances and this node can therefore be omitted, or skipped, in the new search path
  • a Reduction Rate "RR" is calculated for each of the sub-queries in the second set of sub-queries, based on the statistics collected and analysed in the preceding configuration phase.
  • the statistics may be collected and analysed on a more or less continuous basis, that is to keep the statistics up-to-date with data being added, modified and removed in the semantic database.
  • FIG. 2 A procedure for handling database queries will now be described with reference to Fig. 2 , the database queries being directed to a semantic database with a data set of stored information annotated as element identifiers comprising triplets with a subject, a predicate and an object
  • the information has thus been stored according to a known preset ontology structure in a conventional manner.
  • the procedure in Fig. 2 is executed by a query server that may include or be otherwise associated with a semantic database, although the invention is not limited in this respect
  • the query server may thus be operative to execute the modified search on the database or to provide the search to a separate search engine or the like which could be situated in a remote location, e.g. in a different country.
  • the server In a first action 200, the server generally collects statistics derived from the information being currently stored in the database in the above triplet form by analysing the stored triplets. In particular, it is determined how many distinct subjects S and how many distinct objects O appear as variables in triplets with a certain predicate P, which is made for all unique predicates appearing in the database. In practice, this information can be obtained by executing a query with unknown subject and object for each known predicate Pm, i.e. a query with subject and object variables in the form of a triplet "?S - Pm - ?O". A more detailed example of how action 200 can be performed will be provided below when describing Fig. 6 .
  • the predicates can be denoted P1, P2,... PM.
  • one predicate may be "hasGenre” indicating the genre of a movie, and another may be "hasProvider” indicating the provider or producer of a movie.
  • the entire database is searched to find the total number of distinct subjects and objects currently appearing together with that predicate Pin the database.
  • the number N s P of distinct subjects S appearing with predicate P and the number N o P of distinct objects O appearing in triplets with the predicate Pare determined during the collection of statistics. This information may be stored in a statistics table or the like for later use in the query improvement process.
  • the server defines semantic rules relating to conclusions implied by the information in the database, i.e. rules that can be derived logically from the stored triplets in the database for later use in the query improvement process.
  • these rules may refer to certain triplets in the preset ontology structure with nodes in the ontology structure being devoid of instances for this data set Any sub-queries involving these triplets can therefore be omitted when forming a new modified search path for an incoming query, i.e. the above-described second set of sub-queries.
  • Actions 200 and 202 basically conclude the configuring phase of this solution, although these actions may be further performed on a more or less continuous basis throughout the inventive process while the information in the data set is changed from time to time, to keep the statistics and semantic rules up-to-date.
  • the server receives a database query in the run-time phase, as indicated by a next action 204.
  • the received database query has typically been made by a user, or generally a "querying party" which could also be some computer-controlled application, and the received query has been defined such that a first set of sub-queries can be identified therefrom.
  • the received database query may have a format according to SPARQL, although the invention is not limited to this query format
  • the sub-queries of the first set are typically defined in terms of "SELECT- WHERE" clauses, which is well known in this field.
  • the query server then rewrites the first set of sub-queries of the received database query, in a further action 206, by finding a second set of sub-queries in the triplet format based on at least one of: the known ontology structure in the database, the collected statistics and the above defined semantic rules.
  • the ontology structure may imply that the search path defined by the first set of sub-queries can be shortened, still starting and ending on the same nodes.
  • the above defined semantic rules may imply that certain sub-queries in the first set have nodes being devoid of instances, which would thus produce a null result and can therefore be omitted in the search path as explained above.
  • the collected statistics is used to determine a modified order of executing the sub-queries according to the following action.
  • the sub-queries of the second set may likewise be defined in terms of "SELECT- WHERE" clauses.
  • the query server will now determine in which order or sequence the sub-queries in the second set should be executed to provide a modified search which has hopefully been improved in terms of reducing the search space as early in the search as possible. It is thus recognised in this solution that for each sub-query it is most efficient to obtain a minimal amount of search results, thus reducing the search space for the next sub-query. It is therefore determined in the following actions which sub-queries can reduce the resulting search space the most in order to place the most efficient search space reducing sub-queries at the beginning of the execution order.
  • the query server calculates the Reduction Rate RR for the sub-queries of the second setbased on the collected statistics.
  • an RR value is calculated for each sub-query in the second set and these RR values are then used to determine the more efficient and modified order of executing the sub-queries in the search.
  • the RR parameter thus relates to the number of distinct subjects and objects appearing in triplets in the database with the predicate provided in the sub-query, according to the collected statistics.
  • MUB(V S ) is a parameter called "Minimum Upper Bound" for a subject variable V S in the sub-query
  • MUB(V o ) is a Minimum Upper Bound for an object variable V o in the sub-query
  • N s Pm is the number of distinct subjects S appearing in triplets with the predicate Pm
  • N o Pm is the number of distinct objects O appearing in triplets with the predicate Pm.
  • the MUB(V S ) value of the subject variable V S is first determined as the minimum value of the statistics collected for variable V S and the MUB(V o ) value of the object variable V o is likewise determined as the minimum value of the statistics collected for the object variable V o .
  • the MUB value is determined in this way for all variables appearing in the sub-queries of the second set, regardless of whether they appear as subjects or objects or both.
  • the sub-queries of the second set may be divided into groups having a common variable, which is however somewhat outside the scope of this solution.
  • the query server finally provides the sub-queries of the second set as a modified query basically in an order according to decreasing RR values of the sub-queries, for execution basically in that order when searching the database.
  • the sub-query having the greatest RR value of all will be executed first, the sub-query having the next greatest RR value will be executed next, and so forth.
  • some sub-queries in the second set may have the same RR value and the relative order of those sub-queries may be selected freely.
  • the benefits of this solution can be achieved when the sub-queries of the second set are provided in an order at least partly according to the decreasing Reduction Rates of the sub-queries.
  • the sub-queries in the second set can be executed in a more efficient manner such that the search space can be reduced as much as possible early in the succession of sub-queries which thus provides for a most efficient search in terms of processing time and search cost, particularly since the total amount of data to search will be minimised.
  • a server 300 is illustrated when in operation according to this solution to handle and process a received search query directed to a semantic database 302, by turning the received query into a modified query basically in the manner described above.
  • the semantic database 302 thus contains stored information according to a known preset ontology structure and annotated as element identifiers comprising triplets with a subject, a predicate and an object
  • the server 300 may be operative to itself execute the modified search on the database 302, or to provide the modified search to a separate search engine, not shown, that will execute the search.
  • the server 300 comprises a data analyser 300a adapted to collect statistics derived from the information in the database, and to define semantic rules relating to conclusions implied by the information in the database, basically as described above for actions 200 and 202 of the configuring phase of this solution.
  • the server 300 also comprises a communication module 300b adapted to receive, in the run-time phase, a database query forming a first set of sub-queries.
  • the server 300 further comprises a query optimiser 300c adapted to rewrite the received database query as a second set of sub-queries in the triplet format based on at least one of: the structure of the ontologies, the collected statistics and the defined rules.
  • the query optimiser 300c is further adapted to calculate the above-described Reduction Rate RRfor the sub-queries in the second set based on the statistics collected in the data analyser, basically as described above for action 208.
  • the Reduction Rate thus relates to the number of distinct subjects and objects appearing in triplets in the database with predicates corresponding to the sub-queries.
  • the query optimiser 300c is also adapted to provide the sub-queries of the second set as a modified query in an order according to decreasing Reduction Rates of the sub-queries, for execution in that order when searching the database, basically as described above for action 210.
  • the server 300 may further comprise a search module 300d adapted to perform a search by executing the modified query with the sub-queries of the second set, on the database, the query thus being provided from the query optimiser 300c.
  • the communication module 300b may be further adapted to return a response with results from the search This scenario is schematically indicated in the figure by dashed arrows. Otherwise, the communication module 300b may send the modified query, as provided from the query optimiser 300c, to an external search engine or the like, not shown, for execution on the database 302.
  • Fig. 3 merely illustrates various functional modules or units in the server 300 in a logical sense, although the skilled person is free to implement these functions in practice using suitable software and hardware means.
  • the invention is generally not limited to the shown structures of the server 300, while its functional modules 300a-d may be configured to operate according to the procedure described for Fig 2 above, where appropriate.
  • the functional modules 300a-e described above can be implemented in the server 300 as program modules of a computer program CP 12 comprising code means which when run by a processor 14 in the server 300 causes the server 300 to perform the above-described functions and actions.
  • the processor 14 may be a single CPU (Central processing unit), or could comprise two or more processing units in the server 300.
  • the processor 14 may include general purpose microprocessors, instruction set processors and/ or related chips sets and/ or special purpose microprocessors such as ASICs (Application Specific Integrated Circuit).
  • the processor may also comprise board memory for caching purposes.
  • the computer program 12 may be carried by a computer program product CPP 16 in the server 300 connected to the processor 14.
  • the computer program product 16 comprises a computer readable medium on which the computer program 12 is stored.
  • the computer program product 16 may be a flash memory, a RAM (Random-access memory), a ROM (Read-Only Memory) or an EEPROM (Electrically Erasable Programmable ROM), and the program modules could in alternative embodiments be distributed on different computer program products in the form of memories within the server 300.
  • collecting statistics from the database in the configuring phase may be performed in practice by executing a predefined query with unknown subject and object for each known predicate Pm.
  • This predefined "statistics collecting query” can be configured as a triplet "?S - Pm - ?O" with both subject and object as variables, that will provide a search result when executed on the database from which it can be determined how many distinct subjects and objects appear in the database in any triplets having the predicate Pm.
  • the total number N s Pm of distinct subjects S appearing in triplets with a specific predicate Pm is determined and the total number N o Pm of distinct objects O appearing in triplets with the predicate Pm is determined as well.
  • this search is executed basically for all predicates P1, P2... PM.
  • This information may be stored in a statistics table or the like for later use in the query improvement process.
  • An exemplary statistics table for entering the above values N s Pm and N o Pm for each known predicate Pm is depicted in Fig. 5 .
  • a more detailed exemplary procedure of collecting statistics from a semantic database in a statistics table such as the one shown in Fig. 5 will now be described with reference to the flow chart in Fig. 6 .
  • a next action 602 illustrates that a first predefined query valid for a first known predicate P1, is executed on the database. Then, the number N s P1 of distinct subjects stored with predicate P1 of the query is determined from the search results in an action 604. likewise, the number N o P1 of distinct objects stored with predicate P1 of the query is also determined in a further action 606. In a following action 608, the statistics table is updated by entering the determined values N s P1 and N o P1 for predicate P1, basically the first row of the table in Fig. 5 .
  • a further action 610 illustrates that the next predefined query valid for the next known predicate P2 is taken for executing the next query on the database, thus repeating actions 602 - 608 for obtaining statistics for the new query and the next row in the table in Fig. 5 can be populated with the determined values.
  • This process according to actions 602 - 608 is thus repeated until all the predefined statistics collecting queries have been executed for all known predicates P1, P2... PM.
  • this solution does not exclude the possibility of omitting a statistics collecting query for any of the predicates P1, P2... PM, e.g. if that omitted predicate is of minor interest
  • This expression contains the above-determined parameters N s P1 and N o P1 which are available from the table in Fig. 5 for each unique predicate Pm appearing in the sub-queries of the second set
  • Expression (1) further contains the parameter MUB(V S ) being determined as the minimum value of the statistics collected for variable V S , and the parameter MUB(V o ) being determined as the minimum value of the statistics collected for the object variable V o .
  • Fig. 7 an exemplary table of calculated RRvalues for a set of sub-queries is depicted, and an exemplary table of MUB values determined for different variables from the collected statistics is depicted in Fig. 8 .
  • a second set of eight sub-queries in total has been defined when rewriting a received database query.
  • the table of Fig. 7 comprises a first column with triplet identities T1 - T8, a second column with triplets of the actual sub-queries in the second set, a third column with N s Pm values retrieved from the statistics table of Fig. 5 for each predicate Pm in the sub-queries, a fourth column with N o Pm values likewise retrieved from the statistics table of Fig. 5 for each predicate Pm, and a fifth column with the calculated RRvalues for each sub-query in the table.
  • the MUB values can be determined for different variables appearing in the eight sub-queries. As mentioned above, the MUB value for a variable is determined as the minimum value of the statistics collected for that variable.
  • variable "A" in Fig. 7 it appears in four sub-queries T1, T2, T3 and T6 as either subject (T1-T3) or as object (T6).
  • the minimum value of either N s Pm or N o Pm for variable "A” is in this example 1556 out of the values 1556, 2987, 2731 and 3051 for sub-queries T1, T2, T3 and T6, respectively, which is entered as the MUB value for variable "A" in the table of Fig. 8 as shown.
  • corresponding MUB values can be determined for the remaining variables G, C, U,... appearing in the sub-queries specified in Fig. 7 .
  • these MUB values can be used in expression (1) to calculate the RR values for each sub-query in the table, which may be used to determine the optimal and most efficient order of executing the sub-queries in terms of search cost
  • RR values for each sub-query in the table
  • the sub-query having the greatest RR value of all should be executed first, the sub-query having the next greatest RR value should be executed next, and so forth Consequently, the optimal order of execution in this example is T1, T4, T5, T8, T7, T3, T2 and finally T6. Therefore, according to this solution, the sub-queries of the second set can be provided basically in this order according to decreasing RR values of the sub-queries, as a modified query for execution in that order when searching the database. As said above, when sub-queries in the second set have the same RR value the relative order of those sub-queries may be selected freely. In this case, the relative order of sub-queries T1, T4, T5 and T8 having the same RRvalue of 1 can be selected freely without reducing the efficiency of the total query.
  • a more detailed exemplary procedure of performing improvement of a received database query will now be described with reference to the flow chart in Fig. 9 basically corresponding to actions 204 -210 in Fig. 2 . Accordingly, the procedure of Fig. 9 is performed by a server such as the one depicted in Fig. 3 . It is also assumed that a configuring phase has been executed basically according to the flow chart of Fig. 6 .
  • a first shown action 900 the database query is received from a querying party and the query is analysed in a next action 902 to identify a first set of sub-queries and their variables forming the received query.
  • a further action 904 illustrates that the identified first set of sub-queries and their variables are rewritten as a second set of sub-queries and their variables, based on the known ontology structure in the database, the collected statistics and the above defined semantic rules.
  • the above-described RR parameter is calculated for the sub-queries in the second setbased on the collected statistics.
  • Actions 904 and 906 basically corresponds to actions 206 and 208 above, respectively, and are thus not necessary to describe here again.
  • the server determines an execution order of the sub-queries in the second set according to decreasing RR values of the sub-queries to form a modified query, in a further action 908.
  • a example of how actions 906 and 908 can be put into practice was described in connection with Fig's 7 and 8.
  • the modified query is provided for execution on the database.
  • the actual search may be executed by the server, e.g. when having a search module according to Fig. 3 , or by an external search engine or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Description

    Technical field
  • The invention relates to a method and a server for handling database queries with efficient search and retrieval of data in a semantic database.
  • Background
  • When large amounts of data are to be stored in a database for different resources, it is useful to organize the data in a way that can facilitate search and retrieval of relevant data when receiving data queries or requests referring to particular resources, either generic or specific. The stored data may relate to any type of information that can be arranged logically in different classes or categories. The term "resource" is used throughout this description, although other terms could also be used such as items, assets, entities, etc. Some practical examples of resources occurring in ordinary databases are inventory items, products for sale, pieces of equipment, humans, media items, telecommunication subscribers, etc.
  • A plurality of attributes are typically relevant for each resource, and data referring to such attributes is stored in the database available for retrieval in response to data queries related to particular resources or classes. In practice, the data in a conventional database is often organised in a hierarchical structure where different attributes typically form columns in one or more interrelated tables in the database. In this description, the term "attribute" represents any aspect or characteristic relevant for describing a resource. Today, relational databases are typically used for holding data organized in the above class-based hierarchical manner, such as SQL(Structured Query language) databases.
  • In recent years, it has become increasingly interesting and useful to store information in a database in a way that allows for automatic machine-processing in a logical sense, such as interpreting and analysing the information in order to draw conclusions and infer new knowledge therefrom. This possibility may be useful, e.g., for creating and offering context-aware services which can be adapted to the consumer's current situation based on information in the database.
  • To enable uniform handling of information from multiple "heterogeneous" information sources, i.e. different databases, semantic annotation can be employed for different elements in the database information which can easily be identified and analysed logically by a "machine", i.e. using software executed by a processor or computer. A concept called "RDF" (Resource Description Framework) is typically used as a language for representing various pieces of information in databases regarding resources by means of semantic annotation.
  • Some problems associated with the typically huge databases of today are that great amounts of stored data must be searched in order to retrieve relevant information in response to database queries referring to particular resources, which is both time consuming and process intensive, often referred to as the searching "cost". In conventional relational databases, various solutions have been proposed for reducing the amount of data to search, referred to as the "search space", in response to database queries, although no efficient or relatively useful technique has been proposed to reduce the searching cost specifically for semantic databases.
  • US 7363289 B2 discloses how a user query, made to a relational database with hierarchically arranged tables, can be evaluated by means of a query optimiser. In this solution, the query optimiser generates a query execution plan for the user query based on statistical information maintained for intermediate query components, in order to search different relevant tables in a sequence order that minimises the search cost. In this document, the query execution plan is closely related to the hierarchical nature of the relational database which cannot be applied to a semantic database having the above-described "flat" ontology framework. "OptARQ: A SPARQL Optimization Approach based on Triple Pattern Selectivity Estimation", Abraham Bernstein ET AL, reviews the general concept of selectivity of conditions and proposes adaptations to the selectivity of triple patterns and selectivity estimation of triple patterns.
  • Summary
  • It is an object of the invention to address at least some of the problems and issues outlined above. For example, it is an object to provide a solution for handling database queries directed to a semantic database, that can be used to reduce the searching cost by avoiding undue processing delays and load for searching the database in an efficient manner. It is possible to achieve these objects and others by using a method and a server as defined in the attached independent claims.
  • Brief description of drawings
  • The invention will now be described in more detail by means of exemplary embodiments and with reference to the accompanying drawings, in which
    • Fig. 1 is a schematic diagram illustrating an exemplary ontology framework in a semantic database, according to the prior art
    • Fig. 2 is a flow chart illustrating a procedure in a server, according to an exemplary embodiment
    • Fig. 3 is a block diagram illustrating a server in more detail when in operation, according to further exemplary embodiments.
    • Fig. 4 is an illustration of collecting statistics for a particular unique predicate.
    • Fig. 5 is an exemplary table with collected statistics for different unique predicates.
    • Fig. 6 is a flow chart illustrating in more detail a process in the server for collecting statistics, according to another exemplary embodiment
    • Fig. 7 is an exemplary table with collected statistics and calculated RR values for variables and predicates appearing in different sub-queries.
    • Fig. 8 is an exemplary table with calculated MUB values for the variables appearing in the table of Fig. 7.
    • Fig. 9 is a flow chart illustrating in more detail a process in the server for improving a query, according to an exemplary embodiment
    Detailed description
  • A piece of information being stored in an EDF-based semantic database is expressed as a statement in the form of a triplet comprising a "subject", a "predicate" and an "object". Basically, the subject identifies the entity that the statement concerns, the predicate identifies a property or characteristic of the subject that the statement specifies, and the object identifies some "value" or range of that property. In other words, the predicate in a stored triplet provides a relation between the subject and the object of that triplet One simple exemplary information triplet expressing a statement is: "Person X (subject) is a member (predicate) of Enterprise Z (object)", where both Person X and Enterprise Z are resources that are interrelated by the triplet above.
  • In semantic RDF databases, each element of the statement triplets may be annotated as an element identifier such as a URI (Unified Resource Identifier) in HTTP (HyperText Transfer Protocol) format or alternatively as a literal name or the like. A typical statement stored in a database may thus be represented by a triplet of element identifiers such as (URI"X", URI"Y", URI"Z"). By representing different statements in a database with such triplets of element identifiers, the content of the database can be annotated in a machine-readable format to enable automatic analysis and processing of the information therein by means of software applications. The semantic RDF database model is thus distinctly different from the conventional relational database model.
  • The semantic annotation of the content in a database can further be represented in the form of ontologies of statements regarding resources, which will be explained by means of a simple example below. Generally, "ontology" is a topic dealing with how entities or resources can be described in terms of properties. In the context of semantic databases, an ontology is a framework of triplets including predefined resource classes of subjects and objects which are linked by the predicates, and this ontology can be seen as a template which can be filled with information on specific individual resources, also referred to as instances. An illustrative RDF graph can be made with nodes representing the classes, which can be either subjects or objects in the statements, and the nodes are interconnected by properties representing some linking description or characteristic, i.e. the predicates. This type of information is thus annotated in the semantic database as ontologies by the above triplets of element identifiers.
  • Fig. 1 illustrates a simplified exemplary RDF ontology graph forming a framework with four nodes A-D where resource class A relates to resource class B by the linking property or predicate P1, and class B in turn relates to class C by the linking property or predicate P2 and further to class D by the linking property or predicate P3. The ontology graph shown in Fig. 1 is thus basically a framework with three triplets (A,P1,B), (B,P2,C) and (B,P3,D) in which element identifiers such as URIs can be stored for individual instances, as described above. Thus, each resource A-D and predicate P1-P3 is annotated by a respective element identifier or URI, as described above, in a "machine-readable" manner. Alternatively, a URL (Unified Resource locator) or a URN (Unified Resource Name) can be used in an HTTP format for the above annotation.
  • A query language called "SPARQL", a recursive acronym that stands for "SPABQLProtocol and RDF Query language", has also been developed for database queries in RDF-based semantic databases, which is currently an official W3C (World Wide Web Consortium) recommendation. SPARQLallows users to write globally intelligible queries in the above triplet format that can be processed in multiple heterogeneous databases. SPARQL can generally be used to define queries on a high semantic level. A SPARQL query typically comprises a set of triplets forming conditions for the query which can be formulated by means of the so-called "where clause", which is well-known in this field. These conditional triplets can be regarded as a set of sub-queries which are executed one by one in the database search.
  • Briefly described, a solution is provided in a server configured to handle database queries directed to a semantic database in which information is stored in the above-described triplet format and according to a known ontology structure. In this description, the term "semantic database" is used to represent, without limitation, any number of internal, external and/ or distributed semantic databases that can be accessed according to the procedures and embodiments described herein. Further, the term "query server" when used below should be understood as any server or functional entity that is configured to handle incoming database queries and operate according to any of the embodiments described below.
  • An incoming database query can be defined as a first set of sub-queries in triplet form with variables, forming an original search path of nodes/ subclasses and may be in the above-described SPARQL format In conventional search procedures, an incoming SPARQL query with a set of sub-queries thus forms a search path and the sub-queries are executed one by one basically in the order given in the query, often resulting in excessive search costs since this combination and order of sub-queries is typically not very efficient in terms of search space required for the search.
  • In this solution, however, the data set stored in the semantic database is analysed in a "configuring" phase in order to define, or find, semantic rules relating to conclusions that are implied in a logic sense by the information in the database. In other words, such semantic rules can be inferred from the information stored in the database. For example, if no films by a particular producer, e.g. Disney, can be found in a "horror" genre, a semantic rule can be defined basically saying "No films by Disney exist in the horror genre" and the object variable "Genre" is thus devoid of instances when Disney is the subject In another example, the data set may imply that all documentary movies are produced by BBC. A new semantic rule can then be defined saying that "All movies with genre documentary have provider BBC". When an incoming query contains a sub-query specifying the movie genre is "documentary, we only need to search for movies that are produced by "BBC".
  • Further, statistics on the stored data is also collected in the configuring phase, including counting how many individual, i.e. distinct, subjects and objects appear in different triplets in the database. The outcome of this configuring phase of finding semantic rules and collecting statistics is then utilised to handle an incoming search query in an efficient manner to reduce the search cost by modifying and improving the search query as follows. It should be noted that the activities of the configuring phase may continue in the background as data is added and removed in the database, to keep the above statistics and semantic rules up-to-date.
  • Then, in a "run-time" phase of this solution, an incoming query forming a first set of sub-queries can be analysed and rewritten as a second set of sub-queries in the triplet form, preferably according to SPARQL, based on the above collection of statistics and defined semantic rules, and also using the known ontology structure, in order to improve and optimise the search path with respect to processing time and cost The second set of sub-queries thus forms a modified and hopefully improved query in terms of reducing the search space as early in the search as possible.
  • For example, the ontology structure in the database may provide plural alternative paths from one node/ subclass to another node/ subclass, and the search path implied by the first set of sub-queries in the incoming query may be substituted by an equivalent search path involving fewer nodes or different nodes with fewer instances than in the original search path of the incoming query. In that case, the incoming query can be rewritten as a second set of sub-queries forming a shorter search path and/ or a search path with fewer instances in its nodes to search. In another example, the previously defined semantic rules may suggest that a node in the original search path of the incoming query is devoid of instances and this node can therefore be omitted, or skipped, in the new search path
  • Then, a Reduction Rate "RR" is calculated for each of the sub-queries in the second set of sub-queries, based on the statistics collected and analysed in the preceding configuration phase. The statistics may be collected and analysed on a more or less continuous basis, that is to keep the statistics up-to-date with data being added, modified and removed in the semantic database.
  • A procedure for handling database queries will now be described with reference to Fig. 2, the database queries being directed to a semantic database with a data set of stored information annotated as element identifiers comprising triplets with a subject, a predicate and an object The information has thus been stored according to a known preset ontology structure in a conventional manner. The procedure in Fig. 2 is executed by a query server that may include or be otherwise associated with a semantic database, although the invention is not limited in this respect The query server may thus be operative to execute the modified search on the database or to provide the search to a separate search engine or the like which could be situated in a remote location, e.g. in a different country. An example of the configuring phase will be described in more detail later with reference to Fig. 6 and an example of the run-time phase will be described in more detail later with reference to Fig. 9, while the basic procedure in Fig. 2 covers both the configuring phase and the run-time phase.
  • In a first action 200, the server generally collects statistics derived from the information being currently stored in the database in the above triplet form by analysing the stored triplets. In particular, it is determined how many distinct subjects S and how many distinct objects O appear as variables in triplets with a certain predicate P, which is made for all unique predicates appearing in the database. In practice, this information can be obtained by executing a query with unknown subject and object for each known predicate Pm, i.e. a query with subject and object variables in the form of a triplet "?S - Pm - ?O". A more detailed example of how action 200 can be performed will be provided below when describing Fig. 6.
  • Assuming there are M unique predicates appearing in triplets in the database, the predicates can be denoted P1, P2,... PM. For example, in a data set with information on movies, one predicate may be "hasGenre" indicating the genre of a movie, and another may be "hasProvider" indicating the provider or producer of a movie. For each unique predicate P, the entire database is searched to find the total number of distinct subjects and objects currently appearing together with that predicate Pin the database. Thus, the number Ns P of distinct subjects S appearing with predicate P and the number No P of distinct objects O appearing in triplets with the predicate Pare determined during the collection of statistics. This information may be stored in a statistics table or the like for later use in the query improvement process.
  • In a further action 202, the server defines semantic rules relating to conclusions implied by the information in the database, i.e. rules that can be derived logically from the stored triplets in the database for later use in the query improvement process. As exemplified above, these rules may refer to certain triplets in the preset ontology structure with nodes in the ontology structure being devoid of instances for this data set Any sub-queries involving these triplets can therefore be omitted when forming a new modified search path for an incoming query, i.e. the above-described second set of sub-queries.
  • Actions 200 and 202 basically conclude the configuring phase of this solution, although these actions may be further performed on a more or less continuous basis throughout the inventive process while the information in the data set is changed from time to time, to keep the statistics and semantic rules up-to-date. At some point, the server receives a database query in the run-time phase, as indicated by a next action 204. The received database query has typically been made by a user, or generally a "querying party" which could also be some computer-controlled application, and the received query has been defined such that a first set of sub-queries can be identified therefrom. The received database query may have a format according to SPARQL, although the invention is not limited to this query format Further, the sub-queries of the first set are typically defined in terms of "SELECT- WHERE" clauses, which is well known in this field.
  • The query server then rewrites the first set of sub-queries of the received database query, in a further action 206, by finding a second set of sub-queries in the triplet format based on at least one of: the known ontology structure in the database, the collected statistics and the above defined semantic rules. Firstly, the ontology structure may imply that the search path defined by the first set of sub-queries can be shortened, still starting and ending on the same nodes. As mentioned above, there may be plural alternative paths from a particular start node/ subclass to a particular end node/ subclass. Secondly, the above defined semantic rules may imply that certain sub-queries in the first set have nodes being devoid of instances, which would thus produce a null result and can therefore be omitted in the search path as explained above. Thirdly, the collected statistics is used to determine a modified order of executing the sub-queries according to the following action. The sub-queries of the second set may likewise be defined in terms of "SELECT- WHERE" clauses.
  • Having found the sub-queries to form second set which is basically equivalent to the received first set of sub-queries but provides a more efficient search path, the query server will now determine in which order or sequence the sub-queries in the second set should be executed to provide a modified search which has hopefully been improved in terms of reducing the search space as early in the search as possible. It is thus recognised in this solution that for each sub-query it is most efficient to obtain a minimal amount of search results, thus reducing the search space for the next sub-query. It is therefore determined in the following actions which sub-queries can reduce the resulting search space the most in order to place the most efficient search space reducing sub-queries at the beginning of the execution order.
  • In a triple query pattern, either the subject or the object or both can be provided as variables. In a further shown action 208, the query server calculates the Reduction Rate RR for the sub-queries of the second setbased on the collected statistics. Referably, an RR value is calculated for each sub-query in the second set and these RR values are then used to determine the more efficient and modified order of executing the sub-queries in the search. The RR parameter thus relates to the number of distinct subjects and objects appearing in triplets in the database with the predicate provided in the sub-query, according to the collected statistics.
  • In this action, the Reduction Rate RR can be calculated for a specific sub-query of the second set with a predicate Pm as: RR P m = MUB V s N s P m MUB V o N o P m
    Figure imgb0001
  • In the above expression (1), MUB(VS) is a parameter called "Minimum Upper Bound" for a subject variable VS in the sub-query, MUB(Vo) is a Minimum Upper Bound for an object variable Vo in the sub-query, Ns Pm is the number of distinct subjects S appearing in triplets with the predicate Pm, and No Pm is the number of distinct objects O appearing in triplets with the predicate Pm. In this way, an RR value can thus be calculated for each sub-query in the second set
  • In more detail, in order to calculate the RR value according to expression (1) above, the MUB(VS) value of the subject variable VS is first determined as the minimum value of the statistics collected for variable VS and the MUB(Vo) value of the object variable Vo is likewise determined as the minimum value of the statistics collected for the object variable Vo.
  • The MUB value is determined in this way for all variables appearing in the sub-queries of the second set, regardless of whether they appear as subjects or objects or both. In order to facilitate the following processing, the sub-queries of the second set may be divided into groups having a common variable, which is however somewhat outside the scope of this solution.
  • In a last shown action 210, the query server finally provides the sub-queries of the second set as a modified query basically in an order according to decreasing RR values of the sub-queries, for execution basically in that order when searching the database. In other words, the sub-query having the greatest RR value of all will be executed first, the sub-query having the next greatest RR value will be executed next, and so forth. Of course, some sub-queries in the second set may have the same RR value and the relative order of those sub-queries may be selected freely. Further, the benefits of this solution can be achieved when the sub-queries of the second set are provided in an order at least partly according to the decreasing Reduction Rates of the sub-queries. Thereby, the sub-queries in the second set can be executed in a more efficient manner such that the search space can be reduced as much as possible early in the succession of sub-queries which thus provides for a most efficient search in terms of processing time and search cost, particularly since the total amount of data to search will be minimised.
  • In Fig. 3, a server 300 is illustrated when in operation according to this solution to handle and process a received search query directed to a semantic database 302, by turning the received query into a modified query basically in the manner described above. The semantic database 302 thus contains stored information according to a known preset ontology structure and annotated as element identifiers comprising triplets with a subject, a predicate and an object As in the example described above, the server 300 may be operative to itself execute the modified search on the database 302, or to provide the modified search to a separate search engine, not shown, that will execute the search.
  • The server 300 comprises a data analyser 300a adapted to collect statistics derived from the information in the database, and to define semantic rules relating to conclusions implied by the information in the database, basically as described above for actions 200 and 202 of the configuring phase of this solution. The server 300 also comprises a communication module 300b adapted to receive, in the run-time phase, a database query forming a first set of sub-queries.
  • The server 300 further comprises a query optimiser 300c adapted to rewrite the received database query as a second set of sub-queries in the triplet format based on at least one of: the structure of the ontologies, the collected statistics and the defined rules. The query optimiser 300c is further adapted to calculate the above-described Reduction Rate RRfor the sub-queries in the second set based on the statistics collected in the data analyser, basically as described above for action 208. The Reduction Rate thus relates to the number of distinct subjects and objects appearing in triplets in the database with predicates corresponding to the sub-queries. The query optimiser 300c is also adapted to provide the sub-queries of the second set as a modified query in an order according to decreasing Reduction Rates of the sub-queries, for execution in that order when searching the database, basically as described above for action 210.
  • Depending on the implementation. The server 300 may further comprise a search module 300d adapted to perform a search by executing the modified query with the sub-queries of the second set, on the database, the query thus being provided from the query optimiser 300c. In that case, the communication module 300b may be further adapted to return a response with results from the search This scenario is schematically indicated in the figure by dashed arrows. Otherwise, the communication module 300b may send the modified query, as provided from the query optimiser 300c, to an external search engine or the like, not shown, for execution on the database 302.
  • It should be noted that Fig. 3 merely illustrates various functional modules or units in the server 300 in a logical sense, although the skilled person is free to implement these functions in practice using suitable software and hardware means. Thus, the invention is generally not limited to the shown structures of the server 300, while its functional modules 300a-d may be configured to operate according to the procedure described for Fig 2 above, where appropriate.
  • With reference to Fig. 3 and the block diagram of Fig. 3a, the functional modules 300a-e described above can be implemented in the server 300 as program modules of a computer program CP 12 comprising code means which when run by a processor 14 in the server 300 causes the server 300 to perform the above-described functions and actions. The processor 14 may be a single CPU (Central processing unit), or could comprise two or more processing units in the server 300. For example, the processor 14 may include general purpose microprocessors, instruction set processors and/ or related chips sets and/ or special purpose microprocessors such as ASICs (Application Specific Integrated Circuit). The processor may also comprise board memory for caching purposes.
  • The computer program 12 may be carried by a computer program product CPP 16 in the server 300 connected to the processor 14. The computer program product 16 comprises a computer readable medium on which the computer program 12 is stored. For example, the computer program product 16 may be a flash memory, a RAM (Random-access memory), a ROM (Read-Only Memory) or an EEPROM (Electrically Erasable Programmable ROM), and the program modules could in alternative embodiments be distributed on different computer program products in the form of memories within the server 300.
  • As mentioned above, collecting statistics from the database in the configuring phase may be performed in practice by executing a predefined query with unknown subject and object for each known predicate Pm. This predefined "statistics collecting query" can be configured as a triplet "?S - Pm - ?O" with both subject and object as variables, that will provide a search result when executed on the database from which it can be determined how many distinct subjects and objects appear in the database in any triplets having the predicate Pm. As schematically illustrated in Fig. 4, in this process, the total number Ns Pm of distinct subjects S appearing in triplets with a specific predicate Pm is determined and the total number No Pm of distinct objects O appearing in triplets with the predicate Pm is determined as well.
  • As mentioned above, if there are M known predicates appearing in the database, this search is executed basically for all predicates P1, P2... PM. This information may be stored in a statistics table or the like for later use in the query improvement process. An exemplary statistics table for entering the above values Ns Pm and No Pm for each known predicate Pm is depicted in Fig. 5.
  • A more detailed exemplary procedure of collecting statistics from a semantic database in a statistics table such as the one shown in Fig. 5, will now be described with reference to the flow chart in Fig. 6. In a first shown action 600, specific "statistics collecting queries" with unique predicates are defined for collecting statistics in the database. For example, if M unique predicates have been identified as appearing in the database, these queries may be defined as triplets with both subject and object being variables, basically in the form of "?S - Pm - ?O" where m = 1, 2, 3... M in the different queries.
  • A next action 602 illustrates that a first predefined query valid for a first known predicate P1, is executed on the database. Then, the number Ns P1 of distinct subjects stored with predicate P1 of the query is determined from the search results in an action 604. likewise, the number No P1 of distinct objects stored with predicate P1 of the query is also determined in a further action 606. In a following action 608, the statistics table is updated by entering the determined values Ns P1 and No P1 for predicate P1, basically the first row of the table in Fig. 5.
  • A further action 610 illustrates that the next predefined query valid for the next known predicate P2 is taken for executing the next query on the database, thus repeating actions 602 - 608 for obtaining statistics for the new query and the next row in the table in Fig. 5 can be populated with the determined values. This process according to actions 602 - 608 is thus repeated until all the predefined statistics collecting queries have been executed for all known predicates P1, P2... PM. However, this solution does not exclude the possibility of omitting a statistics collecting query for any of the predicates P1, P2... PM, e.g. if that omitted predicate is of minor interest
  • Above, it has been described in detail how the reduction rate RR can be calculated in the run-time phase for sub-queries of the second set of an incoming database query, using the expression (1). This expression contains the above-determined parameters Ns P1 and No P1 which are available from the table in Fig. 5 for each unique predicate Pm appearing in the sub-queries of the second set
  • Expression (1) further contains the parameter MUB(VS) being determined as the minimum value of the statistics collected for variable VS, and the parameter MUB(Vo) being determined as the minimum value of the statistics collected for the object variable Vo. In Fig. 7, an exemplary table of calculated RRvalues for a set of sub-queries is depicted, and an exemplary table of MUB values determined for different variables from the collected statistics is depicted in Fig. 8.
  • In this example, a second set of eight sub-queries in total has been defined when rewriting a received database query. The table of Fig. 7 comprises a first column with triplet identities T1 - T8, a second column with triplets of the actual sub-queries in the second set, a third column with Ns Pm values retrieved from the statistics table of Fig. 5 for each predicate Pm in the sub-queries, a fourth column with No Pm values likewise retrieved from the statistics table of Fig. 5 for each predicate Pm, and a fifth column with the calculated RRvalues for each sub-query in the table.
  • For example, for the sub-query of T3 "?A-P3 -?C", the number Ns P3 of distinct subjects stored with predicate P3 is 2731 and the number No P3 of distinct objects stored with that predicate P3 is 3675, and so forth Using this information in the table of Fig. 7, the MUB values can be determined for different variables appearing in the eight sub-queries. As mentioned above, the MUB value for a variable is determined as the minimum value of the statistics collected for that variable.
  • First considering the variable "A" in Fig. 7, it appears in four sub-queries T1, T2, T3 and T6 as either subject (T1-T3) or as object (T6). Of these four sub-queries, the minimum value of either Ns Pm or No Pm for variable "A" is in this example 1556 out of the values 1556, 2987, 2731 and 3051 for sub-queries T1, T2, T3 and T6, respectively, which is entered as the MUB value for variable "A" in the table of Fig. 8 as shown. In this way, corresponding MUB values can be determined for the remaining variables G, C, U,... appearing in the sub-queries specified in Fig. 7.
  • Then, these MUB values can be used in expression (1) to calculate the RR values for each sub-query in the table, which may be used to determine the optimal and most efficient order of executing the sub-queries in terms of search cost As a result, there are in this example actually four sub-queries in the second set having the same RR value of 1, i.e. T1, T4, T5 and T8.
  • As described above, the sub-query having the greatest RR value of all should be executed first, the sub-query having the next greatest RR value should be executed next, and so forth Consequently, the optimal order of execution in this example is T1, T4, T5, T8, T7, T3, T2 and finally T6. Therefore, according to this solution, the sub-queries of the second set can be provided basically in this order according to decreasing RR values of the sub-queries, as a modified query for execution in that order when searching the database. As said above, when sub-queries in the second set have the same RR value the relative order of those sub-queries may be selected freely. In this case, the relative order of sub-queries T1, T4, T5 and T8 having the same RRvalue of 1 can be selected freely without reducing the efficiency of the total query.
  • A more detailed exemplary procedure of performing improvement of a received database query, will now be described with reference to the flow chart in Fig. 9 basically corresponding to actions 204 -210 in Fig. 2. Accordingly, the procedure of Fig. 9 is performed by a server such as the one depicted in Fig. 3. It is also assumed that a configuring phase has been executed basically according to the flow chart of Fig. 6. In a first shown action 900, the database query is received from a querying party and the query is analysed in a next action 902 to identify a first set of sub-queries and their variables forming the received query.
  • A further action 904 illustrates that the identified first set of sub-queries and their variables are rewritten as a second set of sub-queries and their variables, based on the known ontology structure in the database, the collected statistics and the above defined semantic rules. In a next action 906, the above-described RR parameter is calculated for the sub-queries in the second setbased on the collected statistics. Actions 904 and 906 basically corresponds to actions 206 and 208 above, respectively, and are thus not necessary to describe here again.
  • Then, the server determines an execution order of the sub-queries in the second set according to decreasing RR values of the sub-queries to form a modified query, in a further action 908. A example of how actions 906 and 908 can be put into practice was described in connection with Fig's 7 and 8. Finally, the modified query is provided for execution on the database. The actual search may be executed by the server, e.g. when having a search module according to Fig. 3, or by an external search engine or the like.
  • While the invention has been described with reference to specific exemplary embodiments, the description is generally only intended to illustrate the inventive concept and should notbe taken as limiting the scope of the invention. The invention is defined by the appended claims.

Claims (13)

  1. A method in a server (300) for handling a database query directed to a semantic database with stored information according to a preset ontology structure and annotated in a machine-readable format as element identifiers comprising triplets with a subject, a predicate, and an object, the method comprising:
    - collecting (200) statistics derived from said information in the database, wherein collecting said statistics comprises determining a number N s P
    Figure imgb0002
    of distinct subjects S appearing in triplets with a predicate P and further determining a number N o P
    Figure imgb0003
    of distinct objects O appearing in triplets with the predicate P, and wherein said determining of N s P
    Figure imgb0004
    and N o P
    Figure imgb0005
    is performed for individual unique predicates P1, P2,... PM appearing in the database,
    - defining (202) semantic rules relating to conclusions implied by the information in the database, -receiving (204) the database query forming a first set of sub-queries,
    - rewriting (206) the received database query as a second set of sub-queries in the triplet format based on at least the defined rules and the preset ontology structure,
    - calculating (208) a Reduction Rate, RR, value for the sub-queries of the second set based on the collected statistics, said Reduction Rate, RR, value relating to the number of distinct subjects S and objects O appearing in triplets in the database with predicates provided in the sub-queries of the second set, and
    - providing (210) the sub-queries of the second set as a modified query in an order according to decreasing Reduction Rate, RR, values of said sub-queries, for execution in that order when searching the database.
  2. The method according to claim 1, wherein the received database query has a format according to SPARQL.
  3. The method according to claim 1 or 2, wherein the sub-queries of the first and second sets are defined in terms of "SELECT - WHERE" clauses.
  4. The method according to any of claims 1-3, wherein N s P
    Figure imgb0006
    and N o P
    Figure imgb0007
    are determined by executing predefined queries with said unique predicates in the database and variable subjects (?S) and objects (?O).
  5. The method according to any of claims 1-4, wherein the Reduction Rate, RR, value is calculated for a sub-query of the second set with a predicate Pm as: RR P m = MUB V s N s P m MUB V o N o P m
    Figure imgb0008
    where MUB(VS) is a Minimum Upper Bound for the subject variable VS in the sub-query, MUB(VO) is a Minimum Upper Bound for the object variable VO in the sub-query, N s Pm
    Figure imgb0009
    is the number of distinct subjects S appearing in triplets with the predicate Pm, and N o Pm
    Figure imgb0010
    is the number of distinct objects O appearing in triplets with the predicate Pm.
  6. The method according to claim 5, wherein said Minimum Upper Bound for the subject variable is determined as the minimum value of the statistics collected for the subject variable, wherein said Minimum Upper Bound for the object variable is determined as the minimum value of the statistics collected for the object variable, and wherein the Minimum Upper Bound is determined for basically all variables appearing in the sub-queries of the second set.
  7. The method according to any of claims 1-6, wherein said order of sub-queries of the second set dictates that the sub-query having the greatest RR value of all is executed first, the sub-query having the next greatest RR value is executed next, and so forth.
  8. A server (300) configured to handle a database query directed to a semantic database (302) with stored information according to a preset ontology structure and annotated in a machine-readable format as element identifiers comprising triplets with a subject, a predicate, and an object, wherein the server (300) comprises:
    -a data analyser (300a) adapted to collect statistics derived from said information in the database and to define semantic rules relating to conclusions implied by the information in the database, wherein the data analyser (300a) is adapted to collect said statistics by determining a number N s P
    Figure imgb0011
    of distinct subjects S appearing in triplets with a predicate P and by determining a number N o P
    Figure imgb0012
    of distinct objects O appearing in triplets with the predicate P, and wherein the data analyser (300a) is further adapted to perform said determining of N s P
    Figure imgb0013
    and N o P
    Figure imgb0014
    for individual unique predicates P1, P2,... PM appearing in the database,
    -a communication module (300b) adapted to receive the database query forming a first set of sub-queries, and
    - a query optimiser (300c) adapted to:
    rewrite the received database query as a second set of sub-queries in the triplet format based on at least the defined rules and the preset ontology structure,
    calculate a Reduction Rate, RR, value for the sub-queries in the second set based on the statistics collected in the data analyser, said Reduction Rate, RR, value relating to the number of distinct subjects S and objects O appearing in triplets in the database with predicates provided in the sub-queries in the second set, and
    provide the sub-queries of the second set as a modified query in an order according to decreasing Reduction Rate, RR, values of said sub-queries, for execution in that order when searching the database.
  9. The server (300) according to claim 8, further comprising a search module (300d) adapted to perform a search by executing the sub-queries of the second set in the database, wherein the communication module is further adapted to return a response with results from said search.
  10. The server (300) according to claim 8 or 9, wherein the data analyser (300a) is further adapted to perform said determining of N s P
    Figure imgb0015
    and N o P
    Figure imgb0016
    by executing predefined queries with said unique predicates in the database and variable subjects (?S) and objects (?O).
  11. The server (300) according to any of claims 8-10, wherein the query optimiser (300c) is further adapted to calculate the Reduction Rate, RR, value for a sub-query of the second set with a predicate Pm as: RR P m = MUB V s N s P m MUB V o N o P m
    Figure imgb0017
    where MUB(VS) is a Minimum Upper Bound for the subject variable VS in the sub-query, MUB(VO) is a Minimum Upper Bound for the object variable VO in the sub-query, N s Pm
    Figure imgb0018
    is the number of distinct subjects S appearing in triplets with the predicate Pm, and N o Pm
    Figure imgb0019
    is the number of distinct objects O appearing in triplets with the predicate Pm.
  12. The server (300) according to claim 11, wherein the query optimiser (300c) is further adapted to determine said Minimum Upper Bound for the subject variable as the minimum value of the statistics collected for the subject variable, to determine said Minimum Upper Bound for the object variable as the minimum value of the statistics collected for the object variable, and to determine the Minimum Upper Bound for basically all variables appearing in the sub-queries of the second set.
  13. The server (300) according to any of claims 8-12, wherein said order of sub-queries of the second set dictates that the sub-query having the greatest RR value of all is executed first, the sub-query having the next greatest RR value is executed next, and so forth.
EP10853759.8A 2010-06-21 2010-06-21 Method and server for handling database queries Active EP2583195B1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/SE2010/050702 WO2011162645A1 (en) 2010-06-21 2010-06-21 Method and server for handling database queries

Publications (3)

Publication Number Publication Date
EP2583195A1 EP2583195A1 (en) 2013-04-24
EP2583195A4 EP2583195A4 (en) 2017-08-16
EP2583195B1 true EP2583195B1 (en) 2019-11-27

Family

ID=45371633

Family Applications (1)

Application Number Title Priority Date Filing Date
EP10853759.8A Active EP2583195B1 (en) 2010-06-21 2010-06-21 Method and server for handling database queries

Country Status (3)

Country Link
US (1) US8843473B2 (en)
EP (1) EP2583195B1 (en)
WO (1) WO2011162645A1 (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8793208B2 (en) * 2009-12-17 2014-07-29 International Business Machines Corporation Identifying common data objects representing solutions to a problem in different disciplines
US8747115B2 (en) 2012-03-28 2014-06-10 International Business Machines Corporation Building an ontology by transforming complex triples
US8539001B1 (en) 2012-08-20 2013-09-17 International Business Machines Corporation Determining the value of an association between ontologies
US20140280293A1 (en) * 2013-03-12 2014-09-18 Mckesson Financial Holdings Method and apparatus for retrieving cached database search results
US9037568B1 (en) * 2013-03-15 2015-05-19 Google Inc. Factual query pattern learning
KR20140125488A (en) * 2013-04-19 2014-10-29 한국전자통신연구원 Method and apparatus for providing context awareness based network in smart ubiquitous networks
US9348895B2 (en) 2013-05-01 2016-05-24 International Business Machines Corporation Automatic suggestion for query-rewrite rules
EP3274867A1 (en) * 2015-03-27 2018-01-31 Entit Software LLC Optimize query based on unique attribute
US10262062B2 (en) * 2015-12-21 2019-04-16 Adobe Inc. Natural language system question classifier, semantic representations, and logical form templates
US20210103586A1 (en) 2019-10-07 2021-04-08 International Business Machines Corporation Ontology-based query routing for distributed knowledge bases
CN112632411A (en) * 2020-12-24 2021-04-09 武汉旷视金智科技有限公司 Target object data query method, device, equipment and storage medium

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5590319A (en) * 1993-12-15 1996-12-31 Information Builders, Inc. Query processor for parallel processing in homogenous and heterogenous databases
US20040243595A1 (en) * 2001-09-28 2004-12-02 Zhan Cui Database management system
US6947927B2 (en) 2002-07-09 2005-09-20 Microsoft Corporation Method and apparatus for exploiting statistics on query expressions for optimization
US7539667B2 (en) * 2004-11-05 2009-05-26 International Business Machines Corporation Method, system and program for executing a query having a union operator
CN101495953B (en) * 2005-01-28 2012-07-11 美国联合包裹服务公司 System and method of registration and maintenance of address data for each service point in a territory
US7877373B2 (en) * 2006-06-30 2011-01-25 Oracle International Corporation Executing alternative plans for a SQL statement
US20080040334A1 (en) * 2006-08-09 2008-02-14 Gad Haber Operation of Relational Database Optimizers by Inserting Redundant Sub-Queries in Complex Queries
US8335767B2 (en) * 2007-10-17 2012-12-18 Oracle International Corporation Maintaining and utilizing SQL execution plan histories
US20090119572A1 (en) * 2007-11-02 2009-05-07 Marja-Riitta Koivunen Systems and methods for finding information resources
US7818352B2 (en) * 2007-11-26 2010-10-19 Microsoft Corporation Converting SPARQL queries to SQL queries
US8386508B2 (en) * 2008-04-28 2013-02-26 Infosys Technologies Limited System and method for parallel query evaluation
US8862579B2 (en) * 2009-04-15 2014-10-14 Vcvc Iii Llc Search and search optimization using a pattern of a location identifier
US10628847B2 (en) * 2009-04-15 2020-04-21 Fiver Llc Search-enhanced semantic advertising
US20130166303A1 (en) * 2009-11-13 2013-06-27 Adobe Systems Incorporated Accessing media data using metadata repository
US9442930B2 (en) * 2011-09-07 2016-09-13 Venio Inc. System, method and computer program product for automatic topic identification using a hypertext corpus
US9442928B2 (en) * 2011-09-07 2016-09-13 Venio Inc. System, method and computer program product for automatic topic identification using a hypertext corpus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BIRTE GLIMM ET AL: "SPARQL 1.1 Entailment Regimes - W3C Working Draft", 1 June 2010 (2010-06-01), pages 1 - 23, XP055512154, Retrieved from the Internet <URL:https://www.w3.org/TR/2010/WD-sparql11-entailment-20100601/> [retrieved on 20181004] *

Also Published As

Publication number Publication date
EP2583195A4 (en) 2017-08-16
EP2583195A1 (en) 2013-04-24
US8843473B2 (en) 2014-09-23
US20130091119A1 (en) 2013-04-11
WO2011162645A1 (en) 2011-12-29

Similar Documents

Publication Publication Date Title
EP2583195B1 (en) Method and server for handling database queries
Martinez et al. Integrating data warehouses with web data: A survey
Etcheverry et al. Enhancing OLAP analysis with web cubes
US20040148278A1 (en) System and method for providing content warehouse
US8423569B2 (en) Decomposed query conditions
US20140372409A1 (en) Data Flow Graph Optimization Using Adaptive Rule Chaining
US10997174B2 (en) Case join decompositions
CN112989145B (en) Query statement generation method, device and system and computer readable storage medium
Mami et al. The query translation landscape: a survey
Poveda et al. Application of semantic search in Idea Management Systems
Ma et al. Modeling and querying temporal RDF knowledge graphs with relational databases
Barbieri et al. Continuous queries and real-time analysis of social semantic data with c-sparql
US8468163B2 (en) Ontology system providing enhanced search capability with ranking of results
Damasio et al. Query Performance Problem Determination with Knowledge Base in Semantic Web System OptImatch.
CN107436919B (en) Cloud manufacturing standard service modeling method based on ontology and BOSS
US12026161B2 (en) Hierarchical datacube query plan generation
US20200159868A1 (en) Search engine functionality using shared ai models
CN109558427A (en) Intelligent inquiry system and method based on steel industry data platform
US11860964B2 (en) Industrial information identification and retrieval system
Bodra Processing queries over partitioned graph databases: An approach and it’s evaluation
Mavroudopoulos et al. A Comprehensive Scalable Framework for Cloud-Native Pattern Detection with Enhanced Expressiveness
Raimbault Overviewing the RDF (S) semantic web
Milenkovic et al. Enabling knowledge management in complex industrial processes using semantic web technology
Zoulis et al. Workload-aware self-tuning histograms of string data
Subramanian et al. Calculating Statistics

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20121210

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK SM TR

DAX Request for extension of the european patent (deleted)
RA4 Supplementary search report drawn up and despatched (corrected)

Effective date: 20170719

RIC1 Information provided on ipc code assigned before grant

Ipc: G06F 17/30 20060101AFI20170713BHEP

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20181012

REG Reference to a national code

Ref country code: DE

Ref legal event code: R079

Ref document number: 602010062197

Country of ref document: DE

Free format text: PREVIOUS MAIN CLASS: G06F0017300000

Ipc: G06F0016245300

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: GRANT OF PATENT IS INTENDED

RIC1 Information provided on ipc code assigned before grant

Ipc: G06F 16/2453 20190101AFI20190619BHEP

INTG Intention to grant announced

Effective date: 20190709

RIN1 Information on inventor provided before grant (corrected)

Inventor name: MORITZ, SIMON

Inventor name: HUANG, VINCENT

Inventor name: SHEN, XIANWEI

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE PATENT HAS BEEN GRANTED

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK SM TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602010062197

Country of ref document: DE

REG Reference to a national code

Ref country code: AT

Ref legal event code: REF

Ref document number: 1207486

Country of ref document: AT

Kind code of ref document: T

Effective date: 20191215

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: NL

Ref legal event code: MP

Effective date: 20191127

REG Reference to a national code

Ref country code: LT

Ref legal event code: MG4D

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200228

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200227

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200227

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200327

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: AL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200419

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602010062197

Country of ref document: DE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: SM

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 1207486

Country of ref document: AT

Kind code of ref document: T

Effective date: 20191127

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed

Effective date: 20200828

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200621

REG Reference to a national code

Ref country code: BE

Ref legal event code: MM

Effective date: 20200630

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200621

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200630

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200630

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200630

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200630

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: MT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191127

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20220628

Year of fee payment: 13

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20220629

Year of fee payment: 13

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 602010062197

Country of ref document: DE

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20230621

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20240103

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20230621