WO2023124729A1 - Data query method and apparatus, and device and storage medium - Google Patents

Data query method and apparatus, and device and storage medium Download PDF

Info

Publication number
WO2023124729A1
WO2023124729A1 PCT/CN2022/135606 CN2022135606W WO2023124729A1 WO 2023124729 A1 WO2023124729 A1 WO 2023124729A1 CN 2022135606 W CN2022135606 W CN 2022135606W WO 2023124729 A1 WO2023124729 A1 WO 2023124729A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
query
data
tree
nodes
Prior art date
Application number
PCT/CN2022/135606
Other languages
French (fr)
Chinese (zh)
Inventor
邹磊
Original Assignee
北京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学 filed Critical 北京大学
Publication of WO2023124729A1 publication Critical patent/WO2023124729A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24539Query rewriting; Transformation using cached or materialised query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists

Definitions

  • the present application relates to the technical field of graph databases, and in particular to a method, device, device and storage medium for querying data.
  • RDF Resource Description Framework, resource description framework
  • RDF Resource Description Framework, resource description framework
  • Each edge in the knowledge graph is expressed in the form of an RDF triple in the form of "subject, predicate, object", which represents a pair of A named relationship between entities or a named property value that an entity has.
  • SPARQL SPARQL Protocol and RDF Query Language, query language and data acquisition protocol
  • UNION union
  • OPTIONAL optional matching
  • FILTER filtering
  • a computer device executes a query operation corresponding to a data query statement, it only sequentially executes query processing corresponding to each query expression, and the efficiency of querying data is low.
  • Embodiments of the present application provide a data query method, device, device, and storage medium, which can improve data query efficiency. Described technical scheme is as follows:
  • a method for querying data comprising:
  • the types of nodes in the first query tree include a merge node and a query node, wherein the merge node is used to represent the data query statement or a subquery statement in the data query statement , the query node is used to represent the query word in the data query statement or a subquery statement in the data query statement, and the query node includes a BGP node, a UNION node, an OPTIONAL node, a FILTER node at least one of the points.
  • the simplification of the first query tree based on the types of nodes in the first query tree to obtain a second query tree includes:
  • the child nodes of the first merging node include multiple BGP nodes, performing merging processing on the multiple BGP nodes, Obtaining the merged first BGP node, deleting the first merged node, adding the first BGP node to the position of the first merged node;
  • the child nodes of the second merging node include at least one BGP node and at least one UNION node, then for the at least one BGP The node is merged to obtain the second BGP node after the merge process, and the at least one UNION node is merged to obtain the third UNION node after the merge process; the second BGP node is merged into In the child node of the third UNION node, obtain the fourth UNION node, delete the second merged node, and add the fourth UNION node to the position of the second merged node;
  • the method further includes:
  • performing the query operation corresponding to each node in the second query tree sequentially in the graph database includes:
  • performing the query operation corresponding to each node in the second query tree sequentially in the graph database includes:
  • the depth of the first OPTIONAL node and the at least one second OPTIONAL node sequentially execute the first sibling node of the first OPTIONAL node and the second sibling node of the at least one second OPTIONAL node
  • the depth of the first OPTIONAL node and the at least one second OPTIONAL node sequentially execute the child nodes of the first OPTIONAL node and the child nodes of the at least one second OPTIONAL node Query operation, wherein the data query range corresponding to the child nodes of any OPTIONAL node is the query result corresponding to the brother nodes of any OPTIONAL node.
  • the simplification of the first query tree based on the types of nodes in the first query tree to obtain a second query tree includes:
  • the conversion condition is that the FILTER condition corresponding to the FILTER node is composed of variables, constants, and three operators: and, or, and equal.
  • performing the query operation corresponding to each node in the second query tree sequentially in the graph database includes:
  • the graph data is queried for data corresponding to triplet patterns other than the partial common triplet patterns in multiple BGP nodes.
  • a device for querying data comprising:
  • a receiving module configured to receive a data query instruction sent by a data query application program, wherein the data query instruction carries a data query statement
  • a processing module configured to simplify the first query tree based on the type of each node in the first query tree to obtain a second query tree
  • a query module configured to sequentially execute query operations corresponding to each node in the second query tree in the graph database based on a preset execution sequence, to obtain data query results
  • a returning module configured to return the data query result to the data query application program.
  • the types of nodes in the first query tree include a merge node and a query node, wherein the merge node is used to represent the data query statement or a subquery statement in the data query statement , the query node is used to represent the query word in the data query statement or a subquery statement in the data query statement, and the query node includes a BGP node, a UNION node, an OPTIONAL node, a FILTER node at least one of the points.
  • the query module is used for:
  • the child nodes of the first merging node include multiple BGP nodes, performing merging processing on the multiple BGP nodes, Obtaining the merged first BGP node, deleting the first merged node, adding the first BGP node to the position of the first merged node;
  • the child nodes of the second merging node include at least one BGP node and at least one UNION node, then for the at least one BGP The node is merged to obtain the second BGP node after the merge process, and the at least one UNION node is merged to obtain the third UNION node after the merge process; the second BGP node is merged into In the child node of the third UNION node, obtain the fourth UNION node, delete the second merged node, and add the fourth UNION node to the position of the second merged node;
  • processing module is also used for:
  • the query module is used for:
  • the query module is used for:
  • the depth of the first OPTIONAL node and the at least one second OPTIONAL node sequentially execute the first sibling node of the first OPTIONAL node and the second sibling node of the at least one second OPTIONAL node
  • the depth of the first OPTIONAL node and the at least one second OPTIONAL node sequentially execute the child nodes of the first OPTIONAL node and the child nodes of the at least one second OPTIONAL node Query operation, wherein the data query range corresponding to the child nodes of any OPTIONAL node is the query result corresponding to the brother nodes of any OPTIONAL node.
  • processing module is used for:
  • the conversion condition is that the FILTER condition corresponding to the FILTER node is composed of variables, constants, and three operators: and, or, and equal.
  • the query module is used for:
  • the graph data is queried for data corresponding to triplet patterns other than the partial common triplet patterns in multiple BGP nodes.
  • a computer device in a third aspect, includes a processor and a memory, at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor to implement the above-mentioned first The operations performed by the aspect.
  • a computer-readable storage medium is provided, at least one instruction is stored in the storage medium, and the at least one instruction is loaded and executed by a processor to implement the operations performed in the first aspect above.
  • a computer program product in a fifth aspect, includes at least one instruction, and the at least one instruction is loaded and executed by a processor to implement the operations performed in the first aspect above.
  • the query of the data query statement is converted into the first query tree, and then the query logic corresponding to each query node in the first query tree can be used to simplify the query tree to obtain the second query tree. query tree.
  • the query operation of the data query statement can be simplified by simplifying the query tree, thereby improving the efficiency of the data query.
  • Fig. 1 is a flow chart of a method for querying data provided by an embodiment of the present application
  • Fig. 2 is a schematic diagram of a method for querying data provided by an embodiment of the present application
  • Fig. 3 is a flow chart of a method for querying data provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a method for querying data provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a method for querying data provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a method for querying data provided by an embodiment of the present application.
  • FIG. 7 is a flow chart of a method for querying data provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a method for querying data provided by an embodiment of the present application.
  • FIG. 9 is a flow chart of a method for querying data provided by an embodiment of the present application.
  • FIG. 10 is a flow chart of a method for querying data provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a device for querying data provided by an embodiment of the present application.
  • Fig. 12 is a schematic diagram of a computer device for querying data provided by an embodiment of the present application.
  • a method for querying data provided in the embodiment of the present application may be implemented by a computer device.
  • An application program for querying data (such as a data query application program) can run in the computer device.
  • the computer device includes at least a processor and a memory, wherein the memory can be used to store data related to the method for executing the query data, for example, it can include a graph database, program codes corresponding to the method for executing the query data, and the like.
  • the processor can execute the program code stored in the memory, and implement the data query method provided by the embodiment of the present application according to the data query request of the application program.
  • the computer device may be a terminal or a server.
  • the terminal may be a mobile phone, a tablet computer, a smart wearable device, a desktop computer, a notebook computer, and the like.
  • the server can establish communication with the terminal.
  • the server can be a single server or a server group. If it is a single server, the server can be responsible for all the processing in the following schemes, if It is a server group, and different servers in the server group can be responsible for different processing in the following solutions.
  • the specific processing allocation can be set arbitrarily by technicians according to actual needs, and will not be repeated here.
  • RDF Resource Description Framework, resource description framework
  • RDF Resource Description Framework, resource description framework
  • Each edge in the knowledge graph is expressed in the form of an RDF triple in the form of "subject, predicate, object", which represents a pair of A named relationship between entities or a named property value that an entity has.
  • SPARQL (SPARQL Protocol and RDF Query Language, query language and data acquisition protocol) is a standard query language for accessing RDF datasets.
  • RDF datasets consist of multiple RDF triples, which can store arbitrary graph data. For each edge in graph data there is a unique RDF triple.
  • RDF triples Let the pairwise disjoint infinite sets I, B and L denote Internationalized Resource Identifier (Internationalized Resource Identifier, IRI), empty node and literal value respectively.
  • RDF Dataset An RDF Dataset is a collection of RDF triples.
  • Triple pattern let an infinite set V that is disjoint to the above set I, set B, and set L denote variables.
  • BGP (Basic graph pattern): includes at least one triplet pattern, and a triplet pattern t is BGP; if both P 1 and P 2 are BGP, then P 1 ANDP 2 is also BGP.
  • BGP is the basic unit for finding data in graph data. Through the existing BGP matching algorithm, data matching each triple pattern in BGP can be found in the RDF data set.
  • PFILTERC is a graph mode.
  • UNION, OPTIONAL and FILTER are commonly used query expressions in SPARQL data query statements, among which:
  • UNION refers to the combined search of multiple graph patterns, for example, P 1 UNION P 2 refers to the search for triples that satisfy the triplet pattern P 1 and triplet pattern P 2 in the RDF dataset, and the lookup Find the union of the results.
  • OPTONAL refers to selective matching of graph patterns, for example, P1OPTIONAL ⁇ P2 ⁇ refers to adding a compatible result satisfying graph pattern P2 on the premise of retaining the results satisfying graph pattern P1 in the RDF dataset.
  • FILTER is a conditional filter for search results.
  • P 1 FILTERC refers to filtering data that meets condition C in the search results corresponding to triple pattern P 1 .
  • Grouping mode P A recursive definition of a grouping mode P is as follows:
  • P 1 is a group diagram mode or a UNION diagram mode
  • P 2 is a group diagram mode
  • P 1 UNION P 2 is a UNION diagram mode
  • a graph schema P is called well-defined if and only if it satisfies the following conditions:
  • the principle of the invention is based on the semantics of select queries in SPARQL queries.
  • the form of the selection query is "SELECT v 1 v 2 ...v k WHERE ⁇ ... ⁇ ", where the SELECT clause represents the query header and the WHERE clause represents the query body.
  • the SELECT clause determines the projection variable, that is, the variable that needs to appear in the query result; the WHERE clause gives the graph pattern that needs to be matched with the RDF dataset, that is, the WHERE clause gives the data query statement.
  • Each mapping ⁇ refers to a function of a set of variables to a combination of results.
  • the set of variables appearing in the map ⁇ is denoted as dom( ⁇ ).
  • mapping ⁇ 1 and ⁇ 2 If there are two mappings ⁇ 1 and ⁇ 2 , the operations that ⁇ 1 and ⁇ 2 can perform are as follows:
  • mapping (denoted as [[P]] D ) produced by matching a graph schema P with an RDF dataset D is recursively defined as follows:
  • the object of the present invention is to provide an optimization algorithm for generation of SPARQL query execution plans containing UNION, OPTIONAL and FILTER expressions in graph databases, so as to solve the problem of low efficiency of such queries in existing graph database systems.
  • FIG. 1 is a flow chart of a method for querying data provided by an embodiment of the present application. The method can be realized by the computer device mentioned above. Referring to Fig. 1, this embodiment includes:
  • Step 101 receiving a data query instruction sent by a data query application program.
  • the data query application program can be used to query the graph data stored in the storage database, and the graph data can be stored in the form of RDF data set.
  • the graph data can be the equity relationship between different companies, where the nodes in the graph data can include the name, scale, and establishment time of the company, and the edges in the graph data can represent the previous relationship of the company, such as shareholding, share quota, etc. .
  • the RDF triple in the RDF dataset can be: ⁇ http://example.com/TX> ⁇ http://example.com/name>"Beijing TX Computer System Co., Ltd.”, the RDF triple Indicates that the name of the company TX is Beijing TX Computer System Co., Ltd.
  • the data query application program can run in the computer equipment, and the user can input data query statements in the data query application program according to business requirements, and can trigger a data query instruction after inputting the data query statement.
  • the processor in the computer device may perform the following processing according to the data query statement carried by the user in the data query instruction.
  • Step 102 Based on the structure of the data query statement, a first query tree corresponding to the data query statement is established.
  • a query tree (also referred to as a BE tree) corresponding to the data query statement may be established according to the structure of the data query statement and each query word.
  • the types of nodes in the query tree may include merge nodes and query nodes, and the merge nodes may represent data query statements or subquery statements in the data query statements. Among them, the data query statement and the subquery statement are both graph patterns.
  • the merging node represents a data query statement
  • the merging node is the root node of the first query tree.
  • Query nodes represent query words that appear in data query statements or sub-query statements, such as BGP, UNION, OPTIONAL, FILTER, etc.
  • the query nodes used to represent BGP can be called BGP nodes, which are used to represent BGP query nodes It can be called a BGP node, the query node used to represent UNION can be called a UNION node, the query node used to represent OPTIONAL can be called an OPTIONAL node, and the query node used to represent FILTER can be called a FILTER node point.
  • the query tree directly established by the data query statement may be called the first query tree. It should be understood that the establishment of the first query tree corresponding to the data query statement is actually to represent the data query statement and each sub-query statement through the merge node, and to represent the query words in the data query statement through the query node, and then establish the corresponding A tree with the same structure between each sub-query statement and each query word in the data query statement.
  • the structure of the data query statement is ((b 1 AND(b 2 UNION b 3 ))OPTIONAL(b 4 UNION b 5 ))FILTER c 1 , where b 1 -b 5 and c 1 are BGP, which can be expressed as BGP node.
  • (b 2 UNION b 3 ), (b 1 AND(b 2 UNION b 3 ), (b 4 UNION b 5 ), (b 1 AND(b 2 UNION b 3 ))OPTIONAL(b 4 UNION b 5 ) are different
  • the sub-query statement can be expressed by merging nodes.
  • the first query tree corresponding to the data query statement can be as shown in Figure 2.
  • Step 103 based on the types of each node in the first query tree, simplify the first query tree to obtain a second query tree.
  • the first query tree After obtaining the first query tree corresponding to the query statement, the first query tree can be simplified, and then the query operation corresponding to the data query statement can be executed according to the simplified query tree, which can simplify the query operation corresponding to the data query statement, Improve the efficiency of data query.
  • the simplification of the first query tree is the query logic corresponding to the nodes in the first query tree, and the processing of merging, converting, and deleting nodes will not change the query results corresponding to the data query statement .
  • the corresponding simplification processing is also different, which will not be described in detail here.
  • a query tree obtained after simplifying the first query tree may be called a second query tree.
  • Step 104 based on the preset execution order, sequentially execute the query operations corresponding to the nodes in the second query tree in the graph database to obtain the data query results.
  • the query operations corresponding to each node can be executed sequentially according to the depth corresponding to each node in the second query tree, and finally the query result of the node with the largest depth (ie, the root node) can be obtained , the query result is the query result of the data query statement in the data query instruction.
  • the BGP node may be executed first, then the corresponding UNION node or OPTIONAL node, and finally the FILTER node may be executed.
  • Step 105 returning the data query result to the data query application program.
  • the data query result can be sent to the query application program, and the query application program can display the corresponding query application program in the query result display interface for the user to view.
  • the data query statement includes the following triple pattern: ? x ⁇ http://example.com/Shareholding>"Beijing TX Computer System Co., Ltd.”, after the above processing, all individuals and companies holding shares of Beijing TX Computer System Co., Ltd. can be displayed in the query result display interface wait.
  • the query of the data query statement is converted into the first query tree, and then the query logic corresponding to each query node in the first query tree can be used to simplify the query tree to obtain the second query tree. query tree.
  • the query operation of the data query statement can be simplified by simplifying the query tree, thereby improving the efficiency of the data query.
  • Figure 3 is a simplified processing method provided by the present application, which includes:
  • Step 301 Determine the first query tree as the third query tree to be simplified, and determine the depth of each node in the third query tree.
  • Step 302 for the first merging node whose depth is 1 in the third query tree, if the child nodes of the first merging node include multiple BGP nodes, perform merging processing on multiple BGP nodes to obtain the merging process After the first BGP node, delete the first merged node, and add the first BGP node to the position of the first merged node.
  • mappings corresponding to the BGP nodes P1-Pn before merging are [[P1]] D -[[Pn]] D respectively, where n is the number of BGP nodes before merging, then the BGP node Pm after merging corresponding mapping
  • Figure 4 is a schematic diagram of step 303, after replacing the first merged node with a depth of 1 with the corresponding first BGP node, the depth of the node can be reduced, and the data volume of the intermediate result can be reduced. It can improve the efficiency of querying data.
  • the FILTER node can be used as the first BGP node Brother nodes of the FILTER node are replaced with the first merged node, and the scope of the FILTER node is recorded, that is, the corresponding relationship of the first BGP node of the FILTER node is recorded.
  • each FILTER node Before executing each FILTER node, it can be based on For the corresponding relationship recorded by the current FILTER node, determine the nodes included in the current FILTER scope, and execute the query operation corresponding to the FILTER node on the basis of the query results of the nodes included in the scope.
  • Step 303 for the second merging node whose depth is 2 in the third query tree, if the child nodes of the second merging node include at least one BGP node and at least one UNION node, at least one BGP node is performed Merge processing, obtain the second BGP node after the merge processing, merge at least one UNION node, obtain the third UNION node after the merge processing; merge the second BGP node into the child of the third UNION node Among the nodes, the fourth UNION node is obtained, the second merged node is deleted, and the fourth UNION node is added to the position of the second merged node.
  • each BGP node can be merged to obtain the second For the BGP node, the processing may refer to the processing of obtaining the first BGP node in step 303 above, which will not be repeated here.
  • each UNION node may be merged to obtain a third UNION node. For example, there are two UNION nodes u1 and u2 that need to be merged. Each UNION node has two child nodes (the child nodes are all BGP nodes) and the mappings are p1, p2, q1, and q2 respectively. After merging The third UNION node of has four child nodes, and the corresponding mappings are
  • the child nodes of the second merging node include m UNION nodes, wherein the i-th UNION node u i has ni child nodes (BGP nodes), then the third UNION obtained by the final merging node's children indivual.
  • N different mapping sets may be determined according to the BGP nodes of the m UNION nodes.
  • the mappings for each child node of the third UNION node are obtained by performing natural connections on the mappings in each mapping set.
  • the second BGP node may be merged into the child node of the third UNION node u3.
  • the mapping of the second child node is px
  • the third child node has four child nodes
  • the corresponding mappings are Then merge the second BGP node into the child nodes of the third UNION node, and the obtained fourth UNION node u4 still has four child nodes, and the corresponding mappings are respectively
  • Figure 5 is a schematic diagram of step 303, so that after replacing the second merged node with a depth of 2 with the corresponding fourth UNION node, the depth of the node can be reduced, and the amount of data in the intermediate result can be reduced. It can improve the efficiency of querying data.
  • the UNION node can be directly replaced by the original merged node with a depth of 2 to reduce the node depth . If the child nodes of the merged node with a depth of 2 only include a BGP node and a UNION node, the BGP node can be directly merged into the child nodes of the UNION node to reduce the node depth.
  • the child nodes of the merged node with a depth of 2 include a BGP node and multiple UNION nodes, multiple UNION nodes can be merged first, and then the BGP nodes can be merged into the merged UNION node in the child nodes to reduce the node depth. If the child nodes of the merging node with a depth of 2 include multiple BGP nodes and a UNION node, multiple BGP nodes can be merged first, and then the merged BGP node can be merged into the UNION node in the child nodes to reduce the node depth.
  • the FILTER node can be used as a sibling node of the fourth UNION node to replace the second Merge nodes, and record the scope of the FILTER node, that is, record the corresponding relationship of the fourth UNION node of the FILTER node.
  • the FILTER node Before executing each FILTER node, it can be based on the corresponding relationship recorded for the current FILTER node , determine the nodes included in the current FILTER scope, and execute the query operation corresponding to the FILTER node on the basis of the query results of the nodes included in the scope.
  • Step 304 for the fifth UNION node whose depth is 2 in the third query tree, add the grandchildren node of the fifth UNION node to the child node of the fifth UNION node, and delete the fifth UNION node
  • the grandchildren node of the grandchildren node and the parent node of the grandchildren node get the third query tree after simplified processing.
  • the child nodes of the fifth UNION node also include a plurality of UNION nodes, then the child nodes of the fifth UNION node can be The child node corresponding to the UNION node in is merged into the child node of the fifth UNION node, and the grandchild node of the fifth UNION node and the parent node of the grandchild node are deleted, and the simplification is obtained query tree.
  • Fig. 6 is a schematic diagram of step 304, after deleting the grandchildren node of the fifth UNION node whose depth is 2, the node depth can be reduced, the data volume of intermediate results can be reduced, and the query data can be improved efficiency.
  • Step 305 determine the depth of each node in the simplified query tree.
  • Step 306 if there are still query nodes in the simplified query tree that meet the requirements of step 302-step 304 for simplification, then the simplified query tree can be determined as the third query tree to be simplified, and jump to the corresponding Steps continue to simplify until there is no node that can be simplified in the simplified query tree.
  • steps 302 to 304 above changes the depth corresponding to each node in the query tree
  • the depth corresponding to each node in the simplified query tree can be determined. If there are still query nodes in the query tree after simplification that can be simplified in step 302-step 304, you can continue to perform corresponding simplification processing on the query nodes, further reducing the corresponding depth of each node in the query tree, Until it is determined that there is no node that can be simplified in the query tree after simplification.
  • the depth of the query tree is significantly reduced, the data volume of intermediate query results can be reduced, and the efficiency of querying data can be improved.
  • Figure 7 is a simplified processing method provided by the present application, which includes:
  • Step 701. Determine that the corresponding ancestor node in the first query tree does not have the first OPTIONAL node of the OPTIONAL node.
  • Step 702 Transform the subquery tree rooted at the parent node of the first OPTIONAL node into a third BGP node.
  • step 701 may be executed first. If it is determined in step 701 that there is the first OPTIONAL node in the first query tree, then the sub-query tree with the parent node of the first OPTIONAL node as the root node can be regarded as a BGP node. In a query tree, delete the parent node of the first OPTIONAL node as the sub-query tree of the root node, then add the third BGP node to the position of the parent node of the original first OPTIONAL node, as shown in Figure 8, The subquery tree rooted at merge node 2 can be converted into a BGP3 node.
  • step 702 the processing in step 702 can be performed.
  • step 702 the processing in step 702 can be performed.
  • step 702 the processing in step 702 can be performed.
  • step 702 the processing in step 702 can be performed.
  • step 702 the processing in step 702 can be performed.
  • step 702 the processing in step 702 can be performed.
  • step 702 the processing in step 702 can be performed.
  • step 702 the processing in step 702 can be performed.
  • query operations can be performed on each node in the query tree obtained after the processing in FIG. 3 .
  • the query operation involving the corresponding third node the following two processing methods are included:
  • the sub-query tree corresponding to the third BGP node can be determined first, and the sub-query tree is the parent node of the first OPTIONAL node in the above step 702. Point is the subquery tree of the root node.
  • the query operation corresponding to the sibling node of the first OPTIONAL node can be executed first, and after the query operation corresponding to the sibling node is executed, the first query result is obtained, namely
  • the subgraph queried in the graph data is used as the data query scope of the descendant nodes of the first OPTIONAL node, and the query operation corresponding to the descendant nodes of the first OPTIONAL node is executed again. This can reduce the amount of corresponding query data when executing the query operation of the descendant nodes of the first OPTIONAL node, and can improve the execution efficiency of the query operation.
  • the descendant nodes of the first OPTIONAL node may be determined whether the descendant nodes of the first OPTIONAL node also include the second OPTIONAL node.
  • the first OPTIONAL node of the first OPTIONAL node may be sequentially executed according to the depths of the first OPTIONAL node and the at least one second OPTIONAL node.
  • the data query range of the sibling nodes corresponding to each second OPTIONAL node is the query result of the sibling nodes corresponding to the previous OPTIONAL node.
  • the corresponding query operation can be performed on the basis of the query result of the brother node corresponding to the previous OPTIONAL node, which can reduce the query corresponding to each query operation data volume, thereby improving query efficiency.
  • the first OPTIONAL node and at least one second OPTIONAL node can be executed sequentially according to the depth of the first OPTIONAL node and at least one second OPTIONAL node.
  • the corresponding data query scope of the child nodes of any OPTIONAL node is the query result corresponding to the sibling nodes of any OPTIONAL node.
  • the corresponding query operation can be performed on the basis of the query results of its sibling nodes, which can reduce the amount of query data corresponding to each query operation, thereby improving query performance. efficiency.
  • Figure 9 is a simplified processing method provided by the present application, which includes:
  • Step 901 for the FILTER node in the first query tree, if the FILTER condition corresponding to the FILTER node satisfies a preset conversion condition, then convert the FILTER condition into a disjunctive normal form.
  • the FILTER condition corresponding to the FILTER node is only composed of variables, constants, and and, or, and equal operators.
  • Step 902 Convert the FILTER node into a UNION node based on the disjunctive normal form.
  • the FILTER condition corresponding to the FILTER node may be converted into disjunctive normal form f1
  • any fi in the disjunctive paradigm is a constraint on a variable, which can be regarded as the corresponding relationship between the variable and the corresponding constraint value.
  • the first query tree After obtaining the disjunctive normal form f1
  • the variables corresponding to the query nodes constrained by the FILTER condition appearing in are assigned as constraint values with the BIND clause in turn, and then the FILTER nodes are converted into UNION nodes.
  • the specific processing is as follows:
  • the FILTER node can be converted into a UNION node to participate in the simplified processing in Figure 3, which can simplify the query tree to a certain extent, thereby improving the efficiency of data query.
  • Figure 10 is a simplified processing method provided by the present application, which includes:
  • Step 1001 if there are multiple BGP nodes that can be executed in parallel, determine the public triplet patterns corresponding to the multiple BGP nodes.
  • ⁇ (t1,t2) is a bijection from Var(t1) to Var(t2), where Var(t1) represents the set of variables appearing in t1.
  • the common subquery C is a common triple pattern corresponding to multiple BGP nodes.
  • Step 1002 based on a greedy algorithm, determine a part of public triple patterns corresponding to the lowest query cost among the public triple patterns.
  • Cost(B,C S ) is the matching cost of this BGP set
  • Cost( ci ) is the matching cost of the common subquery ci selected into the subset CS
  • C S ) is the After entering the result of the common subquery of the subset CS, the matching cost of querying the remaining RDF triples b j .
  • the matching cost can be the computing resource consumed or the time taken to query the corresponding triplet, etc.
  • the matching cost can be determined according to the variables in each triplet pattern, and the matching cost corresponding to each variable can be determined by the technician Pre-equipment.
  • Cost(c i ) min ⁇ sel(t)
  • C S ) min ⁇ sel(t)
  • Step 1003 query the data corresponding to some public triple patterns in the graph data.
  • Step 1004 query the graph data for data corresponding to other triplet patterns except some common triplet patterns in multiple BGP nodes.
  • the result set can be calculated as follows:
  • each of them is the common subquery in the bj triplet pattern subsequence, and b' j is defined as the part of b j not covered by the common subquery subset CS.
  • Fig. 11 is a schematic structural diagram of a device for querying data provided by an embodiment of the present application.
  • the device may be the computer device in the above embodiment, see Fig. 11 , the device includes:
  • the receiving module 1110 is configured to receive a data query instruction sent by a data query application program, wherein the data query instruction carries a data query statement;
  • Establishing module 1120 configured to establish a first query tree corresponding to the data query statement based on the structure of the data query statement;
  • a processing module 1130 configured to simplify the first query tree based on the type of each node in the first query tree to obtain a second query tree;
  • the query module 1140 is configured to sequentially execute the query operations corresponding to the nodes in the second query tree in the graph database based on a preset execution sequence, to obtain data query results;
  • Returning module 1150 configured to return the data query result to the data query application program.
  • the types of nodes in the first query tree include a merge node and a query node, wherein the merge node is used to represent the data query statement or a subquery statement in the data query statement , the query node is used to represent the query word in the data query statement or a subquery statement in the data query statement, and the query node includes a BGP node, a UNION node, an OPTIONAL node, a FILTER node at least one of the points.
  • the query module 1140 is configured to: determine the first query tree as a third query tree to be simplified, determine the depth of each node in the third query tree; for the third query For the first merging node with a depth of 1 in the tree, if the child nodes of the first merging node include multiple BGP nodes, then the multiple BGP nodes are merged to obtain the merged first A BGP node, delete the first merging node, add the first BGP node to the position of the first merging node; for the second merging node whose depth is 2 in the third query tree point, if the child nodes of the second merging node include at least one BGP node and at least one UNION node, then the at least one BGP node is merged to obtain the second BGP node after merging , performing merging processing on the at least one UNION node to obtain a third UNION node after merging processing; merging the second BGP node into a child node of the third UNION node to obtain a
  • the processing module 1130 is further configured to: determine that the first OPTIONAL node of the OPTIONAL node does not exist in the corresponding ancestor node in the first query tree; use the parent of the first OPTIONAL node The subquery tree whose node is the root node is transformed into a third BGP node.
  • the query module 1140 is configured to: when executing the first query operation corresponding to the third BGP node, determine a sub-query tree corresponding to the third BGP node; execute the sub-query tree
  • the query operation corresponding to the sibling nodes of the first OPTIONAL node in the first OPTIONAL node is obtained to obtain the first query result; the first query result is determined as the data query range of the descendant node of the first OPTIONAL node; based on the Execute the query operation corresponding to the descendant node of the first OPTIONAL node within the scope of the data query.
  • the query module 1140 is configured to: when executing the first query operation corresponding to the third BGP node, if it is determined that at least one second OPTIONAL node is included in the descendant nodes of the first OPTIONAL node Node: according to the depth of the first OPTIONAL node and the at least one second OPTIONAL node, sequentially execute the first sibling node of the first OPTIONAL node and the at least one second OPTIONAL node The query operation corresponding to the second sibling node, wherein, the data query range corresponding to the second sibling node of each second OPTIONAL node is the query result corresponding to the second sibling node of the previous OPTIONAL node; according to the the depth of the first OPTIONAL node and the at least one second OPTIONAL node, performing query operations corresponding to the child nodes of the first OPTIONAL node and the child nodes of the at least one second OPTIONAL node in sequence, Wherein, the corresponding data query scope of the child nodes of any OPTIONAL
  • the processing module 1130 is configured to: for the FILTER node in the first query tree, if the FILTER condition corresponding to the FILTER node satisfies a preset conversion condition, convert the FILTER condition is a disjunctive normal form; based on the disjunctive normal form, transform the FILTER node into a UNION node.
  • the conversion condition is that the FILTER condition corresponding to the FILTER node is composed of variables, constants, and three operators: and, or, and equal.
  • the query module 1140 is configured to: if there are multiple BGP nodes that can be executed in parallel, then determine the public triple pattern corresponding to the multiple BGP nodes; based on the greedy algorithm, in the In the public triplet pattern, determine the part of the public triplet pattern corresponding to the lowest query cost; query the data corresponding to the part of the public triplet pattern in the graph data; query multiple BGP structures in the graph data Data corresponding to other triplet patterns in the point except the partial public triplet patterns.
  • the device for querying data provided by the above-mentioned embodiments queries data
  • the above-mentioned function allocation can be completed by different functional modules according to needs. That is, the internal structure of the computer device is divided into different functional modules to complete all or part of the functions described above.
  • the device for querying data provided by the above embodiment and the method embodiment for querying data belong to the same idea, and its specific implementation process is detailed in the method embodiment, and will not be repeated here.
  • Fig. 12 shows a structural block diagram of a computer device 1200 provided by an exemplary embodiment of the present application.
  • the computer device 1200 can be a portable mobile terminal, such as: smart phone, tablet computer, MP3 player (moving picture experts group audio layer III, moving picture experts compression standard audio layer 3), MP4 (moving picture experts group audio layer IV, Motion Picture Expert compresses standard audio levels 4) Players, laptops or desktops.
  • the computer device 1200 may also be called user equipment, portable terminal, laptop terminal, desktop terminal, or other names.
  • a computer device 1200 includes: a processor 1201 and a memory 1202 .
  • the processor 1201 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like.
  • the processor 1201 can adopt at least one hardware form among DSP (digital signal processing, digital signal processing), FPGA (field-programmable gate array, field programmable gate array), PLA (programmable logic array, programmable logic array) accomplish.
  • the processor 1201 may also include a main processor and a coprocessor, the main processor is a processor for processing data in the wake-up state, and is also called a CPU (central processing unit, central processing unit); the coprocessor is Low-power processor for processing data in standby state.
  • the processor 1201 may be integrated with a GPU (graphics processing unit, image processor), and the GPU is used for rendering and drawing the content that needs to be displayed on the display screen.
  • the processor 1201 may further include an AI (artificial intelligence, artificial intelligence) processor, where the AI processor is used to process computing operations related to machine learning.
  • AI artificial intelligence, artificial intelligence
  • Memory 1202 may include one or more computer-readable storage media, which may be non-transitory.
  • the memory 1202 may also include high-speed random access memory and non-volatile memory, such as one or more magnetic disk storage devices and flash memory storage devices.
  • the non-transitory computer-readable storage medium in the memory 1202 is used to store at least one instruction, and the at least one instruction is used to be executed by the processor 1201 to realize the query data provided by the method embodiments in this application Methods.
  • the computer device 1200 may optionally further include: a peripheral device interface 1203 and at least one peripheral device.
  • the processor 1201, the memory 1202, and the peripheral device interface 1203 may be connected through buses or signal lines.
  • Each peripheral device can be connected to the peripheral device interface 1203 through a bus, a signal line or a circuit board.
  • the peripheral device includes: at least one of a radio frequency circuit 1204 , a display screen 1205 , a camera component 1206 , an audio circuit 1207 , a positioning component 1208 and a power supply 1209 .
  • the peripheral device interface 1203 may be used to connect at least one peripheral device related to I/O (input/output, input/output) to the processor 1201 and the memory 1202 .
  • the processor 1201, memory 1202 and peripheral device interface 1203 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 1201, memory 1202 and peripheral device interface 1203 or The two can be implemented on a separate chip or circuit board, which is not limited in this embodiment.
  • the radio frequency circuit 1204 is used to receive and transmit RF (radio frequency, radio frequency) signals, also called electromagnetic signals.
  • the radio frequency circuit 1204 communicates with the communication network and other communication devices through electromagnetic signals.
  • the radio frequency circuit 1204 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals.
  • the radio frequency circuit 1204 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and the like.
  • the radio frequency circuit 1204 can communicate with other terminals through at least one wireless communication protocol.
  • the wireless communication protocol includes but is not limited to: World Wide Web, Metropolitan Area Network, Intranet, various generations of mobile communication networks (2G, 3G, 4G and 5G), wireless local area network and/or WiFi (wireless fidelity, wireless fidelity) network.
  • the radio frequency circuit 1204 may also include circuits related to NFC (near field communication, short-range wireless communication), which is not limited in this application.
  • the display screen 1205 is used to display a UI (user interface, user interface).
  • the UI can include graphics, text, icons, video, and any combination thereof.
  • the display screen 1205 also has the ability to collect touch signals on or above the surface of the display screen 1205 .
  • the touch signal can be input to the processor 1201 as a control signal for processing.
  • the display screen 1205 can also be used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards.
  • the display screen 1205 may be a flexible display screen, which is arranged on a curved surface or a folded surface of the computer device 1200 . Even, the display screen 1205 can also be set as a non-rectangular irregular figure, that is, a special-shaped screen.
  • the display screen 1205 can be made of LCD (liquid crystal display, liquid crystal display), OLED (organic light-emitting diode, organic light-emitting diode) and other materials.
  • the camera assembly 1206 is used to capture images or videos.
  • the camera component 1206 includes a front camera and a rear camera.
  • the front camera is set on the front panel of the terminal, and the rear camera is set on the back of the terminal.
  • there are at least two rear cameras which are any one of the main camera, depth-of-field camera, wide-angle camera, and telephoto camera, so as to realize the fusion of the main camera and the depth-of-field camera to realize the background blur function.
  • camera assembly 1206 may also include a flash.
  • the flash can be a single-color temperature flash or a dual-color temperature flash. Dual color temperature flash refers to the combination of warm light flash and cold light flash, which can be used for light compensation under different color temperatures.
  • Audio circuitry 1207 may include a microphone and speakers.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals and input them to the processor 1201 for processing, or input them to the radio frequency circuit 1204 to realize voice communication.
  • the microphone can also be an array microphone or an omnidirectional collection microphone.
  • the speaker is used to convert the electrical signal from the processor 1201 or the radio frequency circuit 1204 into sound waves.
  • the loudspeaker can be a conventional membrane loudspeaker or a piezoelectric ceramic loudspeaker.
  • audio circuitry 1207 may also include a headphone jack.
  • the positioning component 1208 is used to locate the current geographic location of the computer device 1200, so as to realize navigation or LBS (location based service, location-based service).
  • the positioning component 1208 may be a positioning component based on the GPS (global positioning system, global positioning system) of the United States, the Beidou system of China or the Galileo system of Russia.
  • the power supply 1209 is used to supply power to various components in the computer device 1200 .
  • the power source 1209 can be alternating current, direct current, disposable batteries, or rechargeable batteries.
  • the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery.
  • a wired rechargeable battery is a battery charged through a wired line
  • a wireless rechargeable battery is a battery charged through a wireless coil.
  • the rechargeable battery can also be used to support fast charging technology.
  • the computing device 1200 also includes one or more sensors 1210 .
  • the one or more sensors 1210 include, but are not limited to: an acceleration sensor 1211 , a gyroscope sensor 1212 , a pressure sensor 1213 , a fingerprint sensor 1214 , an optical sensor 1215 and a proximity sensor 1216 .
  • the acceleration sensor 1211 can detect the acceleration on the three coordinate axes of the coordinate system established by the computer device 1200 .
  • the acceleration sensor 1211 can be used to detect the components of the acceleration of gravity on the three coordinate axes.
  • the processor 1201 may control the display screen 1205 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 1211 .
  • the acceleration sensor 1211 can also be used for collecting game or user's motion data.
  • the gyro sensor 1212 can detect the body direction and rotation angle of the computer device 1200 , and the gyro sensor 1212 can cooperate with the acceleration sensor 1211 to collect 3D actions of the user on the computer device 1200 .
  • the processor 1201 can realize the following functions: motion sensing (such as changing the UI according to the tilt operation of the user), image stabilization during shooting, game control and inertial navigation.
  • the pressure sensor 1213 may be disposed on the side frame of the computer device 1200 and/or the lower layer of the display screen 1205 .
  • the pressure sensor 1213 When the pressure sensor 1213 is arranged on the side frame of the computer device 1200 , it can detect the user's grip signal on the computer device 1200 , and the processor 1201 performs left and right hand recognition or shortcut operation according to the grip signal collected by the pressure sensor 1213 .
  • the processor 1201 controls operable controls on the UI interface according to the user's pressure operation on the display screen 1205.
  • the operable controls include at least one of button controls, scroll bar controls, icon controls, and menu controls.
  • the fingerprint sensor 1214 is used to collect the user's fingerprint, and the processor 1201 recognizes the identity of the user according to the fingerprint collected by the fingerprint sensor 1214, or, the fingerprint sensor 1214 recognizes the user's identity according to the collected fingerprint.
  • the processor 1201 authorizes the user to perform related sensitive operations, such sensitive operations include unlocking the screen, viewing encrypted information, downloading software, making payment, and changing settings.
  • Fingerprint sensor 1214 may be disposed on the front, back or sides of computing device 1200 . When the computer device 1200 is provided with a physical button or a manufacturer's logo, the fingerprint sensor 1214 may be integrated with the physical button or the manufacturer's Logo.
  • the optical sensor 1215 is used to collect ambient light intensity.
  • the processor 1201 may control the display brightness of the display screen 1205 according to the ambient light intensity collected by the optical sensor 1215 . Specifically, when the ambient light intensity is high, the display brightness of the display screen 1205 is increased; when the ambient light intensity is low, the display brightness of the display screen 1205 is decreased.
  • the processor 1201 may also dynamically adjust shooting parameters of the camera assembly 1206 according to the ambient light intensity collected by the optical sensor 1215 .
  • a proximity sensor 1216 also called a distance sensor, is usually disposed on the front panel of the computer device 1200 .
  • the proximity sensor 1216 is used to capture the distance between the user and the front of the computer device 1200 .
  • the processor 1201 controls the display screen 1205 to switch from the bright screen state to the off-screen state; when the proximity sensor 1216 detects When the distance between the user and the front of the computer device 1200 gradually increases, the processor 1201 controls the display screen 1205 to switch from the off-screen state to the on-screen state.
  • FIG. 12 does not constitute a limitation to the computer device 1200, and may include more or less components than shown in the figure, or combine some components, or adopt a different arrangement of components.
  • a computer-readable storage medium such as a memory including instructions, and the above instructions can be executed by a processor in the terminal to complete the method for querying data in the above embodiments.
  • the computer readable storage medium may be non-transitory.
  • the computer-readable storage medium may be ROM (read-only memory, read-only memory), RAM (random access memory, random access memory), CD-ROM, magnetic tape, floppy disk, and optical data storage device, etc.
  • a computer program product includes at least one instruction, and the at least one instruction is loaded and executed by a processor to implement the method for querying data in the above embodiments.
  • the program can be stored in a computer-readable storage medium.
  • the above-mentioned The storage medium mentioned may be a read-only memory, a magnetic disk or an optical disk, and the like.
  • first and second are used to distinguish the same or similar items with basically the same function and function. It should be understood that there is no logic or sequence between “first” and “second” Dependencies on the above, and there are no restrictions on the number and execution order. It should also be understood that although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by the terms. These terms are only used to distinguish one element from another. The meaning of the term “at least one” in this application refers to one or more, and the meaning of the term “multiple” in this application refers to two or more.

Abstract

The present application belongs to the technical field of graph databases. Disclosed are a data query method and apparatus, and a device and a storage medium. The method comprises: receiving a data query instruction, which is sent by a data query application program, wherein the data query instruction carries a data query statement; on the basis of the structure of the data query statement, establishing a first query tree corresponding to the data query statement; performing simplification processing on the first query tree on the basis of the type of each node in the first query tree, so as to obtain a second query tree; on the basis of a preset execution sequence, sequentially executing, in a graph database, query operations corresponding to respective nodes in the second query tree, so as to obtain a data query result; and returning the data query result to the data query application program. By means of the present application, the data query efficiency in a graph database can be improved.

Description

查询数据的方法、装置、设备及存储介质Method, device, equipment and storage medium for querying data
本申请要求于2021年12月31日提交的申请号为202111673409.6、发明名称为“查询数据的方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202111673409.6 and the title of the invention "method, device, equipment and storage medium for querying data" filed on December 31, 2021, the entire contents of which are incorporated by reference in this application middle.
技术领域technical field
本申请涉及图数据库技术领域,特别涉及一种查询数据的方法、装置、设备及存储介质。The present application relates to the technical field of graph databases, and in particular to a method, device, device and storage medium for querying data.
背景技术Background technique
RDF(Resource Description Framework,资源描述框架)是一种知识图谱的事实数据模型,知识图谱中的每条边以形如“主语、谓语、宾语”的RDF三元组形式表示,其代表了一对实体之间的命名关系或一个实体拥有的命名属性值。RDF (Resource Description Framework, resource description framework) is a factual data model of knowledge graph. Each edge in the knowledge graph is expressed in the form of an RDF triple in the form of "subject, predicate, object", which represents a pair of A named relationship between entities or a named property value that an entity has.
SPARQL(SPARQL Protocol and RDF Query Language,查询语言和数据获取协议)是访问RDF数据集的标准查询语言,其中UNION(联合)、OPTIONAL(可选匹配)和FILTER(过滤)表达式是SPARQL的数据查询语句中常用的查询表达式。SPARQL (SPARQL Protocol and RDF Query Language, query language and data acquisition protocol) is a standard query language for accessing RDF datasets, in which UNION (union), OPTIONAL (optional matching) and FILTER (filtering) expressions are SPARQL data queries Commonly used query expressions in the statement.
目前计算机设备在执行数据查询语句对应的查询操作时,仅是依次执行各查询表达式对应的查询处理,查询数据的效率较低。At present, when a computer device executes a query operation corresponding to a data query statement, it only sequentially executes query processing corresponding to each query expression, and the efficiency of querying data is low.
发明内容Contents of the invention
本申请实施例提供了一种查询数据的方法、装置、设备及存储介质,能够提高查询数据的效率。所述技术方案如下:Embodiments of the present application provide a data query method, device, device, and storage medium, which can improve data query efficiency. Described technical scheme is as follows:
第一方面,提供了一种查询数据的方法,所述方法包括:In a first aspect, a method for querying data is provided, the method comprising:
接收数据查询应用程序发送的数据查询指令,所述数据查询指令中携带有数据查询语句;receiving a data query instruction sent by a data query application program, wherein the data query instruction carries a data query statement;
基于所述数据查询语句的结构,建立所述数据查询语句对应的第一查询树;Establishing a first query tree corresponding to the data query statement based on the structure of the data query statement;
基于所述第一查询树中各结点的类型,对所述第一查询树进行简化处理,得到第二查询树;Based on the type of each node in the first query tree, simplifying the first query tree to obtain a second query tree;
基于预设的执行顺序,在图数据库中依次执行所述第二查询树中各结点对应的查询操作,得到数据查询结果;Based on the preset execution sequence, sequentially execute the query operations corresponding to the nodes in the second query tree in the graph database to obtain data query results;
将所述数据查询结果返回至所述数据查询应用程序。returning the data query result to the data query application program.
可选的,所述第一查询树中结点的类型包括合并结点和查询结点,其中,所述合并结点用于表示所述数据查询语句或所述数据查询语句中的子查询语句,所述查询结点用于表示所述数据查询语句或所述数据查询语句中的子查询语句中的查询词,所述查询结点包括BGP结点、UNION结点、OPTIONAL结点、FILTER结点中的至少一种。Optionally, the types of nodes in the first query tree include a merge node and a query node, wherein the merge node is used to represent the data query statement or a subquery statement in the data query statement , the query node is used to represent the query word in the data query statement or a subquery statement in the data query statement, and the query node includes a BGP node, a UNION node, an OPTIONAL node, a FILTER node at least one of the points.
可选的,所述基于所述第一查询树中各结点的类型,对所述第一查询树进行简化处理,得到第二查询树,包括:Optionally, the simplification of the first query tree based on the types of nodes in the first query tree to obtain a second query tree includes:
将所述第一查询树确定为待简化的第三查询树,确定所述第三查询树中各结点的深度;Determining the first query tree as a third query tree to be simplified, and determining the depth of each node in the third query tree;
对于所述第三查询树中深度为1的第一合并结点,如果所述第一合并结点的孩子结点包括多个BGP结点,则对所述多个BGP结点进行合并处理,得到合并处理后的第一BGP结点,删除所述第一合并结点,将所述第一BGP结点添加到所述第一合并结点的位置;For the first merging node with a depth of 1 in the third query tree, if the child nodes of the first merging node include multiple BGP nodes, performing merging processing on the multiple BGP nodes, Obtaining the merged first BGP node, deleting the first merged node, adding the first BGP node to the position of the first merged node;
对于所述第三查询树中深度为2的第二合并结点,如果所述第二合并结点的孩子结点包括至少一个BGP结点和至少一个UNION结点,则对所述至少一个BGP结点进行合并处理,得到合并处理后的第二BGP结点,对所述至少一个UNION结点进行合并处理,得到合并处 理后的第三UNION结点;将所述第二BGP结点合并至所述第三UNION结点的孩子结点中,得到第四UNION结点,删除所述第二合并结点,将所述第四UNION结点添加到所述第二合并结点的位置;For the second merging node with a depth of 2 in the third query tree, if the child nodes of the second merging node include at least one BGP node and at least one UNION node, then for the at least one BGP The node is merged to obtain the second BGP node after the merge process, and the at least one UNION node is merged to obtain the third UNION node after the merge process; the second BGP node is merged into In the child node of the third UNION node, obtain the fourth UNION node, delete the second merged node, and add the fourth UNION node to the position of the second merged node;
对于所述第三查询树中深度为2的第五UNION结点,将所述第五UNION结点的孙代结点添加到所述第五UNION结点的孩子结点中,并删除所述第五UNION结点的孙代结点以及所述孙代结点的父结点,得到简化处理之后第三查询树。For the fifth UNION node whose depth is 2 in the third query tree, add the grandchildren node of the fifth UNION node to the child node of the fifth UNION node, and delete the The grandchild node of the fifth UNION node and the parent node of the grandchild node obtain the third query tree after simplified processing.
可选的,所述将所述第一查询树确定为待简化的第三查询树之前,所述方法还包括:Optionally, before determining the first query tree as the third query tree to be simplified, the method further includes:
确定所述第一查询树中对应的祖先结点不存在OPTIONAL结点的第一OPTIONAL结点;determining that there is no first OPTIONAL node in the corresponding ancestor node in the first query tree;
将以所述第一OPTIONAL结点的父结点为根结点的子查询树转化为第三BGP结点。Converting the subquery tree whose root node is the parent node of the first OPTIONAL node into a third BGP node.
可选的,所述在图数据库中依次执行所述第二查询树中各结点对应的查询操作,包括:Optionally, performing the query operation corresponding to each node in the second query tree sequentially in the graph database includes:
当执行所述第三BGP结点对应的第一查询操作时,确定所述第三BGP结点对应的子查询树;When executing the first query operation corresponding to the third BGP node, determine the sub-query tree corresponding to the third BGP node;
执行所述子查询树中所述第一OPTIONAL结点的兄弟结点对应的查询操作,得到第一查询结果;将所述第一查询结果确定为所述第一OPTIONAL结点的后代结点的数据查询范围;基于所述数据查询范围,执行所述第一OPTIONAL结点的后代结点对应的查询操作。Execute the query operation corresponding to the brother node of the first OPTIONAL node in the sub-query tree to obtain a first query result; determine the first query result as the descendant node of the first OPTIONAL node A data query range: based on the data query range, perform a query operation corresponding to a descendant node of the first OPTIONAL node.
可选的,所述在图数据库中依次执行所述第二查询树中各结点对应的查询操作,包括:Optionally, performing the query operation corresponding to each node in the second query tree sequentially in the graph database includes:
当执行所述第三BGP结点对应的第一查询操作时,如果确定所述第一OPTIONAL结点的后代结点中包括至少一个第二OPTIONAL结点;When executing the first query operation corresponding to the third BGP node, if it is determined that at least one second OPTIONAL node is included in the descendant nodes of the first OPTIONAL node;
按照所述第一OPTIONAL结点和所述至少一个第二OPTIONAL结点的深度,依次执行所述第一OPTIONAL结点的第一兄弟结点和所述至少一个第二OPTIONAL结点的第二兄弟结点对应的查询操作,其中,每个第二OPTIONAL结点的第二兄弟结点对应的数据查询范围为前一个OPTIONAL结点的第二兄弟结点对应的查询结果;According to the depth of the first OPTIONAL node and the at least one second OPTIONAL node, sequentially execute the first sibling node of the first OPTIONAL node and the second sibling node of the at least one second OPTIONAL node The query operation corresponding to the node, wherein, the data query range corresponding to the second sibling node of each second OPTIONAL node is the query result corresponding to the second sibling node of the previous OPTIONAL node;
按照所述第一OPTIONAL结点和所述至少一个第二OPTIONAL结点的深度,依次执行所述第一OPTIONAL结点的孩子结点和所述至少一个第二OPTIONAL结点的孩子结点对应的查询操作,其中,所述任一OPTIONAL结点的孩子结点的对应的数据查询范围为所述任一OPTIONAL结点的兄弟结点对应的查询结果。According to the depth of the first OPTIONAL node and the at least one second OPTIONAL node, sequentially execute the child nodes of the first OPTIONAL node and the child nodes of the at least one second OPTIONAL node Query operation, wherein the data query range corresponding to the child nodes of any OPTIONAL node is the query result corresponding to the brother nodes of any OPTIONAL node.
可选的,所述基于所述第一查询树中各结点的类型,对所述第一查询树进行简化处理,得到第二查询树,包括:Optionally, the simplification of the first query tree based on the types of nodes in the first query tree to obtain a second query tree includes:
对于所述第一查询树中的FILTER结点,如果所述FILTER结点对应的FILTER条件满足预设的转换条件,则将所述FILTER条件转换为析取范式;For the FILTER node in the first query tree, if the FILTER condition corresponding to the FILTER node satisfies a preset conversion condition, then convert the FILTER condition into a disjunctive normal form;
基于所述析取范式,将所述FILTER结点转换为UNION结点。Based on the disjunctive normal form, convert the FILTER node into a UNION node.
可选的,所述转换条件为所述FILTER结点对应的FILTER条件由变量、常量以及与、或、相等三种运算符组成。Optionally, the conversion condition is that the FILTER condition corresponding to the FILTER node is composed of variables, constants, and three operators: and, or, and equal.
可选的,所述在图数据库中依次执行所述第二查询树中各结点对应的查询操作,包括:Optionally, performing the query operation corresponding to each node in the second query tree sequentially in the graph database includes:
如果存在可并行执行的多个BGP结点,则确定所述多个BGP结点对应的公共三元组模式;If there are multiple BGP nodes that can be executed in parallel, then determine the public triplet pattern corresponding to the multiple BGP nodes;
基于所述贪心算法,在所述公共三元组模式中确定对应查询成本最低的部分公共三元组模式;Based on the greedy algorithm, determining a part of the public triplet patterns corresponding to the lowest query cost in the public triplet patterns;
在所述图数据中查询所述部分公共三元组模式对应的数据;Querying the data corresponding to the partial public triple pattern in the graph data;
在所述图数据中查询多个BGP结点中除所述部分公共三元组模式外的其他三元组模式对应的数据。The graph data is queried for data corresponding to triplet patterns other than the partial common triplet patterns in multiple BGP nodes.
第二方面,提供了一种查询数据的装置,所述装置包括:In a second aspect, a device for querying data is provided, the device comprising:
接收模块,用于接收数据查询应用程序发送的数据查询指令,所述数据查询指令中携带有数据查询语句;A receiving module, configured to receive a data query instruction sent by a data query application program, wherein the data query instruction carries a data query statement;
建立模块,用于基于所述数据查询语句的结构,建立所述数据查询语句对应的第一查询 树;Establishing a module for, based on the structure of the data query sentence, setting up the first query tree corresponding to the data query sentence;
处理模块,用于基于所述第一查询树中各结点的类型,对所述第一查询树进行简化处理,得到第二查询树;A processing module, configured to simplify the first query tree based on the type of each node in the first query tree to obtain a second query tree;
查询模块,用于基于预设的执行顺序,在图数据库中依次执行所述第二查询树中各结点对应的查询操作,得到数据查询结果;A query module, configured to sequentially execute query operations corresponding to each node in the second query tree in the graph database based on a preset execution sequence, to obtain data query results;
返回模块,用于将所述数据查询结果返回至所述数据查询应用程序。A returning module, configured to return the data query result to the data query application program.
可选的,所述第一查询树中结点的类型包括合并结点和查询结点,其中,所述合并结点用于表示所述数据查询语句或所述数据查询语句中的子查询语句,所述查询结点用于表示所述数据查询语句或所述数据查询语句中的子查询语句中的查询词,所述查询结点包括BGP结点、UNION结点、OPTIONAL结点、FILTER结点中的至少一种。Optionally, the types of nodes in the first query tree include a merge node and a query node, wherein the merge node is used to represent the data query statement or a subquery statement in the data query statement , the query node is used to represent the query word in the data query statement or a subquery statement in the data query statement, and the query node includes a BGP node, a UNION node, an OPTIONAL node, a FILTER node at least one of the points.
可选的,所述查询模块,用于:Optionally, the query module is used for:
将所述第一查询树确定为待简化的第三查询树,确定所述第三查询树中各结点的深度;Determining the first query tree as a third query tree to be simplified, and determining the depth of each node in the third query tree;
对于所述第三查询树中深度为1的第一合并结点,如果所述第一合并结点的孩子结点包括多个BGP结点,则对所述多个BGP结点进行合并处理,得到合并处理后的第一BGP结点,删除所述第一合并结点,将所述第一BGP结点添加到所述第一合并结点的位置;For the first merging node with a depth of 1 in the third query tree, if the child nodes of the first merging node include multiple BGP nodes, performing merging processing on the multiple BGP nodes, Obtaining the merged first BGP node, deleting the first merged node, adding the first BGP node to the position of the first merged node;
对于所述第三查询树中深度为2的第二合并结点,如果所述第二合并结点的孩子结点包括至少一个BGP结点和至少一个UNION结点,则对所述至少一个BGP结点进行合并处理,得到合并处理后的第二BGP结点,对所述至少一个UNION结点进行合并处理,得到合并处理后的第三UNION结点;将所述第二BGP结点合并至所述第三UNION结点的孩子结点中,得到第四UNION结点,删除所述第二合并结点,将所述第四UNION结点添加到所述第二合并结点的位置;For the second merging node with a depth of 2 in the third query tree, if the child nodes of the second merging node include at least one BGP node and at least one UNION node, then for the at least one BGP The node is merged to obtain the second BGP node after the merge process, and the at least one UNION node is merged to obtain the third UNION node after the merge process; the second BGP node is merged into In the child node of the third UNION node, obtain the fourth UNION node, delete the second merged node, and add the fourth UNION node to the position of the second merged node;
对于所述第三查询树中深度为2的第五UNION结点,将所述第五UNION结点的孙代结点添加到所述第五UNION结点的孩子结点中,并删除所述第五UNION结点的孙代结点以及所述孙代结点的父结点,得到简化处理之后第三查询树。For the fifth UNION node whose depth is 2 in the third query tree, add the grandchildren node of the fifth UNION node to the child node of the fifth UNION node, and delete the The grandchild node of the fifth UNION node and the parent node of the grandchild node obtain the third query tree after simplified processing.
可选的,所述处理模块,还用于:Optionally, the processing module is also used for:
确定所述第一查询树中对应的祖先结点不存在OPTIONAL结点的第一OPTIONAL结点;determining that there is no first OPTIONAL node in the corresponding ancestor node in the first query tree;
将以所述第一OPTIONAL结点的父结点为根结点的子查询树转化为第三BGP结点。Converting the subquery tree whose root node is the parent node of the first OPTIONAL node into a third BGP node.
可选的,所述查询模块,用于:Optionally, the query module is used for:
当执行所述第三BGP结点对应的第一查询操作时,确定所述第三BGP结点对应的子查询树;When executing the first query operation corresponding to the third BGP node, determine the sub-query tree corresponding to the third BGP node;
执行所述子查询树中所述第一OPTIONAL结点的兄弟结点对应的查询操作,得到第一查询结果;将所述第一查询结果确定为所述第一OPTIONAL结点的后代结点的数据查询范围;基于所述数据查询范围,执行所述第一OPTIONAL结点的后代结点对应的查询操作。Execute the query operation corresponding to the brother node of the first OPTIONAL node in the sub-query tree to obtain a first query result; determine the first query result as the descendant node of the first OPTIONAL node A data query range: based on the data query range, perform a query operation corresponding to a descendant node of the first OPTIONAL node.
可选的,所述查询模块,用于:Optionally, the query module is used for:
当执行所述第三BGP结点对应的第一查询操作时,如果确定所述第一OPTIONAL结点的后代结点中包括至少一个第二OPTIONAL结点;When executing the first query operation corresponding to the third BGP node, if it is determined that at least one second OPTIONAL node is included in the descendant nodes of the first OPTIONAL node;
按照所述第一OPTIONAL结点和所述至少一个第二OPTIONAL结点的深度,依次执行所述第一OPTIONAL结点的第一兄弟结点和所述至少一个第二OPTIONAL结点的第二兄弟结点对应的查询操作,其中,每个第二OPTIONAL结点的第二兄弟结点对应的数据查询范围为前一个OPTIONAL结点的第二兄弟结点对应的查询结果;According to the depth of the first OPTIONAL node and the at least one second OPTIONAL node, sequentially execute the first sibling node of the first OPTIONAL node and the second sibling node of the at least one second OPTIONAL node The query operation corresponding to the node, wherein, the data query range corresponding to the second sibling node of each second OPTIONAL node is the query result corresponding to the second sibling node of the previous OPTIONAL node;
按照所述第一OPTIONAL结点和所述至少一个第二OPTIONAL结点的深度,依次执行所述第一OPTIONAL结点的孩子结点和所述至少一个第二OPTIONAL结点的孩子结点对应的查询操作,其中,所述任一OPTIONAL结点的孩子结点的对应的数据查询范围为所述任一OPTIONAL结点的兄弟结点对应的查询结果。According to the depth of the first OPTIONAL node and the at least one second OPTIONAL node, sequentially execute the child nodes of the first OPTIONAL node and the child nodes of the at least one second OPTIONAL node Query operation, wherein the data query range corresponding to the child nodes of any OPTIONAL node is the query result corresponding to the brother nodes of any OPTIONAL node.
可选的,所述处理模块,用于:Optionally, the processing module is used for:
对于所述第一查询树中的FILTER结点,如果所述FILTER结点对应的FILTER条件满足预设的转换条件,则将所述FILTER条件转换为析取范式;For the FILTER node in the first query tree, if the FILTER condition corresponding to the FILTER node satisfies a preset conversion condition, then convert the FILTER condition into a disjunctive normal form;
基于所述析取范式,将所述FILTER结点转换为UNION结点。Based on the disjunctive normal form, convert the FILTER node into a UNION node.
可选的,所述转换条件为所述FILTER结点对应的FILTER条件由变量、常量以及与、或、相等三种运算符组成。Optionally, the conversion condition is that the FILTER condition corresponding to the FILTER node is composed of variables, constants, and three operators: and, or, and equal.
可选的,所述查询模块,用于:Optionally, the query module is used for:
如果存在可并行执行的多个BGP结点,则确定所述多个BGP结点对应的公共三元组模式;If there are multiple BGP nodes that can be executed in parallel, then determine the public triplet pattern corresponding to the multiple BGP nodes;
基于所述贪心算法,在所述公共三元组模式中确定对应查询成本最低的部分公共三元组模式;Based on the greedy algorithm, determining a part of the public triplet patterns corresponding to the lowest query cost in the public triplet patterns;
在所述图数据中查询所述部分公共三元组模式对应的数据;Querying the data corresponding to the partial public triple pattern in the graph data;
在所述图数据中查询多个BGP结点中除所述部分公共三元组模式外的其他三元组模式对应的数据。The graph data is queried for data corresponding to triplet patterns other than the partial common triplet patterns in multiple BGP nodes.
第三方面,提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令,所述至少一条指令由所述处理器加载并执行以实现如上述第一方面所执行的操作。In a third aspect, a computer device is provided, the computer device includes a processor and a memory, at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor to implement the above-mentioned first The operations performed by the aspect.
第四方面,提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令,所述至少一条指令由处理器加载并执行以实现如上述第一方面所执行的操作。In a fourth aspect, a computer-readable storage medium is provided, at least one instruction is stored in the storage medium, and the at least one instruction is loaded and executed by a processor to implement the operations performed in the first aspect above.
第五方面,提供了一种计算机程序产品,所述计算机程序产品中包括至少一条指令,所述至少一条指令由处理器加载并执行以实现上述第一方面所执行的操作。In a fifth aspect, a computer program product is provided, the computer program product includes at least one instruction, and the at least one instruction is loaded and executed by a processor to implement the operations performed in the first aspect above.
本申请实施例提供的技术方案带来的有益效果是:The beneficial effects brought by the technical solutions provided by the embodiments of the present application are:
本申请实施例,根据数据查询语句的结构将数据查询语句的查询转化为第一查询树,然后可以利用第一查询树中各查询结点对应的查询逻辑,对查询树进行简化处理得到第二查询树。如此可以通过对查询树的简化,简化数据查询语句的查询操作,进而可以提高数据查询的效率。In the embodiment of the present application, according to the structure of the data query statement, the query of the data query statement is converted into the first query tree, and then the query logic corresponding to each query node in the first query tree can be used to simplify the query tree to obtain the second query tree. query tree. In this way, the query operation of the data query statement can be simplified by simplifying the query tree, thereby improving the efficiency of the data query.
附图说明Description of drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present application. For those skilled in the art, other drawings can also be obtained based on these drawings without creative effort.
图1是本申请实施例提供的一种查询数据的方法流程图;Fig. 1 is a flow chart of a method for querying data provided by an embodiment of the present application;
图2是本申请实施例提供的一种查询数据的方法示意图;Fig. 2 is a schematic diagram of a method for querying data provided by an embodiment of the present application;
图3是本申请实施例提供的一种查询数据的方法流程图;Fig. 3 is a flow chart of a method for querying data provided by an embodiment of the present application;
图4是本申请实施例提供的一种查询数据的方法示意图;FIG. 4 is a schematic diagram of a method for querying data provided by an embodiment of the present application;
图5是本申请实施例提供的一种查询数据的方法示意图;FIG. 5 is a schematic diagram of a method for querying data provided by an embodiment of the present application;
图6是本申请实施例提供的一种查询数据的方法示意图;FIG. 6 is a schematic diagram of a method for querying data provided by an embodiment of the present application;
图7是本申请实施例提供的一种查询数据的方法流程图;FIG. 7 is a flow chart of a method for querying data provided by an embodiment of the present application;
图8是本申请实施例提供的一种查询数据的方法示意图;FIG. 8 is a schematic diagram of a method for querying data provided by an embodiment of the present application;
图9是本申请实施例提供的一种查询数据的方法流程图;FIG. 9 is a flow chart of a method for querying data provided by an embodiment of the present application;
图10是本申请实施例提供的一种查询数据的方法流程图;FIG. 10 is a flow chart of a method for querying data provided by an embodiment of the present application;
图11是本申请实施例提供的一种查询数据的装置结构示意图;FIG. 11 is a schematic structural diagram of a device for querying data provided by an embodiment of the present application;
图12是本申请实施例提供的一种查询数据的计算机设备示意图。Fig. 12 is a schematic diagram of a computer device for querying data provided by an embodiment of the present application.
具体实施方式Detailed ways
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。In order to make the purpose, technical solution and advantages of the present application clearer, the implementation manners of the present application will be further described in detail below in conjunction with the accompanying drawings.
本申请实施例提供的一种查询数据的方法可以由计算机设备实现。该计算机设备中可运行用于查询数据的应用程序(如图数据查询应用程序)。在该计算机设备中至少包括处理器和存储器,其中,存储器可用于存储执行查询数据的方法涉及的数据,例如可以包括图数据库、执行查询数据的方法对应的程序代码等。处理器可以执行存储器中存储的程序代码,并根据应用程序的数据查询请求,实现本申请实施例提供的查询数据的方法。A method for querying data provided in the embodiment of the present application may be implemented by a computer device. An application program for querying data (such as a data query application program) can run in the computer device. The computer device includes at least a processor and a memory, wherein the memory can be used to store data related to the method for executing the query data, for example, it can include a graph database, program codes corresponding to the method for executing the query data, and the like. The processor can execute the program code stored in the memory, and implement the data query method provided by the embodiment of the present application according to the data query request of the application program.
该计算机设备可以是终端或者服务器,当该计算机设备为终端时,该终端可以是手机、平板电脑、智能穿戴设备、台式计算机、笔记本电脑等。当该计算机设备为服务器时,该服务器可以与终端建立通信,该服务器可以是一个单独的服务器也可以是一个服务器组,如果是单独的服务器,该服务器可以负责下述方案中的所有处理,如果是服务器组,服务器组中的不同服务器分别可以负责下述方案中的不同处理,具体的处理分配情况可以由技术人员根据实际需求任意设置,此处不再赘述。The computer device may be a terminal or a server. When the computer device is a terminal, the terminal may be a mobile phone, a tablet computer, a smart wearable device, a desktop computer, a notebook computer, and the like. When the computer device is a server, the server can establish communication with the terminal. The server can be a single server or a server group. If it is a single server, the server can be responsible for all the processing in the following schemes, if It is a server group, and different servers in the server group can be responsible for different processing in the following solutions. The specific processing allocation can be set arbitrarily by technicians according to actual needs, and will not be repeated here.
下面对与本申请实施例相关的概念进行介绍:The concepts related to the embodiments of this application are introduced below:
RDF(Resource Description Framework,资源描述框架)是一种知识图谱的事实数据模型,知识图谱中的每条边以形如“主语、谓语、宾语”的RDF三元组形式表示,其代表了一对实体之间的命名关系或一个实体拥有的命名属性值。RDF (Resource Description Framework, resource description framework) is a factual data model of knowledge graph. Each edge in the knowledge graph is expressed in the form of an RDF triple in the form of "subject, predicate, object", which represents a pair of A named relationship between entities or a named property value that an entity has.
SPARQL(SPARQL Protocol and RDF Query Language,查询语言和数据获取协议)是访问RDF数据集的标准查询语言。SPARQL (SPARQL Protocol and RDF Query Language, query language and data acquisition protocol) is a standard query language for accessing RDF datasets.
RDF数据集包括多个RDF三元组,可存储任意图数据。对于图数据中每个边对应有唯一RDF三元组。RDF datasets consist of multiple RDF triples, which can store arbitrary graph data. For each edge in graph data there is a unique RDF triple.
RDF三元组:令两两不相交的无穷集合I,B和L分别表示国际化资源标识符(Internationalized Resource Identifier,IRI)、空结点和字面值。一个RDF三元组三元组形如t=<主语,谓词,宾语>∈(I∪B)×I×(I∪B∪L)。RDF triples: Let the pairwise disjoint infinite sets I, B and L denote Internationalized Resource Identifier (Internationalized Resource Identifier, IRI), empty node and literal value respectively. An RDF triplet is of the form t=<subject, predicate, object>∈(I∪B)×I×(I∪B∪L).
RDF数据集:一个RDF数据集是指是RDF三元组的集合。RDF Dataset: An RDF Dataset is a collection of RDF triples.
三元组模式:令与上述集合I,集合B和集合L均不相交的无穷集合V表示变量。一个三元组模式形如t=<主语,谓词,宾语>∈(V∪I)×(V∪I)×(V∪I∪L)。由于三元组模式中的主语、谓词或宾语可能存在变量,这样通过三元组模式可以在RDF数据集中匹配到多个RDF三元组。可见,可以通过三元组模式,在RDF数据集中查询RDF三元组。又由于RDF数据集可用于表示图数据,因此,在查询图数据中的数据时,也可以通过三元组模式查找。Triple pattern: let an infinite set V that is disjoint to the above set I, set B, and set L denote variables. A triple pattern is of the form t=<subject, predicate, object>∈(V∪I)×(V∪I)×(V∪I∪L). Since there may be variables in the subject, predicate or object in the triple pattern, multiple RDF triples can be matched in the RDF dataset through the triple pattern. It can be seen that the RDF triples can be queried in the RDF dataset through the triple mode. And because the RDF data set can be used to represent graph data, when querying data in graph data, it can also be searched through the triple mode.
BGP(Basic graph pattern,基本图模式):包括至少一个三元组模式,一个三元组模式t是BGP;如果P 1和P 2均为BGP,则P 1ANDP 2也是BGP。 BGP (Basic graph pattern): includes at least one triplet pattern, and a triplet pattern t is BGP; if both P 1 and P 2 are BGP, then P 1 ANDP 2 is also BGP.
BGP是在图数据中查找数据的基本单位,通过现有的BGP匹配算法可以在RDF数据集中查找到与BGP中各三元组模式匹配的数据。BGP is the basic unit for finding data in graph data. Through the existing BGP matching algorithm, data matching each triple pattern in BGP can be found in the RDF data set.
图模式:Figure mode:
(1)如果P为BGP,则P为图模式;(1) If P is BGP, then P is graph mode;
(2)如果P 1和P 2均为图模式,则P 1ANDP 2也是图模式; (2) If both P 1 and P 2 are in graph mode, then P 1 ANDP 2 is also in graph mode;
(3)如果P 1和P 2均为图模式,则{P 1}UNION{P 2}、P 1OPTIONAL{P 2}也是图模式,其中,{P i}表示组图模式; (3) If P 1 and P 2 are both graph modes, then {P 1 }UNION{P 2 }, P 1 OPTIONAL{P 2 } are also graph modes, where {P i } represents the group graph mode;
(4)如果P是一个图模式,C是一个内建条件(使用I∪L∪V和常量,可包含逻辑运算符(∧,∨),比较运算符(<,≤,>,≥,=),一元函数(isBlank(判断是否为空结点)、isIRI(判断是否为IRI))等功能),则PFILTERC是一个图模式。(4) If P is a graph pattern, C is a built-in condition (using I∪L∪V and constants, which can contain logical operators (∧, ∨), comparison operators (<, ≤, >, ≥, = ), unary functions (isBlank (judging whether it is an empty node), isIRI (judging whether it is an IRI)) and other functions), then PFILTERC is a graph mode.
其中,UNION、OPTIONAL和FILTER是SPARQL的数据查询语句中常用的查询表达式,其中:Among them, UNION, OPTIONAL and FILTER are commonly used query expressions in SPARQL data query statements, among which:
UNION是指对多个图模式的合并查找,例如P 1UNION P 2是指对满足RDF数据集中分别满足三元组模式P 1和三元组模式P 2的三元组进行查找,并对查找结果求并集。 UNION refers to the combined search of multiple graph patterns, for example, P 1 UNION P 2 refers to the search for triples that satisfy the triplet pattern P 1 and triplet pattern P 2 in the RDF dataset, and the lookup Find the union of the results.
OPTONAL是指对图模式进行选择性匹配,例如P1OPTIONAL{P2}是指在保留RDF数 据集中满足图模式P1的结果的前提下,添加与之兼容的满足图模式P2的结果。OPTONAL refers to selective matching of graph patterns, for example, P1OPTIONAL{P2} refers to adding a compatible result satisfying graph pattern P2 on the premise of retaining the results satisfying graph pattern P1 in the RDF dataset.
FILTER是对查找结果进行的条件筛选,例如,P 1FILTERC是指在三元组模式P 1对应的查找结果中,筛选满足条件C的数据。 FILTER is a conditional filter for search results. For example, P 1 FILTERC refers to filtering data that meets condition C in the search results corresponding to triple pattern P 1 .
组图模式:一个组图模式P递归定义如下:Grouping mode: A recursive definition of a grouping mode P is as follows:
(1)如果P是一个图模式,则{P}是一个组图模式;(1) If P is a graph pattern, then {P} is a group graph pattern;
(2)如果P是一个组图模式,则P也是一个图模式。(2) If P is a graph pattern, then P is also a graph pattern.
UNION图模式:UNION graph mode:
(1)如果P 1是一个组图模式或UNION图模式,且P 2是一个组图模式,则P 1UNION P 2是一个UNION图模式; (1) If P 1 is a group diagram mode or a UNION diagram mode, and P 2 is a group diagram mode, then P 1 UNION P 2 is a UNION diagram mode;
(2)如果P 1UNION P 2是一个UNION图模式,则它也是一个图模式。 (2) If P 1 UNION P 2 is a UNION graph pattern, then it is also a graph pattern.
良定义的SPARQL查询:一个图模式P当且仅当满足下列条件时被称为良定义的:Well-defined SPARQL query: A graph schema P is called well-defined if and only if it satisfies the following conditions:
(1)对于P中的每个形如P’FILTERC的子模式,所有出现在内建条件P中的变量也出现在图模式P’中;(1) For each subpattern of the form P'FILTERC in P, all variables appearing in the built-in condition P also appear in the graph pattern P';
(2)对于P中的每个形如P’=P 1OPTIONAL{P 2}的子模式,所有出现在图模式P 2及P’以外的变量也出现在图模式P 1中。 (2) For each subpattern in P of the form P'=P 1 OPTIONAL{P 2 }, all variables that appear in the graph pattern P 2 and other than P' also appear in the graph pattern P 1 .
本发明的原理基于SPARQL查询中选择查询的语义。选择查询的形式为“SELECT v 1v 2…v k WHERE{…}”,其中SELECT子句表示查询头,WHERE子句表示查询主体。SELECT子句确定投影变量,即需要出现在查询结果中的变量;WHERE子句给出需要与RDF数据集进行匹配的组图模式,也就是WHERE子句给出了数据查询语句。 The principle of the invention is based on the semantics of select queries in SPARQL queries. The form of the selection query is "SELECT v 1 v 2 ...v k WHERE{...}", where the SELECT clause represents the query header and the WHERE clause represents the query body. The SELECT clause determines the projection variable, that is, the variable that needs to appear in the query result; the WHERE clause gives the graph pattern that needs to be matched with the RDF dataset, that is, the WHERE clause gives the data query statement.
图模式P与RDF数据集D进行匹配生成一系列映射[[P]] D={μ 12,…,μ n}。注意映射中允许存在重复的元素,即映射为包而非集合。每个映射μ是指变量集合到结果结合的函数。在映射μ中出现的变量集合表示为dom(μ)。 The graph schema P is matched with the RDF dataset D to generate a series of mappings [[P]] D ={μ 12 ,…,μ n }. Note that duplicate elements are allowed in the map, i.e. the map is a package rather than a collection. Each mapping μ refers to a function of a set of variables to a combination of results. The set of variables appearing in the map μ is denoted as dom(μ).
当且仅当所有变量v∈dom(μ 1)∩dom(μ 2)满足μ 1(v)=μ 2(v)时,称μ 1和μ 2这两个映射是兼容的,记作μ 1~μ 2,此时μ 1∪μ 2也是一个映射。如果μ 1和μ 2这两个映射不兼容,记作
Figure PCTCN2022135606-appb-000001
If and only if all variables v∈dom(μ 1 )∩dom(μ 2 ) satisfy μ 1 (v)=μ 2 (v), the two mappings of μ 1 and μ 2 are said to be compatible, denoted as μ 1 ~μ 2 , then μ 1 ∪μ 2 is also a mapping. If the two mappings μ 1 and μ 2 are incompatible, write
Figure PCTCN2022135606-appb-000001
若存在两个映射Ω 1和Ω 2,Ω 1和Ω 2可进行的运算如下: If there are two mappings Ω 1 and Ω 2 , the operations that Ω 1 and Ω 2 can perform are as follows:
(1)
Figure PCTCN2022135606-appb-000002
(2)Ω 1bagΩ 2={μ 11∈Ω 1}∪ bag22∈Ω 2};
(1)
Figure PCTCN2022135606-appb-000002
(2)Ω 1bag Ω 2 ={μ 11 ∈Ω 1 }∪ bag22 ∈Ω 2 };
(3)
Figure PCTCN2022135606-appb-000003
图模式P与RDF数据集D匹配所产生的映射(记作[[P]] D)递归定义如下:
(3)
Figure PCTCN2022135606-appb-000003
The mapping (denoted as [[P]] D ) produced by matching a graph schema P with an RDF dataset D is recursively defined as follows:
(1)如果P是一个三元组模式t,则[[P]] D={μ|var(t)=dom(μ)∧μ(t)∈D},其中var(t)表示t中出现的所有变量,μ(t)t中出现的所有变量均替换为μ后得到的RDF三元组; (1) If P is a triple pattern t, then [[P]] D = {μ|var(t) = dom(μ)∧μ(t)∈D}, where var(t) represents the All variables that appear, all variables that appear in μ(t)t are replaced by the RDF triples obtained after μ;
(2)如果P=(P 1AND P 2),则
Figure PCTCN2022135606-appb-000004
(2) If P=(P 1 AND P 2 ), then
Figure PCTCN2022135606-appb-000004
(3)如果P=(P 1UNION P 2),则[[P]] D=[[P1]] Dbag[[P2]] D(3) If P=(P 1 UNION P 2 ), then [[P]] D =[[P1]] Dbag [[P2]] D ;
(4)如果P=(P 1OPTIONAL P 2),
Figure PCTCN2022135606-appb-000005
Figure PCTCN2022135606-appb-000006
(4) If P=(P 1 OPTIONAL P 2 ),
Figure PCTCN2022135606-appb-000005
Figure PCTCN2022135606-appb-000006
(5)如果P=(P 1FILTER C),则[[P]] D={μ|μ∈[[P1]] D∧μ(C)}(即当C中出现的所有变量均替换为μ时(记作μ(C)),μ(C)的值为真)。 (5) If P=(P 1 FILTER C), then [[P]] D ={μ|μ∈[[P1]] D ∧μ(C)} (that is, when all variables appearing in C are replaced by μ (denoted as μ (C)), the value of μ (C) is true).
本发明的目的是提供一种针对图数据库中含有UNION、OPTIONAL和FILTER表达式的SPARQL查询执行计划生成的优化算法,用以解决现存图数据库系统中此类查询效率低下的问题。The object of the present invention is to provide an optimization algorithm for generation of SPARQL query execution plans containing UNION, OPTIONAL and FILTER expressions in graph databases, so as to solve the problem of low efficiency of such queries in existing graph database systems.
下面对结合实施例对本申请提供的一种查询数据的方法进行详细说明:A method for querying data provided by the present application will be described in detail below in conjunction with an embodiment:
图1是本申请实施例提供的一种查询数据的方法的流程图。该方法可以由上述计算机设备实现。参见图1,该实施例包括:FIG. 1 is a flow chart of a method for querying data provided by an embodiment of the present application. The method can be realized by the computer device mentioned above. Referring to Fig. 1, this embodiment includes:
步骤101、接收数据查询应用程序发送的数据查询指令。 Step 101, receiving a data query instruction sent by a data query application program.
其中,数据查询应用程序可用于查询存储数据库中存储的图数据,该图数据可以是以RDF数据集形式存储。该图数据可以是不同企业之间的股权关系,其中,图数据中的结点可以包括企业名称、规模、成立时间,图数据中的边可以表示企业之前的关系、如持股,股份额度等。例如,RDF数据集中的RDF三元组可以是:<http://example.com/TX><http://example.com/名称>“北京市TX计算机系统有限公司”,该RDF三元组表示公司TX的名称为北京市TX计算机系统有限公司。Wherein, the data query application program can be used to query the graph data stored in the storage database, and the graph data can be stored in the form of RDF data set. The graph data can be the equity relationship between different companies, where the nodes in the graph data can include the name, scale, and establishment time of the company, and the edges in the graph data can represent the previous relationship of the company, such as shareholding, share quota, etc. . For example, the RDF triple in the RDF dataset can be: <http://example.com/TX><http://example.com/name>"Beijing TX Computer System Co., Ltd.", the RDF triple Indicates that the name of the company TX is Beijing TX Computer System Co., Ltd.
数据查询应用程序可以在计算机设备中运行,用户可以根据业务需求,在数据查询应用程序中输入数据查询语句,在输入数据查询语句后可以触发数据查询指令。计算机设备中的处理器在接收到数据查询指令后,可以根据数据查询指令中携带用户输入的数据查询语句执行以下处理。The data query application program can run in the computer equipment, and the user can input data query statements in the data query application program according to business requirements, and can trigger a data query instruction after inputting the data query statement. After receiving the data query instruction, the processor in the computer device may perform the following processing according to the data query statement carried by the user in the data query instruction.
步骤102、基于数据查询语句的结构,建立数据查询语句对应的第一查询树。 Step 102. Based on the structure of the data query statement, a first query tree corresponding to the data query statement is established.
在接收到数据查询语句后,可以根据数据查询语句和各查询词的结构,建立数据查询语句对应的查询树(也可称为BE树)。After receiving the data query statement, a query tree (also referred to as a BE tree) corresponding to the data query statement may be established according to the structure of the data query statement and each query word.
查询树中结点的类型可包括合并结点和查询结点,合并结点可以表示数据查询语句或数据查询语句中的子查询语句。其中,数据查询语句和子查询语句均为图模式。当合并结点表示数据查询语句时,该合并结点即为第一查询树的根结点。查询结点表示数据查询语句或子查询语句中出现的查询词,例如BGP、UNION、OPTIONAL、FILTER等,用于表示BGP的查询结点可称为BGP结点,用于表示BGP的查询结点可称为BGP结点,用于表示UNION的查询结点可称为UNION结点,用于表示OPTIONAL的查询结点可称为OPTIONAL结点,用于表示FILTER的查询结点可称为FILTER结点。The types of nodes in the query tree may include merge nodes and query nodes, and the merge nodes may represent data query statements or subquery statements in the data query statements. Among them, the data query statement and the subquery statement are both graph patterns. When the merging node represents a data query statement, the merging node is the root node of the first query tree. Query nodes represent query words that appear in data query statements or sub-query statements, such as BGP, UNION, OPTIONAL, FILTER, etc. The query nodes used to represent BGP can be called BGP nodes, which are used to represent BGP query nodes It can be called a BGP node, the query node used to represent UNION can be called a UNION node, the query node used to represent OPTIONAL can be called an OPTIONAL node, and the query node used to represent FILTER can be called a FILTER node point.
在本申请中通过数据查询语句直接建立的查询树可称为第一查询树。应理解的,建立数据查询语句对应的第一查询树,实际上就是将数据查询语句和各子查询语句通过合并结点表示,将数据查询语句中的查询词通过查询结点表示,然后建立与数据查询语句中各子查询语句与各查询词之间的结构相同的树。In this application, the query tree directly established by the data query statement may be called the first query tree. It should be understood that the establishment of the first query tree corresponding to the data query statement is actually to represent the data query statement and each sub-query statement through the merge node, and to represent the query words in the data query statement through the query node, and then establish the corresponding A tree with the same structure between each sub-query statement and each query word in the data query statement.
例如,数据查询语句的结构为((b 1AND(b 2UNION b 3))OPTIONAL(b 4UNION b 5))FILTER c 1,其中,b 1-b 5、c 1为BGP,可表示为BGP结点。(b 2UNION b 3)、(b 1AND(b 2UNION b 3)、(b 4UNION b 5)、(b 1AND(b 2UNION b 3))OPTIONAL(b 4UNION b 5)为不同的子查询语句,可通过合并结点表示。该数据查询语句对应的第一查询树可如图2所示。 For example, the structure of the data query statement is ((b 1 AND(b 2 UNION b 3 ))OPTIONAL(b 4 UNION b 5 ))FILTER c 1 , where b 1 -b 5 and c 1 are BGP, which can be expressed as BGP node. (b 2 UNION b 3 ), (b 1 AND(b 2 UNION b 3 ), (b 4 UNION b 5 ), (b 1 AND(b 2 UNION b 3 ))OPTIONAL(b 4 UNION b 5 ) are different The sub-query statement can be expressed by merging nodes. The first query tree corresponding to the data query statement can be as shown in Figure 2.
步骤103、基于第一查询树中各结点的类型,对第一查询树进行简化处理,得到第二查询树。 Step 103, based on the types of each node in the first query tree, simplify the first query tree to obtain a second query tree.
在得到查询语句对应的第一查询树后,可以对第一查询树进行简化处理,进而可以根据简化处理之后的查询树执行数据查询语句对应的查询操作,能够简化数据查询语句对应的查询操作,提高数据查询的效率。After obtaining the first query tree corresponding to the query statement, the first query tree can be simplified, and then the query operation corresponding to the data query statement can be executed according to the simplified query tree, which can simplify the query operation corresponding to the data query statement, Improve the efficiency of data query.
应理解的,对第一查询树进行简化处理,是对第一查询树中结点对应的查询逻辑,对结点进行合并、转化、删除的处理,并不会改变数据查询语句对应的查询结果。根据第一查询树中结点的类型不同,对应的简化处理也不同,此处先不进行详细介绍。对第一查询树进行简化处理后得到的查询树可称为第二查询树。It should be understood that the simplification of the first query tree is the query logic corresponding to the nodes in the first query tree, and the processing of merging, converting, and deleting nodes will not change the query results corresponding to the data query statement . According to the different types of nodes in the first query tree, the corresponding simplification processing is also different, which will not be described in detail here. A query tree obtained after simplifying the first query tree may be called a second query tree.
步骤104、基于预设的执行顺序,在图数据库中依次执行第二查询树中各结点对应的查询操作,得到数据查询结果。 Step 104, based on the preset execution order, sequentially execute the query operations corresponding to the nodes in the second query tree in the graph database to obtain the data query results.
在得到第二查询树后,可以按照第二查询树中各结点对应的深度,依次执行各结点对应的查询操作,最后可以将得到深度最大的结点(即根结点)的查询结果,该查询结果即为数据查询指令中数据查询语句的查询结果。After obtaining the second query tree, the query operations corresponding to each node can be executed sequentially according to the depth corresponding to each node in the second query tree, and finally the query result of the node with the largest depth (ie, the root node) can be obtained , the query result is the query result of the data query statement in the data query instruction.
其中,需要说明的是,对于具有相同深度的各查询结点,可以先执行BGP结点、再执行对应的UNION结点或OPTIONAL结点,最后执行FILTER结点。It should be noted that, for each query node with the same depth, the BGP node may be executed first, then the corresponding UNION node or OPTIONAL node, and finally the FILTER node may be executed.
步骤105、将数据查询结果返回至数据查询应用程序。 Step 105, returning the data query result to the data query application program.
在得到数据查询结果后,可以数据查询结果发送至查询应用程序,查询应用程序可以将对应的查询应用程序显示在查询结果显示界面中,以供用户查看。After the data query result is obtained, the data query result can be sent to the query application program, and the query application program can display the corresponding query application program in the query result display interface for the user to view.
例如数据查询语句中包括如下三元组模式:?x<http://example.com/持股>“北京市TX计算机系统有限公司”,则经过上述处理之后,查询结果显示界面中可以显示持股北京市TX计算机系统有限公司的所有个人、公司等。For example, the data query statement includes the following triple pattern: ? x<http://example.com/Shareholding>"Beijing TX Computer System Co., Ltd.", after the above processing, all individuals and companies holding shares of Beijing TX Computer System Co., Ltd. can be displayed in the query result display interface wait.
本申请实施例,根据数据查询语句的结构将数据查询语句的查询转化为第一查询树,然后可以利用第一查询树中各查询结点对应的查询逻辑,对查询树进行简化处理得到第二查询树。如此可以通过对查询树的化简,简化数据查询语句的查询操作,进而提高数据查询的效率。In the embodiment of the present application, according to the structure of the data query statement, the query of the data query statement is converted into the first query tree, and then the query logic corresponding to each query node in the first query tree can be used to simplify the query tree to obtain the second query tree. query tree. In this way, the query operation of the data query statement can be simplified by simplifying the query tree, thereby improving the efficiency of the data query.
下面对步骤103中的对第一查询树的简化处理进行详细说明,根据第一查询树中结点类型的不同,对应的不同的处理分别如下:The simplified processing of the first query tree in step 103 is described in detail below. According to the difference of node types in the first query tree, the corresponding different processing is as follows:
如图3所示,图3是本申请提供的一种简化处理的方法,该方法包括:As shown in Figure 3, Figure 3 is a simplified processing method provided by the present application, which includes:
步骤301、将第一查询树确定为待简化的第三查询树,确定第三查询树中各结点的深度。 Step 301. Determine the first query tree as the third query tree to be simplified, and determine the depth of each node in the third query tree.
其中,对于第三查询树中的叶子结点a的深度d(a)=0。对于除叶子结点之外的其他结点b的深度d(b)=max{d(bi)|bi为o的孩子结点}+1。Wherein, for the depth d(a)=0 of the leaf node a in the third query tree. For the depth d(b)=max{d(bi)|bi is the child node of o}+1 for other nodes b except leaf nodes.
步骤302、对于第三查询树中深度为1的第一合并结点,如果第一合并结点的孩子结点包括多个BGP结点,则对多个BGP结点进行合并处理,得到合并处理后的第一BGP结点,删除第一合并结点,将第一BGP结点添加到第一合并结点的位置。 Step 302, for the first merging node whose depth is 1 in the third query tree, if the child nodes of the first merging node include multiple BGP nodes, perform merging processing on multiple BGP nodes to obtain the merging process After the first BGP node, delete the first merged node, and add the first BGP node to the position of the first merged node.
如果在第三查询树中存在深度为1的第一合并结点,且该第一合并结点的孩子结点为多个BGP结点,则可以将多个BGP结点进行合并处理,将多个BGP结点合并为一个BGP结点,合并后得到的BGP结点即为第一BGP结点。合并之前的BGP结点P1-Pn对应的映射分别为[[P1]] D-[[Pn]] D,其中,n为合并之前的BGP结点的个数,则合并之后的BGP结点Pm对应的映射
Figure PCTCN2022135606-appb-000007
If there is a first merging node with a depth of 1 in the third query tree, and the child nodes of the first merging node are multiple BGP nodes, then multiple BGP nodes can be merged, and multiple BGP nodes can be merged. BGP nodes are merged into one BGP node, and the merged BGP node is the first BGP node. The mappings corresponding to the BGP nodes P1-Pn before merging are [[P1]] D -[[Pn]] D respectively, where n is the number of BGP nodes before merging, then the BGP node Pm after merging corresponding mapping
Figure PCTCN2022135606-appb-000007
在得到第一子结点之后,可以将原来深度为1的第一合并结点删除,然后将第一BGP结点添加到原来深度为1的合并结点的位置,也就是将第一BGP结点替换掉原来深度为1的第一合并结点。如图4所示,图4为步骤303的示意图,这样在将深度为1的第一合并结点替换为对应的第一BGP结点后,能够降低结点深度,减少中间结果的数据量,可以提高查询数据的效率。After obtaining the first child node, the first merged node with the original depth of 1 can be deleted, and then the first BGP node can be added to the position of the merged node with the original depth of 1, that is, the first BGP node The point replaces the first merged node with a depth of 1. As shown in Figure 4, Figure 4 is a schematic diagram of step 303, after replacing the first merged node with a depth of 1 with the corresponding first BGP node, the depth of the node can be reduced, and the data volume of the intermediate result can be reduced. It can improve the efficiency of querying data.
对于深度为1的UNION结点不做处理。其中,如果第一合并结点的孩子结点包括多个BGP结点之外,还包括FILTER结点,则在合并得到第一BGP结点后,将该FILTER结点可作为第一BGP结点的兄弟结点一并替换第一合并结点,并记录该FILTER结点的作用域,也就是记录该FILTER结点第一BGP结点的对应关系,在执行每个FILTER结点之前,可以根据为当前FILTER结点记录的对应关系,确定当前FILTER作用域包括的结点,并在作用域所包括结点的查询结果的基础上,执行FILTER结点对应的查询操作。No processing is done for UNION nodes with a depth of 1. Wherein, if the child node of the first merging node includes not only a plurality of BGP nodes but also a FILTER node, then after merging to obtain the first BGP node, the FILTER node can be used as the first BGP node Brother nodes of the FILTER node are replaced with the first merged node, and the scope of the FILTER node is recorded, that is, the corresponding relationship of the first BGP node of the FILTER node is recorded. Before executing each FILTER node, it can be based on For the corresponding relationship recorded by the current FILTER node, determine the nodes included in the current FILTER scope, and execute the query operation corresponding to the FILTER node on the basis of the query results of the nodes included in the scope.
步骤303、对于第三查询树中深度为2的第二合并结点,如果第二合并结点的孩子结点包括至少一个BGP结点和至少一个UNION结点,则对至少一个BGP结点进行合并处理,得到合并处理后的第二BGP结点,对至少一个UNION结点进行合并处理,得到合并处理后的第三UNION结点;将第二BGP结点合并至第三UNION结点的孩子结点中,得到第四UNION结点,删除第二合并结点,将第四UNION结点添加到第二合并结点的位置。 Step 303, for the second merging node whose depth is 2 in the third query tree, if the child nodes of the second merging node include at least one BGP node and at least one UNION node, at least one BGP node is performed Merge processing, obtain the second BGP node after the merge processing, merge at least one UNION node, obtain the third UNION node after the merge processing; merge the second BGP node into the child of the third UNION node Among the nodes, the fourth UNION node is obtained, the second merged node is deleted, and the fourth UNION node is added to the position of the second merged node.
如果第三查询树中存在深度为2的第二合并结点,且如果该第二合并结点的孩子结点中包括多个BGP结点,则可以将各BGP结点进行合并处理得到第二BGP结点,该处理可参考上述步骤303中得到第一BGP结点的处理,此处不再赘述。If there is a second merging node with a depth of 2 in the third query tree, and if the child nodes of the second merging node include multiple BGP nodes, each BGP node can be merged to obtain the second For the BGP node, the processing may refer to the processing of obtaining the first BGP node in step 303 above, which will not be repeated here.
如果该第二合并结点的孩子结点中还包括多个UNION结点,则可以将各UNION结点进行合并处理得到第三UNION结点。例如存在两个UNION结点u1和u2需要进行合并每个UNION结点有两个孩子结点(该孩子结点均为BGP结点)的映射分别为p1、p2和q1、q2, 则合并之后的第三UNION结点具有四个孩子结点,对应的映射分别为
Figure PCTCN2022135606-appb-000008
Figure PCTCN2022135606-appb-000009
If the child nodes of the second merging node also include multiple UNION nodes, each UNION node may be merged to obtain a third UNION node. For example, there are two UNION nodes u1 and u2 that need to be merged. Each UNION node has two child nodes (the child nodes are all BGP nodes) and the mappings are p1, p2, q1, and q2 respectively. After merging The third UNION node of has four child nodes, and the corresponding mappings are
Figure PCTCN2022135606-appb-000008
Figure PCTCN2022135606-appb-000009
如果该第二合并结点的孩子结点中包括m个UNION结点,其中,第i个UNION结点u i有n i个孩子结点(BGP结点),则最终合并得到的第三UNION结点的孩子结点共
Figure PCTCN2022135606-appb-000010
个。其中,可以根据该m个UNION结点的BGP结点,确定N个不同的映射集合。每个映射集合中存在m个BGP结点的映射,不同的映射所属的BGP结点为不同UNION结点的孩子结点。对于第三UNION结点的各孩子结点的映射,由分别由各映射集合中的映射进行自然连接得到。
If the child nodes of the second merging node include m UNION nodes, wherein the i-th UNION node u i has ni child nodes (BGP nodes), then the third UNION obtained by the final merging node's children
Figure PCTCN2022135606-appb-000010
indivual. Wherein, N different mapping sets may be determined according to the BGP nodes of the m UNION nodes. There are m BGP node mappings in each mapping set, and the BGP nodes to which different mappings belong are child nodes of different UNION nodes. The mappings for each child node of the third UNION node are obtained by performing natural connections on the mappings in each mapping set.
在分别对BGP结点和UNION结点的合并处理后,可以将第二BGP结点合并到第三UNION结点u3的孩子结点中。例如第二子结点的映射为px,第三子结点的孩子结点有四个,对应的映射分别为
Figure PCTCN2022135606-appb-000011
则将第二BGP结点合并至第三UNION结点的孩子结点中,得到的第四UNION结点u4仍然有四个孩子结点,对应的映射分别为
Figure PCTCN2022135606-appb-000012
After merging the BGP node and the UNION node respectively, the second BGP node may be merged into the child node of the third UNION node u3. For example, the mapping of the second child node is px, and the third child node has four child nodes, and the corresponding mappings are
Figure PCTCN2022135606-appb-000011
Then merge the second BGP node into the child nodes of the third UNION node, and the obtained fourth UNION node u4 still has four child nodes, and the corresponding mappings are respectively
Figure PCTCN2022135606-appb-000012
在得到第四UNION结点后,可以删除深度为2的第二合并结点,将第四UNION结点添加到原来深度为2的第二合并结点中,也就是将第四UNION结点替换掉原来深度为2的第二合并结点。如图5所示,图5为步骤303的示意图,这样在将深度为2的第二合并结点替换为对应的第四UNION结点后,能够降低结点深度,减少中间结果的数据量,可以提高查询数据的效率。After obtaining the fourth UNION node, you can delete the second merge node with a depth of 2, and add the fourth UNION node to the original second merge node with a depth of 2, that is, replace the fourth UNION node Drop the second merge node with the original depth of 2. As shown in Figure 5, Figure 5 is a schematic diagram of step 303, so that after replacing the second merged node with a depth of 2 with the corresponding fourth UNION node, the depth of the node can be reduced, and the amount of data in the intermediate result can be reduced. It can improve the efficiency of querying data.
其中,需要说明的是,如果深度为2的合并结点的孩子结点中仅包括UNION结点,则可以直接将该UNION结点替换掉原来深度为2的合并结点,以降低结点深度。如果深度为2的合并结点的孩子结点中仅包括一个BGP结点和一个UNION结点,则可以直接将BGP结点合并至UNION结点的孩子结点中,以降低结点深度。如果深度为2的合并结点的孩子结点中包括一个BGP结点和多个UNION结点,则可以先将多个UNION结点进行合并,再将BGP结点合并至合并后的UNION结点的孩子结点中,以降低结点深度。如果深度为2的合并结点的孩子结点中包括多个BGP结点和一个UNION结点,则可以先将多个BGP结点进行合并,再将合并后的BGP结点合并至UNION结点的孩子结点中,以降低结点深度。Among them, it should be noted that if the child nodes of the merged node with a depth of 2 only include UNION nodes, the UNION node can be directly replaced by the original merged node with a depth of 2 to reduce the node depth . If the child nodes of the merged node with a depth of 2 only include a BGP node and a UNION node, the BGP node can be directly merged into the child nodes of the UNION node to reduce the node depth. If the child nodes of the merged node with a depth of 2 include a BGP node and multiple UNION nodes, multiple UNION nodes can be merged first, and then the BGP nodes can be merged into the merged UNION node in the child nodes to reduce the node depth. If the child nodes of the merging node with a depth of 2 include multiple BGP nodes and a UNION node, multiple BGP nodes can be merged first, and then the merged BGP node can be merged into the UNION node in the child nodes to reduce the node depth.
另外,如果第二合并结点的孩子结点还包括FILTER结点,则在合并得到第四UNION结点后,将该FILTER结点可作为第四UNION结点的兄弟结点一并替换第二合并结点,并记录该FILTER结点的作用域,也就是记录该FILTER结点第四UNION结点的对应关系,在执行每个FILTER结点之前,可以根据为当前FILTER结点记录的对应关系,确定当前FILTER作用域包括的结点,并在作用域所包括结点的查询结果的基础上,执行FILTER结点对应的查询操作。In addition, if the child node of the second merging node also includes a FILTER node, after merging to obtain the fourth UNION node, the FILTER node can be used as a sibling node of the fourth UNION node to replace the second Merge nodes, and record the scope of the FILTER node, that is, record the corresponding relationship of the fourth UNION node of the FILTER node. Before executing each FILTER node, it can be based on the corresponding relationship recorded for the current FILTER node , determine the nodes included in the current FILTER scope, and execute the query operation corresponding to the FILTER node on the basis of the query results of the nodes included in the scope.
步骤304、对于第三查询树中深度为2的第五UNION结点,将第五UNION结点的孙代结点添加到第五UNION结点的孩子结点中,并删除第五UNION结点的孙代结点以及孙代结点的父结点,得到简化处理之后第三查询树。 Step 304, for the fifth UNION node whose depth is 2 in the third query tree, add the grandchildren node of the fifth UNION node to the child node of the fifth UNION node, and delete the fifth UNION node The grandchildren node of the grandchildren node and the parent node of the grandchildren node get the third query tree after simplified processing.
如果第三查询树中存在深度为2的第五UNION结点,则该第五UNION结点的孩子结点中同样包括多个UNION结点,则可以将该第五UNION结点的孩子结点中的UNION结点对应的孩子结点,合并至该第五UNION结点的孩子结点中,并删除第五UNION结点的孙代结点以及孙代结点的父结点,得到简化之后查询树。If there is a fifth UNION node with a depth of 2 in the third query tree, then the child nodes of the fifth UNION node also include a plurality of UNION nodes, then the child nodes of the fifth UNION node can be The child node corresponding to the UNION node in is merged into the child node of the fifth UNION node, and the grandchild node of the fifth UNION node and the parent node of the grandchild node are deleted, and the simplification is obtained query tree.
参见图6,图6为步骤304的示意图,这样在将深度为2的第五UNION结点的孙代结点删除后,能够降低结点深度,减少中间结果的数据量,可以提高查询数据的效率。Referring to Fig. 6, Fig. 6 is a schematic diagram of step 304, after deleting the grandchildren node of the fifth UNION node whose depth is 2, the node depth can be reduced, the data volume of intermediate results can be reduced, and the query data can be improved efficiency.
需要说明的是,上述步骤302-步骤304仅是对不同的处理分别进行的说明,其在执行时序上并没有先后之分。在经过上述步骤302-步骤304的处理之后,可以得到简化之后的查询树。It should be noted that the above steps 302 to 304 are only descriptions of different processes, and there is no sequence in execution sequence. After the above steps 302 - 304 are processed, a simplified query tree can be obtained.
步骤305、确定简化之后的查询树中各结点的深度。 Step 305, determine the depth of each node in the simplified query tree.
步骤306、如果简化之后的查询树中仍然存在满足步骤302-步骤304进行简化处理的查询结点时,则可以将简化之后的查询树确定为待简化的第三查询树,并跳转至对应的步骤继续进行简化处理,直到简化之后的查询树中不存在可以进行简化处理的结点。 Step 306, if there are still query nodes in the simplified query tree that meet the requirements of step 302-step 304 for simplification, then the simplified query tree can be determined as the third query tree to be simplified, and jump to the corresponding Steps continue to simplify until there is no node that can be simplified in the simplified query tree.
由于上述步骤302-步骤304的处理,改变了查询树中各结点对应的深度,因此,在得到简化之后的查询树后,可以确定简化之后的查询树中各结点对应的深度。如果简化之后的查询树中仍然存在可以进行步骤302-步骤304对应简化处理的查询结点时,则可以继续对查询结点进行对应的简化处理,进一步降低查询树中各结点对应的深度,直到确定简化之后的查询树后中不存在可以进行简化的结点。Since the processing of steps 302 to 304 above changes the depth corresponding to each node in the query tree, after the simplified query tree is obtained, the depth corresponding to each node in the simplified query tree can be determined. If there are still query nodes in the query tree after simplification that can be simplified in step 302-step 304, you can continue to perform corresponding simplification processing on the query nodes, further reducing the corresponding depth of each node in the query tree, Until it is determined that there is no node that can be simplified in the query tree after simplification.
如此,在经过上述多个步骤的循环处理后,明显减少了查询树的深度,可以减低中间查询结果的数据量,可以提高查询数据的效率。In this way, after the cyclic processing of the above-mentioned multiple steps, the depth of the query tree is significantly reduced, the data volume of intermediate query results can be reduced, and the efficiency of querying data can be improved.
如图7所示,图7是本申请提供的一种简化处理的方法,该方法包括:As shown in Figure 7, Figure 7 is a simplified processing method provided by the present application, which includes:
步骤701、确定第一查询树中对应的祖先结点不存在OPTIONAL结点的第一OPTIONAL结点。 Step 701. Determine that the corresponding ancestor node in the first query tree does not have the first OPTIONAL node of the OPTIONAL node.
步骤702、将以第一OPTIONAL结点的父结点为根结点的子查询树转化为第三BGP结点。Step 702: Transform the subquery tree rooted at the parent node of the first OPTIONAL node into a third BGP node.
其中,该图7的处理可以与上述图3的处理结合实施,也就是在上述步骤301之前,可以先执行步骤701。如果在步骤701中确定第一查询树中存在第一OPTIONAL结点,则可以将该以第一OPTIONAL结点的父结点为根结点的子查询树当做一个BGP结点,既可以在第一查询树中删除第一OPTIONAL结点的父结点为根结点的子查询树,然后将第三BGP结点添加到原来第一OPTIONAL结点的父结点的位置,可参见图8,可以将以合并结点2为根的子查询树转化为BGP3结点。Wherein, the processing in FIG. 7 may be implemented in combination with the processing in FIG. 3 above, that is, before step 301 above, step 701 may be executed first. If it is determined in step 701 that there is the first OPTIONAL node in the first query tree, then the sub-query tree with the parent node of the first OPTIONAL node as the root node can be regarded as a BGP node. In a query tree, delete the parent node of the first OPTIONAL node as the sub-query tree of the root node, then add the third BGP node to the position of the parent node of the original first OPTIONAL node, as shown in Figure 8, The subquery tree rooted at merge node 2 can be converted into a BGP3 node.
在完成该步骤702的处理之后,可以进行图3的处理,在完成图3的处理后,可以对经过图3处理后得到的查询树中的各结点进行查询操作。当执行到涉及第三结点对应的查询操作时,包括如下两种处理方式:After the processing in step 702 is completed, the processing in FIG. 3 can be performed. After the processing in FIG. 3 is completed, query operations can be performed on each node in the query tree obtained after the processing in FIG. 3 . When the query operation involving the corresponding third node is executed, the following two processing methods are included:
处理方式一:Processing method one:
在执行第三BGP结点对应的第一查询操作时,可以先确定第三BGP结点对应的子查询树,该子查询树即为上述步骤702中,由以第一OPTIONAL结点的父结点为根结点的子查询树。When executing the first query operation corresponding to the third BGP node, the sub-query tree corresponding to the third BGP node can be determined first, and the sub-query tree is the parent node of the first OPTIONAL node in the above step 702. Point is the subquery tree of the root node.
在执行该子查询树的结点对应的查询操作时,可以先执行第一OPTIONAL结点的兄弟结点对应的查询操作,将执行兄弟结点对应的查询操作后,得到第一查询结果,即在图数据中查询到的子图,作为第一OPTIONAL结点的后代结点的数据查询范围,再一次执行第一OPTIONAL结点的后代结点对应的查询操作。这样能够减少执行第一OPTIONAL结点的后代结点的查询操作时,对应的查询数据量,可以提高查询操作的执行效率。When executing the query operation corresponding to the node of the sub-query tree, the query operation corresponding to the sibling node of the first OPTIONAL node can be executed first, and after the query operation corresponding to the sibling node is executed, the first query result is obtained, namely The subgraph queried in the graph data is used as the data query scope of the descendant nodes of the first OPTIONAL node, and the query operation corresponding to the descendant nodes of the first OPTIONAL node is executed again. This can reduce the amount of corresponding query data when executing the query operation of the descendant nodes of the first OPTIONAL node, and can improve the execution efficiency of the query operation.
处理方式二:Processing method two:
在确定第三BGP结点对应的子查询树后,可以确定第一OPTIONAL结点的后代结点中是否还包括第二OPTIONAL结点。After determining the sub-query tree corresponding to the third BGP node, it may be determined whether the descendant nodes of the first OPTIONAL node also include the second OPTIONAL node.
如果确定第一OPTIONAL结点的后代结点中包括至少一个第二OPTIONAL结点,则可以按照第一OPTIONAL结点和至少一个第二OPTIONAL结点的深度,依次执行第一OPTIONAL结点的第一兄弟结点和至少一个第二OPTIONAL结点的第二兄弟结点对应的查询操作。If it is determined that at least one second OPTIONAL node is included in the descendant nodes of the first OPTIONAL node, the first OPTIONAL node of the first OPTIONAL node may be sequentially executed according to the depths of the first OPTIONAL node and the at least one second OPTIONAL node. A query operation corresponding to a sibling node and a second sibling node of at least one second OPTIONAL node.
其中,每个第二OPTIONAL结点对应的兄弟结点的数据查询范围为前一个OPTIONAL结点对应的兄弟结点的查询结果。Wherein, the data query range of the sibling nodes corresponding to each second OPTIONAL node is the query result of the sibling nodes corresponding to the previous OPTIONAL node.
这样在至执行每个OPTIONAL结点对应的兄弟结点时,可以将根据前一个OPTIONAL结点对应的兄弟结点的查询结果的基础上执行对应的查询操作,可以减低每次查询操作对应的查询数据量,进而提高查询效率。In this way, when the brother node corresponding to each OPTIONAL node is executed, the corresponding query operation can be performed on the basis of the query result of the brother node corresponding to the previous OPTIONAL node, which can reduce the query corresponding to each query operation data volume, thereby improving query efficiency.
在得到每个OPTIONAL结点对应的兄弟结点的查询结果后,可以再次按照第一OPTIONAL结点和至少一个第二OPTIONAL结点的深度,依次执行第一OPTIONAL结点和至少一个第二OPTIONAL结点的孩子结点对应的查询操作。After obtaining the query results of the sibling nodes corresponding to each OPTIONAL node, the first OPTIONAL node and at least one second OPTIONAL node can be executed sequentially according to the depth of the first OPTIONAL node and at least one second OPTIONAL node. The query operation corresponding to the child node of the point.
其中,任一OPTIONAL结点的孩子结点的对应的数据查询范围为任一OPTIONAL结点的兄弟结点对应的查询结果。这样在至执行每个OPTIONAL结点对应的孩子结点时,可以将在其兄弟结点的查询结果的基础上执行对应的查询操作,可以减低每次查询操作对应的查询数据量,进而提高查询效率。Wherein, the corresponding data query scope of the child nodes of any OPTIONAL node is the query result corresponding to the sibling nodes of any OPTIONAL node. In this way, when the child node corresponding to each OPTIONAL node is executed, the corresponding query operation can be performed on the basis of the query results of its sibling nodes, which can reduce the amount of query data corresponding to each query operation, thereby improving query performance. efficiency.
如图9所示,图9是本申请提供的一种简化处理的方法,该方法包括:As shown in Figure 9, Figure 9 is a simplified processing method provided by the present application, which includes:
步骤901、对于第一查询树中的FILTER结点,如果FILTER结点对应的FILTER条件满足预设的转换条件,则将FILTER条件转换为析取范式。 Step 901, for the FILTER node in the first query tree, if the FILTER condition corresponding to the FILTER node satisfies a preset conversion condition, then convert the FILTER condition into a disjunctive normal form.
在执行上述步骤301之前,如果确定第一查询树中存在FILTER结点,则可以将先确定FILTER结点对应的FILTER条件是否满足预设的转换条件。其中,转换条件FILTER结点对应的FILTER条件中仅有由变量、常量以及与、或、相等三种运算符组成。Before performing the above step 301, if it is determined that there is a FILTER node in the first query tree, it may first be determined whether the FILTER condition corresponding to the FILTER node satisfies a preset conversion condition. Among them, the FILTER condition corresponding to the conversion condition FILTER node is only composed of variables, constants, and and, or, and equal operators.
步骤902、基于析取范式,将FILTER结点转换为UNION结点。Step 902: Convert the FILTER node into a UNION node based on the disjunctive normal form.
如果确定存在FILTER结点对应的FILTER条件满足预设的转换条件,则可以将FILTER结点对应的FILTER条件转换为析取范式f1||f2||...||fm。其中在析取范式中任一fi为对一个变量的约束条件,该约束条件可认为是变量与对应约束值的对应关系。在得到析取范式f1||f2||...||fm后,如果确定任一fi和fj不相容,即不存在任何一个映射满足任一对fi∧fj,则将第一查询树中出现的被FILTER条件约束的查询结点对应的变量依次用BIND子句赋值为约束值,进而将FILTER结点转换为UNION结点,具体处理如下:If it is determined that the FILTER condition corresponding to the FILTER node satisfies the preset conversion condition, the FILTER condition corresponding to the FILTER node may be converted into disjunctive normal form f1||f2||...||fm. Among them, any fi in the disjunctive paradigm is a constraint on a variable, which can be regarded as the corresponding relationship between the variable and the corresponding constraint value. After obtaining the disjunctive normal form f1||f2||...||fm, if it is determined that any fi and fj are incompatible, that is, there is no mapping that satisfies any pair of fi∧fj, the first query tree The variables corresponding to the query nodes constrained by the FILTER condition appearing in , are assigned as constraint values with the BIND clause in turn, and then the FILTER nodes are converted into UNION nodes. The specific processing is as follows:
可以添加一个UNION结点作为FILTER结点的兄弟,并为这个UNION结点添加m个孩子结点,该m个孩子结点均为合并结点(m即为析取范式中的约束条件数)。对每个约束条件fi,将被FILTER条件约束的查询结点(即FILTER结点对应的除新添加的UNION结点外的其他兄弟结点)中的变量用BIND子句赋值为fi中的约束值,并将得到的结点作为孩子结点接到新添加的UNION结点的第i个孩子结点下。最后,删除该层除新添加的UNION结点以外的所有结点。完成步骤902后,可转至执行步骤301的处理。You can add a UNION node as a brother of the FILTER node, and add m child nodes to this UNION node, and the m child nodes are all merge nodes (m is the number of constraints in the disjunction paradigm) . For each constraint condition fi, assign the variable in the query node constrained by the FILTER condition (that is, other sibling nodes corresponding to the FILTER node except the newly added UNION node) to the constraint in fi with the BIND clause value, and connect the obtained node as a child node under the ith child node of the newly added UNION node. Finally, delete all nodes in this layer except the newly added UNION node. After step 902 is completed, go to step 301.
如此,可以将FILTER结点转换为UNION结点参与到图3的简化处理中,能够在一定程度上对查询树进行简化,进而提高数据查询效率。In this way, the FILTER node can be converted into a UNION node to participate in the simplified processing in Figure 3, which can simplify the query tree to a certain extent, thereby improving the efficiency of data query.
如图10所示,图10是本申请提供的一种简化处理的方法,该方法包括:As shown in Figure 10, Figure 10 is a simplified processing method provided by the present application, which includes:
步骤1001、如果存在可并行执行的多个BGP结点,则确定多个BGP结点对应的公共三元组模式。 Step 1001, if there are multiple BGP nodes that can be executed in parallel, determine the public triplet patterns corresponding to the multiple BGP nodes.
其中,两个三元组模式t1=<s1,p1,o1>和t2=<s2,p2,o2>,称t1和t2是等价的(记作
Figure PCTCN2022135606-appb-000013
Figure PCTCN2022135606-appb-000014
),当且仅当以下条件成立:1、s1和s2均为变量;2、p1和p2是相同的谓词;3、o1和o2均为变量,或是相同的常量。
Among them, two triplet patterns t1=<s1, p1, o1> and t2=<s2, p2, o2>, said t1 and t2 are equivalent (denoted as
Figure PCTCN2022135606-appb-000013
Figure PCTCN2022135606-appb-000014
), if and only if the following conditions hold: 1, s1 and s2 are variables; 2, p1 and p2 are the same predicate; 3, o1 and o2 are variables, or the same constant.
Figure PCTCN2022135606-appb-000015
则μ(t1,t2)是从Var(t1)到Var(t2)的双射,其中Var(t1)表示t1中出现的变量集合。
like
Figure PCTCN2022135606-appb-000015
Then μ(t1,t2) is a bijection from Var(t1) to Var(t2), where Var(t1) represents the set of variables appearing in t1.
给定两个BGP bi和bj,其三元组序列分别为Si=(ti1,...,tik)和Sj=(tj1,...,tjk’),称Si和Sj是等价的(表示为
Figure PCTCN2022135606-appb-000016
),当且仅当以下条件成立:1、序列Si和Sj长度相同,即k=k’;2、同时满足
Figure PCTCN2022135606-appb-000017
3、μ1(ti1,tj1)∪...∪μk(tik,tjk’)仍为从Var(Si)到Var(Sj)的双射,其中Var(Si)表示Si中出现的变量集合。
Given two BGP bi and bj, whose triplet sequences are Si=(ti1,...,tik) and Sj=(tj1,...,tjk'), Si and Sj are said to be equivalent ( Expressed as
Figure PCTCN2022135606-appb-000016
), if and only if the following conditions are established: 1. The lengths of sequences Si and Sj are the same, that is, k=k'; 2. Simultaneously satisfy
Figure PCTCN2022135606-appb-000017
3. μ1(ti1,tj1)∪...∪μk(tik,tjk') is still a bijection from Var(Si) to Var(Sj), where Var(Si) represents the set of variables appearing in Si.
基于上述定义,如果在查询图中存在多个可并行执行的BGP时,可以使用频繁子图挖掘算法找出多个BGP间的公共子查询C={c1,…,cn},其中每个ci是这些BGP中的等价三元组子序列。其中,公共子查询C即为多个BGP结点对应的公共三元组模式。Based on the above definition, if there are multiple BGPs that can be executed in parallel in the query graph, the frequent subgraph mining algorithm can be used to find the common subquery C={c1,...,cn} among multiple BGPs, where each ci are the equivalent triplet subsequences in BGP. Wherein, the common subquery C is a common triple pattern corresponding to multiple BGP nodes.
步骤1002、基于贪心算法,在公共三元组模式中确定对应查询成本最低的部分公共三元组模式。 Step 1002, based on a greedy algorithm, determine a part of public triple patterns corresponding to the lowest query cost among the public triple patterns.
选取高选择度的公共子查询:给定一个BGP集合B={b1,…,bm}和它们之间的公共子查询集合C={c1,…,cn},选择一个公共子查询子集
Figure PCTCN2022135606-appb-000018
以最小化如下代价:
Select a common subquery with high selectivity: Given a BGP set B={b1,...,bm} and the common subquery set C={c1,...,cn} between them, select a common subquery subset
Figure PCTCN2022135606-appb-000018
To minimize the cost of:
Figure PCTCN2022135606-appb-000019
Figure PCTCN2022135606-appb-000019
其中Cost(B,C S)为此BGP集合的匹配成本,Cost(c i)为选入子集CS的公共子查询c i的匹配成本,Cost(b j|C S)是给定所有选入子集CS的公共子查询的结果后,对剩余的RDF三元组b j进行查询的匹配成本。其中,匹配成本可以是查询对应的三元组需要消耗的计算资源或占用的时长等,该匹配成本可以根据每个三元组模式中的变量确定,每个变量对应的匹配成本可以由技术人员预先设备。 Where Cost(B,C S ) is the matching cost of this BGP set, Cost( ci ) is the matching cost of the common subquery ci selected into the subset CS, and Cost(b j |C S ) is the After entering the result of the common subquery of the subset CS, the matching cost of querying the remaining RDF triples b j . Among them, the matching cost can be the computing resource consumed or the time taken to query the corresponding triplet, etc. The matching cost can be determined according to the variables in each triplet pattern, and the matching cost corresponding to each variable can be determined by the technician Pre-equipment.
Cost(c i)和Cost(b j|C S)基于三元组模式t的选择度sel(t)定义如下: The selectivity sel(t) of Cost( ci ) and Cost(b j |C S ) based on triple pattern t is defined as follows:
Cost(c i)=min{sel(t)|t∈c i}×|c i| Cost(c i )=min{sel(t)| t∈ci }×| ci |
Cost(b j|C S)=min{sel(t)|t∈b′ j}×|b′ j| Cost(b j |C S )=min{sel(t)|t∈b′ j }×|b′ j |
其中
Figure PCTCN2022135606-appb-000020
即BGP集合B中未被公共子查询子集CS覆盖的部分。
in
Figure PCTCN2022135606-appb-000020
That is, the part of the BGP set B that is not covered by the common subquery subset CS.
最小化上述目标方程是一个NP难问题。因此,使用贪心算法选择CS使上述目标尽量小。将CS初始化为空集。在每一步中,选择一个公共子查询ci∈C加入到CS中,使之最大化Δ=Cost(B,C S)-Cost(B,C S∪c i),迭代直至无法将任何公共子查询添加到CS。最后得到的CS即为查询成本最低的部分公共三元组模式。 Minimizing the above objective equation is an NP-hard problem. Therefore, a greedy algorithm is used to select CS to make the above target as small as possible. Initialize CS to the empty set. In each step, select a common subquery ci∈C and add it to CS to maximize Δ=Cost(B,C S )-Cost(B,C S ∪c i ), iterate until no common subquery Query added to CS. The finally obtained CS is the part of the common triple pattern with the lowest query cost.
步骤1003、在图数据中查询部分公共三元组模式对应的数据。 Step 1003, query the data corresponding to some public triple patterns in the graph data.
步骤1004、在图数据中查询多个BGP结点中除部分公共三元组模式外的其他三元组模式对应的数据。 Step 1004, query the graph data for data corresponding to other triplet patterns except some common triplet patterns in multiple BGP nodes.
在执行多个BGP结点对应的查询操作时,首先可以对所有选定的公共子查询ci∈CS执行匹配,并缓存中间结果[[ci]]。对每个BGP结点bj执行匹配时,其结果集可如下计算:When executing the query operation corresponding to multiple BGP nodes, it can first perform matching on all selected common subqueries ci∈CS, and cache the intermediate results [[ci]]. When matching is performed on each BGP node bj, the result set can be calculated as follows:
Figure PCTCN2022135606-appb-000021
Figure PCTCN2022135606-appb-000021
其中每个
Figure PCTCN2022135606-appb-000022
为在bj三元组模式子序列中的公共子查询,b′ j如上定义为b j未被公共子查询子集CS覆盖的部分。
each of them
Figure PCTCN2022135606-appb-000022
is the common subquery in the bj triplet pattern subsequence, and b' j is defined as the part of b j not covered by the common subquery subset CS.
如此,在执行多个BGP结点对应的查询处理时,可以先对多个BGP结点对应的公关三元组模式进行查询,再执行多个BGP结点中除部分公共三元组模式外的其他三元组模式对应的查询处理,如此可以减少查询的数据量,能够提高查询速度。In this way, when performing query processing corresponding to multiple BGP nodes, you can first query the public relations triplet patterns corresponding to multiple BGP nodes, and then execute the public triplet patterns of multiple BGP nodes except for some public triplet patterns. Query processing corresponding to other triplet patterns, which can reduce the amount of query data and improve query speed.
上述所有可选技术方案,可以采用任意结合形成本公开的可选实施例,在此不再一一赘述。All the above optional technical solutions may be combined in any way to form optional embodiments of the present disclosure, which will not be repeated here.
图11是本申请实施例提供的一种查询数据的装置结构示意图,该装置可以是上述实施例中的计算机设备,参见图11,该装置包括:Fig. 11 is a schematic structural diagram of a device for querying data provided by an embodiment of the present application. The device may be the computer device in the above embodiment, see Fig. 11 , the device includes:
接收模块1110,用于接收数据查询应用程序发送的数据查询指令,所述数据查询指令中携带有数据查询语句;The receiving module 1110 is configured to receive a data query instruction sent by a data query application program, wherein the data query instruction carries a data query statement;
建立模块1120,用于基于所述数据查询语句的结构,建立所述数据查询语句对应的第一查询树;Establishing module 1120, configured to establish a first query tree corresponding to the data query statement based on the structure of the data query statement;
处理模块1130,用于基于所述第一查询树中各结点的类型,对所述第一查询树进行简化处理,得到第二查询树;A processing module 1130, configured to simplify the first query tree based on the type of each node in the first query tree to obtain a second query tree;
查询模块1140,用于基于预设的执行顺序,在图数据库中依次执行所述第二查询树中各结点对应的查询操作,得到数据查询结果;The query module 1140 is configured to sequentially execute the query operations corresponding to the nodes in the second query tree in the graph database based on a preset execution sequence, to obtain data query results;
返回模块1150,用于将所述数据查询结果返回至所述数据查询应用程序。Returning module 1150, configured to return the data query result to the data query application program.
可选的,所述第一查询树中结点的类型包括合并结点和查询结点,其中,所述合并结点用于表示所述数据查询语句或所述数据查询语句中的子查询语句,所述查询结点用于表示所述数据查询语句或所述数据查询语句中的子查询语句中的查询词,所述查询结点包括BGP结点、UNION结点、OPTIONAL结点、FILTER结点中的至少一种。Optionally, the types of nodes in the first query tree include a merge node and a query node, wherein the merge node is used to represent the data query statement or a subquery statement in the data query statement , the query node is used to represent the query word in the data query statement or a subquery statement in the data query statement, and the query node includes a BGP node, a UNION node, an OPTIONAL node, a FILTER node at least one of the points.
可选的,所述查询模块1140,用于:将所述第一查询树确定为待简化的第三查询树,确定所述第三查询树中各结点的深度;对于所述第三查询树中深度为1的第一合并结点,如果所述第一合并结点的孩子结点包括多个BGP结点,则对所述多个BGP结点进行合并处理,得到合并处理后的第一BGP结点,删除所述第一合并结点,将所述第一BGP结点添加到所述第一合并结点的位置;对于所述第三查询树中深度为2的第二合并结点,如果所述第二合并结点的孩子结点包括至少一个BGP结点和至少一个UNION结点,则对所述至少一个BGP结点进行合并处理,得到合并处理后的第二BGP结点,对所述至少一个UNION结点进行合并处理,得到合并处理后的第三UNION结点;将所述第二BGP结点合并至所述第三UNION结点的孩子结点中,得到第四UNION结点,删除所述第二合并结点,将所述第四UNION结点添加到所述第二合并结点的位置;对于所述第三查询树中深度为2的第五UNION结点,将所述第五UNION结点的孙代结点添加到所述第五UNION结点的孩子结点中,并删除所述第五UNION结点的孙代结点以及所述孙代结点的父结点,得到简化处理之后第三查询树。Optionally, the query module 1140 is configured to: determine the first query tree as a third query tree to be simplified, determine the depth of each node in the third query tree; for the third query For the first merging node with a depth of 1 in the tree, if the child nodes of the first merging node include multiple BGP nodes, then the multiple BGP nodes are merged to obtain the merged first A BGP node, delete the first merging node, add the first BGP node to the position of the first merging node; for the second merging node whose depth is 2 in the third query tree point, if the child nodes of the second merging node include at least one BGP node and at least one UNION node, then the at least one BGP node is merged to obtain the second BGP node after merging , performing merging processing on the at least one UNION node to obtain a third UNION node after merging processing; merging the second BGP node into a child node of the third UNION node to obtain a fourth UNION node, delete the second merge node, add the fourth UNION node to the position of the second merge node; for the fifth UNION node whose depth is 2 in the third query tree , adding the grandchild node of the fifth UNION node to the child node of the fifth UNION node, and deleting the grandchild node and the grandchild node of the fifth UNION node The parent node of is the third query tree after simplified processing.
可选的,所述处理模块1130,还用于:确定所述第一查询树中对应的祖先结点不存在OPTIONAL结点的第一OPTIONAL结点;将以所述第一OPTIONAL结点的父结点为根结点的子查询树转化为第三BGP结点。Optionally, the processing module 1130 is further configured to: determine that the first OPTIONAL node of the OPTIONAL node does not exist in the corresponding ancestor node in the first query tree; use the parent of the first OPTIONAL node The subquery tree whose node is the root node is transformed into a third BGP node.
可选的,所述查询模块1140,用于:当执行所述第三BGP结点对应的第一查询操作时,确定所述第三BGP结点对应的子查询树;执行所述子查询树中所述第一OPTIONAL结点的兄弟结点对应的查询操作,得到第一查询结果;将所述第一查询结果确定为所述第一OPTIONAL结点的后代结点的数据查询范围;基于所述数据查询范围,执行所述第一OPTIONAL结点的后代结点对应的查询操作。Optionally, the query module 1140 is configured to: when executing the first query operation corresponding to the third BGP node, determine a sub-query tree corresponding to the third BGP node; execute the sub-query tree The query operation corresponding to the sibling nodes of the first OPTIONAL node in the first OPTIONAL node is obtained to obtain the first query result; the first query result is determined as the data query range of the descendant node of the first OPTIONAL node; based on the Execute the query operation corresponding to the descendant node of the first OPTIONAL node within the scope of the data query.
可选的,所述查询模块1140,用于:当执行所述第三BGP结点对应的第一查询操作时,如果确定所述第一OPTIONAL结点的后代结点中包括至少一个第二OPTIONAL结点;按照所述第一OPTIONAL结点和所述至少一个第二OPTIONAL结点的深度,依次执行所述第一OPTIONAL结点的第一兄弟结点和所述至少一个第二OPTIONAL结点的第二兄弟结点对应的查询操作,其中,每个第二OPTIONAL结点的第二兄弟结点对应的数据查询范围为前一个OPTIONAL结点的第二兄弟结点对应的查询结果;按照所述第一OPTIONAL结点和所述至少一个第二OPTIONAL结点的深度,依次执行所述第一OPTIONAL结点的孩子结点和所述至少一个第二OPTIONAL结点的孩子结点对应的查询操作,其中,所述任一OPTIONAL结点的孩子结点的对应的数据查询范围为所述任一OPTIONAL结点的兄弟结点对应的查询结果。Optionally, the query module 1140 is configured to: when executing the first query operation corresponding to the third BGP node, if it is determined that at least one second OPTIONAL node is included in the descendant nodes of the first OPTIONAL node Node: according to the depth of the first OPTIONAL node and the at least one second OPTIONAL node, sequentially execute the first sibling node of the first OPTIONAL node and the at least one second OPTIONAL node The query operation corresponding to the second sibling node, wherein, the data query range corresponding to the second sibling node of each second OPTIONAL node is the query result corresponding to the second sibling node of the previous OPTIONAL node; according to the the depth of the first OPTIONAL node and the at least one second OPTIONAL node, performing query operations corresponding to the child nodes of the first OPTIONAL node and the child nodes of the at least one second OPTIONAL node in sequence, Wherein, the corresponding data query scope of the child nodes of any OPTIONAL node is the query result corresponding to the sibling nodes of any OPTIONAL node.
可选的,所述处理模块1130,用于:对于所述第一查询树中的FILTER结点,如果所述FILTER结点对应的FILTER条件满足预设的转换条件,则将所述FILTER条件转换为析取范式;基于所述析取范式,将所述FILTER结点转换为UNION结点。可选的,所述转换条件为所述FILTER结点对应的FILTER条件由变量、常量以及与、或、相等三种运算符组成。Optionally, the processing module 1130 is configured to: for the FILTER node in the first query tree, if the FILTER condition corresponding to the FILTER node satisfies a preset conversion condition, convert the FILTER condition is a disjunctive normal form; based on the disjunctive normal form, transform the FILTER node into a UNION node. Optionally, the conversion condition is that the FILTER condition corresponding to the FILTER node is composed of variables, constants, and three operators: and, or, and equal.
可选的,所述查询模块1140,用于:如果存在可并行执行的多个BGP结点,则确定所述多个BGP结点对应的公共三元组模式;基于所述贪心算法,在所述公共三元组模式中确定对应查询成本最低的部分公共三元组模式;在所述图数据中查询所述部分公共三元组模式对应的数据;在所述图数据中查询多个BGP结点中除所述部分公共三元组模式外的其他三元组模式对应的数据。Optionally, the query module 1140 is configured to: if there are multiple BGP nodes that can be executed in parallel, then determine the public triple pattern corresponding to the multiple BGP nodes; based on the greedy algorithm, in the In the public triplet pattern, determine the part of the public triplet pattern corresponding to the lowest query cost; query the data corresponding to the part of the public triplet pattern in the graph data; query multiple BGP structures in the graph data Data corresponding to other triplet patterns in the point except the partial public triplet patterns.
需要说明的是:上述实施例提供的查询数据的装置在查询数据时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将计算机设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的查询数据的装置与查询数据的方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。It should be noted that: when the device for querying data provided by the above-mentioned embodiments queries data, it only uses the division of the above-mentioned functional modules for illustration. In practical applications, the above-mentioned function allocation can be completed by different functional modules according to needs. That is, the internal structure of the computer device is divided into different functional modules to complete all or part of the functions described above. In addition, the device for querying data provided by the above embodiment and the method embodiment for querying data belong to the same idea, and its specific implementation process is detailed in the method embodiment, and will not be repeated here.
图12示出了本申请一个示例性实施例提供的计算机设备1200的结构框图。该计算机设 备1200可以是便携式移动终端,比如:智能手机、平板电脑、MP3播放器(moving picture experts group audio layer III,动态影像专家压缩标准音频层面3)、MP4(moving picture experts group audio layer IV,动态影像专家压缩标准音频层面4)播放器、笔记本电脑或台式电脑。计算机设备1200还可能被称为用户设备、便携式终端、膝上型终端、台式终端等其他名称。Fig. 12 shows a structural block diagram of a computer device 1200 provided by an exemplary embodiment of the present application. The computer device 1200 can be a portable mobile terminal, such as: smart phone, tablet computer, MP3 player (moving picture experts group audio layer III, moving picture experts compression standard audio layer 3), MP4 (moving picture experts group audio layer IV, Motion Picture Expert compresses standard audio levels 4) Players, laptops or desktops. The computer device 1200 may also be called user equipment, portable terminal, laptop terminal, desktop terminal, or other names.
通常,计算机设备1200包括有:处理器1201和存储器1202。处理器1201可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器1201可以采用DSP(digital signal processing,数字信号处理)、FPGA(field-programmable gate array,现场可编程门阵列)、PLA(programmable logic array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器1201也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(central processing unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器1201可以集成有GPU(graphics processing unit,图像处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器1201还可以包括AI(artificial intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。Generally, a computer device 1200 includes: a processor 1201 and a memory 1202 . The processor 1201 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 1201 can adopt at least one hardware form among DSP (digital signal processing, digital signal processing), FPGA (field-programmable gate array, field programmable gate array), PLA (programmable logic array, programmable logic array) accomplish. The processor 1201 may also include a main processor and a coprocessor, the main processor is a processor for processing data in the wake-up state, and is also called a CPU (central processing unit, central processing unit); the coprocessor is Low-power processor for processing data in standby state. In some embodiments, the processor 1201 may be integrated with a GPU (graphics processing unit, image processor), and the GPU is used for rendering and drawing the content that needs to be displayed on the display screen. In some embodiments, the processor 1201 may further include an AI (artificial intelligence, artificial intelligence) processor, where the AI processor is used to process computing operations related to machine learning.
存储器1202可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器1202还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器1202中的非暂态的计算机可读存储介质用于存储至少一个指令,该至少一个指令用于被处理器1201所执行以实现本申请中方法实施例提供的查询数据的方法。 Memory 1202 may include one or more computer-readable storage media, which may be non-transitory. The memory 1202 may also include high-speed random access memory and non-volatile memory, such as one or more magnetic disk storage devices and flash memory storage devices. In some embodiments, the non-transitory computer-readable storage medium in the memory 1202 is used to store at least one instruction, and the at least one instruction is used to be executed by the processor 1201 to realize the query data provided by the method embodiments in this application Methods.
在一些实施例中,计算机设备1200还可选包括有:外围设备接口1203和至少一个外围设备。处理器1201、存储器1202和外围设备接口1203之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口1203相连。具体地,外围设备包括:射频电路1204、显示屏1205、摄像头组件1206、音频电路1207、定位组件1208和电源1209中的至少一种。In some embodiments, the computer device 1200 may optionally further include: a peripheral device interface 1203 and at least one peripheral device. The processor 1201, the memory 1202, and the peripheral device interface 1203 may be connected through buses or signal lines. Each peripheral device can be connected to the peripheral device interface 1203 through a bus, a signal line or a circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 1204 , a display screen 1205 , a camera component 1206 , an audio circuit 1207 , a positioning component 1208 and a power supply 1209 .
外围设备接口1203可被用于将I/O(input/output,输入/输出)相关的至少一个外围设备连接到处理器1201和存储器1202。在一些实施例中,处理器1201、存储器1202和外围设备接口1203被集成在同一芯片或电路板上;在一些其他实施例中,处理器1201、存储器1202和外围设备接口1203中的任意一个或两个可以在单独的芯片或电路板上实现,本实施例对此不加以限定。The peripheral device interface 1203 may be used to connect at least one peripheral device related to I/O (input/output, input/output) to the processor 1201 and the memory 1202 . In some embodiments, the processor 1201, memory 1202 and peripheral device interface 1203 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 1201, memory 1202 and peripheral device interface 1203 or The two can be implemented on a separate chip or circuit board, which is not limited in this embodiment.
射频电路1204用于接收和发射RF(radio frequency,射频)信号,也称电磁信号。射频电路1204通过电磁信号与通信网络以及其他通信设备进行通信。射频电路1204将电信号转换为电磁信号进行发送,或者,将接收到的电磁信号转换为电信号。可选地,射频电路1204包括:天线系统、RF收发器、一个或多个放大器、调谐器、振荡器、数字信号处理器、编解码芯片组、用户身份模块卡等等。射频电路1204可以通过至少一种无线通信协议来与其它终端进行通信。该无线通信协议包括但不限于:万维网、城域网、内联网、各代移动通信网络(2G、3G、4G及5G)、无线局域网和/或WiFi(wireless fidelity,无线保真)网络。在一些实施例中,射频电路1204还可以包括NFC(near field communication,近距离无线通信)有关的电路,本申请对此不加以限定。The radio frequency circuit 1204 is used to receive and transmit RF (radio frequency, radio frequency) signals, also called electromagnetic signals. The radio frequency circuit 1204 communicates with the communication network and other communication devices through electromagnetic signals. The radio frequency circuit 1204 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals. Optionally, the radio frequency circuit 1204 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and the like. The radio frequency circuit 1204 can communicate with other terminals through at least one wireless communication protocol. The wireless communication protocol includes but is not limited to: World Wide Web, Metropolitan Area Network, Intranet, various generations of mobile communication networks (2G, 3G, 4G and 5G), wireless local area network and/or WiFi (wireless fidelity, wireless fidelity) network. In some embodiments, the radio frequency circuit 1204 may also include circuits related to NFC (near field communication, short-range wireless communication), which is not limited in this application.
显示屏1205用于显示UI(user interface,用户界面)。该UI可以包括图形、文本、图标、视频及其它们的任意组合。当显示屏1205是触摸显示屏时,显示屏1205还具有采集在显示屏1205的表面或表面上方的触摸信号的能力。该触摸信号可以作为控制信号输入至处理器1201进行处理。此时,显示屏1205还可以用于提供虚拟按钮和/或虚拟键盘,也称软按钮和/或软键盘。在一些实施例中,显示屏1205可以为一个,设置在计算机设备1200的前面板;在另一些实施例中,显示屏1205可以为至少两个,分别设置在计算机设备1200的不同表面或呈折叠设计;在另一些实施例中,显示屏1205可以是柔性显示屏,设置在计算机设备1200 的弯曲表面上或折叠面上。甚至,显示屏1205还可以设置成非矩形的不规则图形,也即异形屏。显示屏1205可以采用LCD(liquid crystal display,液晶显示屏)、OLED(organic light-emitting diode,有机发光二极管)等材质制备。The display screen 1205 is used to display a UI (user interface, user interface). The UI can include graphics, text, icons, video, and any combination thereof. When the display screen 1205 is a touch display screen, the display screen 1205 also has the ability to collect touch signals on or above the surface of the display screen 1205 . The touch signal can be input to the processor 1201 as a control signal for processing. At this time, the display screen 1205 can also be used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards. In some embodiments, there may be one display screen 1205, which is arranged on the front panel of the computer device 1200; in other embodiments, there may be at least two display screens 1205, which are respectively arranged on different surfaces of the computer device 1200 or folded Design; In some other embodiments, the display screen 1205 may be a flexible display screen, which is arranged on a curved surface or a folded surface of the computer device 1200 . Even, the display screen 1205 can also be set as a non-rectangular irregular figure, that is, a special-shaped screen. The display screen 1205 can be made of LCD (liquid crystal display, liquid crystal display), OLED (organic light-emitting diode, organic light-emitting diode) and other materials.
摄像头组件1206用于采集图像或视频。可选地,摄像头组件1206包括前置摄像头和后置摄像头。通常,前置摄像头设置在终端的前面板,后置摄像头设置在终端的背面。在一些实施例中,后置摄像头为至少两个,分别为主摄像头、景深摄像头、广角摄像头、长焦摄像头中的任意一种,以实现主摄像头和景深摄像头融合实现背景虚化功能、主摄像头和广角摄像头融合实现全景拍摄以及VR(virtual reality,虚拟现实)拍摄功能或者其它融合拍摄功能。在一些实施例中,摄像头组件1206还可以包括闪光灯。闪光灯可以是单色温闪光灯,也可以是双色温闪光灯。双色温闪光灯是指暖光闪光灯和冷光闪光灯的组合,可以用于不同色温下的光线补偿。The camera assembly 1206 is used to capture images or videos. Optionally, the camera component 1206 includes a front camera and a rear camera. Usually, the front camera is set on the front panel of the terminal, and the rear camera is set on the back of the terminal. In some embodiments, there are at least two rear cameras, which are any one of the main camera, depth-of-field camera, wide-angle camera, and telephoto camera, so as to realize the fusion of the main camera and the depth-of-field camera to realize the background blur function. Combined with the wide-angle camera to achieve panoramic shooting and VR (virtual reality, virtual reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 1206 may also include a flash. The flash can be a single-color temperature flash or a dual-color temperature flash. Dual color temperature flash refers to the combination of warm light flash and cold light flash, which can be used for light compensation under different color temperatures.
音频电路1207可以包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至处理器1201进行处理,或者输入至射频电路1204以实现语音通信。出于立体声采集或降噪的目的,麦克风可以为多个,分别设置在计算机设备1200的不同部位。麦克风还可以是阵列麦克风或全向采集型麦克风。扬声器则用于将来自处理器1201或射频电路1204的电信号转换为声波。扬声器可以是传统的薄膜扬声器,也可以是压电陶瓷扬声器。当扬声器是压电陶瓷扬声器时,不仅可以将电信号转换为人类可听见的声波,也可以将电信号转换为人类听不见的声波以进行测距等用途。在一些实施例中,音频电路1207还可以包括耳机插孔。 Audio circuitry 1207 may include a microphone and speakers. The microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals and input them to the processor 1201 for processing, or input them to the radio frequency circuit 1204 to realize voice communication. For the purpose of stereo acquisition or noise reduction, there may be multiple microphones, which are respectively arranged at different parts of the computer device 1200 . The microphone can also be an array microphone or an omnidirectional collection microphone. The speaker is used to convert the electrical signal from the processor 1201 or the radio frequency circuit 1204 into sound waves. The loudspeaker can be a conventional membrane loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, it is possible not only to convert electrical signals into sound waves audible to humans, but also to convert electrical signals into sound waves inaudible to humans for purposes such as distance measurement. In some embodiments, audio circuitry 1207 may also include a headphone jack.
定位组件1208用于定位计算机设备1200的当前地理位置,以实现导航或LBS(location based service,基于位置的服务)。定位组件1208可以是基于美国的GPS(global positioning system,全球定位系统)、中国的北斗系统或俄罗斯的伽利略系统的定位组件。The positioning component 1208 is used to locate the current geographic location of the computer device 1200, so as to realize navigation or LBS (location based service, location-based service). The positioning component 1208 may be a positioning component based on the GPS (global positioning system, global positioning system) of the United States, the Beidou system of China or the Galileo system of Russia.
电源1209用于为计算机设备1200中的各个组件进行供电。电源1209可以是交流电、直流电、一次性电池或可充电电池。当电源1209包括可充电电池时,该可充电电池可以是有线充电电池或无线充电电池。有线充电电池是通过有线线路充电的电池,无线充电电池是通过无线线圈充电的电池。该可充电电池还可以用于支持快充技术。The power supply 1209 is used to supply power to various components in the computer device 1200 . The power source 1209 can be alternating current, direct current, disposable batteries, or rechargeable batteries. When the power source 1209 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. A wired rechargeable battery is a battery charged through a wired line, and a wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery can also be used to support fast charging technology.
在一些实施例中,计算机设备1200还包括有一个或多个传感器1210。该一个或多个传感器1210包括但不限于:加速度传感器1211、陀螺仪传感器1212、压力传感器1213、指纹传感器1214、光学传感器1215以及接近传感器1216。In some embodiments, the computing device 1200 also includes one or more sensors 1210 . The one or more sensors 1210 include, but are not limited to: an acceleration sensor 1211 , a gyroscope sensor 1212 , a pressure sensor 1213 , a fingerprint sensor 1214 , an optical sensor 1215 and a proximity sensor 1216 .
加速度传感器1211可以检测以计算机设备1200建立的坐标系的三个坐标轴上的加速度大小。比如,加速度传感器1211可以用于检测重力加速度在三个坐标轴上的分量。处理器1201可以根据加速度传感器1211采集的重力加速度信号,控制显示屏1205以横向视图或纵向视图进行用户界面的显示。加速度传感器1211还可以用于游戏或者用户的运动数据的采集。The acceleration sensor 1211 can detect the acceleration on the three coordinate axes of the coordinate system established by the computer device 1200 . For example, the acceleration sensor 1211 can be used to detect the components of the acceleration of gravity on the three coordinate axes. The processor 1201 may control the display screen 1205 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 1211 . The acceleration sensor 1211 can also be used for collecting game or user's motion data.
陀螺仪传感器1212可以检测计算机设备1200的机体方向及转动角度,陀螺仪传感器1212可以与加速度传感器1211协同采集用户对计算机设备1200的3D动作。处理器1201根据陀螺仪传感器1212采集的数据,可以实现如下功能:动作感应(比如根据用户的倾斜操作来改变UI)、拍摄时的图像稳定、游戏控制以及惯性导航。The gyro sensor 1212 can detect the body direction and rotation angle of the computer device 1200 , and the gyro sensor 1212 can cooperate with the acceleration sensor 1211 to collect 3D actions of the user on the computer device 1200 . According to the data collected by the gyroscope sensor 1212, the processor 1201 can realize the following functions: motion sensing (such as changing the UI according to the tilt operation of the user), image stabilization during shooting, game control and inertial navigation.
压力传感器1213可以设置在计算机设备1200的侧边框和/或显示屏1205的下层。当压力传感器1213设置在计算机设备1200的侧边框时,可以检测用户对计算机设备1200的握持信号,由处理器1201根据压力传感器1213采集的握持信号进行左右手识别或快捷操作。当压力传感器1213设置在显示屏1205的下层时,由处理器1201根据用户对显示屏1205的压力操作,实现对UI界面上的可操作性控件进行控制。可操作性控件包括按钮控件、滚动条控件、图标控件、菜单控件中的至少一种。The pressure sensor 1213 may be disposed on the side frame of the computer device 1200 and/or the lower layer of the display screen 1205 . When the pressure sensor 1213 is arranged on the side frame of the computer device 1200 , it can detect the user's grip signal on the computer device 1200 , and the processor 1201 performs left and right hand recognition or shortcut operation according to the grip signal collected by the pressure sensor 1213 . When the pressure sensor 1213 is disposed on the lower layer of the display screen 1205, the processor 1201 controls operable controls on the UI interface according to the user's pressure operation on the display screen 1205. The operable controls include at least one of button controls, scroll bar controls, icon controls, and menu controls.
指纹传感器1214用于采集用户的指纹,由处理器1201根据指纹传感器1214采集到的指纹识别用户的身份,或者,由指纹传感器1214根据采集到的指纹识别用户的身份。在识别出 用户的身份为可信身份时,由处理器1201授权该用户执行相关的敏感操作,该敏感操作包括解锁屏幕、查看加密信息、下载软件、支付及更改设置等。指纹传感器1214可以被设置在计算机设备1200的正面、背面或侧面。当计算机设备1200上设置有物理按键或厂商Logo时,指纹传感器1214可以与物理按键或厂商Logo集成在一起。The fingerprint sensor 1214 is used to collect the user's fingerprint, and the processor 1201 recognizes the identity of the user according to the fingerprint collected by the fingerprint sensor 1214, or, the fingerprint sensor 1214 recognizes the user's identity according to the collected fingerprint. When the identity of the user is recognized as a trusted identity, the processor 1201 authorizes the user to perform related sensitive operations, such sensitive operations include unlocking the screen, viewing encrypted information, downloading software, making payment, and changing settings. Fingerprint sensor 1214 may be disposed on the front, back or sides of computing device 1200 . When the computer device 1200 is provided with a physical button or a manufacturer's Logo, the fingerprint sensor 1214 may be integrated with the physical button or the manufacturer's Logo.
光学传感器1215用于采集环境光强度。在一个实施例中,处理器1201可以根据光学传感器1215采集的环境光强度,控制显示屏1205的显示亮度。具体地,当环境光强度较高时,调高显示屏1205的显示亮度;当环境光强度较低时,调低显示屏1205的显示亮度。在另一个实施例中,处理器1201还可以根据光学传感器1215采集的环境光强度,动态调整摄像头组件1206的拍摄参数。The optical sensor 1215 is used to collect ambient light intensity. In one embodiment, the processor 1201 may control the display brightness of the display screen 1205 according to the ambient light intensity collected by the optical sensor 1215 . Specifically, when the ambient light intensity is high, the display brightness of the display screen 1205 is increased; when the ambient light intensity is low, the display brightness of the display screen 1205 is decreased. In another embodiment, the processor 1201 may also dynamically adjust shooting parameters of the camera assembly 1206 according to the ambient light intensity collected by the optical sensor 1215 .
接近传感器1216,也称距离传感器,通常设置在计算机设备1200的前面板。接近传感器1216用于采集用户与计算机设备1200的正面之间的距离。在一个实施例中,当接近传感器1216检测到用户与计算机设备1200的正面之间的距离逐渐变小时,由处理器1201控制显示屏1205从亮屏状态切换为息屏状态;当接近传感器1216检测到用户与计算机设备1200的正面之间的距离逐渐变大时,由处理器1201控制显示屏1205从息屏状态切换为亮屏状态。A proximity sensor 1216 , also called a distance sensor, is usually disposed on the front panel of the computer device 1200 . The proximity sensor 1216 is used to capture the distance between the user and the front of the computer device 1200 . In one embodiment, when the proximity sensor 1216 detects that the distance between the user and the front of the computer device 1200 gradually decreases, the processor 1201 controls the display screen 1205 to switch from the bright screen state to the off-screen state; when the proximity sensor 1216 detects When the distance between the user and the front of the computer device 1200 gradually increases, the processor 1201 controls the display screen 1205 to switch from the off-screen state to the on-screen state.
本领域技术人员可以理解,图12中示出的结构并不构成对计算机设备1200的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。Those skilled in the art can understand that the structure shown in FIG. 12 does not constitute a limitation to the computer device 1200, and may include more or less components than shown in the figure, or combine some components, or adopt a different arrangement of components.
在示例性实施例中,还提供了一种计算机可读存储介质,例如包括指令的存储器,上述指令可由终端中的处理器执行以完成上述实施例中查询数据的方法。该计算机可读存储介质可以是非暂态的。例如,所述计算机可读存储介质可以是ROM(read-only memory,只读存储器)、RAM(random access memory,随机存取存储器)、CD-ROM、磁带、软盘和光数据存储设备等。In an exemplary embodiment, there is also provided a computer-readable storage medium, such as a memory including instructions, and the above instructions can be executed by a processor in the terminal to complete the method for querying data in the above embodiments. The computer readable storage medium may be non-transitory. For example, the computer-readable storage medium may be ROM (read-only memory, read-only memory), RAM (random access memory, random access memory), CD-ROM, magnetic tape, floppy disk, and optical data storage device, etc.
在示例性实施例中,还提供了一种计算机程序产品,该计算机程序产品中包括至少一条指令,该至少一条指令由处理器加载并执行以实现上述实施例中查询数据的方法。In an exemplary embodiment, a computer program product is also provided, the computer program product includes at least one instruction, and the at least one instruction is loaded and executed by a processor to implement the method for querying data in the above embodiments.
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps for implementing the above embodiments can be completed by hardware, and can also be completed by instructing related hardware through a program. The program can be stored in a computer-readable storage medium. The above-mentioned The storage medium mentioned may be a read-only memory, a magnetic disk or an optical disk, and the like.
本申请中术语“第一”“第二”等字样用于对作用和功能基本相同的相同项或相似项进行区分,应理解,“第一”、“第二”之间不具有逻辑或时序上的依赖关系,也不对数量和执行顺序进行限定。还应理解,尽管以下描述使用术语第一、第二等来描述各种元素,但这些元素不应受术语的限制。这些术语只是用于将一元素与另一元素区别分开。本申请中术语“至少一个”的含义是指一个或多个,本申请中术语“多个”的含义是指两个或两个以上。In this application, the terms "first" and "second" are used to distinguish the same or similar items with basically the same function and function. It should be understood that there is no logic or sequence between "first" and "second" Dependencies on the above, and there are no restrictions on the number and execution order. It should also be understood that although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by the terms. These terms are only used to distinguish one element from another. The meaning of the term "at least one" in this application refers to one or more, and the meaning of the term "multiple" in this application refers to two or more.
以上描述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above description is only the specific implementation of the application, but the scope of protection of the application is not limited thereto. Any person familiar with the technical field can easily think of various equivalent modifications within the technical scope disclosed in the application. Or replacement, these modifications or replacements should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.

Claims (12)

  1. 一种查询数据的方法,其特征在于,所述方法包括:A method for querying data, characterized in that the method comprises:
    接收数据查询应用程序发送的数据查询指令,所述数据查询指令中携带有数据查询语句;receiving a data query instruction sent by a data query application program, wherein the data query instruction carries a data query statement;
    基于所述数据查询语句的结构,建立所述数据查询语句对应的第一查询树;Establishing a first query tree corresponding to the data query statement based on the structure of the data query statement;
    基于所述第一查询树中各结点的类型,对所述第一查询树进行简化处理,得到第二查询树;Based on the type of each node in the first query tree, simplifying the first query tree to obtain a second query tree;
    基于预设的执行顺序,在图数据库中依次执行所述第二查询树中各结点对应的查询操作,得到数据查询结果;Based on the preset execution sequence, sequentially execute the query operations corresponding to the nodes in the second query tree in the graph database to obtain data query results;
    将所述数据查询结果返回至所述数据查询应用程序。returning the data query result to the data query application program.
  2. 根据权利要求1所述的方法,其特征在于,所述第一查询树中结点的类型包括合并结点和查询结点,其中,所述合并结点用于表示所述数据查询语句或所述数据查询语句中的子查询语句,所述查询结点用于表示所述数据查询语句或所述数据查询语句中的子查询语句中的查询词,所述查询结点包括基本图模式BGP结点、联合UNION结点、可选匹配OPTIONAL结点、过滤FILTER结点中的至少一种。The method according to claim 1, wherein the types of nodes in the first query tree include merge nodes and query nodes, wherein the merge nodes are used to represent the data query statement or the A subquery statement in the data query statement, the query node is used to represent the query word in the data query statement or the subquery statement in the data query statement, and the query node includes a basic graph mode BGP node At least one of node, joint UNION node, optional matching OPTIONAL node, and filter FILTER node.
  3. 根据权利要求2所述的方法,其特征在于,所述基于所述第一查询树中各结点的类型,对所述第一查询树进行简化处理,得到第二查询树,包括:The method according to claim 2, wherein, based on the types of nodes in the first query tree, the first query tree is simplified to obtain a second query tree, including:
    将所述第一查询树确定为待简化的第三查询树,确定所述第三查询树中各结点的深度;Determining the first query tree as a third query tree to be simplified, and determining the depth of each node in the third query tree;
    对于所述第三查询树中深度为1的第一合并结点,如果所述第一合并结点的孩子结点包括多个BGP结点,则对所述多个BGP结点进行合并处理,得到合并处理后的第一BGP结点,删除所述第一合并结点,将所述第一BGP结点添加到所述第一合并结点的位置;For the first merging node with a depth of 1 in the third query tree, if the child nodes of the first merging node include multiple BGP nodes, performing merging processing on the multiple BGP nodes, Obtaining the merged first BGP node, deleting the first merged node, adding the first BGP node to the position of the first merged node;
    对于所述第三查询树中深度为2的第二合并结点,如果所述第二合并结点的孩子结点包括至少一个BGP结点和至少一个UNION结点,则对所述至少一个BGP结点进行合并处理,得到合并处理后的第二BGP结点,对所述至少一个UNION结点进行合并处理,得到合并处理后的第三UNION结点;将所述第二BGP结点合并至所述第三UNION结点的孩子结点中,得到第四UNION结点,删除所述第二合并结点,将所述第四UNION结点添加到所述第二合并结点的位置;For the second merging node with a depth of 2 in the third query tree, if the child nodes of the second merging node include at least one BGP node and at least one UNION node, then for the at least one BGP The node is merged to obtain the second BGP node after the merge process, and the at least one UNION node is merged to obtain the third UNION node after the merge process; the second BGP node is merged into In the child node of the third UNION node, obtain the fourth UNION node, delete the second merged node, and add the fourth UNION node to the position of the second merged node;
    对于所述第三查询树中深度为2的第五UNION结点,将所述第五UNION结点的孙代结点添加到所述第五UNION结点的孩子结点中,并删除所述第五UNION结点的孙代结点以及所述孙代结点的父结点,得到简化处理之后第三查询树。For the fifth UNION node whose depth is 2 in the third query tree, add the grandchildren node of the fifth UNION node to the child node of the fifth UNION node, and delete the The grandchild node of the fifth UNION node and the parent node of the grandchild node obtain the third query tree after simplified processing.
  4. 根据权利要求3所述的方法,其特征在于,所述将所述第一查询树确定为待简化的第三查询树之前,所述方法还包括:The method according to claim 3, wherein before the first query tree is determined as the third query tree to be simplified, the method further comprises:
    确定所述第一查询树中对应的祖先结点不存在OPTIONAL结点的第一OPTIONAL结点;determining that there is no first OPTIONAL node in the corresponding ancestor node in the first query tree;
    将以所述第一OPTIONAL结点的父结点为根结点的子查询树转化为第三BGP结点。Converting the subquery tree whose root node is the parent node of the first OPTIONAL node into a third BGP node.
  5. 根据权利要求4所述的方法,其特征在于,所述在图数据库中依次执行所述第二查询树中各结点对应的查询操作,包括:The method according to claim 4, wherein the sequentially executing the query operation corresponding to each node in the second query tree in the graph database includes:
    当执行所述第三BGP结点对应的第一查询操作时,确定所述第三BGP结点对应的子查询树;When executing the first query operation corresponding to the third BGP node, determine the sub-query tree corresponding to the third BGP node;
    执行所述子查询树中所述第一OPTIONAL结点的兄弟结点对应的查询操作,得到第一查询结果;将所述第一查询结果确定为所述第一OPTIONAL结点的后代结点的数据查询范围;基于所述数据查询范围,执行所述第一OPTIONAL结点的后代结点对应的查询操作。Execute the query operation corresponding to the brother node of the first OPTIONAL node in the sub-query tree to obtain a first query result; determine the first query result as the descendant node of the first OPTIONAL node A data query range: based on the data query range, perform a query operation corresponding to a descendant node of the first OPTIONAL node.
  6. 根据权利要求4所述的方法,其特征在于,所述在图数据库中依次执行所述第二查询树中各结点对应的查询操作,包括:The method according to claim 4, wherein the sequentially executing the query operation corresponding to each node in the second query tree in the graph database includes:
    当执行所述第三BGP结点对应的第一查询操作时,如果确定所述第一OPTIONAL结点 的后代结点中包括至少一个第二OPTIONAL结点;When performing the first query operation corresponding to the third BGP node, if it is determined that at least one second OPTIONAL node is included in the descendant nodes of the first OPTIONAL node;
    按照所述第一OPTIONAL结点和所述至少一个第二OPTIONAL结点的深度,依次执行所述第一OPTIONAL结点的第一兄弟结点和所述至少一个第二OPTIONAL结点的第二兄弟结点对应的查询操作,其中,每个第二OPTIONAL结点的第二兄弟结点对应的数据查询范围为前一个OPTIONAL结点的第二兄弟结点对应的查询结果;According to the depth of the first OPTIONAL node and the at least one second OPTIONAL node, sequentially execute the first sibling node of the first OPTIONAL node and the second sibling node of the at least one second OPTIONAL node The query operation corresponding to the node, wherein, the data query range corresponding to the second sibling node of each second OPTIONAL node is the query result corresponding to the second sibling node of the previous OPTIONAL node;
    按照所述第一OPTIONAL结点和所述至少一个第二OPTIONAL结点的深度,依次执行所述第一OPTIONAL结点的孩子结点和所述至少一个第二OPTIONAL结点的孩子结点对应的查询操作,其中,所述任一OPTIONAL结点的孩子结点的对应的数据查询范围为所述任一OPTIONAL结点的兄弟结点对应的查询结果。According to the depth of the first OPTIONAL node and the at least one second OPTIONAL node, sequentially execute the child nodes of the first OPTIONAL node and the child nodes of the at least one second OPTIONAL node Query operation, wherein the data query range corresponding to the child nodes of any OPTIONAL node is the query result corresponding to the brother nodes of any OPTIONAL node.
  7. 根据权利要求2所述的方法,其特征在于,所述基于所述第一查询树中各结点的类型,对所述第一查询树进行简化处理,得到第二查询树,包括:The method according to claim 2, wherein, based on the types of nodes in the first query tree, the first query tree is simplified to obtain a second query tree, including:
    对于所述第一查询树中的FILTER结点,如果所述FILTER结点对应的FILTER条件满足预设的转换条件,则将所述FILTER条件转换为析取范式;For the FILTER node in the first query tree, if the FILTER condition corresponding to the FILTER node satisfies a preset conversion condition, then convert the FILTER condition into a disjunctive normal form;
    基于所述析取范式,将所述FILTER结点转换为UNION结点。Based on the disjunctive normal form, convert the FILTER node into a UNION node.
  8. 根据权利要求7所述的方法,其特征在于,所述转换条件为所述FILTER结点对应的FILTER条件由变量、常量以及与、或、相等三种运算符组成。The method according to claim 7, wherein the conversion condition is that the FILTER condition corresponding to the FILTER node is composed of variables, constants, and and, or, and equal operators.
  9. 根据权利要求2所述的方法,其特征在于,所述在图数据库中依次执行所述第二查询树中各结点对应的查询操作,包括:The method according to claim 2, wherein the sequentially executing the query operation corresponding to each node in the second query tree in the graph database includes:
    如果存在可并行执行的多个BGP结点,则确定所述多个BGP结点对应的公共三元组模式;If there are multiple BGP nodes that can be executed in parallel, then determine the public triplet pattern corresponding to the multiple BGP nodes;
    基于所述贪心算法,在所述公共三元组模式中确定对应查询成本最低的部分公共三元组模式;Based on the greedy algorithm, determining a part of the public triplet patterns corresponding to the lowest query cost in the public triplet patterns;
    在所述图数据中查询所述部分公共三元组模式对应的数据;Querying the data corresponding to the partial public triple pattern in the graph data;
    在所述图数据中查询多个BGP结点中除所述部分公共三元组模式外的其他三元组模式对应的数据。The graph data is queried for data corresponding to triplet patterns other than the partial common triplet patterns in multiple BGP nodes.
  10. 一种查询数据的装置,其特征在于,所述装置包括:A device for querying data, characterized in that the device comprises:
    接收模块,用于接收数据查询应用程序发送的数据查询指令,所述数据查询指令中携带有数据查询语句;A receiving module, configured to receive a data query instruction sent by a data query application program, wherein the data query instruction carries a data query statement;
    建立模块,用于基于所述数据查询语句的结构,建立所述数据查询语句对应的第一查询树;A building module, configured to build a first query tree corresponding to the data query statement based on the structure of the data query statement;
    处理模块,用于基于所述第一查询树中各结点的类型,对所述第一查询树进行简化处理,得到第二查询树;A processing module, configured to simplify the first query tree based on the type of each node in the first query tree to obtain a second query tree;
    查询模块,用于基于预设的执行顺序,在图数据库中依次执行所述第二查询树中各结点对应的查询操作,得到数据查询结果;A query module, configured to sequentially execute query operations corresponding to each node in the second query tree in the graph database based on a preset execution sequence, to obtain data query results;
    返回模块,用于将所述数据查询结果返回至所述数据查询应用程序。A returning module, configured to return the data query result to the data query application program.
  11. 一种计算机设备,其特征在于,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令,所述至少一条指令由所述处理器加载并执行以实现如权利要求1至权利要求9任一项所述的查询数据的方法所执行的操作。A computer device, characterized in that the computer device includes a processor and a memory, and at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor to implement claims 1 to 1. The operation performed by the method for querying data described in any one of claim 9.
  12. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令,所述至少一条指令由处理器加载并执行以实现如权利要求1至权利要求9任一项所述的查询数据的方法所执行的操作。A computer-readable storage medium, characterized in that at least one instruction is stored in the storage medium, and the at least one instruction is loaded and executed by a processor to implement the method described in any one of claims 1 to 9 The operation performed by the method that queries the data.
PCT/CN2022/135606 2021-12-31 2022-11-30 Data query method and apparatus, and device and storage medium WO2023124729A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111673409.6A CN114706846A (en) 2021-12-31 2021-12-31 Method, device and equipment for querying data and storage medium
CN202111673409.6 2021-12-31

Publications (1)

Publication Number Publication Date
WO2023124729A1 true WO2023124729A1 (en) 2023-07-06

Family

ID=82166982

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/135606 WO2023124729A1 (en) 2021-12-31 2022-11-30 Data query method and apparatus, and device and storage medium

Country Status (2)

Country Link
CN (1) CN114706846A (en)
WO (1) WO2023124729A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117271576A (en) * 2023-10-19 2023-12-22 北京人大金仓信息技术股份有限公司 Query optimization method, storage medium and computer equipment

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114706846A (en) * 2021-12-31 2022-07-05 北京大学 Method, device and equipment for querying data and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103116625A (en) * 2013-01-31 2013-05-22 重庆大学 Volume radio direction finde (RDF) data distribution type query processing method based on Hadoop
US20140067793A1 (en) * 2012-08-31 2014-03-06 Infotech Soft, Inc. Query Optimization for SPARQL
US20140304251A1 (en) * 2013-04-03 2014-10-09 International Business Machines Corporation Method and Apparatus for Optimizing the Evaluation of Semantic Web Queries
CN111241127A (en) * 2020-01-16 2020-06-05 华南师范大学 Predicate combination-based SPARQL query optimization method, system, storage medium and equipment
CN114706846A (en) * 2021-12-31 2022-07-05 北京大学 Method, device and equipment for querying data and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140067793A1 (en) * 2012-08-31 2014-03-06 Infotech Soft, Inc. Query Optimization for SPARQL
CN103116625A (en) * 2013-01-31 2013-05-22 重庆大学 Volume radio direction finde (RDF) data distribution type query processing method based on Hadoop
US20140304251A1 (en) * 2013-04-03 2014-10-09 International Business Machines Corporation Method and Apparatus for Optimizing the Evaluation of Semantic Web Queries
CN111241127A (en) * 2020-01-16 2020-06-05 华南师范大学 Predicate combination-based SPARQL query optimization method, system, storage medium and equipment
CN114706846A (en) * 2021-12-31 2022-07-05 北京大学 Method, device and equipment for querying data and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117271576A (en) * 2023-10-19 2023-12-22 北京人大金仓信息技术股份有限公司 Query optimization method, storage medium and computer equipment

Also Published As

Publication number Publication date
CN114706846A (en) 2022-07-05

Similar Documents

Publication Publication Date Title
WO2023124729A1 (en) Data query method and apparatus, and device and storage medium
CN108717432B (en) Resource query method and device
CN108363569B (en) Image frame generation method, device, equipment and storage medium in application
CN110503959B (en) Voice recognition data distribution method and device, computer equipment and storage medium
CN111694834A (en) Method, device and equipment for putting picture data into storage and readable storage medium
WO2022100221A1 (en) Retrieval processing method and apparatus, and storage medium
CN114244595B (en) Authority information acquisition method and device, computer equipment and storage medium
CN111104402A (en) Method, device, electronic equipment and medium for searching data
WO2020088681A1 (en) Management method for model files and terminal device
CN113742366A (en) Data processing method and device, computer equipment and storage medium
CN109902089B (en) Query method and device using heterogeneous index, electronic equipment and medium
CN113553039A (en) Method and device for generating executable code of operator
CN111061803A (en) Task processing method, device, equipment and storage medium
WO2019223601A1 (en) Database system, and establishment method and apparatus therefor
CN112561084B (en) Feature extraction method and device, computer equipment and storage medium
CN111475611B (en) Dictionary management method, dictionary management device, computer equipment and storage medium
CN110149408B (en) Service data display method and device, terminal and server
JP2022540736A (en) CHARACTER RECOMMENDATION METHOD, CHARACTER RECOMMENDATION DEVICE, COMPUTER AND PROGRAM
CN111125095B (en) Method, device, electronic equipment and medium for adding data prefix
CN112905328B (en) Task processing method, device and computer readable storage medium
CN113900920A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN113222771B (en) Method and device for determining target group based on knowledge graph and electronic equipment
CN112015973B (en) Relationship reasoning method and terminal of heterogeneous network
CN114385723A (en) Data reading method and device, electronic equipment and storage medium
CN112711636A (en) Data synchronization method, device, equipment and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913984

Country of ref document: EP

Kind code of ref document: A1