WO2023124729A1

WO2023124729A1 - Data query method and apparatus, and device and storage medium

Info

Publication number: WO2023124729A1
Application number: PCT/CN2022/135606
Authority: WO
Inventors: 邹磊
Original assignee: 北京大学
Priority date: 2021-12-31
Filing date: 2022-11-30
Publication date: 2023-07-06
Also published as: CN114706846A

Abstract

The present application belongs to the technical field of graph databases. Disclosed are a data query method and apparatus, and a device and a storage medium. The method comprises: receiving a data query instruction, which is sent by a data query application program, wherein the data query instruction carries a data query statement; on the basis of the structure of the data query statement, establishing a first query tree corresponding to the data query statement; performing simplification processing on the first query tree on the basis of the type of each node in the first query tree, so as to obtain a second query tree; on the basis of a preset execution sequence, sequentially executing, in a graph database, query operations corresponding to respective nodes in the second query tree, so as to obtain a data query result; and returning the data query result to the data query application program. By means of the present application, the data query efficiency in a graph database can be improved.

Description

Method, device, equipment and storage medium for querying data

This application claims the priority of the Chinese patent application with the application number 202111673409.6 and the title of the invention "method, device, equipment and storage medium for querying data" filed on December 31, 2021, the entire contents of which are incorporated by reference in this application middle.

technical field

The present application relates to the technical field of graph databases, and in particular to a method, device, device and storage medium for querying data.

Background technique

RDF (Resource Description Framework, resource description framework) is a factual data model of knowledge graph. Each edge in the knowledge graph is expressed in the form of an RDF triple in the form of "subject, predicate, object", which represents a pair of A named relationship between entities or a named property value that an entity has.

SPARQL (SPARQL Protocol and RDF Query Language, query language and data acquisition protocol) is a standard query language for accessing RDF datasets, in which UNION (union), OPTIONAL (optional matching) and FILTER (filtering) expressions are SPARQL data queries Commonly used query expressions in the statement.

At present, when a computer device executes a query operation corresponding to a data query statement, it only sequentially executes query processing corresponding to each query expression, and the efficiency of querying data is low.

Contents of the invention

Embodiments of the present application provide a data query method, device, device, and storage medium, which can improve data query efficiency. Described technical scheme is as follows:

In a first aspect, a method for querying data is provided, the method comprising:

receiving a data query instruction sent by a data query application program, wherein the data query instruction carries a data query statement;

Establishing a first query tree corresponding to the data query statement based on the structure of the data query statement;

Based on the type of each node in the first query tree, simplifying the first query tree to obtain a second query tree;

Based on the preset execution sequence, sequentially execute the query operations corresponding to the nodes in the second query tree in the graph database to obtain data query results;

returning the data query result to the data query application program.

Optionally, the types of nodes in the first query tree include a merge node and a query node, wherein the merge node is used to represent the data query statement or a subquery statement in the data query statement , the query node is used to represent the query word in the data query statement or a subquery statement in the data query statement, and the query node includes a BGP node, a UNION node, an OPTIONAL node, a FILTER node at least one of the points.

Optionally, the simplification of the first query tree based on the types of nodes in the first query tree to obtain a second query tree includes:

Determining the first query tree as a third query tree to be simplified, and determining the depth of each node in the third query tree;

For the first merging node with a depth of 1 in the third query tree, if the child nodes of the first merging node include multiple BGP nodes, performing merging processing on the multiple BGP nodes, Obtaining the merged first BGP node, deleting the first merged node, adding the first BGP node to the position of the first merged node;

For the second merging node with a depth of 2 in the third query tree, if the child nodes of the second merging node include at least one BGP node and at least one UNION node, then for the at least one BGP The node is merged to obtain the second BGP node after the merge process, and the at least one UNION node is merged to obtain the third UNION node after the merge process; the second BGP node is merged into In the child node of the third UNION node, obtain the fourth UNION node, delete the second merged node, and add the fourth UNION node to the position of the second merged node;

For the fifth UNION node whose depth is 2 in the third query tree, add the grandchildren node of the fifth UNION node to the child node of the fifth UNION node, and delete the The grandchild node of the fifth UNION node and the parent node of the grandchild node obtain the third query tree after simplified processing.

Optionally, before determining the first query tree as the third query tree to be simplified, the method further includes:

determining that there is no first OPTIONAL node in the corresponding ancestor node in the first query tree;

Converting the subquery tree whose root node is the parent node of the first OPTIONAL node into a third BGP node.

Optionally, performing the query operation corresponding to each node in the second query tree sequentially in the graph database includes:

When executing the first query operation corresponding to the third BGP node, determine the sub-query tree corresponding to the third BGP node;

Execute the query operation corresponding to the brother node of the first OPTIONAL node in the sub-query tree to obtain a first query result; determine the first query result as the descendant node of the first OPTIONAL node A data query range: based on the data query range, perform a query operation corresponding to a descendant node of the first OPTIONAL node.

When executing the first query operation corresponding to the third BGP node, if it is determined that at least one second OPTIONAL node is included in the descendant nodes of the first OPTIONAL node;

According to the depth of the first OPTIONAL node and the at least one second OPTIONAL node, sequentially execute the first sibling node of the first OPTIONAL node and the second sibling node of the at least one second OPTIONAL node The query operation corresponding to the node, wherein, the data query range corresponding to the second sibling node of each second OPTIONAL node is the query result corresponding to the second sibling node of the previous OPTIONAL node;

According to the depth of the first OPTIONAL node and the at least one second OPTIONAL node, sequentially execute the child nodes of the first OPTIONAL node and the child nodes of the at least one second OPTIONAL node Query operation, wherein the data query range corresponding to the child nodes of any OPTIONAL node is the query result corresponding to the brother nodes of any OPTIONAL node.

For the FILTER node in the first query tree, if the FILTER condition corresponding to the FILTER node satisfies a preset conversion condition, then convert the FILTER condition into a disjunctive normal form;

Based on the disjunctive normal form, convert the FILTER node into a UNION node.

Optionally, the conversion condition is that the FILTER condition corresponding to the FILTER node is composed of variables, constants, and three operators: and, or, and equal.

If there are multiple BGP nodes that can be executed in parallel, then determine the public triplet pattern corresponding to the multiple BGP nodes;

Based on the greedy algorithm, determining a part of the public triplet patterns corresponding to the lowest query cost in the public triplet patterns;

Querying the data corresponding to the partial public triple pattern in the graph data;

The graph data is queried for data corresponding to triplet patterns other than the partial common triplet patterns in multiple BGP nodes.

In a second aspect, a device for querying data is provided, the device comprising:

A receiving module, configured to receive a data query instruction sent by a data query application program, wherein the data query instruction carries a data query statement;

Establishing a module for, based on the structure of the data query sentence, setting up the first query tree corresponding to the data query sentence;

A processing module, configured to simplify the first query tree based on the type of each node in the first query tree to obtain a second query tree;

A query module, configured to sequentially execute query operations corresponding to each node in the second query tree in the graph database based on a preset execution sequence, to obtain data query results;

A returning module, configured to return the data query result to the data query application program.

Optionally, the query module is used for:

Optionally, the processing module is also used for:

Optionally, the query module is used for:

Optionally, the processing module is used for:

Optionally, the query module is used for:

In a third aspect, a computer device is provided, the computer device includes a processor and a memory, at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor to implement the above-mentioned first The operations performed by the aspect.

In a fourth aspect, a computer-readable storage medium is provided, at least one instruction is stored in the storage medium, and the at least one instruction is loaded and executed by a processor to implement the operations performed in the first aspect above.

In a fifth aspect, a computer program product is provided, the computer program product includes at least one instruction, and the at least one instruction is loaded and executed by a processor to implement the operations performed in the first aspect above.

The beneficial effects brought by the technical solutions provided by the embodiments of the present application are:

In the embodiment of the present application, according to the structure of the data query statement, the query of the data query statement is converted into the first query tree, and then the query logic corresponding to each query node in the first query tree can be used to simplify the query tree to obtain the second query tree. query tree. In this way, the query operation of the data query statement can be simplified by simplifying the query tree, thereby improving the efficiency of the data query.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present application. For those skilled in the art, other drawings can also be obtained based on these drawings without creative effort.

Fig. 1 is a flow chart of a method for querying data provided by an embodiment of the present application;

Fig. 2 is a schematic diagram of a method for querying data provided by an embodiment of the present application;

Fig. 3 is a flow chart of a method for querying data provided by an embodiment of the present application;

FIG. 4 is a schematic diagram of a method for querying data provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of a method for querying data provided by an embodiment of the present application;

FIG. 6 is a schematic diagram of a method for querying data provided by an embodiment of the present application;

FIG. 7 is a flow chart of a method for querying data provided by an embodiment of the present application;

FIG. 8 is a schematic diagram of a method for querying data provided by an embodiment of the present application;

FIG. 9 is a flow chart of a method for querying data provided by an embodiment of the present application;

FIG. 10 is a flow chart of a method for querying data provided by an embodiment of the present application;

FIG. 11 is a schematic structural diagram of a device for querying data provided by an embodiment of the present application;

Fig. 12 is a schematic diagram of a computer device for querying data provided by an embodiment of the present application.

Detailed ways

In order to make the purpose, technical solution and advantages of the present application clearer, the implementation manners of the present application will be further described in detail below in conjunction with the accompanying drawings.

A method for querying data provided in the embodiment of the present application may be implemented by a computer device. An application program for querying data (such as a data query application program) can run in the computer device. The computer device includes at least a processor and a memory, wherein the memory can be used to store data related to the method for executing the query data, for example, it can include a graph database, program codes corresponding to the method for executing the query data, and the like. The processor can execute the program code stored in the memory, and implement the data query method provided by the embodiment of the present application according to the data query request of the application program.

The computer device may be a terminal or a server. When the computer device is a terminal, the terminal may be a mobile phone, a tablet computer, a smart wearable device, a desktop computer, a notebook computer, and the like. When the computer device is a server, the server can establish communication with the terminal. The server can be a single server or a server group. If it is a single server, the server can be responsible for all the processing in the following schemes, if It is a server group, and different servers in the server group can be responsible for different processing in the following solutions. The specific processing allocation can be set arbitrarily by technicians according to actual needs, and will not be repeated here.

The concepts related to the embodiments of this application are introduced below:

SPARQL (SPARQL Protocol and RDF Query Language, query language and data acquisition protocol) is a standard query language for accessing RDF datasets.

RDF datasets consist of multiple RDF triples, which can store arbitrary graph data. For each edge in graph data there is a unique RDF triple.

RDF triples: Let the pairwise disjoint infinite sets I, B and L denote Internationalized Resource Identifier (Internationalized Resource Identifier, IRI), empty node and literal value respectively. An RDF triplet is of the form t=<subject, predicate, object>∈(I∪B)×I×(I∪B∪L).

RDF Dataset: An RDF Dataset is a collection of RDF triples.

Triple pattern: let an infinite set V that is disjoint to the above set I, set B, and set L denote variables. A triple pattern is of the form t=<subject, predicate, object>∈(V∪I)×(V∪I)×(V∪I∪L). Since there may be variables in the subject, predicate or object in the triple pattern, multiple RDF triples can be matched in the RDF dataset through the triple pattern. It can be seen that the RDF triples can be queried in the RDF dataset through the triple mode. And because the RDF data set can be used to represent graph data, when querying data in graph data, it can also be searched through the triple mode.

BGP (Basic graph pattern): includes at least one triplet pattern, and a triplet pattern t is BGP; if both P ₁ and P ₂ are BGP, then P ₁ ANDP ₂ is also BGP.

BGP is the basic unit for finding data in graph data. Through the existing BGP matching algorithm, data matching each triple pattern in BGP can be found in the RDF data set.

Figure mode:

(1) If P is BGP, then P is graph mode;

(2) If both P ₁ and P ₂ are in graph mode, then P ₁ ANDP ₂ is also in graph mode;

(3) If P ₁ and P ₂ are both graph modes, then {P ₁ }UNION{P ₂ }, P ₁ OPTIONAL{P ₂ } are also graph modes, where {P _i } represents the group graph mode;

(4) If P is a graph pattern, C is a built-in condition (using I∪L∪V and constants, which can contain logical operators (∧, ∨), comparison operators (<, ≤, >, ≥, = ), unary functions (isBlank (judging whether it is an empty node), isIRI (judging whether it is an IRI)) and other functions), then PFILTERC is a graph mode.

Among them, UNION, OPTIONAL and FILTER are commonly used query expressions in SPARQL data query statements, among which:

UNION refers to the combined search of multiple graph patterns, for example, P ₁ UNION P ₂ refers to the search for triples that satisfy the triplet pattern P ₁ and triplet pattern P ₂ in the RDF dataset, and the lookup Find the union of the results.

OPTONAL refers to selective matching of graph patterns, for example, P1OPTIONAL{P2} refers to adding a compatible result satisfying graph pattern P2 on the premise of retaining the results satisfying graph pattern P1 in the RDF dataset.

FILTER is a conditional filter for search results. For example, P ₁ FILTERC refers to filtering data that meets condition C in the search results corresponding to triple pattern P ₁ .

Grouping mode: A recursive definition of a grouping mode P is as follows:

(1) If P is a graph pattern, then {P} is a group graph pattern;

(2) If P is a graph pattern, then P is also a graph pattern.

UNION graph mode:

(1) If P ₁ is a group diagram mode or a UNION diagram mode, and P ₂ is a group diagram mode, then P ₁ UNION P ₂ is a UNION diagram mode;

(2) If P ₁ UNION P ₂ is a UNION graph pattern, then it is also a graph pattern.

Well-defined SPARQL query: A graph schema P is called well-defined if and only if it satisfies the following conditions:

(1) For each subpattern of the form P'FILTERC in P, all variables appearing in the built-in condition P also appear in the graph pattern P';

(2) For each subpattern in P of the form P'=P ₁ OPTIONAL{P ₂ }, all variables that appear in the graph pattern P ₂ and other than P' also appear in the graph pattern P ₁ .

The principle of the invention is based on the semantics of select queries in SPARQL queries. The form of the selection query is "SELECT v ₁ v ₂ ...v _k WHERE{...}", where the SELECT clause represents the query header and the WHERE clause represents the query body. The SELECT clause determines the projection variable, that is, the variable that needs to appear in the query result; the WHERE clause gives the graph pattern that needs to be matched with the RDF dataset, that is, the WHERE clause gives the data query statement.

The graph schema P is matched with the RDF dataset D to generate a series of mappings [[P]] _D ={μ ₁ ,μ ₂ ,…,μ _n }. Note that duplicate elements are allowed in the map, i.e. the map is a package rather than a collection. Each mapping μ refers to a function of a set of variables to a combination of results. The set of variables appearing in the map μ is denoted as dom(μ).

If and only if all variables v∈dom(μ ₁ )∩dom(μ ₂ ) satisfy μ ₁ (v)=μ ₂ (v), the two mappings of μ ₁ and μ ₂ are said to be compatible, denoted as μ ₁ ～μ ₂ , then μ ₁ ∪μ ₂ is also a mapping. If the two mappings μ ₁ and μ ₂ are incompatible, write

If there are two mappings Ω ₁ and Ω ₂ , the operations that Ω ₁ and Ω ₂ can perform are as follows:

(1)

(2)Ω ₁ ∪ _bag Ω ₂ ＝{μ ₁ |μ ₁ ∈Ω ₁ }∪ _bag {μ ₂ |μ ₂ ∈Ω ₂ };

(3)

The mapping (denoted as [[P]] _D ) produced by matching a graph schema P with an RDF dataset D is recursively defined as follows:

(1) If P is a triple pattern t, then [[P]] _D = {μ|var(t) = dom(μ)∧μ(t)∈D}, where var(t) represents the All variables that appear, all variables that appear in μ(t)t are replaced by the RDF triples obtained after μ;

(2) If P＝(P ₁ AND P ₂ ), then

(3) If P＝(P ₁ UNION P ₂ ), then [[P]] _D ＝[[P1]] _D ∪ _bag [[P2]] _D ;

(4) If P＝(P ₁ OPTIONAL P ₂ ),

(5) If P＝(P ₁ FILTER C), then [[P]] _D ＝{μ|μ∈[[P1]] _D ∧μ(C)} (that is, when all variables appearing in C are replaced by μ (denoted as μ (C)), the value of μ (C) is true).

The object of the present invention is to provide an optimization algorithm for generation of SPARQL query execution plans containing UNION, OPTIONAL and FILTER expressions in graph databases, so as to solve the problem of low efficiency of such queries in existing graph database systems.

A method for querying data provided by the present application will be described in detail below in conjunction with an embodiment:

FIG. 1 is a flow chart of a method for querying data provided by an embodiment of the present application. The method can be realized by the computer device mentioned above. Referring to Fig. 1, this embodiment includes:

Step 101, receiving a data query instruction sent by a data query application program.

Wherein, the data query application program can be used to query the graph data stored in the storage database, and the graph data can be stored in the form of RDF data set. The graph data can be the equity relationship between different companies, where the nodes in the graph data can include the name, scale, and establishment time of the company, and the edges in the graph data can represent the previous relationship of the company, such as shareholding, share quota, etc. . For example, the RDF triple in the RDF dataset can be: <http://example.com/TX><http://example.com/name>"Beijing TX Computer System Co., Ltd.", the RDF triple Indicates that the name of the company TX is Beijing TX Computer System Co., Ltd.

The data query application program can run in the computer equipment, and the user can input data query statements in the data query application program according to business requirements, and can trigger a data query instruction after inputting the data query statement. After receiving the data query instruction, the processor in the computer device may perform the following processing according to the data query statement carried by the user in the data query instruction.

Step 102. Based on the structure of the data query statement, a first query tree corresponding to the data query statement is established.

After receiving the data query statement, a query tree (also referred to as a BE tree) corresponding to the data query statement may be established according to the structure of the data query statement and each query word.

The types of nodes in the query tree may include merge nodes and query nodes, and the merge nodes may represent data query statements or subquery statements in the data query statements. Among them, the data query statement and the subquery statement are both graph patterns. When the merging node represents a data query statement, the merging node is the root node of the first query tree. Query nodes represent query words that appear in data query statements or sub-query statements, such as BGP, UNION, OPTIONAL, FILTER, etc. The query nodes used to represent BGP can be called BGP nodes, which are used to represent BGP query nodes It can be called a BGP node, the query node used to represent UNION can be called a UNION node, the query node used to represent OPTIONAL can be called an OPTIONAL node, and the query node used to represent FILTER can be called a FILTER node point.

In this application, the query tree directly established by the data query statement may be called the first query tree. It should be understood that the establishment of the first query tree corresponding to the data query statement is actually to represent the data query statement and each sub-query statement through the merge node, and to represent the query words in the data query statement through the query node, and then establish the corresponding A tree with the same structure between each sub-query statement and each query word in the data query statement.

For example, the structure of the data query statement is ((b ₁ AND(b ₂ UNION b ₃ ))OPTIONAL(b ₄ UNION b ₅ ))FILTER c ₁ , where b ₁ -b ₅ and c ₁ are BGP, which can be expressed as BGP node. (b ₂ UNION b ₃ ), (b ₁ AND(b ₂ UNION b ₃ ), (b ₄ UNION b ₅ ), (b ₁ AND(b ₂ UNION b ₃ ))OPTIONAL(b ₄ UNION b ₅ ) are different The sub-query statement can be expressed by merging nodes. The first query tree corresponding to the data query statement can be as shown in Figure 2.

Step 103, based on the types of each node in the first query tree, simplify the first query tree to obtain a second query tree.

After obtaining the first query tree corresponding to the query statement, the first query tree can be simplified, and then the query operation corresponding to the data query statement can be executed according to the simplified query tree, which can simplify the query operation corresponding to the data query statement, Improve the efficiency of data query.

It should be understood that the simplification of the first query tree is the query logic corresponding to the nodes in the first query tree, and the processing of merging, converting, and deleting nodes will not change the query results corresponding to the data query statement . According to the different types of nodes in the first query tree, the corresponding simplification processing is also different, which will not be described in detail here. A query tree obtained after simplifying the first query tree may be called a second query tree.

Step 104, based on the preset execution order, sequentially execute the query operations corresponding to the nodes in the second query tree in the graph database to obtain the data query results.

After obtaining the second query tree, the query operations corresponding to each node can be executed sequentially according to the depth corresponding to each node in the second query tree, and finally the query result of the node with the largest depth (ie, the root node) can be obtained , the query result is the query result of the data query statement in the data query instruction.

It should be noted that, for each query node with the same depth, the BGP node may be executed first, then the corresponding UNION node or OPTIONAL node, and finally the FILTER node may be executed.

Step 105, returning the data query result to the data query application program.

After the data query result is obtained, the data query result can be sent to the query application program, and the query application program can display the corresponding query application program in the query result display interface for the user to view.

For example, the data query statement includes the following triple pattern: ? x<http://example.com/Shareholding>"Beijing TX Computer System Co., Ltd.", after the above processing, all individuals and companies holding shares of Beijing TX Computer System Co., Ltd. can be displayed in the query result display interface wait.

The simplified processing of the first query tree in step 103 is described in detail below. According to the difference of node types in the first query tree, the corresponding different processing is as follows:

As shown in Figure 3, Figure 3 is a simplified processing method provided by the present application, which includes:

Step 301. Determine the first query tree as the third query tree to be simplified, and determine the depth of each node in the third query tree.

Wherein, for the depth d(a)=0 of the leaf node a in the third query tree. For the depth d(b)=max{d(bi)|bi is the child node of o}+1 for other nodes b except leaf nodes.

Step 302, for the first merging node whose depth is 1 in the third query tree, if the child nodes of the first merging node include multiple BGP nodes, perform merging processing on multiple BGP nodes to obtain the merging process After the first BGP node, delete the first merged node, and add the first BGP node to the position of the first merged node.

If there is a first merging node with a depth of 1 in the third query tree, and the child nodes of the first merging node are multiple BGP nodes, then multiple BGP nodes can be merged, and multiple BGP nodes can be merged. BGP nodes are merged into one BGP node, and the merged BGP node is the first BGP node. The mappings corresponding to the BGP nodes P1-Pn before merging are [[P1]] _D -[[Pn]] _D respectively, where n is the number of BGP nodes before merging, then the BGP node Pm after merging corresponding mapping

After obtaining the first child node, the first merged node with the original depth of 1 can be deleted, and then the first BGP node can be added to the position of the merged node with the original depth of 1, that is, the first BGP node The point replaces the first merged node with a depth of 1. As shown in Figure 4, Figure 4 is a schematic diagram of step 303, after replacing the first merged node with a depth of 1 with the corresponding first BGP node, the depth of the node can be reduced, and the data volume of the intermediate result can be reduced. It can improve the efficiency of querying data.

No processing is done for UNION nodes with a depth of 1. Wherein, if the child node of the first merging node includes not only a plurality of BGP nodes but also a FILTER node, then after merging to obtain the first BGP node, the FILTER node can be used as the first BGP node Brother nodes of the FILTER node are replaced with the first merged node, and the scope of the FILTER node is recorded, that is, the corresponding relationship of the first BGP node of the FILTER node is recorded. Before executing each FILTER node, it can be based on For the corresponding relationship recorded by the current FILTER node, determine the nodes included in the current FILTER scope, and execute the query operation corresponding to the FILTER node on the basis of the query results of the nodes included in the scope.

Step 303, for the second merging node whose depth is 2 in the third query tree, if the child nodes of the second merging node include at least one BGP node and at least one UNION node, at least one BGP node is performed Merge processing, obtain the second BGP node after the merge processing, merge at least one UNION node, obtain the third UNION node after the merge processing; merge the second BGP node into the child of the third UNION node Among the nodes, the fourth UNION node is obtained, the second merged node is deleted, and the fourth UNION node is added to the position of the second merged node.

If there is a second merging node with a depth of 2 in the third query tree, and if the child nodes of the second merging node include multiple BGP nodes, each BGP node can be merged to obtain the second For the BGP node, the processing may refer to the processing of obtaining the first BGP node in step 303 above, which will not be repeated here.

If the child nodes of the second merging node also include multiple UNION nodes, each UNION node may be merged to obtain a third UNION node. For example, there are two UNION nodes u1 and u2 that need to be merged. Each UNION node has two child nodes (the child nodes are all BGP nodes) and the mappings are p1, p2, q1, and q2 respectively. After merging The third UNION node of has four child nodes, and the corresponding mappings are

If the child nodes of the second merging node include m UNION nodes, wherein the i-th UNION node u _i has _ni child nodes (BGP nodes), then the third UNION obtained by the final merging node's children

indivual. Wherein, N different mapping sets may be determined according to the BGP nodes of the m UNION nodes. There are m BGP node mappings in each mapping set, and the BGP nodes to which different mappings belong are child nodes of different UNION nodes. The mappings for each child node of the third UNION node are obtained by performing natural connections on the mappings in each mapping set.

After merging the BGP node and the UNION node respectively, the second BGP node may be merged into the child node of the third UNION node u3. For example, the mapping of the second child node is px, and the third child node has four child nodes, and the corresponding mappings are

Then merge the second BGP node into the child nodes of the third UNION node, and the obtained fourth UNION node u4 still has four child nodes, and the corresponding mappings are respectively

After obtaining the fourth UNION node, you can delete the second merge node with a depth of 2, and add the fourth UNION node to the original second merge node with a depth of 2, that is, replace the fourth UNION node Drop the second merge node with the original depth of 2. As shown in Figure 5, Figure 5 is a schematic diagram of step 303, so that after replacing the second merged node with a depth of 2 with the corresponding fourth UNION node, the depth of the node can be reduced, and the amount of data in the intermediate result can be reduced. It can improve the efficiency of querying data.

Among them, it should be noted that if the child nodes of the merged node with a depth of 2 only include UNION nodes, the UNION node can be directly replaced by the original merged node with a depth of 2 to reduce the node depth . If the child nodes of the merged node with a depth of 2 only include a BGP node and a UNION node, the BGP node can be directly merged into the child nodes of the UNION node to reduce the node depth. If the child nodes of the merged node with a depth of 2 include a BGP node and multiple UNION nodes, multiple UNION nodes can be merged first, and then the BGP nodes can be merged into the merged UNION node in the child nodes to reduce the node depth. If the child nodes of the merging node with a depth of 2 include multiple BGP nodes and a UNION node, multiple BGP nodes can be merged first, and then the merged BGP node can be merged into the UNION node in the child nodes to reduce the node depth.

In addition, if the child node of the second merging node also includes a FILTER node, after merging to obtain the fourth UNION node, the FILTER node can be used as a sibling node of the fourth UNION node to replace the second Merge nodes, and record the scope of the FILTER node, that is, record the corresponding relationship of the fourth UNION node of the FILTER node. Before executing each FILTER node, it can be based on the corresponding relationship recorded for the current FILTER node , determine the nodes included in the current FILTER scope, and execute the query operation corresponding to the FILTER node on the basis of the query results of the nodes included in the scope.

Step 304, for the fifth UNION node whose depth is 2 in the third query tree, add the grandchildren node of the fifth UNION node to the child node of the fifth UNION node, and delete the fifth UNION node The grandchildren node of the grandchildren node and the parent node of the grandchildren node get the third query tree after simplified processing.

If there is a fifth UNION node with a depth of 2 in the third query tree, then the child nodes of the fifth UNION node also include a plurality of UNION nodes, then the child nodes of the fifth UNION node can be The child node corresponding to the UNION node in is merged into the child node of the fifth UNION node, and the grandchild node of the fifth UNION node and the parent node of the grandchild node are deleted, and the simplification is obtained query tree.

Referring to Fig. 6, Fig. 6 is a schematic diagram of step 304, after deleting the grandchildren node of the fifth UNION node whose depth is 2, the node depth can be reduced, the data volume of intermediate results can be reduced, and the query data can be improved efficiency.

It should be noted that the above steps 302 to 304 are only descriptions of different processes, and there is no sequence in execution sequence. After the above steps 302 - 304 are processed, a simplified query tree can be obtained.

Step 305, determine the depth of each node in the simplified query tree.

Step 306, if there are still query nodes in the simplified query tree that meet the requirements of step 302-step 304 for simplification, then the simplified query tree can be determined as the third query tree to be simplified, and jump to the corresponding Steps continue to simplify until there is no node that can be simplified in the simplified query tree.

Since the processing of steps 302 to 304 above changes the depth corresponding to each node in the query tree, after the simplified query tree is obtained, the depth corresponding to each node in the simplified query tree can be determined. If there are still query nodes in the query tree after simplification that can be simplified in step 302-step 304, you can continue to perform corresponding simplification processing on the query nodes, further reducing the corresponding depth of each node in the query tree, Until it is determined that there is no node that can be simplified in the query tree after simplification.

In this way, after the cyclic processing of the above-mentioned multiple steps, the depth of the query tree is significantly reduced, the data volume of intermediate query results can be reduced, and the efficiency of querying data can be improved.

As shown in Figure 7, Figure 7 is a simplified processing method provided by the present application, which includes:

Step 701. Determine that the corresponding ancestor node in the first query tree does not have the first OPTIONAL node of the OPTIONAL node.

Step 702: Transform the subquery tree rooted at the parent node of the first OPTIONAL node into a third BGP node.

Wherein, the processing in FIG. 7 may be implemented in combination with the processing in FIG. 3 above, that is, before step 301 above, step 701 may be executed first. If it is determined in step 701 that there is the first OPTIONAL node in the first query tree, then the sub-query tree with the parent node of the first OPTIONAL node as the root node can be regarded as a BGP node. In a query tree, delete the parent node of the first OPTIONAL node as the sub-query tree of the root node, then add the third BGP node to the position of the parent node of the original first OPTIONAL node, as shown in Figure 8, The subquery tree rooted at merge node 2 can be converted into a BGP3 node.

After the processing in step 702 is completed, the processing in FIG. 3 can be performed. After the processing in FIG. 3 is completed, query operations can be performed on each node in the query tree obtained after the processing in FIG. 3 . When the query operation involving the corresponding third node is executed, the following two processing methods are included:

Processing method one:

When executing the first query operation corresponding to the third BGP node, the sub-query tree corresponding to the third BGP node can be determined first, and the sub-query tree is the parent node of the first OPTIONAL node in the above step 702. Point is the subquery tree of the root node.

When executing the query operation corresponding to the node of the sub-query tree, the query operation corresponding to the sibling node of the first OPTIONAL node can be executed first, and after the query operation corresponding to the sibling node is executed, the first query result is obtained, namely The subgraph queried in the graph data is used as the data query scope of the descendant nodes of the first OPTIONAL node, and the query operation corresponding to the descendant nodes of the first OPTIONAL node is executed again. This can reduce the amount of corresponding query data when executing the query operation of the descendant nodes of the first OPTIONAL node, and can improve the execution efficiency of the query operation.

Processing method two:

After determining the sub-query tree corresponding to the third BGP node, it may be determined whether the descendant nodes of the first OPTIONAL node also include the second OPTIONAL node.

If it is determined that at least one second OPTIONAL node is included in the descendant nodes of the first OPTIONAL node, the first OPTIONAL node of the first OPTIONAL node may be sequentially executed according to the depths of the first OPTIONAL node and the at least one second OPTIONAL node. A query operation corresponding to a sibling node and a second sibling node of at least one second OPTIONAL node.

Wherein, the data query range of the sibling nodes corresponding to each second OPTIONAL node is the query result of the sibling nodes corresponding to the previous OPTIONAL node.

In this way, when the brother node corresponding to each OPTIONAL node is executed, the corresponding query operation can be performed on the basis of the query result of the brother node corresponding to the previous OPTIONAL node, which can reduce the query corresponding to each query operation data volume, thereby improving query efficiency.

After obtaining the query results of the sibling nodes corresponding to each OPTIONAL node, the first OPTIONAL node and at least one second OPTIONAL node can be executed sequentially according to the depth of the first OPTIONAL node and at least one second OPTIONAL node. The query operation corresponding to the child node of the point.

Wherein, the corresponding data query scope of the child nodes of any OPTIONAL node is the query result corresponding to the sibling nodes of any OPTIONAL node. In this way, when the child node corresponding to each OPTIONAL node is executed, the corresponding query operation can be performed on the basis of the query results of its sibling nodes, which can reduce the amount of query data corresponding to each query operation, thereby improving query performance. efficiency.

As shown in Figure 9, Figure 9 is a simplified processing method provided by the present application, which includes:

Step 901, for the FILTER node in the first query tree, if the FILTER condition corresponding to the FILTER node satisfies a preset conversion condition, then convert the FILTER condition into a disjunctive normal form.

Before performing the above step 301, if it is determined that there is a FILTER node in the first query tree, it may first be determined whether the FILTER condition corresponding to the FILTER node satisfies a preset conversion condition. Among them, the FILTER condition corresponding to the conversion condition FILTER node is only composed of variables, constants, and and, or, and equal operators.

Step 902: Convert the FILTER node into a UNION node based on the disjunctive normal form.

If it is determined that the FILTER condition corresponding to the FILTER node satisfies the preset conversion condition, the FILTER condition corresponding to the FILTER node may be converted into disjunctive normal form f1||f2||...||fm. Among them, any fi in the disjunctive paradigm is a constraint on a variable, which can be regarded as the corresponding relationship between the variable and the corresponding constraint value. After obtaining the disjunctive normal form f1||f2||...||fm, if it is determined that any fi and fj are incompatible, that is, there is no mapping that satisfies any pair of fi∧fj, the first query tree The variables corresponding to the query nodes constrained by the FILTER condition appearing in , are assigned as constraint values with the BIND clause in turn, and then the FILTER nodes are converted into UNION nodes. The specific processing is as follows:

You can add a UNION node as a brother of the FILTER node, and add m child nodes to this UNION node, and the m child nodes are all merge nodes (m is the number of constraints in the disjunction paradigm) . For each constraint condition fi, assign the variable in the query node constrained by the FILTER condition (that is, other sibling nodes corresponding to the FILTER node except the newly added UNION node) to the constraint in fi with the BIND clause value, and connect the obtained node as a child node under the ith child node of the newly added UNION node. Finally, delete all nodes in this layer except the newly added UNION node. After step 902 is completed, go to step 301.

In this way, the FILTER node can be converted into a UNION node to participate in the simplified processing in Figure 3, which can simplify the query tree to a certain extent, thereby improving the efficiency of data query.

As shown in Figure 10, Figure 10 is a simplified processing method provided by the present application, which includes:

Step 1001, if there are multiple BGP nodes that can be executed in parallel, determine the public triplet patterns corresponding to the multiple BGP nodes.

Among them, two triplet patterns t1=<s1, p1, o1> and t2=<s2, p2, o2>, said t1 and t2 are equivalent (denoted as

), if and only if the following conditions hold: 1, s1 and s2 are variables; 2, p1 and p2 are the same predicate; 3, o1 and o2 are variables, or the same constant.

like

Then μ(t1,t2) is a bijection from Var(t1) to Var(t2), where Var(t1) represents the set of variables appearing in t1.

Given two BGP bi and bj, whose triplet sequences are Si=(ti1,...,tik) and Sj=(tj1,...,tjk'), Si and Sj are said to be equivalent ( Expressed as

), if and only if the following conditions are established: 1. The lengths of sequences Si and Sj are the same, that is, k=k'; 2. Simultaneously satisfy

3. μ1(ti1,tj1)∪...∪μk(tik,tjk') is still a bijection from Var(Si) to Var(Sj), where Var(Si) represents the set of variables appearing in Si.

Based on the above definition, if there are multiple BGPs that can be executed in parallel in the query graph, the frequent subgraph mining algorithm can be used to find the common subquery C={c1,...,cn} among multiple BGPs, where each ci are the equivalent triplet subsequences in BGP. Wherein, the common subquery C is a common triple pattern corresponding to multiple BGP nodes.

Step 1002, based on a greedy algorithm, determine a part of public triple patterns corresponding to the lowest query cost among the public triple patterns.

Select a common subquery with high selectivity: Given a BGP set B={b1,...,bm} and the common subquery set C={c1,...,cn} between them, select a common subquery subset

To minimize the cost of:

Where Cost(B,C _S ) is the matching cost of this BGP set, Cost( _ci ) is the matching cost of the common subquery _ci selected into the subset CS, and Cost(b _j |C _S ) is the After entering the result of the common subquery of the subset CS, the matching cost of querying the remaining RDF triples b _j . Among them, the matching cost can be the computing resource consumed or the time taken to query the corresponding triplet, etc. The matching cost can be determined according to the variables in each triplet pattern, and the matching cost corresponding to each variable can be determined by the technician Pre-equipment.

The selectivity sel(t) of Cost( _ci ) and Cost(b _j |C _S ) based on triple pattern t is defined as follows:

Cost(c _i )＝min{sel(t)| _t∈ci }×| _ci |

Cost(b _j |C _S )＝min{sel(t)|t∈b′ _j }×|b′ _j |

in

That is, the part of the BGP set B that is not covered by the common subquery subset CS.

Minimizing the above objective equation is an NP-hard problem. Therefore, a greedy algorithm is used to select CS to make the above target as small as possible. Initialize CS to the empty set. In each step, select a common subquery ci∈C and add it to CS to maximize Δ=Cost(B,C _S )-Cost(B,C _S ∪c _i ), iterate until no common subquery Query added to CS. The finally obtained CS is the part of the common triple pattern with the lowest query cost.

Step 1003, query the data corresponding to some public triple patterns in the graph data.

Step 1004, query the graph data for data corresponding to other triplet patterns except some common triplet patterns in multiple BGP nodes.

When executing the query operation corresponding to multiple BGP nodes, it can first perform matching on all selected common subqueries ci∈CS, and cache the intermediate results [[ci]]. When matching is performed on each BGP node bj, the result set can be calculated as follows:

each of them

is the common subquery in the bj triplet pattern subsequence, and b' _j is defined as the part of b _j not covered by the common subquery subset CS.

In this way, when performing query processing corresponding to multiple BGP nodes, you can first query the public relations triplet patterns corresponding to multiple BGP nodes, and then execute the public triplet patterns of multiple BGP nodes except for some public triplet patterns. Query processing corresponding to other triplet patterns, which can reduce the amount of query data and improve query speed.

All the above optional technical solutions may be combined in any way to form optional embodiments of the present disclosure, which will not be repeated here.

Fig. 11 is a schematic structural diagram of a device for querying data provided by an embodiment of the present application. The device may be the computer device in the above embodiment, see Fig. 11 , the device includes:

The receiving module 1110 is configured to receive a data query instruction sent by a data query application program, wherein the data query instruction carries a data query statement;

Establishing module 1120, configured to establish a first query tree corresponding to the data query statement based on the structure of the data query statement;

A processing module 1130, configured to simplify the first query tree based on the type of each node in the first query tree to obtain a second query tree;

The query module 1140 is configured to sequentially execute the query operations corresponding to the nodes in the second query tree in the graph database based on a preset execution sequence, to obtain data query results;

Returning module 1150, configured to return the data query result to the data query application program.

Optionally, the query module 1140 is configured to: determine the first query tree as a third query tree to be simplified, determine the depth of each node in the third query tree; for the third query For the first merging node with a depth of 1 in the tree, if the child nodes of the first merging node include multiple BGP nodes, then the multiple BGP nodes are merged to obtain the merged first A BGP node, delete the first merging node, add the first BGP node to the position of the first merging node; for the second merging node whose depth is 2 in the third query tree point, if the child nodes of the second merging node include at least one BGP node and at least one UNION node, then the at least one BGP node is merged to obtain the second BGP node after merging , performing merging processing on the at least one UNION node to obtain a third UNION node after merging processing; merging the second BGP node into a child node of the third UNION node to obtain a fourth UNION node, delete the second merge node, add the fourth UNION node to the position of the second merge node; for the fifth UNION node whose depth is 2 in the third query tree , adding the grandchild node of the fifth UNION node to the child node of the fifth UNION node, and deleting the grandchild node and the grandchild node of the fifth UNION node The parent node of is the third query tree after simplified processing.

Optionally, the processing module 1130 is further configured to: determine that the first OPTIONAL node of the OPTIONAL node does not exist in the corresponding ancestor node in the first query tree; use the parent of the first OPTIONAL node The subquery tree whose node is the root node is transformed into a third BGP node.

Optionally, the query module 1140 is configured to: when executing the first query operation corresponding to the third BGP node, determine a sub-query tree corresponding to the third BGP node; execute the sub-query tree The query operation corresponding to the sibling nodes of the first OPTIONAL node in the first OPTIONAL node is obtained to obtain the first query result; the first query result is determined as the data query range of the descendant node of the first OPTIONAL node; based on the Execute the query operation corresponding to the descendant node of the first OPTIONAL node within the scope of the data query.

Optionally, the query module 1140 is configured to: when executing the first query operation corresponding to the third BGP node, if it is determined that at least one second OPTIONAL node is included in the descendant nodes of the first OPTIONAL node Node: according to the depth of the first OPTIONAL node and the at least one second OPTIONAL node, sequentially execute the first sibling node of the first OPTIONAL node and the at least one second OPTIONAL node The query operation corresponding to the second sibling node, wherein, the data query range corresponding to the second sibling node of each second OPTIONAL node is the query result corresponding to the second sibling node of the previous OPTIONAL node; according to the the depth of the first OPTIONAL node and the at least one second OPTIONAL node, performing query operations corresponding to the child nodes of the first OPTIONAL node and the child nodes of the at least one second OPTIONAL node in sequence, Wherein, the corresponding data query scope of the child nodes of any OPTIONAL node is the query result corresponding to the sibling nodes of any OPTIONAL node.

Optionally, the processing module 1130 is configured to: for the FILTER node in the first query tree, if the FILTER condition corresponding to the FILTER node satisfies a preset conversion condition, convert the FILTER condition is a disjunctive normal form; based on the disjunctive normal form, transform the FILTER node into a UNION node. Optionally, the conversion condition is that the FILTER condition corresponding to the FILTER node is composed of variables, constants, and three operators: and, or, and equal.

Optionally, the query module 1140 is configured to: if there are multiple BGP nodes that can be executed in parallel, then determine the public triple pattern corresponding to the multiple BGP nodes; based on the greedy algorithm, in the In the public triplet pattern, determine the part of the public triplet pattern corresponding to the lowest query cost; query the data corresponding to the part of the public triplet pattern in the graph data; query multiple BGP structures in the graph data Data corresponding to other triplet patterns in the point except the partial public triplet patterns.

It should be noted that: when the device for querying data provided by the above-mentioned embodiments queries data, it only uses the division of the above-mentioned functional modules for illustration. In practical applications, the above-mentioned function allocation can be completed by different functional modules according to needs. That is, the internal structure of the computer device is divided into different functional modules to complete all or part of the functions described above. In addition, the device for querying data provided by the above embodiment and the method embodiment for querying data belong to the same idea, and its specific implementation process is detailed in the method embodiment, and will not be repeated here.

Fig. 12 shows a structural block diagram of a computer device 1200 provided by an exemplary embodiment of the present application. The computer device 1200 can be a portable mobile terminal, such as: smart phone, tablet computer, MP3 player (moving picture experts group audio layer III, moving picture experts compression standard audio layer 3), MP4 (moving picture experts group audio layer IV, Motion Picture Expert compresses standard audio levels 4) Players, laptops or desktops. The computer device 1200 may also be called user equipment, portable terminal, laptop terminal, desktop terminal, or other names.

Generally, a computer device 1200 includes: a processor 1201 and a memory 1202 . The processor 1201 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 1201 can adopt at least one hardware form among DSP (digital signal processing, digital signal processing), FPGA (field-programmable gate array, field programmable gate array), PLA (programmable logic array, programmable logic array) accomplish. The processor 1201 may also include a main processor and a coprocessor, the main processor is a processor for processing data in the wake-up state, and is also called a CPU (central processing unit, central processing unit); the coprocessor is Low-power processor for processing data in standby state. In some embodiments, the processor 1201 may be integrated with a GPU (graphics processing unit, image processor), and the GPU is used for rendering and drawing the content that needs to be displayed on the display screen. In some embodiments, the processor 1201 may further include an AI (artificial intelligence, artificial intelligence) processor, where the AI processor is used to process computing operations related to machine learning.

Memory 1202 may include one or more computer-readable storage media, which may be non-transitory. The memory 1202 may also include high-speed random access memory and non-volatile memory, such as one or more magnetic disk storage devices and flash memory storage devices. In some embodiments, the non-transitory computer-readable storage medium in the memory 1202 is used to store at least one instruction, and the at least one instruction is used to be executed by the processor 1201 to realize the query data provided by the method embodiments in this application Methods.

In some embodiments, the computer device 1200 may optionally further include: a peripheral device interface 1203 and at least one peripheral device. The processor 1201, the memory 1202, and the peripheral device interface 1203 may be connected through buses or signal lines. Each peripheral device can be connected to the peripheral device interface 1203 through a bus, a signal line or a circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 1204 , a display screen 1205 , a camera component 1206 , an audio circuit 1207 , a positioning component 1208 and a power supply 1209 .

The peripheral device interface 1203 may be used to connect at least one peripheral device related to I/O (input/output, input/output) to the processor 1201 and the memory 1202 . In some embodiments, the processor 1201, memory 1202 and peripheral device interface 1203 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 1201, memory 1202 and peripheral device interface 1203 or The two can be implemented on a separate chip or circuit board, which is not limited in this embodiment.

The radio frequency circuit 1204 is used to receive and transmit RF (radio frequency, radio frequency) signals, also called electromagnetic signals. The radio frequency circuit 1204 communicates with the communication network and other communication devices through electromagnetic signals. The radio frequency circuit 1204 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals. Optionally, the radio frequency circuit 1204 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and the like. The radio frequency circuit 1204 can communicate with other terminals through at least one wireless communication protocol. The wireless communication protocol includes but is not limited to: World Wide Web, Metropolitan Area Network, Intranet, various generations of mobile communication networks (2G, 3G, 4G and 5G), wireless local area network and/or WiFi (wireless fidelity, wireless fidelity) network. In some embodiments, the radio frequency circuit 1204 may also include circuits related to NFC (near field communication, short-range wireless communication), which is not limited in this application.

The display screen 1205 is used to display a UI (user interface, user interface). The UI can include graphics, text, icons, video, and any combination thereof. When the display screen 1205 is a touch display screen, the display screen 1205 also has the ability to collect touch signals on or above the surface of the display screen 1205 . The touch signal can be input to the processor 1201 as a control signal for processing. At this time, the display screen 1205 can also be used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards. In some embodiments, there may be one display screen 1205, which is arranged on the front panel of the computer device 1200; in other embodiments, there may be at least two display screens 1205, which are respectively arranged on different surfaces of the computer device 1200 or folded Design; In some other embodiments, the display screen 1205 may be a flexible display screen, which is arranged on a curved surface or a folded surface of the computer device 1200 . Even, the display screen 1205 can also be set as a non-rectangular irregular figure, that is, a special-shaped screen. The display screen 1205 can be made of LCD (liquid crystal display, liquid crystal display), OLED (organic light-emitting diode, organic light-emitting diode) and other materials.

The camera assembly 1206 is used to capture images or videos. Optionally, the camera component 1206 includes a front camera and a rear camera. Usually, the front camera is set on the front panel of the terminal, and the rear camera is set on the back of the terminal. In some embodiments, there are at least two rear cameras, which are any one of the main camera, depth-of-field camera, wide-angle camera, and telephoto camera, so as to realize the fusion of the main camera and the depth-of-field camera to realize the background blur function. Combined with the wide-angle camera to achieve panoramic shooting and VR (virtual reality, virtual reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 1206 may also include a flash. The flash can be a single-color temperature flash or a dual-color temperature flash. Dual color temperature flash refers to the combination of warm light flash and cold light flash, which can be used for light compensation under different color temperatures.

Audio circuitry 1207 may include a microphone and speakers. The microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals and input them to the processor 1201 for processing, or input them to the radio frequency circuit 1204 to realize voice communication. For the purpose of stereo acquisition or noise reduction, there may be multiple microphones, which are respectively arranged at different parts of the computer device 1200 . The microphone can also be an array microphone or an omnidirectional collection microphone. The speaker is used to convert the electrical signal from the processor 1201 or the radio frequency circuit 1204 into sound waves. The loudspeaker can be a conventional membrane loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, it is possible not only to convert electrical signals into sound waves audible to humans, but also to convert electrical signals into sound waves inaudible to humans for purposes such as distance measurement. In some embodiments, audio circuitry 1207 may also include a headphone jack.

The positioning component 1208 is used to locate the current geographic location of the computer device 1200, so as to realize navigation or LBS (location based service, location-based service). The positioning component 1208 may be a positioning component based on the GPS (global positioning system, global positioning system) of the United States, the Beidou system of China or the Galileo system of Russia.

The power supply 1209 is used to supply power to various components in the computer device 1200 . The power source 1209 can be alternating current, direct current, disposable batteries, or rechargeable batteries. When the power source 1209 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. A wired rechargeable battery is a battery charged through a wired line, and a wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery can also be used to support fast charging technology.

In some embodiments, the computing device 1200 also includes one or more sensors 1210 . The one or more sensors 1210 include, but are not limited to: an acceleration sensor 1211 , a gyroscope sensor 1212 , a pressure sensor 1213 , a fingerprint sensor 1214 , an optical sensor 1215 and a proximity sensor 1216 .

The acceleration sensor 1211 can detect the acceleration on the three coordinate axes of the coordinate system established by the computer device 1200 . For example, the acceleration sensor 1211 can be used to detect the components of the acceleration of gravity on the three coordinate axes. The processor 1201 may control the display screen 1205 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 1211 . The acceleration sensor 1211 can also be used for collecting game or user's motion data.

The gyro sensor 1212 can detect the body direction and rotation angle of the computer device 1200 , and the gyro sensor 1212 can cooperate with the acceleration sensor 1211 to collect 3D actions of the user on the computer device 1200 . According to the data collected by the gyroscope sensor 1212, the processor 1201 can realize the following functions: motion sensing (such as changing the UI according to the tilt operation of the user), image stabilization during shooting, game control and inertial navigation.

The pressure sensor 1213 may be disposed on the side frame of the computer device 1200 and/or the lower layer of the display screen 1205 . When the pressure sensor 1213 is arranged on the side frame of the computer device 1200 , it can detect the user's grip signal on the computer device 1200 , and the processor 1201 performs left and right hand recognition or shortcut operation according to the grip signal collected by the pressure sensor 1213 . When the pressure sensor 1213 is disposed on the lower layer of the display screen 1205, the processor 1201 controls operable controls on the UI interface according to the user's pressure operation on the display screen 1205. The operable controls include at least one of button controls, scroll bar controls, icon controls, and menu controls.

The fingerprint sensor 1214 is used to collect the user's fingerprint, and the processor 1201 recognizes the identity of the user according to the fingerprint collected by the fingerprint sensor 1214, or, the fingerprint sensor 1214 recognizes the user's identity according to the collected fingerprint. When the identity of the user is recognized as a trusted identity, the processor 1201 authorizes the user to perform related sensitive operations, such sensitive operations include unlocking the screen, viewing encrypted information, downloading software, making payment, and changing settings. Fingerprint sensor 1214 may be disposed on the front, back or sides of computing device 1200 . When the computer device 1200 is provided with a physical button or a manufacturer's Logo, the fingerprint sensor 1214 may be integrated with the physical button or the manufacturer's Logo.

The optical sensor 1215 is used to collect ambient light intensity. In one embodiment, the processor 1201 may control the display brightness of the display screen 1205 according to the ambient light intensity collected by the optical sensor 1215 . Specifically, when the ambient light intensity is high, the display brightness of the display screen 1205 is increased; when the ambient light intensity is low, the display brightness of the display screen 1205 is decreased. In another embodiment, the processor 1201 may also dynamically adjust shooting parameters of the camera assembly 1206 according to the ambient light intensity collected by the optical sensor 1215 .

A proximity sensor 1216 , also called a distance sensor, is usually disposed on the front panel of the computer device 1200 . The proximity sensor 1216 is used to capture the distance between the user and the front of the computer device 1200 . In one embodiment, when the proximity sensor 1216 detects that the distance between the user and the front of the computer device 1200 gradually decreases, the processor 1201 controls the display screen 1205 to switch from the bright screen state to the off-screen state; when the proximity sensor 1216 detects When the distance between the user and the front of the computer device 1200 gradually increases, the processor 1201 controls the display screen 1205 to switch from the off-screen state to the on-screen state.

Those skilled in the art can understand that the structure shown in FIG. 12 does not constitute a limitation to the computer device 1200, and may include more or less components than shown in the figure, or combine some components, or adopt a different arrangement of components.

In an exemplary embodiment, there is also provided a computer-readable storage medium, such as a memory including instructions, and the above instructions can be executed by a processor in the terminal to complete the method for querying data in the above embodiments. The computer readable storage medium may be non-transitory. For example, the computer-readable storage medium may be ROM (read-only memory, read-only memory), RAM (random access memory, random access memory), CD-ROM, magnetic tape, floppy disk, and optical data storage device, etc.

In an exemplary embodiment, a computer program product is also provided, the computer program product includes at least one instruction, and the at least one instruction is loaded and executed by a processor to implement the method for querying data in the above embodiments.

Those of ordinary skill in the art can understand that all or part of the steps for implementing the above embodiments can be completed by hardware, and can also be completed by instructing related hardware through a program. The program can be stored in a computer-readable storage medium. The above-mentioned The storage medium mentioned may be a read-only memory, a magnetic disk or an optical disk, and the like.

In this application, the terms "first" and "second" are used to distinguish the same or similar items with basically the same function and function. It should be understood that there is no logic or sequence between "first" and "second" Dependencies on the above, and there are no restrictions on the number and execution order. It should also be understood that although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by the terms. These terms are only used to distinguish one element from another. The meaning of the term "at least one" in this application refers to one or more, and the meaning of the term "multiple" in this application refers to two or more.

The above description is only the specific implementation of the application, but the scope of protection of the application is not limited thereto. Any person familiar with the technical field can easily think of various equivalent modifications within the technical scope disclosed in the application. Or replacement, these modifications or replacements should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.

Claims

A method for querying data, characterized in that the method comprises:

receiving a data query instruction sent by a data query application program, wherein the data query instruction carries a data query statement;

Establishing a first query tree corresponding to the data query statement based on the structure of the data query statement;

Based on the type of each node in the first query tree, simplifying the first query tree to obtain a second query tree;

Based on the preset execution sequence, sequentially execute the query operations corresponding to the nodes in the second query tree in the graph database to obtain data query results;

returning the data query result to the data query application program.
The method according to claim 1, wherein the types of nodes in the first query tree include merge nodes and query nodes, wherein the merge nodes are used to represent the data query statement or the A subquery statement in the data query statement, the query node is used to represent the query word in the data query statement or the subquery statement in the data query statement, and the query node includes a basic graph mode BGP node At least one of node, joint UNION node, optional matching OPTIONAL node, and filter FILTER node.
The method according to claim 2, wherein, based on the types of nodes in the first query tree, the first query tree is simplified to obtain a second query tree, including:

Determining the first query tree as a third query tree to be simplified, and determining the depth of each node in the third query tree;

For the first merging node with a depth of 1 in the third query tree, if the child nodes of the first merging node include multiple BGP nodes, performing merging processing on the multiple BGP nodes, Obtaining the merged first BGP node, deleting the first merged node, adding the first BGP node to the position of the first merged node;

For the second merging node with a depth of 2 in the third query tree, if the child nodes of the second merging node include at least one BGP node and at least one UNION node, then for the at least one BGP The node is merged to obtain the second BGP node after the merge process, and the at least one UNION node is merged to obtain the third UNION node after the merge process; the second BGP node is merged into In the child node of the third UNION node, obtain the fourth UNION node, delete the second merged node, and add the fourth UNION node to the position of the second merged node;

For the fifth UNION node whose depth is 2 in the third query tree, add the grandchildren node of the fifth UNION node to the child node of the fifth UNION node, and delete the The grandchild node of the fifth UNION node and the parent node of the grandchild node obtain the third query tree after simplified processing.
The method according to claim 3, wherein before the first query tree is determined as the third query tree to be simplified, the method further comprises:

determining that there is no first OPTIONAL node in the corresponding ancestor node in the first query tree;

Converting the subquery tree whose root node is the parent node of the first OPTIONAL node into a third BGP node.
The method according to claim 4, wherein the sequentially executing the query operation corresponding to each node in the second query tree in the graph database includes:

When executing the first query operation corresponding to the third BGP node, determine the sub-query tree corresponding to the third BGP node;

Execute the query operation corresponding to the brother node of the first OPTIONAL node in the sub-query tree to obtain a first query result; determine the first query result as the descendant node of the first OPTIONAL node A data query range: based on the data query range, perform a query operation corresponding to a descendant node of the first OPTIONAL node.
The method according to claim 4, wherein the sequentially executing the query operation corresponding to each node in the second query tree in the graph database includes:

When performing the first query operation corresponding to the third BGP node, if it is determined that at least one second OPTIONAL node is included in the descendant nodes of the first OPTIONAL node;

According to the depth of the first OPTIONAL node and the at least one second OPTIONAL node, sequentially execute the first sibling node of the first OPTIONAL node and the second sibling node of the at least one second OPTIONAL node The query operation corresponding to the node, wherein, the data query range corresponding to the second sibling node of each second OPTIONAL node is the query result corresponding to the second sibling node of the previous OPTIONAL node;

According to the depth of the first OPTIONAL node and the at least one second OPTIONAL node, sequentially execute the child nodes of the first OPTIONAL node and the child nodes of the at least one second OPTIONAL node Query operation, wherein the data query range corresponding to the child nodes of any OPTIONAL node is the query result corresponding to the brother nodes of any OPTIONAL node.
The method according to claim 2, wherein, based on the types of nodes in the first query tree, the first query tree is simplified to obtain a second query tree, including:

For the FILTER node in the first query tree, if the FILTER condition corresponding to the FILTER node satisfies a preset conversion condition, then convert the FILTER condition into a disjunctive normal form;

Based on the disjunctive normal form, convert the FILTER node into a UNION node.
The method according to claim 7, wherein the conversion condition is that the FILTER condition corresponding to the FILTER node is composed of variables, constants, and and, or, and equal operators.
The method according to claim 2, wherein the sequentially executing the query operation corresponding to each node in the second query tree in the graph database includes:

If there are multiple BGP nodes that can be executed in parallel, then determine the public triplet pattern corresponding to the multiple BGP nodes;

Based on the greedy algorithm, determining a part of the public triplet patterns corresponding to the lowest query cost in the public triplet patterns;

Querying the data corresponding to the partial public triple pattern in the graph data;

The graph data is queried for data corresponding to triplet patterns other than the partial common triplet patterns in multiple BGP nodes.
A device for querying data, characterized in that the device comprises:

A receiving module, configured to receive a data query instruction sent by a data query application program, wherein the data query instruction carries a data query statement;

A building module, configured to build a first query tree corresponding to the data query statement based on the structure of the data query statement;

A processing module, configured to simplify the first query tree based on the type of each node in the first query tree to obtain a second query tree;

A query module, configured to sequentially execute query operations corresponding to each node in the second query tree in the graph database based on a preset execution sequence, to obtain data query results;

A returning module, configured to return the data query result to the data query application program.
A computer device, characterized in that the computer device includes a processor and a memory, and at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor to implement claims 1 to 1. The operation performed by the method for querying data described in any one of claim 9.
A computer-readable storage medium, characterized in that at least one instruction is stored in the storage medium, and the at least one instruction is loaded and executed by a processor to implement the method described in any one of claims 1 to 9 The operation performed by the method that queries the data.