CN114372165A

CN114372165A - Optimized path query method, device, equipment and storage medium for jump connection

Info

Publication number: CN114372165A
Application number: CN202210055214.3A
Authority: CN
Inventors: 李艳; 彭鹏; 李文杰
Original assignee: Beijing Tupu Technology Co ltd
Current assignee: Beijing Tupu Technology Co ltd
Priority date: 2022-01-18
Filing date: 2022-01-18
Publication date: 2022-04-19

Abstract

The invention relates to the technical field of subgraph matching and discloses a jump connection optimized path query method, a device, equipment and a storage medium, namely aiming at a path with n edge numbers and n edge labels

Query statement of

First, two corresponding sub-query statements are obtained:

and

and is provided with k₁+k₂+1 ═ n, then according to two subgroupsInquiring the target data graph by the inquiry statement to obtain corresponding sub-inquiry matching results, and finally performing jump-type connection edge expansion on the edge matching results in the two sub-inquiry matching results to obtain the result corresponding to the inquiry statement

And the corresponding final edge is matched with the result, so that the pressure for generating and storing the intermediate result can be effectively reduced, the calculation is accelerated, the query performance is improved, the optimality under the fastest condition is ensured, and the practical application and popularization are facilitated.

Description

Optimized path query method, device, equipment and storage medium for jump connection

Technical Field

The invention belongs to the technical field of subgraph matching, and particularly relates to a jump-connection optimized path query method, device, equipment and storage medium.

Background

Currently, many graph systems are proposed for efficiently storing data graphs and processing query graphs, and they mainly include RDF (Resource-Description-Framework, which is a network Resource Description model written in XML) graph systems, such as Jena, Virtuoso, RDF4J, gStore, and the like, and attribute graph systems Neo4j, Graphflow, EmptyHeaded, and the like, where some system query languages support variable-length path query operations, most commonly SPARQL and Cypher query languages.

As a graph system, the query is an important basic operation, and all query operations can be generalized to subgraph matching, namely all embeddings in the data graph G which are isomorphic with the query graph q are searched. Subgraph matching has been widely used in academia. Due to the importance of subgraph matching, various algorithms have been proposed. In the field of databases, subgraph matching algorithms can be divided into two types, wherein one type is a connection type (i.e., Join type), and existing connection strategies can be roughly divided into three types (of course, connection strategies that do not belong to the three types exist, and are not described again because they are irrelevant to this application).

(1) The first connection policy is pair-wise join (hereinafter abbreviated as PJ), which is a two-column table connection policy commonly used in databases (which is not described in more detail because it is irrelevant to the present application), and is used in, for example, the attribute map system Neo4 j.

(2) The second type of join strategy is Binary join, which computes the subgraph match by solving a series of Binary joins, and first decomposes the original query graph into a set of join elements whose matches can join the basic elements according to a predefined join order, and finally obtains the result. The Binary join algorithm differs only in concatenation units and concatenation order, with typical algorithms for concatenation units being StarJoin, TwinTwigJoin and CliqueJoin. StarJoin obviously decomposes a query graph by using a star as a connection unit, firstly, vertex coverage of the query graph is positioned, each covered vertex and neighbor points which are not used yet automatically form the star, and finally, a group of star connection units obtain all results according to a left deep connection sequence. However, StarJoin has a big disadvantage that k-star is enumerated at a vertex with degree d, which takes the cost of O (dk), and if the degree is very large, the "star expansion" is generated to obtain a very large table. TwinTwigJoin was optimized for StarJoin, its joining unit was "TwinTwig", which is a star structure having only two sides at most, and therefore it puts constraints on the star structure, but it follows the left deep joining order as with StarJoin. TwinTwigJoin hinders "star exposure" to some extent, but it still has certain disadvantages: execution time is long and left deep join is a suboptimal join plan. The proposal of clique join in 2016 well solves the problem, and firstly, a triangulation strategy is adopted in data division, based on which clique join can adopt 'clique' and 'star' as connection units, and the use of clique can greatly shorten the execution time. And secondly, CliqueJoin adds a link unit obtained by the Bushy join plan link decomposition.

(3) The third connection strategy is Worst case optimal connection (WCOJ), which is a latest technology about connection operations in a database. Given a number of tables { R₁,R₂,...,R_nThe upper limit of the number of results obtained by the multi-table links above them can be determined according to the fractional edge coverage of the hypergraph (hypergraph) to which their links correspond. For example, the term "hypergraph" refers to a graph in which each edge has a plurality of endpoints, and the graph is defined by two endpoints per edge. It is clear that a hypergraph is a generalization of the usual graph definitions, so the properties satisfied on a hypergraph are commonAlso satisfies the diagram definition. The partial edge cover (fractional edge cover) on the so-called hypergraph is a function f: E → R⁺The function satisfies that v is sigma for any v belonging to V (H)_e∈E:v∈ef (e) is not less than 1. For any partial edge coverage f, sigma_e∈Ef (e) a weight called partial edge coverage f, denoted w. The smallest value among the weights of all the partial edge covers f is called a partial edge cover value (fractional edge cover number), and is denoted as w^*. The partial edge coverage corresponding to the partial edge coverage value is recorded as f^*. TABLE { R₁,R₂,...,R_nAny one of the multi-table connections on the graph can correspond to a hypergraph H. Specifically, in the table { R₁,R₂,...,R_nEach table in the graph can correspond to an edge in the hypergraph H, and the attribute in each table corresponds to a point in the hypergraph H. We can then have the following properties, Table { R }₁,R₂,...,R_nThe number of results obtained by a multi-table connection on { OUT } satisfies the following property:

in the formula, R_eIs a table corresponding to the upper edge e of the hypergraph H, | R_eIs R |_eAnd f is a partial edge cover. In analysis, it is often assumed that any one table is N in size, so the table { R }₁,R₂,...,R_nOne multi-table connection on (b) } the number of results | OUT | with an upper limit of

w^*Are partial edge coverage values for the hypergraph H. Subgraph matching queries can also be considered a special case of multi-table connections. In the subgraph matching query, each edge is a table with only two columns, and subgraphs with multiple edges are connected corresponding to the tables with two columns. A typical example is a triangle query, in which the number of results | OUT | of the query is limited to N, assuming that each edge matches N edges on the graph^1.5. Because of thisThe coverage value of each triangle partial side is 1.5, and the corresponding partial side covers f^*All edges are assigned a value of 0.5. This conclusion is exciting because conventional multi-table joins, which translate two tables into a combination of two-table joins, result in N in the process³The number of results of (a).

However, for the current query, there are problems that too many intermediate results in the query process result in too large storage pressure and reduced processing performance, such as an extreme case shown in fig. 1, where the left side is a data graph G, the right side is a path query P with 3 path edges and an edge label a based on a query graph q³(a) Based on the existing WCOJ algorithm, the connected sub-query results are expanded and the intersection is made through a query vertex adjacent to at least one vertex in the connected sub-query results, namely, the path query P is enumerated firstly²(a) (the number of path edges is 2 and the edge label is a) of 1000001 results, and the last point is expanded to obtain the final result number of 1000, as shown in fig. 2, the foregoing query process can see path query P²(a) Query P at path³(a) The method is not utilized, and huge memory consumption and calculation time cost are brought.

Disclosure of Invention

In order to solve the problems that the storage pressure is too large and the processing performance is reduced due to too many intermediate results in the conventional path query mode in subgraph matching, the invention aims to provide a jump-connection optimized path query method, a jump-connection optimized path query device, computer equipment and a computer readable storage medium, which can effectively reduce the generation and storage pressure of the intermediate results, accelerate the calculation, improve the query performance, ensure the optimality under the fastest condition and facilitate the practical application and popularization.

In a first aspect, the present invention provides a method for querying an optimized path of a jump connection, including:

for a path, the number of edges is n and the edge label is

Is queried bySentence

Obtaining a corresponding first sub-query statement

And a second sub-query statement

Wherein n is a positive integer greater than or equal to three, k₁Is a positive integer and represents the first sub-query statement

Number of path edges, k₂Is a positive integer and represents the second sub-query statement

Number of path edges, k₁+k₂+1 ═ n, the query statement

The first sub-query statement

And the second sub-query statement

Has only one edge at each initial query point, the query statement

The first sub-query statement

And the second sub-query statement

Has only one incoming edge respectively, the query statement

The first sub-query statement

And the second sub-query statement

The rest query points are respectively provided with only one pair of incoming edges and outgoing edges;

according to the first sub-query statement

Obtaining a first sub-query matching result from the query in the target data graph, and obtaining a second sub-query statement according to the first sub-query statement

Querying from the target data graph to obtain a second sub-query matching result, wherein the first sub-query matching result and the second sub-query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result only have one pair of incoming edge and outgoing edge;

selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result, then constructing a first point set according to the starting points of all edge matching results in the sub-query matching result, and performing the following edge expansion on the end point of each edge matching result in the other sub-query matching result: and for a certain terminal point, traversing and searching all corresponding extension points in the first point set, and when finding that a certain extension point can be searched, connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point by extension to obtain a result corresponding to the query statement

Corresponding toAnd (5) edge matching results.

Based on the invention content, a new path query scheme capable of effectively reducing intermediate results in the sub-graph matching process is provided, namely, n is the number of edges of one path and the labels of the edges are

Query statement of

First, two corresponding sub-query statements are obtained:

and

and is provided with k₁+k₂And if n is +1, then obtaining a corresponding sub-query matching result from the target data graph according to the two sub-query sentences, and finally performing edge expansion in a jumping connection on the edge matching result in the two sub-query matching results to obtain the result corresponding to the query sentence

In one possible design, when k₁＝k₂The first sub-query statement

And the second sub-query statement

The first sub-query matching result and the second sub-query matching result are the same sub-query matching result;

selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result, wherein the selecting step comprises the following steps: and randomly selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result.

In one possible design, when k₁≠k₂The first sub-query statement

And the second sub-query statement

The first sub-query matching result and the second sub-query matching result are different sub-query matching results for different sub-query statements;

selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result, wherein the selecting step comprises the following steps: and selecting one sub-query matching result with more total number of edge matching results from the first sub-query matching result and the second sub-query matching result.

In one possible design, for the first sub-query statement

Or the second sub-query statement

If the number of corresponding path edges is a positive integer greater than or equal to three, obtaining a corresponding sub-query matching result from the target data graph according to the corresponding sub-query statement, including:

against sub-query statements

Obtaining a corresponding first grandchild query statement

And a second grandchild query statement

Wherein the sub-query statement

For the first sub-query statement

Or the second sub-query statement

k is k₁Or k₂，k₁₁Is a positive integer and represents the first grandchild query statement

Number of path edges, k₂₂Is a positive integer and represents the second grandchild query statement

Number of path edges, k₁₁+k₂₂K, the first grandchild query statement

And the second grandchild query statement

The initial query points of the query sentence respectively have only one outgoing edge, and the first grandchild query sentence

And the second grandchild query statement

The terminal query points of (2) have only one incoming edge respectively, and the first grandchild query statement

And the second grandchild query statement

according to the first grandchild query statement

Obtaining a first grandchild query matching result from the query of the target data graph, and obtaining a second grandchild query statement according to the first grandchild query statement

Querying from the target data graph to obtain a second grandchild query matching result, wherein the first grandchild query matching result and the second grandchild query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result only have one pair of incoming edge and outgoing edge;

selecting a grandchild query matching result from the first grandchild query matching result and the second grandchild query matching result, then constructing a second point set according to the starting points of all edge matching results in the grandchild query matching result, and performing the following edge expansion on the end point of each edge matching result in the other grandchild query matching result: and for a certain terminal point, traversing and searching all corresponding extension points in the second point set, and when finding that a certain extension point can be searched, connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point by extension to obtain a result corresponding to the sub-query statement

And matching the corresponding edges.

In one possible design, when | k₁-k₂When 1, according to the first sub-query statement

Obtaining a first sub-query match from a query in a target data graphMatching results, and according to the second sub-query statement

Obtaining a second sub-query matching result from the query in the target data graph, wherein the second sub-query matching result comprises:

according to the first sub-query statement

And the second sub-query statement

Obtaining the first sub-query statement from the query in the target data graph in parallel

Corresponding first sub-query matching result and second sub-query statement

And the corresponding second sub-query matching results, wherein the first sub-query matching result and the second sub-query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result respectively only have one pair of incoming edges and outgoing edges.

In one possible design, when n is an odd number, the method comprises

In one possible design, when n is an even number, the method includes

In a second aspect, the invention provides a jump-type connection optimized path query device, which comprises a query statement acquisition module, a query statement execution module and a jump-type connection module, wherein the query statement acquisition module, the query statement execution module and the jump-type connection module are sequentially in communication connection;

the query statement acquisition module is used for aiming at a path with n edge number and n edge label

Query statement of

Obtaining a corresponding first sub-query statement

And a second sub-query statement

Number of path edges, k₁+k₂+1 ═ n, the query statement

The first sub-query statement

And the second sub-query statement

Has only one edge at each initial query point, the query statement

The first sub-query statement

And stationThe second sub-query statement

Has only one incoming edge respectively, the query statement

The first sub-query statement

And the second sub-query statement

the query statement execution module is used for executing the first sub-query statement according to the first sub-query statement

the skip connection module is configured to select one sub-query matching result from the first sub-query matching result and the second sub-query matching result, construct a first point set according to starting points of all edge matching results in the sub-query matching result, and perform the following edge expansion on an end point of each edge matching result in another sub-query matching result: for a certain end point, all corresponding extension points are searched in the first point set in a traversing manner, and the corresponding extension points are foundWhen a certain extension point can be searched, an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point are connected in an extending manner to obtain an edge matching result corresponding to the query statement

And matching the corresponding edges.

In a third aspect, the present invention provides a computer device, comprising a memory, a processor and a transceiver, which are sequentially connected in communication, wherein the memory is used for storing a computer program, the transceiver is used for transmitting and receiving information, and the processor is used for reading the computer program and executing the optimized path query method according to the first aspect or any possible design of the first aspect.

In a fourth aspect, the present invention provides a computer-readable storage medium having stored thereon instructions which, when executed on a computer, perform the optimized path query method as described in the first aspect or any of the possible designs of the first aspect.

In a fifth aspect, the present invention provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the optimized path query method as described in the first aspect or any possible design of the first aspect above.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is an example diagram of an extreme data graph G and a path query with a path edge number of 3.

Fig. 2 is a diagram illustrating an exemplary implementation of a path query for the extreme case shown in fig. 1 by using the WCOJ algorithm.

Fig. 3 is a schematic flow chart of the optimized path query method for jump connection according to the present invention.

Fig. 4 is a diagram illustrating an exemplary execution process of performing a query on the extreme case shown in fig. 1 by using an optimized path query method.

FIG. 5 is an exemplary diagram of parallel execution of optimized path queries provided by the present invention.

Fig. 6 is a schematic structural diagram of the optimized path query device for jump connection provided by the present invention.

Fig. 7 is a schematic structural diagram of a computer device provided by the present invention.

Detailed Description

The invention is further described with reference to the following figures and specific embodiments. It should be noted that the description of the embodiments is provided to help understanding of the present invention, but the present invention is not limited thereto. Specific structural and functional details disclosed herein are merely representative of exemplary embodiments of the invention. This invention may, however, be embodied in many alternate forms and should not be construed as limited to the embodiments set forth herein.

It will be understood that, although the terms first, second, etc. may be used herein to describe various objects, these objects should not be limited by these terms. These terms are only used to distinguish one object from another. For example, a first object may be referred to as a second object, and similarly, a second object may be referred to as a first object, without departing from the scope of example embodiments of the present invention.

It should be understood that, for the term "and/or" as may appear herein, it is merely an associative relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, B exists alone or A and B exist at the same time; for the term "/and" as may appear herein, which describes another associative object relationship, it means that two relationships may exist, e.g., a/and B, may mean: a exists singly or A and B exist simultaneously; in addition, for the character "/" that may appear herein, it generally means that the former and latter associated objects are in an "or" relationship.

As shown in fig. 3 to 4, the optimized path query method for jump connection provided in the first aspect of this embodiment may be, but not limited to, executed by a Computer device having certain computing resources and performing subgraph matching, for example, executed by an electronic device such as a Personal Computer (PC, which refers to a multipurpose Computer with a size, price, and performance suitable for Personal use; a desktop Computer, a notebook Computer, a small notebook Computer, a tablet Computer, a super book, and the like all belong to the Personal Computer), a smart phone, a Personal digital assistant (PAD), or a wearable device, so as to facilitate the sub-graph. As shown in fig. 1, the optimized path query method for the jump-connection may include, but is not limited to, the following steps S1 to S3.

S1, aiming at a path, the number of edges is n and the edge label is

Query statement of

Obtaining a corresponding first sub-query statement

And a second sub-query statement

Number of path edges, k₁+k₂+1 ═ n, the query statement

The first sub-query statement

And the second sub-query statement

Has only one edge at each initial query point, the query statement

The first sub-query statement

And the second sub-query statement

Has only one incoming edge respectively, the query statement

The first sub-query statement

And the second sub-query statement

The rest of the query points have only one pair of in-edge and out-edge respectively.

In the step S1, the query statement

For a given path query statement in the subgraph matching, because the initial query point of the given path query statement only has one outgoing edge, the final query point only has one incoming edge and the rest query points respectively only have one pair of incoming edge and outgoing edge, a linear path (which is a constraint condition of the current optimized path query) can be obtained by connecting all the query points according to the edges in the query. The first sub-query statement

And the second sub-query statement

According to the query statement

The new path query statement can obtain a linear path by connecting all query points according to edges in the query, so that the query statement can be obtained based on the query result

The query result of (2). Furthermore, the query statement

The first sub-query statement

And the second sub-query statement

But not limited to SPARQL query or Cypher query, etc., which may be made available in a conventional manner.

S2, according to the first sub-query statement

Obtaining a second sub-query matching result by querying from the target data graph, wherein the first sub-query matching result and the second sub-query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one edge, and the edge matching result is obtainedThe end point of the fruit is only one in edge, and other points of the edge matching result are only one pair of in edge and out edge respectively.

In the step S2, when the target data diagram is stored in a database (the creation process may be, but is not limited to, running the gbuild command of gStore to create according to the target data diagram given the database name and the path of the RDF data store in NT format), the first sub-query statement sentence may be used according to the given query database name

And the second sub-query statement

And carrying out conventional query to obtain at least one corresponding edge matching result.

S3, selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result, then constructing a first point set according to the starting points of all edge matching results in the sub-query matching result, and performing the following edge expansion on the end point of each edge matching result in the other sub-query matching result: and for a certain terminal point, traversing and searching all corresponding extension points in the first point set, and when finding that a certain extension point can be searched, connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point by extension to obtain a result corresponding to the query statement

And matching the corresponding edges.

In the step S3, specifically, when k is₁＝k₂Due to the first sub-query statement

And the second sub-query statement

Are the same asThe first sub-query matching result and the second sub-query matching result are the same sub-query matching result (i.e., query isomorphism), so that one sub-query matching result can be arbitrarily selected from the first sub-query matching result and the second sub-query matching result. When k is₁≠k₂Due to the first sub-query statement

And the second sub-query statement

The first sub-query matching result and the second sub-query matching result are different sub-query matching results (namely different query structures) for different sub-query statements, and in order to reduce the calculation resources required in the subsequent terminal point expanded connection, one sub-query matching result with more total number of edge matching results is preferably selected from the first sub-query matching result and the second sub-query matching result. Further, the first set of points may be, but is not limited to being, in the form of a hash table.

In the step S3, taking the extreme case shown in fig. 1 as an example, as shown in fig. 4, for a given query statement P³(a) Two same sub-query sentences P with the isomorphic characteristic of query can be obtained first¹(a) Then, the data graph G is inquired to obtain the same sub-inquiry matching result, then one sub-inquiry matching result is selected randomly, and a point set { V } is constructed according to the starting points of all edge matching results in the sub-inquiry matching result₁,V₂,…,V₁₀₀₀,V,V₂₀₀₀And simultaneously matching the end points { V, V ] of the results for each edge in the other sub-query matching result₁₀₀₁,V₁₀₀₂,…,V₂₀₀₀V' } perform the following edge expansion: for a certain end point V, at said set of points { V₁,V₂,…,V₁₀₀₀,V,V₂₀₀₀All extension points corresponding to the V are searched in a traversal mode₁₀₀₁,V₁₀₀₂,…,V₂₀₀₀Find a certain extension point V₂₀₀₀By expanding the connectionReceiving the edge matching result V corresponding to the certain terminal point V₁₀₀V and the certain extension point V₂₀₀₀Corresponding edge matching result V₂₀₀₀V', obtaining a query statement P³(a) Corresponding edge matching result V₁₀₀VV₂₀₀₀V', and further through such a jump connection (which may be named Jumping Join) as described above, not only are intermediate results reduced, but also the computation can be accelerated. In addition, since the same matching results of the two sub-queries are obtained, all the results can be obtained by traversing the data graph G once, and further, when n is an odd number, the result is preferably obtained

Further reducing intermediate results and speeding up calculations.

As can be seen from the above example, the query statement for the path edge number of 3

Sub-query statement with path edge number of 1

And obtaining a corresponding final result according to the query result. Analogize in turn, for query statement with path edge number of 4

Can be based on the first sub-query statement with the path edge number of 1

And a second sub-query statement with a path edge number of 2

Obtaining a corresponding final result according to the query result; for a query statement with a path edge number of 5

Can be based on the first sub-query statement with the path edge number of 1

And a second sub-query statement with a path edge number of 3

Query result (which may be according to path edge number 1 sub-query statement)

Obtained from the query result) to obtain a corresponding final result, or according to a sub-query statement with a path edge number of 2

Obtaining a corresponding final result according to the query result; and so on. For any value of the path edge number n being greater than or equal to three, the first sub-query statement with the path edge number of 1 can be used

And/or a second sub-query statement with a path edge number of 2

The query result of (A) obtains a corresponding query statement

The final result of (1).

Therefore, based on the optimized path query method of the jump connection described in the foregoing steps S1-S3, a new path query scheme is provided that can effectively reduce intermediate results in the sub-graph matching process, that is, for a path with n edges and n edge labels

Query statement of

First, two corresponding sub-query statements are obtained:

and

On the basis of the technical solution of the first aspect, the present embodiment further provides a possible design for how to refine the sub-query process, that is, for the first sub-query statement

Or the second sub-query statement

If the number of corresponding path edges is a positive integer greater than or equal to three, the corresponding sub-query matching result is obtained from the target data graph by querying according to the corresponding sub-query statement, including but not limited to the following steps S100 to S300.

S100, aiming at sub query statement

Obtaining a corresponding first grandchild query statement

And a second grandchild query statement

Wherein the sub-query statement

For the first sub-query statement

Or the second sub-query statement

Number of path edges, k₁₁+k₂₂K, the first grandchild query statement

And the second grandchild query statement

And the second grandchild query statement

And the second grandchild query statement

S200, according to the first grandchild query statement

And querying from the target data graph to obtain a second grandchild query matching result, wherein the first grandchild query matching result and the second grandchild query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result only have one pair of incoming edges and outgoing edges.

S300, selecting a grandchild query matching result from the first grandchild query matching result and the second grandchild query matching result, then constructing a second point set according to the starting points of all edge matching results in the grandchild query matching result, and performing the following edge expansion on the end point of each edge matching result in the other grandchild query matching result: and for a certain terminal point, traversing and searching all corresponding extension points in the second point set, and when finding that a certain extension point can be searched, connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point by extension to obtain a result corresponding to the sub-query statement

And matching the corresponding edges.

The details of the steps S100 to S300 can be derived by referring to the steps S1 to S3, and are not described herein again. In addition, for the grandchild query statement, the query process can be refined by referring to the foregoing steps S100 to S300 until the first child query statement with the path edge number of 1 is obtained by the drop-down process

And/or a second sub-query statement with a path edge number of 2

The query result of (2).

In this embodiment, on the basis of the first aspect or the technical solution of the possible first design, a possible second design for speeding up the query process is provided, that is, when | k |₁-k₂When 1, according to the first sub-query statement

Obtaining a second sub-query matching result from the query in the target data graph, wherein the second sub-query matching result comprises: according to the first sub-query statement

And the second sub-query statement

Corresponding first sub-query matching result and second sub-query statement

Due to | k₁-k₂1 reflects that the corresponding two sub-query processes are independent and have no relevanceTherefore, the parallel query can quickly obtain the edge matching result of the two sub-queries, so that the subsequent jump connection can be completed quickly, and the final edge matching result is obtained. As shown in FIG. 5, GJ (general join) denotes a normal junction, and Jump is referred to herein as a Jump junction. For query statement

The first sub-query statement can be obtained by parallel first query

And a second sub-query statement

The corresponding edge matching result; for query statement

The first sub-query statement can be obtained by parallel first query

And a second sub-query statement

The corresponding edge matching result; and so on. Thus, when n is even, it is preferable to execute the sub-query statements in parallel

As shown in fig. 6, a second aspect of this embodiment provides a virtual device for implementing the optimized path query method according to any one of the first aspect or the first aspect, including a query statement acquisition module, a query statement execution module, and a jump connection module, which are sequentially connected in a communication manner;

Query statement of

Obtaining a corresponding first sub-query statement

And a second sub-query statement

Number of path edges, k₁+k₂+1 ═ n, the query statement

The first sub-query statement

And the second sub-query statement

Has only one edge at each initial query point, the query statement

The first sub-query statement

And the second sub-query statement

Has only one incoming edge respectively, the query statement

The first sub-query statement

And the second sub-query statement

the skip connection module is configured to select one sub-query matching result from the first sub-query matching result and the second sub-query matching result, construct a first point set according to starting points of all edge matching results in the sub-query matching result, and perform the following edge expansion on an end point of each edge matching result in another sub-query matching result: and traversing and searching all corresponding extension points in the first point set aiming at a certain terminal point, and connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point through extension when finding that a certain extension point can be searched, so as to obtain a query languageSentence

And matching the corresponding edges.

For the working process, working details, and technical effects of the foregoing apparatus provided in the second aspect of this embodiment, reference may be made to the optimized path query method described in the first aspect or any one of the first aspects that may be designed, which is not described herein again.

As shown in fig. 7, a third aspect of this embodiment provides a computer device for executing the optimized path query method as may be designed in any of the first aspect or the first aspect, where the computer device includes a memory, a processor, and a transceiver, which are communicatively connected in sequence, where the memory is used to store a computer program, the transceiver is used to transmit and receive information, and the processor is used to read the computer program and execute the optimized path query method as may be designed in any of the first aspect or the first aspect. For example, the Memory may include, but is not limited to, a Random-Access Memory (RAM), a Read-Only Memory (ROM), a Flash Memory (Flash Memory), a First-in First-out (FIFO), and/or a First-in Last-out (FILO), and the like; the processor may be, but is not limited to, a microprocessor of the model number STM32F105 family. In addition, the computer device may also include, but is not limited to, a power module, a display screen, and other necessary components.

For the working process, working details, and technical effects of the foregoing computer device provided in the third aspect of this embodiment, reference may be made to the optimized path query method described in the first aspect or any one of the first aspects that may be designed, which is not described herein again.

A fourth aspect of the present invention provides a computer-readable storage medium storing instructions including the instructions of the first aspect or any one of the possible designs of the optimized path query method of the first aspect, that is, the computer-readable storage medium has instructions stored thereon, and when the instructions are executed on a computer, the optimized path query method of the first aspect or any one of the possible designs of the first aspect is executed. The computer-readable storage medium refers to a carrier for storing data, and may include, but is not limited to, a computer-readable storage medium such as a floppy disk, an optical disk, a hard disk, a flash Memory, a flash disk and/or a Memory Stick (Memory Stick), and the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.

For a working process, working details, and technical effects of the foregoing computer-readable storage medium provided in the fourth aspect of this embodiment, reference may be made to the first aspect or any possible design of the optimized path query method in the first aspect, which is not described herein again.

A fifth aspect of the present embodiments provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the optimized path query method as set forth in the first aspect or any one of the possible designs of the first aspect. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable devices.

Finally, it should be noted that the present invention is not limited to the above alternative embodiments, and that various other forms of products can be obtained by anyone in light of the present invention. The above detailed description should not be taken as limiting the scope of the invention, which is defined in the claims, and which the description is intended to be interpreted accordingly.

Claims

1. A method for querying an optimized path of a jump connection is characterized by comprising the following steps:

for a path, the number of edges is n and the edge label is

Query statement of

Obtaining a corresponding first sub-query statement

And a second sub-query statement

Number of path edges, k₁+k₂+1 ═ n, the query statement

The first sub-query statement

And the second sub-query statement

Has only one edge at each initial query point, the query statement

The first sub-query statement

And the second sub-query statement

Has only one incoming edge respectively, the query statement

The first sub-query statement

And the second sub-query statement

according to the first sub-query statement

And matching the corresponding edges.

2. The optimized path query method of claim 1, wherein when k is₁＝k₂The first sub-query statement

And the second sub-query statement

3. The optimized path query method of claim 1, wherein when k is₁≠k₂The first sub-query statement

And the second sub-query statement

4. The optimized path query method of claim 1, wherein for saidFirst sub-query statement

Or the second sub-query statement

against sub-query statements

Obtaining a corresponding first grandchild query statement

And a second grandchild query statement

Wherein the sub-query statement

For the first sub-query statement

Or the second sub-query statement

Number of path edges, k₁₁+k₂₂K, the first grandchild query statement

And the second grandchild query statement

And the second grandchild query statement

And the second grandchild query statement

according to the first grandchild query statement

Obtaining a second grandchild query matching result by querying from the target data graph, wherein the first grandchild query matching result and the second grandchild query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result respectivelyOnly one pair of an incoming edge and an outgoing edge is provided;

And matching the corresponding edges.

5. The optimized path query method of claim 1, wherein when | k₁-k₂When 1, according to the first sub-query statement

according to the first sub-query statement

And the second sub-query statement

Corresponding first sub-query matching result and second sub-query statement

6. The optimized path query method as claimed in claim 1, wherein when n is odd, making n be odd

7. The optimized path query method as claimed in claim 1, wherein when n is an even number, making n equal to n

8. A jump-type connected optimized path query device is characterized by comprising a query statement acquisition module, a query statement execution module and a jump-type connection module which are sequentially in communication connection;

Query statement of

Obtaining a corresponding first sub-query statement

And a second sub-query statement

Number of path edges, k₁+k₂+1 ═ n, the query statement

The first sub-query statement

And the second sub-query statement

Has only one edge at each initial query point, the query statement

The first sub-query statement

And the second sub-query statement

Has only one incoming edge respectively, the query statement

The first sub-query statement

And the second sub-query statement

the skip connection module is configured to select one sub-query matching result from the first sub-query matching result and the second sub-query matching result, construct a first point set according to starting points of all edge matching results in the sub-query matching result, and perform the following edge expansion on an end point of each edge matching result in another sub-query matching result: and for a certain terminal point, traversing and searching all corresponding extension points in the first point set, and when finding that a certain extension point can be searched, connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point by extension to obtain a result corresponding to the query statement

And matching the corresponding edges.

9. A computer device comprising a memory, a processor and a transceiver, wherein the memory is used for storing a computer program, the transceiver is used for transmitting and receiving information, and the processor is used for reading the computer program and executing the optimized path query method according to any one of claims 1 to 7.

10. A computer-readable storage medium having stored thereon instructions for performing the optimized path query method of any one of claims 1-7 when the instructions are run on a computer.