CN114372165A - Optimized path query method, device, equipment and storage medium for jump connection - Google Patents

Optimized path query method, device, equipment and storage medium for jump connection Download PDF

Info

Publication number
CN114372165A
CN114372165A CN202210055214.3A CN202210055214A CN114372165A CN 114372165 A CN114372165 A CN 114372165A CN 202210055214 A CN202210055214 A CN 202210055214A CN 114372165 A CN114372165 A CN 114372165A
Authority
CN
China
Prior art keywords
query
sub
matching result
edge
query statement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210055214.3A
Other languages
Chinese (zh)
Inventor
李艳
彭鹏
李文杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Tupu Technology Co ltd
Original Assignee
Beijing Tupu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Tupu Technology Co ltd filed Critical Beijing Tupu Technology Co ltd
Priority to CN202210055214.3A priority Critical patent/CN114372165A/en
Publication of CN114372165A publication Critical patent/CN114372165A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/532Query formulation, e.g. graphical querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of subgraph matching and discloses a jump connection optimized path query method, a device, equipment and a storage medium, namely aiming at a path with n edge numbers and n edge labels
Figure DDA0003476205570000011
Query statement of
Figure DDA0003476205570000012
First, two corresponding sub-query statements are obtained:
Figure DDA0003476205570000013
and
Figure DDA0003476205570000014
and is provided with k1+k2+1 ═ n, then according to two subgroupsInquiring the target data graph by the inquiry statement to obtain corresponding sub-inquiry matching results, and finally performing jump-type connection edge expansion on the edge matching results in the two sub-inquiry matching results to obtain the result corresponding to the inquiry statement
Figure DDA0003476205570000015
And the corresponding final edge is matched with the result, so that the pressure for generating and storing the intermediate result can be effectively reduced, the calculation is accelerated, the query performance is improved, the optimality under the fastest condition is ensured, and the practical application and popularization are facilitated.

Description

Optimized path query method, device, equipment and storage medium for jump connection
Technical Field
The invention belongs to the technical field of subgraph matching, and particularly relates to a jump-connection optimized path query method, device, equipment and storage medium.
Background
Currently, many graph systems are proposed for efficiently storing data graphs and processing query graphs, and they mainly include RDF (Resource-Description-Framework, which is a network Resource Description model written in XML) graph systems, such as Jena, Virtuoso, RDF4J, gStore, and the like, and attribute graph systems Neo4j, Graphflow, EmptyHeaded, and the like, where some system query languages support variable-length path query operations, most commonly SPARQL and Cypher query languages.
As a graph system, the query is an important basic operation, and all query operations can be generalized to subgraph matching, namely all embeddings in the data graph G which are isomorphic with the query graph q are searched. Subgraph matching has been widely used in academia. Due to the importance of subgraph matching, various algorithms have been proposed. In the field of databases, subgraph matching algorithms can be divided into two types, wherein one type is a connection type (i.e., Join type), and existing connection strategies can be roughly divided into three types (of course, connection strategies that do not belong to the three types exist, and are not described again because they are irrelevant to this application).
(1) The first connection policy is pair-wise join (hereinafter abbreviated as PJ), which is a two-column table connection policy commonly used in databases (which is not described in more detail because it is irrelevant to the present application), and is used in, for example, the attribute map system Neo4 j.
(2) The second type of join strategy is Binary join, which computes the subgraph match by solving a series of Binary joins, and first decomposes the original query graph into a set of join elements whose matches can join the basic elements according to a predefined join order, and finally obtains the result. The Binary join algorithm differs only in concatenation units and concatenation order, with typical algorithms for concatenation units being StarJoin, TwinTwigJoin and CliqueJoin. StarJoin obviously decomposes a query graph by using a star as a connection unit, firstly, vertex coverage of the query graph is positioned, each covered vertex and neighbor points which are not used yet automatically form the star, and finally, a group of star connection units obtain all results according to a left deep connection sequence. However, StarJoin has a big disadvantage that k-star is enumerated at a vertex with degree d, which takes the cost of O (dk), and if the degree is very large, the "star expansion" is generated to obtain a very large table. TwinTwigJoin was optimized for StarJoin, its joining unit was "TwinTwig", which is a star structure having only two sides at most, and therefore it puts constraints on the star structure, but it follows the left deep joining order as with StarJoin. TwinTwigJoin hinders "star exposure" to some extent, but it still has certain disadvantages: execution time is long and left deep join is a suboptimal join plan. The proposal of clique join in 2016 well solves the problem, and firstly, a triangulation strategy is adopted in data division, based on which clique join can adopt 'clique' and 'star' as connection units, and the use of clique can greatly shorten the execution time. And secondly, CliqueJoin adds a link unit obtained by the Bushy join plan link decomposition.
(3) The third connection strategy is Worst case optimal connection (WCOJ), which is a latest technology about connection operations in a database. Given a number of tables { R1,R2,...,RnThe upper limit of the number of results obtained by the multi-table links above them can be determined according to the fractional edge coverage of the hypergraph (hypergraph) to which their links correspond. For example, the term "hypergraph" refers to a graph in which each edge has a plurality of endpoints, and the graph is defined by two endpoints per edge. It is clear that a hypergraph is a generalization of the usual graph definitions, so the properties satisfied on a hypergraph are commonAlso satisfies the diagram definition. The partial edge cover (fractional edge cover) on the so-called hypergraph is a function f: E → R+The function satisfies that v is sigma for any v belonging to V (H)e∈E:v∈ef (e) is not less than 1. For any partial edge coverage f, sigmae∈Ef (e) a weight called partial edge coverage f, denoted w. The smallest value among the weights of all the partial edge covers f is called a partial edge cover value (fractional edge cover number), and is denoted as w*. The partial edge coverage corresponding to the partial edge coverage value is recorded as f*. TABLE { R1,R2,...,RnAny one of the multi-table connections on the graph can correspond to a hypergraph H. Specifically, in the table { R1,R2,...,RnEach table in the graph can correspond to an edge in the hypergraph H, and the attribute in each table corresponds to a point in the hypergraph H. We can then have the following properties, Table { R }1,R2,...,RnThe number of results obtained by a multi-table connection on { OUT } satisfies the following property:
Figure BDA0003476205550000021
in the formula, ReIs a table corresponding to the upper edge e of the hypergraph H, | ReIs R |eAnd f is a partial edge cover. In analysis, it is often assumed that any one table is N in size, so the table { R }1,R2,...,RnOne multi-table connection on (b) } the number of results | OUT | with an upper limit of
Figure BDA0003476205550000022
w*Are partial edge coverage values for the hypergraph H. Subgraph matching queries can also be considered a special case of multi-table connections. In the subgraph matching query, each edge is a table with only two columns, and subgraphs with multiple edges are connected corresponding to the tables with two columns. A typical example is a triangle query, in which the number of results | OUT | of the query is limited to N, assuming that each edge matches N edges on the graph1.5. Because of thisThe coverage value of each triangle partial side is 1.5, and the corresponding partial side covers f*All edges are assigned a value of 0.5. This conclusion is exciting because conventional multi-table joins, which translate two tables into a combination of two-table joins, result in N in the process3The number of results of (a).
However, for the current query, there are problems that too many intermediate results in the query process result in too large storage pressure and reduced processing performance, such as an extreme case shown in fig. 1, where the left side is a data graph G, the right side is a path query P with 3 path edges and an edge label a based on a query graph q3(a) Based on the existing WCOJ algorithm, the connected sub-query results are expanded and the intersection is made through a query vertex adjacent to at least one vertex in the connected sub-query results, namely, the path query P is enumerated firstly2(a) (the number of path edges is 2 and the edge label is a) of 1000001 results, and the last point is expanded to obtain the final result number of 1000, as shown in fig. 2, the foregoing query process can see path query P2(a) Query P at path3(a) The method is not utilized, and huge memory consumption and calculation time cost are brought.
Disclosure of Invention
In order to solve the problems that the storage pressure is too large and the processing performance is reduced due to too many intermediate results in the conventional path query mode in subgraph matching, the invention aims to provide a jump-connection optimized path query method, a jump-connection optimized path query device, computer equipment and a computer readable storage medium, which can effectively reduce the generation and storage pressure of the intermediate results, accelerate the calculation, improve the query performance, ensure the optimality under the fastest condition and facilitate the practical application and popularization.
In a first aspect, the present invention provides a method for querying an optimized path of a jump connection, including:
for a path, the number of edges is n and the edge label is
Figure BDA0003476205550000031
Is queried bySentence
Figure BDA0003476205550000032
Obtaining a corresponding first sub-query statement
Figure BDA0003476205550000033
And a second sub-query statement
Figure BDA0003476205550000034
Wherein n is a positive integer greater than or equal to three, k1Is a positive integer and represents the first sub-query statement
Figure BDA0003476205550000035
Number of path edges, k2Is a positive integer and represents the second sub-query statement
Figure BDA0003476205550000036
Number of path edges, k1+k2+1 ═ n, the query statement
Figure BDA0003476205550000037
The first sub-query statement
Figure BDA0003476205550000038
And the second sub-query statement
Figure BDA0003476205550000039
Has only one edge at each initial query point, the query statement
Figure BDA00034762055500000310
The first sub-query statement
Figure BDA00034762055500000311
And the second sub-query statement
Figure BDA00034762055500000312
Has only one incoming edge respectively, the query statement
Figure BDA00034762055500000313
The first sub-query statement
Figure BDA00034762055500000314
And the second sub-query statement
Figure BDA00034762055500000315
The rest query points are respectively provided with only one pair of incoming edges and outgoing edges;
according to the first sub-query statement
Figure BDA00034762055500000316
Obtaining a first sub-query matching result from the query in the target data graph, and obtaining a second sub-query statement according to the first sub-query statement
Figure BDA00034762055500000317
Querying from the target data graph to obtain a second sub-query matching result, wherein the first sub-query matching result and the second sub-query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result only have one pair of incoming edge and outgoing edge;
selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result, then constructing a first point set according to the starting points of all edge matching results in the sub-query matching result, and performing the following edge expansion on the end point of each edge matching result in the other sub-query matching result: and for a certain terminal point, traversing and searching all corresponding extension points in the first point set, and when finding that a certain extension point can be searched, connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point by extension to obtain a result corresponding to the query statement
Figure BDA0003476205550000041
Corresponding toAnd (5) edge matching results.
Based on the invention content, a new path query scheme capable of effectively reducing intermediate results in the sub-graph matching process is provided, namely, n is the number of edges of one path and the labels of the edges are
Figure BDA0003476205550000042
Query statement of
Figure BDA0003476205550000043
First, two corresponding sub-query statements are obtained:
Figure BDA0003476205550000044
and
Figure BDA0003476205550000045
and is provided with k1+k2And if n is +1, then obtaining a corresponding sub-query matching result from the target data graph according to the two sub-query sentences, and finally performing edge expansion in a jumping connection on the edge matching result in the two sub-query matching results to obtain the result corresponding to the query sentence
Figure BDA0003476205550000046
And the corresponding final edge is matched with the result, so that the pressure for generating and storing the intermediate result can be effectively reduced, the calculation is accelerated, the query performance is improved, the optimality under the fastest condition is ensured, and the practical application and popularization are facilitated.
In one possible design, when k1=k2The first sub-query statement
Figure BDA0003476205550000047
And the second sub-query statement
Figure BDA0003476205550000048
The first sub-query matching result and the second sub-query matching result are the same sub-query matching result;
selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result, wherein the selecting step comprises the following steps: and randomly selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result.
In one possible design, when k1≠k2The first sub-query statement
Figure BDA0003476205550000049
And the second sub-query statement
Figure BDA00034762055500000410
The first sub-query matching result and the second sub-query matching result are different sub-query matching results for different sub-query statements;
selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result, wherein the selecting step comprises the following steps: and selecting one sub-query matching result with more total number of edge matching results from the first sub-query matching result and the second sub-query matching result.
In one possible design, for the first sub-query statement
Figure BDA00034762055500000411
Or the second sub-query statement
Figure BDA00034762055500000412
If the number of corresponding path edges is a positive integer greater than or equal to three, obtaining a corresponding sub-query matching result from the target data graph according to the corresponding sub-query statement, including:
against sub-query statements
Figure BDA00034762055500000413
Obtaining a corresponding first grandchild query statement
Figure BDA00034762055500000414
And a second grandchild query statement
Figure BDA00034762055500000415
Wherein the sub-query statement
Figure BDA00034762055500000416
For the first sub-query statement
Figure BDA00034762055500000417
Or the second sub-query statement
Figure BDA0003476205550000051
k is k1Or k2,k11Is a positive integer and represents the first grandchild query statement
Figure BDA0003476205550000052
Number of path edges, k22Is a positive integer and represents the second grandchild query statement
Figure BDA0003476205550000053
Number of path edges, k11+k22K, the first grandchild query statement
Figure BDA0003476205550000054
And the second grandchild query statement
Figure BDA0003476205550000055
The initial query points of the query sentence respectively have only one outgoing edge, and the first grandchild query sentence
Figure BDA0003476205550000056
And the second grandchild query statement
Figure BDA0003476205550000057
The terminal query points of (2) have only one incoming edge respectively, and the first grandchild query statement
Figure BDA0003476205550000058
And the second grandchild query statement
Figure BDA0003476205550000059
The rest query points are respectively provided with only one pair of incoming edges and outgoing edges;
according to the first grandchild query statement
Figure BDA00034762055500000510
Obtaining a first grandchild query matching result from the query of the target data graph, and obtaining a second grandchild query statement according to the first grandchild query statement
Figure BDA00034762055500000511
Querying from the target data graph to obtain a second grandchild query matching result, wherein the first grandchild query matching result and the second grandchild query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result only have one pair of incoming edge and outgoing edge;
selecting a grandchild query matching result from the first grandchild query matching result and the second grandchild query matching result, then constructing a second point set according to the starting points of all edge matching results in the grandchild query matching result, and performing the following edge expansion on the end point of each edge matching result in the other grandchild query matching result: and for a certain terminal point, traversing and searching all corresponding extension points in the second point set, and when finding that a certain extension point can be searched, connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point by extension to obtain a result corresponding to the sub-query statement
Figure BDA00034762055500000512
And matching the corresponding edges.
In one possible design, when | k1-k2When 1, according to the first sub-query statement
Figure BDA00034762055500000513
Obtaining a first sub-query match from a query in a target data graphMatching results, and according to the second sub-query statement
Figure BDA00034762055500000514
Obtaining a second sub-query matching result from the query in the target data graph, wherein the second sub-query matching result comprises:
according to the first sub-query statement
Figure BDA00034762055500000515
And the second sub-query statement
Figure BDA00034762055500000516
Obtaining the first sub-query statement from the query in the target data graph in parallel
Figure BDA00034762055500000517
Corresponding first sub-query matching result and second sub-query statement
Figure BDA00034762055500000518
And the corresponding second sub-query matching results, wherein the first sub-query matching result and the second sub-query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result respectively only have one pair of incoming edges and outgoing edges.
In one possible design, when n is an odd number, the method comprises
Figure BDA00034762055500000519
In one possible design, when n is an even number, the method includes
Figure BDA0003476205550000061
In a second aspect, the invention provides a jump-type connection optimized path query device, which comprises a query statement acquisition module, a query statement execution module and a jump-type connection module, wherein the query statement acquisition module, the query statement execution module and the jump-type connection module are sequentially in communication connection;
the query statement acquisition module is used for aiming at a path with n edge number and n edge label
Figure BDA0003476205550000062
Query statement of
Figure BDA0003476205550000063
Obtaining a corresponding first sub-query statement
Figure BDA0003476205550000064
And a second sub-query statement
Figure BDA0003476205550000065
Wherein n is a positive integer greater than or equal to three, k1Is a positive integer and represents the first sub-query statement
Figure BDA0003476205550000066
Number of path edges, k2Is a positive integer and represents the second sub-query statement
Figure BDA0003476205550000067
Number of path edges, k1+k2+1 ═ n, the query statement
Figure BDA0003476205550000068
The first sub-query statement
Figure BDA0003476205550000069
And the second sub-query statement
Figure BDA00034762055500000610
Has only one edge at each initial query point, the query statement
Figure BDA00034762055500000611
The first sub-query statement
Figure BDA00034762055500000612
And stationThe second sub-query statement
Figure BDA00034762055500000613
Has only one incoming edge respectively, the query statement
Figure BDA00034762055500000614
The first sub-query statement
Figure BDA00034762055500000615
And the second sub-query statement
Figure BDA00034762055500000616
The rest query points are respectively provided with only one pair of incoming edges and outgoing edges;
the query statement execution module is used for executing the first sub-query statement according to the first sub-query statement
Figure BDA00034762055500000617
Obtaining a first sub-query matching result from the query in the target data graph, and obtaining a second sub-query statement according to the first sub-query statement
Figure BDA00034762055500000618
Querying from the target data graph to obtain a second sub-query matching result, wherein the first sub-query matching result and the second sub-query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result only have one pair of incoming edge and outgoing edge;
the skip connection module is configured to select one sub-query matching result from the first sub-query matching result and the second sub-query matching result, construct a first point set according to starting points of all edge matching results in the sub-query matching result, and perform the following edge expansion on an end point of each edge matching result in another sub-query matching result: for a certain end point, all corresponding extension points are searched in the first point set in a traversing manner, and the corresponding extension points are foundWhen a certain extension point can be searched, an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point are connected in an extending manner to obtain an edge matching result corresponding to the query statement
Figure BDA00034762055500000619
And matching the corresponding edges.
In a third aspect, the present invention provides a computer device, comprising a memory, a processor and a transceiver, which are sequentially connected in communication, wherein the memory is used for storing a computer program, the transceiver is used for transmitting and receiving information, and the processor is used for reading the computer program and executing the optimized path query method according to the first aspect or any possible design of the first aspect.
In a fourth aspect, the present invention provides a computer-readable storage medium having stored thereon instructions which, when executed on a computer, perform the optimized path query method as described in the first aspect or any of the possible designs of the first aspect.
In a fifth aspect, the present invention provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the optimized path query method as described in the first aspect or any possible design of the first aspect above.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is an example diagram of an extreme data graph G and a path query with a path edge number of 3.
Fig. 2 is a diagram illustrating an exemplary implementation of a path query for the extreme case shown in fig. 1 by using the WCOJ algorithm.
Fig. 3 is a schematic flow chart of the optimized path query method for jump connection according to the present invention.
Fig. 4 is a diagram illustrating an exemplary execution process of performing a query on the extreme case shown in fig. 1 by using an optimized path query method.
FIG. 5 is an exemplary diagram of parallel execution of optimized path queries provided by the present invention.
Fig. 6 is a schematic structural diagram of the optimized path query device for jump connection provided by the present invention.
Fig. 7 is a schematic structural diagram of a computer device provided by the present invention.
Detailed Description
The invention is further described with reference to the following figures and specific embodiments. It should be noted that the description of the embodiments is provided to help understanding of the present invention, but the present invention is not limited thereto. Specific structural and functional details disclosed herein are merely representative of exemplary embodiments of the invention. This invention may, however, be embodied in many alternate forms and should not be construed as limited to the embodiments set forth herein.
It will be understood that, although the terms first, second, etc. may be used herein to describe various objects, these objects should not be limited by these terms. These terms are only used to distinguish one object from another. For example, a first object may be referred to as a second object, and similarly, a second object may be referred to as a first object, without departing from the scope of example embodiments of the present invention.
It should be understood that, for the term "and/or" as may appear herein, it is merely an associative relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, B exists alone or A and B exist at the same time; for the term "/and" as may appear herein, which describes another associative object relationship, it means that two relationships may exist, e.g., a/and B, may mean: a exists singly or A and B exist simultaneously; in addition, for the character "/" that may appear herein, it generally means that the former and latter associated objects are in an "or" relationship.
As shown in fig. 3 to 4, the optimized path query method for jump connection provided in the first aspect of this embodiment may be, but not limited to, executed by a Computer device having certain computing resources and performing subgraph matching, for example, executed by an electronic device such as a Personal Computer (PC, which refers to a multipurpose Computer with a size, price, and performance suitable for Personal use; a desktop Computer, a notebook Computer, a small notebook Computer, a tablet Computer, a super book, and the like all belong to the Personal Computer), a smart phone, a Personal digital assistant (PAD), or a wearable device, so as to facilitate the sub-graph. As shown in fig. 1, the optimized path query method for the jump-connection may include, but is not limited to, the following steps S1 to S3.
S1, aiming at a path, the number of edges is n and the edge label is
Figure BDA0003476205550000081
Query statement of
Figure BDA0003476205550000082
Obtaining a corresponding first sub-query statement
Figure BDA0003476205550000083
And a second sub-query statement
Figure BDA0003476205550000084
Wherein n is a positive integer greater than or equal to three, k1Is a positive integer and represents the first sub-query statement
Figure BDA0003476205550000085
Number of path edges, k2Is a positive integer and represents the second sub-query statement
Figure BDA0003476205550000086
Number of path edges, k1+k2+1 ═ n, the query statement
Figure BDA0003476205550000087
The first sub-query statement
Figure BDA0003476205550000088
And the second sub-query statement
Figure BDA0003476205550000089
Has only one edge at each initial query point, the query statement
Figure BDA00034762055500000810
The first sub-query statement
Figure BDA00034762055500000811
And the second sub-query statement
Figure BDA00034762055500000812
Has only one incoming edge respectively, the query statement
Figure BDA00034762055500000813
The first sub-query statement
Figure BDA00034762055500000814
And the second sub-query statement
Figure BDA00034762055500000815
The rest of the query points have only one pair of in-edge and out-edge respectively.
In the step S1, the query statement
Figure BDA00034762055500000816
For a given path query statement in the subgraph matching, because the initial query point of the given path query statement only has one outgoing edge, the final query point only has one incoming edge and the rest query points respectively only have one pair of incoming edge and outgoing edge, a linear path (which is a constraint condition of the current optimized path query) can be obtained by connecting all the query points according to the edges in the query. The first sub-query statement
Figure BDA00034762055500000817
And the second sub-query statement
Figure BDA00034762055500000818
According to the query statement
Figure BDA00034762055500000819
The new path query statement can obtain a linear path by connecting all query points according to edges in the query, so that the query statement can be obtained based on the query result
Figure BDA00034762055500000820
The query result of (2). Furthermore, the query statement
Figure BDA00034762055500000821
The first sub-query statement
Figure BDA00034762055500000822
And the second sub-query statement
Figure BDA00034762055500000823
But not limited to SPARQL query or Cypher query, etc., which may be made available in a conventional manner.
S2, according to the first sub-query statement
Figure BDA0003476205550000091
Obtaining a first sub-query matching result from the query in the target data graph, and obtaining a second sub-query statement according to the first sub-query statement
Figure BDA0003476205550000092
Obtaining a second sub-query matching result by querying from the target data graph, wherein the first sub-query matching result and the second sub-query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one edge, and the edge matching result is obtainedThe end point of the fruit is only one in edge, and other points of the edge matching result are only one pair of in edge and out edge respectively.
In the step S2, when the target data diagram is stored in a database (the creation process may be, but is not limited to, running the gbuild command of gStore to create according to the target data diagram given the database name and the path of the RDF data store in NT format), the first sub-query statement sentence may be used according to the given query database name
Figure BDA0003476205550000093
And the second sub-query statement
Figure BDA0003476205550000094
And carrying out conventional query to obtain at least one corresponding edge matching result.
S3, selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result, then constructing a first point set according to the starting points of all edge matching results in the sub-query matching result, and performing the following edge expansion on the end point of each edge matching result in the other sub-query matching result: and for a certain terminal point, traversing and searching all corresponding extension points in the first point set, and when finding that a certain extension point can be searched, connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point by extension to obtain a result corresponding to the query statement
Figure BDA0003476205550000095
And matching the corresponding edges.
In the step S3, specifically, when k is1=k2Due to the first sub-query statement
Figure BDA0003476205550000096
And the second sub-query statement
Figure BDA0003476205550000097
Are the same asThe first sub-query matching result and the second sub-query matching result are the same sub-query matching result (i.e., query isomorphism), so that one sub-query matching result can be arbitrarily selected from the first sub-query matching result and the second sub-query matching result. When k is1≠k2Due to the first sub-query statement
Figure BDA0003476205550000098
And the second sub-query statement
Figure BDA0003476205550000099
The first sub-query matching result and the second sub-query matching result are different sub-query matching results (namely different query structures) for different sub-query statements, and in order to reduce the calculation resources required in the subsequent terminal point expanded connection, one sub-query matching result with more total number of edge matching results is preferably selected from the first sub-query matching result and the second sub-query matching result. Further, the first set of points may be, but is not limited to being, in the form of a hash table.
In the step S3, taking the extreme case shown in fig. 1 as an example, as shown in fig. 4, for a given query statement P3(a) Two same sub-query sentences P with the isomorphic characteristic of query can be obtained first1(a) Then, the data graph G is inquired to obtain the same sub-inquiry matching result, then one sub-inquiry matching result is selected randomly, and a point set { V } is constructed according to the starting points of all edge matching results in the sub-inquiry matching result1,V2,…,V1000,V,V2000And simultaneously matching the end points { V, V ] of the results for each edge in the other sub-query matching result1001,V1002,…,V2000V' } perform the following edge expansion: for a certain end point V, at said set of points { V1,V2,…,V1000,V,V2000All extension points corresponding to the V are searched in a traversal mode1001,V1002,…,V2000Find a certain extension point V2000By expanding the connectionReceiving the edge matching result V corresponding to the certain terminal point V100V and the certain extension point V2000Corresponding edge matching result V2000V', obtaining a query statement P3(a) Corresponding edge matching result V100VV2000V', and further through such a jump connection (which may be named Jumping Join) as described above, not only are intermediate results reduced, but also the computation can be accelerated. In addition, since the same matching results of the two sub-queries are obtained, all the results can be obtained by traversing the data graph G once, and further, when n is an odd number, the result is preferably obtained
Figure BDA0003476205550000101
Further reducing intermediate results and speeding up calculations.
As can be seen from the above example, the query statement for the path edge number of 3
Figure BDA0003476205550000102
Sub-query statement with path edge number of 1
Figure BDA0003476205550000103
And obtaining a corresponding final result according to the query result. Analogize in turn, for query statement with path edge number of 4
Figure BDA0003476205550000104
Can be based on the first sub-query statement with the path edge number of 1
Figure BDA0003476205550000105
And a second sub-query statement with a path edge number of 2
Figure BDA0003476205550000106
Obtaining a corresponding final result according to the query result; for a query statement with a path edge number of 5
Figure BDA0003476205550000107
Can be based on the first sub-query statement with the path edge number of 1
Figure BDA0003476205550000108
And a second sub-query statement with a path edge number of 3
Figure BDA0003476205550000109
Query result (which may be according to path edge number 1 sub-query statement)
Figure BDA00034762055500001010
Obtained from the query result) to obtain a corresponding final result, or according to a sub-query statement with a path edge number of 2
Figure BDA00034762055500001011
Obtaining a corresponding final result according to the query result; and so on. For any value of the path edge number n being greater than or equal to three, the first sub-query statement with the path edge number of 1 can be used
Figure BDA00034762055500001012
And/or a second sub-query statement with a path edge number of 2
Figure BDA00034762055500001013
The query result of (A) obtains a corresponding query statement
Figure BDA00034762055500001014
The final result of (1).
Therefore, based on the optimized path query method of the jump connection described in the foregoing steps S1-S3, a new path query scheme is provided that can effectively reduce intermediate results in the sub-graph matching process, that is, for a path with n edges and n edge labels
Figure BDA00034762055500001015
Query statement of
Figure BDA00034762055500001016
First, two corresponding sub-query statements are obtained:
Figure BDA00034762055500001017
and
Figure BDA00034762055500001018
and is provided with k1+k2And if n is +1, then obtaining a corresponding sub-query matching result from the target data graph according to the two sub-query sentences, and finally performing edge expansion in a jumping connection on the edge matching result in the two sub-query matching results to obtain the result corresponding to the query sentence
Figure BDA00034762055500001019
And the corresponding final edge is matched with the result, so that the pressure for generating and storing the intermediate result can be effectively reduced, the calculation is accelerated, the query performance is improved, the optimality under the fastest condition is ensured, and the practical application and popularization are facilitated.
On the basis of the technical solution of the first aspect, the present embodiment further provides a possible design for how to refine the sub-query process, that is, for the first sub-query statement
Figure BDA0003476205550000111
Or the second sub-query statement
Figure BDA0003476205550000112
If the number of corresponding path edges is a positive integer greater than or equal to three, the corresponding sub-query matching result is obtained from the target data graph by querying according to the corresponding sub-query statement, including but not limited to the following steps S100 to S300.
S100, aiming at sub query statement
Figure BDA0003476205550000113
Obtaining a corresponding first grandchild query statement
Figure BDA0003476205550000114
And a second grandchild query statement
Figure BDA0003476205550000115
Wherein the sub-query statement
Figure BDA0003476205550000116
For the first sub-query statement
Figure BDA0003476205550000117
Or the second sub-query statement
Figure BDA0003476205550000118
k is k1Or k2,k11Is a positive integer and represents the first grandchild query statement
Figure BDA0003476205550000119
Number of path edges, k22Is a positive integer and represents the second grandchild query statement
Figure BDA00034762055500001110
Number of path edges, k11+k22K, the first grandchild query statement
Figure BDA00034762055500001111
And the second grandchild query statement
Figure BDA00034762055500001112
The initial query points of the query sentence respectively have only one outgoing edge, and the first grandchild query sentence
Figure BDA00034762055500001113
And the second grandchild query statement
Figure BDA00034762055500001114
The terminal query points of (2) have only one incoming edge respectively, and the first grandchild query statement
Figure BDA00034762055500001115
And the second grandchild query statement
Figure BDA00034762055500001116
The rest of the query points have only one pair of in-edge and out-edge respectively.
S200, according to the first grandchild query statement
Figure BDA00034762055500001117
Obtaining a first grandchild query matching result from the query of the target data graph, and obtaining a second grandchild query statement according to the first grandchild query statement
Figure BDA00034762055500001118
And querying from the target data graph to obtain a second grandchild query matching result, wherein the first grandchild query matching result and the second grandchild query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result only have one pair of incoming edges and outgoing edges.
S300, selecting a grandchild query matching result from the first grandchild query matching result and the second grandchild query matching result, then constructing a second point set according to the starting points of all edge matching results in the grandchild query matching result, and performing the following edge expansion on the end point of each edge matching result in the other grandchild query matching result: and for a certain terminal point, traversing and searching all corresponding extension points in the second point set, and when finding that a certain extension point can be searched, connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point by extension to obtain a result corresponding to the sub-query statement
Figure BDA00034762055500001119
And matching the corresponding edges.
The details of the steps S100 to S300 can be derived by referring to the steps S1 to S3, and are not described herein again. In addition, for the grandchild query statement, the query process can be refined by referring to the foregoing steps S100 to S300 until the first child query statement with the path edge number of 1 is obtained by the drop-down process
Figure BDA00034762055500001120
And/or a second sub-query statement with a path edge number of 2
Figure BDA00034762055500001121
The query result of (2).
In this embodiment, on the basis of the first aspect or the technical solution of the possible first design, a possible second design for speeding up the query process is provided, that is, when | k |1-k2When 1, according to the first sub-query statement
Figure BDA0003476205550000121
Obtaining a first sub-query matching result from the query in the target data graph, and obtaining a second sub-query statement according to the first sub-query statement
Figure BDA0003476205550000122
Obtaining a second sub-query matching result from the query in the target data graph, wherein the second sub-query matching result comprises: according to the first sub-query statement
Figure BDA0003476205550000123
And the second sub-query statement
Figure BDA0003476205550000124
Obtaining the first sub-query statement from the query in the target data graph in parallel
Figure BDA0003476205550000125
Corresponding first sub-query matching result and second sub-query statement
Figure BDA0003476205550000126
And the corresponding second sub-query matching results, wherein the first sub-query matching result and the second sub-query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result respectively only have one pair of incoming edges and outgoing edges.
Due to | k1-k21 reflects that the corresponding two sub-query processes are independent and have no relevanceTherefore, the parallel query can quickly obtain the edge matching result of the two sub-queries, so that the subsequent jump connection can be completed quickly, and the final edge matching result is obtained. As shown in FIG. 5, GJ (general join) denotes a normal junction, and Jump is referred to herein as a Jump junction. For query statement
Figure BDA0003476205550000127
The first sub-query statement can be obtained by parallel first query
Figure BDA0003476205550000128
And a second sub-query statement
Figure BDA0003476205550000129
The corresponding edge matching result; for query statement
Figure BDA00034762055500001210
The first sub-query statement can be obtained by parallel first query
Figure BDA00034762055500001211
And a second sub-query statement
Figure BDA00034762055500001212
The corresponding edge matching result; and so on. Thus, when n is even, it is preferable to execute the sub-query statements in parallel
Figure BDA00034762055500001213
As shown in fig. 6, a second aspect of this embodiment provides a virtual device for implementing the optimized path query method according to any one of the first aspect or the first aspect, including a query statement acquisition module, a query statement execution module, and a jump connection module, which are sequentially connected in a communication manner;
the query statement acquisition module is used for aiming at a path with n edge number and n edge label
Figure BDA00034762055500001214
Query statement of
Figure BDA00034762055500001215
Obtaining a corresponding first sub-query statement
Figure BDA00034762055500001216
And a second sub-query statement
Figure BDA00034762055500001217
Wherein n is a positive integer greater than or equal to three, k1Is a positive integer and represents the first sub-query statement
Figure BDA00034762055500001218
Number of path edges, k2Is a positive integer and represents the second sub-query statement
Figure BDA00034762055500001219
Number of path edges, k1+k2+1 ═ n, the query statement
Figure BDA00034762055500001220
The first sub-query statement
Figure BDA00034762055500001221
And the second sub-query statement
Figure BDA00034762055500001222
Has only one edge at each initial query point, the query statement
Figure BDA00034762055500001223
The first sub-query statement
Figure BDA00034762055500001224
And the second sub-query statement
Figure BDA00034762055500001225
Has only one incoming edge respectively, the query statement
Figure BDA0003476205550000131
The first sub-query statement
Figure BDA0003476205550000132
And the second sub-query statement
Figure BDA0003476205550000133
The rest query points are respectively provided with only one pair of incoming edges and outgoing edges;
the query statement execution module is used for executing the first sub-query statement according to the first sub-query statement
Figure BDA0003476205550000134
Obtaining a first sub-query matching result from the query in the target data graph, and obtaining a second sub-query statement according to the first sub-query statement
Figure BDA0003476205550000135
Querying from the target data graph to obtain a second sub-query matching result, wherein the first sub-query matching result and the second sub-query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result only have one pair of incoming edge and outgoing edge;
the skip connection module is configured to select one sub-query matching result from the first sub-query matching result and the second sub-query matching result, construct a first point set according to starting points of all edge matching results in the sub-query matching result, and perform the following edge expansion on an end point of each edge matching result in another sub-query matching result: and traversing and searching all corresponding extension points in the first point set aiming at a certain terminal point, and connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point through extension when finding that a certain extension point can be searched, so as to obtain a query languageSentence
Figure BDA0003476205550000136
And matching the corresponding edges.
For the working process, working details, and technical effects of the foregoing apparatus provided in the second aspect of this embodiment, reference may be made to the optimized path query method described in the first aspect or any one of the first aspects that may be designed, which is not described herein again.
As shown in fig. 7, a third aspect of this embodiment provides a computer device for executing the optimized path query method as may be designed in any of the first aspect or the first aspect, where the computer device includes a memory, a processor, and a transceiver, which are communicatively connected in sequence, where the memory is used to store a computer program, the transceiver is used to transmit and receive information, and the processor is used to read the computer program and execute the optimized path query method as may be designed in any of the first aspect or the first aspect. For example, the Memory may include, but is not limited to, a Random-Access Memory (RAM), a Read-Only Memory (ROM), a Flash Memory (Flash Memory), a First-in First-out (FIFO), and/or a First-in Last-out (FILO), and the like; the processor may be, but is not limited to, a microprocessor of the model number STM32F105 family. In addition, the computer device may also include, but is not limited to, a power module, a display screen, and other necessary components.
For the working process, working details, and technical effects of the foregoing computer device provided in the third aspect of this embodiment, reference may be made to the optimized path query method described in the first aspect or any one of the first aspects that may be designed, which is not described herein again.
A fourth aspect of the present invention provides a computer-readable storage medium storing instructions including the instructions of the first aspect or any one of the possible designs of the optimized path query method of the first aspect, that is, the computer-readable storage medium has instructions stored thereon, and when the instructions are executed on a computer, the optimized path query method of the first aspect or any one of the possible designs of the first aspect is executed. The computer-readable storage medium refers to a carrier for storing data, and may include, but is not limited to, a computer-readable storage medium such as a floppy disk, an optical disk, a hard disk, a flash Memory, a flash disk and/or a Memory Stick (Memory Stick), and the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
For a working process, working details, and technical effects of the foregoing computer-readable storage medium provided in the fourth aspect of this embodiment, reference may be made to the first aspect or any possible design of the optimized path query method in the first aspect, which is not described herein again.
A fifth aspect of the present embodiments provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the optimized path query method as set forth in the first aspect or any one of the possible designs of the first aspect. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable devices.
Finally, it should be noted that the present invention is not limited to the above alternative embodiments, and that various other forms of products can be obtained by anyone in light of the present invention. The above detailed description should not be taken as limiting the scope of the invention, which is defined in the claims, and which the description is intended to be interpreted accordingly.

Claims (10)

1. A method for querying an optimized path of a jump connection is characterized by comprising the following steps:
for a path, the number of edges is n and the edge label is
Figure FDA0003476205540000011
Query statement of
Figure FDA0003476205540000012
Obtaining a corresponding first sub-query statement
Figure FDA0003476205540000013
And a second sub-query statement
Figure FDA0003476205540000014
Wherein n is a positive integer greater than or equal to three, k1Is a positive integer and represents the first sub-query statement
Figure FDA0003476205540000015
Number of path edges, k2Is a positive integer and represents the second sub-query statement
Figure FDA0003476205540000016
Number of path edges, k1+k2+1 ═ n, the query statement
Figure FDA0003476205540000017
The first sub-query statement
Figure FDA0003476205540000018
And the second sub-query statement
Figure FDA0003476205540000019
Has only one edge at each initial query point, the query statement
Figure FDA00034762055400000110
The first sub-query statement
Figure FDA00034762055400000111
And the second sub-query statement
Figure FDA00034762055400000112
Has only one incoming edge respectively, the query statement
Figure FDA00034762055400000113
The first sub-query statement
Figure FDA00034762055400000114
And the second sub-query statement
Figure FDA00034762055400000115
The rest query points are respectively provided with only one pair of incoming edges and outgoing edges;
according to the first sub-query statement
Figure FDA00034762055400000116
Obtaining a first sub-query matching result from the query in the target data graph, and obtaining a second sub-query statement according to the first sub-query statement
Figure FDA00034762055400000117
Querying from the target data graph to obtain a second sub-query matching result, wherein the first sub-query matching result and the second sub-query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result only have one pair of incoming edge and outgoing edge;
selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result, then constructing a first point set according to the starting points of all edge matching results in the sub-query matching result, and performing the following edge expansion on the end point of each edge matching result in the other sub-query matching result: and for a certain terminal point, traversing and searching all corresponding extension points in the first point set, and when finding that a certain extension point can be searched, connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point by extension to obtain a result corresponding to the query statement
Figure FDA00034762055400000118
And matching the corresponding edges.
2. The optimized path query method of claim 1, wherein when k is1=k2The first sub-query statement
Figure FDA00034762055400000119
And the second sub-query statement
Figure FDA00034762055400000120
The first sub-query matching result and the second sub-query matching result are the same sub-query matching result;
selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result, wherein the selecting step comprises the following steps: and randomly selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result.
3. The optimized path query method of claim 1, wherein when k is1≠k2The first sub-query statement
Figure FDA0003476205540000021
And the second sub-query statement
Figure FDA0003476205540000022
The first sub-query matching result and the second sub-query matching result are different sub-query matching results for different sub-query statements;
selecting one sub-query matching result from the first sub-query matching result and the second sub-query matching result, wherein the selecting step comprises the following steps: and selecting one sub-query matching result with more total number of edge matching results from the first sub-query matching result and the second sub-query matching result.
4. The optimized path query method of claim 1, wherein for saidFirst sub-query statement
Figure FDA0003476205540000023
Or the second sub-query statement
Figure FDA0003476205540000024
If the number of corresponding path edges is a positive integer greater than or equal to three, obtaining a corresponding sub-query matching result from the target data graph according to the corresponding sub-query statement, including:
against sub-query statements
Figure FDA0003476205540000025
Obtaining a corresponding first grandchild query statement
Figure FDA0003476205540000026
And a second grandchild query statement
Figure FDA0003476205540000027
Wherein the sub-query statement
Figure FDA0003476205540000028
For the first sub-query statement
Figure FDA0003476205540000029
Or the second sub-query statement
Figure FDA00034762055400000210
k is k1Or k2,k11Is a positive integer and represents the first grandchild query statement
Figure FDA00034762055400000211
Number of path edges, k22Is a positive integer and represents the second grandchild query statement
Figure FDA00034762055400000212
Number of path edges, k11+k22K, the first grandchild query statement
Figure FDA00034762055400000213
And the second grandchild query statement
Figure FDA00034762055400000214
The initial query points of the query sentence respectively have only one outgoing edge, and the first grandchild query sentence
Figure FDA00034762055400000215
And the second grandchild query statement
Figure FDA00034762055400000216
The terminal query points of (2) have only one incoming edge respectively, and the first grandchild query statement
Figure FDA00034762055400000217
And the second grandchild query statement
Figure FDA00034762055400000218
The rest query points are respectively provided with only one pair of incoming edges and outgoing edges;
according to the first grandchild query statement
Figure FDA00034762055400000219
Obtaining a first grandchild query matching result from the query of the target data graph, and obtaining a second grandchild query statement according to the first grandchild query statement
Figure FDA00034762055400000220
Obtaining a second grandchild query matching result by querying from the target data graph, wherein the first grandchild query matching result and the second grandchild query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result respectivelyOnly one pair of an incoming edge and an outgoing edge is provided;
selecting a grandchild query matching result from the first grandchild query matching result and the second grandchild query matching result, then constructing a second point set according to the starting points of all edge matching results in the grandchild query matching result, and performing the following edge expansion on the end point of each edge matching result in the other grandchild query matching result: and for a certain terminal point, traversing and searching all corresponding extension points in the second point set, and when finding that a certain extension point can be searched, connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point by extension to obtain a result corresponding to the sub-query statement
Figure FDA00034762055400000221
And matching the corresponding edges.
5. The optimized path query method of claim 1, wherein when | k1-k2When 1, according to the first sub-query statement
Figure FDA0003476205540000031
Obtaining a first sub-query matching result from the query in the target data graph, and obtaining a second sub-query statement according to the first sub-query statement
Figure FDA0003476205540000032
Obtaining a second sub-query matching result from the query in the target data graph, wherein the second sub-query matching result comprises:
according to the first sub-query statement
Figure FDA0003476205540000033
And the second sub-query statement
Figure FDA0003476205540000034
Obtaining the first sub-query statement from the query in the target data graph in parallel
Figure FDA0003476205540000035
Corresponding first sub-query matching result and second sub-query statement
Figure FDA0003476205540000036
And the corresponding second sub-query matching results, wherein the first sub-query matching result and the second sub-query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result respectively only have one pair of incoming edges and outgoing edges.
6. The optimized path query method as claimed in claim 1, wherein when n is odd, making n be odd
Figure FDA0003476205540000037
7. The optimized path query method as claimed in claim 1, wherein when n is an even number, making n equal to n
Figure FDA0003476205540000038
8. A jump-type connected optimized path query device is characterized by comprising a query statement acquisition module, a query statement execution module and a jump-type connection module which are sequentially in communication connection;
the query statement acquisition module is used for aiming at a path with n edge number and n edge label
Figure FDA0003476205540000039
Query statement of
Figure FDA00034762055400000310
Obtaining a corresponding first sub-query statement
Figure FDA00034762055400000311
And a second sub-query statement
Figure FDA00034762055400000312
Wherein n is a positive integer greater than or equal to three, k1Is a positive integer and represents the first sub-query statement
Figure FDA00034762055400000313
Number of path edges, k2Is a positive integer and represents the second sub-query statement
Figure FDA00034762055400000314
Number of path edges, k1+k2+1 ═ n, the query statement
Figure FDA00034762055400000315
The first sub-query statement
Figure FDA00034762055400000316
And the second sub-query statement
Figure FDA00034762055400000317
Has only one edge at each initial query point, the query statement
Figure FDA00034762055400000318
The first sub-query statement
Figure FDA00034762055400000319
And the second sub-query statement
Figure FDA00034762055400000320
Has only one incoming edge respectively, the query statement
Figure FDA00034762055400000321
The first sub-query statement
Figure FDA00034762055400000322
And the second sub-query statement
Figure FDA00034762055400000323
The rest query points are respectively provided with only one pair of incoming edges and outgoing edges;
the query statement execution module is used for executing the first sub-query statement according to the first sub-query statement
Figure FDA00034762055400000324
Obtaining a first sub-query matching result from the query in the target data graph, and obtaining a second sub-query statement according to the first sub-query statement
Figure FDA00034762055400000325
Querying from the target data graph to obtain a second sub-query matching result, wherein the first sub-query matching result and the second sub-query matching result respectively comprise at least one edge matching result, the starting point of the edge matching result only has one outgoing edge, the end point of the edge matching result only has one incoming edge, and other points of the edge matching result only have one pair of incoming edge and outgoing edge;
the skip connection module is configured to select one sub-query matching result from the first sub-query matching result and the second sub-query matching result, construct a first point set according to starting points of all edge matching results in the sub-query matching result, and perform the following edge expansion on an end point of each edge matching result in another sub-query matching result: and for a certain terminal point, traversing and searching all corresponding extension points in the first point set, and when finding that a certain extension point can be searched, connecting an edge matching result corresponding to the certain terminal point and an edge matching result corresponding to the certain extension point by extension to obtain a result corresponding to the query statement
Figure FDA0003476205540000041
And matching the corresponding edges.
9. A computer device comprising a memory, a processor and a transceiver, wherein the memory is used for storing a computer program, the transceiver is used for transmitting and receiving information, and the processor is used for reading the computer program and executing the optimized path query method according to any one of claims 1 to 7.
10. A computer-readable storage medium having stored thereon instructions for performing the optimized path query method of any one of claims 1-7 when the instructions are run on a computer.
CN202210055214.3A 2022-01-18 2022-01-18 Optimized path query method, device, equipment and storage medium for jump connection Withdrawn CN114372165A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210055214.3A CN114372165A (en) 2022-01-18 2022-01-18 Optimized path query method, device, equipment and storage medium for jump connection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210055214.3A CN114372165A (en) 2022-01-18 2022-01-18 Optimized path query method, device, equipment and storage medium for jump connection

Publications (1)

Publication Number Publication Date
CN114372165A true CN114372165A (en) 2022-04-19

Family

ID=81143890

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210055214.3A Withdrawn CN114372165A (en) 2022-01-18 2022-01-18 Optimized path query method, device, equipment and storage medium for jump connection

Country Status (1)

Country Link
CN (1) CN114372165A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114526753A (en) * 2022-04-24 2022-05-24 深圳依时货拉拉科技有限公司 Cross-road intersection rule association method and device, computer equipment and readable storage medium
CN114943004A (en) * 2022-07-26 2022-08-26 浙江大华技术股份有限公司 Attribute graph query method, attribute graph query device, and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114526753A (en) * 2022-04-24 2022-05-24 深圳依时货拉拉科技有限公司 Cross-road intersection rule association method and device, computer equipment and readable storage medium
CN114943004A (en) * 2022-07-26 2022-08-26 浙江大华技术股份有限公司 Attribute graph query method, attribute graph query device, and storage medium
CN114943004B (en) * 2022-07-26 2022-10-28 浙江大华技术股份有限公司 Attribute graph query method, attribute graph query device, and storage medium

Similar Documents

Publication Publication Date Title
US11341419B2 (en) Method of and system for generating a prediction model and determining an accuracy of a prediction model
CN110990638B (en) Large-scale data query acceleration device and method based on FPGA-CPU heterogeneous environment
US8326825B2 (en) Automated partitioning in parallel database systems
US20170083573A1 (en) Multi-query optimization
US9547728B2 (en) Graph traversal operator and extensible framework inside a column store
US9934324B2 (en) Index structure to accelerate graph traversal
CN110134714B (en) Distributed computing framework cache index method suitable for big data iterative computation
CN114372165A (en) Optimized path query method, device, equipment and storage medium for jump connection
US9218394B2 (en) Reading rows from memory prior to reading rows from secondary storage
CN106528648B (en) In conjunction with the distributed RDF keyword proximity search method of Redis memory database
CN111666468A (en) Method for searching personalized influence community in social network based on cluster attributes
US10372736B2 (en) Generating and implementing local search engines over large databases
CN104933143A (en) Method and device for acquiring recommended object
US20200104425A1 (en) Techniques for lossless and lossy large-scale graph summarization
CN111414527B (en) Query method, device and storage medium for similar items
CN116383247A (en) Large-scale graph data efficient query method
Ding et al. A learned spatial textual index for efficient keyword queries
CN114817512A (en) Question-answer reasoning method and device
Du et al. A novel knn join algorithms based on hilbert r-tree in mapreduce
US20170031909A1 (en) Locality-sensitive hashing for algebraic expressions
CN113868138A (en) Method, system, equipment and storage medium for acquiring test data
CN113190718A (en) Data processing method and device for graph database, electronic equipment and storage medium
CN112148830A (en) Semantic data storage and retrieval method and device based on maximum area grid
Zhong et al. 3SEPIAS: A semi-structured search engine for personal information in dataspace system
Song et al. Discussions on subgraph ranking for keyworded search

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20220419