CN114116785A - Distributed SPARQL query optimization method based on minimum attribute cut - Google Patents
Distributed SPARQL query optimization method based on minimum attribute cut Download PDFInfo
- Publication number
- CN114116785A CN114116785A CN202111451035.3A CN202111451035A CN114116785A CN 114116785 A CN114116785 A CN 114116785A CN 202111451035 A CN202111451035 A CN 202111451035A CN 114116785 A CN114116785 A CN 114116785A
- Authority
- CN
- China
- Prior art keywords
- attribute
- graph
- partition
- distributed
- query
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000005457 optimization Methods 0.000 title claims abstract description 13
- 238000005192 partition Methods 0.000 claims abstract description 60
- 230000011218 segmentation Effects 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 4
- 241001229889 Metis Species 0.000 claims description 3
- 230000003068 static effect Effects 0.000 claims description 2
- 238000004891 communication Methods 0.000 abstract description 4
- 238000000638 solvent extraction Methods 0.000 description 12
- 239000008186 active pharmaceutical agent Substances 0.000 description 10
- 230000008569 process Effects 0.000 description 7
- 238000000354 decomposition reaction Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000013499 data model Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003012 network analysis Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a distributed SPARQL query optimization method based on minimum attribute cut, which belongs to the field of distributed systems and comprises the following steps: (1) reading an original RDF data graph, and storing an edge attribute set L; (2) calculating the weakly connected component and the corresponding cost of each edge attribute; (3) selecting internal attributes as much as possible to obtain a coarsening graph of the data graph; (4) carrying out vertex division on the coarsening graph, and carrying out anti-coarsening treatment to obtain a final partition; (5) decomposing the SPARQL query into a set of independently executable subqueries; (6) and executing the decomposed sub-queries in parallel in each partition to obtain a matching result. The invention expands the query types which can be independently executed in the distributed RDF system, reduces the connection between the partitions, reduces the data communication time and improves the query efficiency.
Description
Technical Field
The present invention relates to the field of distributed systems, and more particularly to data partitioning and query processing for distributed RDF systems.
Background
Rdf (resource Description framework) is a data model organized by W3C, and represents attributes and relationships of web resources in the basic form of triples < subject, predicate, object >, and is currently applied in the fields of knowledge graphs, social network analysis, and the like. The RDF data model has flexible representation form, and can be represented not only as a table in a relational database, but also as a graph model. When RDF is represented as a graph, a triple represents a directed edge pointing from the subject to the object and two vertices connecting the directed edge, the subject and the object are two vertices of the edge, and the predicate is a label on the directed edge. W3C proposes a standard query language SPARQL (simple protocol and RDFquery language) at the same time of proposing RDF. SPARQL, like RDF, can also be represented as a graphical model. Edges in the query graph are called a triplet mode, and the subject, predicate and object in the triplet mode can be variables or constants. Because both SPARQL and RDF can be represented as graph models, SPARQL queries can be transformed into subgraph matching problems.
With the rapid development of the internet, the scale of the RDF data set is continuously increased, and the traditional single machine system cannot effectively process massive RDF data, so that a distributed RDF system appears. In a distributed system, data partitioning is one of the most basic processes. Specifically, the RDF data graph G is divided into a group of subgraphs { F }1,F2,…,FkEach subgraph, called a partition, is distributed among different machines. Currently, a data partitioning method used in a distributed RDF system is to partition data by vertex, that is, to partition each vertex into different partitions, for example, a common hash partition. In this type of approach, some edges may be "split" between partitions, i.e., the two vertices of an edge are divided into different partitions. To ensure graph integrity, these segmented edges are repeatedly saved in two partitions, called one-hop replication. An edge is called an inner edge if two vertices of the edge are in the same partition; otherwise called crossing edges.
The matching type of the query is the same as the type of the edge, and can be divided into two types: internal matching, wherein the matching result is only contained in one partition; across matches, the match results are contained within multiple partitions. When the query to be executed has only an internal match, it only needs to be executed independently in each partition. For a query with cross matching, most of the existing methods decompose the query into a set of star queries, then independently execute the star queries in each partition, and finally execute inter-partition connection to obtain a final result. However, the inter-partition connection involves data communication and extra computational overhead, and has a large impact on query performance. Moreover, in the conventional method of partitioning by vertex, the query that can be executed independently can only be a star, which is greatly limited, and when processing a general query, distributed connection is usually performed, so the query efficiency is not high.
Disclosure of Invention
The existing distributed RDF system only judges whether the query can be executed independently according to the structure of the query graph, and the query graph is considered to be executed independently only when the query graph is a star. The present invention extends the types of queries that can be executed independently, and not just star queries, after considering the attributes of edges in graph data. One of the objectives of the present invention is to provide a graph data partitioning method based on minimum attribute segmentation, which can reduce the number of spanning attributes, thereby avoiding connection operations between partitions and reducing data communication time. The second purpose of the present invention is to provide a query decomposition method, which can decompose an original query that cannot be executed independently into a set of sub-queries that can be executed independently, thereby making full use of the advantage of minimum attribute segmentation data partitioning and improving query efficiency.
The invention provides a distributed SPARQL query optimization method based on minimum attribute segmentation, which comprises the following steps:
step S1: reading an original RDF data graph G, and storing edge attributes into a set L;
step S2: calculating the weakly connected component and the corresponding cost of each edge attribute;
step S3: selecting internal attributes as much as possible to obtain a coarsening graph of the data graph;
when static graph data is processed, the number of edge attributes is fixed and unchanged, and the types only include an internal attribute and a spanning attribute. Therefore, more internal attributes are selected as much as possible by using a heuristic greedy algorithm, so that the minimum cross-attribute is realized, namely the minimum attribute cut is achieved. And after the internal attribute is selected, each weakly connected component in the internal attribute is used as a super point to obtain a coarsened graph of the data graph.
Step S4: carrying out vertex division on the coarsening graph, and carrying out anti-coarsening treatment to obtain a final partition;
when the coarsened graph is subjected to vertex division, any one of the vertex-division algorithms such as hash and METIS may be used. But ensures that the number of vertices in each partition does not exceed (1+ epsilon) × V |/k at the time of partitioning to achieve inter-partition load balancing. Wherein epsilon is the user-defined, maximum imbalance ratio, and k is the number of partitions.
Step S5: decomposing the SPARQL query into a set of independently executable subqueries;
the original SPARQL query is decomposed according to the cross attribute obtained in step S3, the sub-queries obtained by decomposition can be executed independently within the partition, and the shape of the sub-query is not limited to the star query.
Step S6: and executing the decomposed sub-queries in parallel in each partition to obtain a matching result.
By adopting the invention, the following technical effects can be achieved:
the invention provides a distributed SPARQL query optimization method based on minimum attribute segmentation. The present invention then decomposes queries that cannot be executed independently into a set of sub-queries that can be executed independently. Different from the traditional method, the sub-queries which can be independently executed are not limited to star queries, so that the number of invalid intermediate matching results can be further reduced, and the filtering effect is improved.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a schematic diagram illustrating a process of coarsening a data graph according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a query decomposition process according to an embodiment of the present invention.
Detailed Description
The following further description of embodiments of the present invention is provided in conjunction with the accompanying drawings so that those skilled in the art can more easily understand the present invention. It should be noted that the embodiment described below is only one embodiment of the present invention, and not all embodiments. Other embodiments, which can be derived by those skilled in the art from the embodiments of the present invention without making any creative effort, are within the protection scope of the present invention.
For convenience of description and understanding, the symbols and concepts related to the embodiments of the invention are explained as follows:
g: RDF data graphs.
L: the set of attributes for an edge in the RDF data graph.
q (v): and querying the query graph to which the point v belongs.
G [ L' ]: the induced subgraph of the L 'attribute set is a subgraph formed by attribute edges in L'.
DS (L'): and the L' attribute set corresponds to the union check set.
Inside limit: an edge is said to be an internal edge if both vertices of the edge are within the same partition.
Crossing edges: an edge is said to be a spanning edge if two vertices of the edge are within two different partitions, respectively.
Internal attributes: an attribute is said to be an internal attribute if it does not have a crossing edge.
Span attribute: if at least one crossing edge exists in one attribute, the attribute is called as a crossing attribute, namely, at least one attribute crossing the edge is the attribute.
Queries can be performed independently: if a SPARQL query Q is in the RDF graph G partition F ═ { F ═ F1,F2,…,FkAre independently executable, then the matching of query Q does not require inter-partition connections.
Partitioning by minimum attribute: given an RDF data graph G and a positive integer k, the smallest attribute partition F of G ═ F1,F2,…,FkAnd F satisfies: (1) number of spanning attributes | LcrossL is minimum; (2) the number of vertices in each partition does not exceed (1+ ε) x V/k, where ε is the user-defined, maximum imbalance ratio and k is the number of partitions.
The invention provides a distributed SPARQL query optimization method based on minimum attribute segmentation, the flow of which is shown in figure 1 and comprises the following steps:
s1: reading an original RDF data graph G, and storing edge attributes into a set L;
s2: traversing the set L, and calculating a weakly connected component WCC (G { p }) corresponding to each edge attribute p and a corresponding Cost (G { p });
in calculating the weakly connected component, different calculation methods may be used. In an embodiment of the invention, the optimization calculations are performed using a parallel-lookup data structure. Step S2 specifically includes:
s2.1: traversing each attribute p in the set L, and respectively executing the steps S2.2-S2.4 to the attribute p;
s2.2: a union set DS ({ p } is initialized for attribute p). In the parallel lookup set, each node u corresponds to a tree and contains three attribute values u ({ p }) parent, u ({ p }) rank, and u ({ p }) size. Wherein u ({ p }) parent is a root node of u in DS ({ p }), and the initial value is u itself; u ({ p }) rank is the height value from the u node to the root node, and the initial value is 0; u ({ p }) size is the number of root vertices in the tree, with an initial value of 1;
s2.3: for edges in RDF graphsIf its attribute is p, the trees corresponding to u and u 'in the union set DS ({ p }) can be merged, i.e., weakly connected components containing u and u' can be merged. During the merging process, the root vertex of the tree with smaller rank points to the root vertex of the tree with larger rank. After all the edges with the attribute p are processed, if the induced subgraph G of the attribute p [ { p }]Two vertices are in the same connected component, and then the two vertices are also in the same tree in the union set DS ({ p });
s2.4: calculating an attribute p as the cost of the internal attribute;
because the method of the present invention requires that the number of vertices in each partition does not exceed (1+ epsilon) × V |/k in order to ensure load balancing between partitions, the cost is defined based on the size of the weakly connected component in this embodiment. In particular, for a set of attributesThe cost of L' as an internal attribute is defined as follows:
where c is a weakly connected component in WCC (G [ L ], | c | represents the number of vertices in c. Based on the cost function, the cost of the weakly connected component corresponding to each attribute can be calculated.
S3: selecting as many internal attributes L as possible from the attribute set LinSo as to minimize the spanning attributes, each internal attribute LinThe corresponding weakly connected component in the data graph is used as a super point to obtain a coarsened graph of the data graph. In the coarsened graph, the super points may be connected by a spanning edge;
giving a minimum attribute partition of the data graph G, assigning a unique attribute to each edge in the G, and obtaining a data graph marked asAt this time, atThe minimum attribute cut is calculated in G. Also, because the minimum edge-cut problem is an NP-complete problem, the minimum attribute-cut problem is also an NP-complete problem. Just because the minimum attribute cut problem has this characteristic, in this embodiment, a heuristic greedy algorithm is used to select the internal attribute, which specifically includes the following steps:
s3.1: set the internal attributes LinInitialization is null;
s3.2: judging whether the attribute set L is empty, and if the attribute set L is empty, ending the iteration; otherwise, respectively executing the steps S3.3-S.3.8, and continuing the next iteration;
s3.3: minimum cost mincost set to infinity, optimal attribute poptSet to null;
s3.4: traversing the attribute set L, and respectively executing the steps S3.5 and S3.6 on the attribute p;
s3.5: calculating WCC (G [ L ]in∪{p}]);
In this embodiment, in order to improve the computational efficiency of the weakly connected components, the co-query data structure is used for optimization. Initially, the set DS (L) will be looked upinU { p }) is set to DS (L)in). For vertex u in DS ({ p }), root vertex uRoot of the tree corresponding to DS ({ p }) can be obtained in a recursive manner. Then, at DS (L)inU and uRoot vertexes are respectively obtained in U { p }), and if the u and uRoot vertexes are different, corresponding trees are merged.
S3.6: if Cost (L)in∪p) Less than (1+ ε) × V |/k, and at the same time less than mincost, will Cost (L)in∪p) Assign mincost to p and assign p to poptThen, the step S3.4 is carried out; otherwise, mincost and poptKeeping unchanged, and directly switching to the step S3.4;
s3.7: if after steps S3.4-S3.6, the optimal property poptIf the state is still empty, the process ends in step S3, and proceeds to step S4; otherwise, go to step S3.8;
s3.8: deleting an attribute p from an attribute set LoptThen p is addedoptAdding to an internal Property set LinThen, step S3.2 is carried out to continue to select the internal attribute;
taking fig. 2 as an example, the original data map has 12 vertices and 6 edge attributes, and after the processing of step S3, the internal attribute L is selectedin{ starring, residual, producer, spout, found date }. The edges of the internal property are the thickened edges in fig. 2, which form two weakly connected components. In the coarsened graph, the two weakly connected components each form a super point, and the super points are connected by an edge spanning the property birthPlace.
S4: and (3) carrying out division on the super points in the coarsening graph by using a vertex partition algorithm, and ensuring that the number of the vertexes in each partition does not exceed (1+ epsilon) × V/k during the division. Wherein epsilon is the user-defined maximum imbalance proportion, and k is the partition number;
because the number of the vertexes in the coarsened graph is far smaller than that of the original data graph, the vertexes in the coarsened graph can be partitioned by using any partitioning algorithm divided by the vertexes at the moment without worrying about long time consumption. For example, hash partitioning, METIS partitioning, etc. are used. In this embodiment, S4 specifically includes:
s4.1: taking the number of vertexes inside the overtop in the coarsening graph as the weight of the overtop, thereby using weighted Hash division on the coarsening graph and ensuring that the number of vertexes of the final data partition does not exceed (1+ epsilon) x V/k;
s4.2: the super point set divided into the same partition in the step S4.1 is inversely coarsened into a final partition, namely, an original data point contained in the super point set is divided into a partition in an original data graph;
taking fig. 2 as an example, if the number of partitions is 2, the two super points in fig. 2 are each a partition, that is, the original data map is divided into two partitions by the dashed line in fig. 2, so as to obtain the final minimum attribute divided partition.
S5: decomposing the SPARQL query to be processed into a group of sub-queries which can be executed independently;
in the real SPARQL query task, the query is likely not executable independently. In order to fully utilize the advantages of the minimum attribute segmentation data partitioning and reduce the connection between partitions, the original query needs to be decomposed into a group of sub-queries which can be executed independently. In this embodiment, step S5 specifically includes:
s5.2: deleting the edges with the edge attribute as variable or spanning attribute in the SPARQL query to obtain a group of weakly connected components WCCs (q)'1,q′2,...,q′x};
S5.3: traversing the edge with the edge attribute as variable or crossing attribute in SPARQL queryExecuting steps S5.4-S5.5 to the edge;
s5.4: if v is1And v2If they belong to the same sub-query, add edges to the sub-query in which they are locatedThen, the step S5.3 is carried out to continue a new iteration; otherwise, go to step S5.5;
s5.5: if | q (v)1) | is less than or equal to | q (v)2) If you want to be able to put the edge onAddition to q (v)2) Otherwise, add to q (v)1) In, i.e. to be edgedAddition to v1And v2The sub-query with more vertexes belongs to. Then, step S5.3 is carried out to continue a new iteration;
s5.6: traversing sub-queries q 'in WCCs'iIf q'iThe number of vertexes in is more than 1, then q'iJoin to a collectionIn (1). Here, the query with the number of vertices 1 is not considered because: such queries contain only one query point, the number of matching results is large and meaningless, and other queries contain the query point;
taking FIG. 3 as an example, after step S5.2, three sub-queries q 'are obtained'1、q′2、q′3. Because of query q'1Is greater than q'2So as to cross attribute edgesAdd to query q'1In (1). Because of q'2And q'3The number of vertices is the same, so the edgesMay be added to either one of the two. Hypothetical edgeIs added to q'2In (3), the final decomposed sub-query is q in FIG. 31、q2。
S6: and executing the decomposed sub-queries in parallel in each partition to obtain a matching result. In this embodiment, step S6 specifically includes:
s6.1: the main node of the distributed RDF system broadcasts the decomposed sub-queries to all the slave nodes, and after the slave nodes receive the sub-queries, sub-graph matching is executed in parallel inside the partitions to obtain an intermediate matching result;
s6.2: and carrying out inter-partition connection on the intermediate matching results in each node to obtain a final matching result, and collecting the result into the main node.
In summary, the invention provides a distributed SPARQL query optimization method based on minimum attribute segmentation on the basis of considering the edge attribute in the RDF data graph, so that query types capable of being independently executed are expanded, connection between partitions is reduced, data communication time is reduced, and query efficiency is improved.
The above embodiments are only preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, equivalents, improvements, etc. made thereto should be included within the scope of the present invention.
Claims (5)
1. A distributed SPARQL query optimization method based on minimum attribute cut is characterized by comprising the following steps:
(1) reading an original RDF data graph, and storing an edge attribute set L;
(2) calculating the weakly connected component and the corresponding cost of each edge attribute;
(3) selecting internal attributes as much as possible to obtain a coarsening graph of the data graph;
(4) carrying out vertex division on the coarsening graph, and carrying out anti-coarsening treatment to obtain a final partition;
(5) decomposing the SPARQL query into a set of independently executable subqueries;
(6) and executing the decomposed sub-queries in parallel in each partition to obtain a matching result.
2. The distributed SPARQL query optimization method based on minimum attribute segmentation as claimed in claim 1, wherein step 2 is to use the size of the weakly connected component as the cost of the attribute in order to measure the attribute when selecting the internal attribute when calculating the weakly connected component.
3. The distributed SPARQL query optimization method based on minimum attribute segmentation as claimed in claim 1, wherein in step 3, when processing static graph data, the number of edge attributes is fixed and unchanged, and the types are only internal attributes and two types of cross attributes; by using a heuristic greedy algorithm to select more internal attributes as much as possible, the minimum spanning attributes are realized, namely the minimum attribute cutting purpose is achieved; and after the internal attribute is selected, each weakly connected component in the internal attribute is used as a super point to obtain a coarsened graph of the data graph.
4. The distributed SPARQL query optimization method based on minimum attribute segmentation as claimed in claim 1, wherein in step 4, when the vertex partition is performed on the coarsened graph, any one of partition algorithms divided by the vertex, such as hash and METIS, can be used, but when the partition is performed, the number of vertices in each partition is ensured not to exceed (1+ epsilon) x V/k, so as to achieve load balance between the partitions; wherein epsilon is the user-defined, maximum imbalance ratio, and k is the number of partitions.
5. The method of claim 1, wherein step 5 decomposes the original SPARQL query according to the cross-attribute obtained in step 3, the decomposed subqueries can be executed independently in partitions, and the shape of the subqueries is not limited to star queries.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111451035.3A CN114116785B (en) | 2021-12-01 | 2021-12-01 | Distributed SPARQL query optimization method based on minimum attribute cut |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111451035.3A CN114116785B (en) | 2021-12-01 | 2021-12-01 | Distributed SPARQL query optimization method based on minimum attribute cut |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114116785A true CN114116785A (en) | 2022-03-01 |
CN114116785B CN114116785B (en) | 2024-09-24 |
Family
ID=80369112
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111451035.3A Active CN114116785B (en) | 2021-12-01 | 2021-12-01 | Distributed SPARQL query optimization method based on minimum attribute cut |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114116785B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114356977A (en) * | 2022-03-16 | 2022-04-15 | 湖南大学 | Distributed RDF graph query method, device, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101854284B1 (en) * | 2016-12-26 | 2018-05-03 | 충북대학교 산학협력단 | Distributed RDF query processing system for reducing join cost and communication cost |
CN108520035A (en) * | 2018-03-29 | 2018-09-11 | 天津大学 | SPARQL parent map pattern query processing methods based on star decomposition |
CN109710638A (en) * | 2019-01-01 | 2019-05-03 | 湖南大学 | A kind of multi-query optimization method on federation type distribution RDF data library |
CN112835920A (en) * | 2021-01-22 | 2021-05-25 | 河海大学 | Distributed SPARQL query optimization method based on hybrid storage mode |
CN112883063A (en) * | 2021-02-15 | 2021-06-01 | 湖南大学 | SPARQL query processing method on partition-based distributed RDF system |
-
2021
- 2021-12-01 CN CN202111451035.3A patent/CN114116785B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101854284B1 (en) * | 2016-12-26 | 2018-05-03 | 충북대학교 산학협력단 | Distributed RDF query processing system for reducing join cost and communication cost |
CN108520035A (en) * | 2018-03-29 | 2018-09-11 | 天津大学 | SPARQL parent map pattern query processing methods based on star decomposition |
CN109710638A (en) * | 2019-01-01 | 2019-05-03 | 湖南大学 | A kind of multi-query optimization method on federation type distribution RDF data library |
CN112835920A (en) * | 2021-01-22 | 2021-05-25 | 河海大学 | Distributed SPARQL query optimization method based on hybrid storage mode |
CN112883063A (en) * | 2021-02-15 | 2021-06-01 | 湖南大学 | SPARQL query processing method on partition-based distributed RDF system |
Non-Patent Citations (1)
Title |
---|
杨程: "分布式环境下大规模资源描述框架数据划分方法综述", 《计算机应用》, vol. 40, no. 11, 22 July 2020 (2020-07-22) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114356977A (en) * | 2022-03-16 | 2022-04-15 | 湖南大学 | Distributed RDF graph query method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN114116785B (en) | 2024-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108959613B (en) | RDF knowledge graph-oriented semantic approximate query method | |
Kim et al. | Taming subgraph isomorphism for RDF query processing | |
Sun et al. | Efficient subgraph matching on billion node graphs | |
US9292570B2 (en) | System and method for optimizing pattern query searches on a graph database | |
Meimaris et al. | Extended characteristic sets: graph indexing for SPARQL query optimization | |
CN110134714B (en) | Distributed computing framework cache index method suitable for big data iterative computation | |
CA2973356A1 (en) | Distributed storage and distributed processing query statement reconstruction in accordance with a policy | |
Huang et al. | Query optimization of distributed pattern matching | |
CN104298598B (en) | The adjustment method of RDFS bodies under distributed environment | |
US20130275410A1 (en) | Live topological query | |
CN109325029A (en) | RDF data storage and querying method based on sparse matrix | |
CN108520035A (en) | SPARQL parent map pattern query processing methods based on star decomposition | |
US20070078816A1 (en) | Common sub-expression elimination for inverse query evaluation | |
CN105550332A (en) | Dual-layer index structure based origin graph query method | |
CN114116785A (en) | Distributed SPARQL query optimization method based on minimum attribute cut | |
CN110032676B (en) | SPARQL query optimization method and system based on predicate association | |
CN110245271B (en) | Large-scale associated data partitioning method and system based on attribute graph | |
CN116383247A (en) | Large-scale graph data efficient query method | |
Muhammad et al. | Multi query optimization algorithm using semantic and heuristic approaches | |
Wang et al. | RDF partitioning for scalable SPARQL query processing | |
Curé et al. | HAQWA: a Hash-based and Query Workload Aware Distributed RDF Store. | |
CN109063048A (en) | A kind of matched data cleaning method of knowledge based library figure and device | |
Zheng et al. | Research on partitioning algorithm based on RDF graph | |
CN114297260A (en) | Distributed RDF data query method and device and computer equipment | |
CN110162574B (en) | Method and device for determining data redistribution mode, server and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |