CN105550332B - A kind of provenance graph querying method based on the double-deck index structure - Google Patents

A kind of provenance graph querying method based on the double-deck index structure Download PDF

Info

Publication number
CN105550332B
CN105550332B CN201510969332.5A CN201510969332A CN105550332B CN 105550332 B CN105550332 B CN 105550332B CN 201510969332 A CN201510969332 A CN 201510969332A CN 105550332 B CN105550332 B CN 105550332B
Authority
CN
China
Prior art keywords
index
provenance graph
data
inquiry
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510969332.5A
Other languages
Chinese (zh)
Other versions
CN105550332A (en
Inventor
许国艳
罗章璇
宋健
平萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN201510969332.5A priority Critical patent/CN105550332B/en
Publication of CN105550332A publication Critical patent/CN105550332A/en
Application granted granted Critical
Publication of CN105550332B publication Critical patent/CN105550332B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention discloses a kind of provenance graph querying method based on the double-deck index structure comprising the steps of: firstly, inquiring towards provenance graph, proposes a kind of double-deck index structure;Secondly, design is based on dictionary sheet global index, matching relationship and provenance graph ID between origination data and data are recorded in table;Then, it proposes to be based on bitmap partial indexes, according to provenance graph RDF query mode, proposes the index and three kinds of join inquiry modes for meeting Triple Pattern inquiry, and based on the corresponding search algorithm of Index Design.Finally, demonstrating the feasibility and validity of the provenance graph querying method based on the double-deck index structure by test.

Description

A kind of provenance graph querying method based on the double-deck index structure
Technical field
The present invention relates to the management of the origination data of big data management domain, are directed to the query scheme of data origin figure emphatically Design and realization.The present invention provides a kind of provenance graph issuer based on the double-deck index structure according to data origin figure feature Method.This method is designed from global and local two levels respectively: on the one hand can be with matched data and its by dictionary sheet Relationship between source data proposes to be based on dictionary sheet global index algorithm;On the other hand origin institute is quickly positioned according to provenance graph ID It is stored in cloud computing server node, proposes to be based on bitmap partial indexes structure, including 6 kinds of different selection indexes and 3 kinds Join sitation index, and devise corresponding search algorithm.
Background technique
Data origin is the information to the entire history of data processing, the source including data and the institute for handling these data There is subsequent process.How efficiently to have inquired source information with the continuous development of big data, under cloud computing environment becomes especially to weigh It wants, how efficiently to have inquired source information becomes a urgent problem to be solved.
The present invention is directed to data origin under cloud computing environment and inquires problem, a kind of double-deck index structure is introduced, respectively from complete It is analyzed in terms of office's index and partial indexes two, devises a kind of provenance graph querying method, and feasible to method, effective It is verified.
Summary of the invention
Goal of the invention: aiming at the problems existing in the prior art, the present invention provides a kind of rising based on the double-deck index structure Source figure querying method.
Technical solution: a kind of provenance graph querying method based on the double-deck index structure mentions firstly, inquiring towards provenance graph A kind of double-deck index structure out.Secondly, design is based on dictionary sheet global index, records in table and matched between origination data and data Relationship and provenance graph ID, the relationship that can be associated between origin and data, and the stored cloud in origin can be navigated to rapidly Server node is to reduce the user query response time;Then, it proposes to be based on bitmap partial indexes, according to provenance graph RDF query Mode is proposed the index and three kinds of join inquiry modes for meeting eight kinds of Triple Pattern inquiry, and is set based on index Corresponding search algorithm is counted.
The double-deck index structure towards provenance graph inquiry
Store origination data under previous distributed environment, inquiry origin only rely only on master node come the task of distributing into Row is searched, it usually needs is traversed entire cluster, is consumed a large amount of time and resource.And storage system in origin under existing distributed environment System is substantially based on major key come quick search, lacks efficient index structure, cannot provide the inquiry such as multi-dimensional query and join. Efficient index structure can effectively improve search efficiency, shorten response time when user query.
To improve search efficiency, in conjunction with provenance graph feature, a kind of double-deck index structure is proposed.Index structure includes being based on Dictionary sheet global index and be based on bitmap partial indexes.The server node that global index's inquiry provenance graph is stored, local rope Draw the server node refined queries inquired to global index, and then inquires required origination data.Global index's distribution It, only need to can referring to global index's structure of local server when user requests to reach under cloud environment on each node Node location where obtaining the provenance graph inquired.Partial indexes are only to establish the origination data stored in local server It indexes, there is no dependences for the partial indexes between each node.
Global index and global query's algorithm based on dictionary sheet
Dictionary table structure is provided first, on this basis, completes the querying flow based on global index.
1, dictionary table structure
According to data origin feature, dictionary sheet HCPTable is designed in terms of two.Firstly, storage provenance graph title and correspondence Data item.Data item is exactly the described data that originate from, and all data in one action stream is all corresponded to a provenance graph, slightly Relationship between the description origin of granularity and data.Secondly, storing provenance graph title and corresponding ID.The execution of workflow each time A data provenance graph can be then generated, origin ID is then generated in storing process according to Hash (key) mapping.It is risen in global index Source figure ID is the input item of consistency hash index algorithm, can quickly calculate provenance graph institute storage server according to origin ID Node.
2, based on the querying flow of global index
It is begun stepping through from the root node of provenance graph to leaf node according to provenance graph ID is inquired in HCPTable, according to leaf Node obtains provenance graph storage server.Global index's querying flow is as follows:
(1) it searches dictionary sheet and obtains provenance graph ID number
(2) child node met the requirements is searched according to query demand
(3) output child node number is calculated
Partial indexes and local queries algorithm based on bitmap
In order to improve inquiry provenance graph data efficiency, consider user query when sentence diversity, make up selection index Is, The deficiency of Ip, Io in the inquiry to single Triple Pattern, to triple known to Subject-Verb design index Isp and Ips, designs index Ipo and Iop to triple known to predicate object, designs index Iso to triple known to subject object And Ios, form complete local bitmap index structure, including selection index Is, Ip, Io, Isp, Ipo, Iso and join index Is'、Io'、Iso'。
Partial indexes support the refined queries to the origin diagram data on single cloud storage service device node.Provenance graph inquiry Include two parts: single Triple Pattern inquiry and join inquiry.
(1) single Triple Pattern inquiry
Selection index Is, Ip, Io, Isp, Ipo, Iso are to subject, predicate, object, Subject-Verb, predicate object, subject guest Language carries out the inquiry of single Triple Pattern.
(2) join is inquired
For handling, subject shared variable, object shared variable and subject object are shared to be become selection index Is', Io', Iso' Amount carries out join inquiry.
Detailed description of the invention
Fig. 1 is the double-deck index structure;
Fig. 2 is the origin querying flow figure based on global index;
Fig. 3 is consistency binary tree distributed model;
Fig. 4 is RDF triple join type;
Fig. 5 is that index space occupies analysis graph;
Fig. 6 is query performance analysis graph.
Specific embodiment
Combined with specific embodiments below, the present invention is furture elucidated, it should be understood that these embodiments are merely to illustrate the present invention Rather than limit the scope of the invention, after the present invention has been read, those skilled in the art are to various equivalences of the invention The modification of form falls within the application range as defined in the appended claims.
Provenance graph querying method based on the double-deck index structure proposes a kind of double-deck index firstly, inquiring towards provenance graph Structure.Secondly, design is based on dictionary sheet global index, matching relationship and provenance graph between origination data and data are recorded in table ID, the relationship that can be associated between origin and data, and the stored Cloud Server node in origin can be navigated to rapidly to subtract Few user query response time;Then, it proposes to be based on bitmap partial indexes, according to provenance graph RDF query mode, proposes satisfaction The index and three kinds of join inquiry modes of eight kinds of Triple Pattern inquiry, and based on the corresponding query operator of Index Design Method.
The double-deck index structure towards provenance graph inquiry
Store origination data under previous distributed environment, inquiry origin only rely only on master node come the task of distributing into Row is searched, it usually needs is traversed entire cluster, is consumed a large amount of time and resource.And storage system in origin under existing distributed environment System is substantially based on major key come quick search, lacks efficient index structure, cannot provide the inquiry such as multi-dimensional query and join. Efficient index structure can effectively improve search efficiency, shorten response time when user query.
Index structure includes based on dictionary sheet global index and being based on bitmap partial indexes.Provenance graph institute is inquired by global index The server node of storage, the server node refined queries that partial indexes inquire global index, and then inquire required Origination data.Global index is distributed under cloud environment on each node, only need to be referring to local clothes when user requests to reach Global index's structure of business device can obtain node location where the provenance graph inquired.Partial indexes are only to establish in local clothes The index of origination data that business device is stored, there is no dependences for the partial indexes between each node.The bilayer of design Index structure is specifically as shown in Figure 1.
Global index and global query's algorithm based on dictionary sheet
Dictionary table structure is provided first, on this basis, completes the querying flow based on global index.
1, dictionary table structure
According to data origin feature, dictionary sheet HCPTable is designed in terms of two.Firstly, storage provenance graph title and correspondence Data item.Data item is exactly the described data that originate from, and all data in one action stream is all corresponded to a provenance graph, slightly Relationship between the description origin of granularity and data.Secondly, storing provenance graph title and corresponding ID.The execution of workflow each time A data provenance graph can be then generated, origin ID is then generated in storing process according to Hash (key) mapping.It is risen in global index Source figure ID is the input item of consistency hash index algorithm, can quickly calculate provenance graph institute storage server according to origin ID Node.The storage organization example of the dictionary sheet HCPTable of design is as shown in table 1.
The storage organization of 1 dictionary sheet HCPTable of table
2, the provenance graph memory node querying flow based on global index
Provenance graph memory node querying flow based on global index is as shown in Figure 2.
(1) it searches dictionary sheet and obtains provenance graph ID
(2) it searches since the root node of tree, the server section in tree is stored according to provenance graph ID inquiry origin Point, formula 1 calculate selection child node.ID is provenance graph ID number in formula 1, and root.Number is the number of root node.
Nodenum=ID%root.Number (1)
According to calculated result select child node, verifying child node node.Isleaf determined property whether leaf node, if It is that leaf node thens follow the steps (4), it is no to then follow the steps (3).
(3) using the node as new root node, continuation is executed since step (2).
(4) it calculates and exports this node serial number.
Firstly, the execution of process each time can all select a leaf node for tree, execute since root node to leaf Child node.Querying method is similar to binary search, so time complexity is O (log (n)).Secondly, the present invention uses consistency two Fork tree distribution storage, such binary tree structure storage mode also can be more much higher than the efficiency of other multiway trees.
3, consistency binary tree distribution storage
The thought of consistency binary tree distribution storage is to carry out server layering grouping, and consistency Hash is combined to calculate Data are evenly dispersed in each Cloud Server by method.Each server section in consistency binary tree leaf node expression cloud Point is used to store origin diagram data.
Consistency binary tree distributed model is based on binary tree structure, is divided into multiple mutually disjoint in each Hierarchy nodes Finite aggregate in, wherein each set itself again be one tree, so that all memory nodes to be assigned to the difference of different levels In group.Corresponding server number is stored in leaf node.
Define 1 consistency y-bend distribution tree: the binary tree that consistency binary tree is made of the finite aggregate T of n node, T ={ V, E }, V are the set of node, and E is the set on side.
Each leaf indicates cloud computing server position in finite aggregate T.For each node, unique one can be used Serial No. definition, successively represent the number in the lived through path of the node from left to right, wherein subtree from left to right according to Secondary number 0,1,00 ....It is 11 as inquired D node serial number in Fig. 3, is also just uniquely determined in this consistency distribution tree Specific location of the D node in tree, i.e. inquiry D node pass through 1 and 1 liang of paths.When the volume of leaf node all in tree When number all completion, logical construction of this tree is also determined that.It therebetween is the relationship singly mapped.With consistency y-bend When tree, it can be abstracted as a two-dimensional array, so that it may safeguard tree structure with two-dimensional array.
Algorithm is realized:
The purpose of global query is server node where positioning provenance graph, Design consistency distributed storage of the present invention Different Origin figure is uniformly stored in leaf node different in tree.When inquiring provenance graph, inquired according to provenance graph ID Source is stored in the server node in tree.
Originating node search algorithm Match_Node is specific as follows:
Partial indexes and local queries algorithm based on bitmap
Originate from diagram data in the present invention using triple as unit progress sequential storage, the number of triple uses pos from 1 to n (ti) indicate each triple tiStorage location i in figure, uses pos-1(ti) return to triple tiPosition i in figure, Wherein ti∈ G, G are the triplet sets of a RDF graph, and D is the set of RDF graph, G={ t1,t2,...tn},Gi∈ D, D= {G1,G2,...Gn}。
RDF data is respectively indicated using S, P and O concentrates subject, predicate and object set.As shown in formula 2
S=S1∪S2∪....∪Sn,Si=s | (s, p, o) ∈ Gi},Gi∈ D, D={ G1,G2,...Gn}
P=P1∪P2∪....∪Pn,Pi=p | (s, p, o) ∈ Gi},Gi∈ D, D={ G1,G2,...Gn} (2)
O=O1∪O2∪....∪On,Oi=o | (s, p, o) ∈ Gi},Gi∈ D, D={ G1,G2,...Gn}
1, to the inquiry of single Triple Pattern
Since subject, predicate and object may be variable in single Triple Pattern, then being directed to single triple Inquiry need to design multi-dimensional indexing.The main thought of multi-dimensional indexing is by the non-variables query interface in triple. In SPARQL inquiry clause such as to the expression formula of single triple Triple Pattern clause inquiry and represented semanteme Shown in table 2.
2 Triple Pattern expression formula of table and its meaning
Clause's expression formula Meaning
1 (s,p,o) If triple exists, triple is returned, null value is otherwise returned
2 (? s, p, o) Given predicate, object, return to the subject result set for meeting triple
3 (s,? p, o) Given predicate, object, return to the subject result set for meeting triple
4 (s, p,? o) Given predicate, object, return to the subject result set for meeting triple
5 (? s,? p, o) Given predicate, object, return to the subject result set for meeting triple
6 (s,? p,? o) Given predicate, object, return to the subject result set for meeting triple
7 (? s, p,? o) Given predicate, returns to the subject and object result set for meeting triple
8 (? s,? p,? o) Return to all triples
Define 2 bitmap index Is: index Is is the set { (s of all triple subjects in RDF graph G1,v1),(s2, v2),....,(sn,vn)}.Wherein, s ∈ S.viFor figure G in a bit vector, and the k location in vector be 1 and if only if There are triple t in figure Gk=pos (k), tk∈G,tk.s=si
The purpose of Is Index Design is that the query statement to inquiry subject can quickly be found accordingly in RDF graph Triple.Wherein the size of Is is fixed, identical comprising the number of RDF triple with the provenance graph of place.
Similarly, same mode establishes index Ip and Io, can quickly inquire using predicate or object as keyword Triple.If subject, predicate and object are all it is known that so can be Is, Io and Ip tri- in conjunction with index in query statement Search index: Is ∧ Ip ∧ Is.
Define 3 bitmap index Isp: index Isp is all triple Subject-Verb set { (s in RDF graph G1p1,v1), (s2p2,v2),....,(snpn,vn)}.Wherein, s ∈ S, p ∈ P.viIt to scheme a bit vector in G, and is the k in vector Position is 1 and if only if there are triple t in figure Gk=pos-1(k),tk∈G,tk.sp=sipi
Similarly, same mode establishes index Ips, Ipo, Iop, Iso, Ios, can quickly inquire and be called with subject The triple of language, predicate object, object predicate, subject object and object subject as keyword.
2, containing the inquiry of join
Relevance between triple is judged by whether there is unbound variable of the same name between triple.Root Incidence relation can be turned to three kinds: Subject-Subject link, Object-Object according to the position of occurrences of the same name Link and Object-Subject link, RDF triple join type are as shown in Figure 4.
Define bitmap index Is': index Is' be in RDF graph G it is all comprising identical subject triplet sets (1, v1),(2,v2),....,(n,vn)}.Wherein n=| G |, 1,2...n is then that continuous position identifies in figure.viFor one in figure G Bit vector, and be k location in vector be 1 and if only if there are triple t in figure Gk=pos (k), tk∈G,ti=pos (i),tk.s=ti.s。
Similarly index Io' establishes similar Is'.Herein without establishing Ipp for predicate, because the amount of predicate is opposite in figure Triple that is less and inquiring identical predicate has no meaning for subject and object.
Define 4 bitmap index Iso': index Iso is all triple collection comprising identical subject and object in RDF graph G Close { (1, v1),(2,v2),....,(n,vn)}.Wherein n=| G |, 1,2...n is then that continuous position identifies in figure.viFor in figure G A bit vector, and be k location in vector be 1 and if only if there are triple t in figure Gk=pos (k), tk∈G,ti =pos (i), ti∈G,tk.o=ti.s.Index Ios' is then the transposition for indexing Iso': Ios'=Iso'T
Index Isp, Ips, Ipo, Iop, Iso, Ios be selection index, for handle known Subject-Verb, predicate object or The inquiry request of person's subject object, wherein Isp and Ips, Ipo and Iop, the described triple of Iso and Ios index are in practical figure Middle storage location is identical, therefore only needs Isp, Ipo and Iso.
To sum up, the present invention is using index Is, Ip, Io, Isp, Ipo and Iso to subject, predicate, object, Subject-Verb, meaning Triple known to language object or subject object is inquired.Index Is', Io', Iso', Ios' is shared for handling subject The join inquiry request of variable, object shared variable and subject object shared variable.Bitmap index storing framework TDSuch as 3 institute of table Show.
3 bitmap index storing framework T of tableD
3, algorithm is realized
The unknown is inquired according to known terms in triple to the search algorithm ASI_TP of single Triple Pattern, it is as follows It is shown;
It can be in the respective subject of two triples, object, subject and predicate, predicate to the algorithm AJI_TP of join inquiry When identical with object, subject and predicate difference can Rapid matching, as follows;
And the algorithm Match_BGP to BGP inquiry, it is as follows:
The process algorithm that is called when wherein ASI_TP and AJI_TP is inquires BGP.Match_BGP algorithm will be in BGP All trple pattern are pre-processed, that is, are resequenced, the specific steps are as follows:
1, it is forward to establish the high trple pattern sequence of selectance, trple pattern selectance from high to low suitable Sequence is as follows:
(1) Subject non-variables
(2) Subject, Predicate and Object all non-variables and the non-rdf:type of predicate
(3) Subject is variable, Predicate and Object non-variables and predicate is rdf:type
(4) Subject and Predicate is variable, Object non-variables
(5) Subject and Object is variable, Predicate non-variables and the non-rdf:type of predicate
(6) Subject and Object is variable, Predicate rdf:type
(7) Subject, Object and Predicate are variable.
2, function ASI_TP algorithm is called to look into first trple pattern in the RDF triple collection bgp after sequence It askes, returned variable storage is in vseva.And according to vsevaIt obtains a result and collects S.If result set S is sky, directly return empty Collection.
3, next trple pattern is taken, the trple pattern and trple before are first judged before inquiry Whether pattern has shared variable, if there is shared variable, then calls algorithm AJI_TP, records at current trple pattern The result set S of reasontpi, merge current results collection and vsevaThe result set obtained.
4, third step is repeated, until all result sets all poll-finals.
5, the bit vector in result set S is replaced, specific RDF triple is obtained according to bit vector.Return to the result of inquiry Collection.
Experimental verification
1, space hold is analyzed
This paper partial indexes technology increases three new indexes to accelerate search efficiency, so storing occupied space more three Memory space shared by a index.
The triple that identical subject, predicate or object in 400 RDF triples are generated in one action stream can be multiple Occur, index Is, Ip and Io and does not have to establish index entry to each triple.Triple containing identical element uses The position bit of bitmap vector marks.Such as the position that identical subject only needs to establish the subject for the first time in bitmap index Corresponding position 1 in figure vector.The position indicates its logical place stored in the database.
The triple of identical Subject-Verb, predicate object and subject object in the origination data of workflow record is repeated Same very much, then only set need to be set by different location in the vector established for the first time for duplicate keys, when storage, is only needed Store a bitmap index.Therefore, three index entries are added on the basis of original 6 indexes herein, the quantity of index increases Add 50%, and indexed memory space and only increase 25% or so, as shown in Figure 5.
2, query performance is analyzed
The present invention for University of Texas's origination data standard data set test respectively 11 UTPB query statements come The query performance of index structure designed by the test present invention.
Experiment is thought to have carried out 11 sentences respectively under Hadoop cluster environment to five data sets of D1, D2, D3, D4, D5 Inquiry test.Each inquiry respectively runs 5 average values for taking the corresponding time on five data sets, and query performance is analyzed such as Shown in Fig. 6.
It is analyzed by example implementing result, it was demonstrated that feasibility of the invention also demonstrates the double-deck index structure proposed When coping with the storage of mass data origin, with the increase of data volume, storage and inquiry property are relatively superior, customer inquiries request Response is timely.Data performance in face of complicated inquiry request and magnanimity is still fine.

Claims (2)

1. a kind of provenance graph querying method based on the double-deck index structure, which is characterized in that comprise the steps of: firstly, towards Provenance graph inquiry proposes a kind of double-deck index structure;Secondly, design is based on dictionary sheet global index, origination data is recorded in table Matching relationship and provenance graph ID between data;Then, it proposes to be based on bitmap partial indexes, according to provenance graph RDF query side Formula proposes the index and three kinds of join inquiry modes for meeting Triple Pattern inquiry, and based on Index Design phase The search algorithm answered;
The double-deck index structure towards provenance graph inquiry includes based on dictionary sheet global index and being based on bitmap partial indexes;It is global The server node that search index provenance graph is stored, partial indexes look into the server node refinement that global index inquires It askes, and then inquires required origination data;Global index is distributed under cloud environment on each node, when user requests to reach When, node location where the provenance graph inquired need to can be only obtained referring to global index's structure of local server;Local rope Drawing is the index for only establishing the origination data stored in local server, the partial indexes between each node there is no according to The relationship of relying;
Global index and global query's algorithm based on dictionary sheet are as follows:
Dictionary table structure is provided first, on this basis, completes the querying flow based on global index;
1), dictionary table structure
According to data origin feature, dictionary sheet HCPTable is designed in terms of two;Firstly, storage provenance graph title and corresponding data ?;Data item is exactly the described data that originate from, and all data in one action stream are all corresponded to a provenance graph, coarseness Description origin data between relationship;Secondly, storing provenance graph title and corresponding ID;The execution of workflow then can each time A data provenance graph is generated, origin ID is then generated in storing process according to Hash (key) mapping;Provenance graph in global index ID is the input item of consistency hash index algorithm, can quickly calculate provenance graph institute storage server section according to origin ID Point;
2), based on the querying flow of global index
It is begun stepping through from root node to leaf node according to provenance graph ID is inquired in HCPTable, origin is obtained according to leaf node Figure storage server;Global index's querying flow is as follows:
(1) it searches dictionary sheet and obtains provenance graph ID
(2) child node met the requirements is searched
(3) it calculates and exports this node serial number.
2. the provenance graph querying method according to claim 1 based on the double-deck index structure, which is characterized in that be based on bitmap Partial indexes and local queries algorithm are as follows:
Provenance graph inquiry includes two parts: single Triple Pattern inquiry and join inquiry;
(1) single Triple Pattern inquiry
Selection index Is, Ip, Io, Isp, Ipo, Iso to subject, predicate, object, Subject-Verb, predicate object, subject object into The inquiry of the single Triple Pattern of row;
(2) join is inquired
Selection index Is', Io', Iso' for handle subject shared variable, object shared variable and subject object shared variable into Row join inquiry.
CN201510969332.5A 2015-12-21 2015-12-21 A kind of provenance graph querying method based on the double-deck index structure Expired - Fee Related CN105550332B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510969332.5A CN105550332B (en) 2015-12-21 2015-12-21 A kind of provenance graph querying method based on the double-deck index structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510969332.5A CN105550332B (en) 2015-12-21 2015-12-21 A kind of provenance graph querying method based on the double-deck index structure

Publications (2)

Publication Number Publication Date
CN105550332A CN105550332A (en) 2016-05-04
CN105550332B true CN105550332B (en) 2019-03-29

Family

ID=55829521

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510969332.5A Expired - Fee Related CN105550332B (en) 2015-12-21 2015-12-21 A kind of provenance graph querying method based on the double-deck index structure

Country Status (1)

Country Link
CN (1) CN105550332B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709000B (en) * 2016-12-22 2020-07-14 河海大学 Key view discovery method based on PageRank and origin graph abstraction
CN107016065A (en) * 2017-03-16 2017-08-04 陕西科技大学 It is customizable to rely on semantic effective origin filter method
CN108733681B (en) * 2017-04-14 2021-10-22 华为技术有限公司 Information processing method and device
US10949467B2 (en) * 2018-03-01 2021-03-16 Huawei Technologies Canada Co., Ltd. Random draw forest index structure for searching large scale unstructured data
CN109857743A (en) * 2019-02-12 2019-06-07 浙江水利水电学院 The construction method and device querying method and system of symmetrical canonical multi-dimensional indexing platform
CN112817538B (en) * 2021-02-22 2022-08-30 腾讯科技(深圳)有限公司 Data processing method, device, equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831225A (en) * 2012-08-27 2012-12-19 南京邮电大学 Multi-dimensional index structure under cloud environment, construction method thereof and similarity query method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831225A (en) * 2012-08-27 2012-12-19 南京邮电大学 Multi-dimensional index structure under cloud environment, construction method thereof and similarity query method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
"Matix "Bit" loaded:a scalable lightweight join query processor for RDF data";Medha Atre 等;《WWW"10 Proceedings of the 19th intrnational conference on word wide web》;20101231;第41-50页 *
"Storing,Indexing and Querying Large Provence Data Sets as RDF Graphs in Apache HBase";Artem Chebotko 等;《2013 IEEE Ninth World Congress on Services》;20131107;第1-8页 *
"分布式存储系统中一致性哈希算法的研究";杨彧剑 等;《电脑知识与技术》;20110831;第7卷(第22期);第5295-5296页 *
"分片位图索引:一种适用于运输局管理的辅助索引机制";孟必平 等;《计算机学报》;20121130;第35卷(第11期);第2306-2316页 *
"基于一致性树分布的数据分布式存储方法";郭栋 等;《计算机应用》;20131201;第33卷(第12期);第3432-3436页 *

Also Published As

Publication number Publication date
CN105550332A (en) 2016-05-04

Similar Documents

Publication Publication Date Title
CN105550332B (en) A kind of provenance graph querying method based on the double-deck index structure
US11120022B2 (en) Processing a database query using a shared metadata store
Özsu A survey of RDF data management systems
US9449115B2 (en) Method, controller, program and data storage system for performing reconciliation processing
US9507875B2 (en) Symbolic hyper-graph database
Junghanns et al. Gradoop: Scalable graph data management and analytics with hadoop
Abraham et al. Distributed storage and querying techniques for a semantic web of scientific workflow provenance
Chen et al. SparkRDF: elastic discreted RDF graph processing engine with distributed memory
Huang et al. Query optimization of distributed pattern matching
US20140324882A1 (en) Method and system for navigating complex data sets
Liagouris et al. An effective encoding scheme for spatial RDF data
Madkour et al. WORQ: workload-driven RDF query processing
Azhir et al. Query optimization mechanisms in the cloud environments: A systematic study
Alaoui A categorization of RDF triplestores
CN108241709A (en) A kind of data integrating method, device and system
Pawar et al. Keyword search in information retrieval and relational database system: Two class view
Svoboda et al. Linked data indexing methods: A survey
Schroeder et al. A data distribution model for RDF
Pandat et al. Load balanced semantic aware distributed RDF graph
JP5464017B2 (en) Distributed memory database system, database server, data processing method and program thereof
Bugiotti et al. SPARQL Query Processing in the Cloud.
Kondylakis et al. Enabling joins over cassandra NoSQL databases
Troullinou et al. DIAERESIS: RDF data partitioning and query processing on SPARK
Jose et al. Semantic Web Query Join Optimization Using Modified Grey Wolf Optimization Algorithm.
Valenta et al. Distributed evaluation of XPath axes queries over large XML documents stored in MapReduce clusters

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190329