CN106980901B - Streaming RDF data parallel reasoning algorithm - Google Patents

Streaming RDF data parallel reasoning algorithm Download PDF

Info

Publication number
CN106980901B
CN106980901B CN201710246309.2A CN201710246309A CN106980901B CN 106980901 B CN106980901 B CN 106980901B CN 201710246309 A CN201710246309 A CN 201710246309A CN 106980901 B CN106980901 B CN 106980901B
Authority
CN
China
Prior art keywords
data
rule
triple
node
reasoning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710246309.2A
Other languages
Chinese (zh)
Other versions
CN106980901A (en
Inventor
汪璟玢
叶怡新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN201710246309.2A priority Critical patent/CN106980901B/en
Publication of CN106980901A publication Critical patent/CN106980901A/en
Application granted granted Critical
Publication of CN106980901B publication Critical patent/CN106980901B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/045Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provide streaming RDF data parallel reasoning algorithm: construct rule pseudo- bilateral network, if in regular node there are the link variable of class if establish intermediate node;The data that batch new data and previous reasoning in timing acquisition Streaming data flow generate carry out classification or newly-built corresponding node to the data of input and store to corresponding Redis cluster as input data;It combines pseudo- bilateral network to judge whether the former piece that corresponding intermediate node or regular node are monitored all meets the triple data of input, and then the rule is made inferences, generate inference data;Be saved in input data in Redis cluster as reasoning next time by deleting all data that repetition inference data and this reasoning generate in real time, thus efficient realize the parallel streaming reasoning of RDF data OWL Horst rule.

Description

Streaming RDF data parallel reasoning algorithm
Technical field
The invention belongs to semantic network technology fields, are specifically related to streaming RDF data parallel reasoning algorithm.
Background technique
In recent years, researchers gradually recognize the importance of the parallel reasoning algorithm research of real-time streaming data, but are directed to The related algorithm that the field proposes is still less, needs further to be studied.The also phase of research of the reasoning in terms of intellectual technology simultaneously When it is more, such as the discovery of knowledge, the reasoning etc. of case.Extensive RDF stream data phase is solved by Distributed Parallel Computing Pass problem has become the common recognition of academia and industry.
Research RDFS/OWL streaming parallel inference is a newer at present field.Barbieri D F et al. proposes base In the increment reasoning algorithm of streaming and rich background knowledge, which adds the temporal information that expires into each RDF triple, when new Stream data when reaching, calculatings is made inferences to new data, and terminate the clear fact and the invalid triple of deletion. IDRM algorithm efficiently expansible can carry out RDFS reasoning to incremental data, since IDRM algorithm is special to the progress of RDFS rule Modeling, so for the inefficient of OWL Horst rule-based reasoning.Chevalier J et al. puts forward a kind of effective increment Reasoning device (Slider), the reasoning device make inferences it by the internal characteristics in semantic data stream, are directed to realize The expansible batch processing reasoning device of stream data.But since Slider is designed just for RDFS rule, so for complexity OWL Horst rule-based reasoning and be not suitable for.
Nowadays challenging present in the extensive RDF file reasoning has: the distributed data on network is difficult to obtain suitable When triple;Growing data volume requires the expansible computing capability of large data sets;Existing inference method is special For static ontology, data are usually in variation in the real world.Existing distributing inference method primarily focuses on static number According to reasoning, research streaming RDF data parallel inference is a newer field at present.
The technical issues that need to address:
1. solving that RDF data ontology and OWL Horst rule how to be combined to construct the pseudo- bilateral network of rule, wherein including The corresponding class node of mode triple and regular node, so as to be efficiently completed OWL in extensive stream data The reasoning of the whole rules of Horst.
2. combining the streaming scheme proposed to propose corresponding parallel inference scheme, to meet extensive stream data The demand of distributed parallel reasoning.
Summary of the invention
To solve the above-mentioned problems, the present invention provides a kind of streaming RDF data parallel reasoning algorithm, for OWL Horst Rule proposes PRAS algorithm (Parallel Reasoning Algorithm in conjunction with the advantages of HAL algorithm Streaming RDF Data).The algorithm can be constructed efficiently in extensive stream data and safeguard pseudo- Two-way Network Network, and correct complete execution reasoning.
To achieve the above object, the invention adopts the following technical scheme: a kind of streaming RDF data parallel reasoning algorithm, It is characterized in that, comprising the following steps: S1: loading rule node and mode triple Pj_ RDD and Ok_ RDD is simultaneously saved in Redis collection Group constructs the intermediate node midnode of link variable in rule, skips to S2;S2: the batch new data in timing reading data flow The data itr_data that new_data and previous reasoning generate;If it is mode triple (Si,Pi,Oi), then skip to S3;If its For example triple (si,pi,oi), then skip to S5;If new_data is that empty and itr_data is sky, algorithm terminates;S3: if Its corresponding class node Pj_ RDD or Ok_ RDD exists, then is referred to corresponding class node;If it does not exist, then it creates and corresponds to Class node and be saved in Redis cluster;If its predicate belongs to Symmetric Property, S4 is skipped to;Otherwise S6 is skipped to; Symmetric Property is the set for having symmetric relation for predicate in identity mode triple.Symmetric properties triple Set SymTriples is defined as follows: ;Wherein, Pj_ RDD is mode triplet sets;S4: classification and reasoning are carried out to the data of input;S5: reasoning is generated Triple carries out storage and duplicate removal.
Compared with prior art, the invention has the following advantages that
1. OWL Horst rule and RDF ontology file is combined to construct pseudo- bilateral network structure, the effect of streaming reasoning is improved Rate.
2. combining the storage strategy of Redis clustering design, the storage of duplicate removal and iterative data is carried out to triple, is reduced The memory space and inference time for repeating triple, to improve the efficiency of reasoning.
Detailed description of the invention
Fig. 1 is overall framework schematic diagram of the invention.
Fig. 2 is pseudo- bilateral network structure figures.
Fig. 3 is loading rule and ontology data and constructs pseudo- bilateral network.
Fig. 4 is OWL Horst rule relation figure.
Specific embodiment
Explanation is further explained to the present invention in the following with reference to the drawings and specific embodiments.
Streaming parallel inference proposed by the present invention is broadly divided into the pseudo- bilateral network of building, stream data classification and OWL Three parts of reasoning of Horst rule.The characteristics of according to Spark Streaming and Redis, in conjunction with HAL algorithm and OWL Horst rule and RDF data ontology, construct the pseudo- bilateral network of rule, wherein including the corresponding class node of mode triple And regular node, if in regular node there are the link variable of class if establish intermediate node;Then, timing acquisition Streaming The data that batch new data in data flow and previous reasoning generate are used as input data, to the data of input carry out classification or Newly-built corresponding node is simultaneously stored to corresponding Redis cluster;Then, pseudo- bilateral network is combined to sentence the triple data of input Whether the former piece that corresponding intermediate node or regular node are monitored that breaks all meets, and then makes inferences to the rule, produces Raw inference data.Finally, being saved in Redis by all data for deleting repetition inference data in real time and this reasoning generates Input data in cluster as next reasoning, thus efficient realize the parallel streaming of RDF data OWL Horst rule Reasoning.
Overall framework figure is referring to Fig. 1.
A kind of streaming RDF data parallel reasoning algorithm comprising following steps:
S1: loading rule node and mode triple Pj_ RDD and Ok_ RDD is simultaneously saved in Redis cluster, in building rule The intermediate node midnode of link variable, skips to S2;
S2: the data itr_data that batch new data new_data and previous reasoning in timing reading data flow are generated; If it is mode triple (Si,Pi,Oi), then skip to S3;If it is example triple (si,pi,oi), then skip to S5;If new_ Data is that empty and itr_data is sky, then algorithm terminates;
S3: if its corresponding class node Pj_ RDD or Ok_ RDD exists, then is referred to corresponding class node;If not depositing It is then creating corresponding class node and is being saved in Redis cluster;If its predicate belongs to Symmetric Property, skip to S4;Otherwise S6 is skipped to;Symmetric Property is the set for having symmetric relation for predicate in identity mode triple. Symmetric properties triplet sets SymTriples is defined as follows:
Wherein, Pj_ RDD is mode triplet sets;For example, in OWL Horst rule SymTriples=sameAs, InverseOf, equivalentClass, equivalentProperty };
S4: classification and reasoning are carried out to the data of input;
S5: storage and duplicate removal are carried out for the triple that reasoning generates.
Wherein S4 is the following steps are included: S41: if the triple data of input are mode triple (Si,Pi,Oi), then will The triple data of input are respectively with Pi+”_”+SiFor key, OiFor value and Pi+”_”+OiFor key, SiFor value, building three S in tupleiAnd OiBidirectional relationship, and be saved in Redis cluster, skip to S43;
S42: if the triple data of input are example triple (si,pi,oi), then by the triple data structure of input Build < si,(pi,oi)>、< pi , (si, oi)>and<oi , (si,pi) > tri- key-value pair, and it is stored in Redis cluster, it jumps To S43;
S43: it checks pseudo- bilateral network corresponding to new_data or itr_data, and judges new_data or itr_data The Rule whether monitored comprising regular node or intermediate nodem_ link_RDD, if the Rule that intermediate node is monitoredm_link_ RDD then skips to S44, if the Rule that regular node is monitoredm_ link_RDD then skips to S45, otherwise skips to S2;Pseudo- bilateral network It refers to certain rule RuleiEstablish regular node Rulei_ node, rule in be related to class building class node Classi_ Node establishes intermediate node mid if including link variable in regular former piecei_node;Regular RuleiLink variable refer to It is RuleiIn for connecting the mode triple item of two former pieces, by the link variable information of each rule with < key, Value > form be stored in Rulem_ link_RDD, wherein key stores all mode ternarys for former piece connection of the rule Group item, value store the mode triple item of the rule conclusion part;The building process of pseudo- bilateral network is referring to fig. 2.
S45: judge the Rule monitoredmWhether _ link_RDD all meets, if then skipping to S46, otherwise skips to S2;
S46: whether the corresponding all former pieces of judgment rule node all meet, and generate three if so then execute the reasoning of rule Tuple skips to S5;Otherwise S2 is skipped to.
S5 is stored in entitled itr_data in Redis cluster comprising the following specific steps for the triple that reasoning generates Set, and deduplication operation is carried out to duplicate triple, then gathers itr_data as next reasoning input data A part, S2 is skipped to if the order not stopped.
PRAS algorithm of the invention is according to the principle that the characteristics of Spark RDD and Redis cluster, in conjunction with HAL algorithm and OWL Horst rule and RDF ontology data, are constructed using the pseudo- bilateral network to rule, firstly for mode triple (Si,Pj,Ok) the corresponding class node O of buildingk_ RDD or Pj_ RDD is simultaneously saved in Redis cluster, if P belongs to symmetric properties, To the S and O building bidirectional relationship in the triple and it is saved in Redis cluster.For all former pieces in quick judgment rule Whether all meet, corresponding regular node is established for strictly all rules, if containing link variable link_var in rule, is built Vertical intermediate node midnode, test condition information preservation is in intermediate node and is arranged two-way between intermediate node and regular node Communication;If connectionless variable, class node is connected directly with regular node, and test condition is stored in class node.To be advised in Fig. 2 Then for 8a, schematic diagram is as shown in Figure 3.By the building of heuristic information and symmetric properties between node, in conjunction with Redis collection The efficient access of group, required triple is read from Redis cluster in a manner of inquiring, reduces the reading of unrelated triple It takes and transmits, to improve whole Reasoning Efficiency.
The Map stage mainly completes data classification and reasoning: if the batch fluxion in timing acquisition Streaming data flow It is ontology data according to the data itr_data that new_data or previous reasoning generate, then is referred in corresponding class node, and more The corresponding value of the node in new Redis cluster;If its attribute be symmetric properties, then respectively with " symm_ "+S and " symm_ "+O For key, the bidirectional relationship of S and O in triple is constructed, and is stored in Redis cluster.If new_data or itr_data is real Number of cases evidence, then to example triple (si,pi,oi), building < si, (pi,oi)>、< pi, (si,oi)>and<oi, (si,pi) > tri- key-value pairs, and it is stored in Redis cluster.Then pseudo- bilateral network corresponding to new_data or itr_data is checked, and The link variable or the corresponding all former pieces of regular node for judging the corresponding intermediate node monitoring of new_data or itr_data (can Can include multiple intermediate nodes) whether all meet, the reasoning if so then execute rule generates triple and is output to result The Reduce stage;If part meets, the state of corresponding conditions is modified.Data classification proposed in this paper is specifically walked with reasoning algorithm It is rapid as follows:
Map phase algorithm
Input the triple of streaming triple data and previous reasoning generation
Output < " new ",>
Triple data of the Step1 for input, (Si, Pj, Ok) ∈ SchemaTriple is referred to corresponding class Node simultaneously updates Redis cluster;If PjFor symmetric properties, respectively with Pj+” _”+SiFor key, O it is value and with P+ " _ "+O is Key, S value, construct the bidirectional relationship of S and O in triple, and are stored in Redis cluster.Skip to Step3.
Triple data of the Step2 for input, (si,pj,ok) ∈ InstanceTriple, then to example triple (si,pj,ok) building < si, (pj,ok) >、< pj, (si,ok)>and<ok, (si,pj) > tri- key-value pair is stored in Redis cluster.Skip to Step3.
Step3 checks (si,pj,ok) corresponding to pseudo- bilateral network, required data are read from Redis cluster, and Judge (si,pj,ok) link variable monitored of corresponding intermediate node or the corresponding all former pieces of regular node (may include more A intermediate node) whether all meet, if all met, skip to Step4.If fruit part is unsatisfactory for, then (S is combinedi,Pj, Ok) modify to the monitoring information of intermediate node or class node.
Step4 obtains the triple of reasoning generation according to the conclusion of current ruleAnd export < " new ",>。
By rule 8a and 8b(inverseOf in Fig. 4) for, pseudo-code is described as follows:
Input: (S1, P1, O1)
Output: <”new”, >
Begin
If (S1, P1, O1) the ∈ SchemaTriple // triple be mode triple, carry out classification preservation
{
If P1 equal “type”
sadd O1 S1
else {
sadd P1 (S1,O1)
If P1{/* predicate is that symmetric properties are that building saves subject S to ∈ SymmetriProperty1And O1Symmetrical pass Be */
sadd P1+” _”+S1 O1
sadd P1+” _”+O1S1
}
}
Else when/* is example triple three key-value pairs of building save */
sadd S1 (P1,O1)
sadd P1 (S1,O1)
sadd O1 (S1,P1)
}
/ * reads inverseOf_S in Redis cluster1With inverseOf_O1Set to inverseOf*/
inverseOf smembers (“inverseOf_”+S1)
∪smembers (“inverseOf_”+O1)
If(inverseOf != null){
yield (“new”,( O1,P1, S1))
For (inverse in inverseOf.value){
yield (“new”,( O1, inverse, S1))
}
}
End
Assuming that in batch flow data currently entered containing mode triple T (memberOf, owl:inverseOf, ) and example triple t (GraduateStudent0, memberOf, University0_Department0) member.It is first First for mode triple T, judge that inverseOf_RDD whether there is, if there is no then newly-built inverseOf_RDD and protects (memberOf, member) is deposited into inverseOf_RDD;And if so, being saved directly to inverseOf_RDD.Then, Then with inverseOf_memberOf be respectively key since inverseOf is symmetric properties, member be value and InverseOf_member is key, memberOf value, and the bidirectional relationship of building memberOf and member is stored in Redis cluster.For example triple t, building < GraduateStudent0, (memberOf, University0_ Department0)>、< memberOf, (GraduateStudent0, University0_Department0)>、< University0_Department0 and is saved in Redis cluster at (GraduateStudent0, memberOf) >.Most Afterwards, the set of inverseOf_ memberOf and inverseOf_ member in Redis cluster is read to inverseOf, time It goes through inverseOf and exports (GraduateStudent0, member, University0_Department0).
It can by the bidirectional relationship of the building and storage of symmetric properties similar to regular 8a and 8b containing symmetric properties Quickly to find out relevant triple in Redis cluster, to improve Reasoning Efficiency.
By rule 15(someValuesFrom in Fig. 4) for, pseudo-code is described as follows:
Input: (S1, P1, O1)
Output: <”new”, >
Begin
If (S1, P1, O1) { // triple is mode triple to ∈ SchemaTriple, carries out classification preservation
If P1 equal “type”
sadd O1 S1
else {
sadd P1 (S1,O1)
If P1{ // predicate is that symmetric properties are the symmetrical passes that building saves subject S1 and O1 to ∈ SymmetriProperty System
sadd P1+” _”+S1 O1
sadd P1+” _”+O1 S1
}
}
Else // saved to construct three key-value pairs when example triple
sadd S1 (P1,O1)
sadd P1 (S1,O1)
sadd O1 (S1,P1)
}
someValuesFrom_set Smembers (" someValuesFrom ")/* is read in Redis cluster The set * of someValuesFrom/
onProperty_set smembers (“onProperty”)
For (svf in someValuesFrom_set) {
For (op in onProperty_set) {
If(svf.v equals op.v){
temp_w smembers (svf.w)
It is the three of type that x_type_w=temp_w.filter (x=> x.p==" type ")/*, which filters out p in temp_w, Tuple */
u_p_x smembers (op.p)
result = u_p_x.filter(t=>
t.x==x_type_w.x
Yield (" new ", (t.u, type, svf.v))) former piece in rule is attached by/*, generate reasoning knot Fruit */
}
}
}
End
Assuming that in batch flow data currently entered containing mode triple T1 (Chair, owl:someValuesFrom, Department), T2 (Chair, owl:onProperty, headOf) and example triple t1 (FullProfessor7, HeadOf, University0_Department0), t2 (University0_Department0, rdf:type, Departmment).Firstly for mode triple T1 and T2, someValuesFrom_RDD and onProperty_RDD are judged Whether there is, if there is no then create someValuesFrom_RDD and onProperty_RDD and respectively save (Chair, Department) to someValuesFrom_RDD and preservation (Chair, headOf) into onProperty_RDD;If deposited Then it is being saved directly to someValuesFrom_RDD or onProperty_RDD.For example triple t1, construct < FullProfessor7, (headOf, University0_Department0)>、< headOf, (FullProfessor7, University0_Department0)>、< University0_Department0 , (FullProfessor7, HeadOf) > and it is saved in Redis cluster, t2 is similar to aforesaid operations.Then, it reads in Redis cluster respectively The set of someValuesFrom and onProperty is traversed to someValuesFrom_set and onProperty_set SomeValuesFrom_set and onProperty_set, at this time the Chair in someValuesFrom_set with The Chair of onProperty_set is identical, with Department is respectively then key and headOf is that key obtains Redis cluster In two set;Finally FullProfessor7 is connect with Chair and export (FullProfessor7, rdf:type, Chair)。
Similar to the rule 15 of multi-connection variable, pass through the key of class node for mode triple, can quickly from It is obtained in Redis cluster;Connection is passed through using the storage strategy of example triple in Redis for associated example triple The value of variable finds out relevant example triple, to improve Reasoning Efficiency.
The data that the Reduce stage mainly generates reasoning save.For the triple that reasoning generates, it is stored in It is entitled in Redis cluster " set of itr_data ", and deduplication operation is carried out to duplicate triple, then will " itr_ The a part of data " set as next reasoning input data.Data deduplication proposed in this paper and storage algorithm specific steps are such as Under:
Reduce algorithm
Input<" new ", Iterator<String>values>
Export null
Step1. the SchemaTriple of input and InstanceTriple is stored in using itr_data as set name Reading in Redis cluster, for next reasoning.
In order to which definitely Reduce stage is to the duplicate removal and storage of input data, pseudo-code is described as follows:
Input: <”new”, Iterator <String> values>
Output: null
Begin:
del itr_data
itr for each values
Value in sadd itr_data itr.value/* traversal values is added to the itr_data collection of Redis cluster * in conjunction/
End
Can be obtained by above-mentioned pseudo-code, in the Reduce stage, by the triple of input by the set of Redis carry out duplicate removal and The preparation of data is carried out in storage for next reasoning.
Algorithm complexity analysis is the important indicator for measuring an efficiency of algorithm, the complexity point of PRAS algorithm of the invention Analysis has different modes from centralized algorithm.Analyze PRAS algorithm complexity when, can be broken down into Map and Two stages of Reduce carry out algorithm complexity analysis.If it includes N number of triple that experimental data, which is concentrated, Redis data are read Time is set as t, and during MapReduce Map task and line number be set as k, Reduce stage received example triple number Be set as m, Reduce task and line number be set as x.Due to PRAS algorithm in the Map stage to the triple of each input, in conjunction with class Node or intermediate node run-down, that is, can determine whether the triple can participate in certain rule-based reasonings, as can participating in subsequent rule Then reasoning then obtains the reasoning results by reading the former piece data reasoning in Redis.Therefore, the time in Map stage is complicated Property are as follows: O (t*N/k).Sort out in triple of the Reduce stage to each input, therefore, the time in Reduce stage is multiple Polygamy are as follows: O (m/x).
The above are preferred embodiments of the present invention, all any changes made according to the technical solution of the present invention, and generated function is made When with range without departing from technical solution of the present invention, all belong to the scope of protection of the present invention.

Claims (2)

1. a kind of streaming RDF data parallel reasoning algorithm, which comprises the following steps:
S1: loading rule node and mode triple Pj_ RDD and Ok_ RDD is simultaneously saved in Redis cluster, connects in building rule The intermediate node midnode of variable, skips to S2;
S2: the data itr_data that batch new data new_data and previous reasoning in timing reading data flow are generated;If its For mode triple (Si,Pi,Oi), then skip to S3;If it is example triple (si,pi,oi), then skip to S5;If new_data It is sky for empty and itr_data, then algorithm terminates;
S3: if its corresponding class node Pj_ RDD or Ok_ RDD exists, then is referred to corresponding class node;If it does not exist, then It creates corresponding class node and is saved in Redis cluster;If its predicate belongs to Symmetric Property, S4 is skipped to;It is no Then skip to S6;Symmetric Property is the set for having symmetric relation for predicate in identity mode triple;Symmetrically Attribute triplet sets SymTriples is defined as follows:
Wherein, Pj_ RDD is mode triplet sets;
S4: classification and reasoning are carried out to the data of input;
S5: storage and duplicate removal are carried out for the triple that reasoning generates;
S4 is the following steps are included: S41: if the triple data of input are mode triple (Si,Pi,Oi), then by the three of input Tuple data is respectively with Pi+”_”+SiFor key, OiFor value and Pi+”_”+OiFor key, SiFor value, S in triple is constructedi And OiBidirectional relationship, and be saved in Redis cluster, skip to S43;
S42: if the triple data of input are example triple (si,pi,oi), then by the building of the triple data of input < si,(pi,oi)>、< pi , (si, oi)>and<oi , (si,pi) > tri- key-value pair, and it is stored in Redis cluster, it skips to S43;
S43: it checks pseudo- bilateral network corresponding to new_data or itr_data, and whether judges new_data or itr_data The Rule monitored comprising regular node or intermediate nodem_ link_RDD, if the Rule that intermediate node is monitoredm_ link_RDD is then S44 is skipped to, if the Rule that regular node is monitoredm_ link_RDD then skips to S45, otherwise skips to S2;Pseudo- bilateral network refers to To certain rule RuleiEstablish regular node Rulei_ node, rule in be related to class building class node Classi_ node, such as Intermediate node mid is then established comprising link variable in fruit rule former piecei_node;Regular RuleiLink variable refer to Rulei In for connecting the mode triple item of two former pieces, by the link variable information of each rule with<key, value>shape Formula is stored in Rulem_ link_RDD, wherein key stores all mode triple items for former piece connection of the rule, value Store the mode triple item of the rule conclusion part;
S45: judge the Rule monitoredmWhether _ link_RDD all meets, if then skipping to S46, otherwise skips to S2;
S46: whether the corresponding all former pieces of judgment rule node all meet, and generate ternary if so then execute the reasoning of rule Group skips to S5;Otherwise S2 is skipped to.
2. a kind of streaming RDF data parallel reasoning algorithm according to claim 1, it is characterised in that: S5 includes following tool Body step: the triple generated for reasoning is stored in the set of entitled itr_data in Redis cluster, and to duplicate Triple carries out deduplication operation, itr_data is then gathered a part as next reasoning input data, if do not stopped Order only then skips to S2.
CN201710246309.2A 2017-04-15 2017-04-15 Streaming RDF data parallel reasoning algorithm Active CN106980901B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710246309.2A CN106980901B (en) 2017-04-15 2017-04-15 Streaming RDF data parallel reasoning algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710246309.2A CN106980901B (en) 2017-04-15 2017-04-15 Streaming RDF data parallel reasoning algorithm

Publications (2)

Publication Number Publication Date
CN106980901A CN106980901A (en) 2017-07-25
CN106980901B true CN106980901B (en) 2019-09-13

Family

ID=59346065

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710246309.2A Active CN106980901B (en) 2017-04-15 2017-04-15 Streaming RDF data parallel reasoning algorithm

Country Status (1)

Country Link
CN (1) CN106980901B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763451B (en) * 2018-05-28 2022-03-11 福州大学 Streaming RDF data parallel reasoning algorithm based on Spark Streaming
CN108875953A (en) * 2018-06-08 2018-11-23 福州大学 A kind of complex rule reasoning design method extending DL operator

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105468702A (en) * 2015-11-18 2016-04-06 中国科学院计算机网络信息中心 Large-scale RDF data association path discovery method
CN105808853A (en) * 2016-03-09 2016-07-27 哈尔滨工程大学 Engineering application oriented body establishment management and body data automatic obtaining method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105468702A (en) * 2015-11-18 2016-04-06 中国科学院计算机网络信息中心 Large-scale RDF data association path discovery method
CN105808853A (en) * 2016-03-09 2016-07-27 哈尔滨工程大学 Engineering application oriented body establishment management and body data automatic obtaining method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Benchmarking Streaming Computation Engines:Storm, Flink and Spark Streaming;Sanket Chintapalli etc;《2016 IEEE International Parallel and Distributed Processing Symposium Workshops》;20161231;第1789-1792页 *
HSST+基于分布式内存数据库的;董书暕等;《计算机科学》;20160331;第43卷(第3期);第220-224页 *
一种面向结构化数据源的语义标注和挖掘方法;李璞等;《南阳师范学院学报》;20160630;第15卷(第6期);第22-26页 *
结合RATe的RDF数据分布式并行推理算法;汪璟玢等;《模式识别与人工智能》;20160531;第29卷(第5期);第417-425页 *

Also Published As

Publication number Publication date
CN106980901A (en) 2017-07-25

Similar Documents

Publication Publication Date Title
Balcan et al. Distributed $ k $-means and $ k $-median clustering on general topologies
Lizier et al. Information storage, loop motifs, and clustered structure in complex networks
Daoqing et al. Parallel discrete lion swarm optimization algorithm for solving traveling salesman problem
CN106980901B (en) Streaming RDF data parallel reasoning algorithm
CN110909111A (en) Distributed storage and indexing method based on knowledge graph RDF data characteristics
Singh et al. Performance Measure of Similis and FPGrowth Algo rithm
CN105912721B (en) RDF data distributed semantic parallel inference method
CN104615703A (en) RDF data distributed parallel inference method combined with Rete algorithm
Chatzianastasis et al. Graph ordering attention networks
Yang et al. Novel fast networking approaches mining underlying structures from investment big data
CN107016110A (en) With reference to the OWLHorst regular distribution formula parallel reasoning algorithms of Spark platforms
Lu et al. A unified link prediction framework for predicting arbitrary relations in heterogeneous academic networks
Pazdor et al. Social network analysis of popular YouTube videos via vertical quantitative mining
Xu et al. PGSL: A probabilistic graph diffusion model for source localization
Balaji et al. Distributed graph path queries using spark
Ding et al. Efficient probabilistic skyline query processing in mapreduce
Wang et al. A novel measure for influence nodes across complex networks based on node attraction
CN106330559B (en) Complex network topologies calculation of characteristic parameters method and system based on MapReduce
Nedic et al. A Lyapunov approach to discrete-time linear consensus
CN114756713A (en) Graph representation learning method based on multi-source interaction fusion
Zelinka et al. Competition on learning-based real-parameter single objective optimization by SOMA swarm based algorithm with SOMARemove strategy
Ravindra et al. To nest or not to nest, when and how much: Representing intermediate results of graph pattern queries in mapreduce based processing
Ling et al. Optimization of the distributed K-means clustering algorithm based on set pair analysis
CN108763451B (en) Streaming RDF data parallel reasoning algorithm based on Spark Streaming
Freitas et al. On the visualization of trade-offs and reducibility in many-objective optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant