CN107273418A - A kind of across Noumenon property chain inference method based on cloud platform - Google Patents

A kind of across Noumenon property chain inference method based on cloud platform Download PDF

Info

Publication number
CN107273418A
CN107273418A CN201710331029.1A CN201710331029A CN107273418A CN 107273418 A CN107273418 A CN 107273418A CN 201710331029 A CN201710331029 A CN 201710331029A CN 107273418 A CN107273418 A CN 107273418A
Authority
CN
China
Prior art keywords
attribute
entity
chain
reasoning
inference
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710331029.1A
Other languages
Chinese (zh)
Inventor
陈华钧
陈曦
张宁豫
吴朝晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201710331029.1A priority Critical patent/CN107273418A/en
Publication of CN107273418A publication Critical patent/CN107273418A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The present invention discloses a kind of across Noumenon property chain inference method based on cloud platform, including:(1) the knowledge mapping H of unified form is obtained using semantic interlink method;(2) attribute chain inference network C is obtained using knowledge mapping;(3) object and attribute are replaced with into corresponding id, forms knowledge mapping H ', attribute chain inference network C ';(4) MapReduce frameworks, the parallel inference according to C ' to knowledge mapping H ' carry out attribute chains are used, and changes renewal C ';(5) the reasoning results are preserved into hdfs, and added it in knowledge mapping H ';(6) circulation step (4) and (5), (7) merge the reasoning results of successive ignition generation on hdfs untill the reasoning new until not producing is true.This method can support the parallel inference across the mass knowledge of body, with very strong autgmentability, have good practical value for the application of extensive semantic data reasoning.

Description

A kind of across Noumenon property chain inference method based on cloud platform
Technical field
The present invention relates to the semantic inference technology of computer, and in particular to a kind of across Noumenon property chain reasoning based on cloud platform Method.
Background technology
With continuing to develop for semantic net, the semantic web description language OWL set up on resource description framework is wide It is applied to generally in the Ontology Modeling and reasoning of every field, including life science, media information, semantic space-time data, social activity Network etc., the semantic data in each field is also in explosive increase therewith.To link open data (Linked Open Data) work Exemplified by journey, it proposes the concept of link data (Linked data), and its objective is to call people to issue into available data Semantic interlink data, can be interconnected different data sources with this.It has contained more than 295 data sources so far With 31,000,000,000 triple records.
Many implicit complicated incidence relations are there are between these mass semantic datas, can be by existing semanteme Information, which makes inferences, obtains wherein potential semantic information, and these hiding semantic relations have highly important meaning in practice Justice.For example:Biological medicine worker can draw medicine incidence relation to aid in opening for new drug using the method for semantic reasoning Hair, website data analyst can be got up using user profile reasoning interconnection.
However, existing semantic reasoning machine often lacks good scalability, it is only capable of handling small-scale body, With the continuous growth of OWL ontology data amounts, the inference engine run under above-mentioned stand-alone environment is due to needing a large amount of body numbers According to internal memory is loaded into, when to across ontology data carry out OWL reasonings on a large scale, there is internal memory and overflow, calculate performance and scalability Not enough the problems such as, traditional semantic reasoning machine has been difficult to the semantic information for handling such magnanimity.On the other hand, it has been suggested that one The problem of a little parallel reasoning techniques can not effectively solve large-scale complex semantic reasoning.Study on Semantic field is in the urgent need to one The individual high performance inference engine for handling complicated semantic association changes this predicament.
The content of the invention
In view of this, the invention provides a kind of across Noumenon property chain inference method based on cloud platform.Compared to its other party Method, the present invention realizes by the method for attribute chain reasoning efficiently to find complicated semantic pass present in mass semantic data Join information, and with very strong extended capability.
A kind of across Noumenon property chain inference method based on cloud platform, comprises the following steps:
(1) ontology data of multiple fields is merged using semantic interlink method, obtains one and unify knowing for form Know collection of illustrative plates H;
(2) a series of inference rule A of relation between entity and entity in single bodies of expression is obtained using knowledge mapping, And using the modeling of complex relationship between across the body entity of Owl2 attribute chain members language progress and entity of Ontology description language, Obtain expressing a plurality of attribute chain inference rule B, A and B the formation attribute chain inference network of relation between across body entity and entity C;
(3) a corresponding category is distributed by each attributes object in knowledge mapping and attribute chain inference network C Property No. id, each entity object distributes a corresponding entity id, forms knowledge mapping H ', attribute chain inference network C′;
(4) MapReduce frameworks are used, according to attribute chain inference network C ' to the parallel of knowledge mapping H ' carry out attribute chains Reasoning, and change Update attribute chain inference network C ';
(5) series of results for obtaining step (4) reasoning is preserved into hdfs, and adds it to knowledge mapping H ' In;
(6) judge whether this reasoning results are consistent with last the reasoning results, if so, performing step (7);If It is no, redirect execution step (4);
(7) terminate reasoning, merge the reasoning results of successive ignition generation on hdfs, and remove three repeated in the reasoning results Tuple, then according to attribute mapping table and entity mapping, is reduced into corresponding text triples, using this result as last The reasoning results return.
The relation of entity and entity can be expressed by attribute in single body, and in the reasoning across body, it is this Simple relation may develop as extremely complex chain relationship, and can effectively portray these reasonings using inference rule closes System.
Semantic gap is there is between the cross-cutting body of isomery, can be by table in different bodies using semantic interlink method Show the entity and relationship of same object, so that the reasoning for carrying out next step is implemented.
When carrying out the semantic fusion of cross-cutting body using semantic interlink method, by designing a variety of similarity feature letters The distance between number computational entity, so as to carry out entity link and fusion.Similarity feature function is:
Similarity (X, Y)=Jac (X, Y)+Cos (X, Y),
X, Y are the description description informations of two entities respectively, and Jac (X, Y) represents its Jaccard similitude, Cossine (X, Y) represents its cosine similarity, when Similarity (X, Y) is more than 0.8, carries out entity link.
Described attribute chain inference rule B can effectively express derivation relationship, and rule is provided for follow-up inference method Input.Described attribute chain inference network C can effectively portray the possibility relation between across body entity.By attribute chain with And attribute chain network simplifies to complicated reasoning process, this not only can effectively simplify complexity of reasoning, can also be same The parallel effect of Shi Tigao reasoning processes, basis is provided to implement distributed reasoning algorithm.
Corresponding text object and attributes object are replaced using No. id, the efficiency of reasoning so can be greatly improved.
The detailed process of the step (3) is:
Attribute mapping table is built, is that each attributes object distributes an attribute id;
Entity mapping is built, is that each entity object distributes an entity id;
The attribute chain object in attribute chain inference network C is replaced with attribute id, attribute chain inference network C ' is formed;
Each triple in knowledge mapping H is traveled through, corresponding head node, tail node are replaced with entity id and attribute id With relation node, knowledge mapping H ' is formed.
In step (4), the possibility that can greatly improve reasoning is made inferences using MapReduce frameworks and expansible Property, concrete implementation process is:
(4-1) Map stages:With (line number, triple) key-value pair as input, (link attribute id key assignments, ternary are exported Group) key-value pair;
(4-2) Reduce stages:(link attribute id key assignments, the triple) key-value pair exported with the Map stages is used as this rank The input of section, merges id key assignments identical triples, export (_, new triple or pending triple);
(4-3) merges attribute chain object adjacent in Update attribute chain inference network C ', and divides again for attribute chain object With a new id;
(4-4) checks whether there is new triple output, if so, execution step (4-1) is redirected, if it is not, output reasoning knot Really.
Traditional semantic reasoning method is all based on unit, substantially lacks in face of having across the mass semantic data of body Fall into;And across the Noumenon property chain inference method of the invention based on cloud platform make use of the expansible advantage of cloud platform, it can locate Large-scale semantic data is managed, specific advantage embodies as follows:
(1) present invention carries out simplifying modeling using attribute chain and attribute chain network to complicated across ontology inference process, Overcome the semantic reasoning that traditional reasoning device can be only done set inference rule so that the flexibility of reasoning and availability increase By force.
(2) MapReduce is convenient for the processing of parallel algorithm, simultaneously as a large-scale data Computational frame HDFS can also provide storage basis for large-scale knowledge mapping, and the present invention carries out depositing for extensive semantic data using HDFS Storage, while carrying out the parallel inference across Noumenon property chain by the parallel frameworks of MapReduce, substantially increases processing speed.
Brief description of the drawings
Fig. 1 is the flow chart of across the Noumenon property chain inference method of the invention based on cloud platform;
Fig. 2 is across body magnanimity Biological Knowledge collection of illustrative plates in embodiment 1.
Embodiment
In order to more specifically describe the present invention, below in conjunction with the accompanying drawings and embodiment is to technical scheme It is described in detail.
Referring to Fig. 1, across the Noumenon property chain inference method of the invention based on cloud platform, including:
S01, is merged using semantic interlink method to the ontology data of multiple fields, is obtained one and is unified knowing for form Know collection of illustrative plates H.
S02, a series of inference rule of relation between entity and entity in single bodies of expression is obtained using knowledge mapping Complex relationship builds between A, and across the body entity of Owl2 attribute chain members language progress and entity of use Ontology description language Mould, obtains expressing a plurality of attribute chain inference rule B, A and B the formation attribute chain inference net of relation between across body entity and entity Network C.
S03, a corresponding category is distributed by each attributes object in knowledge mapping and attribute chain inference network C Property No. id, each entity object distributes a corresponding entity id, forms knowledge mapping H ', attribute chain inference network C′。
This step is specially:
Attribute mapping table is built, is that each attributes object distributes an attribute id;
Entity mapping is built, is that each entity object distributes an entity id;
The attribute chain object in attribute chain inference network C is replaced with attribute id, attribute chain inference network C ' is formed;
Each triple in knowledge mapping H is traveled through, corresponding head node, tail node are replaced with entity id and attribute id With relation node, knowledge mapping H ' is formed.
S04, with MapReduce frameworks, according to attribute chain inference network C ' to knowledge mapping H ' carry out attribute chains and Row reasoning, and change Update attribute chain inference network C '.
This step is specially:
(4-1) Map stages:With (line number, triple) key-value pair as input, (link attribute id key assignments, ternary are exported Group) key-value pair;
(4-2) Reduce stages:(link attribute id key assignments, the triple) key-value pair exported with the Map stages is used as this rank The input of section, merges id key assignments identical triples, export (_, new triple or pending triple);
(4-3) merges attribute chain object adjacent in Update attribute chain inference network C ', and divides again for attribute chain object With a new id;
(4-4) checks whether there is new triple output, if so, execution step (4-1) is redirected, if it is not, output reasoning knot Really.
S05, the series of results that S04 reasonings are obtained is preserved into hdfs, and is added it in knowledge mapping H '.
S06, judges whether this reasoning results are consistent with last the reasoning results, if so, performing S07;If it is not, jumping Turn to perform S04.
S07, terminates reasoning, merges the reasoning results of successive ignition generation on hdfs, and removes what is repeated in the reasoning results Triple, then according to attribute mapping table and entity mapping, is reduced into corresponding text triples, using this result as most The reasoning results afterwards are returned.
Embodiment 1
Multiple knowledge bases across body are carried out semantic fusion by this example by semantic interlink and the method for fusion first, this In use across body magnanimity biomedical knowledge collection of illustrative plates exemplified by, as shown in Fig. 2 the collection of illustrative plates is integrated with 20 kinds of different knowledge bases, Include the triple data close to 5,000,000,000.By the knowledge mapping to be stored in HDFS file system (such as table in the form of triple 1), to carry out parallel processing.
Table 1
Subject Relation Object
Subject_text1 Relation_text1 Object_text1
Subject_text2 Relation_text2 Object_text2
Subject_text3 Relation_text3 Object_text3
Subject_textN Relation_textN Object_textN
Build after the knowledge mapping of the above, a series of inference rules can be obtained by knowledge mapping single to express The relation of entity and entity in body, is expressed for the entity relationship between body by multilink inference rule, so that By building an attribute chain inference network come the effective possibility relation portrayed between across body entity.
Inference network and knowledge mapping are then rewritten, and knowledge mapping is used into MapReduce algorithm frames according to reasoning Network carries out the parallel iteration reasoning of attribute chain, and changes corresponding inference network to carry out next round reasoning.Utilize this hair Bright method completes the biomedical herbal medicine (Herb) across body and the reasoning found, real the reasoning results is associated with gene (Gene) The associated entity drawn is shown to very high accuracy, while also having very high calculating operational efficiency.
Technical scheme and beneficial effect are described in detail above-described embodiment, Ying Li Solution is to the foregoing is only presently most preferred embodiment of the invention, is not intended to limit the invention, all principle models in the present invention Interior done any modification, supplement and equivalent substitution etc. are enclosed, be should be included in the scope of the protection.

Claims (3)

1. a kind of across Noumenon property chain inference method based on cloud platform, comprises the following steps:
(1) ontology data of multiple fields is merged using semantic interlink method, obtains the knowledge graph of a unified form Compose H;
(2) a series of inference rule A of relation between entity and entity in single bodies of expression is obtained using knowledge mapping, and is adopted The modeling of complex relationship between across body entity and entity is carried out with the Owl2 attribute chain member languages of semantic ontology description language, is obtained A plurality of attribute chain inference rule B, A and B formation attribute chain the inference network C of relation between across the body entity of expression and entity;
(3) a corresponding attribute id is distributed by each attributes object in knowledge mapping and attribute chain inference network C Number, each entity object distributes a corresponding entity id, forms knowledge mapping H ', attribute chain inference network C ';
(4) MapReduce frameworks are used, the parallel of knowledge mapping H ' carry out attribute chains is pushed away according to attribute chain inference network C ' Reason, and change Update attribute chain inference network C ';
(5) series of results for obtaining step (4) reasoning is preserved into hdfs, and is added it in knowledge mapping H ';
(6) judge whether this reasoning results are consistent with last the reasoning results, if so, performing step (7);If it is not, jumping Turn to perform step (4);
(7) terminate reasoning, merge the reasoning results of successive ignition generation on hdfs, and remove the ternary repeated in the reasoning results Group, then according to attribute mapping table and entity mapping, is reduced into corresponding text triples, using this result as last The reasoning results are returned.
2. across the Noumenon property chain inference method as claimed in claim 1 based on cloud platform, it is characterised in that the step (3) detailed process is:
Attribute mapping table is built, is that each attributes object distributes an attribute id;
Entity mapping is built, is that each entity object distributes an entity id;
The attribute chain object in attribute chain inference network C is replaced with attribute id, attribute chain inference network C ' is formed;
Each triple in knowledge mapping H is traveled through, corresponding head node, tail node and pass are replaced with entity id and attribute id Set section point, forms knowledge mapping H '.
3. across the Noumenon property chain inference method as claimed in claim 1 based on cloud platform, it is characterised in that step (4) Concretely comprise the following steps:
(4-1) Map stages:With (line number, triple) key-value pair as input, (link attribute id key assignments, triple) key is exported Value pair;
(4-2) Reduce stages:(link attribute id key assignments, the triple) key-value pair exported with the Map stages is used as this stage Input, merges id key assignments identical triples, export (_, new triple or pending triple);
(4-3) merges attribute chain object adjacent in Update attribute chain inference network C ', and redistributes one for attribute chain object Individual new id;
(4-4) checks whether there is new triple output, if so, execution step (4-1) is redirected, if it is not, output the reasoning results.
CN201710331029.1A 2017-05-11 2017-05-11 A kind of across Noumenon property chain inference method based on cloud platform Pending CN107273418A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710331029.1A CN107273418A (en) 2017-05-11 2017-05-11 A kind of across Noumenon property chain inference method based on cloud platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710331029.1A CN107273418A (en) 2017-05-11 2017-05-11 A kind of across Noumenon property chain inference method based on cloud platform

Publications (1)

Publication Number Publication Date
CN107273418A true CN107273418A (en) 2017-10-20

Family

ID=60074124

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710331029.1A Pending CN107273418A (en) 2017-05-11 2017-05-11 A kind of across Noumenon property chain inference method based on cloud platform

Country Status (1)

Country Link
CN (1) CN107273418A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109840284A (en) * 2018-12-21 2019-06-04 中科曙光南京研究院有限公司 Family's affiliation knowledge mapping construction method and system
CN110119814A (en) * 2019-04-29 2019-08-13 武汉开目信息技术股份有限公司 Knowledge rule modeling and inference method based on object relationship chain
CN113190689A (en) * 2021-05-25 2021-07-30 广东电网有限责任公司广州供电局 Construction method, device, equipment and medium of electric power safety knowledge graph

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550190A (en) * 2015-06-26 2016-05-04 许昌学院 Knowledge graph-oriented cross-media retrieval system
CN106445913A (en) * 2016-09-06 2017-02-22 中南大学 MapReduce-based semantic inference method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550190A (en) * 2015-06-26 2016-05-04 许昌学院 Knowledge graph-oriented cross-media retrieval system
CN106445913A (en) * 2016-09-06 2017-02-22 中南大学 MapReduce-based semantic inference method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XI CHEN等: "BioTCM-SE: A Semantic Search Engine for the Information Retrieval of Modern Biology and Traditional Chinese Medicine", 《COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE》 *
陈曦等: "一种基于Hadoop的语义大数据分布式推理框架", 《计算机研究与发展》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109840284A (en) * 2018-12-21 2019-06-04 中科曙光南京研究院有限公司 Family's affiliation knowledge mapping construction method and system
CN110119814A (en) * 2019-04-29 2019-08-13 武汉开目信息技术股份有限公司 Knowledge rule modeling and inference method based on object relationship chain
CN110119814B (en) * 2019-04-29 2022-04-29 武汉开目信息技术股份有限公司 Knowledge rule modeling and reasoning method based on object relation chain
CN113190689A (en) * 2021-05-25 2021-07-30 广东电网有限责任公司广州供电局 Construction method, device, equipment and medium of electric power safety knowledge graph
CN113190689B (en) * 2021-05-25 2023-04-18 广东电网有限责任公司广州供电局 Construction method, device, equipment and medium of electric power safety knowledge graph

Similar Documents

Publication Publication Date Title
Choi et al. SPIDER: a system for scalable, parallel/distributed evaluation of large-scale RDF data
CN106570081A (en) Semantic net based large scale offline data analysis framework
CN106021541A (en) Secondary k-anonymity privacy protection algorithm for differentiating quasi-identifier attributes
CN106611037A (en) Method and device for distributed diagram calculation
Patil et al. A survey on graph database management techniques for huge unstructured data
EP3387525B1 (en) Learning from input patterns in programing-by-example
JP5544118B2 (en) Data processing apparatus and processing method
CN107273418A (en) A kind of across Noumenon property chain inference method based on cloud platform
Podlesny et al. Cok: A survey of privacy challenges in relation to data meshes
James et al. Hybrid database system for big data storage and management
Gombos et al. Spar (k) ql: SPARQL evaluation method on Spark GraphX
Narkhede et al. Analyzing web application log files to find hit count through the utilization of Hadoop MapReduce in cloud computing environment
Chakravorty et al. A scalable k-anonymization solution for preserving privacy in an aging-in-place welfare intercloud
Schildgen et al. Marimba: A framework for making mapreduce jobs incremental
Lin et al. [Retracted] A Two‐Phase Method for Optimization of the SPARQL Query
Raj et al. Scalable two-phase top-down specification for big data anonymization using apache pig
Prakash et al. Issues and challenges in the era of big data mining
CN105243063B (en) The method and apparatus of information recommendation
Bhatnagar Data mining-based big data analytics: parameters and layered framework
Tang et al. Design of a data processing method for the farmland environmental monitoring based on improved Spark components
Tzacheva et al. MR-Apriori count distribution algorithm for parallel Action Rules discovery
Cuzzocrea BigMDHealth: Supporting Multidimensional Big Data Management and Analytics over Big Healthcare Data via Effective and Efficient Multidimensional Aggregate Queries over Key-Value Stores
Kalna et al. MDA transformation process of a PIM logical decision-making from NoSQL database to big data NoSQL PSM
Duan et al. Linking design-time and run-time: a graph-based uniform workflow provenance model
Sneha et al. Big Data Analysis and Machine Learning for Green Computing: Concepts and Applications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20171020