CN110147376A - A kind of inquiry of oil gas big data and storage method based on domain body - Google Patents

A kind of inquiry of oil gas big data and storage method based on domain body Download PDF

Info

Publication number
CN110147376A
CN110147376A CN201910454493.9A CN201910454493A CN110147376A CN 110147376 A CN110147376 A CN 110147376A CN 201910454493 A CN201910454493 A CN 201910454493A CN 110147376 A CN110147376 A CN 110147376A
Authority
CN
China
Prior art keywords
node
data
neo4j
storage
rdf
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910454493.9A
Other languages
Chinese (zh)
Inventor
宫法明
马玉辉
唐昱润
袁向兵
李昕
李传涛
李翛然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Petroleum East China
Original Assignee
China University of Petroleum East China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Petroleum East China filed Critical China University of Petroleum East China
Priority to CN201910454493.9A priority Critical patent/CN110147376A/en
Publication of CN110147376A publication Critical patent/CN110147376A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2219Large Object storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists

Abstract

The oil gas big data based on domain body that the invention discloses a kind of is inquired and storage method, this method include: carrying out the building that formalization unified representation completes ontology by the relationship between the concept and concept in domain body;With triple and the data structure of five-tuple to knowledge and the description that is formalized of concept in multiple fields, the unstructured storage of RDF digraph is realized;The mapping of RDF digraph to Neo4j data structure is completed by establishing series R2G structuring mapping ruler, Key-Neo4j distributed storage model is then constructed on the basis of Neo4j data structure, realizes the distributed storage of Neo4j database;The double-deck index descriptor index method suitable for oil field ontology is formd, storage size and the data query time of occupancy is finally exported, realizes the normal storage and dynamic expansion of the vast resources under big data environment.Method of the invention can efficiently solve impedance mismatch problem when domain body storage, greatly alleviate the load pressure of memory space.

Description

A kind of inquiry of oil gas big data and storage method based on domain body
Technical field
The invention belongs to computer oil gas big data field, be related to a kind of oil gas big data inquiry based on domain body and Storage method.
Background technique
The knowledge concepts of more than in oil exploration and development fields 20 a subjects are described in oil field ontology, simultaneously The correlation being demonstrated by between concept and attribute.In petroleum exploration and development work, petroleum ontology can complete oil exploration Multi-disciplinary Knowledge Aggregation and information integration in development field illustrate relationship and its field axiom between term and term, and Formalized description has been carried out to them, the domain body can be used as it is a kind of solution oil field in it is multi-specialized between knowledge melt The mode of conjunction.The characteristics of oil field ontology can solve in oil exploration and development fields due to caused by multidisciplinary concept it Between lack of standard, the problems such as concept is relatively independent, the knowledge concepts between oil field every subjects are formed into a unification Entirety, accelerate oil field electronic information construction process.
Existing algorithm is shaped like triple table methods, horizontal partitioning and vertical partitioning scheduling algorithm, and triple table methods are by entire RDF data It storing into a three column data tables, every a line all respectively corresponds the main body resource, mapping relations and object resource of RDF data, The data that triple table methods are directed to bench scale have very superior performance, but with the increase of data scale, can generate It is a large amount of to connect certainly, cause data processing speed efficiency to be greatly lowered.Horizontal partitioning method is to arrive all RDF data storages In one table, the table be RDF data each predicate value specify a dedicated column, and this table support multi-valued attribute, but due to Sparse attribute leads to a large amount of mentioned null cells, so the storage method is not suitable for and mass data storage.Vertical partitioning method will Triple tables are rewritten as n two lists, and wherein n is the quantity of unique attribute in data, for the inquiry for specifying predicate value, hold Line efficiency is high, but with the increase of data scale, the retrieval time of information will be exponentially increased.Previous oil field ontology number It is small according to amount of storage, select traditional Relational DataBase to can satisfy actual needs for storage medium, still, when due to big data The arrival in generation, data are increased in explosive, are repeated to store caused by choice relation type database, are greatly occupied and deposit Resource is stored up, the memory module of redundancy also brings great difficulty to the retrieval of information.Therefore, how various and extremely in data In complicated relationship, fully using domain body carry out precise and high efficiency data query and storage it is urgently to be resolved as one Problem.
Summary of the invention
The present invention proposes a kind of oil gas big data inquiry based on domain body and storage side to overcome drawbacks described above Method, specific step is as follows by the present invention:
S1 carries out formalization unified representation by the relationship between the concept and concept in domain body, completes the structure of ontology It builds;
S2, with triple t=<s, p, o>data structure in multiple fields professional knowledge and concept carry out it is clear and The description of formalization;
S3, by increase tuple quantity, by multiple RDF triple data combine construct five-tuple O=C, R, At, Rel, Ao } data model;
S4 indicates concept and attribute in ontology data with the node in graph structure, with the Bian Daibiaoben between two nodes Corresponding relationship among body carries out the unstructured storage of RDF digraph;
S5 realizes the mapping of RDF digraph to Neo4j data structure by establishing series R2G structuring mapping ruler;
S6 constructs Key-Neo4j distributed storage model on the basis of Neo4j data structure, realizes Neo4j database Distributed storage;
S7 forms the double-deck index for being suitable for oil field ontology by creation object indexing and triple Indexing Mechanism Descriptor index method;
S8 exports storage size and the data query time of occupancy.
Technical solution of the present invention feature and improvement are as follows:
For step S3, five-tuple O={ C, R, At, Rel, Ao } data model that the present invention uses, by increasing tuple Quantity can preferably in ontology concept and term carry out clearer stratification expression, utilize the data for changing model Structure is fully described by out concept and term and relationship between the two, including five elements: class (C), relationship (R), attribute (At), axiom (Rel) and example (Ao);Class be other than the general significance of concept, can also by RDF triple task, Resource and object resource based on the titles expression such as movement and event, such as " oil-gas exploration and development " is a class, uses ternary Group form is expressed as (oil-gas exploration and development, rdfs:type, Owl:class);Relationship is define concept and attribute in ontology one Kind mapping method, is primarily referred to as the constraint relationship of the two, wherein domain is made of the concept in concept set, and codomain can To be made of data types such as concept and numerical value;Prevailing relationship between domain body include subclass relation (subClassOf), Relationship (edf:type) between example and ontology term;Attribute is the key property for describing concept in domain body, main to wrap Containing two attributes, i.e. data attribute and object properties, data attribute refers to associated between object and data type value, object category Property refers to be associated with each other between object;Axiom is the description to eternal truth, is all true, and example under any circumstance It is the specific example of class, such as a well (rdfs:type in tower;Owl: drilling) indicate tower in a well be drilling hole type oil well a reality Example.
For step S4, the graph structure that the present invention uses is Neo4j figure, also referred to as attributed graph (PG), the important composition of PG Mode is exactly node and relationship;Indicate concept and attribute in ontology data with the node in Neo4j graph structure, with two nodes it Between side represent the corresponding relationship among ontology, wherein dependence mapping in start node join end to end to next node A relationship end to end is formed, is connected between node and node by relationship, each node and relationship can possess Individual attribute declaration can assign the label of multiple types for each node.
For step S5, the present invention uses the R2G mapping ruler by establishing series, realizes RDF digraph to Neo4j number According to the mapping of structure, wherein RDF digraph can be to class by main body resource, corresponding relationship and object resource representation, main body resource It is expressed with concept;Object resource also may indicate that definition and the attribute of class other than it can express class and concept;It is corresponding Relationship mainly describes the relationship between main body resource and object resource, and the data model of Neo4j by node, relationship, node and The attribute of relationship forms;
Preferably, described to include: from RDF digraph to the mapping step of Neo4j database
(S510) each attribute value in RDF digraph, each category in Neo4j database, in RDF digraph are traversed Property value is all generated by corresponding node of graph;Each node can establish multiple relationships with multiple nodes, and individual node can be set Multiple attributes, if V={ v1, v2, v3, v4 } is the group node that RDF digraph is mapped in Neo4j database;
(S520) for each of node set V sky node (bnode) v (b), property set is obtained, indicates that the node removes There is no an additional attribute except type label;
(S530) for each resource identifier node (iri) v (u) in node set V, respectively obtain node type and The attribute set of IRI label, as follows:
In formula (1), φ (v (u1)), φ (v (u2)) and φ (v (u3)) are resource identifier node, and kind indicates node Type, attribute set are the description language after IRI;
(S540) for each literal node (literal) v (l) in node collection V, property set is obtained:
In formula (2), node type and corresponding value, data type and language attribute are obtained respectively, wherein Language attribute can be null value, as follows:
φ (v (u3))=<" kind ", " literal ">,<" literal ", 3582>, < " datatype ", int > } (3)
In formula (3), resource identifier node v (u3) is described out in detail by more attributes, the type literal's of node Value be 3582, datatype indicate be the value data type, modified with int;
(S550) each Bian Dou in Neo4j database represents different RDF triples, as E={ e1, e2, e3 } is The side collection of RDF digraph figure in Neo4j database;
(S560) for each triple t=<s, p, o>, the label at edge corresponds to lbl (p), starting and ending node It is sre (p) and tgt (p), physical relationship is as follows:
In formula (4), 3 are described from RDF digraph to the mapping ruler of Neo4j database, what e1, e2 and e3 were indicated is Side in digraph, and v1, v2, v3, v4 are the nodes in digraph, by by the main body resource of RDF digraph, corresponding relationship It is 1 mapping ruler with object resource description, and then the unstructured storage conversion of RDF digraph is mapped as Neo4j structuring Data are stored.
For step S6, the present invention utilizes the distributed storage of Neo4j database, on the basis of Neo4j data structure Key-Neo4j distributed storage model is constructed, which supports mass memory and dynamic expansion, pass through network Technology will be distributed over the resource in different hardware equipment and be integrated by Clustering, different according to the scale of storage resource, Can be controlled by way of increasing or decreasing memory node, guarantee vast resources can normal storage on the basis of it is real Existing dynamic expansion;In addition, the high availability (High Availability) that Neo4j database is possessed is domain body distribution The frame proposition of formula storage is laid a good foundation, in different regions, different mechanisms, as long as can be by network according to unified mark Standard creates service, and is added among distributed storage network, the execution and management serviced as requested, so that it may The collaboration development and the shared utilization of resource of work are realized in entirely storage virtual network;Distributed storage technology passes through net The virtualization storage framework of building is distributed in the resource data in different memory nodes and carries out integration calling by network technology, will be each Part resource is divided into different logical partitions, it can be ensured that each subregion is isolated from each other, and can be in communication with each other again, can also To realize the remote backup of data resource and the requirement of real-time migration.
For step S7, the present invention is indexed machine by creation object indexing and triple Indexing Mechanism, all objects System is assigned with an id number to establish concordance list, can be with quick search object when facilitating inquiry, and requires building according to retrieval Different expression formulas for search to execute retrieval tasks respectively;Firstly, object matching inquiry passes through accurately matching or fuzzy matching As a result object indexing is retrieved, and is ranked up the correlation for collection of obtaining a result to result set, entitled " Tarim Basin " search Rope range is as follows:
Query=(label: " TarimBasin ") or (altLabel: " TarimBasin ") (5)
In formula (5), Query indicates query result set, and label indicates the markup language of inquiry, and altLabel is son mark Sign language;Secondly, ternary group index can be scanned for and be inquired with relationship match, in specified triple<s, p, o>in, if It needs to retrieve the relationship between s and o, such as retrieves " discovery techniques " object, search engine can be found in id concordance list " explores skill The corresponding id of art ", accessing corresponding id is " 9672 ", and structural relation search is as follows:
Query=(sID:'9672') or (oID:'9672') (6)
In formula (6), if it is known that object and its specified familiarity, can by triple<s, p, o>known to s and p Or o and p retrieves another corresponding object, needs to retrieve another occurrence using object o or s, such as needs to retrieve " stone " ingredient " of oil ", in conjunction with mutual rule system, given searching structure is as follows:
In formula (7), sLabel, pLabel be triple<s, p, o>in s, the markup language of p, the inquiry be condition composite Inquiry, query result need to meet all querying conditions.
The inquiry of oil gas big data and storage method based on domain body of the invention, solves the prior art to big data Under environment it is multi-specialized in oil field between knowledge be difficult to the problem of indicating and merging, have the advantage that
(1) method of the invention can efficiently solve impedance mismatch problem when domain body stores, without repeating Storage, greatly alleviates the load pressure of memory space, higher storage efficiency may be implemented;
(2) method of the invention can be according to the difference of storage resource scale, by the side for increasing or decreasing memory node Method is controlled, and can realize dynamic expansion on the basis of normal storage in guarantee vast resources;
(3) method of the invention can be in conjunction with the extension mechanism based on Semantic mapping, by between two field sheets of discovery The fusion between different ontologies is realized in association in concept and semantically.
Detailed description of the invention
Fig. 1 is the flow chart of oil gas the big data inquiry and storage method in the present invention based on domain body.
Fig. 2 is Neo4j database store process schematic diagram in the present invention.
Fig. 3 is Neo4j HA mode configuration schematic diagram in the present invention.
Specific embodiment
With reference to the accompanying drawing and specific embodiment invention is further described in detail:
A kind of inquiry of oil gas big data and storage method based on domain body, as shown in Figure 1, being of the invention based on neck The flow chart of oil gas the big data inquiry and storage method of domain ontology, this method includes:
S1, ontological construction, the building of ontology are by the relationship between term, concept and the attribute in specific area, benefit The process of conceptual model is configured to certain method.The building of ontology will follow unified standards and specifications, build Ontology there is sharing and reusability during use, this avoids the generations of multiple ontologies.
S2, triple indicate that RDF can be convenient the correlation between ground description object and its attribute, make machine journey Sequence can freely carry out data exchange on network, to realize the automatic processing of Internet resources.Each basic knot in RDF data Structure is all one with main body resource, corresponding relationship and object resource, with triple t=<s, p, o>data structure to RDF data into Row indicates.Wherein, t represents RDF triple, i.e. I indicates uniform resource identifier (URL), and B indicates that empty node, L indicate text section Point.Multiple associated RDF triples can form RDF digraph, and RDF digraph is made of labeled node and edge, Describe main body, object and its corresponding relation on attributes.The main body money of all triple data is contained in RDF digraph Source, the relationship between Bian Daibiao Subjective and Objective, and the direction on side is always directed toward object from main body.In general, a RDF digraph generation Table one group of RDF data, the two be it is mutual corresponding, only the form of expression is different.
S3, five-tuple building, according to the industry characteristic of oil exploration and development fields, the foundation structure of oil field ontology is The five-tuple data model O={ C, R, At, Rel, Ao } being made of multiple RDF triple data, the model is by increasing tuple Quantity, can preferably in ontology concept and term carry out clearer stratification expression, have be easily understood, a mesh The advantages of being clear.In addition, being fully described by out concept and term and pass between the two by the data structure for changing model System.Complete ontology includes five elements: class (C), relationship (R), attribute (At), axiom (Rel) and example (Ao).It is led in petroleum In the oriented label figure of the RDF of domain ontology, solid oval indicates that concepts, the dotted ellipses such as class, axiom and the example of ontology illustrate Attribute corresponding to the concept, oriented arrow illustrate the correlation between semantic concept.
The unstructured storage of S4, RDF digraph, oil field ontology are made of magnanimity RDF data, RDF number According to element between connect each other, constitute the knowledge network of an oil field, ontology number indicated with the node in graph structure Concept and attribute in represent the corresponding relationship among ontology with the side between two nodes, with preferable storage performance Search efficiency.But the main body resource and object resource in each RDF data may be quoted repeatedly, therefore, know to this Know network when being stored, due to being stored based on table for relevant database, multiple tables can be established and collectively formed Ontology representation stores RDF data in each table, can greatly waste data space in this way, be unfavorable for the operation of data And maintenance.For this problem, the present invention proposes RDF data and arrives according to the storage principle of the Neo4j in NoSQL database The mapping ruler of Neo4j database efficiently solves the problems, such as that memory space largely wastes.
S5, R2G structuring mapping, RDF digraph is by main body resource, corresponding relationship and object resource representation, main body resource Class and concept can be expressed.Object resource other than it can express class and concept, also may indicate that class definition and Attribute.Corresponding relationship mainly describes the relationship between main body resource and object resource.The data model of Neo4j by node, relationship, Node and the attribute of relationship composition, there are similitudes on structural model with the RDF digraph of oil field body part information. The present invention realizes mapping of the RDF digraph to Neo4j data structure, Neo4j data inventory by establishing serial mapping ruler It is as shown in Figure 2 to store up process schematic.The bottom of Neo4j is stored node and relationship by way of figure, connects two sections What is put is the relationship between node, can be realized the quick lookup and positioning of node by this storage mode.Node, attribute and Relationship between node is the chief component of Neo4j basic structure.Node can be divided into start node and terminal node, and And two nodes are attached by relationship, attribute is the supplement to different nodes.Each node of Neo4j has a mark Label, are divided into iri, literal and bnode.There are two attribute, i.e. kind and IRI, bnode there was only an attribute for iri node, There are four attribute, i.e. kind, value, datatype and language, the connected nodes of four kinds of different attributes for literal node It is attached and is stored in the form of chained list.
The database distributed storage of S6, Neo4j, in Neo4j HA mode, data can be written from host node, Neng Gouji When be synchronized among node, data write-in can also carried out from node, remained to synchronous among host node in time.But due to From not accomplishing to interconnect between node, when data from node when being written, data cannot be carried out from node at other in time It is synchronous, it needs to be able to achieve data by the dispatching party of host node and synchronizes, Neo4j HA mode configuration schematic diagram is as shown in Figure 3.It should Distributed storage frame is made of multiple independent memory nodes, and each memory node runs Neo4j chart database, is passed through Neo4j included api interface accesses.The purpose of design of Neo4j HA is simplified as much as possible separate unit Neo4j to more The operating process of Neo4j cluster, makes it not need repeatedly to be changed on different devices, and the HA mode of Neo4j database is By a host, zero at most platform slave machine.Logical layer is by key-value pair (key assignments) database and other three major function moulds Block composition.Wherein, data management module is mainly responsible for the disintegration of RDF data collection, judges that RDF data is right in memory node The position answered, data update module store the data determined onto specified Neo4j memory node by API, and data are looked into Module is ask by memory node where the double-deck Indexing Mechanism locating query data, realizes efficiently inquiry.
The storage organization of S7, the double-deck indexed search, the graphical storage organization of Neo4j and relevant database has greatly not Together, for the searching algorithm for being suitable for the patterned structure of Neo4j database, optimizing domain body, the present invention uses Neo4j number According to the spy in multidisciplinary field in the CYPHER search language and Apache Solr index technology and combination oil field of storehouse matching Point proposes the double-deck index searching algorithm suitable for oil field ontology.Pass through creation object indexing and triple index machine System, all objects are all indexed mechanism and are assigned with an id number to establish concordance list, can be with quick search pair when facilitating inquiry As, and require to construct different expression formulas for search to execute retrieval tasks, including object matching retrieval, relationship respectively according to retrieval The modes such as matching retrieval and relationship degree retrieval.
S8 exports storage size and the data query time of occupancy.
In conclusion the inquiry of oil gas big data and storage method of the invention based on domain body is under big data environment It solves the problems, such as that the knowledge between multi-specialized in oil field is difficult to indicate and merge, greatly alleviates the negative of memory space Pressure is carried, higher storage efficiency may be implemented, mass memory and dynamic expansion is supported, is applicable to multiple fields.
It is discussed in detail although the contents of the present invention have passed through above preferred embodiment, but it should be appreciated that above-mentioned Description is not considered as limitation of the present invention.After those skilled in the art have read above content, for of the invention A variety of modifications and substitutions all will be apparent.Therefore, protection scope of the present invention should be limited to the appended claims.

Claims (5)

1. a kind of inquiry of oil gas big data and storage method based on domain body, feature and specific step is as follows:
S1 carries out formalization unified representation by the relationship between the concept and concept in domain body, completes the building of ontology;
S2, with triple t=<s, p, o>data structure in multiple fields professional knowledge and concept carry out clear and form The description of change;
S3 combines building five-tuple O={ C, R, At, Rel, Ao } by multiple RDF triple data by increasing the quantity of tuple Data model;
S4 indicates concept and attribute in ontology data with the node in graph structure, with the side between two nodes represent ontology it In corresponding relationship, carry out the unstructured storage of RDF digraph;
S5 realizes the mapping of RDF digraph to Neo4j data structure by establishing series R2G structuring mapping ruler;
S6 constructs Key-Neo4j distributed storage model on the basis of Neo4j data structure, realizes point of Neo4j database Cloth storage;
S7 forms the double-deck indexed search for being suitable for oil field ontology by creation object indexing and triple Indexing Mechanism Method;
S8 exports storage size and the data query time of occupancy.
2. a kind of inquiry of oil gas big data and storage method, feature based on domain body according to claim 1 exists In, for step S3, five-tuple O={ C, R, At, Rel, Ao } data model that the present invention uses, by the quantity for increasing tuple Can preferably in ontology concept and term carry out clearer stratification expression, using the data structure for changing model, Be fully described by out concept and term and relationship between the two, including five elements: class (C), relationship (R), attribute (At), Axiom (Rel) and example (Ao);Class be other than the general significance of concept, can also by RDF triple task, movement and Resource and object resource based on the expression of the titles such as event, such as " oil-gas exploration and development " is a class, uses triple form It is expressed as (oil-gas exploration and development, rdfs:type, Owl:class);Relationship is to define a kind of mapping of concept and attribute in ontology Method is primarily referred to as the constraint relationship of the two, wherein domain is made of the concept in concept set, and codomain can be by general It reads and the data types such as numerical value forms;Prevailing relationship between domain body include subclass relation (subClassOf), example and Relationship (edf:type) between ontology term;Attribute is the key property for describing concept in domain body, and main includes two Attribute, i.e. data attribute and object properties, data attribute refer to associated between object and data type value, and object properties refer to It is associated with each other between object;Axiom is the description to eternal truth, is all true under any circumstance, and example is class Specific example, such as a well (rdfs:type in tower;Owl: drilling) indicate tower in a well be drilling hole type oil well an example.
3. a kind of inquiry of oil gas big data and storage method, feature based on domain body according to claim 1 exists In for step S4, the graph structure that the present invention uses is Neo4j figure, also referred to as attributed graph (PG), and the important composition mode of PG is just It is node and relationship;Concept and attribute in ontology data are indicated with the node in Neo4j graph structure, with the side between two nodes Represent the corresponding relationship among ontology, wherein the start node in dependence mapping joins end to end to form one to next node A relationship end to end, is connected between node and node by relationship, and each node and relationship can possess individually Attribute declaration can assign the label of multiple types for each node.
4. a kind of inquiry of oil gas big data and storage method, feature based on domain body according to claim 1 exists In for step S5, the present invention uses the R2G mapping ruler by establishing series, realizes RDF digraph to Neo4j data knot The mapping of structure, wherein for RDF digraph by main body resource, corresponding relationship and object resource representation, main body resource can be to class and general Thought is expressed;Object resource also may indicate that definition and the attribute of class other than it can express class and concept;Corresponding relationship Relationship between main description main body resource and object resource, and the data model of Neo4j is by node, relationship, node and relationship Attribute composition, it is described to include: from RDF digraph to the mapping step of Neo4j database
(S510) each attribute value in RDF digraph, each attribute value in Neo4j database, in RDF digraph are traversed All generated by corresponding node of graph;Each node can establish multiple relationships with multiple nodes, and individual node can be set multiple Attribute, if V={ v1, v2, v3, v4 } is the group node that RDF digraph is mapped in Neo4j database;
(S520) for each of node set V sky node (bnode) v (b), property set is obtained, indicates the node in addition to class There is no additional attribute except type label;
(S530) for each resource identifier node (iri) v (u) in node set V, node type and IRI are respectively obtained The attribute set of label, as follows:
In formula (1), φ (v (u1)), φ (v (u2)) and φ (v (u3)) are resource identifier node, and kind indicates the class of node Type, attribute set are the description language after IRI:
(S540) for each literal node (literal) v (l) in node collection V, property set is obtained:
In formula (2), node type and corresponding value, data type and language attribute are obtained respectively, wherein language belongs to Property can be null value, it is as follows:
φ (v (u3))=<" kind ", " literal ">,<" literal ", 3582>,<" datatype ", int>} (3)
In formula (3), resource identifier node v (u3) is described out in detail by more attributes, the value of the type literal of node is 3582, datatype indicate be the value data type, modified with int;
(S550) each Bian Dou in Neo4j database represents different RDF triples, if E={ e1, e2, e3 } is Neo4j number According to the side collection of the RDF digraph figure in library;
(S560) for each triple t=<s, p, o>, the label at edge corresponds to lbl (p), and starting and ending node is sre (p) and tgt (p), physical relationship are as follows:
In formula (4), 3 are described from RDF digraph to the mapping ruler of Neo4j database, what e1, e2 and e3 were indicated is oriented Side in figure, and v1, v2, v3, v4 are the nodes in digraph, by by main body resource, corresponding relationship and the visitor of RDF digraph Body resource description is 1 mapping ruler, and then the unstructured storage conversion of RDF digraph is mapped as Neo4j structural data It is stored.
5. a kind of inquiry of oil gas big data and storage method, feature based on domain body according to claim 1 exists In for step S6, the present invention utilizes the distributed storage of Neo4j database, constructs on the basis of Neo4j data structure Key-Neo4j distributed storage model, the distributed storage mode support mass memory and dynamic expansion, pass through network technology The resource that will be distributed in different hardware equipment is integrated by Clustering, different according to the scale of storage resource, can be with Controlled by way of increasing or decreasing memory node, guarantee vast resources can normal storage on the basis of realize it is dynamic State extension;In addition, the high availability (High Availability) that Neo4j database is possessed is that domain body distribution is deposited The frame proposition of storage is laid a good foundation, in different regions, different mechanisms, as long as can be by network according to unified standard pair Service is created, and is added among distributed storage network, the execution and management serviced as requested, so that it may whole The collaboration development and the shared utilization of resource of work are realized in a storage virtual network;Distributed storage technology passes through network skill The virtualization storage framework of building is distributed in the resource data in different memory nodes and carries out integration calling by art, by each section Resource is divided into different logical partitions, it can be ensured that each subregion is isolated from each other, and can be in communication with each other again, can also be real The remote backup of existing data resource and the requirement of real-time migration.
CN201910454493.9A 2019-05-29 2019-05-29 A kind of inquiry of oil gas big data and storage method based on domain body Pending CN110147376A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910454493.9A CN110147376A (en) 2019-05-29 2019-05-29 A kind of inquiry of oil gas big data and storage method based on domain body

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910454493.9A CN110147376A (en) 2019-05-29 2019-05-29 A kind of inquiry of oil gas big data and storage method based on domain body

Publications (1)

Publication Number Publication Date
CN110147376A true CN110147376A (en) 2019-08-20

Family

ID=67592012

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910454493.9A Pending CN110147376A (en) 2019-05-29 2019-05-29 A kind of inquiry of oil gas big data and storage method based on domain body

Country Status (1)

Country Link
CN (1) CN110147376A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750678A (en) * 2019-09-26 2020-02-04 华南师范大学 Method and system for monitoring video data association description and storage management
CN111046241A (en) * 2019-11-27 2020-04-21 中国人民解放军国防科技大学 Graph storage method and device for stream graph processing
CN112256927A (en) * 2020-10-21 2021-01-22 网易(杭州)网络有限公司 Method and device for processing knowledge graph data based on attribute graph
CN113609175A (en) * 2021-08-02 2021-11-05 北京值得买科技股份有限公司 E-commerce commodity attribute data processing method and device based on graph database
CN114817262A (en) * 2022-04-27 2022-07-29 电子科技大学 Graph traversal algorithm based on distributed graph database

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750678A (en) * 2019-09-26 2020-02-04 华南师范大学 Method and system for monitoring video data association description and storage management
CN111046241A (en) * 2019-11-27 2020-04-21 中国人民解放军国防科技大学 Graph storage method and device for stream graph processing
CN111046241B (en) * 2019-11-27 2023-09-26 中国人民解放军国防科技大学 Graph storage method and device for flow graph processing
CN112256927A (en) * 2020-10-21 2021-01-22 网易(杭州)网络有限公司 Method and device for processing knowledge graph data based on attribute graph
CN113609175A (en) * 2021-08-02 2021-11-05 北京值得买科技股份有限公司 E-commerce commodity attribute data processing method and device based on graph database
CN114817262A (en) * 2022-04-27 2022-07-29 电子科技大学 Graph traversal algorithm based on distributed graph database

Similar Documents

Publication Publication Date Title
CN110147376A (en) A kind of inquiry of oil gas big data and storage method based on domain body
CN101436192B (en) Method and apparatus for optimizing inquiry aiming at vertical storage type database
CN105653691B (en) Management of information resources method and managing device
Hor et al. A semantic graph database for BIM-GIS integrated information model for an intelligent urban mobility web application
CN108446368A (en) A kind of construction method and equipment of Packaging Industry big data knowledge mapping
CN111523003A (en) Data application method and platform with time sequence dynamic map as core
CN103744846A (en) Multidimensional dynamic local knowledge map and constructing method thereof
US11755284B2 (en) Methods and systems for improved data retrieval and sorting
CN103390015A (en) Mass data united storage method based on unified indexing and search method
Banane et al. Storing RDF data into big data NoSQL databases
US11599576B2 (en) Index machine
CN109783484A (en) The construction method and system of the data service platform of knowledge based map
US20240095227A1 (en) Chart engine
CN107870949A (en) Data analysis job dependence relation generation method and system
JP2024041902A (en) Multi-source interoperability and/or information retrieval optimization
Paul et al. A Review on Graph Database and its representation
Álvarez-García et al. Compact and efficient representation of general graph databases
CN110134688B (en) Hot event data storage management method and system in online social network
Colace et al. Pervasive systems architecture and the main related technologies
CN111949649A (en) Dynamic body storage system, storage method and data query method
Ning et al. Dominance-partitioned subgraph matching on large RDF graph
CN102597969A (en) Database management device using key-value store with attributes, and key-value-store structure caching-device therefor
Liu et al. OPSDS: a semantic data integration and service system based on domain ontology
Ferilli et al. LPG-based Ontologies as Schemas for Graph DBs.
Zhang et al. Storing fuzzy description logic ontology knowledge bases in fuzzy relational databases

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190820