CN110110034A - A kind of RDF data management method, device and storage medium based on figure - Google Patents

A kind of RDF data management method, device and storage medium based on figure Download PDF

Info

Publication number
CN110110034A
CN110110034A CN201910389293.XA CN201910389293A CN110110034A CN 110110034 A CN110110034 A CN 110110034A CN 201910389293 A CN201910389293 A CN 201910389293A CN 110110034 A CN110110034 A CN 110110034A
Authority
CN
China
Prior art keywords
node
storage address
physical storage
triple
stored
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910389293.XA
Other languages
Chinese (zh)
Inventor
陈仁海
燕国骅
关启明
冯志勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Research Institute Of Tianjin University
Original Assignee
Shenzhen Research Institute Of Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Research Institute Of Tianjin University filed Critical Shenzhen Research Institute Of Tianjin University
Priority to CN201910389293.XA priority Critical patent/CN110110034A/en
Publication of CN110110034A publication Critical patent/CN110110034A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1847File system types specifically adapted to static storage, e.g. adapted to flash memory or SSD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a kind of RDF data management method, device and storage medium based on figure creates RDF graph based on RDF data to be stored;The node corresponding on RDF graph of each element in triple is stored in different storage units on SSD respectively;In the storage unit that superior node is stored, the physical storage address list of physical storage address including the corresponding all downstream sites of superior node is saved, and the corresponding relationship of the physical storage address for the storage unit for being stored each node and each node, it saves to node address concordance list.Implementation through the invention, diagram data is converted by RDF data to manage, preferably remain the structure feature of RDF data, conveniently from any node heuristic data, conducive to comprehensive, the expansible RDF data management of realization, the high concurrency for taking full advantage of SSD greatly improves the data management performance on SSD.

Description

A kind of RDF data management method, device and storage medium based on figure
Technical field
The present invention relates to field of data storage more particularly to a kind of RDF data management method, device and storages based on figure Medium.
Background technique
Big data era, information show the unstructured and free and abundant relevance of height, and many knowledge bases are for example micro- The data set of rich, Facebook etc. is usually with resource description framework (RDF, Resource Description Framework) Form is stored.RDF data is actually to be made of the triple data of some column, wherein each triple is by three A element composition: resource, attribute and attribute value, also referred to as subject (Subject), predicate (predicate) and object (Object)。
Universal with RDF in recent years, the quantity of RDF data has greatly increased, concentrate in many RDF datas (such as Wikipedia billions of a triples) are produced.Therefore, it is huge as one that these huge RDF datas how effectively to be managed Big challenge.Currently, usually stored RDF data in solid state hard disk (SSD, Solid State Drive), however phase In the technology of pass in storing process, do not consider that spatial character, such as channel, die, plane inside SSD etc. are internal Information, but the free memory locations by RDF data random storage on SSD, so that the performance of SSD is not fully exerted, Data management performance on SSD is lower.
Summary of the invention
The main purpose of the embodiment of the present invention is to provide a kind of RDF data management method, device and storage based on figure Medium is at least able to solve the free memory locations in the related technology by RDF data random storage on SSD, caused SSD Performance is not fully exerted, and the problem that the data management performance based on SSD is lower.
To achieve the above object, first aspect of the embodiment of the present invention provides a kind of RDF data management method based on figure, This method comprises:
RDF graph is created based on RDF data to be stored;The each element wait store all triples in RDF data is right A node on RDF graph described in Ying Yu;
By each node corresponding to each element in the triple, physical storage address is different on the SSD respectively It is stored in storage unit;
In the storage unit that superior node is stored, by the object including the corresponding all downstream sites of the superior node The physical storage address list of reason storage address is saved, and the storage unit that each node and each node are stored Physical storage address corresponding relationship, save to node address concordance list;The object of the triple is that the junior of predicate saves Point, the predicate are the downstream site of subject.
To achieve the above object, second aspect of the embodiment of the present invention provides a kind of RDF data managing device based on figure, The device includes:
Creation module, for creating RDF graph based on RDF data to be stored;It is described wait store all triples in RDF data Each element both correspond to a node on the RDF graph;
Memory module, for by each node corresponding to each element in the triple, physics to be deposited on the SSD respectively It is stored in the different storage unit in storage address;
Preserving module, the storage unit for being stored in superior node will include the corresponding institute of the superior node There is the physical storage address list of the physical storage address of downstream site to be saved, and by each node and each node institute The corresponding relationship of the physical storage address of the storage unit of storage is saved to node address concordance list;The object of the triple For the downstream site of predicate, the predicate is the downstream site of subject.
To achieve the above object, the third aspect of the embodiment of the present invention provides a kind of electronic device, which includes: Processor, memory and communication bus;
The communication bus is for realizing the connection communication between the processor and memory;
The processor is above-mentioned any one to realize for executing one or more program stored in the memory The step of planting the RDF data management method based on figure.
To achieve the above object, fourth aspect of the embodiment of the present invention provides a kind of computer readable storage medium, the meter Calculation machine readable storage medium storing program for executing is stored with one or more program, and one or more of programs can be by one or more It manages device to execute, the step of to realize RDF data management method of any one of the above based on figure.
RDF data management method, device and the storage medium based on figure provided according to embodiments of the present invention, based on wait deposit It stores up RDF data and creates RDF graph, each element wait store all triples in RDF data both corresponds to a section on RDF graph Point;Each node corresponding to each element in triple is stored in different storage units on SSD respectively;In superior node The storage unit stored, by the physical store of the physical storage address including the corresponding all downstream sites of superior node Location list is saved, and the storage unit that each node and each node are stored physical storage address corresponding relationship, It saves to node address concordance list.Implementation through the invention converts diagram data for RDF data to manage, preferably retains The structure feature of RDF data, it is convenient from any node heuristic data, conducive to comprehensive, expansible RDF data management is realized, The high concurrency for taking full advantage of SSD greatly improves the data management performance on SSD.
Other features of the invention and corresponding effect are described in the aft section of specification, and should be appreciated that At least partly effect is apparent from from the record in description of the invention.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those skilled in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.
Fig. 1 is the basic procedure schematic diagram for the RDF data management method that first embodiment of the invention provides;
Fig. 2 is the RDF graph that first embodiment of the invention provides;
Fig. 3 is the basic procedure schematic diagram for another RDF data management method that first embodiment of the invention provides;
Fig. 4 is that the RDF data that first embodiment of the invention provides inquires schematic diagram;
Fig. 5 is the structural schematic diagram for the RDF data managing device that second embodiment of the invention provides;
Fig. 6 is the structural schematic diagram for the electronic device that third embodiment of the invention provides.
Specific embodiment
In order to make the invention's purpose, features and advantages of the invention more obvious and easy to understand, below in conjunction with the present invention Attached drawing in embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described reality Applying example is only a part of the embodiment of the present invention, and not all embodiments.Based on the embodiments of the present invention, those skilled in the art Member's every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
First embodiment:
In order to solve the free memory locations by RDF data random storage on SSD in the related technology, caused SSD Performance is not fully exerted, and the technical problem that the data management performance based on SSD is lower, the present embodiment propose one kind RDF data management method based on figure, applied to the SSD with multiple storage units.It is as shown in Figure 1 provided in this embodiment The basic procedure schematic diagram of RDF data management method based on figure, the RDF data management method based on figure that the present embodiment proposes Include the following steps:
Step 101 creates RDF graph based on RDF data to be stored;Wait store each element of all triples in RDF data Both correspond to a node on RDF graph.
Specifically, include multiple resource descriptions in RDF data, and a resource description is made of multiple sentences, one Sentence is the triple being made of resource, attribute, attribute value.Sentence in resource description can correspond to the language of natural language Sentence, resource correspond to the subject in natural language, and attribute corresponds to predicate, and attribute value corresponds to object, one in RDF term Triple can be expressed as (subject, predicate, object), namely (s, p, o).One RDF data collection can be described as a RDF Figure, is illustrated in figure 2 RDF graph provided in this embodiment, which is an oriented label figure, and subject and object are described as RDF Two adjacent vertex in figure, object are described as the directed edge between two vertex adjacent in RDF graph, and vertex and directed edge are equal A node being regarded as on RDF graph.
Step 102, by each node corresponding to each element in triple, physical storage address is different on SSD respectively It is stored in storage unit.
Specifically, node corresponding to subject, predicate and object in each triple is stored respectively in the present embodiment In different storage units.In addition, it should also be noted that in order to preferably utilize the concurrency of SSD, and guarantee storage Entire RDF graph can be divided into the subgraph of multiple fixed data amount sizes in the present embodiment, then will counted by the utilization rate in space It is stored in different storage units according to equal-sized subgraph is measured.
Step 103, the storage unit stored in superior node will include the corresponding all downstream sites of superior node The physical storage address list of physical storage address saved, and the storage unit that each node and each node are stored Physical storage address corresponding relationship, save to node address concordance list;The object of triple is the downstream site of predicate, meaning Language is the downstream site of subject.
Specifically, in the storage unit that each node is stored, safeguarding the object of the downstream site of the node in the present embodiment Storage address list is managed, includes the physical storage address of all downstream sites of the node in physical storage address list.It should Illustrate, subject, predicate, node corresponding to object node level reduce step by step.Additionally, it should be understood that being The positioning of each node of realization, the also corresponding relationship based on each node and its physical storage address in the present embodiment, generates node Address reference table.In addition, in a preferred embodiment, it, can be by multiple sections in order to make full use of the parallel processing capability of SSD Parallel being stored in different storage units of point significantly improves data write performance.
Optionally, after storing RDF data, inquiry for RDF data please refers to this implementation as shown in Figure 3 The basic procedure schematic diagram for another RDF data management method that example provides, specifically includes the following steps:
Step 301, when receiving RDF data inquiry request, obtain at least one triple to be checked;Ternary to be checked Known element in group is querying condition, and the unknown element in triple to be checked is query result, it is known that element includes at least The highest subject of triple interior joint grade to be checked;
Step 302 searches the physical storage address for corresponding to known element based on node address concordance list;
Step 303, from the physical storage address list that the physical storage address of known element is stored, obtain Known Elements The physical storage address of the corresponding downstream site of element, it is corresponding that the physical storage address based on downstream site searches known element Downstream site, and query result is obtained based on downstream site.
Specifically, be based on RDF data storage strategy above-mentioned, it is corresponding, in the present embodiment then based on superior node come Downstream site is inquired, in the triple to be checked in the present embodiment when known element is main language, subject can be based on It inquires to obtain predicate corresponding to downstream site, is then based on predicate and continues inquiry and obtain guest corresponding to more next stage node Language, and if the known element of triple to be checked is directly inquired to obtain it according to predicate simultaneously when including subject and predicate Object corresponding to downstream site.It is illustrated in figure 4 RDF data inquiry schematic diagram provided in this embodiment, to save by higher level For point Var searches its downstream site, the physical storage address that node address concordance list searches known element Var is first passed through (Channel#0, Flash#0, Page#0), then in the physical storage address list of (Channel#0, Flash#0, Page#0) The middle physical storage address (Channel#1, Flash#0, Page#0) for obtaining downstream site corresponding to Var, and will (Channel#1, Flash#0, Page#0) all downstream sites are loaded, and query result is obtained.Thus in the present embodiment, As long as can be obtained all associated node datas, data query performance is good by a known node.
Optionally, it when triple to be checked has multiple, is searched based on node address concordance list and corresponds to known element Physical storage address includes: to be searched corresponded to known element in each triple to be checked respectively based on node address concordance list Physical storage address.It is corresponding, from the physical storage address list that the physical storage address of known element is stored, obtain The physical storage address of the corresponding downstream site of major elements, the physical storage address based on downstream site search known element phase Corresponding downstream site, and obtaining query result based on downstream site includes: to be deposited from the physical storage address of each known element In the physical storage address list of storage, the physical storage address of the corresponding downstream site of each known element is obtained respectively, then The data that the physical storage address of each downstream site of loaded in parallel is stored search the corresponding downstream site of each known element, And multiple queries result is obtained based on downstream site.
Specifically, when inquiring that data volume is larger namely the triple of required inquiry has multiple, processing number that can be parallel It is investigated that asking, namely the parallel physical storage address for inquiring known superior node in each triple to be checked, then according to institute The object of the downstream site of each superior node of parallel search in the physical storage address list that obtained physical storage address is stored Storage address is managed, finally again from the physical storage address loaded in parallel data of each downstream site, obtains multiple queries as a result, more preferable The parallel performance of SSD is utilized, and improve efficiency data query.
Optionally, it after storing RDF data, when being added to RDF data, specifically includes: receiving RDF When data addition request, search whether node corresponding to each element of triple to be added is deposited in node address concordance list ?;If so, adding ternary to be added in the physical storage address list corresponding to the superior node in triple to be added The physical storage address of downstream site in group;If it is not, then each node in triple to be added is stored on SSD respectively Different storage units, and the storage unit stored in the superior node of triple to be added will include triple to be added The physical storage address list of the physical storage address of downstream site is saved, and by each node in triple to be added With the corresponding relationship of the physical storage address of the storage unit stored, save to node address concordance list.
Specifically, in the present embodiment, when needing to increase data newly on the basis of current RDF graph, in the newly-increased triple of institute Each node data has existed on original RDF graph, then the physical store directly safeguarded in the superior node of newly-increased data The physical storage address of downstream site is added in the list of location;And if each node data in newly-increased data is new node, Each node is then stored respectively in different storage units first, then again by the physical storage address of each node with being added to node Location concordance list, and the physical storage address list safeguarded in each superior node saves the physical storage address of its downstream site.
Optionally, it after storing RDF data, when being updated to RDF data, specifically includes: receiving RDF When data update request, the original triple being stored on SSD corresponding to triple to be updated is determined;Based on ternary to be updated Element in group is updated node corresponding to element in original triple, and based on the element updated, deposits to physics Storage address list and node address concordance list are updated.
Although specifically, in the case where being not read-only it may be reasonably assumed that most of RDF storages are that inquiry is intensive (for example, large size in life science refers to repository), but in some cases there is still a need for the update operation for carrying out data, Namely modify to available data, more new data is explained.When carrying out data update in the present embodiment, it is only necessary to be repaired needed for finding The physical storage address of node corresponding to the element changed repairs corresponding node with reference to the element in triple to be updated Change, with then will being related to the physical store of the node updated in physical storage address list and in node address concordance list again Location carries out adaptation.
Optionally, it after storing RDF data, when deleting RDF data, specifically includes: receiving RDF When data removal request, determine whether node corresponding to each element is also associated with other triples in triple to be deleted;If Be, then by the corresponding physical storage address list of superior node for being also associated with other triples in triple to be deleted, The physical storage address of downstream site is deleted;If it is not, will then be also associated with the section of other triples in triple to be deleted Point is deleted, and is also associated with other ternarys for what is in physical storage address list and node address concordance list, recorded The physical storage address of the node of group is deleted.
Specifically, in the present embodiment when carrying out data deletion, it is thus necessary to determine that each node to be deleted whether in addition to have to It deletes inside triple except the relevance of each node, if also while in other triples have with other nodes and close Connection property, if it is then then retain node data corresponding to each element in triple to be deleted, it only will be in triple to be deleted In the physical storage address list that each superior node is safeguarded, the physical storage address of downstream site to be deleted is deleted; If it is not, then directly deleting node data corresponding to each element in triple to be deleted, and physical storage address is arranged The physical storage address of node to be deleted involved in table and node address concordance list is deleted.
The RDF data management method based on figure provided according to embodiments of the present invention is created based on RDF data to be stored RDF graph, each element wait store all triples in RDF data both correspond to a node on RDF graph;It will be each in triple Each node corresponding to element is stored in different storage units on SSD respectively;In the storage list that superior node is stored Member protects the physical storage address list of the physical storage address including the corresponding all downstream sites of superior node Deposit, and the storage unit that each node and each node are stored physical storage address corresponding relationship, save to node Location concordance list.Implementation through the invention converts diagram data for RDF data to manage, preferably remains the knot of RDF data Structure feature, it is convenient to take full advantage of SSD from any node heuristic data conducive to comprehensive, expansible RDF data management is realized High concurrency, greatly improve the data management performance on SSD.
Second embodiment:
In order to solve the free memory locations by RDF data random storage on SSD in the related technology, caused SSD Performance is not fully exerted, and the technical problem that the data management performance based on SSD is lower, and present embodiment illustrates one kind RDF data managing device based on figure, applied to the SSD with multiple storage units.Fig. 5 specifically is referred to, the present embodiment RDF data managing device includes:
Creation module 501, for creating RDF graph based on RDF data to be stored;Wait store all triples in RDF data Each element both correspond to a node on RDF graph;
Memory module 502, for by each node corresponding to each element in triple, the physical store on SSD respectively It is stored in the different storage unit in location;
Preserving module 503, the storage unit for being stored in superior node will include that superior node is corresponding all The physical storage address list of the physical storage address of downstream site is saved, and each node and each node are stored The corresponding relationship of the physical storage address of storage unit is saved to node address concordance list;The object of triple is under predicate Grade node, predicate are the downstream site of subject.
Specifically, in the present embodiment, a RDF data collection can be described as a RDF graph, which is one Oriented label figure, the subject and object of triple are described as two vertex adjacent in RDF graph in RDF data, and object is described as Directed edge in RDF graph between two adjacent vertex, vertex and directed edge are regarded as a node on RDF graph.This Node corresponding to subject, predicate and object in each triple is respectively stored in different storage units in embodiment In, then, in the storage unit that each node is stored, safeguard the physical storage address list of the downstream site of the node, object Manage the physical storage address of all downstream sites in storage address list including the node, wherein subject, predicate, object institute The node level of corresponding node reduces step by step.In addition, the positioning in order to realize each node, each node is also based in the present embodiment And its corresponding relationship of physical storage address, generate node address concordance list.
In some embodiments of the present embodiment, RDF data managing device further include: enquiry module, for receiving When to RDF data inquiry request, at least one triple to be checked is obtained;Known element in triple to be checked is inquiry item Part, the unknown element in triple to be checked are query result, it is known that element includes at least triple interior joint grade to be checked Highest subject;The physical storage address for corresponding to known element is searched based on node address concordance list;From the object of known element In the physical storage address list that reason storage address is stored, with obtaining the physical store of the corresponding downstream site of known element Location, the physical storage address based on downstream site searches the corresponding downstream site of known element, and is obtained based on downstream site Query result.
Further, in some embodiments of the present embodiment, when triple to be checked has multiple, enquiry module tool Body is used to be based on node address concordance list, searches the physical store for corresponding to known element in each triple to be checked respectively Location;From the physical storage address list that the physical storage address of each known element is stored, each known element phase is obtained respectively The physical storage address of corresponding downstream site, the number that then physical storage address of each downstream site of loaded in parallel is stored According to, the corresponding downstream site of each known element of lookup, and multiple queries result is obtained based on downstream site.
In other embodiments of the present embodiment, RDF data managing device further include: adding module, for connecing When receiving RDF data addition request, node corresponding to each element of triple to be added is searched in node address concordance list It whether there is;If so, adding in the physical storage address list corresponding to the superior node in triple to be added wait add Add the physical storage address of the downstream site in triple;If it is not, then each node in triple to be added is stored in respectively Different storage units on SSD, and the storage unit stored in the superior node of triple to be added will include to be added three The physical storage address list of the physical storage address of the downstream site of tuple is saved, and will be in triple to be added The corresponding relationship of each node and the physical storage address of the storage unit stored is saved to node address concordance list.
In some embodiments of the present embodiment, RDF data managing device further include: update module, for receiving When updating request to RDF data, the original triple being stored on SSD corresponding to triple to be updated is determined;Based on to more Element in new triple is updated node corresponding to element in original triple, and based on the element updated, right Physical storage address list and node address concordance list are updated.
In some embodiments of the present embodiment, RDF data managing device further include: removing module, for receiving When to RDF data removal request, determine whether node corresponding to each element is also associated with other ternarys in triple to be deleted Group;If so, the corresponding physical storage address of the superior node for being also associated with other triples in triple to be deleted is arranged In table, the physical storage address of downstream site is deleted;If it is not, will then be also associated with other triples in triple to be deleted Node deleted, and by physical storage address list and node address concordance list, being also associated with for being recorded is other The physical storage address of the node of triple is deleted.
It should be noted that the RDF data management method based on figure in previous embodiment can be mentioned based on the present embodiment The RDF data managing device based on figure supplied realizes that those of ordinary skill in the art can be clearly understood that, for description It is convenienct and succinct, the specific work process of the RDF data managing device as described in this embodiment based on figure can refer to Corresponding process in preceding method embodiment, details are not described herein.
Using the RDF data managing device provided in this embodiment based on figure, RDF graph is created based on RDF data to be stored, Each element wait store all triples in RDF data both corresponds to a node on RDF graph;By each element institute in triple Corresponding each node is stored in different storage units on SSD respectively;In the storage unit that superior node is stored, will wrap The physical storage address list for including the physical storage address of the corresponding all downstream sites of superior node is saved, and will The corresponding relationship of the physical storage address for the storage unit that each node and each node are stored is saved to node address concordance list. Implementation through the invention converts diagram data for RDF data to manage, and preferably remains the structure feature of RDF data, side Just from any node heuristic data, conducive to comprehensive, expansible RDF data management is realized, the height for taking full advantage of SSD is parallel Property, greatly improve the data management performance on SSD.
3rd embodiment:
A kind of electronic device is present embodiments provided, it is shown in Figure 6 comprising processor 601, memory 602 and logical Believe bus 603, in which: communication bus 603 is for realizing the connection communication between processor 601 and memory 602;Processor 601 for executing one or more computer program stored in memory 602, with realize in above-described embodiment one based on At least one step in the RDF data management method of figure.
The present embodiment additionally provides a kind of computer readable storage medium, which, which is included in, is used for Store any method or skill of information (such as computer readable instructions, data structure, computer program module or other data) The volatibility implemented in art or non-volatile, removable or non-removable medium.Computer readable storage medium includes but not It is limited to RAM (Random Access Memory, random access memory), ROM (Read-Only Memory, read-only storage Device), EEPROM (Electrically Erasable Programmable read only memory, band electric erazable programmable Read-only memory), flash memory or other memory technologies, (Compact Disc Read-Only Memory, CD is only by CD-ROM Read memory), digital versatile disc (DVD) or other optical disc storages, magnetic holder, tape, disk storage or other magnetic memory apparatus, Or any other medium that can be used for storing desired information and can be accessed by a computer.
Computer readable storage medium in the present embodiment can be used for storing one or more computer program, storage One or more computer program can be executed by processor, with realize the method in above-described embodiment one at least one step Suddenly.
The present embodiment additionally provides a kind of computer program, which can be distributed in computer-readable medium On, by can computing device execute, to realize at least one step of the method in above-described embodiment one;And in certain situations Under, at least one shown or described step can be executed using the described sequence of above-described embodiment is different from.
The present embodiment additionally provides a kind of computer program product, including computer readable device, the computer-readable dress It sets and is stored with computer program as shown above.The computer readable device may include calculating as shown above in the present embodiment Machine readable storage medium storing program for executing.
As it can be seen that those skilled in the art should be understood that whole or certain steps in method disclosed hereinabove, be Functional module/unit in system, device may be implemented as the software (computer program code that can be can be performed with computing device To realize), firmware, hardware and its combination appropriate.In hardware embodiment, the functional module that refers in the above description/ Division between unit not necessarily corresponds to the division of physical assemblies;For example, a physical assemblies can have multiple functions, or One function of person or step can be executed by several physical assemblies cooperations.Certain physical assemblies or all physical assemblies can be by realities It applies as by processor, such as the software that central processing unit, digital signal processor or microprocessor execute, or is implemented as hard Part, or it is implemented as integrated circuit, such as specific integrated circuit.
In addition, known to a person of ordinary skill in the art be, communication media generally comprises computer-readable instruction, data knot Other data in the modulated data signal of structure, computer program module or such as carrier wave or other transmission mechanisms etc, and It and may include any information delivery media.So the present invention is not limited to any specific hardware and softwares to combine.
The above content is combining specific embodiment to be further described to made by the embodiment of the present invention, cannot recognize Fixed specific implementation of the invention is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs, Without departing from the inventive concept of the premise, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to the present invention Protection scope.

Claims (10)

1. a kind of RDF data management method based on figure characterized by comprising
RDF graph is created based on RDF data to be stored;The each element wait store all triples in RDF data both corresponds to A node on the RDF graph;
By each node corresponding to each element in the triple, the different storage of physical storage address on the SSD respectively It is stored in unit;
In the storage unit that superior node is stored, the physics including the corresponding all downstream sites of the superior node is deposited The physical storage address list of storage address is saved, and the object for the storage unit that each node and each node are stored The corresponding relationship of storage address is managed, is saved to node address concordance list;The object of the triple is the downstream site of predicate, institute Predication language is the downstream site of subject.
2. the RDF data management method based on figure as described in claim 1, which is characterized in that by each node and described each The corresponding relationship of the physical storage address for the storage unit that node is stored is saved to node address concordance list, further includes:
When receiving RDF data inquiry request, at least one triple to be checked is obtained;In the triple to be checked Major elements is querying condition, and the unknown element in the triple to be checked is query result, and the known element includes at least The highest subject of triple interior joint grade to be checked;
The physical storage address for corresponding to the known element is searched based on the node address concordance list;
From the physical storage address list that the physical storage address of the known element is stored, the known element phase is obtained The physical storage address of corresponding downstream site, the physical storage address based on the downstream site search the known element phase Corresponding downstream site, and the query result is obtained based on the downstream site.
3. the RDF data management method based on figure as claimed in claim 2, which is characterized in that in the triple to be checked When having multiple, described searched based on the node address concordance list includes: corresponding to the physical storage address of the known element
Based on the node address concordance list, searches deposited corresponding to the physics of known element in each triple to be checked respectively Store up address;
In the physical storage address list that the physical storage address from the known element is stored, the Known Elements are obtained The physical storage address of the corresponding downstream site of element, the physical storage address based on the downstream site search the Known Elements The corresponding downstream site of element, and the query result is obtained based on the downstream site and includes:
From the physical storage address list that the physical storage address of each known element is stored, obtain respectively it is each it is described The physical storage address of the corresponding downstream site of major elements, the then physical storage address of each downstream site of loaded in parallel The data stored search the corresponding downstream site of each known element, and obtain multiple institutes based on the downstream site State query result.
4. the RDF data management method based on figure as described in claim 1, which is characterized in that by each node and described each The corresponding relationship of the physical storage address for the storage unit that node is stored is saved to node address concordance list, further includes:
When receiving RDF data addition request, node corresponding to each element of triple to be added is searched in the node It whether there is in address reference table;
If so, in the physical storage address list corresponding to the superior node in the triple to be added, described in addition The physical storage address of downstream site in triple to be added;
If it is not, each node in the triple to be added is then stored in the different storage units on the SSD respectively, and The storage unit that the superior node of the triple to be added is stored will include the downstream site of the triple to be added The physical storage address list of physical storage address is saved, and by the triple to be added each node with deposited The corresponding relationship of the physical storage address of the storage unit of storage is saved to the node address concordance list.
5. the RDF data management method based on figure as described in claim 1, which is characterized in that by each node and described each The corresponding relationship of the physical storage address for the storage unit that node is stored is saved to node address concordance list, further includes:
When receiving RDF data update request, original three be stored on the SSD corresponding to triple to be updated are determined Tuple;
Node corresponding to element in the original triple is updated based on the element in the triple to be updated, and Based on the element updated, the physical storage address list and node address concordance list are updated.
6. the RDF data management method based on figure as described in claim 1, which is characterized in that by each node and described each The corresponding relationship of the physical storage address for the storage unit that node is stored is saved to node address concordance list, further includes:
When receiving RDF data removal request, determine whether node corresponding to each element is also associated in triple to be deleted In other triples;
If so, by the corresponding physical store of superior node for being also associated with other triples in the triple to be deleted In the list of location, the physical storage address of downstream site is deleted;
If it is not, then the node for being also associated with other triples in the triple to be deleted is deleted, and by the physics In storage address list and the node address concordance list, the physics that is recorded be also associated with the node of other triples is deposited It is deleted storage address.
7. a kind of RDF data managing device based on figure characterized by comprising
Creation module, for creating RDF graph based on RDF data to be stored;It is described wait store each of all triples in RDF data Element both corresponds to a node on the RDF graph;
Memory module, for by each node corresponding to each element in the triple, the physical store on the SSD respectively It is stored in the different storage unit in location;
Preserving module, the storage unit for being stored in superior node, will include the superior node it is corresponding it is all under The physical storage address list of the physical storage address of grade node is saved, and each node and each node are stored Storage unit physical storage address corresponding relationship, save to node address concordance list;The object of the triple is meaning The downstream site of language, the predicate are the downstream site of subject.
8. the RDF data managing device based on figure as claimed in claim 7, which is characterized in that further include: enquiry module is used In when receiving RDF data inquiry request, obtaining at least one triple to be checked;It is known in the triple to be checked Element is querying condition, and the unknown element in the triple to be checked is query result, and the known element includes at least institute State the highest subject of triple interior joint grade to be checked;It is searched based on the node address concordance list and corresponds to the Known Elements The physical storage address of element;From the physical storage address list that the physical storage address of the known element is stored, obtain The physical storage address of the corresponding downstream site of the known element, the physical storage address based on the downstream site are searched The corresponding downstream site of the known element, and the query result is obtained based on the downstream site.
9. a kind of electronic device characterized by comprising processor, memory and communication bus;
The communication bus is for realizing the connection communication between the processor and memory;
The processor is for executing one or more program stored in the memory, to realize such as claim 1 to 6 Any one of described in the RDF data management method based on figure the step of.
10. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage have one or Multiple programs, one or more of programs can be executed by one or more processor, to realize such as claim 1 to 6 Any one of described in the RDF data management method based on figure the step of.
CN201910389293.XA 2019-05-10 2019-05-10 A kind of RDF data management method, device and storage medium based on figure Pending CN110110034A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910389293.XA CN110110034A (en) 2019-05-10 2019-05-10 A kind of RDF data management method, device and storage medium based on figure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910389293.XA CN110110034A (en) 2019-05-10 2019-05-10 A kind of RDF data management method, device and storage medium based on figure

Publications (1)

Publication Number Publication Date
CN110110034A true CN110110034A (en) 2019-08-09

Family

ID=67489353

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910389293.XA Pending CN110110034A (en) 2019-05-10 2019-05-10 A kind of RDF data management method, device and storage medium based on figure

Country Status (1)

Country Link
CN (1) CN110110034A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112256886A (en) * 2020-10-23 2021-01-22 平安科技(深圳)有限公司 Probability calculation method and device in map, computer equipment and storage medium
CN113253926A (en) * 2021-05-06 2021-08-13 天津大学深圳研究院 Memory internal index construction method for improving query and memory performance of novel memory

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294710A (en) * 2012-02-28 2013-09-11 北京新媒传信科技有限公司 Data access method and device
CN103617276A (en) * 2013-12-09 2014-03-05 南京大学 Method for storing distributed hierarchical RDF data
CN103778251A (en) * 2014-02-19 2014-05-07 天津大学 SPARQL parallel query method facing large-scale RDF graph data
CN104679764A (en) * 2013-11-28 2015-06-03 方正信息产业控股有限公司 Method and device for searching graph data
CN106156319A (en) * 2016-07-05 2016-11-23 北京航空航天大学 Telescopic distributed resource description framework data storage method and device
CN107038234A (en) * 2017-04-17 2017-08-11 天津大学 A kind of path query framework based on RDF graph data and relation data
CN107291807A (en) * 2017-05-16 2017-10-24 中国科学院计算机网络信息中心 A kind of SPARQL enquiring and optimizing methods based on figure traversal
US20170316110A1 (en) * 2013-01-29 2017-11-02 Oracle International Corporation Publishing rdf quads as relational views
CN109344259A (en) * 2018-07-20 2019-02-15 西安交通大学 A kind of RDF distributed storage method dividing frame based on multilayer
KR101972127B1 (en) * 2018-11-30 2019-04-25 주식회사 피씨엔 Intelligent search system based on resource description framework triple data and intelligent search method using the same

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294710A (en) * 2012-02-28 2013-09-11 北京新媒传信科技有限公司 Data access method and device
US20170316110A1 (en) * 2013-01-29 2017-11-02 Oracle International Corporation Publishing rdf quads as relational views
CN104679764A (en) * 2013-11-28 2015-06-03 方正信息产业控股有限公司 Method and device for searching graph data
CN103617276A (en) * 2013-12-09 2014-03-05 南京大学 Method for storing distributed hierarchical RDF data
CN103778251A (en) * 2014-02-19 2014-05-07 天津大学 SPARQL parallel query method facing large-scale RDF graph data
CN106156319A (en) * 2016-07-05 2016-11-23 北京航空航天大学 Telescopic distributed resource description framework data storage method and device
CN107038234A (en) * 2017-04-17 2017-08-11 天津大学 A kind of path query framework based on RDF graph data and relation data
CN107291807A (en) * 2017-05-16 2017-10-24 中国科学院计算机网络信息中心 A kind of SPARQL enquiring and optimizing methods based on figure traversal
CN109344259A (en) * 2018-07-20 2019-02-15 西安交通大学 A kind of RDF distributed storage method dividing frame based on multilayer
KR101972127B1 (en) * 2018-11-30 2019-04-25 주식회사 피씨엔 Intelligent search system based on resource description framework triple data and intelligent search method using the same

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
冯志杰: "面向大规模RDF数据的混合分布式存储方案研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
姜龙翔等: "一种大规模RDF语义数据的分布式存储方案", 《计算机应用与软件》 *
崔义童等: "基于图聚类算法的大规模RDF数据查询方法研究", 《小型微型计算机系统》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112256886A (en) * 2020-10-23 2021-01-22 平安科技(深圳)有限公司 Probability calculation method and device in map, computer equipment and storage medium
CN112256886B (en) * 2020-10-23 2023-06-27 平安科技(深圳)有限公司 Probability calculation method and device in atlas, computer equipment and storage medium
CN113253926A (en) * 2021-05-06 2021-08-13 天津大学深圳研究院 Memory internal index construction method for improving query and memory performance of novel memory

Similar Documents

Publication Publication Date Title
CN110825748B (en) High-performance and easily-expandable key value storage method by utilizing differentiated indexing mechanism
CN109933570B (en) Metadata management method, system and medium
US9858303B2 (en) In-memory latch-free index structure
CN105574093B (en) A method of index is established in the spark-sql big data processing system based on HDFS
CN103229173B (en) Metadata management method and system
CN110134335A (en) A kind of RDF data management method, device and storage medium based on key-value pair
US20160283538A1 (en) Fast multi-tier indexing supporting dynamic update
US20140136510A1 (en) Hybrid table implementation by using buffer pool as permanent in-memory storage for memory-resident data
CN105117417A (en) Read-optimized memory database Trie tree index method
CN109521959A (en) One kind being based on SSD-SMR disk mixing key assignments memory system data method for organizing
US20150142818A1 (en) Paged column dictionary
CN110309233A (en) Method, apparatus, server and the storage medium of data storage
US20240086332A1 (en) Data processing method and system, device, and medium
Amur et al. Design of a write-optimized data store
CN104598652B (en) A kind of data base query method and device
CN110110034A (en) A kind of RDF data management method, device and storage medium based on figure
US20190303421A1 (en) Histogram sketching for time-series data
US8954688B2 (en) Handling storage pages in a database system
CN110096515A (en) A kind of RDF data management method, device and storage medium based on triple
US20210349918A1 (en) Methods and apparatus to partition a database
US11080332B1 (en) Flexible indexing for graph databases
CN107273443B (en) Mixed indexing method based on metadata of big data model
US11221788B2 (en) Data storage method and data storage engine
US10762139B1 (en) Method and system for managing a document search index
CN109213760A (en) The storage of high load business and search method of non-relation data storage

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190809

RJ01 Rejection of invention patent application after publication