CN110110034A - A kind of RDF data management method, device and storage medium based on figure - Google Patents
A kind of RDF data management method, device and storage medium based on figure Download PDFInfo
- Publication number
- CN110110034A CN110110034A CN201910389293.XA CN201910389293A CN110110034A CN 110110034 A CN110110034 A CN 110110034A CN 201910389293 A CN201910389293 A CN 201910389293A CN 110110034 A CN110110034 A CN 110110034A
- Authority
- CN
- China
- Prior art keywords
- node
- storage address
- physical storage
- triple
- stored
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1847—File system types specifically adapted to static storage, e.g. adapted to flash memory or SSD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention discloses a kind of RDF data management method, device and storage medium based on figure creates RDF graph based on RDF data to be stored;The node corresponding on RDF graph of each element in triple is stored in different storage units on SSD respectively;In the storage unit that superior node is stored, the physical storage address list of physical storage address including the corresponding all downstream sites of superior node is saved, and the corresponding relationship of the physical storage address for the storage unit for being stored each node and each node, it saves to node address concordance list.Implementation through the invention, diagram data is converted by RDF data to manage, preferably remain the structure feature of RDF data, conveniently from any node heuristic data, conducive to comprehensive, the expansible RDF data management of realization, the high concurrency for taking full advantage of SSD greatly improves the data management performance on SSD.
Description
Technical field
The present invention relates to field of data storage more particularly to a kind of RDF data management method, device and storages based on figure
Medium.
Background technique
Big data era, information show the unstructured and free and abundant relevance of height, and many knowledge bases are for example micro-
The data set of rich, Facebook etc. is usually with resource description framework (RDF, Resource Description Framework)
Form is stored.RDF data is actually to be made of the triple data of some column, wherein each triple is by three
A element composition: resource, attribute and attribute value, also referred to as subject (Subject), predicate (predicate) and object
(Object)。
Universal with RDF in recent years, the quantity of RDF data has greatly increased, concentrate in many RDF datas (such as
Wikipedia billions of a triples) are produced.Therefore, it is huge as one that these huge RDF datas how effectively to be managed
Big challenge.Currently, usually stored RDF data in solid state hard disk (SSD, Solid State Drive), however phase
In the technology of pass in storing process, do not consider that spatial character, such as channel, die, plane inside SSD etc. are internal
Information, but the free memory locations by RDF data random storage on SSD, so that the performance of SSD is not fully exerted,
Data management performance on SSD is lower.
Summary of the invention
The main purpose of the embodiment of the present invention is to provide a kind of RDF data management method, device and storage based on figure
Medium is at least able to solve the free memory locations in the related technology by RDF data random storage on SSD, caused SSD
Performance is not fully exerted, and the problem that the data management performance based on SSD is lower.
To achieve the above object, first aspect of the embodiment of the present invention provides a kind of RDF data management method based on figure,
This method comprises:
RDF graph is created based on RDF data to be stored;The each element wait store all triples in RDF data is right
A node on RDF graph described in Ying Yu;
By each node corresponding to each element in the triple, physical storage address is different on the SSD respectively
It is stored in storage unit;
In the storage unit that superior node is stored, by the object including the corresponding all downstream sites of the superior node
The physical storage address list of reason storage address is saved, and the storage unit that each node and each node are stored
Physical storage address corresponding relationship, save to node address concordance list;The object of the triple is that the junior of predicate saves
Point, the predicate are the downstream site of subject.
To achieve the above object, second aspect of the embodiment of the present invention provides a kind of RDF data managing device based on figure,
The device includes:
Creation module, for creating RDF graph based on RDF data to be stored;It is described wait store all triples in RDF data
Each element both correspond to a node on the RDF graph;
Memory module, for by each node corresponding to each element in the triple, physics to be deposited on the SSD respectively
It is stored in the different storage unit in storage address;
Preserving module, the storage unit for being stored in superior node will include the corresponding institute of the superior node
There is the physical storage address list of the physical storage address of downstream site to be saved, and by each node and each node institute
The corresponding relationship of the physical storage address of the storage unit of storage is saved to node address concordance list;The object of the triple
For the downstream site of predicate, the predicate is the downstream site of subject.
To achieve the above object, the third aspect of the embodiment of the present invention provides a kind of electronic device, which includes:
Processor, memory and communication bus;
The communication bus is for realizing the connection communication between the processor and memory;
The processor is above-mentioned any one to realize for executing one or more program stored in the memory
The step of planting the RDF data management method based on figure.
To achieve the above object, fourth aspect of the embodiment of the present invention provides a kind of computer readable storage medium, the meter
Calculation machine readable storage medium storing program for executing is stored with one or more program, and one or more of programs can be by one or more
It manages device to execute, the step of to realize RDF data management method of any one of the above based on figure.
RDF data management method, device and the storage medium based on figure provided according to embodiments of the present invention, based on wait deposit
It stores up RDF data and creates RDF graph, each element wait store all triples in RDF data both corresponds to a section on RDF graph
Point;Each node corresponding to each element in triple is stored in different storage units on SSD respectively;In superior node
The storage unit stored, by the physical store of the physical storage address including the corresponding all downstream sites of superior node
Location list is saved, and the storage unit that each node and each node are stored physical storage address corresponding relationship,
It saves to node address concordance list.Implementation through the invention converts diagram data for RDF data to manage, preferably retains
The structure feature of RDF data, it is convenient from any node heuristic data, conducive to comprehensive, expansible RDF data management is realized,
The high concurrency for taking full advantage of SSD greatly improves the data management performance on SSD.
Other features of the invention and corresponding effect are described in the aft section of specification, and should be appreciated that
At least partly effect is apparent from from the record in description of the invention.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those skilled in the art without creative efforts, can also basis
These attached drawings obtain other attached drawings.
Fig. 1 is the basic procedure schematic diagram for the RDF data management method that first embodiment of the invention provides;
Fig. 2 is the RDF graph that first embodiment of the invention provides;
Fig. 3 is the basic procedure schematic diagram for another RDF data management method that first embodiment of the invention provides;
Fig. 4 is that the RDF data that first embodiment of the invention provides inquires schematic diagram;
Fig. 5 is the structural schematic diagram for the RDF data managing device that second embodiment of the invention provides;
Fig. 6 is the structural schematic diagram for the electronic device that third embodiment of the invention provides.
Specific embodiment
In order to make the invention's purpose, features and advantages of the invention more obvious and easy to understand, below in conjunction with the present invention
Attached drawing in embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described reality
Applying example is only a part of the embodiment of the present invention, and not all embodiments.Based on the embodiments of the present invention, those skilled in the art
Member's every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
First embodiment:
In order to solve the free memory locations by RDF data random storage on SSD in the related technology, caused SSD
Performance is not fully exerted, and the technical problem that the data management performance based on SSD is lower, the present embodiment propose one kind
RDF data management method based on figure, applied to the SSD with multiple storage units.It is as shown in Figure 1 provided in this embodiment
The basic procedure schematic diagram of RDF data management method based on figure, the RDF data management method based on figure that the present embodiment proposes
Include the following steps:
Step 101 creates RDF graph based on RDF data to be stored;Wait store each element of all triples in RDF data
Both correspond to a node on RDF graph.
Specifically, include multiple resource descriptions in RDF data, and a resource description is made of multiple sentences, one
Sentence is the triple being made of resource, attribute, attribute value.Sentence in resource description can correspond to the language of natural language
Sentence, resource correspond to the subject in natural language, and attribute corresponds to predicate, and attribute value corresponds to object, one in RDF term
Triple can be expressed as (subject, predicate, object), namely (s, p, o).One RDF data collection can be described as a RDF
Figure, is illustrated in figure 2 RDF graph provided in this embodiment, which is an oriented label figure, and subject and object are described as RDF
Two adjacent vertex in figure, object are described as the directed edge between two vertex adjacent in RDF graph, and vertex and directed edge are equal
A node being regarded as on RDF graph.
Step 102, by each node corresponding to each element in triple, physical storage address is different on SSD respectively
It is stored in storage unit.
Specifically, node corresponding to subject, predicate and object in each triple is stored respectively in the present embodiment
In different storage units.In addition, it should also be noted that in order to preferably utilize the concurrency of SSD, and guarantee storage
Entire RDF graph can be divided into the subgraph of multiple fixed data amount sizes in the present embodiment, then will counted by the utilization rate in space
It is stored in different storage units according to equal-sized subgraph is measured.
Step 103, the storage unit stored in superior node will include the corresponding all downstream sites of superior node
The physical storage address list of physical storage address saved, and the storage unit that each node and each node are stored
Physical storage address corresponding relationship, save to node address concordance list;The object of triple is the downstream site of predicate, meaning
Language is the downstream site of subject.
Specifically, in the storage unit that each node is stored, safeguarding the object of the downstream site of the node in the present embodiment
Storage address list is managed, includes the physical storage address of all downstream sites of the node in physical storage address list.It should
Illustrate, subject, predicate, node corresponding to object node level reduce step by step.Additionally, it should be understood that being
The positioning of each node of realization, the also corresponding relationship based on each node and its physical storage address in the present embodiment, generates node
Address reference table.In addition, in a preferred embodiment, it, can be by multiple sections in order to make full use of the parallel processing capability of SSD
Parallel being stored in different storage units of point significantly improves data write performance.
Optionally, after storing RDF data, inquiry for RDF data please refers to this implementation as shown in Figure 3
The basic procedure schematic diagram for another RDF data management method that example provides, specifically includes the following steps:
Step 301, when receiving RDF data inquiry request, obtain at least one triple to be checked;Ternary to be checked
Known element in group is querying condition, and the unknown element in triple to be checked is query result, it is known that element includes at least
The highest subject of triple interior joint grade to be checked;
Step 302 searches the physical storage address for corresponding to known element based on node address concordance list;
Step 303, from the physical storage address list that the physical storage address of known element is stored, obtain Known Elements
The physical storage address of the corresponding downstream site of element, it is corresponding that the physical storage address based on downstream site searches known element
Downstream site, and query result is obtained based on downstream site.
Specifically, be based on RDF data storage strategy above-mentioned, it is corresponding, in the present embodiment then based on superior node come
Downstream site is inquired, in the triple to be checked in the present embodiment when known element is main language, subject can be based on
It inquires to obtain predicate corresponding to downstream site, is then based on predicate and continues inquiry and obtain guest corresponding to more next stage node
Language, and if the known element of triple to be checked is directly inquired to obtain it according to predicate simultaneously when including subject and predicate
Object corresponding to downstream site.It is illustrated in figure 4 RDF data inquiry schematic diagram provided in this embodiment, to save by higher level
For point Var searches its downstream site, the physical storage address that node address concordance list searches known element Var is first passed through
(Channel#0, Flash#0, Page#0), then in the physical storage address list of (Channel#0, Flash#0, Page#0)
The middle physical storage address (Channel#1, Flash#0, Page#0) for obtaining downstream site corresponding to Var, and will
(Channel#1, Flash#0, Page#0) all downstream sites are loaded, and query result is obtained.Thus in the present embodiment,
As long as can be obtained all associated node datas, data query performance is good by a known node.
Optionally, it when triple to be checked has multiple, is searched based on node address concordance list and corresponds to known element
Physical storage address includes: to be searched corresponded to known element in each triple to be checked respectively based on node address concordance list
Physical storage address.It is corresponding, from the physical storage address list that the physical storage address of known element is stored, obtain
The physical storage address of the corresponding downstream site of major elements, the physical storage address based on downstream site search known element phase
Corresponding downstream site, and obtaining query result based on downstream site includes: to be deposited from the physical storage address of each known element
In the physical storage address list of storage, the physical storage address of the corresponding downstream site of each known element is obtained respectively, then
The data that the physical storage address of each downstream site of loaded in parallel is stored search the corresponding downstream site of each known element,
And multiple queries result is obtained based on downstream site.
Specifically, when inquiring that data volume is larger namely the triple of required inquiry has multiple, processing number that can be parallel
It is investigated that asking, namely the parallel physical storage address for inquiring known superior node in each triple to be checked, then according to institute
The object of the downstream site of each superior node of parallel search in the physical storage address list that obtained physical storage address is stored
Storage address is managed, finally again from the physical storage address loaded in parallel data of each downstream site, obtains multiple queries as a result, more preferable
The parallel performance of SSD is utilized, and improve efficiency data query.
Optionally, it after storing RDF data, when being added to RDF data, specifically includes: receiving RDF
When data addition request, search whether node corresponding to each element of triple to be added is deposited in node address concordance list
?;If so, adding ternary to be added in the physical storage address list corresponding to the superior node in triple to be added
The physical storage address of downstream site in group;If it is not, then each node in triple to be added is stored on SSD respectively
Different storage units, and the storage unit stored in the superior node of triple to be added will include triple to be added
The physical storage address list of the physical storage address of downstream site is saved, and by each node in triple to be added
With the corresponding relationship of the physical storage address of the storage unit stored, save to node address concordance list.
Specifically, in the present embodiment, when needing to increase data newly on the basis of current RDF graph, in the newly-increased triple of institute
Each node data has existed on original RDF graph, then the physical store directly safeguarded in the superior node of newly-increased data
The physical storage address of downstream site is added in the list of location;And if each node data in newly-increased data is new node,
Each node is then stored respectively in different storage units first, then again by the physical storage address of each node with being added to node
Location concordance list, and the physical storage address list safeguarded in each superior node saves the physical storage address of its downstream site.
Optionally, it after storing RDF data, when being updated to RDF data, specifically includes: receiving RDF
When data update request, the original triple being stored on SSD corresponding to triple to be updated is determined;Based on ternary to be updated
Element in group is updated node corresponding to element in original triple, and based on the element updated, deposits to physics
Storage address list and node address concordance list are updated.
Although specifically, in the case where being not read-only it may be reasonably assumed that most of RDF storages are that inquiry is intensive
(for example, large size in life science refers to repository), but in some cases there is still a need for the update operation for carrying out data,
Namely modify to available data, more new data is explained.When carrying out data update in the present embodiment, it is only necessary to be repaired needed for finding
The physical storage address of node corresponding to the element changed repairs corresponding node with reference to the element in triple to be updated
Change, with then will being related to the physical store of the node updated in physical storage address list and in node address concordance list again
Location carries out adaptation.
Optionally, it after storing RDF data, when deleting RDF data, specifically includes: receiving RDF
When data removal request, determine whether node corresponding to each element is also associated with other triples in triple to be deleted;If
Be, then by the corresponding physical storage address list of superior node for being also associated with other triples in triple to be deleted,
The physical storage address of downstream site is deleted;If it is not, will then be also associated with the section of other triples in triple to be deleted
Point is deleted, and is also associated with other ternarys for what is in physical storage address list and node address concordance list, recorded
The physical storage address of the node of group is deleted.
Specifically, in the present embodiment when carrying out data deletion, it is thus necessary to determine that each node to be deleted whether in addition to have to
It deletes inside triple except the relevance of each node, if also while in other triples have with other nodes and close
Connection property, if it is then then retain node data corresponding to each element in triple to be deleted, it only will be in triple to be deleted
In the physical storage address list that each superior node is safeguarded, the physical storage address of downstream site to be deleted is deleted;
If it is not, then directly deleting node data corresponding to each element in triple to be deleted, and physical storage address is arranged
The physical storage address of node to be deleted involved in table and node address concordance list is deleted.
The RDF data management method based on figure provided according to embodiments of the present invention is created based on RDF data to be stored
RDF graph, each element wait store all triples in RDF data both correspond to a node on RDF graph;It will be each in triple
Each node corresponding to element is stored in different storage units on SSD respectively;In the storage list that superior node is stored
Member protects the physical storage address list of the physical storage address including the corresponding all downstream sites of superior node
Deposit, and the storage unit that each node and each node are stored physical storage address corresponding relationship, save to node
Location concordance list.Implementation through the invention converts diagram data for RDF data to manage, preferably remains the knot of RDF data
Structure feature, it is convenient to take full advantage of SSD from any node heuristic data conducive to comprehensive, expansible RDF data management is realized
High concurrency, greatly improve the data management performance on SSD.
Second embodiment:
In order to solve the free memory locations by RDF data random storage on SSD in the related technology, caused SSD
Performance is not fully exerted, and the technical problem that the data management performance based on SSD is lower, and present embodiment illustrates one kind
RDF data managing device based on figure, applied to the SSD with multiple storage units.Fig. 5 specifically is referred to, the present embodiment
RDF data managing device includes:
Creation module 501, for creating RDF graph based on RDF data to be stored;Wait store all triples in RDF data
Each element both correspond to a node on RDF graph;
Memory module 502, for by each node corresponding to each element in triple, the physical store on SSD respectively
It is stored in the different storage unit in location;
Preserving module 503, the storage unit for being stored in superior node will include that superior node is corresponding all
The physical storage address list of the physical storage address of downstream site is saved, and each node and each node are stored
The corresponding relationship of the physical storage address of storage unit is saved to node address concordance list;The object of triple is under predicate
Grade node, predicate are the downstream site of subject.
Specifically, in the present embodiment, a RDF data collection can be described as a RDF graph, which is one
Oriented label figure, the subject and object of triple are described as two vertex adjacent in RDF graph in RDF data, and object is described as
Directed edge in RDF graph between two adjacent vertex, vertex and directed edge are regarded as a node on RDF graph.This
Node corresponding to subject, predicate and object in each triple is respectively stored in different storage units in embodiment
In, then, in the storage unit that each node is stored, safeguard the physical storage address list of the downstream site of the node, object
Manage the physical storage address of all downstream sites in storage address list including the node, wherein subject, predicate, object institute
The node level of corresponding node reduces step by step.In addition, the positioning in order to realize each node, each node is also based in the present embodiment
And its corresponding relationship of physical storage address, generate node address concordance list.
In some embodiments of the present embodiment, RDF data managing device further include: enquiry module, for receiving
When to RDF data inquiry request, at least one triple to be checked is obtained;Known element in triple to be checked is inquiry item
Part, the unknown element in triple to be checked are query result, it is known that element includes at least triple interior joint grade to be checked
Highest subject;The physical storage address for corresponding to known element is searched based on node address concordance list;From the object of known element
In the physical storage address list that reason storage address is stored, with obtaining the physical store of the corresponding downstream site of known element
Location, the physical storage address based on downstream site searches the corresponding downstream site of known element, and is obtained based on downstream site
Query result.
Further, in some embodiments of the present embodiment, when triple to be checked has multiple, enquiry module tool
Body is used to be based on node address concordance list, searches the physical store for corresponding to known element in each triple to be checked respectively
Location;From the physical storage address list that the physical storage address of each known element is stored, each known element phase is obtained respectively
The physical storage address of corresponding downstream site, the number that then physical storage address of each downstream site of loaded in parallel is stored
According to, the corresponding downstream site of each known element of lookup, and multiple queries result is obtained based on downstream site.
In other embodiments of the present embodiment, RDF data managing device further include: adding module, for connecing
When receiving RDF data addition request, node corresponding to each element of triple to be added is searched in node address concordance list
It whether there is;If so, adding in the physical storage address list corresponding to the superior node in triple to be added wait add
Add the physical storage address of the downstream site in triple;If it is not, then each node in triple to be added is stored in respectively
Different storage units on SSD, and the storage unit stored in the superior node of triple to be added will include to be added three
The physical storage address list of the physical storage address of the downstream site of tuple is saved, and will be in triple to be added
The corresponding relationship of each node and the physical storage address of the storage unit stored is saved to node address concordance list.
In some embodiments of the present embodiment, RDF data managing device further include: update module, for receiving
When updating request to RDF data, the original triple being stored on SSD corresponding to triple to be updated is determined;Based on to more
Element in new triple is updated node corresponding to element in original triple, and based on the element updated, right
Physical storage address list and node address concordance list are updated.
In some embodiments of the present embodiment, RDF data managing device further include: removing module, for receiving
When to RDF data removal request, determine whether node corresponding to each element is also associated with other ternarys in triple to be deleted
Group;If so, the corresponding physical storage address of the superior node for being also associated with other triples in triple to be deleted is arranged
In table, the physical storage address of downstream site is deleted;If it is not, will then be also associated with other triples in triple to be deleted
Node deleted, and by physical storage address list and node address concordance list, being also associated with for being recorded is other
The physical storage address of the node of triple is deleted.
It should be noted that the RDF data management method based on figure in previous embodiment can be mentioned based on the present embodiment
The RDF data managing device based on figure supplied realizes that those of ordinary skill in the art can be clearly understood that, for description
It is convenienct and succinct, the specific work process of the RDF data managing device as described in this embodiment based on figure can refer to
Corresponding process in preceding method embodiment, details are not described herein.
Using the RDF data managing device provided in this embodiment based on figure, RDF graph is created based on RDF data to be stored,
Each element wait store all triples in RDF data both corresponds to a node on RDF graph;By each element institute in triple
Corresponding each node is stored in different storage units on SSD respectively;In the storage unit that superior node is stored, will wrap
The physical storage address list for including the physical storage address of the corresponding all downstream sites of superior node is saved, and will
The corresponding relationship of the physical storage address for the storage unit that each node and each node are stored is saved to node address concordance list.
Implementation through the invention converts diagram data for RDF data to manage, and preferably remains the structure feature of RDF data, side
Just from any node heuristic data, conducive to comprehensive, expansible RDF data management is realized, the height for taking full advantage of SSD is parallel
Property, greatly improve the data management performance on SSD.
3rd embodiment:
A kind of electronic device is present embodiments provided, it is shown in Figure 6 comprising processor 601, memory 602 and logical
Believe bus 603, in which: communication bus 603 is for realizing the connection communication between processor 601 and memory 602;Processor
601 for executing one or more computer program stored in memory 602, with realize in above-described embodiment one based on
At least one step in the RDF data management method of figure.
The present embodiment additionally provides a kind of computer readable storage medium, which, which is included in, is used for
Store any method or skill of information (such as computer readable instructions, data structure, computer program module or other data)
The volatibility implemented in art or non-volatile, removable or non-removable medium.Computer readable storage medium includes but not
It is limited to RAM (Random Access Memory, random access memory), ROM (Read-Only Memory, read-only storage
Device), EEPROM (Electrically Erasable Programmable read only memory, band electric erazable programmable
Read-only memory), flash memory or other memory technologies, (Compact Disc Read-Only Memory, CD is only by CD-ROM
Read memory), digital versatile disc (DVD) or other optical disc storages, magnetic holder, tape, disk storage or other magnetic memory apparatus,
Or any other medium that can be used for storing desired information and can be accessed by a computer.
Computer readable storage medium in the present embodiment can be used for storing one or more computer program, storage
One or more computer program can be executed by processor, with realize the method in above-described embodiment one at least one step
Suddenly.
The present embodiment additionally provides a kind of computer program, which can be distributed in computer-readable medium
On, by can computing device execute, to realize at least one step of the method in above-described embodiment one;And in certain situations
Under, at least one shown or described step can be executed using the described sequence of above-described embodiment is different from.
The present embodiment additionally provides a kind of computer program product, including computer readable device, the computer-readable dress
It sets and is stored with computer program as shown above.The computer readable device may include calculating as shown above in the present embodiment
Machine readable storage medium storing program for executing.
As it can be seen that those skilled in the art should be understood that whole or certain steps in method disclosed hereinabove, be
Functional module/unit in system, device may be implemented as the software (computer program code that can be can be performed with computing device
To realize), firmware, hardware and its combination appropriate.In hardware embodiment, the functional module that refers in the above description/
Division between unit not necessarily corresponds to the division of physical assemblies;For example, a physical assemblies can have multiple functions, or
One function of person or step can be executed by several physical assemblies cooperations.Certain physical assemblies or all physical assemblies can be by realities
It applies as by processor, such as the software that central processing unit, digital signal processor or microprocessor execute, or is implemented as hard
Part, or it is implemented as integrated circuit, such as specific integrated circuit.
In addition, known to a person of ordinary skill in the art be, communication media generally comprises computer-readable instruction, data knot
Other data in the modulated data signal of structure, computer program module or such as carrier wave or other transmission mechanisms etc, and
It and may include any information delivery media.So the present invention is not limited to any specific hardware and softwares to combine.
The above content is combining specific embodiment to be further described to made by the embodiment of the present invention, cannot recognize
Fixed specific implementation of the invention is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs,
Without departing from the inventive concept of the premise, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to the present invention
Protection scope.
Claims (10)
1. a kind of RDF data management method based on figure characterized by comprising
RDF graph is created based on RDF data to be stored;The each element wait store all triples in RDF data both corresponds to
A node on the RDF graph;
By each node corresponding to each element in the triple, the different storage of physical storage address on the SSD respectively
It is stored in unit;
In the storage unit that superior node is stored, the physics including the corresponding all downstream sites of the superior node is deposited
The physical storage address list of storage address is saved, and the object for the storage unit that each node and each node are stored
The corresponding relationship of storage address is managed, is saved to node address concordance list;The object of the triple is the downstream site of predicate, institute
Predication language is the downstream site of subject.
2. the RDF data management method based on figure as described in claim 1, which is characterized in that by each node and described each
The corresponding relationship of the physical storage address for the storage unit that node is stored is saved to node address concordance list, further includes:
When receiving RDF data inquiry request, at least one triple to be checked is obtained;In the triple to be checked
Major elements is querying condition, and the unknown element in the triple to be checked is query result, and the known element includes at least
The highest subject of triple interior joint grade to be checked;
The physical storage address for corresponding to the known element is searched based on the node address concordance list;
From the physical storage address list that the physical storage address of the known element is stored, the known element phase is obtained
The physical storage address of corresponding downstream site, the physical storage address based on the downstream site search the known element phase
Corresponding downstream site, and the query result is obtained based on the downstream site.
3. the RDF data management method based on figure as claimed in claim 2, which is characterized in that in the triple to be checked
When having multiple, described searched based on the node address concordance list includes: corresponding to the physical storage address of the known element
Based on the node address concordance list, searches deposited corresponding to the physics of known element in each triple to be checked respectively
Store up address;
In the physical storage address list that the physical storage address from the known element is stored, the Known Elements are obtained
The physical storage address of the corresponding downstream site of element, the physical storage address based on the downstream site search the Known Elements
The corresponding downstream site of element, and the query result is obtained based on the downstream site and includes:
From the physical storage address list that the physical storage address of each known element is stored, obtain respectively it is each it is described
The physical storage address of the corresponding downstream site of major elements, the then physical storage address of each downstream site of loaded in parallel
The data stored search the corresponding downstream site of each known element, and obtain multiple institutes based on the downstream site
State query result.
4. the RDF data management method based on figure as described in claim 1, which is characterized in that by each node and described each
The corresponding relationship of the physical storage address for the storage unit that node is stored is saved to node address concordance list, further includes:
When receiving RDF data addition request, node corresponding to each element of triple to be added is searched in the node
It whether there is in address reference table;
If so, in the physical storage address list corresponding to the superior node in the triple to be added, described in addition
The physical storage address of downstream site in triple to be added;
If it is not, each node in the triple to be added is then stored in the different storage units on the SSD respectively, and
The storage unit that the superior node of the triple to be added is stored will include the downstream site of the triple to be added
The physical storage address list of physical storage address is saved, and by the triple to be added each node with deposited
The corresponding relationship of the physical storage address of the storage unit of storage is saved to the node address concordance list.
5. the RDF data management method based on figure as described in claim 1, which is characterized in that by each node and described each
The corresponding relationship of the physical storage address for the storage unit that node is stored is saved to node address concordance list, further includes:
When receiving RDF data update request, original three be stored on the SSD corresponding to triple to be updated are determined
Tuple;
Node corresponding to element in the original triple is updated based on the element in the triple to be updated, and
Based on the element updated, the physical storage address list and node address concordance list are updated.
6. the RDF data management method based on figure as described in claim 1, which is characterized in that by each node and described each
The corresponding relationship of the physical storage address for the storage unit that node is stored is saved to node address concordance list, further includes:
When receiving RDF data removal request, determine whether node corresponding to each element is also associated in triple to be deleted
In other triples;
If so, by the corresponding physical store of superior node for being also associated with other triples in the triple to be deleted
In the list of location, the physical storage address of downstream site is deleted;
If it is not, then the node for being also associated with other triples in the triple to be deleted is deleted, and by the physics
In storage address list and the node address concordance list, the physics that is recorded be also associated with the node of other triples is deposited
It is deleted storage address.
7. a kind of RDF data managing device based on figure characterized by comprising
Creation module, for creating RDF graph based on RDF data to be stored;It is described wait store each of all triples in RDF data
Element both corresponds to a node on the RDF graph;
Memory module, for by each node corresponding to each element in the triple, the physical store on the SSD respectively
It is stored in the different storage unit in location;
Preserving module, the storage unit for being stored in superior node, will include the superior node it is corresponding it is all under
The physical storage address list of the physical storage address of grade node is saved, and each node and each node are stored
Storage unit physical storage address corresponding relationship, save to node address concordance list;The object of the triple is meaning
The downstream site of language, the predicate are the downstream site of subject.
8. the RDF data managing device based on figure as claimed in claim 7, which is characterized in that further include: enquiry module is used
In when receiving RDF data inquiry request, obtaining at least one triple to be checked;It is known in the triple to be checked
Element is querying condition, and the unknown element in the triple to be checked is query result, and the known element includes at least institute
State the highest subject of triple interior joint grade to be checked;It is searched based on the node address concordance list and corresponds to the Known Elements
The physical storage address of element;From the physical storage address list that the physical storage address of the known element is stored, obtain
The physical storage address of the corresponding downstream site of the known element, the physical storage address based on the downstream site are searched
The corresponding downstream site of the known element, and the query result is obtained based on the downstream site.
9. a kind of electronic device characterized by comprising processor, memory and communication bus;
The communication bus is for realizing the connection communication between the processor and memory;
The processor is for executing one or more program stored in the memory, to realize such as claim 1 to 6
Any one of described in the RDF data management method based on figure the step of.
10. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage have one or
Multiple programs, one or more of programs can be executed by one or more processor, to realize such as claim 1 to 6
Any one of described in the RDF data management method based on figure the step of.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910389293.XA CN110110034A (en) | 2019-05-10 | 2019-05-10 | A kind of RDF data management method, device and storage medium based on figure |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910389293.XA CN110110034A (en) | 2019-05-10 | 2019-05-10 | A kind of RDF data management method, device and storage medium based on figure |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110110034A true CN110110034A (en) | 2019-08-09 |
Family
ID=67489353
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910389293.XA Pending CN110110034A (en) | 2019-05-10 | 2019-05-10 | A kind of RDF data management method, device and storage medium based on figure |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110110034A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112256886A (en) * | 2020-10-23 | 2021-01-22 | 平安科技(深圳)有限公司 | Probability calculation method and device in map, computer equipment and storage medium |
CN113253926A (en) * | 2021-05-06 | 2021-08-13 | 天津大学深圳研究院 | Memory internal index construction method for improving query and memory performance of novel memory |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103294710A (en) * | 2012-02-28 | 2013-09-11 | 北京新媒传信科技有限公司 | Data access method and device |
CN103617276A (en) * | 2013-12-09 | 2014-03-05 | 南京大学 | Method for storing distributed hierarchical RDF data |
CN103778251A (en) * | 2014-02-19 | 2014-05-07 | 天津大学 | SPARQL parallel query method facing large-scale RDF graph data |
CN104679764A (en) * | 2013-11-28 | 2015-06-03 | 方正信息产业控股有限公司 | Method and device for searching graph data |
CN106156319A (en) * | 2016-07-05 | 2016-11-23 | 北京航空航天大学 | Telescopic distributed resource description framework data storage method and device |
CN107038234A (en) * | 2017-04-17 | 2017-08-11 | 天津大学 | A kind of path query framework based on RDF graph data and relation data |
CN107291807A (en) * | 2017-05-16 | 2017-10-24 | 中国科学院计算机网络信息中心 | A kind of SPARQL enquiring and optimizing methods based on figure traversal |
US20170316110A1 (en) * | 2013-01-29 | 2017-11-02 | Oracle International Corporation | Publishing rdf quads as relational views |
CN109344259A (en) * | 2018-07-20 | 2019-02-15 | 西安交通大学 | A kind of RDF distributed storage method dividing frame based on multilayer |
KR101972127B1 (en) * | 2018-11-30 | 2019-04-25 | 주식회사 피씨엔 | Intelligent search system based on resource description framework triple data and intelligent search method using the same |
-
2019
- 2019-05-10 CN CN201910389293.XA patent/CN110110034A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103294710A (en) * | 2012-02-28 | 2013-09-11 | 北京新媒传信科技有限公司 | Data access method and device |
US20170316110A1 (en) * | 2013-01-29 | 2017-11-02 | Oracle International Corporation | Publishing rdf quads as relational views |
CN104679764A (en) * | 2013-11-28 | 2015-06-03 | 方正信息产业控股有限公司 | Method and device for searching graph data |
CN103617276A (en) * | 2013-12-09 | 2014-03-05 | 南京大学 | Method for storing distributed hierarchical RDF data |
CN103778251A (en) * | 2014-02-19 | 2014-05-07 | 天津大学 | SPARQL parallel query method facing large-scale RDF graph data |
CN106156319A (en) * | 2016-07-05 | 2016-11-23 | 北京航空航天大学 | Telescopic distributed resource description framework data storage method and device |
CN107038234A (en) * | 2017-04-17 | 2017-08-11 | 天津大学 | A kind of path query framework based on RDF graph data and relation data |
CN107291807A (en) * | 2017-05-16 | 2017-10-24 | 中国科学院计算机网络信息中心 | A kind of SPARQL enquiring and optimizing methods based on figure traversal |
CN109344259A (en) * | 2018-07-20 | 2019-02-15 | 西安交通大学 | A kind of RDF distributed storage method dividing frame based on multilayer |
KR101972127B1 (en) * | 2018-11-30 | 2019-04-25 | 주식회사 피씨엔 | Intelligent search system based on resource description framework triple data and intelligent search method using the same |
Non-Patent Citations (3)
Title |
---|
冯志杰: "面向大规模RDF数据的混合分布式存储方案研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
姜龙翔等: "一种大规模RDF语义数据的分布式存储方案", 《计算机应用与软件》 * |
崔义童等: "基于图聚类算法的大规模RDF数据查询方法研究", 《小型微型计算机系统》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112256886A (en) * | 2020-10-23 | 2021-01-22 | 平安科技(深圳)有限公司 | Probability calculation method and device in map, computer equipment and storage medium |
CN112256886B (en) * | 2020-10-23 | 2023-06-27 | 平安科技(深圳)有限公司 | Probability calculation method and device in atlas, computer equipment and storage medium |
CN113253926A (en) * | 2021-05-06 | 2021-08-13 | 天津大学深圳研究院 | Memory internal index construction method for improving query and memory performance of novel memory |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110825748B (en) | High-performance and easily-expandable key value storage method by utilizing differentiated indexing mechanism | |
CN109933570B (en) | Metadata management method, system and medium | |
US9858303B2 (en) | In-memory latch-free index structure | |
CN105574093B (en) | A method of index is established in the spark-sql big data processing system based on HDFS | |
CN103229173B (en) | Metadata management method and system | |
CN110134335A (en) | A kind of RDF data management method, device and storage medium based on key-value pair | |
US20160283538A1 (en) | Fast multi-tier indexing supporting dynamic update | |
US20140136510A1 (en) | Hybrid table implementation by using buffer pool as permanent in-memory storage for memory-resident data | |
CN105117417A (en) | Read-optimized memory database Trie tree index method | |
CN109521959A (en) | One kind being based on SSD-SMR disk mixing key assignments memory system data method for organizing | |
US20150142818A1 (en) | Paged column dictionary | |
CN110309233A (en) | Method, apparatus, server and the storage medium of data storage | |
US20240086332A1 (en) | Data processing method and system, device, and medium | |
Amur et al. | Design of a write-optimized data store | |
CN104598652B (en) | A kind of data base query method and device | |
CN110110034A (en) | A kind of RDF data management method, device and storage medium based on figure | |
US20190303421A1 (en) | Histogram sketching for time-series data | |
US8954688B2 (en) | Handling storage pages in a database system | |
CN110096515A (en) | A kind of RDF data management method, device and storage medium based on triple | |
US20210349918A1 (en) | Methods and apparatus to partition a database | |
US11080332B1 (en) | Flexible indexing for graph databases | |
CN107273443B (en) | Mixed indexing method based on metadata of big data model | |
US11221788B2 (en) | Data storage method and data storage engine | |
US10762139B1 (en) | Method and system for managing a document search index | |
CN109213760A (en) | The storage of high load business and search method of non-relation data storage |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190809 |
|
RJ01 | Rejection of invention patent application after publication |