WO2016180186A1

WO2016180186A1 - Semantic data storage method and apparatus

Info

Publication number: WO2016180186A1
Application number: PCT/CN2016/079672
Authority: WO
Inventors: 曲文武; 王志坤
Original assignee: 中兴通讯股份有限公司
Priority date: 2015-07-01
Filing date: 2016-04-19
Publication date: 2016-11-17
Also published as: CN106326295B; CN106326295A

Abstract

A semantic data storage method and apparatus. The method comprises: selecting a theme attribute and a master key attribute in semantic data, wherein the theme attribute is an attribute, the query frequency of which exceeds a predetermined threshold value, in the semantic data, and the master key attribute is an attribute of data logic description in the semantic data; calculating a master key attribute value set of the master key attribute which corresponds to each theme attribute value of the theme attribute; storing attributes which belong to the same master key attribute value set in the semantic data on the same node; and establishing an attribute table for each attribute which is stored in the node on the node and storing the attribute table according to a key value storage mode. The technical solution gives consideration to a storage space and query efficiency in the semantic data storage method, thereby saving the storage space and improving the query efficiency.

Description

Semantic data storage method and device

Technical field

This document relates to, but is not limited to, the field of communications, and in particular, to a method and apparatus for storing semantic data.

Background technique

Semantic data is a kind of data described by the Resource Description Framework (RDF), also known as RDF data. The format of semantic data is generally expressed as <subject, predicate, object>, such as <Bob, monthly salary, 5800>, <Bob, department, personnel>, where predicate is also called attribute.

As the amount of data increases, distributed file systems become the mainstream storage method for semantic data. In this case, a significant portion of the query performance of the semantic data is consumed in the reading of the semantic data. The distributed file system has three semantic data storage methods: the first is triple storage, and the first is a triple storage table of semantic data in the related art. As shown in Table 1, it is a three-column table, respectively. Store subjects, predicates, and objects. This method is the easiest to implement, but the query performance is poor, and often need to assist some optimization methods. For example, query "Do you have colleagues in the department of Bob?", it needs to traverse all the data. If you index "personnel", you only need to query the target data according to the index, but this will bring complex and huge indexing problems.

Table 1

主语subject	谓语predicate	宾语object
BobBob	公司the company	CorpACorpA
BobBob	部门department	人事personnel
BobBob	月薪Monthly salary	58005800
BobBob	性别gender	男male
BobBob	配偶spouse	JerryJerry
BobBob	爱好Hobby	篮球basketball

JerryJerry	公司the company	CorpBCorpB
JerryJerry	部门department	销售Sales
JerryJerry	性别gender	女Female
JerryJerry	爱好Hobby	购物shopping
JerryJerry	邮箱mailbox	Jerry@CorpB.comJerry@CorpB.com
TomTom	公司the company	CorpACorpA
TomTom	部门department	人事personnel
CorpACorpA	地址address	北京Beijing
CorpBCorpB	员工数目Number of employees	100100

The second is column storage, which is a storage method of <key, value>, with the subject as the primary key, the predicate as the attribute, and the object as the attribute value. Table 2 is a department table of column storage of semantic data in the related art. As shown in Table 2, it describes column storage of a department table. The primary key describes the name of a person, and the attribute value describes the department to which the person belongs. The advantage of this approach is that it can make full use of the storage space and store all the data with the same attributes in a table, which is beneficial to the query of the attributes. For example, if you query "bos who have colleagues in Bob's department?", you only need to query the records in the department table that have the attribute value "Personnel". However, the drawback of this method is that the different attribute values (objects) of the subject are scattered in different tables. When the query involves multiple attributes, it needs to connect multiple tables, which affects the query efficiency. For example, query "Do you have colleagues in the department of Bob?" The semantic data for the user also stores employee information of multiple companies. The query involves two attributes: "company" and "department", then the query needs to be in the "company" table. Query the employees with the same company as Bob, query the employees in the same department of Bob in the “Department” table, and then make the two parts of the results as a connection with the employees.

Table 2

主键Primary key	部门department
BobBob	人事personnel
JerryJerry	销售Sales

The third is row storage. In the extreme case, there is a super table. All predicates are attributes of the table, so that all data can exist in this table. Table 3 is a super table of semantic data, as shown in Table 3. But the problem is that super tables can be very sparse in many cases, wasting a lot of storage space. In the actual implementation, some closely related attributes are stored in a table, as shown in Table 4a and Table 4b, Table 4a is the employee attribute table of semantic data, and Table 4b is the company attribute table of semantic data, which can lower the table. The sparseness can also avoid the connection operations of some tables, but how to find these closely related attributes is a difficult problem to solve.

table 3

Table 4a

Table 4b

However, semantic data implies the relationship between data. These relationships combine data into a single graph. Figure 1 is a schematic diagram of the formation of semantic data in the related art, as shown in Figure 1. A query for semantic data is equivalent to searching for a subgraph in the graph. When the amount of data of semantic data is large, the data needs to be stored on different nodes, and the search for the subgraph may involve different nodes. The above three storage methods have different tradeoffs in data management, storage space and query efficiency, respectively. Their respective advantages and disadvantages.

In view of the related art, the problem that the storage space and the query efficiency cannot be balanced in the semantic data storage method has not yet proposed an effective solution.

Summary of the invention

The following is an overview of the topics detailed in this document. This Summary is not intended to limit the scope of the claims.

The embodiment of the invention provides a method and a device for storing semantic data, so as to achieve a balance between storage space and query efficiency in the semantic data storage method.

An embodiment of the present invention provides a method for storing semantic data, including: selecting a topic attribute and a primary key attribute in the semantic data, where the topic attribute is an attribute whose query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is data logic in the semantic data. The attribute of the description; the primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the subject attribute; the attribute belonging to the same primary key attribute value set in the semantic data is stored on the same node; and the node is stored in the node Each attribute establishes a property sheet and stores the property table in a key value storage manner.

In the embodiment of the present invention, storing the attributes belonging to the same primary key attribute value set in the semantic data on the same node includes: establishing a super table according to the primary key attribute on the semantic data; and recording the super primary table belonging to the same primary key attribute value set. Stored on the same node.

In the embodiment of the present invention, after storing the attribute table according to the key value storage manner, the method further includes: establishing, on the node, the access index according to the specified format for the predetermined attribute and the topic attribute stored in the node.

In the embodiment of the present invention, the primary key attribute value set corresponding to the primary key attribute corresponding to each attribute value of the subject attribute includes one of the following: calculating a primary key attribute value set whose attribute value belongs to the primary key attribute in the object corresponding to the subject attribute value. Calculate the primary key attribute value set whose primary value is the primary key attribute in the subject corresponding to the subject attribute value.

In the embodiment of the present invention, before establishing an attribute table for each attribute stored in the node on the node, the method further includes: in the case that the primary key attribute value belongs to the plurality of primary key attribute value sets at the same time, the primary key in the super table The records corresponding to the attribute values are stored on multiple nodes.

The embodiment of the invention further provides a storage device for semantic data, comprising: a selection module, configured to select a topic attribute and a primary key attribute in the semantic data, wherein the topic attribute is an attribute in which the query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is The attribute of the logical description of the data in the semantic data; the calculation module is configured to calculate a primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the topic attribute; the first storage module is set to belong to the same primary key attribute value in the semantic data The attributes of the collection are stored on the same node; the second storage module is configured to establish an attribute table for each attribute stored in the node on the node, and store the attribute table according to the key value storage manner.

In the embodiment of the present invention, the first storage module includes: an establishing unit, configured to establish a super table according to the primary key attribute to the semantic data; and the storage unit is configured to store the records belonging to the same primary key attribute value set in the super table in the same node on.

In the embodiment of the present invention, the apparatus further includes: an indexing module, configured to: after storing the attribute table according to the key value storage manner, establish an access index on the node according to the specified format on the predetermined attribute and the topic attribute stored in the node. .

In the embodiment of the present invention, the calculation module includes one of the following: the first calculation unit is configured to calculate a primary key attribute value set whose attribute value belongs to the primary key attribute in the object corresponding to the subject attribute value; the second calculating unit is set to The set of primary key attribute values whose attribute value belongs to the primary key attribute in the subject corresponding to the subject attribute value is calculated.

In an embodiment of the present invention, the apparatus further includes: a third storage module configured to: before the attribute table is created on the node for each attribute stored in the node, the primary key attribute value belongs to the plurality of primary key attribute value sets at the same time In this case, the records corresponding to the primary key attribute values in the super table are stored on multiple nodes.

Through the embodiment of the present invention, the attribute belonging to the same primary key attribute value set in the semantic data is stored on the same node by using the topic attribute and the primary key attribute method, and then the attribute table is established for each attribute stored in the node. And storing the attribute table according to the key value storage manner, that is, first distinguishing the semantic data according to the row, and then storing the data with high relevance of the query in the manner of storing the columns in the partition, in the semantic data storage The method takes into account the storage space and query efficiency, thereby saving storage space and improving query efficiency.

Other aspects will be apparent upon reading and understanding the drawings and detailed description.

BRIEF abstract

1 is a schematic diagram of semantic data formation in the related art;

2 is a flow chart 1 of storage of semantic data in accordance with an embodiment of the present invention;

3 is a flow chart 2 of storing semantic data in accordance with an embodiment of the present invention;

4 is a structural block diagram 1 of a storage device for semantic data according to an embodiment of the present invention;

FIG. 5 is a structural block diagram 2 of a storage device for semantic data according to an embodiment of the present invention; FIG.

6 is a structural block diagram 3 of a storage device for semantic data according to an embodiment of the present invention;

7 is a schematic diagram of semantic data storage results in accordance with an alternate embodiment of the present invention.

Embodiments of the invention

The invention will be described in detail below with reference to the drawings in conjunction with the embodiments. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict.

It is to be understood that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar objects, and are not necessarily used to describe a particular order or order.

In this embodiment, a method for storing semantic data is provided, the method is applicable to a cluster system, the cluster system includes: a distributed cluster, and FIG. 2 is a flowchart of storing semantic data according to an embodiment of the present invention. First, as shown in Figure 2, the process includes the following steps:

Step S202, selecting a topic attribute and a primary key attribute in the semantic data, where the topic attribute is an attribute whose query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is an attribute logically described by the data in the semantic data;

Step S204, calculating a primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the topic attribute;

Step S206, storing attributes belonging to the same primary key attribute value set in the semantic data on the same node;

Step S208, establishing an attribute table for each attribute stored in the node on the node, and storing the attribute table according to the key value storage manner;

Through the above steps, the attributes belonging to the same primary key attribute value set in the semantic data are stored on the same node by using the theme attribute and the primary key attribute method, and then the attribute table is established for each attribute stored in the node, and The attribute table is stored according to the key value storage manner, that is, the semantic data is first distinguished according to the row, and then the data with high relevance of the query is stored together in the partition according to the manner of the key value storage, in the semantic data storage. The method takes into account the storage space and query efficiency, thereby saving storage space and improving query efficiency.

It should be noted that the above key value storage method may be stored in a format of <key, value>, but is not limited thereto.

In the embodiment of the present invention, storing the attributes belonging to the same primary key attribute value set in the semantic data on the same node may be implemented in multiple manners. Alternatively, the method may be implemented as follows: Table; stores records in the super table that belong to the same set of primary key attribute values on the same node.

3 is a flowchart 2 of storing semantic data according to an embodiment of the present invention. As shown in FIG. 3, after step S208, the method further includes step S302, on the node, a predetermined attribute stored in the node and The topic attribute establishes an access index in the specified format. The predetermined attribute is an artificially set semantic data attribute generated according to an application. It should be noted that different predetermined attributes correspond to different specified formats, and the same predetermined attribute may also adopt different specified formats, which may be set according to actual conditions.

By establishing an access index, when the user needs to read the data, first query the access index to see if the query condition is met. If it is satisfied, the data is read; if it is not satisfied, the data needs to be read, which effectively reduces the inter-node. The connection operation and access to invalid data greatly improve the efficiency of the query.

In the embodiment of the present invention, the primary key attribute value set of the primary key attribute corresponding to each attribute value of the subject attribute may be implemented by one of the following, but is not limited thereto: calculating the attribute value of the object corresponding to the subject attribute value belongs to the subject The primary key attribute value set of the primary key attribute; the primary key attribute value set whose attribute value belongs to the primary key attribute in the subject corresponding to the object attribute is calculated.

In the embodiment of the present invention, before establishing an attribute table for each attribute stored in the node on the node, the method further includes: in the case of the primary key attribute value belonging to the plurality of primary key attribute value sets, in the super table The records corresponding to the primary key attribute values are stored on multiple nodes. This makes it easier for users to query.

The embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the above method.

Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation. Based on such understanding, the technical solution of the present invention in essence or the contribution to the related art can be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, CD-ROM). The instructions include a number of instructions for causing a terminal device (which may be a cell phone, computer, server, or network device, etc.) to perform the methods described in various embodiments of the present invention.

In the embodiment, a storage device for the semantic data is provided, and the device is used to implement the foregoing embodiments and optional implementations, and details are not described herein. As used below, the term "module" may implement a combination of software and/or hardware of a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.

4 is a structural block diagram 1 of a storage device for semantic data according to an embodiment of the present invention. As shown in FIG. 4, the device includes:

The selection module 42 is configured to select a topic attribute and a primary key attribute in the semantic data, where the topic attribute is an attribute whose query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is an attribute logically described by the data in the semantic data;

The calculating module 44 is configured to calculate a primary key attribute corresponding to each topic attribute value of the topic attribute Primary key attribute value set;

The first storage module 46 is configured to store the attributes of the semantic data belonging to the same primary key attribute value set on the same node;

The second storage module 48 is configured to establish an attribute table on the node for each attribute stored in the node, and store the attribute table in a key value storage manner.

Through the above device, the attribute belonging to the same primary key attribute value set in the semantic data is stored on the same node by using the theme attribute and the primary key attribute method, and then the attribute table is established for each attribute stored in the node, and The attribute table is stored according to the key value storage manner, that is, the semantic data is first distinguished according to the row, and then the data with high relevance of the query is stored together in the partition according to the column, in the semantic data storage method. Taking into account the storage space and query efficiency, which saves storage space and improves query efficiency.

5 is a structural block diagram of a storage device for semantic data according to an embodiment of the present invention. As shown in FIG. 5, the first storage module 46 includes: an establishing unit 52, configured to establish a super table according to a primary key attribute; Unit 54, is configured to store records belonging to the same primary key attribute value set in the super table on the same node.

6 is a structural block diagram 3 of a storage device for semantic data according to an embodiment of the present invention. As shown in FIG. 6, the device further includes: an indexing module 62, configured to store the attribute table according to the key value storage manner, On the node, the access index is established in the specified format for the predetermined attribute and the topic attribute stored in the node.

It should be noted that different predetermined attributes correspond to different specified formats, and the same predetermined attribute may also adopt different specified formats, which may be set according to actual conditions.

In the embodiment of the present invention, the calculating module 44 may include one of the following: a first calculating unit, configured to calculate a primary key attribute value set whose attribute value belongs to the primary key attribute in the object corresponding to the subject attribute value; the second calculating unit, Set to calculate the attribute in the subject corresponding to the subject attribute value The value belongs to the primary key attribute value set of the primary key attribute.

It should be noted that each of the above modules may be implemented by software or hardware. For the latter, the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the modules are located in multiple In the processor.

The present invention is further explained below in conjunction with specific embodiments:

The embodiment of the invention provides an optional method for storing semantic data, which is a hybrid method of row and column based on theme and primary key, which can achieve better effects in both storage space and query efficiency. The theme is an attribute, which is to view the semantic data from the perspective of the graph. Since most of the queries are query subgraphs, we define the attributes with higher frequency of the query as the subject, so that the graph can be related to the query. Data exists together to effectively increase the efficiency of data access. The primary key is to view the semantic data from the perspective of data logic. As described in the previous three storage methods, the semantic data is logically described as an attribute, so the storage based on the primary key can logically describe the data, thereby avoiding the connection operation. Improve query efficiency. This alternative embodiment includes the following process:

Data storage process:

In this alternative embodiment, the method based on the row and column hybrid storage of the subject and the primary key is as follows:

Step 1. Select the appropriate topic attribute (TopicAttr) and primary key attribute (KeyAttr).

Step 2: For each topic attribute value (topici) of the topic attribute, calculate a primary key attribute value set (keySeti) in which the attribute value of the object (or subject) corresponding to the subject (or object) belongs to the primary key attribute in the semantic data (keySeti) ).

Step 3: Construct a logical super table on the semantic data with the primary key attribute. The row of the super table is called a record.

Step 4: Records belonging to the same primary key attribute value set (keySeti) are logically stored in the same node. If a primary key attribute value belongs to multiple primary key attribute value sets, then this The records corresponding to the primary key attribute values are also stored in multiple nodes.

Step 5: On each node, construct a property sheet for each attribute and store it in the format of <key, value>.

Step 6. Generate an access index according to the specified format for the pre-specified attribute and the topic attribute.

(2) The process of reading semantic data

When the user needs to read the data, first query the access index to see if the query condition is met. If it is satisfied, the data is read; if it is not satisfied, the data need not be read.

In this alternative embodiment, the above storage in the format of <key, value> is equivalent to the key value storage mode in the above embodiment.

The embodiment of the present invention further provides another optional method for storing semantic data. The optional embodiment includes the following process:

(1) Data storage process:

The initial semantic data is shown in Table 1.

Step 1. Select the topic attribute as the company (TopicAttr={CorpA, CorpB}) and the primary key attribute as the user name (KeyAttr={Bob, Jerry, Tom})

Step 2: Calculate the set of primary key attribute values corresponding to each topic attribute value. Taking CorpA as an example, the set of primary key attribute values whose attribute values in the object (or subject) corresponding to CorpA's subject (or object) belong to the primary key attribute in the semantic data, keySetCorpA={Bob, Tom, CorpA}. Similarly, you can calculate keySetCorpB={Jerry,CorpB}

Step 3: Construct a logical super table for the semantic data with the primary key attribute, as shown in Table 3.

The results of steps 4-6 are as shown in Fig. 7, Tables 5a to 5g, and Tables 6a to 6f, wherein Tables 5a to 5g show attribute tables in node a, and tables 6a to 6f show nodes b. Property sheet.

Step 4: Store the records corresponding to CorpA and CorpB on consecutive nodes, that is, the records corresponding to {Bob, Tom, CorpA} are stored on node a, and the records corresponding to {Jerry, CorpB} are stored on node b.

Step 5. On each node, build a property sheet for each property, according to <key, value> The format is stored. A property table of <key, value> is constructed on the node a for the attributes "company", "department", "monthly salary", "spouse", "gender", "hobby", and "address". On the node b, a property table of <key, value> is constructed for the attributes "company", "department", "mailbox", "gender", "hobby", and "number of employees".

Step 6. On node a and node b, for the attribute "monthly salary", generate an access index of the minimum and maximum values. The minimum value in the "monthly salary" attribute table of node a is 5800, and the maximum value is also 5800. There is no "monthly salary" attribute table on node b, so both the minimum and maximum values are represented as null values (NULL).

(2) Data query process

When the user queries "Who is the department's colleagues in Bob?", the query "the company value is CorpA, and the department value is the personnel list." Then you only need to query the department attribute table on node a, that is, "Tom is Bob's department colleague."

When the user queries "How many people have a monthly salary greater than 8000 in CorpA?", on the node a, the access index of the "monthly salary" attribute table shows that the person with the highest monthly salary is only 5800, that is, there is no monthly salary greater than 8000 on the node. People, so the query does not need to read the data of the "monthly salary" property sheet.

Embodiments of the present invention also provide a storage medium. Optionally, in the embodiment, the foregoing storage medium may be configured to store program code for performing the following steps:

S1, selecting a topic attribute and a primary key attribute in the semantic data, where the topic attribute is an attribute whose query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is an attribute logically described by the data in the semantic data;

S2. Calculate a primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the topic attribute;

S3, storing attributes belonging to the same primary key attribute value set in the semantic data on the same node;

S4, an attribute table is established on the node for each attribute stored in the node, and the attribute table is stored according to the key value storage manner.

Optionally, in this embodiment, the foregoing storage medium may include, but not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), a mobile hard disk, and a magnetic memory. A variety of media that can store program code, such as a disc or a disc.

For example, the specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the optional embodiments, and details are not described herein again.

It will be apparent to those skilled in the art that the various modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein. The steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software.

The above description is only an alternative embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and scope of the present invention are intended to be included within the scope of the present invention.

Industrial applicability

The above technical solution takes into account the storage space and the query efficiency in the semantic data storage method, thereby saving storage space and improving query efficiency.

Claims

A method of storing semantic data, including:

Selecting a topic attribute and a primary key attribute in the semantic data, wherein the topic attribute is an attribute in the semantic data whose query frequency exceeds a predetermined threshold, and the primary key attribute is an attribute logically described by data in the semantic data;

Calculating a primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the topic attribute;

Storing the attributes of the semantic data that belong to the same set of primary key attribute values on the same node;

An attribute table is created on the node for each attribute stored in the node, and the attribute table is stored in a key value storage manner.
The method of claim 1, wherein storing the attributes of the semantic data that belong to the same set of primary key attribute values on the same node comprises:

Establishing a super table for the semantic data according to the primary key attribute;

The records belonging to the same set of primary key attribute values in the super table are stored on the same node.
The method of claim 1 further comprising:

After the storing the attribute table in the key value storage manner, on the node, the access index is established according to the specified format for the predetermined attribute and the topic attribute stored in the node.
The method of claim 1, wherein calculating a primary key attribute value set of the primary key attribute corresponding to each attribute value of the topic attribute comprises one of the following:

Calculating a primary key attribute value set in which the attribute value in the object corresponding to the subject attribute value belongs to the primary key attribute;

Calculating a primary key attribute value set whose attribute value belongs to the primary key attribute in the subject corresponding to the subject attribute value.
The method of claim 2, the method further comprising:

Before the attribute table is created on the node for each attribute stored in the node, When the primary key attribute value belongs to a plurality of the primary key attribute value sets at the same time, the record corresponding to the primary key attribute value in the super table is stored on a plurality of nodes.
A storage device for semantic data, comprising:

a selection module, configured to select a topic attribute and a primary key attribute in the semantic data, wherein the topic attribute is an attribute in the semantic data whose query frequency exceeds a predetermined threshold, and the primary key attribute is an attribute logically described by the data in the semantic data;

a calculation module, configured to calculate a primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the topic attribute;

a first storage module, configured to store attributes of the semantic data that belong to the same set of primary key attribute values on the same node;

The second storage module is configured to establish an attribute table on the node for each attribute stored in the node, and store the attribute table in a key value storage manner.
The apparatus of claim 6, wherein the first storage module comprises:

Establishing a unit, configured to establish a super table for the semantic data according to the primary key attribute;

a storage unit, configured to store records of the super table belonging to the same set of primary key attribute values on the same node.
The apparatus of claim 6 further comprising:

And an indexing module, configured to establish, on the node, an access index for the predetermined attribute stored in the node and the topic attribute according to a specified format after the storing the attribute table according to the key value storage manner.
The apparatus of claim 6 wherein said computing module comprises one of:

a first calculating unit, configured to calculate a primary key attribute value set in which an attribute value in the object corresponding to the subject attribute value belongs to the primary key attribute;

The second calculating unit is configured to calculate a primary key attribute value set whose attribute value belongs to the primary key attribute in the subject corresponding to the subject attribute value.
The apparatus of claim 7 further comprising:

a third storage module, configured to build on the node for each attribute stored in the node Before the attribute table is established, in a case where the primary key attribute value belongs to a plurality of the primary key attribute value sets at the same time, the record corresponding to the primary key attribute value in the super table is stored on a plurality of nodes.