WO2016180186A1 - Semantic data storage method and apparatus - Google Patents

Semantic data storage method and apparatus Download PDF

Info

Publication number
WO2016180186A1
WO2016180186A1 PCT/CN2016/079672 CN2016079672W WO2016180186A1 WO 2016180186 A1 WO2016180186 A1 WO 2016180186A1 CN 2016079672 W CN2016079672 W CN 2016079672W WO 2016180186 A1 WO2016180186 A1 WO 2016180186A1
Authority
WO
WIPO (PCT)
Prior art keywords
attribute
primary key
node
semantic data
value
Prior art date
Application number
PCT/CN2016/079672
Other languages
French (fr)
Chinese (zh)
Inventor
曲文武
王志坤
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016180186A1 publication Critical patent/WO2016180186A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof

Definitions

  • This document relates to, but is not limited to, the field of communications, and in particular, to a method and apparatus for storing semantic data.
  • Semantic data is a kind of data described by the Resource Description Framework (RDF), also known as RDF data.
  • RDF Resource Description Framework
  • the format of semantic data is generally expressed as ⁇ subject, predicate, object>, such as ⁇ Bob, monthly salary, 5800>, ⁇ Bob, department, personnel>, where predicate is also called attribute.
  • the distributed file system has three semantic data storage methods: the first is triple storage, and the first is a triple storage table of semantic data in the related art. As shown in Table 1, it is a three-column table, respectively. Store subjects, predicates, and objects. This method is the easiest to implement, but the query performance is poor, and often need to assist some optimization methods. For example, query "Do you have colleagues in the department of Bob?", it needs to traverse all the data. If you index "personnel", you only need to query the target data according to the index, but this will bring complex and huge indexing problems.
  • the second is column storage, which is a storage method of ⁇ key, value>, with the subject as the primary key, the predicate as the attribute, and the object as the attribute value.
  • Table 2 is a department table of column storage of semantic data in the related art. As shown in Table 2, it describes column storage of a department table. The primary key describes the name of a person, and the attribute value describes the department to which the person belongs. The advantage of this approach is that it can make full use of the storage space and store all the data with the same attributes in a table, which is beneficial to the query of the attributes. For example, if you query "bos who have colleagues in Bob's department?", you only need to query the records in the department table that have the attribute value "Personnel".
  • the drawback of this method is that the different attribute values (objects) of the subject are scattered in different tables.
  • the query involves multiple attributes, it needs to connect multiple tables, which affects the query efficiency. For example, query "Do you have colleagues in the department of Bob?" The semantic data for the user also stores employee information of multiple companies.
  • the query involves two attributes: "company” and "department”, then the query needs to be in the "company” table. Query the employees with the same company as Bob, query the employees in the same department of Bob in the “Department” table, and then make the two parts of the results as a connection with the employees.
  • the third is row storage.
  • Table 3 is a super table of semantic data, as shown in Table 3. But the problem is that super tables can be very sparse in many cases, wasting a lot of storage space.
  • some closely related attributes are stored in a table, as shown in Table 4a and Table 4b, Table 4a is the employee attribute table of semantic data, and Table 4b is the company attribute table of semantic data, which can lower the table.
  • the sparseness can also avoid the connection operations of some tables, but how to find these closely related attributes is a difficult problem to solve.
  • FIG 1 is a schematic diagram of the formation of semantic data in the related art, as shown in Figure 1.
  • a query for semantic data is equivalent to searching for a subgraph in the graph.
  • the data needs to be stored on different nodes, and the search for the subgraph may involve different nodes.
  • the above three storage methods have different tradeoffs in data management, storage space and query efficiency, respectively. Their respective advantages and disadvantages.
  • the embodiment of the invention provides a method and a device for storing semantic data, so as to achieve a balance between storage space and query efficiency in the semantic data storage method.
  • An embodiment of the present invention provides a method for storing semantic data, including: selecting a topic attribute and a primary key attribute in the semantic data, where the topic attribute is an attribute whose query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is data logic in the semantic data.
  • the attribute of the description; the primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the subject attribute; the attribute belonging to the same primary key attribute value set in the semantic data is stored on the same node; and the node is stored in the node
  • Each attribute establishes a property sheet and stores the property table in a key value storage manner.
  • storing the attributes belonging to the same primary key attribute value set in the semantic data on the same node includes: establishing a super table according to the primary key attribute on the semantic data; and recording the super primary table belonging to the same primary key attribute value set. Stored on the same node.
  • the method further includes: establishing, on the node, the access index according to the specified format for the predetermined attribute and the topic attribute stored in the node.
  • the primary key attribute value set corresponding to the primary key attribute corresponding to each attribute value of the subject attribute includes one of the following: calculating a primary key attribute value set whose attribute value belongs to the primary key attribute in the object corresponding to the subject attribute value. Calculate the primary key attribute value set whose primary value is the primary key attribute in the subject corresponding to the subject attribute value.
  • the method before establishing an attribute table for each attribute stored in the node on the node, the method further includes: in the case that the primary key attribute value belongs to the plurality of primary key attribute value sets at the same time, the primary key in the super table The records corresponding to the attribute values are stored on multiple nodes.
  • the embodiment of the invention further provides a storage device for semantic data, comprising: a selection module, configured to select a topic attribute and a primary key attribute in the semantic data, wherein the topic attribute is an attribute in which the query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is The attribute of the logical description of the data in the semantic data; the calculation module is configured to calculate a primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the topic attribute; the first storage module is set to belong to the same primary key attribute value in the semantic data The attributes of the collection are stored on the same node; the second storage module is configured to establish an attribute table for each attribute stored in the node on the node, and store the attribute table according to the key value storage manner.
  • a selection module configured to select a topic attribute and a primary key attribute in the semantic data, wherein the topic attribute is an attribute in which the query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is The attribute of the logical description of the data in the
  • the first storage module includes: an establishing unit, configured to establish a super table according to the primary key attribute to the semantic data; and the storage unit is configured to store the records belonging to the same primary key attribute value set in the super table in the same node on.
  • the apparatus further includes: an indexing module, configured to: after storing the attribute table according to the key value storage manner, establish an access index on the node according to the specified format on the predetermined attribute and the topic attribute stored in the node. .
  • the calculation module includes one of the following: the first calculation unit is configured to calculate a primary key attribute value set whose attribute value belongs to the primary key attribute in the object corresponding to the subject attribute value; the second calculating unit is set to The set of primary key attribute values whose attribute value belongs to the primary key attribute in the subject corresponding to the subject attribute value is calculated.
  • the apparatus further includes: a third storage module configured to: before the attribute table is created on the node for each attribute stored in the node, the primary key attribute value belongs to the plurality of primary key attribute value sets at the same time In this case, the records corresponding to the primary key attribute values in the super table are stored on multiple nodes.
  • the attribute belonging to the same primary key attribute value set in the semantic data is stored on the same node by using the topic attribute and the primary key attribute method, and then the attribute table is established for each attribute stored in the node. And storing the attribute table according to the key value storage manner, that is, first distinguishing the semantic data according to the row, and then storing the data with high relevance of the query in the manner of storing the columns in the partition, in the semantic data storage
  • the method takes into account the storage space and query efficiency, thereby saving storage space and improving query efficiency.
  • FIG. 2 is a flow chart 1 of storage of semantic data in accordance with an embodiment of the present invention.
  • FIG. 3 is a flow chart 2 of storing semantic data in accordance with an embodiment of the present invention.
  • FIG. 4 is a structural block diagram 1 of a storage device for semantic data according to an embodiment of the present invention.
  • FIG. 5 is a structural block diagram 2 of a storage device for semantic data according to an embodiment of the present invention.
  • FIG. 6 is a structural block diagram 3 of a storage device for semantic data according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of semantic data storage results in accordance with an alternate embodiment of the present invention.
  • FIG. 2 is a flowchart of storing semantic data according to an embodiment of the present invention.
  • the process includes the following steps:
  • Step S202 selecting a topic attribute and a primary key attribute in the semantic data, where the topic attribute is an attribute whose query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is an attribute logically described by the data in the semantic data;
  • Step S204 calculating a primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the topic attribute;
  • Step S206 storing attributes belonging to the same primary key attribute value set in the semantic data on the same node
  • Step S208 establishing an attribute table for each attribute stored in the node on the node, and storing the attribute table according to the key value storage manner;
  • the attributes belonging to the same primary key attribute value set in the semantic data are stored on the same node by using the theme attribute and the primary key attribute method, and then the attribute table is established for each attribute stored in the node, and
  • the attribute table is stored according to the key value storage manner, that is, the semantic data is first distinguished according to the row, and then the data with high relevance of the query is stored together in the partition according to the manner of the key value storage, in the semantic data storage.
  • the method takes into account the storage space and query efficiency, thereby saving storage space and improving query efficiency.
  • key value storage method may be stored in a format of ⁇ key, value>, but is not limited thereto.
  • storing the attributes belonging to the same primary key attribute value set in the semantic data on the same node may be implemented in multiple manners.
  • the method may be implemented as follows: Table; stores records in the super table that belong to the same set of primary key attribute values on the same node.
  • step S302 on the node, a predetermined attribute stored in the node and The topic attribute establishes an access index in the specified format.
  • the predetermined attribute is an artificially set semantic data attribute generated according to an application. It should be noted that different predetermined attributes correspond to different specified formats, and the same predetermined attribute may also adopt different specified formats, which may be set according to actual conditions.
  • the primary key attribute value set of the primary key attribute corresponding to each attribute value of the subject attribute may be implemented by one of the following, but is not limited thereto: calculating the attribute value of the object corresponding to the subject attribute value belongs to the subject The primary key attribute value set of the primary key attribute; the primary key attribute value set whose attribute value belongs to the primary key attribute in the subject corresponding to the object attribute is calculated.
  • the method before establishing an attribute table for each attribute stored in the node on the node, the method further includes: in the case of the primary key attribute value belonging to the plurality of primary key attribute value sets, in the super table
  • the records corresponding to the primary key attribute values are stored on multiple nodes. This makes it easier for users to query.
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the above method.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the present invention in essence or the contribution to the related art can be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, CD-ROM).
  • the instructions include a number of instructions for causing a terminal device (which may be a cell phone, computer, server, or network device, etc.) to perform the methods described in various embodiments of the present invention.
  • a storage device for the semantic data is provided, and the device is used to implement the foregoing embodiments and optional implementations, and details are not described herein.
  • the term "module” may implement a combination of software and/or hardware of a predetermined function.
  • the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
  • FIG. 4 is a structural block diagram 1 of a storage device for semantic data according to an embodiment of the present invention. As shown in FIG. 4, the device includes:
  • the selection module 42 is configured to select a topic attribute and a primary key attribute in the semantic data, where the topic attribute is an attribute whose query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is an attribute logically described by the data in the semantic data;
  • the calculating module 44 is configured to calculate a primary key attribute corresponding to each topic attribute value of the topic attribute Primary key attribute value set;
  • the first storage module 46 is configured to store the attributes of the semantic data belonging to the same primary key attribute value set on the same node;
  • the second storage module 48 is configured to establish an attribute table on the node for each attribute stored in the node, and store the attribute table in a key value storage manner.
  • the attribute belonging to the same primary key attribute value set in the semantic data is stored on the same node by using the theme attribute and the primary key attribute method, and then the attribute table is established for each attribute stored in the node, and
  • the attribute table is stored according to the key value storage manner, that is, the semantic data is first distinguished according to the row, and then the data with high relevance of the query is stored together in the partition according to the column, in the semantic data storage method.
  • the first storage module 46 includes: an establishing unit 52, configured to establish a super table according to a primary key attribute; Unit 54, is configured to store records belonging to the same primary key attribute value set in the super table on the same node.
  • FIG. 6 is a structural block diagram 3 of a storage device for semantic data according to an embodiment of the present invention.
  • the device further includes: an indexing module 62, configured to store the attribute table according to the key value storage manner, On the node, the access index is established in the specified format for the predetermined attribute and the topic attribute stored in the node.
  • predetermined attributes correspond to different specified formats, and the same predetermined attribute may also adopt different specified formats, which may be set according to actual conditions.
  • the calculating module 44 may include one of the following: a first calculating unit, configured to calculate a primary key attribute value set whose attribute value belongs to the primary key attribute in the object corresponding to the subject attribute value; the second calculating unit, Set to calculate the attribute in the subject corresponding to the subject attribute value The value belongs to the primary key attribute value set of the primary key attribute.
  • the apparatus further includes: a third storage module configured to: before the attribute table is created on the node for each attribute stored in the node, the primary key attribute value belongs to the plurality of primary key attribute value sets at the same time In this case, the records corresponding to the primary key attribute values in the super table are stored on multiple nodes.
  • each of the above modules may be implemented by software or hardware.
  • the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the modules are located in multiple In the processor.
  • the embodiment of the invention provides an optional method for storing semantic data, which is a hybrid method of row and column based on theme and primary key, which can achieve better effects in both storage space and query efficiency.
  • the theme is an attribute, which is to view the semantic data from the perspective of the graph. Since most of the queries are query subgraphs, we define the attributes with higher frequency of the query as the subject, so that the graph can be related to the query. Data exists together to effectively increase the efficiency of data access.
  • the primary key is to view the semantic data from the perspective of data logic. As described in the previous three storage methods, the semantic data is logically described as an attribute, so the storage based on the primary key can logically describe the data, thereby avoiding the connection operation. Improve query efficiency.
  • This alternative embodiment includes the following process:
  • Step 1 Select the appropriate topic attribute (TopicAttr) and primary key attribute (KeyAttr).
  • Step 2 For each topic attribute value (topici) of the topic attribute, calculate a primary key attribute value set (keySeti) in which the attribute value of the object (or subject) corresponding to the subject (or object) belongs to the primary key attribute in the semantic data (keySeti) ).
  • Step 3 Construct a logical super table on the semantic data with the primary key attribute.
  • the row of the super table is called a record.
  • Step 4 Records belonging to the same primary key attribute value set (keySeti) are logically stored in the same node. If a primary key attribute value belongs to multiple primary key attribute value sets, then this The records corresponding to the primary key attribute values are also stored in multiple nodes.
  • Step 5 On each node, construct a property sheet for each attribute and store it in the format of ⁇ key, value>.
  • Step 6 Generate an access index according to the specified format for the pre-specified attribute and the topic attribute.
  • the access index When the user needs to read the data, first query the access index to see if the query condition is met. If it is satisfied, the data is read; if it is not satisfied, the data need not be read.
  • the above storage in the format of ⁇ key, value> is equivalent to the key value storage mode in the above embodiment.
  • the embodiment of the present invention further provides another optional method for storing semantic data.
  • the optional embodiment includes the following process:
  • Step 2 Calculate the set of primary key attribute values corresponding to each topic attribute value.
  • keySetCorpA ⁇ Bob, Tom, CorpA ⁇ .
  • keySetCorpB ⁇ Jerry,CorpB ⁇
  • Step 3 Construct a logical super table for the semantic data with the primary key attribute, as shown in Table 3.
  • steps 4-6 are as shown in Fig. 7, Tables 5a to 5g, and Tables 6a to 6f, wherein Tables 5a to 5g show attribute tables in node a, and tables 6a to 6f show nodes b. Property sheet.
  • Step 4 Store the records corresponding to CorpA and CorpB on consecutive nodes, that is, the records corresponding to ⁇ Bob, Tom, CorpA ⁇ are stored on node a, and the records corresponding to ⁇ Jerry, CorpB ⁇ are stored on node b.
  • Step 5 On each node, build a property sheet for each property, according to ⁇ key, value> The format is stored.
  • a property table of ⁇ key, value> is constructed on the node a for the attributes "company”, “department”, “monthly salary”, “spouse”, “gender”, “hobby”, and “address”.
  • On the node b a property table of ⁇ key, value> is constructed for the attributes "company”, “department”, “mailbox”, “gender”, “hobby", and "number of employees”.
  • Step 6 On node a and node b, for the attribute "monthly salary”, generate an access index of the minimum and maximum values.
  • the minimum value in the "monthly salary” attribute table of node a is 5800, and the maximum value is also 5800.
  • the access index of the "monthly salary” attribute table shows that the person with the highest monthly salary is only 5800, that is, there is no monthly salary greater than 8000 on the node. People, so the query does not need to read the data of the "monthly salary" property sheet.
  • the above storage in the format of ⁇ key, value> is equivalent to the key value storage mode in the above embodiment.
  • Embodiments of the present invention also provide a storage medium.
  • the foregoing storage medium may be configured to store program code for performing the following steps:
  • the foregoing storage medium may include, but not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), a mobile hard disk, and a magnetic memory.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • a mobile hard disk e.g., a hard disk
  • magnetic memory e.g., a hard disk
  • modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein.
  • the steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module.
  • the invention is not limited to any specific combination of hardware and software.
  • the above technical solution takes into account the storage space and the query efficiency in the semantic data storage method, thereby saving storage space and improving query efficiency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A semantic data storage method and apparatus. The method comprises: selecting a theme attribute and a master key attribute in semantic data, wherein the theme attribute is an attribute, the query frequency of which exceeds a predetermined threshold value, in the semantic data, and the master key attribute is an attribute of data logic description in the semantic data; calculating a master key attribute value set of the master key attribute which corresponds to each theme attribute value of the theme attribute; storing attributes which belong to the same master key attribute value set in the semantic data on the same node; and establishing an attribute table for each attribute which is stored in the node on the node and storing the attribute table according to a key value storage mode. The technical solution gives consideration to a storage space and query efficiency in the semantic data storage method, thereby saving the storage space and improving the query efficiency.

Description

语义数据的存储方法及装置Semantic data storage method and device 技术领域Technical field
本文涉及但不限于通信领域,具体而言,涉及一种语义数据的存储方法及装置。This document relates to, but is not limited to, the field of communications, and in particular, to a method and apparatus for storing semantic data.
背景技术Background technique
语义数据,是一种使用资源描述框架(Resource Description Framework,简称RDF)来描述的数据,也称为RDF数据。语义数据的格式一般表示为<主语(Subject),谓语(Predicate),宾语(Object)>,例如<Bob,月薪,5800>,<Bob,部门,人事>,其中,谓语也被称为属性。Semantic data is a kind of data described by the Resource Description Framework (RDF), also known as RDF data. The format of semantic data is generally expressed as <subject, predicate, object>, such as <Bob, monthly salary, 5800>, <Bob, department, personnel>, where predicate is also called attribute.
随着数据量的增加,分布式文件系统成为语义数据的主流存储方式。在这种情况下,语义数据的查询性能中相当大的一部分消耗在语义数据的读取上。分布式文件系统有三种语义数据存储方式:第一种是三元组存储,表1是相关技术中语义数据的三元组存储表,如表1所示,它是一个三列的表,分别存储主语、谓语和宾语。这种方式实现起来最为简单,但查询性能较差,往往需要辅助一些优化手段,例如,查询“Bob的部门同事都有谁?”,它需要遍历所有数据。如果对“人事”建立索引,那么只需要根据索引就能查询到目标数据,然而这又会带来复杂而庞大的索引问题。As the amount of data increases, distributed file systems become the mainstream storage method for semantic data. In this case, a significant portion of the query performance of the semantic data is consumed in the reading of the semantic data. The distributed file system has three semantic data storage methods: the first is triple storage, and the first is a triple storage table of semantic data in the related art. As shown in Table 1, it is a three-column table, respectively. Store subjects, predicates, and objects. This method is the easiest to implement, but the query performance is poor, and often need to assist some optimization methods. For example, query "Do you have colleagues in the department of Bob?", it needs to traverse all the data. If you index "personnel", you only need to query the target data according to the index, but this will bring complex and huge indexing problems.
表1Table 1
主语subject 谓语predicate 宾语object
BobBob 公司the company CorpACorpA
BobBob 部门department 人事personnel
BobBob 月薪Monthly salary 58005800
BobBob 性别gender male
BobBob 配偶spouse JerryJerry
BobBob 爱好Hobby 篮球basketball
JerryJerry 公司the company CorpBCorpB
JerryJerry 部门department 销售Sales
JerryJerry 性别gender Female
JerryJerry 爱好Hobby 购物shopping
JerryJerry 邮箱mailbox Jerry@CorpB.comJerry@CorpB.com
TomTom 公司the company CorpACorpA
TomTom 部门department 人事personnel
CorpACorpA 地址address 北京Beijing
CorpBCorpB 员工数目Number of employees 100100
第二种是列存储,它是一种<key,value>的存储方式,将主语作为主键(key),将谓语作为属性,宾语作为属性值(value)。表2是相关技术中语义数据的列存储的部门表,如表2所示,它描述的是一个部门表的列存储,主键描述的是一个人的名字,属性值描述这个人所属的部门。这种方式的优点是能充分利用存储空间,并将所有具有相同属性的数据存储在一张表中,有利于对属性的查询。例如那么查询“Bob的部门同事都有谁?”只需要查询部门表中具有属性值“人事”的记录即可。但这种方式的缺陷是主语的不同属性值(宾语)分散在不同的表中,当查询涉及到多个属性时,它需要连接多个表,从而影响查询效率。例如查询“Bob的部门同事都有谁?”面向的语义数据同时存储着多个公司的员工信息,该查询涉及了两个属性:“公司”和“部门”,那么查询需要在“公司”表中查询与Bob具有相同公司的员工,在“部门”表中查询与Bob具有相同部门的员工,然后把两部分结果以员工做一个连接。The second is column storage, which is a storage method of <key, value>, with the subject as the primary key, the predicate as the attribute, and the object as the attribute value. Table 2 is a department table of column storage of semantic data in the related art. As shown in Table 2, it describes column storage of a department table. The primary key describes the name of a person, and the attribute value describes the department to which the person belongs. The advantage of this approach is that it can make full use of the storage space and store all the data with the same attributes in a table, which is beneficial to the query of the attributes. For example, if you query "bos who have colleagues in Bob's department?", you only need to query the records in the department table that have the attribute value "Personnel". However, the drawback of this method is that the different attribute values (objects) of the subject are scattered in different tables. When the query involves multiple attributes, it needs to connect multiple tables, which affects the query efficiency. For example, query "Do you have colleagues in the department of Bob?" The semantic data for the user also stores employee information of multiple companies. The query involves two attributes: "company" and "department", then the query needs to be in the "company" table. Query the employees with the same company as Bob, query the employees in the same department of Bob in the “Department” table, and then make the two parts of the results as a connection with the employees.
表2Table 2
主键Primary key 部门department
BobBob 人事personnel
JerryJerry 销售Sales
第三种是行存储,极端的情况是存在一张超级表,所有谓语都是该表的属性,这样所有数据都可以存在这张表中,表3是语义数据的超级表,如表3所示,但问题是超级表在很多情况下会非常稀疏,浪费大量的存储空间。 在实际实现中,是将一些关系紧密的属性存储到一张表中,如表4a和表4b所示,表4a为语义数据的员工属性表,表4b为语义数据的公司属性表,这样可以降低表的稀疏性,也能避免一些表的连接操作,但如何找到这些关系紧密的属性是一个很难解决的问题。The third is row storage. In the extreme case, there is a super table. All predicates are attributes of the table, so that all data can exist in this table. Table 3 is a super table of semantic data, as shown in Table 3. But the problem is that super tables can be very sparse in many cases, wasting a lot of storage space. In the actual implementation, some closely related attributes are stored in a table, as shown in Table 4a and Table 4b, Table 4a is the employee attribute table of semantic data, and Table 4b is the company attribute table of semantic data, which can lower the table. The sparseness can also avoid the connection operations of some tables, but how to find these closely related attributes is a difficult problem to solve.
表3table 3
Figure PCTCN2016079672-appb-000001
Figure PCTCN2016079672-appb-000001
表4aTable 4a
Figure PCTCN2016079672-appb-000002
Figure PCTCN2016079672-appb-000002
表4bTable 4b
Figure PCTCN2016079672-appb-000003
Figure PCTCN2016079672-appb-000003
但语义数据蕴含着数据之间的关系,这些关系将数据结合成一张图,图1是相关技术中的语义数据形成的示意图,如图1所示。对语义数据的查询相当于在图中搜索一个子图。当语义数据的数据量很大时,这些数据需要存储到不同的节点上,这时对子图的搜索可能会涉及不同的节点。以上三种存储方式在数据管理、存储空间和查询效率等方面做了不同的折衷,分别具有 各自的优缺点。However, semantic data implies the relationship between data. These relationships combine data into a single graph. Figure 1 is a schematic diagram of the formation of semantic data in the related art, as shown in Figure 1. A query for semantic data is equivalent to searching for a subgraph in the graph. When the amount of data of semantic data is large, the data needs to be stored on different nodes, and the search for the subgraph may involve different nodes. The above three storage methods have different tradeoffs in data management, storage space and query efficiency, respectively. Their respective advantages and disadvantages.
针对相关技术中,语义数据存储方法中存储空间和查询效率不能兼顾的问题,目前尚未提出有效的解决方案。In view of the related art, the problem that the storage space and the query efficiency cannot be balanced in the semantic data storage method has not yet proposed an effective solution.
发明内容Summary of the invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。The following is an overview of the topics detailed in this document. This Summary is not intended to limit the scope of the claims.
本发明实施例提供了一种语义数据的存储方法及装置,以实现在语义数据存储方法中存储空间和查询效率的兼顾。The embodiment of the invention provides a method and a device for storing semantic data, so as to achieve a balance between storage space and query efficiency in the semantic data storage method.
本发明实施例,提供了一种语义数据的存储方法,包括:选择语义数据中的主题属性和主键属性,主题属性是语义数据中查询频率超过预定阈值的属性,主键属性是语义数据中数据逻辑描述的属性;计算主题属性的每个主题属性值对应的主键属性的主键属性值集合;将语义数据中属于同一个主键属性值集合的属性存储在同一节点上;在节点上为存储在节点中的每个属性建立属性表,以及对属性表按照键值存储方式进行存储。An embodiment of the present invention provides a method for storing semantic data, including: selecting a topic attribute and a primary key attribute in the semantic data, where the topic attribute is an attribute whose query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is data logic in the semantic data. The attribute of the description; the primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the subject attribute; the attribute belonging to the same primary key attribute value set in the semantic data is stored on the same node; and the node is stored in the node Each attribute establishes a property sheet and stores the property table in a key value storage manner.
在本发明实施例中,将语义数据中属于同一个主键属性值集合的属性存储在同一节点上包括:根据主键属性对语义数据建立超级表;将超级表中属于同一个主键属性值集合的记录存储在同一节点上。In the embodiment of the present invention, storing the attributes belonging to the same primary key attribute value set in the semantic data on the same node includes: establishing a super table according to the primary key attribute on the semantic data; and recording the super primary table belonging to the same primary key attribute value set. Stored on the same node.
在本发明实施例中,在对属性表按照键值存储方式进行存储之后还包括:在节点上,对存储在节点的预定属性和主题属性按照指定格式建立访问索引。In the embodiment of the present invention, after storing the attribute table according to the key value storage manner, the method further includes: establishing, on the node, the access index according to the specified format for the predetermined attribute and the topic attribute stored in the node.
在本发明实施例中,计算主题属性的每个属性值对应的主键属性的主键属性值集合包括以下之一:计算以主题属性值为主语所对应宾语中属性值属于主键属性的主键属性值集合;计算以主题属性值为宾语所对应主语中属性值属于主键属性的主键属性值集合。In the embodiment of the present invention, the primary key attribute value set corresponding to the primary key attribute corresponding to each attribute value of the subject attribute includes one of the following: calculating a primary key attribute value set whose attribute value belongs to the primary key attribute in the object corresponding to the subject attribute value. Calculate the primary key attribute value set whose primary value is the primary key attribute in the subject corresponding to the subject attribute value.
在本发明实施例中,在节点上为存储在节点中的每个属性建立属性表之前,该方法还包括:在主键属性值同时属于多个主键属性值集合的情况下,在超级表中主键属性值对应的记录则存储在多个节点上。 In the embodiment of the present invention, before establishing an attribute table for each attribute stored in the node on the node, the method further includes: in the case that the primary key attribute value belongs to the plurality of primary key attribute value sets at the same time, the primary key in the super table The records corresponding to the attribute values are stored on multiple nodes.
本发明实施例还提供了一种语义数据的存储装置,包括:选择模块,设置为选择语义数据中的主题属性和主键属性,主题属性是语义数据中查询频率超过预定阈值的属性,主键属性是语义数据中数据逻辑描述的属性;计算模块,设置为计算主题属性的每个主题属性值对应的主键属性的主键属性值集合;第一存储模块,设置为将语义数据中属于同一个主键属性值集合的属性存储在同一节点上;第二存储模块,设置为在节点上为存储在节点中的每个属性建立属性表,以及对属性表按照键值存储方式进行存储。The embodiment of the invention further provides a storage device for semantic data, comprising: a selection module, configured to select a topic attribute and a primary key attribute in the semantic data, wherein the topic attribute is an attribute in which the query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is The attribute of the logical description of the data in the semantic data; the calculation module is configured to calculate a primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the topic attribute; the first storage module is set to belong to the same primary key attribute value in the semantic data The attributes of the collection are stored on the same node; the second storage module is configured to establish an attribute table for each attribute stored in the node on the node, and store the attribute table according to the key value storage manner.
在本发明实施例中,第一存储模块包括:建立单元,设置为根据主键属性对语义数据建立超级表;存储单元,设置为将超级表中属于同一个主键属性值集合的记录存储在同一节点上。In the embodiment of the present invention, the first storage module includes: an establishing unit, configured to establish a super table according to the primary key attribute to the semantic data; and the storage unit is configured to store the records belonging to the same primary key attribute value set in the super table in the same node on.
在本发明实施例中,该装置还包括:索引模块,设置为在对属性表按照键值存储方式进行存储之后,在节点上,对存储在节点的预定属性和主题属性按照指定格式建立访问索引。In the embodiment of the present invention, the apparatus further includes: an indexing module, configured to: after storing the attribute table according to the key value storage manner, establish an access index on the node according to the specified format on the predetermined attribute and the topic attribute stored in the node. .
在本发明实施例中,计算模块包括以下之一:第一计算单元,设置为计算以主题属性值为主语所对应宾语中属性值属于主键属性的主键属性值集合;第二计算单元,设置为计算以主题属性值为宾语所对应主语中属性值属于主键属性的主键属性值集合。In the embodiment of the present invention, the calculation module includes one of the following: the first calculation unit is configured to calculate a primary key attribute value set whose attribute value belongs to the primary key attribute in the object corresponding to the subject attribute value; the second calculating unit is set to The set of primary key attribute values whose attribute value belongs to the primary key attribute in the subject corresponding to the subject attribute value is calculated.
在本发明实施例中,该装置还包括:第三存储模块,设置为在在节点上为存储在节点中的每个属性建立属性表之前,在主键属性值同时属于多个主键属性值集合的情况下,将在超级表中主键属性值对应的记录则存储在多个节点上。In an embodiment of the present invention, the apparatus further includes: a third storage module configured to: before the attribute table is created on the node for each attribute stored in the node, the primary key attribute value belongs to the plurality of primary key attribute value sets at the same time In this case, the records corresponding to the primary key attribute values in the super table are stored on multiple nodes.
通过本发明实施例,采用通过主题属性和主键属性的方法,对语义数据中属于同一个主键属性值集合的属性存储在同一节点上,然后对存储在该节点中的每个属性建立属性表,并对该属性表按照键值存储方式进行存储,即先对语义数据按照行进行区分,然后在分区内按照列进行存储的方式,将查询相关性较高的数据存放在一起,在语义数据存储方法中兼顾了存储空间和查询效率,进而节省了存储空间,提高了查询效率。 Through the embodiment of the present invention, the attribute belonging to the same primary key attribute value set in the semantic data is stored on the same node by using the topic attribute and the primary key attribute method, and then the attribute table is established for each attribute stored in the node. And storing the attribute table according to the key value storage manner, that is, first distinguishing the semantic data according to the row, and then storing the data with high relevance of the query in the manner of storing the columns in the partition, in the semantic data storage The method takes into account the storage space and query efficiency, thereby saving storage space and improving query efficiency.
在阅读并理解了附图和详细描述后,可以明白其他方面。Other aspects will be apparent upon reading and understanding the drawings and detailed description.
附图概述BRIEF abstract
图1是相关技术中的语义数据形成的示意图;1 is a schematic diagram of semantic data formation in the related art;
图2是根据本发明实施例的语义数据的存储的流程图一;2 is a flow chart 1 of storage of semantic data in accordance with an embodiment of the present invention;
图3是根据本发明实施例的语义数据的存储的流程图二;3 is a flow chart 2 of storing semantic data in accordance with an embodiment of the present invention;
图4是根据本发明实施例的语义数据的存储装置的结构框图一;4 is a structural block diagram 1 of a storage device for semantic data according to an embodiment of the present invention;
图5是根据本发明实施例的语义数据的存储装置的结构框图二;FIG. 5 is a structural block diagram 2 of a storage device for semantic data according to an embodiment of the present invention; FIG.
图6是根据本发明实施例的语义数据的存储装置的结构框图三;6 is a structural block diagram 3 of a storage device for semantic data according to an embodiment of the present invention;
图7是根据本发明可选实施例的语义数据存储结果的示意图。7 is a schematic diagram of semantic data storage results in accordance with an alternate embodiment of the present invention.
本发明的实施方式Embodiments of the invention
下文中将参考附图并结合实施例来详细说明本发明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。The invention will be described in detail below with reference to the drawings in conjunction with the embodiments. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict.
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It is to be understood that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar objects, and are not necessarily used to describe a particular order or order.
在本实施例中提供了一种语义数据的存储方法,所述方法可应用于集群系统,所述集群系统包括:分布式集群,图2是根据本发明实施例的语义数据的存储的流程图一,如图2所示,该流程包括如下步骤:In this embodiment, a method for storing semantic data is provided, the method is applicable to a cluster system, the cluster system includes: a distributed cluster, and FIG. 2 is a flowchart of storing semantic data according to an embodiment of the present invention. First, as shown in Figure 2, the process includes the following steps:
步骤S202,选择语义数据中的主题属性和主键属性,主题属性是语义数据中查询频率超过预定阈值的属性,主键属性是语义数据中数据逻辑描述的属性;Step S202, selecting a topic attribute and a primary key attribute in the semantic data, where the topic attribute is an attribute whose query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is an attribute logically described by the data in the semantic data;
步骤S204,计算主题属性的每个主题属性值对应的主键属性的主键属性值集合;Step S204, calculating a primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the topic attribute;
步骤S206,将语义数据中属于同一个主键属性值集合的属性存储在同一节点上; Step S206, storing attributes belonging to the same primary key attribute value set in the semantic data on the same node;
步骤S208,在节点上为存储在节点中的每个属性建立属性表,以及对属性表按照键值存储方式进行存储;Step S208, establishing an attribute table for each attribute stored in the node on the node, and storing the attribute table according to the key value storage manner;
通过上述步骤,采用通过主题属性和主键属性的方法,对语义数据中属于同一个主键属性值集合的属性存储在同一节点上,然后对存储在该节点中的每个属性建立属性表,并对该属性表按照键值存储方式进行存储,即先对语义数据按照行进行区分,然后在分区内按照列进键值存储的方式,将查询相关性较高的数据存放在一起,在语义数据存储方法中兼顾了存储空间和查询效率,进而节省了存储空间,提高了查询效率。Through the above steps, the attributes belonging to the same primary key attribute value set in the semantic data are stored on the same node by using the theme attribute and the primary key attribute method, and then the attribute table is established for each attribute stored in the node, and The attribute table is stored according to the key value storage manner, that is, the semantic data is first distinguished according to the row, and then the data with high relevance of the query is stored together in the partition according to the manner of the key value storage, in the semantic data storage. The method takes into account the storage space and query efficiency, thereby saving storage space and improving query efficiency.
需要说明的是,上述键值存储方式可以是按照<key,value>的格式进行存储,但不限于此。It should be noted that the above key value storage method may be stored in a format of <key, value>, but is not limited thereto.
在本发明实施例中,将语义数据中属于同一个主键属性值集合的属性存储在同一节点上可以有多种实现方式,可选地,可以通过以下方式实现:根据主键属性对语义数据建立超级表;将超级表中属于同一个主键属性值集合的记录存储在同一节点上。In the embodiment of the present invention, storing the attributes belonging to the same primary key attribute value set in the semantic data on the same node may be implemented in multiple manners. Alternatively, the method may be implemented as follows: Table; stores records in the super table that belong to the same set of primary key attribute values on the same node.
图3是根据本发明实施例的语义数据的存储的流程图二,如图3所示,该方法在步骤S208之后,还包括步骤S302,在上述节点上,对存储在上述节点的预定属性和主题属性按照指定格式建立访问索引。所述预定属性是人为设定的根据应用生成的语义数据属性。需要说明的是,不同的预定属性对应不同的指定格式,同一个预定属性也可能采用不同的指定格式,可以根据实际情况进行设置。3 is a flowchart 2 of storing semantic data according to an embodiment of the present invention. As shown in FIG. 3, after step S208, the method further includes step S302, on the node, a predetermined attribute stored in the node and The topic attribute establishes an access index in the specified format. The predetermined attribute is an artificially set semantic data attribute generated according to an application. It should be noted that different predetermined attributes correspond to different specified formats, and the same predetermined attribute may also adopt different specified formats, which may be set according to actual conditions.
通过建立访问索引,当用户需要读取数据时,先查询访问索引,看一下是否满足查询条件,如果满足,则读取数据;如果不满足,不需要读取这部分数据,有效降低了节点间的连接操作,以及对无效数据的访问,进而大大提高了查询效率。By establishing an access index, when the user needs to read the data, first query the access index to see if the query condition is met. If it is satisfied, the data is read; if it is not satisfied, the data needs to be read, which effectively reduces the inter-node. The connection operation and access to invalid data greatly improve the efficiency of the query.
在本发明实施例中,计算主题属性的每个属性值对应的主键属性的主键属性值集合可以通过以下之一实现,但不限于此:计算以主题属性值为主语所对应宾语中属性值属于主键属性的主键属性值集合;计算以主题属性值为宾语所对应主语中属性值属于主键属性的主键属性值集合。 In the embodiment of the present invention, the primary key attribute value set of the primary key attribute corresponding to each attribute value of the subject attribute may be implemented by one of the following, but is not limited thereto: calculating the attribute value of the object corresponding to the subject attribute value belongs to the subject The primary key attribute value set of the primary key attribute; the primary key attribute value set whose attribute value belongs to the primary key attribute in the subject corresponding to the object attribute is calculated.
在本发明实施例中,在在节点上为存储在节点中的每个属性建立属性表之前,该方法还包括:在主键属性值同时属于多个主键属性值集合的情况下,在超级表中主键属性值对应的记录则存储在多个节点上。这样可以更加方便用户的查询。In the embodiment of the present invention, before establishing an attribute table for each attribute stored in the node on the node, the method further includes: in the case of the primary key attribute value belonging to the plurality of primary key attribute value sets, in the super table The records corresponding to the primary key attribute values are stored on multiple nodes. This makes it easier for users to query.
本发明实施例还提供了一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行上述方法。The embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the above method.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation. Based on such understanding, the technical solution of the present invention in essence or the contribution to the related art can be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, CD-ROM). The instructions include a number of instructions for causing a terminal device (which may be a cell phone, computer, server, or network device, etc.) to perform the methods described in various embodiments of the present invention.
在本实施例中还提供了一种语义数据的存储装置,该装置用于实现上述实施例及可选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。In the embodiment, a storage device for the semantic data is provided, and the device is used to implement the foregoing embodiments and optional implementations, and details are not described herein. As used below, the term "module" may implement a combination of software and/or hardware of a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
图4是根据本发明实施例的语义数据的存储装置的结构框图一,如图4所示,该装置包括:4 is a structural block diagram 1 of a storage device for semantic data according to an embodiment of the present invention. As shown in FIG. 4, the device includes:
选择模块42,设置为选择语义数据中的主题属性和主键属性,主题属性是语义数据中查询频率超过预定阈值的属性,主键属性是语义数据中数据逻辑描述的属性;The selection module 42 is configured to select a topic attribute and a primary key attribute in the semantic data, where the topic attribute is an attribute whose query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is an attribute logically described by the data in the semantic data;
计算模块44,设置为计算主题属性的每个主题属性值对应的主键属性的 主键属性值集合;The calculating module 44 is configured to calculate a primary key attribute corresponding to each topic attribute value of the topic attribute Primary key attribute value set;
第一存储模块46,设置为将语义数据中属于同一个主键属性值集合的属性存储在同一节点上;The first storage module 46 is configured to store the attributes of the semantic data belonging to the same primary key attribute value set on the same node;
第二存储模块48,设置为在节点上为存储在节点中的每个属性建立属性表,以及对属性表按照键值存储方式进行存储。The second storage module 48 is configured to establish an attribute table on the node for each attribute stored in the node, and store the attribute table in a key value storage manner.
通过上述装置,采用通过主题属性和主键属性的方法,对语义数据中属于同一个主键属性值集合的属性存储在同一节点上,然后对存储在该节点中的每个属性建立属性表,并对该属性表按照键值存储方式进行存储,即先对语义数据按照行进行区分,然后在分区内按照列进行存储的方式,将查询相关性较高的数据存放在一起,在语义数据存储方法中兼顾了存储空间和查询效率,进而节省了存储空间,提高了查询效率。Through the above device, the attribute belonging to the same primary key attribute value set in the semantic data is stored on the same node by using the theme attribute and the primary key attribute method, and then the attribute table is established for each attribute stored in the node, and The attribute table is stored according to the key value storage manner, that is, the semantic data is first distinguished according to the row, and then the data with high relevance of the query is stored together in the partition according to the column, in the semantic data storage method. Taking into account the storage space and query efficiency, which saves storage space and improves query efficiency.
图5是根据本发明实施例的语义数据的存储装置的结构框图二,如图5所示,上述第一存储模块46包括:建立单元52,设置为根据主键属性对语义数据建立超级表;存储单元54,设置为将超级表中属于同一个主键属性值集合的记录存储在同一节点上。5 is a structural block diagram of a storage device for semantic data according to an embodiment of the present invention. As shown in FIG. 5, the first storage module 46 includes: an establishing unit 52, configured to establish a super table according to a primary key attribute; Unit 54, is configured to store records belonging to the same primary key attribute value set in the super table on the same node.
图6是根据本发明实施例的语义数据的存储装置的结构框图三,如图6所示,该装置还包括:索引模块62,设置为在对属性表按照键值存储方式进行存储之后,在节点上,对存储在节点的预定属性和主题属性按照指定格式建立访问索引。6 is a structural block diagram 3 of a storage device for semantic data according to an embodiment of the present invention. As shown in FIG. 6, the device further includes: an indexing module 62, configured to store the attribute table according to the key value storage manner, On the node, the access index is established in the specified format for the predetermined attribute and the topic attribute stored in the node.
需要说明的是,不同的预定属性对应不同的指定格式,同一个预定属性也可能采用不同的指定格式,可以根据实际情况进行设置。It should be noted that different predetermined attributes correspond to different specified formats, and the same predetermined attribute may also adopt different specified formats, which may be set according to actual conditions.
通过建立访问索引,当用户需要读取数据时,先查询访问索引,看一下是否满足查询条件,如果满足,则读取数据;如果不满足,不需要读取这部分数据,有效降低了节点间的连接操作,以及对无效数据的访问,进而大大提高了查询效率。By establishing an access index, when the user needs to read the data, first query the access index to see if the query condition is met. If it is satisfied, the data is read; if it is not satisfied, the data needs to be read, which effectively reduces the inter-node. The connection operation and access to invalid data greatly improve the efficiency of the query.
在本发明实施例中,计算模块44可以包括以下之一:第一计算单元,设置为计算以主题属性值为主语所对应宾语中属性值属于主键属性的主键属性值集合;第二计算单元,设置为计算以主题属性值为宾语所对应主语中属性 值属于主键属性的主键属性值集合。In the embodiment of the present invention, the calculating module 44 may include one of the following: a first calculating unit, configured to calculate a primary key attribute value set whose attribute value belongs to the primary key attribute in the object corresponding to the subject attribute value; the second calculating unit, Set to calculate the attribute in the subject corresponding to the subject attribute value The value belongs to the primary key attribute value set of the primary key attribute.
在本发明实施例中,上述装置还包括:第三存储模块,设置为在在节点上为存储在节点中的每个属性建立属性表之前,在主键属性值同时属于多个主键属性值集合的情况下,在超级表中主键属性值对应的记录则存储在多个节点上。In an embodiment of the present invention, the apparatus further includes: a third storage module configured to: before the attribute table is created on the node for each attribute stored in the node, the primary key attribute value belongs to the plurality of primary key attribute value sets at the same time In this case, the records corresponding to the primary key attribute values in the super table are stored on multiple nodes.
需要说明的是,上述各个模块是可以通过软件或硬件来实现的,对于后者,可以通过以下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述模块分别位于多个处理器中。It should be noted that each of the above modules may be implemented by software or hardware. For the latter, the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the modules are located in multiple In the processor.
以下结合具体的实施例对本发明做进一步的解释:The present invention is further explained below in conjunction with specific embodiments:
本发明实施例提供了一种可选的语义数据的存储方法,该方法是一种基于主题与主键的行列混合式存储方法,在存储空间和查询效率两方面可以取得更佳的效果。主题是一个属性,它是从图的角度来看待语义数据,由于大多数查询都是查询子图的,我们将查询频率较高的属性,定义为主题,这样,就可以将图中跟查询相关数据存在一起,从而有效提高数据访问的效率。主键是从数据逻辑的角度来看待语义数据,正如前面三种存储方式所描述的那样,语义数据在逻辑上描述也是属性,所以基于主键的存储可以方面在逻辑上来描述数据,从而避免连接操作,提高查询效率。本可选实施例包括以下过程:The embodiment of the invention provides an optional method for storing semantic data, which is a hybrid method of row and column based on theme and primary key, which can achieve better effects in both storage space and query efficiency. The theme is an attribute, which is to view the semantic data from the perspective of the graph. Since most of the queries are query subgraphs, we define the attributes with higher frequency of the query as the subject, so that the graph can be related to the query. Data exists together to effectively increase the efficiency of data access. The primary key is to view the semantic data from the perspective of data logic. As described in the previous three storage methods, the semantic data is logically described as an attribute, so the storage based on the primary key can logically describe the data, thereby avoiding the connection operation. Improve query efficiency. This alternative embodiment includes the following process:
数据存储过程:Data storage process:
在该可选实施例中,基于主题与主键的行列混合式存储方法,步骤如下:In this alternative embodiment, the method based on the row and column hybrid storage of the subject and the primary key is as follows:
步骤1,选择合适的主题属性(TopicAttr)和主键属性(KeyAttr)。Step 1. Select the appropriate topic attribute (TopicAttr) and primary key attribute (KeyAttr).
步骤2,对于主题属性的每个主题属性值(topici),在语义数据中计算以这个值为主语(或宾语)所对应宾语(或主语)中属性值属于主键属性的主键属性值集合(keySeti)。Step 2: For each topic attribute value (topici) of the topic attribute, calculate a primary key attribute value set (keySeti) in which the attribute value of the object (or subject) corresponding to the subject (or object) belongs to the primary key attribute in the semantic data (keySeti) ).
步骤3:以主键属性对语义数据构建逻辑上的超级表,超级表的一行称为一个记录。Step 3: Construct a logical super table on the semantic data with the primary key attribute. The row of the super table is called a record.
步骤4:属于同一个主键属性值集合(keySeti)的记录在逻辑上存储在同一个节点。如果一个主键属性值同时属于多个主键属性值集合,那么这个 主键属性值所对应的记录也会存储到多个节点。Step 4: Records belonging to the same primary key attribute value set (keySeti) are logically stored in the same node. If a primary key attribute value belongs to multiple primary key attribute value sets, then this The records corresponding to the primary key attribute values are also stored in multiple nodes.
步骤5,在每个节点上,对每个属性构建一个属性表,按照<key,value>的格式进行存储。Step 5: On each node, construct a property sheet for each attribute and store it in the format of <key, value>.
步骤6,对预先指定的属性以及主题属性,按照指定的格式生成访问索引。Step 6. Generate an access index according to the specified format for the pre-specified attribute and the topic attribute.
(2)对语义数据的读取过程(2) The process of reading semantic data
当用户需要读取数据时,先查询访问索引,看一下是否满足查询条件,如果满足,则读取数据;如果不满足,不需要读取这部分数据。When the user needs to read the data, first query the access index to see if the query condition is met. If it is satisfied, the data is read; if it is not satisfied, the data need not be read.
在本可选实施例中,上述按照<key,value>的格式进行存储相当于上述实施例中的键值存储方式。In this alternative embodiment, the above storage in the format of <key, value> is equivalent to the key value storage mode in the above embodiment.
本发明实施例还提供了另一种可选的语义数据的存储方法,本可选实施例包括以下过程:The embodiment of the present invention further provides another optional method for storing semantic data. The optional embodiment includes the following process:
(1)数据存储过程:(1) Data storage process:
初始的语义数据如表1所示。The initial semantic data is shown in Table 1.
步骤1,选择主题属性为公司(TopicAttr={CorpA,CorpB}),主键属性为用户名字(KeyAttr={Bob,Jerry,Tom})Step 1. Select the topic attribute as the company (TopicAttr={CorpA, CorpB}) and the primary key attribute as the user name (KeyAttr={Bob, Jerry, Tom})
步骤2,对于每个主题属性值计算它所对应的主键属性值集合。以CorpA为例,计算语义数据中以CorpA为主语(或宾语)所对应宾语(或主语)中属性值属于主键属性的主键属性值集合,keySetCorpA={Bob,Tom,CorpA}。同理可以计算keySetCorpB={Jerry,CorpB}Step 2: Calculate the set of primary key attribute values corresponding to each topic attribute value. Taking CorpA as an example, the set of primary key attribute values whose attribute values in the object (or subject) corresponding to CorpA's subject (or object) belong to the primary key attribute in the semantic data, keySetCorpA={Bob, Tom, CorpA}. Similarly, you can calculate keySetCorpB={Jerry,CorpB}
步骤3:以主键属性对语义数据构建逻辑上的超级表,如表3所示。Step 3: Construct a logical super table for the semantic data with the primary key attribute, as shown in Table 3.
步骤4-6的结果如图7、表5a-表5g以及表6a-表6f,其中,表5a-表5g示出了节点a中的属性表,表6a-表6f示出了节点b中的属性表。The results of steps 4-6 are as shown in Fig. 7, Tables 5a to 5g, and Tables 6a to 6f, wherein Tables 5a to 5g show attribute tables in node a, and tables 6a to 6f show nodes b. Property sheet.
步骤4:将CorpA和CorpB对应的记录分别存储在连个节点上,即{Bob,Tom,CorpA}对应的记录存储在节点a上,{Jerry,CorpB}对应的记录存储在节点b上。Step 4: Store the records corresponding to CorpA and CorpB on consecutive nodes, that is, the records corresponding to {Bob, Tom, CorpA} are stored on node a, and the records corresponding to {Jerry, CorpB} are stored on node b.
步骤5,在每个节点上,为每个属性构建一个属性表,按照<key,value> 的格式进行存储。在节点a上分别对属性“公司”、“部门”、“月薪”、“配偶”、“性别”、“爱好”、“地址”构建<key,value>的属性表。在节点b上分别对属性“公司”、“部门”、“邮箱”、“性别”、“爱好”、“员工数目”构建<key,value>的属性表。Step 5. On each node, build a property sheet for each property, according to <key, value> The format is stored. A property table of <key, value> is constructed on the node a for the attributes "company", "department", "monthly salary", "spouse", "gender", "hobby", and "address". On the node b, a property table of <key, value> is constructed for the attributes "company", "department", "mailbox", "gender", "hobby", and "number of employees".
步骤6,在节点a和节点b上,对属性“月薪”,生成最小值和最大值的访问索引。节点a的“月薪”属性表中最小值是5800,最大值也是5800。节点b上没有“月薪”属性表,所以最小值和最大值都表示为空值(NULL)。Step 6. On node a and node b, for the attribute "monthly salary", generate an access index of the minimum and maximum values. The minimum value in the "monthly salary" attribute table of node a is 5800, and the maximum value is also 5800. There is no "monthly salary" attribute table on node b, so both the minimum and maximum values are represented as null values (NULL).
Figure PCTCN2016079672-appb-000004
Figure PCTCN2016079672-appb-000004
Figure PCTCN2016079672-appb-000005
Figure PCTCN2016079672-appb-000005
(2)数据查询过程(2) Data query process
当用户查询“Bob的部门同事都有谁?”,即查询“公司值为CorpA,且部门值为人事的人员名单”。那么只需要查询节点a上的部门属性表即可,即“Tom是Bob的部门同事”。When the user queries "Who is the department's colleagues in Bob?", the query "the company value is CorpA, and the department value is the personnel list." Then you only need to query the department attribute table on node a, that is, "Tom is Bob's department colleague."
当用户查询“CorpA公司中,月薪大于8000的有多少人?”,在节点a上,“月薪”属性表的访问索引中显示,月薪最大的人只有5800,即该节点上不存在月薪大于8000的人,所以该查询不需要再读取“月薪”属性表的数据。When the user queries "How many people have a monthly salary greater than 8000 in CorpA?", on the node a, the access index of the "monthly salary" attribute table shows that the person with the highest monthly salary is only 5800, that is, there is no monthly salary greater than 8000 on the node. People, so the query does not need to read the data of the "monthly salary" property sheet.
在本可选实施例中,上述按照<key,value>的格式进行存储相当于上述实施例中的键值存储方式。In this alternative embodiment, the above storage in the format of <key, value> is equivalent to the key value storage mode in the above embodiment.
本发明的实施例还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以被设置为存储用于执行以下步骤的程序代码:Embodiments of the present invention also provide a storage medium. Optionally, in the embodiment, the foregoing storage medium may be configured to store program code for performing the following steps:
S1,选择语义数据中的主题属性和主键属性,主题属性是语义数据中查询频率超过预定阈值的属性,主键属性是语义数据中数据逻辑描述的属性;S1, selecting a topic attribute and a primary key attribute in the semantic data, where the topic attribute is an attribute whose query frequency exceeds a predetermined threshold in the semantic data, and the primary key attribute is an attribute logically described by the data in the semantic data;
S2,计算主题属性的每个主题属性值对应的主键属性的主键属性值集合;S2. Calculate a primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the topic attribute;
S3,将语义数据中属于同一个主键属性值集合的属性存储在同一节点上;S3, storing attributes belonging to the same primary key attribute value set in the semantic data on the same node;
S4,在节点上为存储在节点中的每个属性建立属性表,以及对属性表按照键值存储方式进行存储。S4, an attribute table is established on the node for each attribute stored in the node, and the attribute table is stored according to the key value storage manner.
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。 Optionally, in this embodiment, the foregoing storage medium may include, but not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), a mobile hard disk, and a magnetic memory. A variety of media that can store program code, such as a disc or a disc.
可选地,本实施例中的具体示例可以参考上述实施例及可选实施方式中所描述的示例,本实施例在此不再赘述。For example, the specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the optional embodiments, and details are not described herein again.
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。It will be apparent to those skilled in the art that the various modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein. The steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software.
以上所述仅为本发明的可选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above description is only an alternative embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and scope of the present invention are intended to be included within the scope of the present invention.
工业实用性Industrial applicability
上述技术方案在语义数据存储方法中兼顾了存储空间和查询效率,进而节省了存储空间,提高了查询效率。 The above technical solution takes into account the storage space and the query efficiency in the semantic data storage method, thereby saving storage space and improving query efficiency.

Claims (10)

  1. 一种语义数据的存储方法,包括:A method of storing semantic data, including:
    选择语义数据中的主题属性和主键属性,所述主题属性是所述语义数据中查询频率超过预定阈值的属性,所述主键属性是所述语义数据中数据逻辑描述的属性;Selecting a topic attribute and a primary key attribute in the semantic data, wherein the topic attribute is an attribute in the semantic data whose query frequency exceeds a predetermined threshold, and the primary key attribute is an attribute logically described by data in the semantic data;
    计算所述主题属性的每个主题属性值对应的所述主键属性的主键属性值集合;Calculating a primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the topic attribute;
    将所述语义数据中属于同一个所述主键属性值集合的属性存储在同一节点上;Storing the attributes of the semantic data that belong to the same set of primary key attribute values on the same node;
    在所述节点上为存储在所述节点中的每个属性建立属性表,以及对所述属性表按照键值存储方式进行存储。An attribute table is created on the node for each attribute stored in the node, and the attribute table is stored in a key value storage manner.
  2. 根据权利要求1所述的方法,其中,将所述语义数据中属于同一个所述主键属性值集合的属性存储在同一节点上包括:The method of claim 1, wherein storing the attributes of the semantic data that belong to the same set of primary key attribute values on the same node comprises:
    根据所述主键属性对所述语义数据建立超级表;Establishing a super table for the semantic data according to the primary key attribute;
    将所述超级表中属于同一个所述主键属性值集合的记录存储在同一节点上。The records belonging to the same set of primary key attribute values in the super table are stored on the same node.
  3. 根据权利要求1所述的方法,还包括:The method of claim 1 further comprising:
    在所述对所述属性表按照键值存储方式进行存储之后,在所述节点上,对存储在所述节点的预定属性和所述主题属性按照指定格式建立访问索引。After the storing the attribute table in the key value storage manner, on the node, the access index is established according to the specified format for the predetermined attribute and the topic attribute stored in the node.
  4. 根据权利要求1所述的方法,其中,计算所述主题属性的每个属性值对应的所述主键属性的主键属性值集合包括以下之一:The method of claim 1, wherein calculating a primary key attribute value set of the primary key attribute corresponding to each attribute value of the topic attribute comprises one of the following:
    计算以所述主题属性值为主语所对应宾语中属性值属于所述主键属性的主键属性值集合;Calculating a primary key attribute value set in which the attribute value in the object corresponding to the subject attribute value belongs to the primary key attribute;
    计算以所述主题属性值为宾语所对应主语中属性值属于所述主键属性的主键属性值集合。Calculating a primary key attribute value set whose attribute value belongs to the primary key attribute in the subject corresponding to the subject attribute value.
  5. 根据权利要求2所述的方法,所述方法还包括:The method of claim 2, the method further comprising:
    在所述节点上为存储在所述节点中的每个属性建立属性表之前,在所述 主键属性值同时属于多个所述主键属性值集合的情况下,将在所述超级表中所述主键属性值对应的记录存储在多个节点上。Before the attribute table is created on the node for each attribute stored in the node, When the primary key attribute value belongs to a plurality of the primary key attribute value sets at the same time, the record corresponding to the primary key attribute value in the super table is stored on a plurality of nodes.
  6. 一种语义数据的存储装置,包括:A storage device for semantic data, comprising:
    选择模块,设置为选择语义数据中的主题属性和主键属性,所述主题属性是所述语义数据中查询频率超过预定阈值的属性,所述主键属性是所述语义数据中数据逻辑描述的属性;a selection module, configured to select a topic attribute and a primary key attribute in the semantic data, wherein the topic attribute is an attribute in the semantic data whose query frequency exceeds a predetermined threshold, and the primary key attribute is an attribute logically described by the data in the semantic data;
    计算模块,设置为计算所述主题属性的每个主题属性值对应的所述主键属性的主键属性值集合;a calculation module, configured to calculate a primary key attribute value set of the primary key attribute corresponding to each topic attribute value of the topic attribute;
    第一存储模块,设置为将所述语义数据中属于同一个所述主键属性值集合的属性存储在同一节点上;a first storage module, configured to store attributes of the semantic data that belong to the same set of primary key attribute values on the same node;
    第二存储模块,设置为在所述节点上为存储在所述节点中的每个属性建立属性表,以及对所述属性表按照键值存储方式进行存储。The second storage module is configured to establish an attribute table on the node for each attribute stored in the node, and store the attribute table in a key value storage manner.
  7. 根据权利要求6所述的装置,其中,所述第一存储模块包括:The apparatus of claim 6, wherein the first storage module comprises:
    建立单元,设置为根据所述主键属性对所述语义数据建立超级表;Establishing a unit, configured to establish a super table for the semantic data according to the primary key attribute;
    存储单元,设置为将所述超级表中属于同一个所述主键属性值集合的记录存储在同一节点上。a storage unit, configured to store records of the super table belonging to the same set of primary key attribute values on the same node.
  8. 根据权利要求6所述的装置,所述装置还包括:The apparatus of claim 6 further comprising:
    索引模块,设置为在所述对所述属性表按照键值存储方式进行存储之后,在所述节点上,对存储在所述节点的预定属性和所述主题属性按照指定格式建立访问索引。And an indexing module, configured to establish, on the node, an access index for the predetermined attribute stored in the node and the topic attribute according to a specified format after the storing the attribute table according to the key value storage manner.
  9. 根据权利要求6所述的装置,其中,所述计算模块包括以下之一:The apparatus of claim 6 wherein said computing module comprises one of:
    第一计算单元,设置为计算以所述主题属性值为主语所对应宾语中属性值属于所述主键属性的主键属性值集合;a first calculating unit, configured to calculate a primary key attribute value set in which an attribute value in the object corresponding to the subject attribute value belongs to the primary key attribute;
    第二计算单元,设置为计算以所述主题属性值为宾语所对应主语中属性值属于所述主键属性的主键属性值集合。The second calculating unit is configured to calculate a primary key attribute value set whose attribute value belongs to the primary key attribute in the subject corresponding to the subject attribute value.
  10. 根据权利要求7所述的装置,所述装置还包括:The apparatus of claim 7 further comprising:
    第三存储模块,设置为在所述节点上为存储在所述节点中的每个属性建 立属性表之前,在所述主键属性值同时属于多个所述主键属性值集合的情况下,将在所述超级表中所述主键属性值对应的记录存储在多个节点上。 a third storage module, configured to build on the node for each attribute stored in the node Before the attribute table is established, in a case where the primary key attribute value belongs to a plurality of the primary key attribute value sets at the same time, the record corresponding to the primary key attribute value in the super table is stored on a plurality of nodes.
PCT/CN2016/079672 2015-07-01 2016-04-19 Semantic data storage method and apparatus WO2016180186A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510379367.3A CN106326295B (en) 2015-07-01 2015-07-01 Semantic data storage method and device
CN201510379367.3 2015-07-01

Publications (1)

Publication Number Publication Date
WO2016180186A1 true WO2016180186A1 (en) 2016-11-17

Family

ID=57247767

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/079672 WO2016180186A1 (en) 2015-07-01 2016-04-19 Semantic data storage method and apparatus

Country Status (2)

Country Link
CN (1) CN106326295B (en)
WO (1) WO2016180186A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112489643A (en) * 2020-10-27 2021-03-12 广东美的白色家电技术创新中心有限公司 Conversion method, conversion table generation device and computer storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108399919A (en) * 2017-02-06 2018-08-14 中兴通讯股份有限公司 A kind of method for recognizing semantics and device
CN110489417B (en) * 2019-07-25 2023-03-28 深圳壹账通智能科技有限公司 Data processing method and related equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101282313A (en) * 2008-05-22 2008-10-08 北京航空航天大学 Electronic mail system for electric conference accessory system
CN102033956A (en) * 2010-12-27 2011-04-27 陆嘉恒 Graphical XML content and structure query system with intelligent prompt function
CN102184239A (en) * 2011-05-16 2011-09-14 复旦大学 Access probability based document fragmenting method in XML (Extensive Makeup Language) radio data broadcast mode
US20130080476A1 (en) * 2011-09-22 2013-03-28 Fuji Xerox Co., Ltd. Search apparatus, search method, and computer readable medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7664742B2 (en) * 2005-11-14 2010-02-16 Pettovello Primo M Index data structure for a peer-to-peer network
EP2631817A1 (en) * 2012-02-23 2013-08-28 Fujitsu Limited Database, apparatus, and method for storing encoded triples
US9378263B2 (en) * 2012-06-19 2016-06-28 Salesforce.Com, Inc. Method and system for creating indices and loading key-value pairs for NoSQL databases
CN103412883B (en) * 2013-07-17 2016-09-28 中国人民解放军国防科学技术大学 Semantic intelligent information distribution subscription method based on P2P technology
CN103577538A (en) * 2013-09-29 2014-02-12 柳州市宏亿科技有限公司 Key value data query method based on internet

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101282313A (en) * 2008-05-22 2008-10-08 北京航空航天大学 Electronic mail system for electric conference accessory system
CN102033956A (en) * 2010-12-27 2011-04-27 陆嘉恒 Graphical XML content and structure query system with intelligent prompt function
CN102184239A (en) * 2011-05-16 2011-09-14 复旦大学 Access probability based document fragmenting method in XML (Extensive Makeup Language) radio data broadcast mode
US20130080476A1 (en) * 2011-09-22 2013-03-28 Fuji Xerox Co., Ltd. Search apparatus, search method, and computer readable medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112489643A (en) * 2020-10-27 2021-03-12 广东美的白色家电技术创新中心有限公司 Conversion method, conversion table generation device and computer storage medium

Also Published As

Publication number Publication date
CN106326295B (en) 2021-12-14
CN106326295A (en) 2017-01-11

Similar Documents

Publication Publication Date Title
CN106227800B (en) Storage method and management system for highly-associated big data
US9767150B2 (en) System and method for processing database queries
Moniruzzaman et al. Nosql database: New era of databases for big data analytics-classification, characteristics and comparison
US11288241B1 (en) Systems and methods for integration and analysis of data records
TWI706259B (en) Data query method and query device
US8650181B2 (en) OLAP execution model using relational operations
JP6434154B2 (en) Identifying join relationships based on transaction access patterns
US20130282765A1 (en) Optimizing sparse schema-less data in relational stores
WO2015027425A1 (en) Method and device for storing data
CN103631924B (en) A kind of application process and system of distributive database platform
US10762068B2 (en) Virtual columns to expose row specific details for query execution in column store databases
WO2016134580A1 (en) Data query method and apparatus
US20140006369A1 (en) Processing structured and unstructured data
US20150286679A1 (en) Executing a query having multiple set operators
US11567999B2 (en) Using a B-tree to store graph information in a database
Huang et al. A scalable system for community discovery in twitter during hurricane sandy
WO2016180186A1 (en) Semantic data storage method and apparatus
WO2015058500A1 (en) Data storage method and device
CN111723161A (en) Data processing method, device and equipment
US10776368B1 (en) Deriving cardinality values from approximate quantile summaries
US20140074769A1 (en) Tuple reduction for hierarchies of a dimension
Potter et al. Querying distributed RDF graphs: the effects of partitioning
US8548980B2 (en) Accelerating queries based on exact knowledge of specific rows satisfying local conditions
CN113934713A (en) Order data indexing method, system, computer equipment and storage medium
Likhyani et al. Label constrained shortest path estimation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16792036

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16792036

Country of ref document: EP

Kind code of ref document: A1