CN107818117A - A kind of method for building up of tables of data, online query method and relevant apparatus - Google Patents

A kind of method for building up of tables of data, online query method and relevant apparatus Download PDF

Info

Publication number
CN107818117A
CN107818117A CN201610826949.6A CN201610826949A CN107818117A CN 107818117 A CN107818117 A CN 107818117A CN 201610826949 A CN201610826949 A CN 201610826949A CN 107818117 A CN107818117 A CN 107818117A
Authority
CN
China
Prior art keywords
data
node
tables
relation
physical cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610826949.6A
Other languages
Chinese (zh)
Other versions
CN107818117B (en
Inventor
王平
孙权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201610826949.6A priority Critical patent/CN107818117B/en
Publication of CN107818117A publication Critical patent/CN107818117A/en
Application granted granted Critical
Publication of CN107818117B publication Critical patent/CN107818117B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application provides a kind of method for building up of tables of data, online query method and relevant apparatus, the method for building up includes:The first tables of data is established on a physical cluster, first tables of data is used for the relation information of on-line storage first node and section point;The second tables of data is established on the physical cluster, second tables of data is used for the attribute information of on-line storage section point;Wherein, the first node and the section point belong to different node types.In embodiments of the present invention, node type is not repartitioned when establishing tables of data, therefore even if first node is different with the node type of section point, i.e. described first tables of data and second tables of data correspond to different node types, remain on and establish first tables of data and second tables of data on same physical cluster.Therefore, multiple physical clusters need not be accessed during online query, improve online query speed.

Description

A kind of method for building up of tables of data, online query method and relevant apparatus
Technical field
The application is related to on-line storage technical field, more particularly, to a kind of method for building up of tables of data, online query side Method and relevant apparatus.
Background technology
With the continuous development of Internet technology, not only species is more and more but also the order of magnitude for caused relational network data It is increasing, such as relational network data shown in Fig. 1 include two kinds of node, one kind is user node:User i and User j, another kind are commodity nodes:Commodity 1 and commodity 2, wherein user j are user i good friends, and user j have purchased the He of commodity 1 Commodity 2.And how these relational network data to be stored by on-line storage technology, become what people became more concerned with Problem.
At present, these data of on-line storage generally by way of establishing tables of data.For example, facebook companies use Unicorn frameworks establish tables of data.In unicorn frameworks, each node type all corresponds to single physical cluster, that is, The tables of data of each node type is said, is built upon in corresponding physical cluster.Such as shown in Fig. 2, in user's physical cluster The tables of data of user node type is established, the tables of data is used for the attribute information, buddy list, commodity purchasing for storing user node List etc., the tables of data of commodity node type is established in commodity physical cluster, the tables of data is used for the category for storing commodity node Property information etc..
It can be seen that establishing tables of data using unicorn frameworks, different node types are related to during an online query During data, then need to access multiple physical clusters.For example, if desired online query user i good friend purchase commodity when, it is necessary to User's cluster is accessed first, and the good friend for inquiring user i is user j, and the items list { i, j } of user j purchases, is visited again Commodity cluster, inquiry commodity i and j attribute information are as final online query result.Obviously, above-mentioned online query process needs At least two physical clusters are at least accessed, cause online query speed slower.
The content of the invention
The technical problem that the application solves is to provide a kind of method for building up of tables of data, online query method and related dress Put, tables of data is established using more reasonably framework, so as to which multiple physical clusters need not be accessed during online query, improve online query Speed.
Therefore, the technical scheme that the application solves technical problem is:
This application provides a kind of method for building up of tables of data, including:
Establish the first tables of data on a physical cluster, first tables of data is used for on-line storage first node and the The relation information of two nodes;
The second tables of data is established on the physical cluster, second tables of data is used for the category of on-line storage section point Property information;
Wherein, the first node and the section point belong to different node types.
Optionally, first tables of data includes the first index entry and the second index entry, and first index entry is used for Line stores the mark of the first node, and second index entry is used for on-line storage corresponding with the first node described the The mark of two nodes;
Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for on-line storage institute The mark of section point is stated, first attribute item is used for the attribute information of section point described in on-line storage.
Optionally, first tables of data also includes the second attribute item, and second attribute item is used for described in on-line storage The attribute information of the corresponding relation of first node and the section point.
Optionally, first tables of data is key-key-value key-key-value structures, wherein, first index entry For main key information, second index entry is that second attribute item is value information from key information;
Second tables of data is key-value key-value structures, wherein, the 3rd index entry is key information, described One attribute item is value information.
Optionally, the physical cluster includes N number of partition holding, and N is more than or equal to 2;The method for building up also includes:
First subregion is determined from N number of partition holding according to first index entry, according to the 3rd index entry from N number of The second subregion is determined in partition holding;
It is described to establish the first tables of data on a physical cluster, including:Built on the first subregion of the physical cluster Vertical first tables of data;
It is described to establish the second tables of data on the physical cluster, including:Built on the second subregion of the physical cluster Vertical second tables of data.
Optionally, first subregion and second subregion include M backup region respectively, and M is more than or equal to 2;
The first tables of data is established on the first subregion of the physical cluster, including:At first point of the physical cluster On the M backup region in area, first tables of data is established respectively;
The second tables of data is established on the second subregion of the physical cluster, including:At second point of the physical cluster On the M backup region in area, second tables of data is established respectively.
Optionally, first tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node;It is described Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, the 3rd node and the first node category In identical node type, the fourth node and the section point belong to identical node type;
Or methods described also includes:The 3rd tables of data is established on the physical cluster, the 3rd tables of data is used for The relation information of the node of on-line storage the 5th and the first node.
Being established this application provides a kind of online query method, on a physical cluster has the first tables of data and the second data Table, first tables of data are used for the relation information of on-line storage first node and section point;Second tables of data is used for The attribute information of on-line storage section point;Wherein, the first node and the section point belong to different node types; Methods described includes:
On-line query request is received, the on-line query request is used for the relation data for indicating online query first node;
First tables of data and second tables of data in physical cluster described in online access, inquire described first The relation data of node;
Wherein, the relation data for inquiring the first node includes:Institute is inquired from first tables of data Section point corresponding to first node is stated, the attribute information of the section point is inquired from second tables of data.
Optionally, first tables of data includes the first index entry and the second index entry, and first index entry is used for Line stores the mark of the first node, and second index entry is used for on-line storage corresponding with the first node described the The mark of two nodes;
Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for on-line storage institute The mark of section point is stated, first attribute item is used for the attribute information of section point described in on-line storage.
Optionally, the physical cluster includes N number of partition holding, and N is more than or equal to 2;Methods described also includes:
First subregion is determined from N number of partition holding according to first index entry, according to second index entry from N number of The second subregion is determined in partition holding;
First tables of data and second tables of data in physical cluster described in online access, including:Online access First tables of data on first subregion in the physical cluster, and second data on second subregion Table.
Optionally, first tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node;It is described Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, the 3rd node and the first node category In identical node type, the fourth node and the section point belong to identical node type;The online query please Seek the relation data for being additionally operable to indicate the node of online query the 3rd;
Methods described also includes:
First tables of data and second tables of data in physical cluster described in online access, inquire the described 3rd The relation data of node;
Wherein, the relation data for inquiring the 3rd node includes:Institute is inquired from first tables of data State fourth node corresponding to the 3rd node;The attribute information of the fourth node is inquired from second tables of data.
Optionally, the on-line query request includes the first composition operators;Methods described also includes:
The first instruction, the second instruction and the 3rd instruction are parsed from first composition operators, first instruction is used In the relation data of instruction online query first node, described second instructs the relation number for indicating the node of online query the 3rd According to the described 3rd instructs for indicating to carry out integrated treatment to relation data;
The relation data of the first node is inquired, including:Based on the described first instruction, the first node is inquired Relation data;
The relation data of the 3rd node is inquired, including:Based on the described second instruction, the 3rd node is inquired Relation data;
Methods described also includes:Based on the described 3rd instruction, relation data to the first node and described Section three The relation data of point carries out integrated treatment.
Optionally, the integrated treatment includes any one of following handle:Merging treatment, intersection operation and set difference operation.
Optionally, the 3rd tables of data has been also set up on the physical cluster, the 3rd tables of data is used for on-line storage the The relation information of five nodes and the first node;The on-line query request is used for the relation for indicating the node of online query the 5th Data:
First tables of data and second tables of data in physical cluster described in online access, inquire described first The relation data of node, including:First tables of data, second tables of data and institute in physical cluster described in online access The 3rd tables of data is stated, inquires the relation data of the 5th node;
The relation data of the 5th node is inquired, including:Described Section five is inquired from the 3rd tables of data The first node corresponding to point, inquire the relation data of the first node.
Optionally, the on-line query request includes the second composition operators;Methods described also includes:
The 4th instruction and the 5th instruction are parsed from first composition operators, the described 4th instructs for indicating online Node corresponding to the 5th node is inquired about, the described 5th instructs the relation number for indicating node corresponding to the node of online query the 5th According to;
The first node corresponding to the 5th node is inquired from the 3rd tables of data, including:Based on described 4th instruction, the first node corresponding to the 5th node is inquired from the 3rd tables of data;
The relation data of the first node is inquired, including:Based on the described 5th instruction, the first node is inquired Relation data.
This application provides a kind of device of establishing of tables of data, including:
First establishes unit, and for establishing the first tables of data on a physical cluster, first tables of data is used for Line stores the relation information of first node and section point;
Second establishes unit, and for establishing the second tables of data on the physical cluster, second tables of data is used for Line stores the attribute information of section point;
Wherein, the first node and the section point belong to different node types.
Optionally, first tables of data includes the first index entry and the second index entry, and first index entry is used for Line stores the mark of the first node, and second index entry is used for on-line storage corresponding with the first node described the The mark of two nodes;
Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for on-line storage institute The mark of section point is stated, first attribute item is used for the attribute information of section point described in on-line storage.
Optionally, first tables of data also includes the second attribute item, and second attribute item is used for described in on-line storage The attribute information of the corresponding relation of first node and the section point.
Optionally, first tables of data is key-key-value key-key-value structures, wherein, first index entry For main key information, second index entry is that second attribute item is value information from key information;
Second tables of data is key-value key-value structures, wherein, the 3rd index entry is key information, described One attribute item is value information.
Optionally, the physical cluster includes N number of partition holding, and N is more than or equal to 2;The device of establishing also includes:
Determining unit, for determining the first subregion from N number of partition holding according to first index entry, according to described Three index entries determine the second subregion from N number of partition holding;
It is described when establishing the first tables of data on a physical cluster, described first, which establishes unit, is specifically used for, described The first tables of data is established on first subregion of physical cluster;
It is described when establishing the second tables of data on the physical cluster, described second, which establishes unit, is specifically used for, described The second tables of data is established on second subregion of physical cluster.
Optionally, first subregion and second subregion include M backup region respectively, and M is more than or equal to 2;
When establishing the first tables of data on the first subregion of the physical cluster, described first, which establishes unit, is specifically used for, On M backup region of the first subregion of the physical cluster, first tables of data is established respectively;
When establishing the second tables of data on the second subregion of the physical cluster, described second, which establishes unit, is specifically used for, On M backup region of the second subregion of the physical cluster, second tables of data is established respectively.
Optionally, first tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node;It is described Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, the 3rd node and the first node category In identical node type, the fourth node and the section point belong to identical node type;
Or the device of establishing also includes:3rd establishes unit, for establishing the 3rd data on the physical cluster Table, the 3rd tables of data are used for the relation information of the node of on-line storage the 5th and the first node.
Being established this application provides a kind of online query device, on a physical cluster has the first tables of data and the second data Table, first tables of data are used for the relation information of on-line storage first node and section point;Second tables of data is used for The attribute information of on-line storage section point;Wherein, the first node and the section point belong to different node types; Described device includes:
Receiving unit, for receiving on-line query request, the on-line query request is used to indicate online query first segment The relation data of point;
Query unit, for first tables of data in physical cluster described in online access and second tables of data, Inquire the relation data of the first node;
Wherein, the relation data for inquiring the first node includes:Institute is inquired from first tables of data Section point corresponding to first node is stated, the attribute information of the section point is inquired from second tables of data.
Optionally, first tables of data includes the first index entry and the second index entry, and first index entry is used for Line stores the mark of the first node, and second index entry is used for on-line storage corresponding with the first node described the The mark of two nodes;
Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for on-line storage institute The mark of section point is stated, first attribute item is used for the attribute information of section point described in on-line storage.
Optionally, the physical cluster includes N number of partition holding, and N is more than or equal to 2;Described device also includes:
Determining unit, for determining the first subregion from N number of partition holding according to first index entry, according to described Two index entries determine the second subregion from N number of partition holding;
When first tables of data in physical cluster described in online access and second tables of data, the query unit It is specifically used for:First tables of data on first subregion in physical cluster described in online access, and described second point Second tables of data in area.
Optionally, first tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node;It is described Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, the 3rd node and the first node category In identical node type, the fourth node and the section point belong to identical node type;The online query please Seek the relation data for being additionally operable to indicate the node of online query the 3rd;
The query unit is additionally operable to, first tables of data and second number in physical cluster described in online access According to table, the relation data of the 3rd node is inquired;
Wherein, the relation data for inquiring the 3rd node includes:Institute is inquired from first tables of data State fourth node corresponding to the 3rd node;The attribute information of the fourth node is inquired from second tables of data.
Optionally, the on-line query request includes the first composition operators;Described device also includes:
Resolution unit, for parsing the first instruction, the second instruction and the 3rd instruction, institute from first composition operators The relation data that the first instruction is used to indicate online query first node is stated, described second instructs for indicating online query the 3rd The relation data of node, the described 3rd instructs for indicating to carry out integrated treatment to relation data;
When inquiring the relation data of the first node, the query unit is specifically used for, and is instructed based on described first, Inquire the relation data of the first node;
When inquiring the relation data of the 3rd node, the query unit is specifically used for, and is instructed based on described second, Inquire the relation data of the 3rd node;
Described device also includes:Processing unit, for being instructed based on the described 3rd, to the relation data of the first node Integrated treatment is carried out with the relation data of the 3rd node.
Optionally, the integrated treatment includes any one of following handle:Merging treatment, intersection operation and set difference operation.
Optionally, the 3rd tables of data has been also set up on the physical cluster, the 3rd tables of data is used for on-line storage the The relation information of five nodes and the first node;The on-line query request is used for the relation for indicating the node of online query the 5th Data:
First tables of data and second tables of data in physical cluster described in online access, inquire described first During the relation data of node, the query unit is specifically used for, first tables of data in physical cluster described in online access, Second tables of data and the 3rd tables of data, inquire the relation data of the 5th node;
When inquiring the relation data of the 5th node, the query unit is specifically used for, from the 3rd tables of data In inquire the 5th node corresponding to the first node, inquire the relation data of the first node.
Optionally, the on-line query request includes the second composition operators;Described device also includes:
Resolution unit, for parsing the 4th instruction and the 5th instruction from first composition operators, the described 4th refers to Make for indicating node corresponding to the node of online query the 5th, the described 5th instructs for indicating that the node of online query the 5th is corresponding Node relation data;
When inquiring the first node corresponding to the 5th node from the 3rd tables of data, the query unit It is specifically used for, based on the described 4th instruction, is inquired from the 3rd tables of data described first corresponding to the 5th node Node;
When inquiring the relation data of the first node, the query unit is specifically used for, and is instructed based on the described 5th, Inquire the relation data of the first node.
According to the above-mentioned technical solution, in embodiments of the present invention, node type is not repartitioned when establishing tables of data, because Even if this first node is different with the node type of section point, i.e., described first tables of data and second tables of data are corresponding not Same node type, remain on and establish first tables of data and second tables of data on same physical cluster.Cause This, multiple physical clusters need not be accessed during online query, improve online query speed.
Brief description of the drawings
In order to illustrate more clearly of the technical scheme in the embodiment of the present application, make required in being described below to embodiment Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present application, for For those of ordinary skill in the art, other accompanying drawings can also be obtained according to these accompanying drawings.
Fig. 1 is a kind of structural representation of attributed graph;
Fig. 2 is the schematic diagram of unicorn frameworks;
Fig. 3 is a kind of schematic flow sheet for embodiment of the method that the application provides;
Fig. 4 is a kind of structural representation for aggregated structure that the application provides;
Fig. 5 is the schematic flow sheet for another embodiment of the method that the application provides;
Fig. 6 is a kind of structural representation for device embodiment that the application provides;
Fig. 7 is the structural representation for another device embodiment that the application provides.
Embodiment
Relational network data refer to the data that corresponding relation in data caused by internet be present, such as represent user The data of purchase relation between commodity, represent that the data of the friend relation between user and user can constituent relation net Network data.
Relational network data can pass through attributed graph specification.Such as in the attributed graph shown in Fig. 1, including two kinds of section Point, one kind are user nodes:User i and user j, another kind are commodity nodes:Commodity 1 and commodity 2, wherein user j are user i Good friend, user j have purchased commodity 1 and commodity 2.Also, it is generally also provided with attribute between the node and node in attributed graph Information, for example, user i attribute information includes the user profile such as user i name, age and sex, the attribute information of commodity 1 The merchandise newss such as introduction, price and species including commodity 1, also there is good friend's attribute information between user i and user j, such as Friend relation value etc., also there is purchase attribute information, such as the time buying etc. between user j and commodity 1 and commodity 2.
In a kind of on-line storage framework:In unicorn frameworks, each node type all corresponds to single physical cluster.Example As shown in Fig. 2 establishing the tables of data 1 and tables of data 2 of user node type in user's physical cluster, tables of data 1 is used to store User i and j buddy list, tables of data 2 is used for the commodity purchasing list for storing user i and user j, in commodity physical cluster The tables of data 3 of commodity node type is established, tables of data 3 is used to store attribute information of commodity 1 and 2 etc..
If desired, it is necessary to user's cluster be accessed first, from tables of data 1 during the commodity of online query user i good friend's purchase In to inquire user i good friend be user j, the items list { i, j } of user j purchases is inquired from tables of data 2, visits again business Product cluster, commodity i and j attribute information are inquired about as final online query result from tables of data 3.Obviously, it is above-mentioned to look into online Inquiry process needs at least to access user's cluster and the physical cluster of commodity cluster at least two.And due in different physical clusters, The various parameters such as configuration attribute differ, therefore access speed is slower, and it is slower to further result in online query speed.
It can be seen that unicorn frameworks are establishing tables of data time zone merogenesis vertex type, therefore the tables of data of different node types Need to establish on different physical clusters.Obviously, it is more to be not suitable for node type for this mode, or frequently increases node newly Under the scene of type.For example, when node type is more, it is necessary to more physical cluster;And when frequent newly-increased node type, Often increase a node type newly, it is necessary to which an independent physical cluster, this obviously can cause higher cost.
The embodiment of the present application provides a kind of method for building up of tables of data, online query method and relevant apparatus, is establishing number Node type is not repartitioned during according to table, the tables of data of different node types is established on same physical cluster, realized more Reasonably framework establishes tables of data, so as to access multiple physical clusters during online query, improves online query speed.
In order that those skilled in the art more fully understand the technical scheme in the application, it is real below in conjunction with the application The accompanying drawing in example is applied, the technical scheme in the embodiment of the present application is clearly and completely described, it is clear that described implementation Example only some embodiments of the present application, rather than whole embodiments.It is common based on the embodiment in the application, this area The every other embodiment that technical staff is obtained under the premise of creative work is not made, should all belong to protection of the present invention Scope.
Referring to Fig. 3, the embodiment of the present application provides a kind of embodiment of the method for the method for building up of tables of data.The present embodiment Methods described include:
S301:The first tables of data is established on a physical cluster, first tables of data is used for on-line storage first segment The relation information of point and section point.
In the embodiment of the present application, the first node and the section point belong to different node types.For example, institute It can be user node to state first node, and the section point can be commodity node, therefore first tables of data is used to deposit The relation information of user node and commodity node is stored up, represents which commodity each user have purchased respectively.
S302:The second tables of data is established on the physical cluster, second tables of data saves for on-line storage second The attribute information of point.
For example, the section point is commodity node, therefore attribute of second tables of data for storing commodity node Information.
According to the above-mentioned technical solution, in embodiments of the present invention, node type is not repartitioned when establishing tables of data, because Even if this described first node is different with the node type of the section point, i.e., described first tables of data and second data Table corresponds to different node types, remains on first tables of data and second tables of data being stored in same physics collection On group.Therefore, multiple physical clusters need not be accessed during online query, improve online query speed.For example, first node is user Node, section point are commodity node, during the commodity of online query user i purchases, only need to access one physical cluster, just Commodity 1 corresponding to user i can be inquired from the first tables of data, and the attribute of commodity 1 is inquired from the second tables of data Information, improve online query speed.Further, it is also possible to commodity 2 corresponding to user i are inquired from the first tables of data, and The attribute information of commodity 2 is inquired from the second tables of data.
In addition, when node type is more without more physical cluster, and also without weight during newly-increased node type Newly one physical cluster of arrangement, but the tables of data of newly-increased node type can be established on same physical cluster, so as to Can be cost-effective.
Wherein, first tables of data and second tables of data can be indicated by index entry and attribute item.Under Face illustrates respectively.
First tables of data can include the first index entry and the second index entry, and first index entry is used to deposit online The mark of the first node is stored up, second index entry is used for on-line storage second section corresponding with the first node The mark of point.First tables of data can also include the second attribute item, and second attribute item is used for described in on-line storage the The attribute information of the corresponding relation of one node and the section point.
Such as shown in table 1, first tables of data can be key-key-value (key-key-value) structure, wherein, it is described First index entry is main key information, is stored with user j mark, and second index entry is from key information, is stored with user j The mark of the commodity of purchase, second attribute item are value information, are stored with the time of user j purchase commodity.
Table 1
Main key From key value
User j Commodity 1 2016.5.3
User j Commodity 2 2016.7.9
When the data in the table 1 carry out online query, can be inquired about according to main key from key and/or value, or Person can also inquire about value according to main key and from key.For example, the items list of inquiry user j purchases is commodity 1 and commodity 2.
Second tables of data can include the 3rd index entry and the first attribute item, and the 3rd index entry is used to deposit online The mark of the section point is stored up, first attribute item is used for the attribute information of section point described in on-line storage.
Such as shown in table 2, second tables of data can be expressed as key-value (key-value) structure, wherein, described Three index entries are key information, are stored with the mark of commodity, and first attribute item is value information, is stored with the attribute of commodity Information, such as the merchandise news such as introduction, price and species of commodity 1.
Table 2
key value
Commodity 1 The attribute information of commodity 1
Commodity 2 The attribute information of commodity 2
When the data in the table 2 carry out online query, value can be inquired about according to key.For example, inquiry commodity 1 Attribute information.
In the embodiment of the present application, in addition to above-mentioned data, the physical cluster can also be stored with more data. Illustrate separately below.
, can also be online in the first tables of data in addition to the corresponding relation of the first node and the section point The 3rd node and the corresponding relation of fourth node are stored with, wherein the 3rd node and the first node belong to identical section Vertex type, the fourth node and the section point belong to identical node type.That is, first tables of data can For on-line storage first kind node and the corresponding relation of Second Type node.Such as shown in table 3, first tables of data Also it is stored with the mark of the commodity of user i purchases, and the time of user i purchase commodity.
Table 3
Main key From key value
User j Commodity 1 2016.5.3
User j Commodity 2 2016.7.9
User i Commodity 3 2016.1.11
In second tables of data in addition to the attribute information of the section point, there can be other with on-line storage The attribute information of node, such as the attribute information of fourth node.That is, second tables of data can be used for on-line storage The attribute information of Second Type node.Such as shown in table 4, second tables of data is also stored with the attribute information of commodity 3.
Table 4
key value
Commodity 1 The attribute information of commodity 1
Commodity 2 The attribute information of commodity 2
Commodity 3 The attribute information of commodity 3
On the physical cluster in addition to first tables of data and second tables of data, other can also have been established Tables of data, such as the 3rd tables of data.Wherein, the 3rd tables of data is used for the node of on-line storage the 5th and the first node Relation information.Wherein, the 5th node can may belong to the node of same type with the first node, such as shown in table 5, 5th node and first node are all user node, and the 3rd tables of data is used for the buddy list for storing user i:User j, Yi Jiyong Family i and user j friend relation value;5th node can also belong to different types of node with the first node.
Table 5
Main key From key value
User i User j 79
It should be noted that to the structure of tables of data established on the physical cluster and each in the embodiment of the present application Node type is not any limitation as corresponding to individual tables of data.
In the embodiment of the present application, the physical cluster can be with as shown in figure 4, be provided with N number of subregion (English: Partition), N is more than or equal to 2, when establishing the tables of data of node, is determined pair according to the data of the index entry of tables of data The subregion answered, tables of data is established in corresponding subregion.It is specifically described below.
Methods described can also include:First subregion, example are determined from N number of partition holding according to first index entry Such as, according to the value of main key in table 1, subregion 1 is determined using hash algorithm, S301 is included in first point of the physical cluster The first tables of data is established in area.
Methods described can also include:Second subregion, example are determined from N number of partition holding according to the 3rd index entry Such as, according to the value of key in table 2, subregion 2 is determined using hash algorithm, S302 includes:In the second subregion of the physical cluster On establish the second tables of data.In embodiments of the present invention, same tables of data can also be established on multiple subregions, for example, root According to the value of the main key in tables of data per item data, a subregion is determined, the item data of the tables of data is deposited in the subregion.
Include M backup region (English respectively in the distributed physical cluster shown in Fig. 4, in each subregion: Replica), M is more than or equal to 2.Wherein, the data of each backup region storage in same subregion are consistent, not only On-line storage and online query function while can supporting multiple devices, and the backup of data can be realized.Specifically, institute State the first subregion and second subregion includes M backup region respectively;Is established on the first subregion of the physical cluster One tables of data, including:On M backup region of the first subregion of the physical cluster, first tables of data is established respectively; The second tables of data is established on the second subregion of the physical cluster, including:M in the second subregion of the physical cluster are standby On part region, second tables of data is established respectively.
The method for building up embodiment of corresponding above-mentioned tables of data, present invention also provides the side that online query is carried out to tables of data Method embodiment.It is specifically described below.
Referring to Fig. 5, this application provides a kind of embodiment of the method for online query method.
Being established in the present embodiment, on a physical cluster has the first tables of data and the second tables of data, first tables of data For on-line storage first node and the relation information of section point;Second tables of data is used for on-line storage section point Attribute information;Wherein, the first node and the section point belong to different node types.It should be noted that this reality Apply the first tables of data and the progress online query of the second tables of data that example can be used for establishing any of the above-described kind of embodiment of the method.
The methods described of the present embodiment includes:
S501:On-line query request is received, the on-line query request is used for the relation for indicating online query first node Data.
For example, it is desired to during the commodity of online query user purchase, the first node can be user node, described first The relation data of node can be the attribute information of commodity node corresponding to user node.
S502:First tables of data and second tables of data in physical cluster described in online access, inquire institute State the relation data of first node.
Wherein, the relation data for inquiring the first node includes:Institute is inquired from first tables of data Section point corresponding to first node is stated, the attribute information of the section point is inquired from second tables of data.
For example, first node is user node, section point is commodity node, during the commodity of online query user i purchases, Commodity 1 corresponding to user i are inquired from the first tables of data, and the attribute information of commodity 1 is inquired from the second tables of data.
Wherein, first tables of data and second tables of data can be indicated by index entry and attribute item.Under Face illustrates respectively.
First tables of data can include the first index entry and the second index entry, and first index entry is used to deposit online The mark of the first node is stored up, second index entry is used for on-line storage second section corresponding with the first node The mark of point.First tables of data can also include the second attribute item, and second attribute item is used for described in on-line storage the The attribute information of the corresponding relation of one node and the section point.Such as shown in table 1, first tables of data can be key- Key-value (key-key-value) structure.When in the table 1 data carry out online query when, can according to main key inquire about from Key and/or value, or value can also be inquired about according to main key and from key.For example, user j purchases are inquired in table 1 Items list be commodity 1 and commodity 2.
Second tables of data can include the 3rd index entry and the first attribute item, and the 3rd index entry is used to deposit online The mark of the section point is stored up, first attribute item is used for the attribute information of section point described in on-line storage.Such as table 2 Shown, second tables of data can be expressed as key-value (key-value) structure.Data in the table 2 carry out online During inquiry, value can be inquired about according to key.For example, the attribute information of commodity 1 is inquired in table 2.
In the embodiment of the present application, the physical cluster can be with as shown in figure 4, be provided with N number of subregion (English: Partition), N is more than or equal to 2.Therefore when inquiring about tables of data, entered according to the index entry of tables of data into corresponding subregion Row inquiry.It is specifically described below.
Methods described can also include:First subregion is determined from N number of partition holding according to first index entry, such as According to all values of main key in table 1, subregion 1 is determined using hash algorithm;Divided according to second index entry from N number of storage The second subregion, such as all values according to key in table 2 are determined in area, subregion 2 is determined using hash algorithm;In S502 Line accesses first tables of data and second tables of data in the physical cluster, including:Physics collection described in online access First tables of data on first subregion in group, and second tables of data on second subregion.
Include M backup region respectively in the distributed physical cluster shown in Fig. 4, in each subregion, M is more than or equal to 2.Wherein, the data of each backup region storage in same subregion are consistent, and can not only support the same of multiple devices When on-line storage and online query function, and the backup of data can be realized.Therefore, can during the first tables of data of online access To be to access the first tables of data on one or more of first subregion backup region, similar, the data of online access second Can access the second tables of data on one or more of second subregion backup region during table.
In the embodiment of the present application, in addition to above-mentioned data, the physical cluster can also be stored with more data, Online query can be carried out to these data.Illustrate separately below.
, can also be online in the first tables of data in addition to the corresponding relation of the first node and the section point The 3rd node and the corresponding relation of fourth node are stored with, wherein the 3rd node and the first node belong to identical section Vertex type, the fourth node and the section point belong to identical node type.Such as shown in table 3, first data Table is also stored with the mark of the commodity of user i purchases, and the time of user i purchase commodity.Removed in second tables of data Outside the attribute information of the section point, can there are the attribute information of other nodes, such as fourth node with on-line storage Attribute information.Such as shown in table 4, second tables of data is also stored with the attribute information of commodity 3.
The on-line query request is additionally operable to indicate the relation data of the node of online query the 3rd;Methods described can also wrap Include:First tables of data and second tables of data in physical cluster described in online access, inquire the 3rd node Relation data;Wherein, the relation data for inquiring the 3rd node includes:Inquired from first tables of data Fourth node corresponding to 3rd node;The attribute information of the fourth node is inquired from second tables of data.Example Such as, the commodity 3 of user i purchases are inquired from table 3, the attribute information of commodity 3 is inquired from table 4.
On the physical cluster in addition to first tables of data and second tables of data, other can also have been established Tables of data, such as the 3rd tables of data.3rd tables of data is used for the pass of the node of on-line storage the 5th and the first node It is information.3rd tables of data can be as shown in table 5.
The on-line query request is used for the relation data for indicating the node of online query the 5th:Physics collection described in online access First tables of data and second tables of data in group, the relation data of the first node is inquired, including:It is online to visit First tables of data, second tables of data and the 3rd tables of data in the physical cluster are asked, inquires described The relation data of five nodes;The relation data of the 5th node is inquired, including:Institute is inquired from the 3rd tables of data The first node corresponding to stating the 5th node, inquire the relation data of the first node.Such as need to inquire about user i's During the commodity of good friend's purchase, the good friend that user i is inquired from table 5 is user j, and the commodity of user j purchases are inquired from table 1 1, the attribute information of commodity 1 is inquired from table 2.
In the embodiment of the present application, in order to realize online query function, three kinds of different inquiry operators are defined.Pass through this Three kinds of inquiry operators are used alone or in combination, and can realize quick search function.Illustrate separately below.
The first operator:Simple queries operator (is referred to as AtomicSearch operators), for the root from tables of data Corresponding index entry or property value are inquired according to index entry.For example, according to given key:User j, it can be inquired about from table 1 To commodity 1 corresponding to user j and commodity 2, in another example, according to given key:Commodity 1, commodity 1 can be inquired from table 2 Attribute information.
Second of operator:Compound operation operator, for from tables of data according to corresponding to inquiring index entry index entry or Person's property value, and the data to inquiring carry out computing.
First tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node, such as table 3 is also deposited Contain the commodity sign of user i purchases:Commodity 3;Second tables of data is additionally operable to the attribute information of on-line storage fourth node, Such as table 4 is also stored with the attribute information of commodity 3.
The on-line query request includes the first composition operators;Methods described also includes:
The first instruction, the second instruction and the 3rd instruction are parsed from first composition operators, first instruction is used In the relation data of instruction online query first node, described second instructs the relation number for indicating the node of online query the 3rd According to the described 3rd instructs for indicating to carry out integrated treatment to relation data;
The relation data of the first node is inquired, including:Based on the described first instruction, the first node is inquired Relation data, such as based on first instruction, inquire user i purchase commodity:Commodity 1 and commodity 2.
The relation data of the 3rd node is inquired, including:Based on the described second instruction, the 3rd node is inquired Relation data, such as based on second instruction, inquire user j purchase commodity:Commodity 3.
Methods described also includes:Based on the described 3rd instruction, relation data to the first node and described Section three The relation data of point carries out integrated treatment.Wherein, the integrated treatment can include any one of following handle:Merging treatment, friendship Set operation and set difference operation.For example, the commodity that the commodity and user j that user i is bought are bought can be merged, finally As a result it is:Commodity 1, commodity 2 and commodity 3.In another example the commodity that the commodity and user j that user i is bought are bought can be asked friendship Collection, final result are sky.In another example the user i commodity bought and user the j commodity bought are subjected to set difference operation, that is, really Determine user i and bought the commodity that still user j was not bought, final result is commodity 1 and commodity 2.
The third operator:Compound transfer operator, for the result that will be inquired from a tables of data, as another number According to the index entry inquired about in table.
Also set up the 3rd tables of data on the physical cluster, the 3rd tables of data be used for the node of on-line storage the 5th with The relation information of the first node.Such as shown in table 5, the 5th node and first node are all user node, the 3rd tables of data For storing user i buddy list:User j, and user i and user j friend relation value;
The on-line query request includes the second composition operators;Methods described also includes:
The 4th instruction and the 5th instruction are parsed from first composition operators, the described 4th instructs for indicating online Node corresponding to the 5th node is inquired about, the described 5th instructs the relation number for indicating node corresponding to the node of online query the 5th According to.
The first node corresponding to the 5th node is inquired from the 3rd tables of data, including:Based on described 4th instruction, the first node corresponding to the 5th node is inquired from the 3rd tables of data;Inquire described The relation data of one node, including:Instructed based on the described 5th, inquire the relation data of the first node.For example, it is based on The good friend that 4th instruction inquires user i from table 5 is user j, using user j as the index entry inquired about in table 1, from table 1 In inquire the commodity 1 of user j purchases, the attribute informations of commodity 1 is inquired from table 2.
In the embodiment of the present application, storage architecture can be made up of two parts of offline cluster and online cluster.Wherein, from Line cluster can be divided into HDFS (Hadoop distributed file systems) Index Build Cluster (index cluster) and Real- Time Stream Process Cluster (real-time stream process cluster), HDFS Index Build Cluster are mainly used to Attributed graph is converted into the tables of data of key-value and key-key-value structures with efficient batch style, and tables of data is same Walk online cluster.Real-Time Stream Process Cluster are mainly used to handle real-time update message, and send To online cluster, this cluster can be with second level delay disposal message.
And online cluster can be the physical cluster in any of the above-described embodiment, the physical cluster includes Proxy (agency) Sub-cluster and Search (inquiry) sub-cluster.
Proxy sub-clusters are mainly responsible for receiving the on-line query request of user's input, perform inquiry, and by final inquiry As a result user is returned to.In query process is performed, Proxy sub-clusters can ask to the pocket transmission of Search subsets, obtain key- Data in the tables of data of value and key-key-value structures.Compound operation operator presented hereinbefore and compound transfer operator All performed in Proxy sub-clusters.Search sub-clusters, it is mainly responsible for loading key-value and key-key-value structures Tables of data, and in the new information renewal table sended over according to Real-Time Stream Process Cluster in Hold.In addition, it is exactly to receive the online query that Proxy sub-clusters send over to ask that Search sub-clusters, which also have an important function, Ask, data query operation is performed according to on-line query request, and Query Result is returned into Proxy sub-clusters.
Proxy sub-clusters include at least three layers:
Access Layer is serviced, this layer is mainly used in receiving on-line query request, and on-line query request is converted into execution Sent after the form that core layer can identify to execution core layer.And the result for performing core layer return is converted into user's phase The return form of prestige;Core layer is performed, this layer is used to realize online query, specifically includes request analysis and verification online query Request, generation simultaneously send inquiry plan to data acquisition layer, finally return to Query Result to service Access Layer;Data acquisition layer, This layer is mainly responsible for being communicated with Search sub-clusters.The on-line query request for performing core layer transmission is forwarded to by this layer Search sub-clusters, and the result that Search sub-clusters are returned returns to execution core layer.
Search sub-clusters include at least two layers:
Storage management layer, the loading of the tables of data of this layer of main responsible key-value and key-key-value structure and Renewal.
Layer is inquired about, is mainly responsible for for this layer receiving the on-line query request that Proxy sub-clusters issue, and according to online query Request is inquired about, and Query Result is back into proxy sub-clusters.The layer can also be filtered to Query Result, be sorted Deng operation.
Corresponding above method embodiment, present invention also provides corresponding device embodiment, is specifically described below.
Referring to Fig. 6, this application provides a kind of embodiment for establishing device of tables of data.The described device of the present embodiment Including:First, which establishes unit 601 and second, establishes unit 602.
First establishes unit 601, and for establishing the first tables of data on a physical cluster, first tables of data is used for The relation information of on-line storage first node and section point.
Second establishes unit 602, and for establishing the second tables of data on the physical cluster, second tables of data is used for The attribute information of on-line storage section point.
Wherein, the first node and the section point belong to different node types.
Optionally, first tables of data includes the first index entry and the second index entry, and first index entry is used for Line stores the mark of the first node, and second index entry is used for on-line storage corresponding with the first node described the The mark of two nodes;
Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for on-line storage institute The mark of section point is stated, first attribute item is used for the attribute information of section point described in on-line storage.
Optionally, first tables of data also includes the second attribute item, and second attribute item is used for described in on-line storage The attribute information of the corresponding relation of first node and the section point.
Optionally, first tables of data is key-key-value key-key-value structures, wherein, first index entry For main key information, second index entry is that second attribute item is value information from key information;
Second tables of data is key-value key-value structures, wherein, the 3rd index entry is key information, described One attribute item is value information.
Optionally, the physical cluster includes N number of partition holding, and N is more than or equal to 2;The device of establishing also includes:
Determining unit, for determining the first subregion from N number of partition holding according to first index entry, according to described Three index entries determine the second subregion from N number of partition holding;
It is described when establishing the first tables of data on a physical cluster, described first, which establishes unit, is specifically used for, described The first tables of data is established on first subregion of physical cluster;
It is described when establishing the second tables of data on the physical cluster, described second, which establishes unit, is specifically used for, described The second tables of data is established on second subregion of physical cluster.
Optionally, first subregion and second subregion include M backup region respectively, and M is more than or equal to 2;
When establishing the first tables of data on the first subregion of the physical cluster, described first, which establishes unit, is specifically used for, On M backup region of the first subregion of the physical cluster, first tables of data is established respectively;
When establishing the second tables of data on the second subregion of the physical cluster, described second, which establishes unit, is specifically used for, On M backup region of the second subregion of the physical cluster, second tables of data is established respectively.
Optionally, first tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node;It is described Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, the 3rd node and the first node category In identical node type, the fourth node and the section point belong to identical node type;
Or the device of establishing also includes:3rd establishes unit, for establishing the 3rd data on the physical cluster Table, the 3rd tables of data are used for the relation information of the node of on-line storage the 5th and the first node.
Referring to Fig. 7, this application provides a kind of device embodiment of online query device.In the present embodiment, a thing Being established on reason cluster has the first tables of data and the second tables of data, and first tables of data is used for on-line storage first node and second The relation information of node;Second tables of data is used for the attribute information of on-line storage section point;Wherein, the first node Belong to different node types with the section point;
The described device of the present embodiment includes:Receiving unit 701 and query unit 702.
Receiving unit 701, for receiving on-line query request, the on-line query request is used to indicate online query first The relation data of node;
Query unit 702, for first tables of data in physical cluster described in online access and second data Table, inquire the relation data of the first node;
Wherein, the relation data for inquiring the first node includes:Institute is inquired from first tables of data Section point corresponding to first node is stated, the attribute information of the section point is inquired from second tables of data.
Optionally, first tables of data includes the first index entry and the second index entry, and first index entry is used for Line stores the mark of the first node, and second index entry is used for on-line storage corresponding with the first node described the The mark of two nodes;
Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for on-line storage institute The mark of section point is stated, first attribute item is used for the attribute information of section point described in on-line storage.
Optionally, the physical cluster includes N number of partition holding, and N is more than or equal to 2;Described device also includes:
Determining unit, for determining the first subregion from N number of partition holding according to first index entry, according to described Two index entries determine the second subregion from N number of partition holding;
When first tables of data in physical cluster described in online access and second tables of data, the query unit It is specifically used for:First tables of data on first subregion in physical cluster described in online access, and described second point Second tables of data in area.
Optionally, first tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node;It is described Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, the 3rd node and the first node category In identical node type, the fourth node and the section point belong to identical node type;The online query please Seek the relation data for being additionally operable to indicate the node of online query the 3rd;
The query unit is additionally operable to, first tables of data and second number in physical cluster described in online access According to table, the relation data of the 3rd node is inquired;
Wherein, the relation data for inquiring the 3rd node includes:Institute is inquired from first tables of data State fourth node corresponding to the 3rd node;The attribute information of the fourth node is inquired from second tables of data.
Optionally, the on-line query request includes the first composition operators;Described device also includes:
Resolution unit, for parsing the first instruction, the second instruction and the 3rd instruction, institute from first composition operators The relation data that the first instruction is used to indicate online query first node is stated, described second instructs for indicating online query the 3rd The relation data of node, the described 3rd instructs for indicating to carry out integrated treatment to relation data;
When inquiring the relation data of the first node, the query unit is specifically used for, and is instructed based on described first, Inquire the relation data of the first node;
When inquiring the relation data of the 3rd node, the query unit is specifically used for, and is instructed based on described second, Inquire the relation data of the 3rd node;
Described device also includes:Processing unit, for being instructed based on the described 3rd, to the relation data of the first node Integrated treatment is carried out with the relation data of the 3rd node.
Optionally, the integrated treatment includes any one of following handle:Merging treatment, intersection operation and set difference operation.
Optionally, the 3rd tables of data has been also set up on the physical cluster, the 3rd tables of data is used for on-line storage the The relation information of five nodes and the first node;The on-line query request is used for the relation for indicating the node of online query the 5th Data:
First tables of data and second tables of data in physical cluster described in online access, inquire described first During the relation data of node, the query unit is specifically used for, first tables of data in physical cluster described in online access, Second tables of data and the 3rd tables of data, inquire the relation data of the 5th node;
When inquiring the relation data of the 5th node, the query unit is specifically used for, from the 3rd tables of data In inquire the 5th node corresponding to the first node, inquire the relation data of the first node.
Optionally, the on-line query request includes the second composition operators;Described device also includes:
Resolution unit, for parsing the 4th instruction and the 5th instruction from first composition operators, the described 4th refers to Make for indicating node corresponding to the node of online query the 5th, the described 5th instructs for indicating that the node of online query the 5th is corresponding Node relation data;
When inquiring the first node corresponding to the 5th node from the 3rd tables of data, the query unit It is specifically used for, based on the described 4th instruction, is inquired from the 3rd tables of data described first corresponding to the 5th node Node;
When inquiring the relation data of the first node, the query unit is specifically used for, and is instructed based on the described 5th, Inquire the relation data of the first node.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, the corresponding process in preceding method embodiment is may be referred to, will not be repeated here.
In several embodiments provided herein, it should be understood that disclosed system, apparatus and method can be with Realize by another way.For example, device embodiment described above is only schematical, for example, the unit Division, only a kind of division of logic function, can there is other dividing mode, such as multiple units or component when actually realizing Another system can be combined or be desirably integrated into, or some features can be ignored, or do not perform.It is another, it is shown or The mutual coupling discussed or direct-coupling or communication connection can be the indirect couplings by some interfaces, device or unit Close or communicate to connect, can be electrical, mechanical or other forms.
The unit illustrated as separating component can be or may not be physically separate, show as unit The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be selected to realize the mesh of this embodiment scheme according to the actual needs 's.
In addition, each functional unit in each embodiment of the application can be integrated in a processing unit, can also That unit is individually physically present, can also two or more units it is integrated in a unit.Above-mentioned integrated list Member can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.
If the integrated unit is realized in the form of SFU software functional unit and is used as independent production marketing or use When, it can be stored in a computer read/write memory medium.Based on such understanding, the technical scheme of the application is substantially The part to be contributed in other words to prior art or all or part of the technical scheme can be in the form of software products Embody, the computer software product is stored in a storage medium, including some instructions are causing a computer Equipment (can be personal computer, server, or network equipment etc.) performs the complete of each embodiment methods described of the application Portion or part steps.And foregoing storage medium includes:USB flash disk, mobile hard disk, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can store journey The medium of sequence code.
Described above, above example is only to illustrate the technical scheme of the application, rather than its limitations;Although with reference to before Embodiment is stated the application is described in detail, it will be understood by those within the art that:It still can be to preceding State the technical scheme described in each embodiment to modify, or equivalent substitution is carried out to which part technical characteristic;And these Modification is replaced, and the essence of appropriate technical solution is departed from the spirit and scope of each embodiment technical scheme of the application.

Claims (30)

  1. A kind of 1. method for building up of tables of data, it is characterised in that including:
    The first tables of data is established on a physical cluster, first tables of data is used for on-line storage first node and the second section The relation information of point;
    The second tables of data is established on the physical cluster, the attribute that second tables of data is used for on-line storage section point is believed Breath;
    Wherein, the first node and the section point belong to different node types.
  2. 2. method for building up according to claim 1, it is characterised in that first tables of data includes the first index entry and the Two index entries, first index entry are used for the mark of first node described in on-line storage, and second index entry is used for online The mark of the storage section point corresponding with the first node;
    Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for described in on-line storage the The mark of two nodes, first attribute item are used for the attribute information of section point described in on-line storage.
  3. 3. method for building up according to claim 2, it is characterised in that first tables of data also includes the second attribute item, Second attribute item is used for the attribute information of the corresponding relation of first node and the section point described in on-line storage.
  4. 4. method for building up according to claim 3, it is characterised in that first tables of data is key-key-value key-key- Value structures, wherein, first index entry is main key information, and second index entry is from key information, second attribute Item is value information;
    Second tables of data is key-value key-value structures, wherein, the 3rd index entry is key information, first category Property item is value information.
  5. 5. according to the method for building up described in any one of claim 2 to 4, it is characterised in that the physical cluster includes N number of storage Subregion, N are more than or equal to 2;The method for building up also includes:
    First subregion is determined from N number of partition holding according to first index entry, according to the 3rd index entry from N number of storage The second subregion is determined in subregion;
    It is described to establish the first tables of data on a physical cluster, including:Is established on the first subregion of the physical cluster One tables of data;
    It is described to establish the second tables of data on the physical cluster, including:Is established on the second subregion of the physical cluster Two tables of data.
  6. 6. method for building up according to claim 5, it is characterised in that first subregion and second subregion wrap respectively M backup region is included, M is more than or equal to 2;
    The first tables of data is established on the first subregion of the physical cluster, including:In the M of the first subregion of the physical cluster On individual backup region, first tables of data is established respectively;
    The second tables of data is established on the second subregion of the physical cluster, including:In the M of the second subregion of the physical cluster On individual backup region, second tables of data is established respectively.
  7. 7. method for building up according to claim 1, it is characterised in that first tables of data is additionally operable to on-line storage the 3rd The relation information of node and fourth node;Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, 3rd node and the first node belong to identical node type, and the fourth node and the section point belong to phase Same node type;
    Or methods described also includes:The 3rd tables of data is established on the physical cluster, the 3rd tables of data is used for online Store the relation information of the 5th node and the first node.
  8. A kind of 8. online query method, it is characterised in that being established on a physical cluster has the first tables of data and the second tables of data, First tables of data is used for the relation information of on-line storage first node and section point;Second tables of data is used for online Store the attribute information of section point;Wherein, the first node and the section point belong to different node types;It is described Method includes:
    On-line query request is received, the on-line query request is used for the relation data for indicating online query first node;
    First tables of data and second tables of data in physical cluster described in online access, inquire the first node Relation data;
    Wherein, the relation data for inquiring the first node includes:Described is inquired from first tables of data Section point corresponding to one node, the attribute information of the section point is inquired from second tables of data.
  9. 9. online query method according to claim 8, it is characterised in that first tables of data includes the first index entry With the second index entry, first index entry is used for the mark of first node described in on-line storage, and second index entry is used for The mark of the on-line storage section point corresponding with the first node;
    Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for described in on-line storage the The mark of two nodes, first attribute item are used for the attribute information of section point described in on-line storage.
  10. 10. online query method according to claim 9, it is characterised in that the physical cluster includes N number of storage point Area, N are more than or equal to 2;Methods described also includes:
    First subregion is determined from N number of partition holding according to first index entry, according to second index entry from N number of storage The second subregion is determined in subregion;
    First tables of data and second tables of data in physical cluster described in online access, including:Described in online access First tables of data on first subregion in physical cluster, and second tables of data on second subregion.
  11. 11. online query method according to claim 9, it is characterised in that first tables of data is additionally operable to deposit online Store up the 3rd node and the relation information of fourth node;Second tables of data is additionally operable to the attribute letter of on-line storage fourth node Breath;Wherein, the 3rd node and the first node belong to identical node type, the fourth node and second section Point belongs to identical node type;The on-line query request is additionally operable to indicate the relation data of the node of online query the 3rd;
    Methods described also includes:
    First tables of data and second tables of data in physical cluster described in online access, inquire the 3rd node Relation data;
    Wherein, the relation data for inquiring the 3rd node includes:Described is inquired from first tables of data Fourth node corresponding to three nodes;The attribute information of the fourth node is inquired from second tables of data.
  12. 12. online query method according to claim 11, it is characterised in that the on-line query request includes first Composition operators;Methods described also includes:
    The first instruction, the second instruction and the 3rd instruction are parsed from first composition operators, described first instructs for referring to Show the relation data of online query first node, described second instructs the relation data for indicating the node of online query the 3rd, Described 3rd instructs for indicating to carry out integrated treatment to relation data;
    The relation data of the first node is inquired, including:Instructed based on described first, inquire the pass of the first node Coefficient evidence;
    The relation data of the 3rd node is inquired, including:Instructed based on described second, inquire the pass of the 3rd node Coefficient evidence;
    Methods described also includes:Instructed based on the described 3rd, the relation data and the 3rd node to the first node Relation data carries out integrated treatment.
  13. 13. online query method according to claim 12, it is characterised in that the integrated treatment includes any one of following Processing:Merging treatment, intersection operation and set difference operation.
  14. 14. online query method according to claim 9, it is characterised in that also set up the 3rd on the physical cluster Tables of data, the 3rd tables of data are used for the relation information of the node of on-line storage the 5th and the first node;It is described to look into online Ask the relation data that request is used to indicate the node of online query the 5th:
    First tables of data and second tables of data in physical cluster described in online access, inquire the first node Relation data, including:First tables of data, second tables of data in physical cluster described in online access and described Three tables of data, inquire the relation data of the 5th node;
    The relation data of the 5th node is inquired, including:The 5th node pair is inquired from the 3rd tables of data The first node answered, inquire the relation data of the first node.
  15. 15. online query method according to claim 14, it is characterised in that the on-line query request includes second Composition operators;Methods described also includes:
    The 4th instruction and the 5th instruction are parsed from first composition operators, the described 4th instructs for indicating online query Node corresponding to 5th node, the described 5th instructs the relation data for indicating node corresponding to the node of online query the 5th;
    The first node corresponding to the 5th node is inquired from the 3rd tables of data, including:Based on the described 4th Instruction, the first node corresponding to the 5th node is inquired from the 3rd tables of data;
    The relation data of the first node is inquired, including:Instructed based on the described 5th, inquire the pass of the first node Coefficient evidence.
  16. 16. a kind of tables of data establishes device, it is characterised in that including:
    First establishes unit, and for establishing the first tables of data on a physical cluster, first tables of data is used to deposit online Store up the relation information of first node and section point;
    Second establishes unit, and for establishing the second tables of data on the physical cluster, second tables of data is used to deposit online Store up the attribute information of section point;
    Wherein, the first node and the section point belong to different node types.
  17. 17. according to claim 16 establish device, it is characterised in that first tables of data include the first index entry and Second index entry, first index entry are used for the mark of first node described in on-line storage, and second index entry is used for Line stores the mark of the section point corresponding with the first node;
    Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for described in on-line storage the The mark of two nodes, first attribute item are used for the attribute information of section point described in on-line storage.
  18. 18. according to claim 17 establish device, it is characterised in that first tables of data also includes the second attribute , second attribute item is used for the attribute information of the corresponding relation of first node and the section point described in on-line storage.
  19. 19. according to claim 18 establish device, it is characterised in that first tables of data is key-key-value key- Key-value structures, wherein, first index entry is main key information, and second index entry is from key information described second Attribute item is value information;
    Second tables of data is key-value key-value structures, wherein, the 3rd index entry is key information, first category Property item is value information.
  20. 20. establish device according to any one of claim 17 to 19, it is characterised in that the physical cluster includes N number of Partition holding, N are more than or equal to 2;The device of establishing also includes:
    Determining unit, for determining the first subregion from N number of partition holding according to first index entry, according to the 3rd rope Draw item and the second subregion is determined from N number of partition holding;
    It is described when establishing the first tables of data on a physical cluster, described first, which establishes unit, is specifically used for, in the physics The first tables of data is established on first subregion of cluster;
    It is described when establishing the second tables of data on the physical cluster, described second, which establishes unit, is specifically used for, in the physics The second tables of data is established on second subregion of cluster.
  21. 21. according to claim 20 establish device, it is characterised in that first subregion and second subregion difference Including M backup region, M is more than or equal to 2;
    When establishing the first tables of data on the first subregion of the physical cluster, described first, which establishes unit, is specifically used for, in institute State on M backup region of the first subregion of physical cluster, establish first tables of data respectively;
    When establishing the second tables of data on the second subregion of the physical cluster, described second, which establishes unit, is specifically used for, in institute State on M backup region of the second subregion of physical cluster, establish second tables of data respectively.
  22. 22. according to claim 16 establish device, it is characterised in that first tables of data is additionally operable to on-line storage The relation information of three nodes and fourth node;Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Its In, the 3rd node and the first node belong to identical node type, the fourth node and the section point category In identical node type;
    Or the device of establishing also includes:3rd establishes unit, for establishing the 3rd tables of data on the physical cluster, 3rd tables of data is used for the relation information of the node of on-line storage the 5th and the first node.
  23. 23. a kind of online query device, it is characterised in that being established on a physical cluster has the first tables of data and the second data Table, first tables of data are used for the relation information of on-line storage first node and section point;Second tables of data is used for The attribute information of on-line storage section point;Wherein, the first node and the section point belong to different node types; Described device includes:
    Receiving unit, for receiving on-line query request, the on-line query request is used to indicate online query first node Relation data;
    Query unit, for first tables of data in physical cluster described in online access and second tables of data, inquiry To the relation data of the first node;
    Wherein, the relation data for inquiring the first node includes:Described is inquired from first tables of data Section point corresponding to one node, the attribute information of the section point is inquired from second tables of data.
  24. 24. device according to claim 23, it is characterised in that first tables of data includes the first index entry and second Index entry, first index entry are used for the mark of first node described in on-line storage, and second index entry is used to deposit online The mark of the storage section point corresponding with the first node;
    Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for described in on-line storage the The mark of two nodes, first attribute item are used for the attribute information of section point described in on-line storage.
  25. 25. device according to claim 24, it is characterised in that the physical cluster includes N number of partition holding, and N is more than Or equal to 2;Described device also includes:
    Determining unit, for determining the first subregion from N number of partition holding according to first index entry, according to second rope Draw item and the second subregion is determined from N number of partition holding;
    When first tables of data in physical cluster described in online access and second tables of data, the query unit is specific For:First tables of data on first subregion in physical cluster described in online access, and on second subregion Second tables of data.
  26. 26. device according to claim 24, it is characterised in that first tables of data is additionally operable to Section three of on-line storage Point and the relation information of fourth node;Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, institute State the 3rd node and the first node belongs to identical node type, the fourth node and the section point belong to identical Node type;The on-line query request is additionally operable to indicate the relation data of the node of online query the 3rd;
    The query unit is additionally operable to, first tables of data and second data in physical cluster described in online access Table, inquire the relation data of the 3rd node;
    Wherein, the relation data for inquiring the 3rd node includes:Described is inquired from first tables of data Fourth node corresponding to three nodes;The attribute information of the fourth node is inquired from second tables of data.
  27. 27. device according to claim 26, it is characterised in that the on-line query request includes the first compound calculation Son;Described device also includes:
    Resolution unit, for parsing the first instruction, the second instruction and the 3rd instruction from first composition operators, described the One instructs the relation data for indicating online query first node, and described second instructs for indicating the node of online query the 3rd Relation data, the described 3rd instruct for indicate to relation data carry out integrated treatment;
    When inquiring the relation data of the first node, the query unit is specifically used for, based on the described first instruction, inquiry To the relation data of the first node;
    When inquiring the relation data of the 3rd node, the query unit is specifically used for, based on the described second instruction, inquiry To the relation data of the 3rd node;
    Described device also includes:Processing unit, for being instructed based on the described 3rd, relation data and institute to the first node The relation data for stating the 3rd node carries out integrated treatment.
  28. 28. device according to claim 27, it is characterised in that the integrated treatment includes any one of following handle:Close And handle, intersection operation and set difference operation.
  29. 29. device according to claim 24, it is characterised in that the 3rd tables of data has been also set up on the physical cluster, 3rd tables of data is used for the relation information of the node of on-line storage the 5th and the first node;The on-line query request is used In the relation data of the instruction node of online query the 5th:
    First tables of data and second tables of data in physical cluster described in online access, inquire the first node Relation data when, the query unit is specifically used for, first tables of data in physical cluster described in online access, described Second tables of data and the 3rd tables of data, inquire the relation data of the 5th node;
    When inquiring the relation data of the 5th node, the query unit is specifically used for, and is looked into from the 3rd tables of data The first node corresponding to asking the 5th node, inquire the relation data of the first node.
  30. 30. device according to claim 29, it is characterised in that the on-line query request includes the second compound calculation Son;Described device also includes:
    Resolution unit, for parsing the 4th instruction and the 5th instruction from first composition operators, the 4th instruction is used In node corresponding to the instruction node of online query the 5th, the described 5th instructs for indicating to save corresponding to the node of online query the 5th The relation data of point;
    When inquiring the first node corresponding to the 5th node from the 3rd tables of data, the query unit is specific For based on the described 4th instruction, the first node corresponding to the 5th node to be inquired from the 3rd tables of data;
    When inquiring the relation data of the first node, the query unit is specifically used for, based on the described 5th instruction, inquiry To the relation data of the first node.
CN201610826949.6A 2016-09-14 2016-09-14 Data table establishing method, online query method and related device Active CN107818117B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610826949.6A CN107818117B (en) 2016-09-14 2016-09-14 Data table establishing method, online query method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610826949.6A CN107818117B (en) 2016-09-14 2016-09-14 Data table establishing method, online query method and related device

Publications (2)

Publication Number Publication Date
CN107818117A true CN107818117A (en) 2018-03-20
CN107818117B CN107818117B (en) 2022-02-15

Family

ID=61601282

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610826949.6A Active CN107818117B (en) 2016-09-14 2016-09-14 Data table establishing method, online query method and related device

Country Status (1)

Country Link
CN (1) CN107818117B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108563697A (en) * 2018-03-22 2018-09-21 腾讯科技(深圳)有限公司 A kind of data processing method, device and storage medium
CN110543585A (en) * 2019-08-14 2019-12-06 天津大学 RDF graph and attribute graph unified storage method based on relational model
CN111125156A (en) * 2019-12-17 2020-05-08 网银在线(北京)科技有限公司 Data query method and device and electronic equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6728713B1 (en) * 1999-03-30 2004-04-27 Tivo, Inc. Distributed database management system
CN101547092B (en) * 2008-03-27 2011-06-08 天津德智科技有限公司 Method and device for data synchronization of multi-application systems for unifying user authentication
CN102395962A (en) * 2009-03-11 2012-03-28 甲骨文国际公司 Composite hash and list partitioning of database tables
CN103218404A (en) * 2013-03-20 2013-07-24 华中科技大学 Multi-dimensional metadata management method and system based on association characteristics
CN103631924A (en) * 2013-12-03 2014-03-12 Tcl集团股份有限公司 Application method and system for distributive database platform
CN103995879A (en) * 2014-05-27 2014-08-20 华为技术有限公司 Data query method, device and system based on OLAP system
CN104063487A (en) * 2014-07-03 2014-09-24 浙江大学 File data management method based on relational database and K-D tree indexes
CN104809129A (en) * 2014-01-26 2015-07-29 华为技术有限公司 Method, device and system for storing distributed data
US20150220617A1 (en) * 2013-12-23 2015-08-06 Teradata Us, Inc. Techniques for query processing using high dimension histograms
CN105045871A (en) * 2015-07-15 2015-11-11 国家超级计算深圳中心(深圳云计算中心) Data aggregation query method and apparatus

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6728713B1 (en) * 1999-03-30 2004-04-27 Tivo, Inc. Distributed database management system
CN101547092B (en) * 2008-03-27 2011-06-08 天津德智科技有限公司 Method and device for data synchronization of multi-application systems for unifying user authentication
CN102395962A (en) * 2009-03-11 2012-03-28 甲骨文国际公司 Composite hash and list partitioning of database tables
CN103218404A (en) * 2013-03-20 2013-07-24 华中科技大学 Multi-dimensional metadata management method and system based on association characteristics
CN103631924A (en) * 2013-12-03 2014-03-12 Tcl集团股份有限公司 Application method and system for distributive database platform
US20150220617A1 (en) * 2013-12-23 2015-08-06 Teradata Us, Inc. Techniques for query processing using high dimension histograms
CN104809129A (en) * 2014-01-26 2015-07-29 华为技术有限公司 Method, device and system for storing distributed data
CN103995879A (en) * 2014-05-27 2014-08-20 华为技术有限公司 Data query method, device and system based on OLAP system
CN104063487A (en) * 2014-07-03 2014-09-24 浙江大学 File data management method based on relational database and K-D tree indexes
CN105045871A (en) * 2015-07-15 2015-11-11 国家超级计算深圳中心(深圳云计算中心) Data aggregation query method and apparatus

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
I GUSTI BAGUS ADY SUTRISNA等: ""Implementation of GRAC algorithm (Graph Algorithm Clustering) in graph database compression"", 《2015 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOICT)》 *
李国庆: "《ASP.NET程序设计项目教程》", 31 January 2010, 北京理工大学出版社 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108563697A (en) * 2018-03-22 2018-09-21 腾讯科技(深圳)有限公司 A kind of data processing method, device and storage medium
CN110543585A (en) * 2019-08-14 2019-12-06 天津大学 RDF graph and attribute graph unified storage method based on relational model
CN110543585B (en) * 2019-08-14 2021-08-31 天津大学 RDF graph and attribute graph unified storage method based on relational model
CN111125156A (en) * 2019-12-17 2020-05-08 网银在线(北京)科技有限公司 Data query method and device and electronic equipment
CN111125156B (en) * 2019-12-17 2023-09-26 网银在线(北京)科技有限公司 Data query method and device and electronic equipment

Also Published As

Publication number Publication date
CN107818117B (en) 2022-02-15

Similar Documents

Publication Publication Date Title
CN109033101B (en) Label recommendation method and device
CN104394118A (en) User identity identification method and system
CN106570008A (en) Recommendation method and device
CN107291779B (en) Cache data management method and device
US20140115010A1 (en) Propagating information through networks
CN107967284A (en) Method and apparatus for storing, inquiring about sequence information
WO2019242343A1 (en) Marketing information release platform construction method and apparatus
CN105119956B (en) Network application system and dispositions method
CN107818117A (en) A kind of method for building up of tables of data, online query method and relevant apparatus
WO2018036219A1 (en) Multi-level rebating method and multi-level rebating platform
CN111639253A (en) Data duplication judging method, device, equipment and storage medium
CN106933891A (en) Access the method for distributed data base and the device of Distributed database service
US20190362016A1 (en) Frequent pattern analysis for distributed systems
CN105894310A (en) Personalized recommendation method
CN104468751A (en) Self-defining method for business process nodes in cloud sea operating system
US20200098030A1 (en) Inventory-assisted artificial intelligence recommendation engine
CN113761350A (en) Data recommendation method, related device and data recommendation system
CN106446943A (en) Commodity correlation big data sparse network quick clustering method
US9830377B1 (en) Methods and systems for hierarchical blocking
CN106780062A (en) Based on groups of users update method and system that social networks and big data are analyzed
Li et al. An empirical study of alternating least squares collaborative filtering recommendation for Movielens on Apache Hadoop and Spark
US11294917B2 (en) Data attribution using frequent pattern analysis
CN107679096B (en) Method and device for sharing indexes among data marts
US20200097485A1 (en) Selective synchronization of linked records
CN106202503B (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant