CN107818117A - A kind of method for building up of tables of data, online query method and relevant apparatus - Google Patents
A kind of method for building up of tables of data, online query method and relevant apparatus Download PDFInfo
- Publication number
- CN107818117A CN107818117A CN201610826949.6A CN201610826949A CN107818117A CN 107818117 A CN107818117 A CN 107818117A CN 201610826949 A CN201610826949 A CN 201610826949A CN 107818117 A CN107818117 A CN 107818117A
- Authority
- CN
- China
- Prior art keywords
- data
- node
- tables
- relation
- physical cluster
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
- G06F16/278—Data partitioning, e.g. horizontal or vertical partitioning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application provides a kind of method for building up of tables of data, online query method and relevant apparatus, the method for building up includes:The first tables of data is established on a physical cluster, first tables of data is used for the relation information of on-line storage first node and section point;The second tables of data is established on the physical cluster, second tables of data is used for the attribute information of on-line storage section point;Wherein, the first node and the section point belong to different node types.In embodiments of the present invention, node type is not repartitioned when establishing tables of data, therefore even if first node is different with the node type of section point, i.e. described first tables of data and second tables of data correspond to different node types, remain on and establish first tables of data and second tables of data on same physical cluster.Therefore, multiple physical clusters need not be accessed during online query, improve online query speed.
Description
Technical field
The application is related to on-line storage technical field, more particularly, to a kind of method for building up of tables of data, online query side
Method and relevant apparatus.
Background technology
With the continuous development of Internet technology, not only species is more and more but also the order of magnitude for caused relational network data
It is increasing, such as relational network data shown in Fig. 1 include two kinds of node, one kind is user node:User i and
User j, another kind are commodity nodes:Commodity 1 and commodity 2, wherein user j are user i good friends, and user j have purchased the He of commodity 1
Commodity 2.And how these relational network data to be stored by on-line storage technology, become what people became more concerned with
Problem.
At present, these data of on-line storage generally by way of establishing tables of data.For example, facebook companies use
Unicorn frameworks establish tables of data.In unicorn frameworks, each node type all corresponds to single physical cluster, that is,
The tables of data of each node type is said, is built upon in corresponding physical cluster.Such as shown in Fig. 2, in user's physical cluster
The tables of data of user node type is established, the tables of data is used for the attribute information, buddy list, commodity purchasing for storing user node
List etc., the tables of data of commodity node type is established in commodity physical cluster, the tables of data is used for the category for storing commodity node
Property information etc..
It can be seen that establishing tables of data using unicorn frameworks, different node types are related to during an online query
During data, then need to access multiple physical clusters.For example, if desired online query user i good friend purchase commodity when, it is necessary to
User's cluster is accessed first, and the good friend for inquiring user i is user j, and the items list { i, j } of user j purchases, is visited again
Commodity cluster, inquiry commodity i and j attribute information are as final online query result.Obviously, above-mentioned online query process needs
At least two physical clusters are at least accessed, cause online query speed slower.
The content of the invention
The technical problem that the application solves is to provide a kind of method for building up of tables of data, online query method and related dress
Put, tables of data is established using more reasonably framework, so as to which multiple physical clusters need not be accessed during online query, improve online query
Speed.
Therefore, the technical scheme that the application solves technical problem is:
This application provides a kind of method for building up of tables of data, including:
Establish the first tables of data on a physical cluster, first tables of data is used for on-line storage first node and the
The relation information of two nodes;
The second tables of data is established on the physical cluster, second tables of data is used for the category of on-line storage section point
Property information;
Wherein, the first node and the section point belong to different node types.
Optionally, first tables of data includes the first index entry and the second index entry, and first index entry is used for
Line stores the mark of the first node, and second index entry is used for on-line storage corresponding with the first node described the
The mark of two nodes;
Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for on-line storage institute
The mark of section point is stated, first attribute item is used for the attribute information of section point described in on-line storage.
Optionally, first tables of data also includes the second attribute item, and second attribute item is used for described in on-line storage
The attribute information of the corresponding relation of first node and the section point.
Optionally, first tables of data is key-key-value key-key-value structures, wherein, first index entry
For main key information, second index entry is that second attribute item is value information from key information;
Second tables of data is key-value key-value structures, wherein, the 3rd index entry is key information, described
One attribute item is value information.
Optionally, the physical cluster includes N number of partition holding, and N is more than or equal to 2;The method for building up also includes:
First subregion is determined from N number of partition holding according to first index entry, according to the 3rd index entry from N number of
The second subregion is determined in partition holding;
It is described to establish the first tables of data on a physical cluster, including:Built on the first subregion of the physical cluster
Vertical first tables of data;
It is described to establish the second tables of data on the physical cluster, including:Built on the second subregion of the physical cluster
Vertical second tables of data.
Optionally, first subregion and second subregion include M backup region respectively, and M is more than or equal to 2;
The first tables of data is established on the first subregion of the physical cluster, including:At first point of the physical cluster
On the M backup region in area, first tables of data is established respectively;
The second tables of data is established on the second subregion of the physical cluster, including:At second point of the physical cluster
On the M backup region in area, second tables of data is established respectively.
Optionally, first tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node;It is described
Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, the 3rd node and the first node category
In identical node type, the fourth node and the section point belong to identical node type;
Or methods described also includes:The 3rd tables of data is established on the physical cluster, the 3rd tables of data is used for
The relation information of the node of on-line storage the 5th and the first node.
Being established this application provides a kind of online query method, on a physical cluster has the first tables of data and the second data
Table, first tables of data are used for the relation information of on-line storage first node and section point;Second tables of data is used for
The attribute information of on-line storage section point;Wherein, the first node and the section point belong to different node types;
Methods described includes:
On-line query request is received, the on-line query request is used for the relation data for indicating online query first node;
First tables of data and second tables of data in physical cluster described in online access, inquire described first
The relation data of node;
Wherein, the relation data for inquiring the first node includes:Institute is inquired from first tables of data
Section point corresponding to first node is stated, the attribute information of the section point is inquired from second tables of data.
Optionally, first tables of data includes the first index entry and the second index entry, and first index entry is used for
Line stores the mark of the first node, and second index entry is used for on-line storage corresponding with the first node described the
The mark of two nodes;
Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for on-line storage institute
The mark of section point is stated, first attribute item is used for the attribute information of section point described in on-line storage.
Optionally, the physical cluster includes N number of partition holding, and N is more than or equal to 2;Methods described also includes:
First subregion is determined from N number of partition holding according to first index entry, according to second index entry from N number of
The second subregion is determined in partition holding;
First tables of data and second tables of data in physical cluster described in online access, including:Online access
First tables of data on first subregion in the physical cluster, and second data on second subregion
Table.
Optionally, first tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node;It is described
Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, the 3rd node and the first node category
In identical node type, the fourth node and the section point belong to identical node type;The online query please
Seek the relation data for being additionally operable to indicate the node of online query the 3rd;
Methods described also includes:
First tables of data and second tables of data in physical cluster described in online access, inquire the described 3rd
The relation data of node;
Wherein, the relation data for inquiring the 3rd node includes:Institute is inquired from first tables of data
State fourth node corresponding to the 3rd node;The attribute information of the fourth node is inquired from second tables of data.
Optionally, the on-line query request includes the first composition operators;Methods described also includes:
The first instruction, the second instruction and the 3rd instruction are parsed from first composition operators, first instruction is used
In the relation data of instruction online query first node, described second instructs the relation number for indicating the node of online query the 3rd
According to the described 3rd instructs for indicating to carry out integrated treatment to relation data;
The relation data of the first node is inquired, including:Based on the described first instruction, the first node is inquired
Relation data;
The relation data of the 3rd node is inquired, including:Based on the described second instruction, the 3rd node is inquired
Relation data;
Methods described also includes:Based on the described 3rd instruction, relation data to the first node and described Section three
The relation data of point carries out integrated treatment.
Optionally, the integrated treatment includes any one of following handle:Merging treatment, intersection operation and set difference operation.
Optionally, the 3rd tables of data has been also set up on the physical cluster, the 3rd tables of data is used for on-line storage the
The relation information of five nodes and the first node;The on-line query request is used for the relation for indicating the node of online query the 5th
Data:
First tables of data and second tables of data in physical cluster described in online access, inquire described first
The relation data of node, including:First tables of data, second tables of data and institute in physical cluster described in online access
The 3rd tables of data is stated, inquires the relation data of the 5th node;
The relation data of the 5th node is inquired, including:Described Section five is inquired from the 3rd tables of data
The first node corresponding to point, inquire the relation data of the first node.
Optionally, the on-line query request includes the second composition operators;Methods described also includes:
The 4th instruction and the 5th instruction are parsed from first composition operators, the described 4th instructs for indicating online
Node corresponding to the 5th node is inquired about, the described 5th instructs the relation number for indicating node corresponding to the node of online query the 5th
According to;
The first node corresponding to the 5th node is inquired from the 3rd tables of data, including:Based on described
4th instruction, the first node corresponding to the 5th node is inquired from the 3rd tables of data;
The relation data of the first node is inquired, including:Based on the described 5th instruction, the first node is inquired
Relation data.
This application provides a kind of device of establishing of tables of data, including:
First establishes unit, and for establishing the first tables of data on a physical cluster, first tables of data is used for
Line stores the relation information of first node and section point;
Second establishes unit, and for establishing the second tables of data on the physical cluster, second tables of data is used for
Line stores the attribute information of section point;
Wherein, the first node and the section point belong to different node types.
Optionally, first tables of data includes the first index entry and the second index entry, and first index entry is used for
Line stores the mark of the first node, and second index entry is used for on-line storage corresponding with the first node described the
The mark of two nodes;
Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for on-line storage institute
The mark of section point is stated, first attribute item is used for the attribute information of section point described in on-line storage.
Optionally, first tables of data also includes the second attribute item, and second attribute item is used for described in on-line storage
The attribute information of the corresponding relation of first node and the section point.
Optionally, first tables of data is key-key-value key-key-value structures, wherein, first index entry
For main key information, second index entry is that second attribute item is value information from key information;
Second tables of data is key-value key-value structures, wherein, the 3rd index entry is key information, described
One attribute item is value information.
Optionally, the physical cluster includes N number of partition holding, and N is more than or equal to 2;The device of establishing also includes:
Determining unit, for determining the first subregion from N number of partition holding according to first index entry, according to described
Three index entries determine the second subregion from N number of partition holding;
It is described when establishing the first tables of data on a physical cluster, described first, which establishes unit, is specifically used for, described
The first tables of data is established on first subregion of physical cluster;
It is described when establishing the second tables of data on the physical cluster, described second, which establishes unit, is specifically used for, described
The second tables of data is established on second subregion of physical cluster.
Optionally, first subregion and second subregion include M backup region respectively, and M is more than or equal to 2;
When establishing the first tables of data on the first subregion of the physical cluster, described first, which establishes unit, is specifically used for,
On M backup region of the first subregion of the physical cluster, first tables of data is established respectively;
When establishing the second tables of data on the second subregion of the physical cluster, described second, which establishes unit, is specifically used for,
On M backup region of the second subregion of the physical cluster, second tables of data is established respectively.
Optionally, first tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node;It is described
Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, the 3rd node and the first node category
In identical node type, the fourth node and the section point belong to identical node type;
Or the device of establishing also includes:3rd establishes unit, for establishing the 3rd data on the physical cluster
Table, the 3rd tables of data are used for the relation information of the node of on-line storage the 5th and the first node.
Being established this application provides a kind of online query device, on a physical cluster has the first tables of data and the second data
Table, first tables of data are used for the relation information of on-line storage first node and section point;Second tables of data is used for
The attribute information of on-line storage section point;Wherein, the first node and the section point belong to different node types;
Described device includes:
Receiving unit, for receiving on-line query request, the on-line query request is used to indicate online query first segment
The relation data of point;
Query unit, for first tables of data in physical cluster described in online access and second tables of data,
Inquire the relation data of the first node;
Wherein, the relation data for inquiring the first node includes:Institute is inquired from first tables of data
Section point corresponding to first node is stated, the attribute information of the section point is inquired from second tables of data.
Optionally, first tables of data includes the first index entry and the second index entry, and first index entry is used for
Line stores the mark of the first node, and second index entry is used for on-line storage corresponding with the first node described the
The mark of two nodes;
Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for on-line storage institute
The mark of section point is stated, first attribute item is used for the attribute information of section point described in on-line storage.
Optionally, the physical cluster includes N number of partition holding, and N is more than or equal to 2;Described device also includes:
Determining unit, for determining the first subregion from N number of partition holding according to first index entry, according to described
Two index entries determine the second subregion from N number of partition holding;
When first tables of data in physical cluster described in online access and second tables of data, the query unit
It is specifically used for:First tables of data on first subregion in physical cluster described in online access, and described second point
Second tables of data in area.
Optionally, first tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node;It is described
Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, the 3rd node and the first node category
In identical node type, the fourth node and the section point belong to identical node type;The online query please
Seek the relation data for being additionally operable to indicate the node of online query the 3rd;
The query unit is additionally operable to, first tables of data and second number in physical cluster described in online access
According to table, the relation data of the 3rd node is inquired;
Wherein, the relation data for inquiring the 3rd node includes:Institute is inquired from first tables of data
State fourth node corresponding to the 3rd node;The attribute information of the fourth node is inquired from second tables of data.
Optionally, the on-line query request includes the first composition operators;Described device also includes:
Resolution unit, for parsing the first instruction, the second instruction and the 3rd instruction, institute from first composition operators
The relation data that the first instruction is used to indicate online query first node is stated, described second instructs for indicating online query the 3rd
The relation data of node, the described 3rd instructs for indicating to carry out integrated treatment to relation data;
When inquiring the relation data of the first node, the query unit is specifically used for, and is instructed based on described first,
Inquire the relation data of the first node;
When inquiring the relation data of the 3rd node, the query unit is specifically used for, and is instructed based on described second,
Inquire the relation data of the 3rd node;
Described device also includes:Processing unit, for being instructed based on the described 3rd, to the relation data of the first node
Integrated treatment is carried out with the relation data of the 3rd node.
Optionally, the integrated treatment includes any one of following handle:Merging treatment, intersection operation and set difference operation.
Optionally, the 3rd tables of data has been also set up on the physical cluster, the 3rd tables of data is used for on-line storage the
The relation information of five nodes and the first node;The on-line query request is used for the relation for indicating the node of online query the 5th
Data:
First tables of data and second tables of data in physical cluster described in online access, inquire described first
During the relation data of node, the query unit is specifically used for, first tables of data in physical cluster described in online access,
Second tables of data and the 3rd tables of data, inquire the relation data of the 5th node;
When inquiring the relation data of the 5th node, the query unit is specifically used for, from the 3rd tables of data
In inquire the 5th node corresponding to the first node, inquire the relation data of the first node.
Optionally, the on-line query request includes the second composition operators;Described device also includes:
Resolution unit, for parsing the 4th instruction and the 5th instruction from first composition operators, the described 4th refers to
Make for indicating node corresponding to the node of online query the 5th, the described 5th instructs for indicating that the node of online query the 5th is corresponding
Node relation data;
When inquiring the first node corresponding to the 5th node from the 3rd tables of data, the query unit
It is specifically used for, based on the described 4th instruction, is inquired from the 3rd tables of data described first corresponding to the 5th node
Node;
When inquiring the relation data of the first node, the query unit is specifically used for, and is instructed based on the described 5th,
Inquire the relation data of the first node.
According to the above-mentioned technical solution, in embodiments of the present invention, node type is not repartitioned when establishing tables of data, because
Even if this first node is different with the node type of section point, i.e., described first tables of data and second tables of data are corresponding not
Same node type, remain on and establish first tables of data and second tables of data on same physical cluster.Cause
This, multiple physical clusters need not be accessed during online query, improve online query speed.
Brief description of the drawings
In order to illustrate more clearly of the technical scheme in the embodiment of the present application, make required in being described below to embodiment
Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present application, for
For those of ordinary skill in the art, other accompanying drawings can also be obtained according to these accompanying drawings.
Fig. 1 is a kind of structural representation of attributed graph;
Fig. 2 is the schematic diagram of unicorn frameworks;
Fig. 3 is a kind of schematic flow sheet for embodiment of the method that the application provides;
Fig. 4 is a kind of structural representation for aggregated structure that the application provides;
Fig. 5 is the schematic flow sheet for another embodiment of the method that the application provides;
Fig. 6 is a kind of structural representation for device embodiment that the application provides;
Fig. 7 is the structural representation for another device embodiment that the application provides.
Embodiment
Relational network data refer to the data that corresponding relation in data caused by internet be present, such as represent user
The data of purchase relation between commodity, represent that the data of the friend relation between user and user can constituent relation net
Network data.
Relational network data can pass through attributed graph specification.Such as in the attributed graph shown in Fig. 1, including two kinds of section
Point, one kind are user nodes:User i and user j, another kind are commodity nodes:Commodity 1 and commodity 2, wherein user j are user i
Good friend, user j have purchased commodity 1 and commodity 2.Also, it is generally also provided with attribute between the node and node in attributed graph
Information, for example, user i attribute information includes the user profile such as user i name, age and sex, the attribute information of commodity 1
The merchandise newss such as introduction, price and species including commodity 1, also there is good friend's attribute information between user i and user j, such as
Friend relation value etc., also there is purchase attribute information, such as the time buying etc. between user j and commodity 1 and commodity 2.
In a kind of on-line storage framework:In unicorn frameworks, each node type all corresponds to single physical cluster.Example
As shown in Fig. 2 establishing the tables of data 1 and tables of data 2 of user node type in user's physical cluster, tables of data 1 is used to store
User i and j buddy list, tables of data 2 is used for the commodity purchasing list for storing user i and user j, in commodity physical cluster
The tables of data 3 of commodity node type is established, tables of data 3 is used to store attribute information of commodity 1 and 2 etc..
If desired, it is necessary to user's cluster be accessed first, from tables of data 1 during the commodity of online query user i good friend's purchase
In to inquire user i good friend be user j, the items list { i, j } of user j purchases is inquired from tables of data 2, visits again business
Product cluster, commodity i and j attribute information are inquired about as final online query result from tables of data 3.Obviously, it is above-mentioned to look into online
Inquiry process needs at least to access user's cluster and the physical cluster of commodity cluster at least two.And due in different physical clusters,
The various parameters such as configuration attribute differ, therefore access speed is slower, and it is slower to further result in online query speed.
It can be seen that unicorn frameworks are establishing tables of data time zone merogenesis vertex type, therefore the tables of data of different node types
Need to establish on different physical clusters.Obviously, it is more to be not suitable for node type for this mode, or frequently increases node newly
Under the scene of type.For example, when node type is more, it is necessary to more physical cluster;And when frequent newly-increased node type,
Often increase a node type newly, it is necessary to which an independent physical cluster, this obviously can cause higher cost.
The embodiment of the present application provides a kind of method for building up of tables of data, online query method and relevant apparatus, is establishing number
Node type is not repartitioned during according to table, the tables of data of different node types is established on same physical cluster, realized more
Reasonably framework establishes tables of data, so as to access multiple physical clusters during online query, improves online query speed.
In order that those skilled in the art more fully understand the technical scheme in the application, it is real below in conjunction with the application
The accompanying drawing in example is applied, the technical scheme in the embodiment of the present application is clearly and completely described, it is clear that described implementation
Example only some embodiments of the present application, rather than whole embodiments.It is common based on the embodiment in the application, this area
The every other embodiment that technical staff is obtained under the premise of creative work is not made, should all belong to protection of the present invention
Scope.
Referring to Fig. 3, the embodiment of the present application provides a kind of embodiment of the method for the method for building up of tables of data.The present embodiment
Methods described include:
S301:The first tables of data is established on a physical cluster, first tables of data is used for on-line storage first segment
The relation information of point and section point.
In the embodiment of the present application, the first node and the section point belong to different node types.For example, institute
It can be user node to state first node, and the section point can be commodity node, therefore first tables of data is used to deposit
The relation information of user node and commodity node is stored up, represents which commodity each user have purchased respectively.
S302:The second tables of data is established on the physical cluster, second tables of data saves for on-line storage second
The attribute information of point.
For example, the section point is commodity node, therefore attribute of second tables of data for storing commodity node
Information.
According to the above-mentioned technical solution, in embodiments of the present invention, node type is not repartitioned when establishing tables of data, because
Even if this described first node is different with the node type of the section point, i.e., described first tables of data and second data
Table corresponds to different node types, remains on first tables of data and second tables of data being stored in same physics collection
On group.Therefore, multiple physical clusters need not be accessed during online query, improve online query speed.For example, first node is user
Node, section point are commodity node, during the commodity of online query user i purchases, only need to access one physical cluster, just
Commodity 1 corresponding to user i can be inquired from the first tables of data, and the attribute of commodity 1 is inquired from the second tables of data
Information, improve online query speed.Further, it is also possible to commodity 2 corresponding to user i are inquired from the first tables of data, and
The attribute information of commodity 2 is inquired from the second tables of data.
In addition, when node type is more without more physical cluster, and also without weight during newly-increased node type
Newly one physical cluster of arrangement, but the tables of data of newly-increased node type can be established on same physical cluster, so as to
Can be cost-effective.
Wherein, first tables of data and second tables of data can be indicated by index entry and attribute item.Under
Face illustrates respectively.
First tables of data can include the first index entry and the second index entry, and first index entry is used to deposit online
The mark of the first node is stored up, second index entry is used for on-line storage second section corresponding with the first node
The mark of point.First tables of data can also include the second attribute item, and second attribute item is used for described in on-line storage the
The attribute information of the corresponding relation of one node and the section point.
Such as shown in table 1, first tables of data can be key-key-value (key-key-value) structure, wherein, it is described
First index entry is main key information, is stored with user j mark, and second index entry is from key information, is stored with user j
The mark of the commodity of purchase, second attribute item are value information, are stored with the time of user j purchase commodity.
Table 1
Main key | From key | value |
User j | Commodity 1 | 2016.5.3 |
User j | Commodity 2 | 2016.7.9 |
When the data in the table 1 carry out online query, can be inquired about according to main key from key and/or value, or
Person can also inquire about value according to main key and from key.For example, the items list of inquiry user j purchases is commodity 1 and commodity 2.
Second tables of data can include the 3rd index entry and the first attribute item, and the 3rd index entry is used to deposit online
The mark of the section point is stored up, first attribute item is used for the attribute information of section point described in on-line storage.
Such as shown in table 2, second tables of data can be expressed as key-value (key-value) structure, wherein, described
Three index entries are key information, are stored with the mark of commodity, and first attribute item is value information, is stored with the attribute of commodity
Information, such as the merchandise news such as introduction, price and species of commodity 1.
Table 2
key | value |
Commodity 1 | The attribute information of commodity 1 |
Commodity 2 | The attribute information of commodity 2 |
When the data in the table 2 carry out online query, value can be inquired about according to key.For example, inquiry commodity 1
Attribute information.
In the embodiment of the present application, in addition to above-mentioned data, the physical cluster can also be stored with more data.
Illustrate separately below.
, can also be online in the first tables of data in addition to the corresponding relation of the first node and the section point
The 3rd node and the corresponding relation of fourth node are stored with, wherein the 3rd node and the first node belong to identical section
Vertex type, the fourth node and the section point belong to identical node type.That is, first tables of data can
For on-line storage first kind node and the corresponding relation of Second Type node.Such as shown in table 3, first tables of data
Also it is stored with the mark of the commodity of user i purchases, and the time of user i purchase commodity.
Table 3
Main key | From key | value |
User j | Commodity 1 | 2016.5.3 |
User j | Commodity 2 | 2016.7.9 |
User i | Commodity 3 | 2016.1.11 |
In second tables of data in addition to the attribute information of the section point, there can be other with on-line storage
The attribute information of node, such as the attribute information of fourth node.That is, second tables of data can be used for on-line storage
The attribute information of Second Type node.Such as shown in table 4, second tables of data is also stored with the attribute information of commodity 3.
Table 4
key | value |
Commodity 1 | The attribute information of commodity 1 |
Commodity 2 | The attribute information of commodity 2 |
Commodity 3 | The attribute information of commodity 3 |
On the physical cluster in addition to first tables of data and second tables of data, other can also have been established
Tables of data, such as the 3rd tables of data.Wherein, the 3rd tables of data is used for the node of on-line storage the 5th and the first node
Relation information.Wherein, the 5th node can may belong to the node of same type with the first node, such as shown in table 5,
5th node and first node are all user node, and the 3rd tables of data is used for the buddy list for storing user i:User j, Yi Jiyong
Family i and user j friend relation value;5th node can also belong to different types of node with the first node.
Table 5
Main key | From key | value |
User i | User j | 79 |
It should be noted that to the structure of tables of data established on the physical cluster and each in the embodiment of the present application
Node type is not any limitation as corresponding to individual tables of data.
In the embodiment of the present application, the physical cluster can be with as shown in figure 4, be provided with N number of subregion (English:
Partition), N is more than or equal to 2, when establishing the tables of data of node, is determined pair according to the data of the index entry of tables of data
The subregion answered, tables of data is established in corresponding subregion.It is specifically described below.
Methods described can also include:First subregion, example are determined from N number of partition holding according to first index entry
Such as, according to the value of main key in table 1, subregion 1 is determined using hash algorithm, S301 is included in first point of the physical cluster
The first tables of data is established in area.
Methods described can also include:Second subregion, example are determined from N number of partition holding according to the 3rd index entry
Such as, according to the value of key in table 2, subregion 2 is determined using hash algorithm, S302 includes:In the second subregion of the physical cluster
On establish the second tables of data.In embodiments of the present invention, same tables of data can also be established on multiple subregions, for example, root
According to the value of the main key in tables of data per item data, a subregion is determined, the item data of the tables of data is deposited in the subregion.
Include M backup region (English respectively in the distributed physical cluster shown in Fig. 4, in each subregion:
Replica), M is more than or equal to 2.Wherein, the data of each backup region storage in same subregion are consistent, not only
On-line storage and online query function while can supporting multiple devices, and the backup of data can be realized.Specifically, institute
State the first subregion and second subregion includes M backup region respectively;Is established on the first subregion of the physical cluster
One tables of data, including:On M backup region of the first subregion of the physical cluster, first tables of data is established respectively;
The second tables of data is established on the second subregion of the physical cluster, including:M in the second subregion of the physical cluster are standby
On part region, second tables of data is established respectively.
The method for building up embodiment of corresponding above-mentioned tables of data, present invention also provides the side that online query is carried out to tables of data
Method embodiment.It is specifically described below.
Referring to Fig. 5, this application provides a kind of embodiment of the method for online query method.
Being established in the present embodiment, on a physical cluster has the first tables of data and the second tables of data, first tables of data
For on-line storage first node and the relation information of section point;Second tables of data is used for on-line storage section point
Attribute information;Wherein, the first node and the section point belong to different node types.It should be noted that this reality
Apply the first tables of data and the progress online query of the second tables of data that example can be used for establishing any of the above-described kind of embodiment of the method.
The methods described of the present embodiment includes:
S501:On-line query request is received, the on-line query request is used for the relation for indicating online query first node
Data.
For example, it is desired to during the commodity of online query user purchase, the first node can be user node, described first
The relation data of node can be the attribute information of commodity node corresponding to user node.
S502:First tables of data and second tables of data in physical cluster described in online access, inquire institute
State the relation data of first node.
Wherein, the relation data for inquiring the first node includes:Institute is inquired from first tables of data
Section point corresponding to first node is stated, the attribute information of the section point is inquired from second tables of data.
For example, first node is user node, section point is commodity node, during the commodity of online query user i purchases,
Commodity 1 corresponding to user i are inquired from the first tables of data, and the attribute information of commodity 1 is inquired from the second tables of data.
Wherein, first tables of data and second tables of data can be indicated by index entry and attribute item.Under
Face illustrates respectively.
First tables of data can include the first index entry and the second index entry, and first index entry is used to deposit online
The mark of the first node is stored up, second index entry is used for on-line storage second section corresponding with the first node
The mark of point.First tables of data can also include the second attribute item, and second attribute item is used for described in on-line storage the
The attribute information of the corresponding relation of one node and the section point.Such as shown in table 1, first tables of data can be key-
Key-value (key-key-value) structure.When in the table 1 data carry out online query when, can according to main key inquire about from
Key and/or value, or value can also be inquired about according to main key and from key.For example, user j purchases are inquired in table 1
Items list be commodity 1 and commodity 2.
Second tables of data can include the 3rd index entry and the first attribute item, and the 3rd index entry is used to deposit online
The mark of the section point is stored up, first attribute item is used for the attribute information of section point described in on-line storage.Such as table 2
Shown, second tables of data can be expressed as key-value (key-value) structure.Data in the table 2 carry out online
During inquiry, value can be inquired about according to key.For example, the attribute information of commodity 1 is inquired in table 2.
In the embodiment of the present application, the physical cluster can be with as shown in figure 4, be provided with N number of subregion (English:
Partition), N is more than or equal to 2.Therefore when inquiring about tables of data, entered according to the index entry of tables of data into corresponding subregion
Row inquiry.It is specifically described below.
Methods described can also include:First subregion is determined from N number of partition holding according to first index entry, such as
According to all values of main key in table 1, subregion 1 is determined using hash algorithm;Divided according to second index entry from N number of storage
The second subregion, such as all values according to key in table 2 are determined in area, subregion 2 is determined using hash algorithm;In S502
Line accesses first tables of data and second tables of data in the physical cluster, including:Physics collection described in online access
First tables of data on first subregion in group, and second tables of data on second subregion.
Include M backup region respectively in the distributed physical cluster shown in Fig. 4, in each subregion, M is more than or equal to
2.Wherein, the data of each backup region storage in same subregion are consistent, and can not only support the same of multiple devices
When on-line storage and online query function, and the backup of data can be realized.Therefore, can during the first tables of data of online access
To be to access the first tables of data on one or more of first subregion backup region, similar, the data of online access second
Can access the second tables of data on one or more of second subregion backup region during table.
In the embodiment of the present application, in addition to above-mentioned data, the physical cluster can also be stored with more data,
Online query can be carried out to these data.Illustrate separately below.
, can also be online in the first tables of data in addition to the corresponding relation of the first node and the section point
The 3rd node and the corresponding relation of fourth node are stored with, wherein the 3rd node and the first node belong to identical section
Vertex type, the fourth node and the section point belong to identical node type.Such as shown in table 3, first data
Table is also stored with the mark of the commodity of user i purchases, and the time of user i purchase commodity.Removed in second tables of data
Outside the attribute information of the section point, can there are the attribute information of other nodes, such as fourth node with on-line storage
Attribute information.Such as shown in table 4, second tables of data is also stored with the attribute information of commodity 3.
The on-line query request is additionally operable to indicate the relation data of the node of online query the 3rd;Methods described can also wrap
Include:First tables of data and second tables of data in physical cluster described in online access, inquire the 3rd node
Relation data;Wherein, the relation data for inquiring the 3rd node includes:Inquired from first tables of data
Fourth node corresponding to 3rd node;The attribute information of the fourth node is inquired from second tables of data.Example
Such as, the commodity 3 of user i purchases are inquired from table 3, the attribute information of commodity 3 is inquired from table 4.
On the physical cluster in addition to first tables of data and second tables of data, other can also have been established
Tables of data, such as the 3rd tables of data.3rd tables of data is used for the pass of the node of on-line storage the 5th and the first node
It is information.3rd tables of data can be as shown in table 5.
The on-line query request is used for the relation data for indicating the node of online query the 5th:Physics collection described in online access
First tables of data and second tables of data in group, the relation data of the first node is inquired, including:It is online to visit
First tables of data, second tables of data and the 3rd tables of data in the physical cluster are asked, inquires described
The relation data of five nodes;The relation data of the 5th node is inquired, including:Institute is inquired from the 3rd tables of data
The first node corresponding to stating the 5th node, inquire the relation data of the first node.Such as need to inquire about user i's
During the commodity of good friend's purchase, the good friend that user i is inquired from table 5 is user j, and the commodity of user j purchases are inquired from table 1
1, the attribute information of commodity 1 is inquired from table 2.
In the embodiment of the present application, in order to realize online query function, three kinds of different inquiry operators are defined.Pass through this
Three kinds of inquiry operators are used alone or in combination, and can realize quick search function.Illustrate separately below.
The first operator:Simple queries operator (is referred to as AtomicSearch operators), for the root from tables of data
Corresponding index entry or property value are inquired according to index entry.For example, according to given key:User j, it can be inquired about from table 1
To commodity 1 corresponding to user j and commodity 2, in another example, according to given key:Commodity 1, commodity 1 can be inquired from table 2
Attribute information.
Second of operator:Compound operation operator, for from tables of data according to corresponding to inquiring index entry index entry or
Person's property value, and the data to inquiring carry out computing.
First tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node, such as table 3 is also deposited
Contain the commodity sign of user i purchases:Commodity 3;Second tables of data is additionally operable to the attribute information of on-line storage fourth node,
Such as table 4 is also stored with the attribute information of commodity 3.
The on-line query request includes the first composition operators;Methods described also includes:
The first instruction, the second instruction and the 3rd instruction are parsed from first composition operators, first instruction is used
In the relation data of instruction online query first node, described second instructs the relation number for indicating the node of online query the 3rd
According to the described 3rd instructs for indicating to carry out integrated treatment to relation data;
The relation data of the first node is inquired, including:Based on the described first instruction, the first node is inquired
Relation data, such as based on first instruction, inquire user i purchase commodity:Commodity 1 and commodity 2.
The relation data of the 3rd node is inquired, including:Based on the described second instruction, the 3rd node is inquired
Relation data, such as based on second instruction, inquire user j purchase commodity:Commodity 3.
Methods described also includes:Based on the described 3rd instruction, relation data to the first node and described Section three
The relation data of point carries out integrated treatment.Wherein, the integrated treatment can include any one of following handle:Merging treatment, friendship
Set operation and set difference operation.For example, the commodity that the commodity and user j that user i is bought are bought can be merged, finally
As a result it is:Commodity 1, commodity 2 and commodity 3.In another example the commodity that the commodity and user j that user i is bought are bought can be asked friendship
Collection, final result are sky.In another example the user i commodity bought and user the j commodity bought are subjected to set difference operation, that is, really
Determine user i and bought the commodity that still user j was not bought, final result is commodity 1 and commodity 2.
The third operator:Compound transfer operator, for the result that will be inquired from a tables of data, as another number
According to the index entry inquired about in table.
Also set up the 3rd tables of data on the physical cluster, the 3rd tables of data be used for the node of on-line storage the 5th with
The relation information of the first node.Such as shown in table 5, the 5th node and first node are all user node, the 3rd tables of data
For storing user i buddy list:User j, and user i and user j friend relation value;
The on-line query request includes the second composition operators;Methods described also includes:
The 4th instruction and the 5th instruction are parsed from first composition operators, the described 4th instructs for indicating online
Node corresponding to the 5th node is inquired about, the described 5th instructs the relation number for indicating node corresponding to the node of online query the 5th
According to.
The first node corresponding to the 5th node is inquired from the 3rd tables of data, including:Based on described
4th instruction, the first node corresponding to the 5th node is inquired from the 3rd tables of data;Inquire described
The relation data of one node, including:Instructed based on the described 5th, inquire the relation data of the first node.For example, it is based on
The good friend that 4th instruction inquires user i from table 5 is user j, using user j as the index entry inquired about in table 1, from table 1
In inquire the commodity 1 of user j purchases, the attribute informations of commodity 1 is inquired from table 2.
In the embodiment of the present application, storage architecture can be made up of two parts of offline cluster and online cluster.Wherein, from
Line cluster can be divided into HDFS (Hadoop distributed file systems) Index Build Cluster (index cluster) and Real-
Time Stream Process Cluster (real-time stream process cluster), HDFS Index Build Cluster are mainly used to
Attributed graph is converted into the tables of data of key-value and key-key-value structures with efficient batch style, and tables of data is same
Walk online cluster.Real-Time Stream Process Cluster are mainly used to handle real-time update message, and send
To online cluster, this cluster can be with second level delay disposal message.
And online cluster can be the physical cluster in any of the above-described embodiment, the physical cluster includes Proxy (agency)
Sub-cluster and Search (inquiry) sub-cluster.
Proxy sub-clusters are mainly responsible for receiving the on-line query request of user's input, perform inquiry, and by final inquiry
As a result user is returned to.In query process is performed, Proxy sub-clusters can ask to the pocket transmission of Search subsets, obtain key-
Data in the tables of data of value and key-key-value structures.Compound operation operator presented hereinbefore and compound transfer operator
All performed in Proxy sub-clusters.Search sub-clusters, it is mainly responsible for loading key-value and key-key-value structures
Tables of data, and in the new information renewal table sended over according to Real-Time Stream Process Cluster in
Hold.In addition, it is exactly to receive the online query that Proxy sub-clusters send over to ask that Search sub-clusters, which also have an important function,
Ask, data query operation is performed according to on-line query request, and Query Result is returned into Proxy sub-clusters.
Proxy sub-clusters include at least three layers:
Access Layer is serviced, this layer is mainly used in receiving on-line query request, and on-line query request is converted into execution
Sent after the form that core layer can identify to execution core layer.And the result for performing core layer return is converted into user's phase
The return form of prestige;Core layer is performed, this layer is used to realize online query, specifically includes request analysis and verification online query
Request, generation simultaneously send inquiry plan to data acquisition layer, finally return to Query Result to service Access Layer;Data acquisition layer,
This layer is mainly responsible for being communicated with Search sub-clusters.The on-line query request for performing core layer transmission is forwarded to by this layer
Search sub-clusters, and the result that Search sub-clusters are returned returns to execution core layer.
Search sub-clusters include at least two layers:
Storage management layer, the loading of the tables of data of this layer of main responsible key-value and key-key-value structure and
Renewal.
Layer is inquired about, is mainly responsible for for this layer receiving the on-line query request that Proxy sub-clusters issue, and according to online query
Request is inquired about, and Query Result is back into proxy sub-clusters.The layer can also be filtered to Query Result, be sorted
Deng operation.
Corresponding above method embodiment, present invention also provides corresponding device embodiment, is specifically described below.
Referring to Fig. 6, this application provides a kind of embodiment for establishing device of tables of data.The described device of the present embodiment
Including:First, which establishes unit 601 and second, establishes unit 602.
First establishes unit 601, and for establishing the first tables of data on a physical cluster, first tables of data is used for
The relation information of on-line storage first node and section point.
Second establishes unit 602, and for establishing the second tables of data on the physical cluster, second tables of data is used for
The attribute information of on-line storage section point.
Wherein, the first node and the section point belong to different node types.
Optionally, first tables of data includes the first index entry and the second index entry, and first index entry is used for
Line stores the mark of the first node, and second index entry is used for on-line storage corresponding with the first node described the
The mark of two nodes;
Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for on-line storage institute
The mark of section point is stated, first attribute item is used for the attribute information of section point described in on-line storage.
Optionally, first tables of data also includes the second attribute item, and second attribute item is used for described in on-line storage
The attribute information of the corresponding relation of first node and the section point.
Optionally, first tables of data is key-key-value key-key-value structures, wherein, first index entry
For main key information, second index entry is that second attribute item is value information from key information;
Second tables of data is key-value key-value structures, wherein, the 3rd index entry is key information, described
One attribute item is value information.
Optionally, the physical cluster includes N number of partition holding, and N is more than or equal to 2;The device of establishing also includes:
Determining unit, for determining the first subregion from N number of partition holding according to first index entry, according to described
Three index entries determine the second subregion from N number of partition holding;
It is described when establishing the first tables of data on a physical cluster, described first, which establishes unit, is specifically used for, described
The first tables of data is established on first subregion of physical cluster;
It is described when establishing the second tables of data on the physical cluster, described second, which establishes unit, is specifically used for, described
The second tables of data is established on second subregion of physical cluster.
Optionally, first subregion and second subregion include M backup region respectively, and M is more than or equal to 2;
When establishing the first tables of data on the first subregion of the physical cluster, described first, which establishes unit, is specifically used for,
On M backup region of the first subregion of the physical cluster, first tables of data is established respectively;
When establishing the second tables of data on the second subregion of the physical cluster, described second, which establishes unit, is specifically used for,
On M backup region of the second subregion of the physical cluster, second tables of data is established respectively.
Optionally, first tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node;It is described
Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, the 3rd node and the first node category
In identical node type, the fourth node and the section point belong to identical node type;
Or the device of establishing also includes:3rd establishes unit, for establishing the 3rd data on the physical cluster
Table, the 3rd tables of data are used for the relation information of the node of on-line storage the 5th and the first node.
Referring to Fig. 7, this application provides a kind of device embodiment of online query device.In the present embodiment, a thing
Being established on reason cluster has the first tables of data and the second tables of data, and first tables of data is used for on-line storage first node and second
The relation information of node;Second tables of data is used for the attribute information of on-line storage section point;Wherein, the first node
Belong to different node types with the section point;
The described device of the present embodiment includes:Receiving unit 701 and query unit 702.
Receiving unit 701, for receiving on-line query request, the on-line query request is used to indicate online query first
The relation data of node;
Query unit 702, for first tables of data in physical cluster described in online access and second data
Table, inquire the relation data of the first node;
Wherein, the relation data for inquiring the first node includes:Institute is inquired from first tables of data
Section point corresponding to first node is stated, the attribute information of the section point is inquired from second tables of data.
Optionally, first tables of data includes the first index entry and the second index entry, and first index entry is used for
Line stores the mark of the first node, and second index entry is used for on-line storage corresponding with the first node described the
The mark of two nodes;
Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for on-line storage institute
The mark of section point is stated, first attribute item is used for the attribute information of section point described in on-line storage.
Optionally, the physical cluster includes N number of partition holding, and N is more than or equal to 2;Described device also includes:
Determining unit, for determining the first subregion from N number of partition holding according to first index entry, according to described
Two index entries determine the second subregion from N number of partition holding;
When first tables of data in physical cluster described in online access and second tables of data, the query unit
It is specifically used for:First tables of data on first subregion in physical cluster described in online access, and described second point
Second tables of data in area.
Optionally, first tables of data is additionally operable to the node of on-line storage the 3rd and the relation information of fourth node;It is described
Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, the 3rd node and the first node category
In identical node type, the fourth node and the section point belong to identical node type;The online query please
Seek the relation data for being additionally operable to indicate the node of online query the 3rd;
The query unit is additionally operable to, first tables of data and second number in physical cluster described in online access
According to table, the relation data of the 3rd node is inquired;
Wherein, the relation data for inquiring the 3rd node includes:Institute is inquired from first tables of data
State fourth node corresponding to the 3rd node;The attribute information of the fourth node is inquired from second tables of data.
Optionally, the on-line query request includes the first composition operators;Described device also includes:
Resolution unit, for parsing the first instruction, the second instruction and the 3rd instruction, institute from first composition operators
The relation data that the first instruction is used to indicate online query first node is stated, described second instructs for indicating online query the 3rd
The relation data of node, the described 3rd instructs for indicating to carry out integrated treatment to relation data;
When inquiring the relation data of the first node, the query unit is specifically used for, and is instructed based on described first,
Inquire the relation data of the first node;
When inquiring the relation data of the 3rd node, the query unit is specifically used for, and is instructed based on described second,
Inquire the relation data of the 3rd node;
Described device also includes:Processing unit, for being instructed based on the described 3rd, to the relation data of the first node
Integrated treatment is carried out with the relation data of the 3rd node.
Optionally, the integrated treatment includes any one of following handle:Merging treatment, intersection operation and set difference operation.
Optionally, the 3rd tables of data has been also set up on the physical cluster, the 3rd tables of data is used for on-line storage the
The relation information of five nodes and the first node;The on-line query request is used for the relation for indicating the node of online query the 5th
Data:
First tables of data and second tables of data in physical cluster described in online access, inquire described first
During the relation data of node, the query unit is specifically used for, first tables of data in physical cluster described in online access,
Second tables of data and the 3rd tables of data, inquire the relation data of the 5th node;
When inquiring the relation data of the 5th node, the query unit is specifically used for, from the 3rd tables of data
In inquire the 5th node corresponding to the first node, inquire the relation data of the first node.
Optionally, the on-line query request includes the second composition operators;Described device also includes:
Resolution unit, for parsing the 4th instruction and the 5th instruction from first composition operators, the described 4th refers to
Make for indicating node corresponding to the node of online query the 5th, the described 5th instructs for indicating that the node of online query the 5th is corresponding
Node relation data;
When inquiring the first node corresponding to the 5th node from the 3rd tables of data, the query unit
It is specifically used for, based on the described 4th instruction, is inquired from the 3rd tables of data described first corresponding to the 5th node
Node;
When inquiring the relation data of the first node, the query unit is specifically used for, and is instructed based on the described 5th,
Inquire the relation data of the first node.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, the corresponding process in preceding method embodiment is may be referred to, will not be repeated here.
In several embodiments provided herein, it should be understood that disclosed system, apparatus and method can be with
Realize by another way.For example, device embodiment described above is only schematical, for example, the unit
Division, only a kind of division of logic function, can there is other dividing mode, such as multiple units or component when actually realizing
Another system can be combined or be desirably integrated into, or some features can be ignored, or do not perform.It is another, it is shown or
The mutual coupling discussed or direct-coupling or communication connection can be the indirect couplings by some interfaces, device or unit
Close or communicate to connect, can be electrical, mechanical or other forms.
The unit illustrated as separating component can be or may not be physically separate, show as unit
The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple
On NE.Some or all of unit therein can be selected to realize the mesh of this embodiment scheme according to the actual needs
's.
In addition, each functional unit in each embodiment of the application can be integrated in a processing unit, can also
That unit is individually physically present, can also two or more units it is integrated in a unit.Above-mentioned integrated list
Member can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.
If the integrated unit is realized in the form of SFU software functional unit and is used as independent production marketing or use
When, it can be stored in a computer read/write memory medium.Based on such understanding, the technical scheme of the application is substantially
The part to be contributed in other words to prior art or all or part of the technical scheme can be in the form of software products
Embody, the computer software product is stored in a storage medium, including some instructions are causing a computer
Equipment (can be personal computer, server, or network equipment etc.) performs the complete of each embodiment methods described of the application
Portion or part steps.And foregoing storage medium includes:USB flash disk, mobile hard disk, read-only storage (ROM, Read-Only
Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can store journey
The medium of sequence code.
Described above, above example is only to illustrate the technical scheme of the application, rather than its limitations;Although with reference to before
Embodiment is stated the application is described in detail, it will be understood by those within the art that:It still can be to preceding
State the technical scheme described in each embodiment to modify, or equivalent substitution is carried out to which part technical characteristic;And these
Modification is replaced, and the essence of appropriate technical solution is departed from the spirit and scope of each embodiment technical scheme of the application.
Claims (30)
- A kind of 1. method for building up of tables of data, it is characterised in that including:The first tables of data is established on a physical cluster, first tables of data is used for on-line storage first node and the second section The relation information of point;The second tables of data is established on the physical cluster, the attribute that second tables of data is used for on-line storage section point is believed Breath;Wherein, the first node and the section point belong to different node types.
- 2. method for building up according to claim 1, it is characterised in that first tables of data includes the first index entry and the Two index entries, first index entry are used for the mark of first node described in on-line storage, and second index entry is used for online The mark of the storage section point corresponding with the first node;Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for described in on-line storage the The mark of two nodes, first attribute item are used for the attribute information of section point described in on-line storage.
- 3. method for building up according to claim 2, it is characterised in that first tables of data also includes the second attribute item, Second attribute item is used for the attribute information of the corresponding relation of first node and the section point described in on-line storage.
- 4. method for building up according to claim 3, it is characterised in that first tables of data is key-key-value key-key- Value structures, wherein, first index entry is main key information, and second index entry is from key information, second attribute Item is value information;Second tables of data is key-value key-value structures, wherein, the 3rd index entry is key information, first category Property item is value information.
- 5. according to the method for building up described in any one of claim 2 to 4, it is characterised in that the physical cluster includes N number of storage Subregion, N are more than or equal to 2;The method for building up also includes:First subregion is determined from N number of partition holding according to first index entry, according to the 3rd index entry from N number of storage The second subregion is determined in subregion;It is described to establish the first tables of data on a physical cluster, including:Is established on the first subregion of the physical cluster One tables of data;It is described to establish the second tables of data on the physical cluster, including:Is established on the second subregion of the physical cluster Two tables of data.
- 6. method for building up according to claim 5, it is characterised in that first subregion and second subregion wrap respectively M backup region is included, M is more than or equal to 2;The first tables of data is established on the first subregion of the physical cluster, including:In the M of the first subregion of the physical cluster On individual backup region, first tables of data is established respectively;The second tables of data is established on the second subregion of the physical cluster, including:In the M of the second subregion of the physical cluster On individual backup region, second tables of data is established respectively.
- 7. method for building up according to claim 1, it is characterised in that first tables of data is additionally operable to on-line storage the 3rd The relation information of node and fourth node;Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, 3rd node and the first node belong to identical node type, and the fourth node and the section point belong to phase Same node type;Or methods described also includes:The 3rd tables of data is established on the physical cluster, the 3rd tables of data is used for online Store the relation information of the 5th node and the first node.
- A kind of 8. online query method, it is characterised in that being established on a physical cluster has the first tables of data and the second tables of data, First tables of data is used for the relation information of on-line storage first node and section point;Second tables of data is used for online Store the attribute information of section point;Wherein, the first node and the section point belong to different node types;It is described Method includes:On-line query request is received, the on-line query request is used for the relation data for indicating online query first node;First tables of data and second tables of data in physical cluster described in online access, inquire the first node Relation data;Wherein, the relation data for inquiring the first node includes:Described is inquired from first tables of data Section point corresponding to one node, the attribute information of the section point is inquired from second tables of data.
- 9. online query method according to claim 8, it is characterised in that first tables of data includes the first index entry With the second index entry, first index entry is used for the mark of first node described in on-line storage, and second index entry is used for The mark of the on-line storage section point corresponding with the first node;Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for described in on-line storage the The mark of two nodes, first attribute item are used for the attribute information of section point described in on-line storage.
- 10. online query method according to claim 9, it is characterised in that the physical cluster includes N number of storage point Area, N are more than or equal to 2;Methods described also includes:First subregion is determined from N number of partition holding according to first index entry, according to second index entry from N number of storage The second subregion is determined in subregion;First tables of data and second tables of data in physical cluster described in online access, including:Described in online access First tables of data on first subregion in physical cluster, and second tables of data on second subregion.
- 11. online query method according to claim 9, it is characterised in that first tables of data is additionally operable to deposit online Store up the 3rd node and the relation information of fourth node;Second tables of data is additionally operable to the attribute letter of on-line storage fourth node Breath;Wherein, the 3rd node and the first node belong to identical node type, the fourth node and second section Point belongs to identical node type;The on-line query request is additionally operable to indicate the relation data of the node of online query the 3rd;Methods described also includes:First tables of data and second tables of data in physical cluster described in online access, inquire the 3rd node Relation data;Wherein, the relation data for inquiring the 3rd node includes:Described is inquired from first tables of data Fourth node corresponding to three nodes;The attribute information of the fourth node is inquired from second tables of data.
- 12. online query method according to claim 11, it is characterised in that the on-line query request includes first Composition operators;Methods described also includes:The first instruction, the second instruction and the 3rd instruction are parsed from first composition operators, described first instructs for referring to Show the relation data of online query first node, described second instructs the relation data for indicating the node of online query the 3rd, Described 3rd instructs for indicating to carry out integrated treatment to relation data;The relation data of the first node is inquired, including:Instructed based on described first, inquire the pass of the first node Coefficient evidence;The relation data of the 3rd node is inquired, including:Instructed based on described second, inquire the pass of the 3rd node Coefficient evidence;Methods described also includes:Instructed based on the described 3rd, the relation data and the 3rd node to the first node Relation data carries out integrated treatment.
- 13. online query method according to claim 12, it is characterised in that the integrated treatment includes any one of following Processing:Merging treatment, intersection operation and set difference operation.
- 14. online query method according to claim 9, it is characterised in that also set up the 3rd on the physical cluster Tables of data, the 3rd tables of data are used for the relation information of the node of on-line storage the 5th and the first node;It is described to look into online Ask the relation data that request is used to indicate the node of online query the 5th:First tables of data and second tables of data in physical cluster described in online access, inquire the first node Relation data, including:First tables of data, second tables of data in physical cluster described in online access and described Three tables of data, inquire the relation data of the 5th node;The relation data of the 5th node is inquired, including:The 5th node pair is inquired from the 3rd tables of data The first node answered, inquire the relation data of the first node.
- 15. online query method according to claim 14, it is characterised in that the on-line query request includes second Composition operators;Methods described also includes:The 4th instruction and the 5th instruction are parsed from first composition operators, the described 4th instructs for indicating online query Node corresponding to 5th node, the described 5th instructs the relation data for indicating node corresponding to the node of online query the 5th;The first node corresponding to the 5th node is inquired from the 3rd tables of data, including:Based on the described 4th Instruction, the first node corresponding to the 5th node is inquired from the 3rd tables of data;The relation data of the first node is inquired, including:Instructed based on the described 5th, inquire the pass of the first node Coefficient evidence.
- 16. a kind of tables of data establishes device, it is characterised in that including:First establishes unit, and for establishing the first tables of data on a physical cluster, first tables of data is used to deposit online Store up the relation information of first node and section point;Second establishes unit, and for establishing the second tables of data on the physical cluster, second tables of data is used to deposit online Store up the attribute information of section point;Wherein, the first node and the section point belong to different node types.
- 17. according to claim 16 establish device, it is characterised in that first tables of data include the first index entry and Second index entry, first index entry are used for the mark of first node described in on-line storage, and second index entry is used for Line stores the mark of the section point corresponding with the first node;Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for described in on-line storage the The mark of two nodes, first attribute item are used for the attribute information of section point described in on-line storage.
- 18. according to claim 17 establish device, it is characterised in that first tables of data also includes the second attribute , second attribute item is used for the attribute information of the corresponding relation of first node and the section point described in on-line storage.
- 19. according to claim 18 establish device, it is characterised in that first tables of data is key-key-value key- Key-value structures, wherein, first index entry is main key information, and second index entry is from key information described second Attribute item is value information;Second tables of data is key-value key-value structures, wherein, the 3rd index entry is key information, first category Property item is value information.
- 20. establish device according to any one of claim 17 to 19, it is characterised in that the physical cluster includes N number of Partition holding, N are more than or equal to 2;The device of establishing also includes:Determining unit, for determining the first subregion from N number of partition holding according to first index entry, according to the 3rd rope Draw item and the second subregion is determined from N number of partition holding;It is described when establishing the first tables of data on a physical cluster, described first, which establishes unit, is specifically used for, in the physics The first tables of data is established on first subregion of cluster;It is described when establishing the second tables of data on the physical cluster, described second, which establishes unit, is specifically used for, in the physics The second tables of data is established on second subregion of cluster.
- 21. according to claim 20 establish device, it is characterised in that first subregion and second subregion difference Including M backup region, M is more than or equal to 2;When establishing the first tables of data on the first subregion of the physical cluster, described first, which establishes unit, is specifically used for, in institute State on M backup region of the first subregion of physical cluster, establish first tables of data respectively;When establishing the second tables of data on the second subregion of the physical cluster, described second, which establishes unit, is specifically used for, in institute State on M backup region of the second subregion of physical cluster, establish second tables of data respectively.
- 22. according to claim 16 establish device, it is characterised in that first tables of data is additionally operable to on-line storage The relation information of three nodes and fourth node;Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Its In, the 3rd node and the first node belong to identical node type, the fourth node and the section point category In identical node type;Or the device of establishing also includes:3rd establishes unit, for establishing the 3rd tables of data on the physical cluster, 3rd tables of data is used for the relation information of the node of on-line storage the 5th and the first node.
- 23. a kind of online query device, it is characterised in that being established on a physical cluster has the first tables of data and the second data Table, first tables of data are used for the relation information of on-line storage first node and section point;Second tables of data is used for The attribute information of on-line storage section point;Wherein, the first node and the section point belong to different node types; Described device includes:Receiving unit, for receiving on-line query request, the on-line query request is used to indicate online query first node Relation data;Query unit, for first tables of data in physical cluster described in online access and second tables of data, inquiry To the relation data of the first node;Wherein, the relation data for inquiring the first node includes:Described is inquired from first tables of data Section point corresponding to one node, the attribute information of the section point is inquired from second tables of data.
- 24. device according to claim 23, it is characterised in that first tables of data includes the first index entry and second Index entry, first index entry are used for the mark of first node described in on-line storage, and second index entry is used to deposit online The mark of the storage section point corresponding with the first node;Second tables of data includes the 3rd index entry and the first attribute item, and the 3rd index entry is used for described in on-line storage the The mark of two nodes, first attribute item are used for the attribute information of section point described in on-line storage.
- 25. device according to claim 24, it is characterised in that the physical cluster includes N number of partition holding, and N is more than Or equal to 2;Described device also includes:Determining unit, for determining the first subregion from N number of partition holding according to first index entry, according to second rope Draw item and the second subregion is determined from N number of partition holding;When first tables of data in physical cluster described in online access and second tables of data, the query unit is specific For:First tables of data on first subregion in physical cluster described in online access, and on second subregion Second tables of data.
- 26. device according to claim 24, it is characterised in that first tables of data is additionally operable to Section three of on-line storage Point and the relation information of fourth node;Second tables of data is additionally operable to the attribute information of on-line storage fourth node;Wherein, institute State the 3rd node and the first node belongs to identical node type, the fourth node and the section point belong to identical Node type;The on-line query request is additionally operable to indicate the relation data of the node of online query the 3rd;The query unit is additionally operable to, first tables of data and second data in physical cluster described in online access Table, inquire the relation data of the 3rd node;Wherein, the relation data for inquiring the 3rd node includes:Described is inquired from first tables of data Fourth node corresponding to three nodes;The attribute information of the fourth node is inquired from second tables of data.
- 27. device according to claim 26, it is characterised in that the on-line query request includes the first compound calculation Son;Described device also includes:Resolution unit, for parsing the first instruction, the second instruction and the 3rd instruction from first composition operators, described the One instructs the relation data for indicating online query first node, and described second instructs for indicating the node of online query the 3rd Relation data, the described 3rd instruct for indicate to relation data carry out integrated treatment;When inquiring the relation data of the first node, the query unit is specifically used for, based on the described first instruction, inquiry To the relation data of the first node;When inquiring the relation data of the 3rd node, the query unit is specifically used for, based on the described second instruction, inquiry To the relation data of the 3rd node;Described device also includes:Processing unit, for being instructed based on the described 3rd, relation data and institute to the first node The relation data for stating the 3rd node carries out integrated treatment.
- 28. device according to claim 27, it is characterised in that the integrated treatment includes any one of following handle:Close And handle, intersection operation and set difference operation.
- 29. device according to claim 24, it is characterised in that the 3rd tables of data has been also set up on the physical cluster, 3rd tables of data is used for the relation information of the node of on-line storage the 5th and the first node;The on-line query request is used In the relation data of the instruction node of online query the 5th:First tables of data and second tables of data in physical cluster described in online access, inquire the first node Relation data when, the query unit is specifically used for, first tables of data in physical cluster described in online access, described Second tables of data and the 3rd tables of data, inquire the relation data of the 5th node;When inquiring the relation data of the 5th node, the query unit is specifically used for, and is looked into from the 3rd tables of data The first node corresponding to asking the 5th node, inquire the relation data of the first node.
- 30. device according to claim 29, it is characterised in that the on-line query request includes the second compound calculation Son;Described device also includes:Resolution unit, for parsing the 4th instruction and the 5th instruction from first composition operators, the 4th instruction is used In node corresponding to the instruction node of online query the 5th, the described 5th instructs for indicating to save corresponding to the node of online query the 5th The relation data of point;When inquiring the first node corresponding to the 5th node from the 3rd tables of data, the query unit is specific For based on the described 4th instruction, the first node corresponding to the 5th node to be inquired from the 3rd tables of data;When inquiring the relation data of the first node, the query unit is specifically used for, based on the described 5th instruction, inquiry To the relation data of the first node.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610826949.6A CN107818117B (en) | 2016-09-14 | 2016-09-14 | Data table establishing method, online query method and related device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610826949.6A CN107818117B (en) | 2016-09-14 | 2016-09-14 | Data table establishing method, online query method and related device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107818117A true CN107818117A (en) | 2018-03-20 |
CN107818117B CN107818117B (en) | 2022-02-15 |
Family
ID=61601282
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610826949.6A Active CN107818117B (en) | 2016-09-14 | 2016-09-14 | Data table establishing method, online query method and related device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107818117B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108563697A (en) * | 2018-03-22 | 2018-09-21 | 腾讯科技(深圳)有限公司 | A kind of data processing method, device and storage medium |
CN110543585A (en) * | 2019-08-14 | 2019-12-06 | 天津大学 | RDF graph and attribute graph unified storage method based on relational model |
CN111125156A (en) * | 2019-12-17 | 2020-05-08 | 网银在线(北京)科技有限公司 | Data query method and device and electronic equipment |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6728713B1 (en) * | 1999-03-30 | 2004-04-27 | Tivo, Inc. | Distributed database management system |
CN101547092B (en) * | 2008-03-27 | 2011-06-08 | 天津德智科技有限公司 | Method and device for data synchronization of multi-application systems for unifying user authentication |
CN102395962A (en) * | 2009-03-11 | 2012-03-28 | 甲骨文国际公司 | Composite hash and list partitioning of database tables |
CN103218404A (en) * | 2013-03-20 | 2013-07-24 | 华中科技大学 | Multi-dimensional metadata management method and system based on association characteristics |
CN103631924A (en) * | 2013-12-03 | 2014-03-12 | Tcl集团股份有限公司 | Application method and system for distributive database platform |
CN103995879A (en) * | 2014-05-27 | 2014-08-20 | 华为技术有限公司 | Data query method, device and system based on OLAP system |
CN104063487A (en) * | 2014-07-03 | 2014-09-24 | 浙江大学 | File data management method based on relational database and K-D tree indexes |
CN104809129A (en) * | 2014-01-26 | 2015-07-29 | 华为技术有限公司 | Method, device and system for storing distributed data |
US20150220617A1 (en) * | 2013-12-23 | 2015-08-06 | Teradata Us, Inc. | Techniques for query processing using high dimension histograms |
CN105045871A (en) * | 2015-07-15 | 2015-11-11 | 国家超级计算深圳中心(深圳云计算中心) | Data aggregation query method and apparatus |
-
2016
- 2016-09-14 CN CN201610826949.6A patent/CN107818117B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6728713B1 (en) * | 1999-03-30 | 2004-04-27 | Tivo, Inc. | Distributed database management system |
CN101547092B (en) * | 2008-03-27 | 2011-06-08 | 天津德智科技有限公司 | Method and device for data synchronization of multi-application systems for unifying user authentication |
CN102395962A (en) * | 2009-03-11 | 2012-03-28 | 甲骨文国际公司 | Composite hash and list partitioning of database tables |
CN103218404A (en) * | 2013-03-20 | 2013-07-24 | 华中科技大学 | Multi-dimensional metadata management method and system based on association characteristics |
CN103631924A (en) * | 2013-12-03 | 2014-03-12 | Tcl集团股份有限公司 | Application method and system for distributive database platform |
US20150220617A1 (en) * | 2013-12-23 | 2015-08-06 | Teradata Us, Inc. | Techniques for query processing using high dimension histograms |
CN104809129A (en) * | 2014-01-26 | 2015-07-29 | 华为技术有限公司 | Method, device and system for storing distributed data |
CN103995879A (en) * | 2014-05-27 | 2014-08-20 | 华为技术有限公司 | Data query method, device and system based on OLAP system |
CN104063487A (en) * | 2014-07-03 | 2014-09-24 | 浙江大学 | File data management method based on relational database and K-D tree indexes |
CN105045871A (en) * | 2015-07-15 | 2015-11-11 | 国家超级计算深圳中心(深圳云计算中心) | Data aggregation query method and apparatus |
Non-Patent Citations (2)
Title |
---|
I GUSTI BAGUS ADY SUTRISNA等: ""Implementation of GRAC algorithm (Graph Algorithm Clustering) in graph database compression"", 《2015 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOICT)》 * |
李国庆: "《ASP.NET程序设计项目教程》", 31 January 2010, 北京理工大学出版社 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108563697A (en) * | 2018-03-22 | 2018-09-21 | 腾讯科技(深圳)有限公司 | A kind of data processing method, device and storage medium |
CN110543585A (en) * | 2019-08-14 | 2019-12-06 | 天津大学 | RDF graph and attribute graph unified storage method based on relational model |
CN110543585B (en) * | 2019-08-14 | 2021-08-31 | 天津大学 | RDF graph and attribute graph unified storage method based on relational model |
CN111125156A (en) * | 2019-12-17 | 2020-05-08 | 网银在线(北京)科技有限公司 | Data query method and device and electronic equipment |
CN111125156B (en) * | 2019-12-17 | 2023-09-26 | 网银在线(北京)科技有限公司 | Data query method and device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN107818117B (en) | 2022-02-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109033101B (en) | Label recommendation method and device | |
CN104394118A (en) | User identity identification method and system | |
CN106570008A (en) | Recommendation method and device | |
CN107291779B (en) | Cache data management method and device | |
US20140115010A1 (en) | Propagating information through networks | |
CN107967284A (en) | Method and apparatus for storing, inquiring about sequence information | |
WO2019242343A1 (en) | Marketing information release platform construction method and apparatus | |
CN105119956B (en) | Network application system and dispositions method | |
CN107818117A (en) | A kind of method for building up of tables of data, online query method and relevant apparatus | |
WO2018036219A1 (en) | Multi-level rebating method and multi-level rebating platform | |
CN111639253A (en) | Data duplication judging method, device, equipment and storage medium | |
CN106933891A (en) | Access the method for distributed data base and the device of Distributed database service | |
US20190362016A1 (en) | Frequent pattern analysis for distributed systems | |
CN105894310A (en) | Personalized recommendation method | |
CN104468751A (en) | Self-defining method for business process nodes in cloud sea operating system | |
US20200098030A1 (en) | Inventory-assisted artificial intelligence recommendation engine | |
CN113761350A (en) | Data recommendation method, related device and data recommendation system | |
CN106446943A (en) | Commodity correlation big data sparse network quick clustering method | |
US9830377B1 (en) | Methods and systems for hierarchical blocking | |
CN106780062A (en) | Based on groups of users update method and system that social networks and big data are analyzed | |
Li et al. | An empirical study of alternating least squares collaborative filtering recommendation for Movielens on Apache Hadoop and Spark | |
US11294917B2 (en) | Data attribution using frequent pattern analysis | |
CN107679096B (en) | Method and device for sharing indexes among data marts | |
US20200097485A1 (en) | Selective synchronization of linked records | |
CN106202503B (en) | Data processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |