WO2016191995A1 - 一种分布式数据库中关联表分区的方法和设备 - Google Patents

一种分布式数据库中关联表分区的方法和设备 Download PDF

Info

Publication number
WO2016191995A1
WO2016191995A1 PCT/CN2015/080444 CN2015080444W WO2016191995A1 WO 2016191995 A1 WO2016191995 A1 WO 2016191995A1 CN 2015080444 W CN2015080444 W CN 2015080444W WO 2016191995 A1 WO2016191995 A1 WO 2016191995A1
Authority
WO
WIPO (PCT)
Prior art keywords
data table
partition
data
key
tables
Prior art date
Application number
PCT/CN2015/080444
Other languages
English (en)
French (fr)
Inventor
王铁英
刘哲
沈柯
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2015/080444 priority Critical patent/WO2016191995A1/zh
Priority to CN201580001222.8A priority patent/CN106415534B/zh
Publication of WO2016191995A1 publication Critical patent/WO2016191995A1/zh
Priority to US15/814,141 priority patent/US10831737B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/68Details of translation look-aside buffer [TLB]

Definitions

  • the embodiments of the present invention relate to the field of databases, and in particular, to a method and a device for associating table partitions in a distributed database.
  • NoSQL is a data management system mainly for the World Wide Web (Web) business, because there is no complexity.
  • the table relationship based on the key-value data model, can solve most of the data management problems.
  • NoSQL places the data of multiple relational tables in a large table by constructing a Schema model of the nested formal data model.
  • this type of data model will increase the redundancy of data, and when faced with the data of analytical services, this nested data model seriously affects the query efficiency of On-Line Analytical Processing (OLAP).
  • OLAP On-Line Analytical Processing
  • a distributed database management system also known as a distributed database, has multiple nodes, each node maintains and manages a part of the data, some of which provide services (called Active data), and some of which do not serve as backups. (called Backup data), these nodes can also be called data management nodes or data nodes.
  • Active data provides services
  • Backup data backup data
  • these nodes can also be called data management nodes or data nodes.
  • the union of the Active data of all nodes is the complete data of the database, and there is no intersection of the Active data of the tables between the nodes.
  • the database management system needs to partition the table (Partition). Then, each partition range (Range) after the Partition is assigned to each node for management.
  • each partition key needs to be established for each table, and each Range after the Partition is assigned to each node for management.
  • Partition and allocation methods will affect the query efficiency of OLAP, based on massive data Distributed database, OLAP operations involve computing nodes need to obtain computational data from other nodes, resulting in excessive system delay, resulting in a decline in overall performance of the entire system.
  • a distributed database for example, a Hadoop database (HBase) database, a distributed key-value database HyperDex, etc., for a Partition method of an associated table, may cause an OLAP operation to involve a large amount of data in a batch.
  • a Hadoop database (HBase) database a distributed key-value database HyperDex, etc.
  • HyperDex a distributed key-value database HyperDex
  • the embodiments of the present invention provide a Partition method and a device for associating a table in a distributed database, so that the OLAP operation reduces the large amount of data obtained from multiple nodes and the global matching of the dimension table, thereby improving the entire system. Performance.
  • an embodiment of the present invention provides a method for partitioning an associated table in a distributed database, including: determining a first data table in each data table that needs to be partitioned, where the partition key of the first data table is not any one a foreign key of the second data table, wherein the second data table is a data table other than the first data table in each data table that needs to be partitioned; and a co-located partition CP of the first data table is generated.
  • the CP table set of the first data table includes a first data table and at least one CP table of the first data table, where a CP table of the first data table is a partition key is the first a data table of a subset of partition keys of the data table; partitioning the first data table according to each partition key of the first data table; and exposing the first data table in the CP table set
  • Each CP table is partitioned, wherein a partition range of the partition key associated with the first data table for each CP table other than the first data table in the set of CP tables and the first data
  • the partitioning keys of the corresponding partitioning keys in the table have the same extent.
  • the generating a CP table set of the first data table is specifically: acquiring a dimension table of the first data table, and determining the first data table Whether the dimension table is the CP table of the first data table, and if yes, adding the dimension table of the first data table to the CP table set of the first data table; iteratively determining that the first data table has been added Whether the dimension table of the CP table of the first data table in the set of the CP table is the CP table of the first data table, and the data table that is confirmed to be the CP table of the first data table is added to the first a collection of CP tables of the data table, Until all the CP tables of the first data table are found.
  • the method further includes: receiving a load balancing request, where the load balancing request indicates that the first node needs to be Performing load balancing; determining, in the first node, a set of CP tables for load balancing;
  • the determining, by the first node, the set of the CP table that performs load balancing specifically: recording the number according to the data table or The space dimension is used to determine a set of CP tables on the first node that need to be load balanced.
  • an embodiment of the present invention provides a database manager, including: a determining unit, configured to determine a first data table in each data table that needs to be partitioned, where the partition key of the first data table is not any one a foreign key of the two data tables, the second data table is a data table other than the first data table in each data table that needs to be partitioned; a co-located partition CP table set generating unit is used to generate a a set of co-located partitioned CP tables of the first data table, where the set of CP tables of the first data table includes a first data table and at least one CP table of the first data table, wherein the first data table
  • the CP table is a data table in which the partition key is a subset of the partition key of the first data table, and the partition unit is configured to partition the first data table according to each partition key of the first data table.
  • each CP table of the first data table in the set of CP tables wherein each CP table of the first data table in the set of CP tables is associated with the first data table
  • the partitioning range of the partitioning key is relative to the first data table The same partition key range partitioning.
  • the co-located partitioned CP table set generating unit is configured to: obtain a dimension table of the first data table, and determine a dimension table of the first data table. Whether it is the CP table of the first data table, if yes, adding the dimension table of the first data table to the CP table set of the first data table; iteratively determining the CP table that has been added to the first data table Whether the dimension table of the CP table of the first data table in the set is the CP table of the first data table, and the data table that is confirmed to be the CP table of the first data table is added to the first data table Collection of CP tables, Until all the CP tables of the first data table are found.
  • the database manager further includes: a load balancing unit, configured to: receive a load balancing request, and the load balancing request Determining that load balancing is performed on the first node; determining a set of CP tables for load balancing in the first node; determining a data table for performing load adjustment in the set of CP tables determined to perform load balancing; from the determining Selecting a partition key in the load balancing data table to perform partition range adjustment; and all the data tables in the CP table set that are load-balanced to determine the partition key that is selected for partition range adjustment are partitioned according to the selection.
  • the range-adjusted partition key performs the same partition range adjustment as the data table of the partition key selected for the partition range adjustment.
  • the database manager further includes a load balancing unit, where the load balancing unit includes: a receiving module: configured to: Receiving a load balancing request, the load balancing request indicating that load balancing is performed on the first node, determining a set of CP tables for performing load balancing in the first node, and determining a module for performing load balancing on the determining Determining a data table for performing load adjustment in the set of CP tables; selecting a partition key from the data table for determining load balancing to perform partition range adjustment; and adjusting module: all in a set of CP tables used for load balancing the determination
  • the data table including the partition key selected for the partition range adjustment performs the same partition range adjustment as the data table of the partition key selected for the partition range adjustment according to the partition key for performing the partition range adjustment.
  • the method for determining a load balancing CP table set is: according to a data table record number or a space dimension, Determining a set of CP tables on the first node that need to be load balanced.
  • an embodiment of the present invention further provides a database management device, including: a processor, a memory, and a bus; the memory is configured to store a computer to execute an instruction, and the processor is connected to the memory through the bus.
  • the processor reads the computer-executed instructions stored by the memory to perform the method described in the first aspect or any of the possible implementations of the first aspect, when the database management device is running.
  • the Partition method of the association table in the distributed database provided by the embodiment of the present invention is the same as the Partition mode of the associated partition key in the association table by co-location (Colocation) of the associated data in each association table. And the distribution is also mapped one by one.
  • Colocation co-location
  • the OLAP operation is performed, only the own node or the associated Range data is obtained from other nodes, and the entire fact table is not needed, thereby improving the performance of the entire system.
  • FIG. 1 is a schematic diagram of a distributed database architecture in an embodiment of the present invention.
  • FIG. 2 is a schematic structural diagram of a distributed database library according to Embodiment 1 of the present invention.
  • FIG. 3 is a schematic diagram of a result of a distributed database Partition according to Embodiment 1 of the present invention.
  • FIG. 4 is a flowchart of a method for partitioning an association table in a distributed database according to Embodiment 2 of the present invention.
  • FIG. 5 is a flowchart of a method for determining a first data table in Embodiment 2 of the present invention.
  • FIG. 6 is a flowchart of a method for generating a CP table set of a first data table according to Embodiment 2 of the present invention
  • FIG. 7 is a flowchart of a method for load balancing according to Embodiment 3 of the present invention.
  • FIG. 8 is a schematic diagram of node load balancing according to Embodiment 3 of the present invention.
  • FIG. 9 is a structural diagram of a database manager according to Embodiment 4 of the present invention.
  • FIG. 10 is another structural diagram of a database manager according to Embodiment 4 of the present invention.
  • FIG. 11 is another structural diagram of a database manager according to Embodiment 4 of the present invention.
  • Figure 12 is a structural diagram of a database management device according to Embodiment 5 of the present invention.
  • a distributed database system includes a plurality of nodes 20, as shown in the figure, node_1--node_5, and also includes a management.
  • 10 (Manager) is used to maintain and manage the database system, the manager shown in Figure 1. 10 is independent of nodes 1-5. In a different architecture than that shown in FIG. 1, the manager may also be served by one of the nodes of the distributed database system, and is provided on the node to provide maintenance of the database system. Management function.
  • the node and the manager in the distributed database system may be implemented by any host, and any host includes an ordinary computer, a mobile terminal, a workstation or a server, a dedicated server, and the like, such as an X86 processor, and the present invention does not specifically limited.
  • Each node maintains and manages part of the data, some of which provide services (called Active data, also called primary data), and some of which are used as backups (called Backup data, also called backup data).
  • Active data also called primary data
  • Backup data also called backup data
  • the union of all node Active data is the complete data of the database.
  • the table needs to be partitioned (Partition), and then each range (Range) after the Partition is assigned to each node. As shown in FIG.
  • the main data is stored in the respective nodes according to the Partition, for example, where Range(0,10], (50,60) is stored in the node_1, and Range(10,20), (60, 70] Stored in node_2, the Active data stored in each node constitutes the complete data of the database (0, 100), and is not repeated. Further, each node stores a part of the backup data of the other node's main data, Figure 1 The primary data of each node in the system shown is backed up in the other two nodes. For example, the primary data of Range(0,10) and (50,60) stored by node_1 is at node_4. Node_5 has a backup, and node_1 backs up its main data to node_4 and node_5 as backup data.
  • the table saved by the database can be divided into a fact table and a dimension table according to the saved content.
  • a fact table can be associated with multiple dimension tables. As shown in Tables 1 and 2 below, Table 1 is a fact table, and Table 2 is one of a plurality of dimension tables associated with the fact table 1.
  • the fact table is a core table in the database schema that contains data describing specific events within a business (such as banking transactions or product sales) for storing at least one fact record, each fact record corresponding to a row in the fact table.
  • Each row of data in the fact table in Table 1 records a fact record of the sale.
  • the remaining two columns - sales volume and sales - are the metric columns.
  • the value in the key-value column corresponds to the dimension of the fact record, for example, sales territory, product code, sales month, product origin, product category, etc.
  • the value in the measure column corresponds to the subject matter of the fact table, for example, sales or Sales volume and so on.
  • the entries in the dimension table are used to describe the data in the fact table, and the dimensions of the fact records (generally the key-value columns in them) in the fact table are recorded.
  • the dimension table contains the data on which the dimension is created.
  • the dimension table in Table 2 is used to describe the product code dimension of the fact table in Table 1.
  • a dimension table can include a name column and an attribute column, such as in Table 2, the product code is a name column, and the product name and product category are attribute columns.
  • the bank records the deposits in the database system, the actual data stored in the A table, including the account number, the organization number, the deposit amount, etc., the correspondence between the organization number of the B table and the organization name, then A is the fact Table, B is a dimension table.
  • the data table contains primary and foreign keys.
  • the primary key is specified when the table is created. It can be a column in the table or a union value of multiple columns.
  • the primary key is a unique identifier of the row record. It is unique. Any two rows in a table have different values.
  • the primary key value, the primary key is generally the first column of the table.
  • the foreign key is used to connect two tables. It is the same type of field and field name of another field in this table. It is used to associate two or more tables. It can be understood that the foreign key is generally Refers to the primary key in the other table. Typically, the foreign key of the fact table corresponds to the primary key of the dimension table.
  • the column M in the table A corresponds to the column N in the table B, that is, the column M in the A table and the column N in the B table correspond to the same attribute, for example, both correspond to the order number
  • column M is In the primary key in Table A
  • column N is the foreign key in Table B.
  • the keys that join the two data tables are also often referred to as association keys.
  • the fact table and the dimension table are associated by the association key.
  • the association key is a foreign key for the fact table, a primary key for the dimension table, and a foreign key corresponding to the fact table.
  • the primary key of the dimension table For example, the product code column in Table 1 is associated with the product code column in Table 2 and can be referred to as an association key.
  • the product code in Table 1 is listed as a foreign key to the fact table, and the product code in Table 2 is listed as the primary key of the dimension table.
  • the manager needs to set a partition key partition key for each table.
  • the partition key is a set of data columns in the data table that determines which partition the data row belongs to, and is a basis for judging the data partition.
  • the manager performs a Partition on the data table through the PartitionKey, and obtains each Range of each table after the Partition.
  • Each Range of each table is assigned to each node for management, and the Partition Key of the table may be hashed and sorted (Hash).
  • the data of each table is distributed on each node according to each Range obtained by Partition.
  • the data is aggregated to one node for aggregate Join calculation.
  • the system delay caused by cross-node data reading is too large, which will cause the overall performance of the whole system to decline, especially in the case of data processing.
  • a method for partitioning an associated table in a distributed database is provided.
  • 2 is a schematic diagram of a library structure of a distributed database related to an embodiment of the present invention.
  • the distributed database includes the following eight associated data tables, a project table Lineitem, a component supplier table Part_Supp, a component table Part, and a vendor table Supp. , Order Table Order, Customer Table Customer, Country Table Nation, Area Table Region. The association relationship between the tables is as shown in FIG.
  • the LineItem table is associated with the Part_Supp table by the component key Part_Key, the supplier key Supp_Key
  • the Order table is associated with the Order table by the order key Order_Key, where the Part_Key, the Supp_Key, and the Order_Key are both LineItems.
  • the foreign key of the table is associated with the Part_Supp table by the component key Part_Key, and is associated with the Supp table by the vendor key Supp_Key.
  • the Order table is associated with the Custermer table by the customer key CustomerKey, which is the foreign key of the Order table.
  • the Customer table is associated with the National table by the country key Nat_Key, which is the foreign key of the Custermer table.
  • the National table is associated with the Region table by the region key Reg_Key, which is a foreign key of the National table.
  • the partition keys determined in each data table are as follows: the partition keys of the LineItem table are OrderKey, Part_Key, and Supp_Key, the partition keys of the Part_Supp table are Part_Key and Supp_Key, the partition key of the Part table is Part_Key, and the partition key of the Supp table is Supp_Key, the Order table.
  • the partition key is Order_Key and CustermerKey, the partition key of the Custermer table is CustermerKey, and the National table and the Region table do not specify a partition key.
  • a core data table is first selected from each data table requiring Partition, which we will refer to here as the first data table.
  • the specific step of determining the first data table is: arbitrarily select a data table that has not been Partition from all the data tables of the distributed database, and assume that the eight data tables shown in FIG. 2 have not yet been Partition, assuming here The part supplier table Part_Supp table is selected. Determine the partition key of the Part_Supp table as Part_Key and Supp_Key, and then determine whether the partition key of the table has a foreign key of another table, and if not, use the table as the first data table (ie, the core data table).
  • the other table (the other table found here, that is, the fact table of the table), and then determine whether the fact table can be used as the first data table until the first one is found. data sheet. Because Part_Key and Supp_Key in the partition key of the Part_Supp table are foreign keys of the LineItem table, the fact table LineItem table is found, and then it is determined whether the LineItem table can be used as the first data table, because the partition key of the LineItem table in the LineItem table is OrderKey, Part_Key, and Supp_Key are not foreign keys of other data tables, so the LineItem table can be selected as the first data table for Partition.
  • the partition key of the LineItem table in the LineItem table is OrderKey, Part_Key, and Supp_Key are not foreign keys of other data tables, so the LineItem table can be selected as the first data table for Partition.
  • the above method selects a data table for Partition, which can be understood as essentially finding a fact table.
  • This fact table is the uppermost data table in each associated data table of the Partition.
  • the partition key in the first data table is Not a foreign key to other tables. It can be seen that in the distributed database shown in FIG. 2, no matter which table the step of searching for the first data table is started, the fact table LineItem table is finally found as the first data table.
  • the Colocation Partition (CP) table of the first data table needs to be found, and the CP table set of the first data table is established.
  • the concept of the CP table means that if all the partition keys of the table B are included in the partition key of the table A, that is, the partition key of the table B is a subset of the partition key of the table A, the table B is called the CP table of the table A. , indicating that the B table can perform the same Range Partition according to the associated key corresponding to the A table.
  • the partition key of the LineItem table is OrderKey, Part_Key, and Supp_Key, and the associated table found by it includes the Part_Supp table and the Order table.
  • the specific method is: selecting an association table of the first data table that does not determine the relationship of the CP table, and obtaining all the partition keys of the association table, and determining whether all the partition keys of the association table are all included in the partition key of the first data table. If all are included, confirm that the association table is a CP table of the first data table, add it to the CP table set of the first data table, and further determine the association table by means of loop iteration (ie, the first confirmation is first Whether the association table of the CP table of the data table is the CP table of the first data table, and the data table associated with the partition key of the association table is first determined, and in the data table associated with the CP table, it is determined whether the association is related.
  • All the partition keys of the data table are all included in the partition key of the first data table, and it is confirmed that the data table is the CP table of the first data table, and the data table is also added to the CP table set of the first data table.
  • the loop iteratively continues to find the CP table of the first data table until the found CP table can no longer find the CP table of the first data table.
  • the association table of the first data table is all judged according to the above method, and the CP table set of all the first data tables is obtained. So far, the CP table set of the first data table includes the first data table and the CP table of the first data table, and the partition keys of each CP table in the set are all included in the partition key of the first data table.
  • the partition key of an associated table is not all included in the partition key of the first data table, it can be confirmed that the association table is not the CP table of the first data table. Therefore, for the partition table Part_Supp table and Order table of the LineItem table, the CP table relationship of each associated table is determined one by one, and the partition keys of the Part_Supp table are Part_Key and Supp_Key, which are all included in the partition key of the LineItem table, so the Part_Supp table The CP table for the LineItem table should be in the CP table collection of the LineItem table. Next, determine the association table of the Part_Supp table.
  • the partition key of the Part_Supp table is Part_Key and Supp_Key.
  • the association table of the partition key is found to include the Part table and the Supp table, and the partition key of the Part table is Part_Key, and the partition key of the Supp table is Supp_Key.
  • the Part table, the Supp table are the CP table of the LineItem table, and should also be in the CP table collection of the LineItem table.
  • look up the associated table of the Part table and the Supp table, and the Part table and the Supp table have no associated table of the partition key, so there is no CP table of the LineItem table.
  • the partition key of the Order table is Order_Key and CustermerKey.
  • the CustermerKey is not included in the partition key of the LineItem table, so the Order table is not the CP table of the LineItem table.
  • the CP table set CP1 of the first data table is obtained, and the CP1 includes a first data table and an association table that can be co-located with the first data table CP, that is, a CP table of the first data table. Therefore, the LineItem in the previous LineItem table includes the LineItem in the CP1 set CP1. Table, Part_Supp, Part, and Supp tables.
  • the co-location partition Colocation Partition can be performed on the table in the first data table and its CP table set CP1.
  • the specific step is: performing a Partition of the first data table according to the partition key of the first data table, and if the number of partition keys is one, it is a one-dimensional Partition, and if there are multiple, it is a multi-dimensional Partition.
  • each data table determines the dimension of the Range of the Partition according to the number of partition keys, and the same partition key. Corresponding to the same Range interval in each data table.
  • the LineItem table, the Part_Supp table, the Part table, and the Supp table in the CP table set CP1 are all Partitions in the same Range interval
  • the LineItem table is a three-dimensional partition, wherein the L columns are in accordance with (L_0, L-1), (L_1, L-2], (L_2, L-3) divides the Partition into three Ranges, and the P columns are in accordance with (P_0, P-1], (P_1, P-2], (P_2, P-3), (P_3, P-4], (P_4, P-5) divides the Partition into 5 Ranges, and the S column divides the Partition into 2 Ranges according to (S_0, S-1), (S_1, S-2), and the Part table is one.
  • the Part_Supp table is used as the CP table of the LineItem table, which is a two-dimensional partition, using the same Range interval as the LineItem table, and the P column is in accordance with (P_0, P-1], (P_1, P-2), (P_2, P-3], (P_3, P-4), (P_4, P-5] perform Partition, and S column performs Partition according to (S_0, S-1], (S_1, S-2).
  • Type Partition same manner as CP same partition key set table for each association table associated with that same Range.
  • the remaining order table Order, customer table Customer, country table Nation, and region table Region are not Partition, and the first data is first determined according to the method described above.
  • the table is the Order table, and then the CP table set CP2 of the Order table is created.
  • the CP table set CP2 of the Order table includes the Order table and the Customer table.
  • the table in the CP table set CP2 of the Order table is co-located with the Colocation Partition, and the partition result of the CP table set as shown in FIG. 3 is obtained.
  • the Order table and the Customer table are both Partition and Order tables with the same Range interval.
  • O column is Partition according to (O_0, O-1], (O_1, O-2), (O_2, O-3), and the C column is in accordance with (C_0, C-1), (C_1, C). -2], (C_2, C-3), (C_3, C-4), (C_4, C-5) for Partition
  • the Customer table as the Order table
  • the CP table is a one-dimensional partition
  • the C column is the same as the Order table.
  • Range section according to (C_0, C-1], (C_1, C-2), (C_2, C-3], (C_3, C-4), (C_4, C-5] Partition.
  • the area table Region has no partition key, and no need to consider. At this point, the entire partition of the distributed database is completed, and each data table is in the same way as the partition key in the same CP table set, and the partition scope is also Can be mapped one by one.
  • the associated CP table is found in each association table, and the CP table is the same in the Range mode of the Partition according to the associated partition key, so that the OLAP operation is performed.
  • the OLAP operation is performed.
  • only the local node or the data of the same Range associated with other nodes is obtained, and the entire fact table is not needed, thereby improving the performance of the entire system.
  • FIG. 4 is a flowchart of a method for partitioning an association table in a distributed database according to Embodiment 2 of the present invention. As shown, the method is:
  • Step 101 Determine a first data table in each data table that needs to be partitioned.
  • the first data table herein refers to a data table that satisfies the condition that the partition key partition key of the data table is not a foreign key of any one of the second data tables
  • the second data table herein refers to each data table that needs to be partitioned. Other data tables except the first data table.
  • the Partition is usually performed by the manager Manager in the system. deal with.
  • the manager selects the first data table in the system by adopting the following method: arbitrarily selecting a data table that has not been partitioned from all data tables of the distributed database, and then determining whether there is a partition in the partition key of the table. The key is the foreign key of the other table. If not, the table is used as the first data table. Otherwise, for other tables found, it is judged whether the other table can be used as the first data table, and the processing is repeated until the first A data sheet. It can be considered that the first data table is the top-level association table in the data table that needs Partition, and the partition key in the first data table is not a foreign key of other tables. Of course, if there is a foreign key in the data table of the Partition that has not been partitioned, but the other table has been Partition, the data table that has not been Partition can also be used as the first data table.
  • the method flow for determining the first data table in each data table that needs to be partitioned in step 101 can be as shown in FIG. 5.
  • Step 1011 Select, from all the data tables in the database, a data table that has not been Partition as a candidate table.
  • the manager manager selects a data table that has not yet been partitioned from all the data tables to determine whether it can be used as the first data table.
  • the selected data table is referred to as a candidate table, that is, an alternative table of the first data table.
  • Step 1012 Obtain all partition keys of the candidate list.
  • step 1013 it is determined whether the partition key of the candidate table has a foreign key of another data table, and if yes, the process proceeds to step 1014; otherwise, the process proceeds to step 1016.
  • step 1014 the other data table found in step 1013 is obtained. If the candidate table has a foreign key (or an associated key) for the other data table, then the other data table is obtained. From the introduction of the distributed database, we can know that this other data table is actually the fact table of the candidate table. That is, the other data table is found in step 1013, and its foreign key is the foreign key of the candidate table, then it is the fact table of the candidate table. The essence of step 1013 is to find the fact table of the candidate list.
  • step 1015 it is judged whether the other table (fact table) found has already been Partition. If it has already been Partition, it proceeds to step 1017, otherwise it enters 1016.
  • step 1016 if the other table does not have a Partition, the other table is taken as a new candidate list, and the process proceeds to step 1012. If the other table has not yet been Partition, then the other table is a data table that needs to be Partition higher than the current candidate table, and then the other table is used as a new candidate table, and the process returns to step 1012. Find out if the new candidate table has a higher level The association table of the Partition is not performed.
  • Step 1017 if the candidate table does not have a partition key as a foreign key of another data table, or even if it finds a foreign key having a partition key as another data table, but the other data table found, that is, its fact table has been Partition, then It can be determined that the candidate list is the first data table to be found.
  • this alternative table is the top-level association table in the data table that needs Partition, its partition key is not a foreign key of other tables, or even if it has a partition key is a foreign key of other tables , but its upper table has been Partition.
  • the determination of the first data table can be completed by the above method flow.
  • the manager in the database system receives the commands input by the system administrator through the background system to create each data table, and completes the establishment of each data table and the definition of the partition key of the table.
  • the field definition of the data table and the definition of the PatitionKey can be completed by a Data Definition Language (DDL).
  • DDL Data Definition Language
  • Step 102 Generate a co-located partition CP table set of the first data table.
  • the set of CP tables of the first data table includes a first data table and at least one CP table of the first data table.
  • the CP table of the first data table is a data table in which the partition key is a subset of the partition key of the first data table, that is, the partition key of each data table in the CP table set of the first data table is included in the first
  • the set includes a first data table and a data table that can perform CP with the first data table.
  • each association table associated with the partition key of the first data table ie, the partition key of the first data table is used as a dimension table associated with the foreign key
  • the method for determining whether an association table is a CP table of the first data table is specifically: obtaining all partition keys of the association table, and determining whether all partition keys of the association table are all included in a partition key of the first data table, If all are included, it is confirmed that the association table is a CP table of the first data table, and is added to the CP table set of the first data table; otherwise, it is confirmed that the association table is not the CP table of the first data table.
  • the first CP table refers to the method of confirming the association table of the first data table to determine whether the association table of the first CP table is the CP of the first data table. table.
  • the specific method for determining whether the association table associated with the partition key of the first CP table is the CP table of the first data table is similarly: obtaining all partition keys of the association table, and determining whether all partition keys of the association table are all Included in the If the partition key of a data table is all included, it is confirmed that the association table is a CP table of the first data table, and is added to the CP table set of the first data table; otherwise, it is confirmed that the association table is not the first data table. CP table. Further, according to this method, the association table of the CP table of the newly added first data table is judged, and according to the method of such loop iteration, until the found CP table can no longer find the CP table of the first data table.
  • the association table of the first data table is all judged according to the above method, and the CP table set of all the first data tables is obtained. So far, the CP table of the first data table includes the first data table and the CP table of the first data table, and according to the confirmation process, it can be determined that the partition key of each CP table in the set is included in the partition key of the first data table. in. After the previous steps, the CP table set of the first data table is obtained, and the set includes a first data table and a data table that can be co-located with the first data table.
  • the method flow for generating the co-located partitioned CP table set of the first data table in step 102 can be as shown in FIG. 6. include:
  • Step 1021 Acquire an association table associated with a partition key of the first data table. First, the partition key of the first data table is obtained, and then the association table associated with the partition key is obtained according to the partition key.
  • Step 1022 Determine whether there is an association table in the association table that does not determine the CP relationship with the first data table. It is necessary to judge one by one whether these association tables of the first data table are suitable for co-location partitioning CP with the first data table, and obtain a CP table set of the first data table.
  • the CP relationship with the first data table refers to whether the associated table is a CP table of the first data table. If yes, the operation of generating the CP table set of the first data table is not finished yet, and step 1023 is performed; otherwise, the establishment of the CP table set of the first data table is completed.
  • Step 1023 Select an association table that does not determine the CP relationship with the first data table, and obtain all partition keys of the association table.
  • an association table is arbitrarily selected for CP relationship determination, and the partition key of the association table is first acquired.
  • Step 1024 Determine whether all the partition keys of the association table of the undetermined CP relationship are included in the partition key of the first data table. If the partition key of the association table is not included in the partition key of the first data table, it indicates that the association table is not the CP table of the first data table, and the process proceeds to step 1022 to continue to select whether there are other undetermined. The association table of the CP relationship proceeds to the next step. Otherwise, step 1025 is performed.
  • Step 1025 determining that the association table is a CP table of the first data table, and adding the association table to the first
  • the data table is in the CP table collection. If the partition key of the association table is included in the partition key of the first data table, it is determined that the association table is a CP table of the first data table, and the CP may be performed with the first data table to add the first data.
  • Step 1026 Search for the CP table of the first data table in the association table of the determined CP table, and add the found CP table to the CP table set of the first data table.
  • the method of searching the CP table of the determined CP table is the same as the method of searching the CP table in the association table of the first data table, that is, the method in steps 1021 to 1026 is performed, and the specific implementation is iterative. The method finds all the data tables in the set of CP tables suitable for joining the first data table.
  • the iterative method is to repeatedly call the process recursive processing of steps 1021 - 1026 until the recursion ends, but the initial time is to find the CP table of the first data table in the association table of the first data table, that is, in step 1021
  • the input is the first data table
  • the input table in step 1021 in the subsequent iteration is the CP table of the newly added first data table until the recursion ends.
  • the layer iteratively looks up the CP table of the first data data table in the association table of the first CP table, where the first CP table may be in the CP table of the first data table.
  • association table associated with the partition key of the first CP table is obtained first, and then it is judged whether the association table is suitable for co-location partition CP with the first data table, that is, whether all partition keys of the association table are included in If the partition key of the first data table is all included, it is determined that the association table is a CP table of the first data table, and the CP may be performed with the first data table, and added to the CP table set of the first data table.
  • step 1027 all the CP tables of the first data table have been obtained, and thus the CP set of the first data table is established.
  • the CP table set of the first data table is obtained, and the table in the CP table group includes the CP table of the first data table.
  • the first data table is partitioned according to its partition key.
  • a multi-dimensional/one-dimensional Partition is performed, and the multi-dimensional or one-dimensional Partition is determined according to the number of partition keys of the first data table. If there is only one partition key, a one-dimensional Partition is performed, and if there are N partition keys, an N-dimensional Partition is performed.
  • Step 104 Partition each CP table of the first data table in the CP table set, where the partition range of the partition key associated with the first data table of each CP table and the partition key corresponding to the first data table The partition range is the same.
  • the other data table except the first data table in the CP table set of the first data table is subjected to the same Partition as the first data table according to the partition key associated with the first data table.
  • Performing a Partition on each CP table in the CP table set of the first data table, and performing the same Partition as the first data table according to the same partition key of the first data table that is, the CP table and its corresponding first data.
  • the table performs the same Partition (Range interval is the same) in the same partition key until all the data tables in the CP set are Partition, completing the associated table partition of the first data table.
  • the method for performing the same Partition as the first data table according to the partition key associated with the first data table for each CP table in the CP table set of the first data table includes: first, determining the first data table Whether there is a CP table without a Partition in the CP table set, and if so, select a CP table that is not a Partition in the CP table set of the first data table, and perform the same Partition as the first data table, that is, the CP table according to and
  • the partition key corresponding to the first data table performs the same Partition (Range interval is the same) until there is no CP table of the Partition in the CP table set of the first data table.
  • steps 101-104 may be repeatedly performed for each data table in the distributed database for which the Partition has not been performed, until the Partition of all the tables in the entire database is completed. That is, for each data table that needs to be Partition other than the CP table set of the first data table, perform colocation partition Colocation Partition according to the method described in steps 101-104 in FIG. 4 until the partition of the entire distributed database is completed. .
  • each data table performs the same partition according to the associated partition key in the same CP table set, and the same partition key in each association table is the same as the Range after the Partition.
  • the Partition method of the association table in the distributed database in the embodiment of the present invention first finds the associated CP table in each data table that needs the Partition, and uses the same Partition's Range according to the associated partition key, so that the CP table is made. In the OLAP operation, only the same node or the associated Range data of other nodes is obtained, and the entire fact table is not needed, thereby improving the performance of the entire system.
  • the database system After the Partition is performed in the distributed database in the distributed database as in the previous embodiment 2, the database system often has load adjustment or load imbalance in the process of joining, exiting, and running the system. In order to ensure the smooth running of the system, it is necessary to ensure that the difference in the running load of each node is not too large. In this case, the data table of the Partition has to be divided again.
  • a method of load balancing as shown in FIG. 7 is also provided in Embodiment 3 of the present invention.
  • Step 201 Receive a load balancing request.
  • the manager receives a request for load balancing a node in the database, and the load balancing request indicates that the first node needs to be load balanced, such as the first node.
  • the load balancing request here may be input by the system administrator through the background system, or may be generated by the background monitoring system of the database system, the network management system or the monitoring component of the manager according to the load monitoring status of each node. It is often necessary to perform load balancing in scenarios such as adding, deleting, and the like, or when the load of an node is too heavy.
  • Step 202 Determine a set of CP tables for load balancing. On the first node that is scheduled to perform load balancing, determine the number of CP tables that need to be load balanced on the node by using the number of records or the space occupied. For example, the CP table set with larger load is required to be load-adjusted. CP table collection.
  • Step 203 Determine a data table for performing load adjustment in the CP table set that performs load balancing.
  • a data table requiring load adjustment for example, a data table A having a large load, is further selected by a dimension such as a record number or a occupied space.
  • Step 204 Select a partition key from the determined data table A to perform Range adjustment.
  • a partition key having a relatively large/small load is randomly selected or selected for Range adjustment.
  • Step 205 Perform the same Range adjustment on the data table including the selected partition key in the determined CP table set according to the Partition processing of the data table A in step 204.
  • load balancing adjustment of the relevant node needs to be performed according to the load balancing adjustment of the first node. For example, if the demand is that the first node is overloaded and needs to adjust some data to other nodes, the data split from the first node after the Range adjustment is adjusted to other nodes, and the adjusted data can be adjusted to the load. A lighter node or a newly added node to achieve a balanced load.
  • FIG. 8 Shown in the table on the left side of FIG. 8 is the result of performing the Partition according to the associated table partitioning method in the foregoing embodiment 2.
  • Table 1 is a fact table
  • Table 2 and Table 3 For the dimension table.
  • Table 1 is a three-dimensional Partition
  • Table 2 and Table 3 are one-dimensional Partitions.
  • Table 3 load is large, then select the partition key of Table 3 for Range split, the existing partition key of Table 3 in node 2's Range is (100,200], split it For (100, 150], (150, 200), similarly, the data table (Table 1) including the partition key of Table 3 in the CP table set is divided according to the same Range as Table 3. Then node 2 contains only Table 1. Range(0,10](0,10](100,150] data, the data of Range(0,10) in Table 2, and the data of Range(100,150) in Table 3, the load is greatly reduced. Node 2 is split.
  • the data is allocated to node 9, which contains the data of Range(0,10)(0,10)(150,200) in Table 1, the data of Range(0,10) in Table 2, and the Range(150,200 in Table 3).
  • the data of the node 9 can be a new node or other existing nodes.
  • the existing partition key of Table 2 in Node 2 is Range (0, 10), which is split into (0, 5), (5, 10). ], the same, will contain Table 2 in the CP table collection
  • the data table of the partition key (Table 1) is divided according to the same Range as in Table 2.
  • node 2 contains only the data of Range(0,10)(0,5)(100,200) in Table 1, and Rangee in Table 2.
  • the data of (0,5), the data of Range(100,200) in Table 3 the load is greatly reduced.
  • the data split in node 2 is allocated to node 9, which contains Range(0,10) in Table 1.
  • the data table and the related partition key that need to be load-balanced are determined, and the associated CP table is still subjected to the same Partition according to the associated partition key, thereby ensuring load balancing of each node of each node.
  • the system can run smoothly, and the associated CP tables in each node after load balancing are still the same as the partition of the Partition according to the associated partition key, which ensures that only the local node or the other nodes are associated when performing OLAP operations.
  • the same Range data without the need to get the entire fact table, improve the entire department System performance.
  • FIG. 9 is a structural diagram of a database manager according to Embodiment 4 of the present invention.
  • the database manager 30 includes a determining unit 31 and a co-located partition table set generating unit 32, and a partitioning unit 33. among them,
  • a determining unit 31 configured to determine, in each data table that needs to be partitioned, a first data table, where the partition key of the first data table is not a foreign key of any one of the second data tables, where the second data table refers to a partition that needs to be partitioned Other data tables in the data tables except the first data table. .
  • the determining unit 31 determines that the specific method of the first data table is to arbitrarily select a data table that has not been partitioned from all the data tables of the distributed database, and then determine whether the partition key of the table has a foreign key of another table in the partition key of the table.
  • the table is used as the first data table, otherwise it is determined whether the other table can be used as the first data table for the other tables found, and the processing is repeated until the first data table is found. It can be considered that the first data table is the top-level association table in the data table that needs Partition, and the partition key in the first data table is not a foreign key of other tables. Of course, if there is a foreign key in the data table of the Partition that has not been partitioned, but the other table has been Partition, the data table that has not been Partition can also be used as the first data table.
  • the determining unit 31 reference may be made to the descriptions in the foregoing embodiments 1 and 2, and details are not described herein again.
  • a co-located partitioned CP table set generating unit 32 configured to generate a CP table set of the first data table, where the CP table set of the first data table includes a first data table and at least one CP table of the first data table, where The CP table of the first data table is a data table in which the partition key is a subset of the partition keys of the first data table. It can be understood that the partition key of each data table in the CP table set of the first data table is included in the partition key of the first data table, and the set includes the first data table and the data table that can perform CP with the first data table. .
  • the co-located partitioned CP table set generating unit 32 is configured to: obtain a dimension table of the first data table, determine whether the dimension table of the first data table is a CP table of the first data table, and if so, The dimension table of a data table is added to the CP table set of the first data table; iteratively determines whether the dimension table of the CP table of the first data table that has been added to the CP table set of the first data table is the CP table of the first data table, The data table confirmed as the CP table of the first data table is added to the CP table set of the first data table until all the CP tables of the first data table are found.
  • the co-located partition CP table set generation unit 32 generates the first For a specific method of the CP table set of a data table, refer to the descriptions in the foregoing Embodiments 1 and 2, and details are not described herein again.
  • the partitioning unit 33 is configured to partition the first data table according to each partition key of the first data table, and partition each CP table of the first data table in the CP table set, where the CP table set is in the
  • the partitioning range of the partitioning key associated with the first data table for each CP table of a data table is the same as the partitioning extent of the corresponding partitioning key in the first data table.
  • the partitioning unit 33 performs the same Partition with the same partitioning key as the first data table according to the same partitioning key as the first data table, that is, the same partitioning key of the corresponding first data table performs the same Partition (Range section). the same). Selecting another data table other than the first data table other than the first data table in the CP table set of the first data table, and performing the same Partition as the first data table according to the same partition key of the selected data table and the first data table, The same Partition mentioned here means that the same partition key performs the same section of the Partition, until all the data tables in the CP table set of the first data table have completed the Partition.
  • the partitioning unit 33 For specific implementation details of the partitioning process performed by the partitioning unit 33, reference may be made to the descriptions in the foregoing embodiments 1 and 2, and details are not described herein again.
  • FIG. 10 is another structural diagram of the database manager according to Embodiment 4 of the present invention.
  • the database manager 30 shown in the embodiment of the present invention further includes a load balancing unit 34.
  • the load balancing unit 34 is used to:
  • the load balancing request indicating that load balancing needs to be performed on the first node; determining, in the first node, a set of CP tables for performing load balancing;
  • a data table including the partition key selected for the Range adjustment in the CP table set that is load-balanced, and the partition key that performs Range adjustment according to the selection performs data of the partition key that is Range-adjusted by the selection. The same Range adjustment for the table.
  • the determining unit 31 of the embodiment of the present invention includes:
  • the candidate table selection module 311 is configured to select, in each data table that needs to be partitioned, a data table that has not been subjected to a Partition as a candidate table of the first data table;
  • the determining module 312 is configured to determine whether the partition key is a foreign key of another data table in the partition key of the selected candidate table, if not, the candidate table is used as the first data table, otherwise The other data table continues as a candidate list of the first data table to determine whether it can be the first data table until the first data table is determined.
  • the candidate table can also be used as the first data table.
  • the co-located partitioned CP table set generating unit 32 in the database manager 30 includes: an obtaining module 321: a partitioning key for acquiring a data table; and a dimension table associated with the partitioning key of the one data table;
  • a CP table judging module 322 configured to determine whether each of the associated dimension tables is a CP table of the first data table, if all partition keys of one of the associated dimension tables are included in the first data table In the partition key, it is confirmed that the associated dimension table is the CP table of the first data table, otherwise, it is confirmed that the associated dimension table is not the CP table of the one data table.
  • the database manager 30 further includes a load balancing unit 34, and the load balancing unit 34 includes:
  • the receiving module 341 is configured to receive a load balancing request, where the load balancing request indicates that load balancing is performed on the first node, and determine a set of CP tables for performing load balancing in the first node.
  • a determining module 342 configured to determine a data table for performing load adjustment in determining a set of CP tables for performing load balancing, and selecting a partition key from the data table for determining load balancing to perform Range adjustment;
  • the adjustment module 343 is configured to: in the CP table set for determining the load balancing, all the data tables including the partition key selected for the Range adjustment, and the partition key selected according to the selection of the range is the same as the data table of the partition key selected for the Range adjustment. Range adjustment.
  • the determining module 342 determines the set of the CP table for load balancing according to the data table record number or the occupied space dimension, and determines the CP table set that needs to be load balanced on the first node.
  • the database manager in the embodiment of the present invention performs Partition on the association table in the distributed database, first find the associated CP table in each data table that needs the Partition, and use the same CP according to the associated partition key.
  • the Partition's Range makes it possible to perform OLAP operations by only obtaining the associated Range data from the node or from other nodes, without the need to obtain an entire fact table, thereby improving the performance of the entire system.
  • the database manager determines the data table and the related partition key that need to be load-balanced, and the associated CP table still performs the same Partition according to the associated partition key, and each node can be guaranteed.
  • the load balancing of the nodes and the smooth running of the system, and the associated CP tables in each node after the load balancing are still the same as the partition of the Partition according to the associated partition key, which ensures that only the local node or other nodes are required for the OLAP operation. Get the same Range data of the association without having to get the entire fact table to improve the performance of the entire system.
  • Fig. 12 is a view schematically showing a database management device 40 of Embodiment 5 of the present invention.
  • the database management device 40 includes a processor 41, a memory 42, and a bus 43.
  • the processor 41 and the memory 42 realize a communication connection with each other via the bus 43.
  • the processor 41 can be a general-purpose central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), or one or more integrated circuits for executing related programs.
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • the memory 42 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
  • Memory 42 can store operating systems and other applications.
  • the program code for implementing the technical solution provided by the embodiment of the present invention is stored in the memory 42 and executed by the processor 41 when the technical solution provided by the embodiment of the present invention is implemented by software or firmware.
  • the memory 42 can be used to store computer-executed instructions, as well as to store various information, such as query results.
  • the processor 41 can read the information stored by the memory 42 through the bus system 43 or store the result of the query to the memory 42. Further, when the database management device 40 is running, the processor 41 can read the computer execution instructions stored in the memory 42 to execute the front The association table Partition method of the distributed database described in Embodiments 1, 2, and 3.
  • the database management device may further include an input/output interface 44 and a communication interface 45.
  • the input/output interface 44 is for receiving input data and information, and outputting data such as operation results.
  • Communication interface 45 implements communication between database management device 40 and other devices or communication networks using transceivers such as, but not limited to, transceivers.
  • Bus 43 may include a path for communicating information between various components of database management device 40, such as processor 41, memory 42, input/output interface 43, and communication interface 44.
  • database management device 40 shown in FIG. 12 only shows the processor 41, the memory 42, the input/output interface 44, the communication interface 45, and the bus 43, in a specific implementation process, those skilled in the art should Understand that the data, database management device 40 also contains other devices necessary to achieve proper operation. At the same time, those skilled in the art will appreciate that database management device 40 may also include hardware devices that implement other additional functions, depending on the particular needs. Moreover, those skilled in the art will appreciate that the database management device 40 may also only include the components necessary to implement the embodiments of the present invention, and does not necessarily include all of the devices shown in FIG.
  • the disclosed systems, devices, and The method can be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the unit/module is only a logical function division.
  • there may be another division manner for example, multiple units or components may be used. Combinations can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, or an electrical, mechanical or other form of connection.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the embodiments of the present invention.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in one computer computer readable storage medium or as one or more instructions or code embodied on a computer readable medium.
  • Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another.
  • a storage medium may be any available media that can be accessed by a computer.
  • computer readable media may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, disk storage media or other magnetic storage device, or can be used for carrying or storing in the form of an instruction or data structure.
  • the desired program code and any other medium that can be accessed by the computer may be stored in one computer computer readable storage medium or as one or more instructions or code embodied on a computer readable medium.
  • transmission Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another.
  • a storage medium may be any available media that can
  • connection may suitably be a computer readable medium.
  • the software is transmitted from a website, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave
  • coaxial cable , fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, wireless, and microwave are included in the definition of the medium to which they belong.
  • a disk and a disc include a compact disc (CD), a laser disc, a compact disc, a digital versatile disc (DVD), a floppy disk, and a Blu-ray disc, wherein the disc is usually magnetically copied, and the disc is The laser is used to optically replicate the data.
  • CD compact disc
  • DVD digital versatile disc
  • a floppy disk a compact disc
  • Blu-ray disc wherein the disc is usually magnetically copied, and the disc is The laser is used to optically replicate the data.
  • the above combination is also It should be included within the scope of the computer readable medium.
  • the technical solution of the present invention is essential or part of the prior art, or all or part of the technical solution may be stored in a storage medium, including a plurality of instructions for causing a computer device (may be a personal computer, server, or network device, etc.) performing all or part of the steps of the methods described in various embodiments of the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明实施例公开了一种分布式数据库中关联表分区的方法,包括:在需要分区的各数据表中确定第一数据表,生成第一数据表的共址分区CP表集合,第一数据表的CP表集合包括第一数据表和第一数据表的至少一个CP表,其中,第一数据表的CP表为分区键是第一数据表的分区键的子集的数据表,将第一数据表按照分区键进行Partition,并将CP表集合中第一数据表的每个CP表进行分区,每个CP表与第一数据表相关联的分区键的分区范围与第一数据表中相对应的分区键的分区范围相同。从而使得进行OLAP操作时,只需要本节点或者从其他节点获得关联的相同Range的数据,而不需要获得整张的事实表,从而提升了整个系统的性能。

Description

一种分布式数据库中关联表分区的方法和设备 技术领域
本发明实施例涉及数据库领域,尤其涉及分布式数据库中关联表分区的方法方法和设备。
背景技术
海量数据对于传统的分布式数据库管理系统(Distributed Database Management System,简称DDBMS)是一个挑战。而Google设计的分布式数据存储系统BigTable思想的公开,使得大量的非关系型数据库(Not Only SQL,简称NoSQL)系统相继出现,NoSQL作为主要面向万维网(Web)业务的数据管理系统,由于没有复杂的表关系,基于键值(Key-Value)的数据模型就可以解决绝大部分数据管理的问题。即使存在管理关系的表,NoSQL也通过构建嵌套形式数据模型的Schema模型,把多张关系表的数据放置在一张大表中。但这一类型的数据模型会增加数据的冗余,而且面对分析型业务的数据时,这种嵌套的数据模型严重影响联机分析处理(On-Line Analytical Processing,简称OLAP)的查询效率。
分布式数据库管理系统,也称为分布式数据库,其中有多个节点,每个节点维护和管理着一部分数据,其中有一部分数据提供服务(称为Active数据),另外一部分数据作为备份不提供服务(称为Backup数据),这些节点也可以称作数据管理节点或数据节点。所有节点的Active数据的并集是数据库的完整数据,节点之间表的Active数据不存在交集。为了保证上面这个条件,在数据库管理员(Database Administrator,简称DBA)创建表后,数据库管理系统需要对表进行分区(Partition)处理。然后把Partition后的各个分区范围(Range),分配到各个节点上进行管理。在多表管理的业务场景中,需要对各个表建立各自的分区键(PartitionKey),并把Partition后的各个Range分配到各个节点进行管理。Partition和分配方式会影响OLAP的查询效率,基于海量数据的 分布式数据库,OLAP操作会涉及计算节点需从其他节点获得计算数据,造成系统延迟过大,使得整个系统的整体性能下降。
现有技术中,分布式数据库,例如,Hadoop数据库(Hadoop ed database,简称HBase)、分布式key-value数据库HyperDex等对于关联表的Partition方法均会导致OLAP的操作涉及到批量的数据要从多个节点获得,且多表Join操作时,会涉及维度表的全局匹配,导致整个系统的性能下降。
发明内容
有鉴于此,本发明实施例提供了一种分布式数据库中关联表的Partition方法和设备,使得OLAP操作时减少从多个节点获取大量的数据以及维度表全局匹配的情况,从而提升了整个系统的性能。
第一方面,本发明实施例提供了一种分布式数据库中关联表分区的方法,包括:在需要分区的各数据表中确定第一数据表,所述第一数据表的分区键不是任一个第二数据表的外键,所述第二数据表为所述需要分区的各数据表中除所述第一数据表之外的其他数据表;生成所述第一数据表的共址分区CP表集合,所述第一数据表的CP表集合包括第一数据表和所述第一数据表的至少一个CP表,其中,所述第一数据表的CP表为分区键是所述第一数据表的分区键的子集的数据表;将所述第一数据表按照所述第一数据表的每个分区键分别进行分区;将所述CP表集合中所述第一数据表之外的每个CP表进行分区,其中,所述CP表集合中所述第一数据表之外的每个CP表与所述第一数据表相关联的分区键的分区范围与所述第一数据表中相对应的分区键的分区范围相同。
在第一方面的第一种可能的实现方式中,所述生成所述第一数据表的CP表集合,具体为:获取所述第一数据表的维度表,判断所述第一数据表的维度表是否为所述第一数据表的CP表,若是,则将所述第一数据表的维度表加入所述第一数据表的CP表集合;迭代判断已加入所述第一数据表的CP表集合中的所述第一数据表的CP表的维度表是否为所述第一数据表的CP表,将确认为是所述第一数据表的CP表的数据表加入所述第一数据表的CP表集合, 直到查找到全部的所述第一数据表的CP表为止。
结合第一方面或第一方面的第一种可能的实现方式,在第二种可能的实现方式中,所述方法还包括:接收进行负载均衡请求,所述负载均衡请求指明需对第一节点进行负载均衡;在所述第一节点中确定进行负载均衡的CP表集合;
在所述确定进行负载均衡的CP表集合中确定进行负载调整的数据表;从所述确定进行负载均衡的数据表中选择一个分区键进行分区范围调整;对所述确定进行负载均衡的CP表集合中所有包含所述选择进行分区范围调整的分区键的数据表,根据所述选择进行分区范围调整的分区键进行与所述选择进行分区范围调整的分区键的数据表相同的分区范围调整。
结合第一方面的第二种可能的实现方式,在第三种可能的实现方式中,所述在所述第一节点中确定进行负载均衡的CP表集合,具体为:根据数据表记录数或者占用空间维度,确定所述第一节点上需要进行负载均衡的CP表集合。
第二方面,本发明实施例提供了一种数据库管理器,包括:确定单元,用于在需要分区的各数据表中确定第一数据表,所述第一数据表的分区键不是任一个第二数据表的外键,所述第二数据表为所述需要分区的各数据表中除所述第一数据表之外的其他数据表;共址分区CP表集合生成单元,用于生成所述第一数据表的共址分区CP表集合,所述第一数据表的CP表集合包括第一数据表和所述第一数据表的至少一个CP表,其中,所述第一数据表的CP表为分区键是所述第一数据表的分区键的子集的数据表;分区单元,用于将所述第一数据表按照所述第一数据表的每个分区键分别进行分区,并将所述CP表集合中所述第一数据表的每个CP表进行分区,其中,所述CP表集合中所述第一数据表的每个CP表与所述第一数据表相关联的分区键的分区范围与所述第一数据表中相对应的分区键的分区范围相同。
在第二方面的第一种可能的实现方式中,所述共址分区CP表集合生成单元,具体用于:获取所述第一数据表的维度表,判断所述第一数据表的维度表是否为所述第一数据表的CP表,若是,则将所述第一数据表的维度表加入所述第一数据表的CP表集合;迭代判断已加入所述第一数据表的CP表集合中的所述第一数据表的CP表的维度表是否为所述第一数据表的CP表,将确认为是所述第一数据表的CP表的数据表加入所述第一数据表的CP表集合, 直到查找到全部的所述第一数据表的CP表为止。
结合第二方面或第二方面的第一种可能的实现方式,在第二种可能的实现方式中,数据库管理器还包括负载均衡单元,用于:接收进行负载均衡请求,所述负载均衡请求指明需对第一节点进行负载均衡;在所述第一节点中确定进行负载均衡的CP表集合;在所述确定进行负载均衡的CP表集合中确定进行负载调整的数据表;从所述确定进行负载均衡的数据表中选择一个分区键进行分区范围调整;对所述确定进行负载均衡的CP表集合中所有包含所述选择进行分区范围调整的分区键的数据表,根据所述选择进行分区范围调整的分区键进行与所述选择进行分区范围调整的分区键的数据表相同的分区范围调整。
结合第二方面或第二方面的第一种可能的实现方式,在第三种可能的实现方式中,所述数据库管理器还包括负载均衡单元,所述负载均衡单元包括:接收模块:用于接收进行负载均衡请求,所述负载均衡请求指明需对第一节点进行负载均衡;在所述第一节点中确定进行负载均衡的CP表集合;确定模块:用于在所述确定进行负载均衡的CP表集合中确定进行负载调整的数据表;从所述确定进行负载均衡的数据表中选择一个分区键进行分区范围调整;调整模块:用于对所述确定进行负载均衡的CP表集合中所有包含所述选择进行分区范围调整的分区键的数据表,根据所述选择进行分区范围调整的分区键进行与所述选择进行分区范围调整的分区键的数据表相同的分区范围调整。
结合第一方面第二种或第三种可能的实现方式,在第四种可能的实现方式中,所述确定进行负载均衡的CP表集合的方法为:根据数据表记录数或者占用空间维度,确定所述第一节点上需要进行负载均衡的CP表集合。
第三方面,本发明实施例还提供了一种数据库管理设备,包括:处理器、存储器和总线;所述存储器用于存储计算机执行指令,所述处理器与所述存储器通过所述总线连接,当所述数据库管理设备运行时,所述处理器读取所述存储器存储的所述计算机执行指令,以执行上述第一方面或第一方面中任一可能的实现方式中所述的方法。
通过上述方案,本发明实施例提供的分布式数据库中关联表的Partition方法,通过把各个关联表中相关联的数据共址(Colocation),即关联表中相关联的分区键的Partition方式相同,并且分布也一一映射起来。 使得进行OLAP操作时,只需要本节点或者从其他节点获得关联的Range的数据,而不需要获得整张的事实表,从而提升了整个系统的性能。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例中分布式数据库架构图;
图2为本发明实施例1的分布式数据库库结构示意图;
图3为本发明实施例1的分布式数据库Partition结果示意图;
图4为本发明实施例2的分布式数据库中关联表分区方法流程图;
图5为本发明实施例2中确定第一数据表的方法流程图;
图6为本发明实施例2中生成第一数据表的CP表集合的方法流程图;
图7为本发明实施例3中负载均衡的方法流程图;
图8为本发明实施例3中节点负载均衡示意图;
图9为本发明实施例4的数据库管理器的结构图;
图10为本发明实施例4的数据库管理器的另一结构图;
图11为本发明实施例4的数据库管理器的另一结构图;
图12为本发明实施例5的数据库管理设备的结构图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
图1为本发明实施例中分布式数据库系统架构图,如图1所示,分布式数据库系统中包括多个节点20,如图中所示节点_1--节点_5,还包含一个管理器10(Manager)用于对数据库系统进行维护、管理,图1所示的管理器 10独立于节点1-5,在不同于图1所示的架构中,管理器也可以由分布式数据库系统的各节点中的一个节点来担任,设置在该节点上,提供数据库系统的维护、管理功能。本发明实施例中分布式数据库系统中的节点和管理器具体可以由任意主机来实现,任意主机包括普通的计算机、移动终端、工作站或服务器、专用服务器等,如X86处理器,本发明不作具体限定。每个节点维护和管理着部分数据,其中有一部分数据提供服务(称为Active数据,也可称为主用数据),另外一部分数据作为备份(称为Backup数据,也可称为备用数据),所有节点Active数据的并集是数据库的完整数据。在数据进行存储时,需要对表进行分区(Partition)处理,然后把Partition后的各个范围(Range),分配到各个节点上去。如图1中所示,将主用数据按照Partition分别存储到各个节点,例如,其中Range(0,10]、(50,60]在节点_1存储,Range(10,20]、(60,70]在节点_2存储,各个节点中存储的Active数据构成了数据库的完整数据(0,100],且不重复。进一步的,每个节点都存储一部分其他节点主用数据的Backup数据,图1所示的系统中每个节点的主用数据在其他的两个节点存有备份,例如,节点_1存储的Range(0,10]、(50,60]的主用数据,在节点_4、节点_5均有备份,节点_1会将自身的主用数据向节点_4、节点_5备份作为备用数据。
在分布式数据库多表管理的业务场景中,数据库保存的表根据保存的内容可以分为事实表和维度表。一个事实表可以关联多个维度表。如下表1和表2所示,表1为事实表,表2为与事实表1相关联的多个维度表中的一个。
表1事实表
销售地区 产品代码 销售月份 销售量 销售额
1 347 1 3 7.95
1 447 1 3 7.32
2 347 2 4 7.95
2 447 2 4 7.32
3 339 3 16 42.70
表2维度表
产品代码 产品名称 产品类别
347 山地100 自行车
339 远路650 自行车
447 车锁 配件
事实表是数据库架构中的核心表,包含描述业务(如银行事务或产品销售)内特定事件的数据,用于存储至少一个事实记录,每个事实记录对应事实表中的一行。表1中的事实表中的每一数据行记录一个销售的事实记录,在示例数据行中,前3个列——销售地区、产品代码和销售月份——为键值列。剩下的两个列——销售量和销售额——为度量值列。键值列中的值对应事实记录的维度,例如,销售地区、产品代码、销售月份、产品产地、产品种类等等,度量值列中的值对应于事实表的主题内容,例如,销售额或销售量等等。维度表中的条目用来描述事实表中的数据,记录事实表中的事实记录的维度(一般是其中的键值列)特性,维度表包含创建维度所基于的数据。例如表2中的维度表用于对表1中的事实表的产品代码维度进行描述。维度表可以包括名称列和属性列,例如表2中,产品代码为名称列,产品名称和产品类别为属性列。例如,在实际应用中,银行对存款记账的数据库系统,A表中存放实际数据,包括账号、所属机构号、存款金额等,B表存放机构号和机构名称的对应关系,则A是事实表,B是维度表。
数据表包含主键和外键,主键在创建表的时候指定,可以是表中的一列或者是多列的联合值,主键是行记录的唯一标识,具有唯一性,一个表中任意两行具有不同的主键值,主键一般为表的第一列。外键则用于连接两个表,是另一张表中与这张表的某个字段的类型、字段名相同的字段,用作关联两张或两张以上的表,可以理解为外键一般是指另外表中的主键。通常,事实表的外键对应于维度表的主键。例如,如果表A中的列M与表B中的列N相对应,即A表中的列M与B表中列N对应于相同的属性,例如,均对应于订单号,并且列M为表A中的主键,则列N为表B中的外键。联接两个数据表的键也常常称为关联键,事实表与维度表通过关联键进行关联,关联键对于事实表来说是外键,对于维度表来说是主键,事实表的外键对应于维度表的主键。例如,表1中的产品代码列与表2中的产品代码列相关联,可以称作关联键, 表1中的产品代码列为事实表的外键,表2中的产品代码列为维度表的主键。
在分布式数据库中,事实表中的数据量要远大于维度表的数据量,表之间的关联查询非常普遍,相对于维度表,事实表的数据插入和更新更加频繁。通常,管理器需要对各个表设置分区键分区键,分区键是数据表中决定数据行属于哪个分区的一组数据列,是对数据进行分区的判断依据。管理器通过PartitionKey对数据表进行Partition,得到Partition后每个表的各个Range,将每个表的各个Range分配到各个节点进行管理,可以采用对表的Partition Key进行哈希(Hash)、排序(Order)或其他的Partition方式,各个表的数据按照Partition得到的各个Range,分散在各个节点上。需要进行数据处理时,把数据聚合到一个节点,进行聚合Join计算,对于基于海量数据的分布式数据库,跨节点数据读取造成系统延迟过大,会使得整个系统的整体性能下降,尤其是在涉及到多个表的Join操作时。
本发明实施例中提供了一种分布式数据库中关联表分区的方法。图2示出了本发明实施例相关的分布式数据库的库结构示意图,在分布式数据库中包括如下8个关联数据表,项目表Lineitem、部件供应商表Part_Supp、部件表Part、供应商表Supp、订单表Order、客户表Customer、国家表Nation、区域表Region。其中,各个表之间的关联关系如图2所示:LineItem表通过部件键Part_Key、供应商键Supp_Key与Part_Supp表关联,通过订单键Order_Key与Order表关联,这里的Part_Key、Supp_Key、Order_Key均为LineItem表的外键。Part_Supp表通过部件键Part_Key与Part表关联,通过供应商键Supp_Key与Supp表关联。Order表通过客户键CustomerKey与Custermer表关联,CustomerKey为Order表的外键。Customer表通过国家键Nat_Key与Nation表关联,Nat_Key为Custermer表的外键。Nation表通过区域键Reg_Key与Region表关联,Reg_Key为Nation表的外键。各数据表中确定的分区键如下:LineItem表的分区键为OrderKey、Part_Key和Supp_Key,Part_Supp表的分区键为Part_Key和Supp_Key,Part表的分区键为Part_Key,Supp表的分区键为Supp_Key,Order表的分区键为Order_Key和CustermerKey,Custermer表的分区键为CustermerKey,Nation表、Region表没有指定分区键。
在本发明实施例的关联表分区的方法中,首先从需要Partition的各数据表中选定一个核心数据表,我们在这里把它称作第一数据表。确定第一数据表的具体步骤是:从分布式数据库的所有数据表中任意选择一个还未进行Partition的数据表,这里假设图2中所示的8个数据表都还未进行Partition,假设这里选择了部件供应商表Part_Supp表。确定Part_Supp表的分区键为Part_Key和Supp_Key,然后判断该表的分区键中是否存在分区键为其他表的外键,如果没有,则就将这个表作为第一数据表(即核心数据表),如果存在分区键为其他表的外键,那么获得这个其他表(这里所找到的其他表,也就是该表的事实表),再判断这个事实表是否可以作为第一数据表,直到找到第一数据表。因为Part_Supp表的分区键中Part_Key、Supp_Key均为LineItem表的外键,则找到事实表LineItem表,接下来继续判断LineItem表是否可以作为第一数据表,因为LineItem表中的LineItem表的分区键为OrderKey、Part_Key、Supp_Key,均不是其他数据表的外键,所以可以选定LineItem表作为进行Partition的第一数据表。上述方法选择了一个数据表进行Partition,可以理解为实质上最终是找到一个事实表,这个事实表是需要Partition的各关联数据表中最上层的数据表,这个第一数据表中的分区键均不是其他表格的外键。可以看出来,在图2所示的分布式数据库中,无论从哪个表开始这个查找第一数据表的步骤,最终均会找到事实表LineItem表作为第一数据表。
接下来,需要找第一数据表的共址分区(Colocation Partition,简称CP)表,建立第一数据表的CP表集合。CP表的概念是指,如果表B的所有分区键均包含在表A的分区键中,即表B的分区键是表A的分区键的子集,则称表B为表A的CP表,表明B表可以按照与A表对应的关联键进行相同的Range的Partition。首先,获得我们找到的第一数据表LineItem的所有分区键,以及与分区键关联的各数据表(即LineItem表的分区键作为外键所关联的维度表,也可称作分区键的关联表)。LineItem表的分区键为OrderKey、Part_Key、Supp_Key,找到它的关联表包括Part_Supp表、Order表。接下来逐个判断这些关联表,是否适合于与LineItem表一起进行CP,即是否为LineItem表的CP表,直到完成对所有关联表的判断,获得LineItem表的CP 表集合。具体方法为:选择一个未判断CP表关系的第一数据表的关联表,并获得该关联表的所有分区键,判断该关联表的所有分区键是否全部包含在第一数据表的分区键中,若全部包含,则确认该关联表是第一数据表的CP表,将其加入第一数据表的CP表集合,并进一步采用循环迭代的方式来判断该关联表(即前面确认为第一数据表的CP表)的关联表是否为第一数据表的CP表,同样是先确定与该关联表的分区键关联的数据表,在这些与该CP表关联的数据表中,判断若关联的数据表的所有分区键全部包含在第一数据表的分区键中,则确认这个数据表是第一数据表的CP表,也将这个数据表加入到第一数据表的CP表集合中。按照此方法循环迭代持续查找第一数据表的CP表,直到找到的CP表再找不到第一数据表的CP表为止。将第一数据表的关联表全部按照上述方法完成CP表关系的判断,获得所有第一数据表的CP表集合。至此,第一数据表的CP表集合中包括了第一数据表和第一数据表的CP表,集合中各个CP表的分区键均包含在第一数据表的分区键中。在查找第一数据表的CP表时,如果一个关联表的分区键不全部包含在第一数据表的分区键中,则可以确认该关联表不是第一数据表的CP表。所以,针对LineItem表的分区键的关联表Part_Supp表、Order表,逐一判断各关联表的CP表关系,Part_Supp表的分区键为Part_Key、Supp_Key,均包含在LineItem表的分区键中,所以Part_Supp表为LineItem表的CP表,应在LineItem表的CP表集合中。接下来判断Part_Supp表的关联表,Part_Supp表的分区键为Part_Key、Supp_Key,找到它的分区键的关联表包括Part表、Supp表,而Part表的分区键为Part_Key,Supp表的分区键为Supp_Key,均包含在LineItem表表的分区键中,所以Part表、Supp表均为LineItem表表的CP表,也应在LineItem表的CP表集合中。接下来,查找Part表、Supp表的关联表,Part表、Supp表均没有分区键的关联表,所以均没有LineItem表的CP表。再来看Order表,Order表的分区键为Order_Key、CustermerKey,其中,CustermerKey不包含在LineItem表的分区键中,所以Order表不是LineItem表的CP表。经过前面的步骤,获得了第一数据表的CP表集合CP1,CP1中包括第一数据表以及可以和第一数据表进行共址分区CP的关联表,即第一数据表的CP表。所以,前面LineItem表的CP表集合CP1中包括LineItem 表、Part_Supp表、Part表、Supp表。
下面就可以对第一数据表和其CP表集合CP1中的表进行共址分区Colocation Partition。具体的步骤是:根据第一数据表的分区键进行第一数据表的Partition,如果分区键的数量为一个,则为一维Partition,如果为多个,则为多维Partition。依次对第一数据表的CP表集合中未Partition的CP表进行Partition,根据该CP表与第一数据表相同的分区键进行与第一数据表相同的Partition,即CP表和其对应的第一数据表,在相同的分区键进行相同的Partition,这里所说的相同的Partition,是指两个数据表在进行Partition时,相同的分区键的Range区间相同,直到CP集合中所有数据表都被Partition,则完成了该分布式数据库的关联表分区。如本发明实施例的分布式数据库Partition结果示意图图3中所示的CP表集合的分区结果,每个数据表根据其分区键的数量,确定了进行Partition的Range的维数,相同的分区键在各个数据表中对应相同的Range区间。例如,CP表集合CP1中的LineItem表、Part_Supp表、Part表、Supp表,均以相同的Range区间进行Partition,LineItem表为三维分区,其中L列按照(L_0,L-1]、(L_1,L-2]、(L_2,L-3]进行Partition分为3个Range,P列按照(P_0,P-1]、(P_1,P-2]、(P_2,P-3]、(P_3,P-4]、(P_4,P-5]进行Partition分为5个Range,S列按照(S_0,S-1]、(S_1,S-2]进行Partition分为2个Range,Part表为一维分区,P列采用与LineItem表相同的Range区间,按照(P_0,P-1]、(P_1,P-2]、(P_2,P-3]、(P_3,P-4]、(P_4,P-5]进行Partition分为五个Range,Supp表也为一维分区,S列采用与LineItem表相同的Range区间,按照(S_0,S-1]、(S_1,S-2]进行Partition分为两个Range,Part_Supp表作为LineItem表的CP表,为二维分区,采用与LineItem表相同的Range区间,P列按照(P_0,P-1]、(P_1,P-2]、(P_2,P-3]、(P_3,P-4]、(P_4,P-5]进行Partition,S列按照(S_0,S-1]、(S_1,S-2]进行Partition。通过这种方式,同一个CP表集合中的各个关联表中相关联的分区键的Partition方式相同,即Range相同。
接下来,再针对分布式数据库中需要Partition的各数据表重复执行上面的步骤,直到完成整个数据库中所有表的Partition。以图2所示的数据库为例,在完成前面的Partition后,尚剩余订单表Order、客户表Customer、国家表Nation、区域表Region未进行Partition,按照前面所述的方法,首先确定第一数据表为Order表,再建立Order表的CP表集合CP2,Order表的CP表集合CP2中包括Order表、Customer表。再将Order表的CP表集合CP2中的表进行共址分区Colocation Partition,得到如图3中所示的CP表集合的分区结果,Order表、Customer表均以相同的Range区间进行Partition,Order表为二维分区,其中O列按照(O_0,O-1]、(O_1,O-2]、(O_2,O-3]进行Partition,C列按照(C_0,C-1]、(C_1,C-2]、(C_2,C-3]、(C_3,C-4]、(C_4,C-5]进行Partition,Customer表作为Order表的CP表为一维分区,C列采用与Order表相同的Range区间,按照(C_0,C-1]、(C_1,C-2]、(C_2,C-3]、(C_3,C-4]、(C_4,C-5]进行Partition。国家表Nation、区域表Region无分区键,无需考虑。至此,整个分布式数据库的Partition完成。且各个数据表均按照同一个CP表集合中的相关联的分区键,进行相同方式的Partition,分区范围也就能够一一映射起来。
通过本发明实施例中的分布式数据库中关联表的Partition方法,在各个关联表中找出相关联的CP表,将CP表按照相关联的分区键进行Partition的Range方式相同,使得进行OLAP操作时,只需要本节点或者从其他节点获得关联的相同Range的数据,而不需要获得整张的事实表,从而提升了整个系统的性能。
图4为本发明实施例2的一种分布式数据库中关联表分区方法的流程图。如图所示,本方法为:
步骤101,在需要分区的各数据表中确定第一数据表。这里的第一数据表是指满足如下的条件的一个数据表:该数据表的分区键分区键不是任一个第二数据表的外键,这里的第二数据表是指需要分区的各数据表中除第一数据表之外的其他数据表。
在分布式数据库系统中,通常由系统中的管理器Manager进行Partition 处理。优选的,管理器在系统中选定第一数据表可以采用如下方法:从分布式数据库的所有数据表中任意选择一个还未进行Partition的数据表,然后判断该表的分区键中是否存在分区键为其他表的外键,如果没有,则就将这个表作为第一数据表,否则再针对这个找到的其他表,判断这个其他表是否可以作为第一数据表,如此反复处理,直到找到第一数据表。可以认为,第一数据表是需要Partition的数据表中最上层的关联表,这个第一数据表中的分区键均不是其他表格的外键。当然,如果这个尚未进行Partition的数据表中存在分区键为其他表的外键,但这个其他表已进行了Partition,同样可以将这个尚未进行Partition的数据表作为第一数据表。
步骤101中在需要分区的各数据表中确定第一数据表的方法流程可以如图5所示。包括:
步骤1011,从数据库中的所有数据表中,选择一个尚未Partition的数据表作为备选表。管理器Manager从所有数据表中任意选择一个尚未Partition的数据表开始确定是否可以作为第一数据表,这里把这个选择的数据表称为备选表,即第一数据表的备选表。
步骤1012,获取该备选表的所有分区键。
步骤1013,判断该备选表的分区键中是否存在分区键为其他数据表的外键,若有则进入步骤1014,否则进入步骤1016。
步骤1014,获得步骤1013所找到的该其他数据表。如果该备选表存在分区键为其他数据表的外键(或叫做关联键),那么获得该其他数据表。从前面对分布式数据库的介绍可以得知,这个其他数据表其实就是该备选表的事实表。即步骤1013中找到的是该其他数据表,它的外键为该备选表的外键,那么它就是该备选表的事实表。步骤1013的实质就是寻找该备选表的事实表。
步骤1015,判断该找到的其他表(事实表)是否已经Partition,若已经Partition,则进入步骤1017,否则进入1016。
步骤1016,如果该其他表还没有Partition,则将该其他表作为新的备选表,重新进入步骤1012。如果该其他表还没有进行Partition,那么说明该其他表是比当前的备选表更上层的需要进行Partition的数据表,接下来将该其他表作为新的备选表,重新回到步骤1012,查找该新的备选表是否还有更上层 的未进行Partition的关联表。
步骤1017,如果该备选表没有分区键为其他数据表的外键,或即使找到其有分区键为其他数据表的外键,但找到的其他数据表,即它的事实表已经Partition,则可以确定该备选表就是要找的第一数据表。其实,符合上面两个条件说明,这个备选表就是需要Partition的数据表中最上层的关联表,它的分区键均不是其他表的外键,或即使它存在分区键是其他表的外键,但它的上层表均已Partition。
通过上面的方法流程可以完成第一数据表的确定。当然,在步骤101之前,数据库系统中的管理器接收系统管理员通过后台系统输入的命令创建各数据表,完成各个数据表的建立和表的分区键的定义。首先需要完整定义各个数据表中的字段,以及确定各个数据表的分区键分区键。优选的,可以通过数据定义语言(Data Definition Language,DDL)完成数据表的字段定义以及PatitionKey的定义。
步骤102,生成第一数据表的共址分区CP表集合。第一数据表的CP表集合包括第一数据表和第一数据表的至少一个CP表。其中,第一数据表的CP表为分区键是第一数据表的分区键的子集的数据表,即,第一数据表的CP表集合中每个数据表的分区键均包含在第一数据表的分区键中,集合中包括第一数据表以及可以和第一数据表进行CP的数据表。
优选的,首先需要获得第一数据表的所有分区键,以及与第一数据表的分区键关联的各关联表(即第一数据表的分区键作为外键所关联的维度表),接下来逐个判断这些关联表是否为第一数据表的CP表。其中,判断一个关联表是否为第一数据表的CP表的方法具体为:获得该关联表的所有分区键,判断该关联表的所有分区键是否全部包含在第一数据表的分区键中,若全部包含,则确认该关联表是第一数据表的CP表,将其加入第一数据表的CP表集合,否则,则确认该关联表不是第一数据表的CP表。对于是第一数据表的CP表的关联表,例如,第一CP表,要参照确认第一数据表的关联表的方法,来判断第一CP表的关联表是否为第一数据表的CP表。判断一个该第一CP表的分区键关联的关联表是否为该第一数据表的CP表的具体方法类似地为:获得该关联表的所有分区键,判断该关联表的所有分区键是否全部包含在第 一数据表的分区键中,若全部包含,则确认该关联表是第一数据表的CP表,将其加入第一数据表的CP表集合,否则,则确认该关联表不是第一数据表的CP表。并进一步地,按照此方法判断该新加入的第一数据表的CP表的关联表,按照如此循环迭代的方法,直到找到的CP表再找不到第一数据表的CP表为止。将第一数据表的关联表全部按照上述方法完成CP表关系的判断,获得所有第一数据表的CP表集合。至此,第一数据表的CP表集合中包括了第一数据表和第一数据表的CP表,按照确认过程可以确定,集合中各个CP表的分区键均包含在第一数据表的分区键中。经过前面的步骤,获得了第一数据表的CP表集合,集合中包括第一数据表以及可以和第一数据表进行共址分区CP的数据表。
步骤102中生成第一数据表的共址分区CP表集合的方法流程可以如图6所示。包括:
步骤1021,获取第一数据表的分区键关联的关联表。首先,获取第一数据表的分区键,然后根据分区键获取分区键关联的关联表。
步骤1022,判断关联表中是否存在未判断与第一数据表的CP关系的关联表。需要逐个判断第一数据表的这些关联表是否适合与第一数据表进行共址分区CP,获得第一数据表的CP表集合。与第一数据表的CP关系是指这个关联表是否为第一数据表的CP表。若有,则说明生成第一数据表的CP表集合的工作尚未结束,执行步骤1023;否则,则第一数据表的CP表集合建立已完成。
步骤1023,选择一个未判断与第一数据表的CP关系的关联表,获得该关联表的所有分区键。在未判断与第一数据表的CP关系的关联表中,任意选择一个关联表进行CP关系判断,首先获取该关联表的分区键。
步骤1024,判断该未判断CP关系的关联表的所有分区键是否均包含在第一数据表的分区键中。如果该关联表的分区键中有不被包含在第一数据表的分区键中的,则说明该关联表不是第一数据表的CP表,转到步骤1022,继续选择看是否有其他未判断CP关系的关联表,进行后续步骤。否则,执行步骤1025。
步骤1025,确定该关联表是第一数据表的CP表,将该关联表加入第一 数据表的CP表集合中。该关联表的分区键中均被包含在第一数据表的分区键中的,则确定该关联表是第一数据表的CP表,可以与第一数据表进行CP,将其加入第一数据表的CP表集合中。
步骤1026,查找该确定的CP表的关联表中的第一数据表的CP表,将找到的CP表加入到第一数据表的CP表集合中。此处,查找该确定的CP表的CP表的方法与查找第一数据表的关联表中的CP表的方法一样,即执行步骤1021——1026中的方法,在具体实现上是采用迭代的方法来查找到所有适合加入第一数据表的CP表集合中的数据表。迭代的方法就是重复调用步骤1021——1026的流程递归处理,直到递归结束,只是初始的是时候是查找为第一数据表的关联表中的第一数据表的CP表,即步骤1021中的输入是第一数据表,后续迭代的过程中步骤1021中的输入表为新加入的第一数据表的CP表,直到递归结束。对于每一层迭代来说,假设该层迭代查找的是第一CP表的关联表中的第一数据数据表的CP表,这里的第一CP表可以是第一数据表的CP表中的任一个,则先获取该第一CP表的分区键关联的关联表,然后逐个判断这些关联表是否适合与第一数据表进行共址分区CP,即判断关联表的所有分区键是否均包含在第一数据表的分区键中,如果全部包含,则确定该关联表是第一数据表的CP表,可以与第一数据表进行CP,将其加入第一数据表的CP表集合中。并针对找到的第一CP表的关联表中的CP表,例如第二CP表,进入下一层迭代,查找第二CP表的关联表中的第一数据表的CP表,直到在某层迭代中再找不到适合与第一数据表进行CP的数据表为止,即递归结束。
步骤1027,已获得第一数据表的所有CP表,至此,第一数据表的CP集合建立完成。
经过上面的流程,获得了第一数据表的CP表集合,CP表群中的表包括第一数据表的CP表。
步骤103,将第一数据表按照其分区键进行Partition。根据第一数据表的分区键的情况,进行多维/一维的Partition,具体进行多维或一维的Partition是根据第一数据表的分区键的数量来决定的。如果仅有1个分区键,则进行一维Partition,如果有N个分区键,则进行N维Partition。
步骤104,将CP表集合中第一数据表的每个CP表进行分区,其中,每个CP表与第一数据表相关联的分区键的分区范围与第一数据表中相对应的分区键的分区范围相同。将第一数据表的CP表集合中除第一数据表之外的其他数据表根据与第一数据表关联的分区键,进行与第一数据表相同的Partition。依次对第一数据表的CP表集合中各CP表进行Partition,根据该CP表与第一数据表相同的分区键进行与第一数据表相同的Partition,即CP表和其对应的第一数据表在相同的分区键进行相同的Partition(Range区间相同),直到CP集合中所有数据表都被Partition,完成第一数据表的关联表分区。
具体的,步骤104中对于第一数据表的CP表集合中各CP表按照与第一数据表关联的分区键进行与第一数据表相同的Partition的方法,包括:首先,判断第一数据表的CP表集合中是否有未Partition的CP表,如果有,则在第一数据表的CP表集合中选择一个未Partition的CP表,进行与第一数据表相同的Partition,即CP表按照和其与第一数据表对应的分区键,进行相同的Partition(Range区间相同),直到第一数据表的CP表集合中再没有未Partition的CP表。
进一步的,可以针对分布式数据库中尚未进行Partition的各数据表重复执行上面的步骤101-104,直到完成整个数据库中所有表的Partition。即针对第一数据表的CP表集合之外的其他需要进行Partition的各数据表,按照前面图4中步骤101-104所述的方法进行共址分区Colocation Partition,直到整个分布式数据库的Partition完成。这样,各个数据表均按照同一个CP表集合中的相关联的分区键,进行相同方式的Partition,各个关联表中同一分区键进行Partition后的Range相同。
通过本发明实施例中的分布式数据库中关联表的Partition方法,首先在需要Partition的各数据表找到相关联的CP表,将CP表按照相关联的分区键采用相同的Partition的Range,使得进行OLAP操作时,只需要本节点或者从其他节点获得关联的相同Range的数据,而不需要获得整张的事实表,从而提升了整个系统的性能。
在经过了如前面实施例2中的分布式数据库中关联表分区方法进行Partition后,数据库系统在节点的加入、退出、系统运行过程中,往往会存在节点的负载调整或者负载不均衡的问题,为了保证系统能够平稳的运行,需要保证各个节点的运行负载差异不能太大,这时还需对已经完成Partition的数据表进行再次Range切分。本发明的实施例3中还提供了如图7所示的一种负载均衡的方法。
步骤201,接收负载均衡请求。管理器接收对数据库中某个节点进行负载均衡的请求,负载均衡请求会指明需对第一节点进行负载均衡,例如第一节点。这里的负载均衡请求可以是系统管理员通过后台系统输入的,也可以是数据库系统的后台系统、网管系统或管理器的监控组件等根据各节点的负载监控情况产生的。往往是在节点的添加、删除等,或某个节点负载过重时的负载不均衡等场景需要进行负载均衡。
步骤202,确定进行负载均衡的CP表集合。在计划进行负载均衡的第一节点上,通过记录数或者占用空间等维度进行判断,确定该节点上需要进行负载均衡的CP表集合,例如,将负载较大的CP表集合作为需要进行负载调整的CP表集合。
步骤203,在进行负载均衡的CP表集合中确定进行负载调整的数据表。在步骤202中确定的进行负载均衡的CP表集合中通过记录数或者占用空间等维度进一步选择需要进行负载调整的数据表,例如,负载较大的数据表A。
步骤204,从确定的数据表A中选择一个分区键进行Range调整。在步骤203中确定的进行负载调整的数据表中随机选择或选择负载相对较大/较小的分区键进行Range调整。
步骤205,对确定的CP表集合中所有包含选择的分区键的数据表,按照步骤204中对数据表A的Partition处理进行相同的Range调整。
进一步的,需要根据第一节点的负载均衡调整,对相关的节点进行负载均衡调整。例如,如果需求是第一节点负载过重,需要将部分数据调整到其他节点,则将从第一节点中进行Range调整后分出来的数据调整到其他节点,可以将调整出来的数据调整到负载较轻的节点或者新加入的节点上,从而实现均衡负载。
为更清楚的了解本发明实施例的负载均衡方法,图8中示出了一个具体的节点负载均衡的实例。在图8左边的表格中示出的是按照如前面实施例2中关联表分区方法进行Partition后的结果,共列出了3个数据表,其中,表1为事实表,表2、表3为维度表。表1为三维Partition,表2、表3为一维Partition。现在,假设节点2过载,管理器收到了需要对节点2进行负载均衡的请求。管理器确定节点2上的表1、表2、表3为一个CP表集合,且负载较大。并进一步的发现,第一种情况:表3负载较大,则选择表3的分区键进行Range拆分,现有的表3的分区键在节点2的Range为(100,200],将其拆分为(100,150]、(150,200],同样的,将CP表集合中包含表3的分区键的数据表(表1)按照与表3同样的Range进行切分。则节点2上仅包含表1中Range(0,10](0,10](100,150]的数据,表2中Range(0,10]的数据,表3中Range(100,150]的数据,负载大大降低。将节点2中拆分出来的数据分配到节点9上,节点9上包含表1中Range(0,10](0,10](150,200]的数据,表2中Range(0,10]的数据,表3中Range(150,200]的数据。当然这里的节点9可以是一个新增的节点,也可以是已有的其他节点。第二种情况:节点2中表1负载较大,且表2中数据较多,则考虑选择表1、表2共有的分区键进行Range拆分,节点2中现有的表2的分区键的Range为(0,10],将其拆分为(0,5]、(5,10],同样的,将CP表集合中包含表2的分区键的数据表(表1)按照与表2同样的Range进行切分。则节点2上仅包含表1中Range(0,10](0,5](100,200]的数据,表2中Range(0,5]的数据,表3中Range(100,200]的数据,负载大大降低。将节点2中拆分出来的数据分配到节点9上,节点9上包含表1中Range(0,10](5,10](100,200]的数据,表2中Range(5,10]的数据,表3中Range(100,200]的数据。
通过以上负载调整的方法,确定需要进行负载均衡调整的数据表以及相关的分区键,并将各关联的CP表仍然按照关联的分区键进行Range相同的Partition,可以保证各个节点各个节点的负载均衡,系统能够平稳的运行,且经过负载均衡后的各节点中各关联的CP表仍然按照关联的分区键进行Partition的Range相同,保障了进行OLAP操作时,只需要本节点或者从其他节点获得关联的相同Range的数据,而不需要获得整张的事实表,提升整个系 统的性能。
图9为本发明实施例4提供的一种数据库管理器的结构图。如图9所示:数据库管理器30,包括确定单元31和共址分区表集合生成单元32,分区单元33。其中,
确定单元31,用于在需要分区的各数据表中确定第一数据表,第一数据表的分区键不是任一个第二数据表的外键,这里的第二数据表是指为需要分区的各数据表中除第一数据表之外的其他数据表。。确定单元31确定第一数据表的具体方法为从分布式数据库的所有数据表中任意选择一个还未进行Partition的数据表,然后判断该表的分区键中是否存在分区键为其他表的外键,如果没有,则就将这个表作为第一数据表,否则再针对这个找到的其他表,判断这个其他表是否可以作为第一数据表,如此反复处理,直到找到第一数据表。可以认为,第一数据表是需要Partition的数据表中最上层的关联表,这个第一数据表中的分区键均不是其他表格的外键。当然,如果这个尚未进行Partition的数据表中存在分区键为其他表的外键,但这个其他表已进行了Partition,同样可以将这个尚未进行Partition的数据表作为第一数据表。确定单元31确定第一数据表的具体方法可以参看前面实施例1、2中的描述,此处不再赘述。
共址分区CP表集合生成单元32,用于生成所述第一数据表的CP表集合,第一数据表的CP表集合包括第一数据表和第一数据表的至少一个CP表,其中,第一数据表的CP表为分区键是第一数据表的分区键的子集的数据表。可以理解,第一数据表的CP表集合中每个数据表的分区键均包含在第一数据表的分区键中,集合中包括第一数据表以及可以和第一数据表进行CP的数据表。
可选的,共址分区CP表集合生成单元32,具体用于:获取第一数据表的维度表,判断第一数据表的维度表是否为第一数据表的CP表,若是,则将第一数据表的维度表加入第一数据表的CP表集合;迭代判断已加入第一数据表的CP表集合中的第一数据表的CP表的维度表是否为第一数据表的CP表,将确认为是第一数据表的CP表的数据表加入第一数据表的CP表集合,直到查找到全部的第一数据表的CP表为止。共址分区CP表集合生成单元32生成第 一数据表的CP表集合的具体方法可以参看前面实施例1、2中的描述,此处不再赘述。
分区单元33,用于将第一数据表按照第一数据表的每个分区键分别进行分区,并将CP表集合中第一数据表的每个CP表进行分区,其中,CP表集合中第一数据表的每个CP表与第一数据表相关联的分区键的分区范围与第一数据表中相对应的分区键的分区范围相同。按照分区键进行Partition,需要根据分区键的数量进行多维或一维的Partition。如果仅有1个分区键,则进行一维Partition,如果有N个分区键,则进行N维Partition。
优选的,分区单元33将其他数据表根据与第一数据表相同的分区键进行与第一数据表相同的Partition,即其对应的第一数据表的相同的分区键进行相同的Partition(Range区间相同)。在第一数据表的CP表集合中选择一个第一数据表之外的未Partition的其他数据表,根据选择的数据表与第一数据表相同的分区键进行与第一数据表相同的Partition,这里所说的相同的Partition是指相同的分区键进行Partition的区间Range相同,直到第一数据表的CP表集合中所有数据表均已完成Partition。分区单元33进行分区的具体方法实施细节可以参看前面实施例1、2中的描述,此处不再赘述。
进一步的,图10为本发明实施例4的数据库管理器的另一结构图。在图9所示的实施例4的数据库管理器的基础上,本发明实施例所示的数据库管理器30中还包括负载均衡单元34。负载均衡单元34用于:
接收进行负载均衡请求,所述负载均衡请求指明需对第一节点进行负载均衡;在所述第一节点中确定进行负载均衡的CP表集合;
在所述确定进行负载均衡的CP表集合中确定进行负载调整的数据表;
从所述确定进行负载均衡的数据表中选择一个分区键进行Range调整;
对所述确定进行负载均衡的CP表集合中所有包含所述选择进行Range调整的分区键的数据表,根据所述选择进行Range调整的分区键进行与所述选择进行Range调整的分区键的数据表相同的Range调整。
负载均衡单元34进行负载均衡调整的具体方法实施细节可以参看前面实施例3中的描述,此处不再赘述。
可选的,图11中示出了本发明实施例4的数据库管理器的另一结构图。在图10所示的数据库管理器的基础上,本发明实施例的确定单元31中包括:
备选表选择模块311:用于在需要分区的各数据表中选择一个尚未进行Partition的数据表作为第一数据表的备选表;
判断模块312:用于判断选择的备选表的分区键中是否存在分区键为其他数据表的外键,如果不存在,则将所述备选表作为所述第一数据表,否则将所述其他数据表作为所述第一数据表的备选表继续判断其是否可以作为所述第一数据表,直到确定所述第一数据表。当然,如果这个备选表中存在分区键为其他表的外键,但这个其他表已进行了Partition,同样可以将这个备选表作为第一数据表。
可选的,数据库管理器30中的共址分区CP表集合生成单元32,包括:获取模块321:用于获取一个数据表的分区键以及与所述一个数据表的分区键关联的维度表;
CP表判断模块322:用于判断每个所述关联的维度表是否为所述第一数据表的CP表,若一个所述关联的维度表的所有分区键均包含在所述第一数据表的分区键中,则确认所述关联的维度表是第一数据表的CP表,否则,则确认所述关联的维度表不是所述一个数据表的CP表。
可选的,数据库管理器30还包括负载均衡单元34,负载均衡单元34包括:
接收模块341:用于接收进行负载均衡请求,负载均衡请求指明需对第一节点进行负载均衡,并在第一节点中确定进行负载均衡的CP表集合;
确定模块342:用于在确定进行负载均衡的CP表集合中确定进行负载调整的数据表,并从确定进行负载均衡的数据表中选择一个分区键进行Range调整;
调整模块343:用于对确定进行负载均衡的CP表集合中所有包含选择进行Range调整的分区键的数据表,根据选择进行Range调整的分区键进行与选择进行Range调整的分区键的数据表相同的Range调整。
进一步的,本实施例中确定模块342确定进行负载均衡的CP表集合的方法为:根据数据表记录数或者占用空间维度,确定第一节点上需要进行负载均衡的CP表集合。
本发明实施例中数据库管理器的各部分的功能的实现细节与前面实施例1、2相同的部分,可以参看前面实施例的描述,本处也不再赘述。
通过本发明实施例中的数据库管理器,在对分布式数据库中关联表进行Partition时,首先在需要Partition的各数据表找到相关联的CP表,将CP表按照相关联的分区键采用相同的Partition的Range,使得进行OLAP操作时,只需要本节点或者从其他节点获得关联的相同Range的数据,而不需要获得整张的事实表,从而提升了整个系统的性能。并且,在需要进行负载调整时,数据库管理器确定需要进行负载均衡调整的数据表以及相关的分区键,将各关联的CP表仍然按照关联的分区键进行Range相同的Partition,可以保证各个节点各个节点的负载均衡,系统平稳运行,且经过负载均衡后的各节点中各关联的CP表仍然按照关联的分区键进行Partition的Range相同,保障了进行OLAP操作时,只需要本节点或者从其他节点获得关联的相同Range的数据,而不需要获得整张的事实表,提升整个系统的性能。
图12示意性地示出了本发明实施例5的数据库管理设备40。如图12所示,该数据库管理设备40包括:处理器41、存储器42和总线43。其中,处理器41和存储器42通过总线43实现彼此之间的通信连接。
处理器41可以采用通用的中央处理器(Central Processing Unit,CPU),微处理器,应用专用集成电路(Application Specific Integrated Circuit,ASIC),或者一个或多个集成电路,用于执行相关程序,以实现本发明实施例所提供的技术方案。
存储器42可以是只读存储器(Read Only Memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(Random Access Memory,RAM)。存储器42可以存储操作系统和其他应用程序。在通过软件或者固件来实现本发明实施例提供的技术方案时,用于实现本发明实施例提供的技术方案的程序代码保存在存储器42中,并由处理器41来执行。
具体地,存储器42可以用于存储计算机执行指令,也可以用于存储各种信息,例如,查询结果。处理器41可以通过总线系统43读取该存储器42存储的信息,或者将查询结果存储至存储器42。此外,当该数据库管理设备40运行时,处理器41可以读取存储器42存储的计算机执行指令,以执行前面 实施例1、2、3中所描述的分布式数据库的关联表Partition方法。
可选地,如图12所示,该数据库管理设备还可以包括输入/输出接口44和通信接口45。输入/输出接口44用于接收输入的数据和信息,输出操作结果等数据。
通信接口45使用例如但不限于收发器一类的收发装置,来实现数据库管理设备40与其他设备或通信网络之间的通信。
总线43可包括一个通路,在数据库管理设备40各个部件(例如处理器41、存储器42、输入/输出接口43和通信接口44)之间传送信息。
图12所示的硬件结构以及上述描述适用于执行本发明实施例所提供的各种分布式数据库的关联表Partition方法,例如,实施例1、2,以及图4至图7所示的各种方法流程,为了简洁,这里不再赘述。
应注意,尽管图12所示的数据库管理设备40仅仅示出了处理器41、存储器42、输入/输出接口44、通信接口45以及总线43,但是在具体实现过程中,本领域的技术人员应当明白,数据,数据库管理设备40还包含实现正常运行所必须的其他器件。同时,根据具体需要,本领域的技术人员应当明白,数据库管理设备40还可包含实现其他附加功能的硬件器件。此外,本领域的技术人员应当明白,数据库管理设备40也可仅仅包含实现本发明实施例所必须的器件,而不必包含图12中所示的全部器件。
本领域普通技术人员可以意识到,结合本文中所公开的实施例中描述的各方法步骤和单元,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各实施例的步骤及组成。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。本领域普通技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和 方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元/模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本发明实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中或作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是计算机能够存取的任何可用介质。以此为例但不限于:计算机可读介质可以包括RAM、ROM、EEPROM、CD-ROM或其他光盘存储、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质。此外。任何连接可以适当的成为计算机可读介质。例如,如果软件是使用同轴电缆、光纤光缆、双绞线、数字用户线(DSL)或者诸如红外线、无线电和微波之类的无线技术从网站、服务器或者其他远程源传输的,那么同轴电缆、光纤光缆、双绞线、DSL或者诸如红外线、无线和微波之类的无线技术包括在所属介质的定义中。如本发明所使用的,盘(Disk)和碟(disc)包括压缩光碟(CD)、激光碟、光碟、数字通用光碟(DVD)、软盘和蓝光光碟,其中盘通常磁性的复制数据,而碟则用激光来光学的复制数据。上面的组合也 应当包括在计算机可读介质的保护范围之内。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。
总之,以上所述仅为本发明技术方案的较佳实施例而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种分布式数据库中关联表分区的方法,包括:
    在需要分区的各数据表中确定第一数据表,所述第一数据表的分区键不是任一个第二数据表的外键,所述第二数据表为所述需要分区的各数据表中除所述第一数据表之外的其他数据表;
    生成所述第一数据表的共址分区CP表集合,所述第一数据表的CP表集合包括第一数据表和所述第一数据表的至少一个CP表,其中,所述第一数据表的CP表为分区键是所述第一数据表的分区键的子集的数据表;
    将所述第一数据表按照所述第一数据表的每个分区键分别进行分区;
    将所述CP表集合中所述第一数据表的每个CP表进行分区,其中,所述CP表集合中所述第一数据表的每个CP表与所述第一数据表相关联的分区键的分区范围与所述第一数据表中相对应的分区键的分区范围相同。
  2. 根据权利要求1所述的方法,其特征在于,所述生成所述第一数据表的CP表集合,具体为:
    获取所述第一数据表的维度表,判断所述第一数据表的维度表是否为所述第一数据表的CP表,若是,则将所述第一数据表的维度表加入所述第一数据表的CP表集合;
    迭代判断已加入所述第一数据表的CP表集合中的所述第一数据表的CP表的维度表是否为所述第一数据表的CP表,将确认为是所述第一数据表的CP表的数据表加入所述第一数据表的CP表集合,直到查找到全部的所述第一数据表的CP表为止。
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:
    接收进行负载均衡请求,所述负载均衡请求指明需对第一节点进行负载均衡;
    在所述第一节点中确定进行负载均衡的CP表集合;
    在所述确定进行负载均衡的CP表集合中确定进行负载调整的数据表;
    从所述确定进行负载均衡的数据表中选择一个分区键进行分区范围调整;
    对所述确定进行负载均衡的CP表集合中所有包含所述选择进行分区范围调整的分区键的数据表,根据所述选择进行分区范围调整的分区键进行与所 述选择进行分区范围调整的分区键的数据表相同的分区范围调整。
  4. 根据权利要求3所述的方法,其特征在于,所述在所述第一节点中确定进行负载均衡的CP表集合,具体为:
    根据数据表记录数或者占用空间维度,确定所述第一节点上需要进行负载均衡的CP表集合。
  5. 一种数据库管理器,其特征在于,包括:
    确定单元,用于在需要分区的各数据表中确定第一数据表,所述第一数据表的分区键不是任一个第二数据表的外键,所述第二数据表为所述需要分区的各数据表中除所述第一数据表之外的其他数据表;
    共址分区CP表集合生成单元,用于生成所述第一数据表的共址分区CP表集合,所述第一数据表的CP表集合包括第一数据表和所述第一数据表的至少一个CP表,其中,所述第一数据表的CP表为分区键是所述第一数据表的分区键的子集的数据表;
    分区单元,用于将所述第一数据表按照所述第一数据表的每个分区键分别进行分区,并将所述CP表集合中所述第一数据表的每个CP表进行分区,其中,所述CP表集合中所述第一数据表的每个CP表与所述第一数据表相关联的分区键的分区范围与所述第一数据表中相对应的分区键的分区范围相同。
  6. 根据权利要求5所述的数据库管理器,其特征在于,所述共址分区CP表集合生成单元,具体用于:
    获取所述第一数据表的维度表,判断所述第一数据表的维度表是否为所述第一数据表的CP表,若是,则将所述第一数据表的维度表加入所述第一数据表的CP表集合;
    迭代判断已加入所述第一数据表的CP表集合中的所述第一数据表的CP表的维度表是否为所述第一数据表的CP表,将确认为是所述第一数据表的CP表的数据表加入所述第一数据表的CP表集合,直到查找到全部的所述第一数据表的CP表为止。
  7. 根据权利要求1或2所述的数据库管理器,其特征在于,还包括负载均衡单元,用于:
    接收进行负载均衡请求,所述负载均衡请求指明需对第一节点进行负载均 衡;
    在所述第一节点中确定进行负载均衡的CP表集合;
    在所述确定进行负载均衡的CP表集合中确定进行负载调整的数据表;
    从所述确定进行负载均衡的数据表中选择一个分区键进行分区范围调整;
    对所述确定进行负载均衡的CP表集合中所有包含所述选择进行分区范围调整的分区键的数据表,根据所述选择进行分区范围调整的分区键进行与所述选择进行分区范围调整的分区键的数据表相同的分区范围调整。
  8. 根据权利要求1或2所述的数据库管理器,其特征在于,所述数据库管理器还包括负载均衡单元,所述负载均衡单元包括:
    接收模块:用于接收进行负载均衡请求,所述负载均衡请求指明需对第一节点进行负载均衡;在所述第一节点中确定进行负载均衡的CP表集合;
    确定模块:用于在所述确定进行负载均衡的CP表集合中确定进行负载调整的数据表;从所述确定进行负载均衡的数据表中选择一个分区键进行分区范围调整;
    调整模块:用于对所述确定进行负载均衡的CP表集合中所有包含所述选择进行分区范围调整的分区键的数据表,根据所述选择进行分区范围调整的分区键进行与所述选择进行分区范围调整的分区键的数据表相同的分区范围调整。
  9. 根据权利要求7或8所述的数据库管理器,其特征在于,所述确定进行负载均衡的CP表集合的方法为:根据数据表记录数或者占用空间维度,确定所述第一节点上需要进行负载均衡的CP表集合。
  10. 一种数据库管理设备,其特征在于,包括:处理器、存储器和总线;所述存储器用于存储计算机执行指令,所述处理器与所述存储器通过所述总线连接,当所述数据库管理设备运行时,所述处理器读取所述存储器存储的所述计算机执行指令,以执行权利要求1至4中任一项所述的方法。
PCT/CN2015/080444 2015-05-31 2015-05-31 一种分布式数据库中关联表分区的方法和设备 WO2016191995A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
PCT/CN2015/080444 WO2016191995A1 (zh) 2015-05-31 2015-05-31 一种分布式数据库中关联表分区的方法和设备
CN201580001222.8A CN106415534B (zh) 2015-05-31 2015-05-31 一种分布式数据库中关联表分区的方法和设备
US15/814,141 US10831737B2 (en) 2015-05-31 2017-11-15 Method and device for partitioning association table in distributed database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/080444 WO2016191995A1 (zh) 2015-05-31 2015-05-31 一种分布式数据库中关联表分区的方法和设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/814,141 Continuation US10831737B2 (en) 2015-05-31 2017-11-15 Method and device for partitioning association table in distributed database

Publications (1)

Publication Number Publication Date
WO2016191995A1 true WO2016191995A1 (zh) 2016-12-08

Family

ID=57439756

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/080444 WO2016191995A1 (zh) 2015-05-31 2015-05-31 一种分布式数据库中关联表分区的方法和设备

Country Status (3)

Country Link
US (1) US10831737B2 (zh)
CN (1) CN106415534B (zh)
WO (1) WO2016191995A1 (zh)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10482076B2 (en) 2015-08-14 2019-11-19 Sap Se Single level, multi-dimension, hash-based table partitioning
CN108874950B (zh) * 2018-06-05 2022-04-12 亚信科技(中国)有限公司 一种基于er关系的数据分布存储方法及装置
CN115129782A (zh) * 2018-06-27 2022-09-30 北京奥星贝斯科技有限公司 一种分布式数据库的分区级连接方法和装置
CN115827627A (zh) * 2018-12-24 2023-03-21 深圳市奋源科技有限公司 一种基于自定义数据表的数据资料管理方法及装置
CN112015738A (zh) * 2020-08-28 2020-12-01 支付宝(杭州)信息技术有限公司 用于实现多个数据明细表的联表处理的方法及装置
CN112233727B (zh) * 2020-10-29 2024-01-26 北京诺禾致源科技股份有限公司 数据分区存储方法及装置
CN113641686B (zh) * 2021-10-19 2022-02-15 腾讯科技(深圳)有限公司 数据处理方法、装置、电子设备、存储介质及程序产品

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090171885A1 (en) * 2007-12-27 2009-07-02 Adam Silberstein Efficient bulk load
CN101916261A (zh) * 2010-07-28 2010-12-15 北京播思软件技术有限公司 一种分布式并行数据库系统的数据分区方法
WO2014067449A1 (en) * 2012-10-29 2014-05-08 Huawei Technologies Co., Ltd. System and method for flexible distributed massively parallel processing (mpp) database

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
MXPA06009355A (es) * 2004-02-21 2007-03-01 Datallegro Inc Base de datos paralela ultra - nada compartida.
US9594580B2 (en) * 2014-04-09 2017-03-14 Bitspray Corporation Secure storage and accelerated transmission of information over communication networks
CN104462430B (zh) * 2014-12-12 2017-12-22 北京国双科技有限公司 关系型数据库的数据处理方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090171885A1 (en) * 2007-12-27 2009-07-02 Adam Silberstein Efficient bulk load
CN101916261A (zh) * 2010-07-28 2010-12-15 北京播思软件技术有限公司 一种分布式并行数据库系统的数据分区方法
WO2014067449A1 (en) * 2012-10-29 2014-05-08 Huawei Technologies Co., Ltd. System and method for flexible distributed massively parallel processing (mpp) database

Also Published As

Publication number Publication date
CN106415534A (zh) 2017-02-15
US20180075077A1 (en) 2018-03-15
CN106415534B (zh) 2019-09-20
US10831737B2 (en) 2020-11-10

Similar Documents

Publication Publication Date Title
WO2016191995A1 (zh) 一种分布式数据库中关联表分区的方法和设备
US11580070B2 (en) Utilizing metadata to prune a data set
US9507875B2 (en) Symbolic hyper-graph database
US10585867B2 (en) Systems and methods for generating partial indexes in distributed databases
WO2018177060A1 (zh) 查询优化方法及相关装置
US20130110873A1 (en) Method and system for data storage and management
KR20190099087A (ko) 서비스 데이터를 블록체인에 기입하기 위한 방법 및 장치, 그리고 서비스 하위세트를 결정하기 위한 방법
US10866970B1 (en) Range query capacity allocation
US20130031229A1 (en) Traffic reduction method for distributed key-value store
US20110246550A1 (en) System and method for aggregation of data from a plurality of data sources
US8943057B2 (en) Method and system for distributed bulk matching and loading
US20180293257A1 (en) Method for accessing distributed database and distributed data service apparatus
US9330158B1 (en) Range query capacity allocation
CN110399368B (zh) 一种定制数据表的方法、数据操作方法及装置
CN106569896B (zh) 一种数据分发及并行处理方法和系统
JP2019504390A (ja) データ照会方法および装置ならびにデータベースシステム
CN104731969A (zh) 分布式环境下海量数据连接聚集查询方法、装置和系统
US20190354516A1 (en) Single-level, multi-dimension, hash-based table partitioning
US7984072B2 (en) Three-dimensional data structure for storing data of multiple domains and the management thereof
US11567969B2 (en) Unbalanced partitioning of database for application data
CN112699134A (zh) 基于图剖分的分布式图数据库的存储与查询方法
CN114969110B (zh) 查询方法和装置
US11940972B2 (en) Execution of operations on partitioned tables
CN117555906B (zh) 数据处理方法、装置、电子设备及存储介质
US20240078252A1 (en) Method and system for efficient data management in distributed database system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15893658

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15893658

Country of ref document: EP

Kind code of ref document: A1