CN108959510A - A kind of partition level connection method of distributed data base and device - Google Patents

A kind of partition level connection method of distributed data base and device Download PDF

Info

Publication number
CN108959510A
CN108959510A CN201810682121.7A CN201810682121A CN108959510A CN 108959510 A CN108959510 A CN 108959510A CN 201810682121 A CN201810682121 A CN 201810682121A CN 108959510 A CN108959510 A CN 108959510A
Authority
CN
China
Prior art keywords
data
physical machine
logical partition
cost
logical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810682121.7A
Other languages
Chinese (zh)
Other versions
CN108959510B (en
Inventor
陈萌萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Oceanbase Technology Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202210635770.8A priority Critical patent/CN115129782A/en
Priority to CN201810682121.7A priority patent/CN108959510B/en
Publication of CN108959510A publication Critical patent/CN108959510A/en
Application granted granted Critical
Publication of CN108959510B publication Critical patent/CN108959510B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24549Run-time optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24558Binary matching operations
    • G06F16/2456Join operations

Abstract

One or more embodiments of this specification provide partition level connection method and the device of a kind of distributed data base, the distributed data base includes multiple tables of data, multiple tables of data are based on identical subregion key and are partitioned, any data table is divided into multiple logical partitions, the multiple logical partitions for being under the jurisdiction of different data table are based on the identical subregion key and are attached, the connection method includes: the concatenate rule for receiving and planning M logical partition for being located at the first physical machine, wherein, the M logical partition is under the jurisdiction of M tables of data respectively;Examine whether physical machine locating for the M logical partition changes;If so, obtaining the second physical machine locating for the logical partition that position changes, and Data Migration cost evaluation is carried out, the data migration cost assessment migrates the logical partition to the cost value of the first physical machine from second physical machine for calculating;Determine whether to execute the concatenate rule according to the result that the data migration cost is assessed.

Description

A kind of partition level connection method of distributed data base and device
Technical field
This specification be related to field of computer technology more particularly to a kind of distributed data base partition level connection method and Device.
Background technique
In Database Systems, data subregion is the important means for improving O&M efficiency, improving system performance.For being related to To the inquiry of multi partition, the executive mode that database often uses multithreading/process concurrent improves execution efficiency, but parallel execution Also the cost of data exchange is brought, for this purpose, optimizer can be as far as possible by the operating polymerization that can merge in the same thread Middle completion.A kind of mode of polymerization is known as partition level connection --- i.e. when needing to be attached two tables of operation or the subregion of multilist When key is identical as connecting key, attended operation can be depressed into inside corresponding subregion and be carried out, avoid the number between cross-thread/process According to exchange.But under distributed data base, the scene of this optimization is limited by subregion physical distribution, distributed data Often there is the operation of Data Migration in library system, and the optimizing phase is in the data subregion of same physical machine, in the stage of execution It is possible that there is the case where certain subregions are migrated to other machines, the Parallel districts grade connection plan that static state generates equally also without Method automatically processes such case.
Summary of the invention
In view of the above-mentioned problems, present description provides a kind of partition level connection method of distributed data base,
The distributed data base includes multiple tables of data, and the multiple tables of data is based on identical subregion key and is partitioned, Wherein, any data table is divided into multiple logical partitions, is under the jurisdiction of multiple logical partitions of different data table based on described identical Subregion key be attached, the connection method includes:
Receive the concatenate rule planned M logical partition for being located at the first physical machine, wherein the M logical partition It is under the jurisdiction of M tables of data respectively;
Examine whether physical machine locating for the M logical partition changes;
If so, obtaining the second physical machine locating for the logical partition that position changes, and carries out data migration cost and comment Estimate, the data migration cost assessment migrates the logical partition to the first physical machine from second physical machine for calculating Cost value;
Determine whether to execute the concatenate rule according to the result that the data migration cost is assessed.
More preferably, the concatenate rule includes obtaining the subregion from the M logical partition for being located at the first physical machine The identical logical partition of key value carries out equivalent connection to the identical logical partition of the subregion key value.
More preferably, the result according to data migration cost assessment determines whether to execute the concatenate rule, wraps It includes: comparing the cost value and default cost value threshold value,
If the cost value is less than the default cost threshold value, the logical partition is moved from second physical machine position It is moved back to first physical machine, the concatenate rule is executed to the table of multiple logical partitions of the multiple tables of data.
If the cost value is greater than the default cost threshold value, the concatenate rule is not executed.
More preferably, the data migration cost includes that the logical partition is migrated back described from second physical machine Duration needed for one physical machine.
More preferably, the data migration cost includes that the logical partition is migrated back described from second physical machine The data exchange number carried out needed for one physical machine.
This specification additionally provides a kind of partition level attachment device of distributed data base, and the distributed data base includes Multiple tables of data, the multiple tables of data are based on identical subregion key and are partitioned, wherein any data table is divided into multiple logics Subregion, the multiple logical partitions for being under the jurisdiction of different data table are based on the identical subregion key and are attached, the attachment device Include:
Receiving module receives the concatenate rule planned M logical partition for being located at the first physical machine, wherein the M Logical partition is under the jurisdiction of M tables of data respectively;
Inspection module, examines whether physical machine locating for the M logical partition changes;
Cost evaluation module obtains the second physical machine locating for the logical partition that position changes, and carries out data and move Move cost evaluation, data migration cost assessment migrates the logical partition to the from second physical machine for calculating The cost value of one physical machine;
Judgment module determines whether to execute the concatenate rule according to the result that the data migration cost is assessed.
More preferably, the concatenate rule includes obtaining the subregion from the M logical partition for being located at the first physical machine The identical logical partition of key value carries out equivalent connection to the identical logical partition of the subregion key value.
More preferably, the judgment module is further, compare the cost value and default cost value threshold value,
If the cost value is less than the default cost threshold value, the logical partition is moved from second physical machine position It is moved back to first physical machine, the concatenate rule is executed to the table of multiple logical partitions of the multiple tables of data.
If the cost value is greater than the default cost threshold value, the concatenate rule is not executed.
More preferably, the data migration cost includes that the logical partition is migrated back described from second physical machine Duration needed for one physical machine.
More preferably, the data migration cost includes that the logical partition is migrated back described from second physical machine The data exchange number carried out needed for one physical machine.
Correspondingly, this specification additionally provides a kind of computer equipment, comprising: memory and processor;The memory On be stored with can by processor run computer program;When the processor runs the computer program, execute as above-mentioned Step described in the partition level connection method of distributed data base.
Correspondingly, this specification additionally provides a kind of computer readable storage medium, is stored thereon with computer program, institute When stating computer program and being run by processor, the step as described in the partition level connection method of above-mentioned distributed data base is executed.
Method and apparatus are connected using partitions of database grade provided by this specification, is generated in static state and adapts to distributed number When planning according to the Parallel districts grade connection in library, introduces automatic data processing migration or a small number of subregions are unsatisfactory for partition level connection The processing capacity of plan condition can be with the subregion physical distribution of self-adaptive processing distributed system based on built-in Cost Model Variation promotes the execution efficiency of the database operating instructions such as user query.
Detailed description of the invention
Fig. 1 is a kind of flow chart for partitions of database grade connection method that one exemplary embodiment of this specification provides;
Fig. 2 is that more physical machines in distributed data base that one exemplary embodiment of this specification provides execute subregion parallel The logical architecture figure of grade attended operation;
Fig. 3 is a kind of schematic diagram for partitions of database grade attachment device that one exemplary embodiment of this specification provides;
Fig. 4 is a kind of hardware knot for running the connection method or apparatus embodiment of partitions of database grade provided by this specification Composition.
Specific embodiment
In order in specific database manipulation reduce reading and writing data total amount to reduce the response time, partitions of database is A kind of common Physical database design technology.Database (or tables of data) subregion is exactly by the data in a large data volume table Different system partitionings, hard disk are assigned to according to different partitioning strategies or is different on server apparatus, realize data Equilibrium assignment, with balanced Volume data into different storage mesons, subregion each in this way has shared a part of data equally, so It can navigate in specified subregion afterwards, demand operation is carried out to tables of data, in addition, also facilitating management data list, for example to be deleted Except the data of some period, so that it may according to date subregion, then directly delete the date subregion.Therefore data subregion It is the important means for improving O&M efficiency, improving system performance.
For be related to multi partition data requirements operate, database often use the concurrent executive mode of multithreading/process with Execution efficiency is improved, to reduce parallel execution bring data exchange cost, the optimizer of database will can close as far as possible And execution operation be placed in same thread/process and complete, one of which, which merges, to be executed the mode of operation and is known as " subregion cascade Connect ", i.e., when two tables or multi-table join, if the connecting key (join key) of two tables or multilist that are related to and its subregion key Unanimously (column for executing attended operation reference to two tables or multilist are also reference when two table or multilist execute division operation Column), then attended operation can be performed simultaneously inside multiple subregions, and by stages is without carrying out data exchange.But work as database When being the distributed data base being made up of one or more physical machine network interconnection, distributed data base system often occurs The operation of Data Migration, optimizing phase are in the data subregion of same physical machine, are also possible to occur in the execution stage certain Subregion is migrated to the case where other machines, and the Parallel districts grade connection plan that static state generates equally can not also automatically process this Situation.
Based on problem above, one exemplary embodiment of this specification proposes a kind of partition level connection side of distributed data base Method, as shown in Figure 1, the distributed data base includes multiple tables of data, the multiple tables of data is based on identical subregion key quilt Subregion, wherein any data table is divided into multiple logical partitions, is under the jurisdiction of multiple logical partitions of different data table based on described Identical subregion key is attached, and the connection method includes:
Step 102, the concatenate rule planned M logical partition for being located at the first physical machine is received, wherein the M Logical partition is under the jurisdiction of M tables of data respectively;
Step 104, examine whether physical machine locating for the M logical partition changes;
If so,
Step 106, the second physical machine locating for the logical partition that position changes is obtained, and carries out data migration cost Assessment, the data migration cost assessment migrate the logical partition to the first physics from second physical machine for calculating The cost value of machine;
Step 108, determine whether to execute the concatenate rule according to the result that the data migration cost is assessed.
Distributed data base as described in the examples provided by this specification refers to and is led to by one or more physical machines The distributed data base of network interconnection composition is crossed, may all there is a complete copy pair of total data in above-mentioned every physical machine This or copied part copy, the above-mentioned more physical machines positioned at different physical address are interconnected by network, are collectively constituted One complete, global database concentrated, be physically distributed in logic.Multiple tables of data in database are being based on attribute After identical subregion key executes logical partition, multiple logical partitions of each tables of data can be located in different physical machines, on Distributed data base is stated when formulating concatenate rule (or connection plan join) for above-mentioned " partition level connection ", need to be directed to upper The respective logic subregion positioned at same physical machine of multiple tables of data is stated to formulate connection plan, is divided above-mentioned respective logic with it Area connection, with facilitate be attached in same thread/process of the physical machine after data a variety of operations, as to data Increase, delete, change, look into.
More physical machines in distributed data base that Fig. 2 illustrates the offer of one embodiment of this specification execute partition level parallel The logical architecture of attended operation illustrates only two tables of data t1, t2 bases for simplicity in each physical machine shown in Fig. 2 The connection of logical partition (the p0 subregion of such as t1 and the p0 subregion of t2) after the identical subregion key subregion of attribute, corresponding, Those skilled in the art is, it should be understood that in practical applications, may include the multiple and different of a tables of data in same physical machine Logical partition does not limit logical partition number of the tables of data in a physical machine, but partition level in the present specification Concatenate rule (join logical operation as shown in Figure 2) need to be for the M logical partition (p1 of such as t1 for being under the jurisdiction of M tables of data The p2 subregion of the p1 subregion or t1 of subregion and t2 and the p2 subregion of t2) it executes, those skilled in the art are, it should be understood that M should be derived from so Number.Above-mentioned " counterlogic subregion ", is that multiple logical partition can be attached based on above-mentioned identical subregion key, and can be To a variety of operations of data after being attached in same thread/process of same physical machine.It is above-mentioned based on identical subregion key into The detailed process of row subregion, is not construed as limiting in the present specification, can select hash compartment model or range compartment model etc. A variety of partitioning strategies modes.
For example, in the embodiment shown in Figure 2, t1 and t2 can carry out subregion, the first row of t1 based on hash compartment model C1 is its subregion key, and the first row c1 of t2 is also its subregion key, and the c1 of t1 arranges, subregion process identical as the c2 Column Properties of t2 It can be with are as follows:
Select*from t1, t2where t1.c1=t2.c1;
create table t1(c1int,c2int)partition by hash(c1)partitions 4;
create table t2(c1int,c2int)partition by hash(c1)partitions 4;
To which t1 and t2 table is respectively divided into tetra- logical partitions of p0, p1, p2, p3.
In general, above-mentioned specific partition level concatenate rule is planned by the optimizer of distributed data base.Optimizer base Physical machine position where the logical partition shown in the partition table of current database, for multiple data in same physical machine The counterlogic subregion (the p3 subregion of such as t1 and the p3 subregion of t2) of table (t1, t2 as shown in Figure 2) formulates partition level connection rule Then (join 0, join 1, join 2 or join 4 as shown in Figure 2), which may include the multiple data that should be connected Multiple counterlogic subregions in same physical machine of table, connect the contents such as process at connection type, can generally be presented as optimization The executive plan tree that device processing generates.Above-mentioned connection type includes but is not limited to the interior connection to above-mentioned multiple logical partitions, Outer connection and interconnection etc., and above-mentioned connection type should all be based on above-mentioned multiple tables of data in generation point in partition level connection Subregion key when area and carry out;The data-handling efficiency after partition level connection is carried out to multiple databases to further increase, on To state the connection type in concatenate rule should be preferably equivalent connection, i.e., from the multiple tables of data that are under the jurisdiction of for being located at same physical machine The identical logical partition of the subregion key value, logical partition identical to the subregion key value are obtained in multiple logical partitions Carry out equivalent connection.
Above-mentioned connection plan is influenced to prevent above-mentioned logical partitioned data from physical migration occurs in distributed data base Implementation is accurately executed, can be each database in the logic level of database in one illustrative examples of this specification Each logical partition regressor RX, logical operator RX is in the concatenate rule (or connection plan) for receiving optimizer and sending Afterwards, examine the corresponding physical location of each logical partition compared to the physical location that the logical partition for including in concatenate rule should be at Whether change:
Verify that include in the corresponding physical machine of the logical partition and concatenate rule is somebody's turn to do if executing logical operator RX The physical machine that logical partition should be at is identical, i.e., in same physical machine, then the RX operator executes " short-circuit mode ", i.e., above-mentioned Concatenate rule is available for the logical partition, and concatenate rule join 0 as shown in Figure 2, join 1, join 2 and join 3 are equal Available to its logical partition, which can directly return to the result of its data scanning to Database Systems (or optimizer).
Since distributed data base is easy to happen Data Migration caused by the other instruction executions of artificial or system, if Executing the logical partition that logical operator RX verifies that its corresponding logical partition has not been assert in concatenate rule should locate In physical machine on, the p3 subregion of t2 as shown in Figure 2 not concatenate rule generate when assert the subregion where physical machine 3 On, the module that RX can be responsible for including into the distributed data base data partition information is communicated, and corresponding is patrolled with obtaining this The position of the now locating physical machine 4 of volume subregion, and the position of the now locating physical machine 4 of the corresponding logical partition is sent It is responsible for calculating the cost evaluation module of Data Migration to database, to carry out data migration cost assessment, above-mentioned cost evaluation mistake Journey include calculate by the corresponding logical partition of logical operator RX from physical machine 4 migrate back physical machine 3 needed for consumption database The cost value (cost) of system, such as in Fig. 2, Database Systems (usually optimizer) answer asking for the corresponding RX of p3 subregion of t2 It asks, assessment migrates back the p3 subregion of t2 in connection plan join 3 needed for corresponding physical machine 3 from now locating physical machine 4 Cost value.
It is responsible for calculating the cost evaluation module of Data Migration, the usually optimization of the distributed data base in above-mentioned database The functional module that device includes, the specific manifestation of the cost of the Data Migration, it may include system is by the logical partition from described Two physical machines (physical machine 4 in such as Fig. 2) migrate back consume needed for first physical machine (physical machine 3 in such as Fig. 2) when Long (system command delay) may also include and migrate the logical partition from second physical machine (physical machine 4 in such as Fig. 2) Return the data exchange number computer system common generation carried out needed for first physical machine (physical machine 3 in such as Fig. 2) Valence indicates.This specification does not limit mathematical model or algorithm based on above-mentioned cost evaluation process, those skilled in the art The cost evaluation model of Data Migration can be set based on specific application scenarios, and different numbers is set for specific application scenarios According to migration cost threshold value, which is used to indicate that Database Systems to be to maintain original partition level connection plan (rule) and migration cost value of the acceptable logical partitioned data between different physical machines.
After assessment obtains the migration cost value of above-mentioned logical partitioned data, Database Systems can be according to the Data Migration generation The result of valence assessment is made whether to can be performed the judgement of above-mentioned concatenate rule, the mode of judgement can there are many, such as data base set The logic judgment module of system can be selected cost threshold comparison method and be judged, according to the above-mentioned data migration cost value being calculated With system for logical partitioned data migrate and preset migration cost threshold value compares:
If cost evaluation model, which calculates resulting cost value, is less than the default cost threshold value, which can be sent out The logical partition is migrated back the first physical machine from the second physical machine, and determined to the multiple tables of data by migration instruction out The tables of data of multiple logical partitions execute the concatenate rule, above-mentioned migration can pass through the side of such as RPC teledata calling Formula is realized.
If cost evaluation model, which calculates resulting cost value, is greater than the default cost threshold value, Database Systems will not be held The row concatenate rule, concatenate rule join 4 as shown in Figure 2 are no longer executed.In the case, optimizer can be generated New concatenate rule reformulates new partition level connection plan to the logical partition that present moment is in same physical machine, And to the logical partition being not located in same physical machine, from the demand of practical application scene, as carried out in difference The connection of multiple logical partitions in physical machine should then formulate new connection plan, such as respectively hash connection two table of left and right, extensively It broadcasts left-handed watch, send the executive plans such as right table at random.
Certainly, above-mentioned Database Systems are made whether that above-mentioned connection can be performed according to the result that the data migration cost is assessed Marking and queuing system, patrolling needed for such as executing to the partition level concatenate rule in different physical machines also can be selected in the judgment mode of rule It collects partition data migration cost be ranked up by the size of cost value, Database Systems choose cost value in tolerance interval Interior partition level concatenate rule, and the Data Migration of its respective logic subregion is returned by former partition level connection by corresponding RX operator The physical machine of rule instruction is to carry out the execution of former partition level concatenate rule.
This specification above-described embodiment realizes the partition level connection to distributed data base by introducing logical operator RX The processing of plan generates the Parallel districts grade connection plan for adapting to distributed data base in the optimizer static state of Database Systems When, automatic data processing migration is introduced by the interaction of RX and each functional module of Database Systems or a small number of subregions are unsatisfactory for The processing capacity that partition level connects plan condition can be with self-adaptive processing distributed data based on built-in cost evaluation model The physical distribution of the logical partition in library changes, and promotes the execution efficiency of the database operating instructions such as user query.The skill of this field Art personnel are, it should be understood that logical operator RX is only abstract representation of the Database Systems in logical operation level, to the reality of the logical process Border implementation should be not limited to any expression way of any computer language.
Corresponding with the realization of above-mentioned process, the embodiment of this specification additionally provides a kind of subregion cascade of distributed data base Connection device.The device can also be realized by software realization by way of hardware or software and hardware combining.With software reality It is CPU (Central Process Unit, the central processing by place equipment as the device on logical meaning for existing Device) by corresponding computer program instructions be read into memory operation formed.For hardware view, in addition to shown in Fig. 4 Except CPU, memory and memory, the equipment where the data processing equipment is also typically included for carrying out wireless signal transmitting-receiving Other hardware such as chip, and/or for realizing other hardware such as board of network communicating function.
Fig. 3 show a kind of partition level attachment device 30 of distributed data base, the distribution provided by this specification Formula database includes multiple tables of data, and the multiple tables of data is based on identical subregion key and is partitioned, wherein any data table quilt It is divided into multiple logical partitions, the multiple logical partitions for being under the jurisdiction of different data table are based on the identical subregion key and are attached, The attachment device 30 includes:
Receiving module 302 receives the concatenate rule planned M logical partition for being located at the first physical machine, wherein described M logical partition is under the jurisdiction of M tables of data respectively;
Inspection module 304, examines whether physical machine locating for the M logical partition changes;
Cost evaluation module 306 obtains the second physical machine locating for the logical partition that position changes, and carries out data Migrate cost evaluation, data migration cost assessment for calculate by the logical partition from second physical machine migrate to The cost value of first physical machine;
Judgment module 308 determines whether to execute the concatenate rule according to the result that the data migration cost is assessed.
It more preferably, is the execution efficiency for further increasing the parallel partition level connection of distributed data base, the connection rule It then include obtaining the identical logical partition of the subregion key value from the M logical partition for being located at the first physical machine, to institute It states the identical logical partition of subregion key value and carries out equivalent connection.
More preferably, the judgment module is further, compare the cost value and default cost value threshold value,
If the cost value is less than the default cost threshold value, the logical partition is moved from second physical machine position It is moved back to first physical machine, the concatenate rule is executed to the table of multiple logical partitions of the multiple tables of data.
If the cost value is greater than the default cost threshold value, the concatenate rule is not executed.
More preferably, the data migration cost includes that the logical partition is migrated back described from second physical machine Duration needed for one physical machine.
More preferably, the data migration cost includes that the logical partition is migrated back described from second physical machine The data exchange number carried out needed for one physical machine.
The function of modules and the realization process of effect are specifically detailed in the above method and correspond to step in above-mentioned apparatus Realization process, the relevent part can refer to the partial explaination of embodiments of method, and details are not described herein.
The apparatus embodiments described above are merely exemplary, wherein described, module can as illustrated by the separation member It is physically separated with being or may not be, the component shown as module may or may not be physics mould Block, it can it is in one place, or may be distributed on multiple network modules.It can be selected according to the actual needs In some or all of unit or module realize the purpose of this specification scheme.Those of ordinary skill in the art are not paying In the case where creative work, it can understand and implement.
Device that above-described embodiment illustrates, module can specifically realize by computer chip or entity, or by having certain The product of function is planted to realize.A kind of typically to realize that equipment is computer, the concrete form of computer can be individual calculus Machine, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media player, navigation equipment, Any several equipment in E-mail receiver/send equipment, game console, tablet computer, wearable device or these equipment Combination.
Corresponding with above method embodiment, the embodiment of this specification additionally provides a kind of computer equipment, the calculating Machine equipment includes memory and processor.Wherein, the computer program that can be run by processor is stored on memory;Processing Device executes the partition level connection method of distributed data base in this specification embodiment in the computer program of operation storage Each step.Content before being referred to the detailed description of each step of the partition level connection method of distributed data base, It is not repeated.
Corresponding with above method embodiment, the embodiment of this specification additionally provides a kind of computer-readable storage medium Matter is stored with computer program on the storage medium, and it is real to execute this specification when being run by processor for these computer programs Apply each step of the partition level connection method of distributed data base in example.To the partition level connection method of distributed data base The detailed description of each step refer to before content, be not repeated.
The foregoing is merely the preferred embodiments of this specification, all in this explanation not to limit this specification Within the spirit and principle of book, any modification, equivalent substitution, improvement and etc. done should be included in the model of this specification protection Within enclosing.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net Network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/or The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable medium Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory techniques, CD-ROM are read-only Memory (CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or Other magnetic storage devices or any other non-transmission medium, can be used for storage can be accessed by a computing device information.According to Herein defines, and computer-readable medium does not include temporary computer readable media (transitory media), such as modulation Data-signal and carrier wave.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including described want There is also other identical elements in the process, method of element, commodity or equipment.
It will be understood by those skilled in the art that the embodiment of this specification can provide as the production of method, system or computer program Product.Therefore, the embodiment of this specification can be used complete hardware embodiment, complete software embodiment or combine software and hardware side The form of the embodiment in face.Moreover, it wherein includes that computer is available that the embodiment of this specification, which can be used in one or more, It is real in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of program code The form for the computer program product applied.

Claims (12)

1. a kind of partition level connection method of distributed data base, the distributed data base includes multiple tables of data, described more A tables of data is based on identical subregion key and is partitioned, wherein any data table is divided into multiple logical partitions, is under the jurisdiction of different numbers It is based on the identical subregion key according to multiple logical partitions of table to be attached, the connection method includes:
Receive the concatenate rule planned M logical partition for being located at the first physical machine, wherein the M logical partition difference It is under the jurisdiction of M tables of data;
Examine whether physical machine locating for the M logical partition changes;
If so, obtaining the second physical machine locating for the logical partition that position changes, and Data Migration cost evaluation is carried out, institute Data migration cost assessment is stated to migrate the logical partition to the generation of the first physical machine from second physical machine for calculating Value;
Determine whether to execute the concatenate rule according to the result that the data migration cost is assessed.
2. according to the method described in claim 1, the concatenate rule includes from the M logic for being located at the first physical machine point Area obtains the identical logical partition of the subregion key value, carries out equivalent connect to the identical logical partition of the subregion key value It connects.
3. method according to claim 1 or 2, the result according to data migration cost assessment determines whether to hold The row concatenate rule, comprising: the cost value and default cost value threshold value are compared,
If the cost value is less than the default cost threshold value, the logical partition is migrated back from second physical machine position First physical machine executes the concatenate rule to the table of multiple logical partitions of the multiple tables of data;
If the cost value is greater than the default cost threshold value, the concatenate rule is not executed.
4. according to the method described in claim 1, the data migration cost includes by the logical partition from second object Duration needed for reason machine migrates back first physical machine.
5. according to the method described in claim 1, the data migration cost includes by the logical partition from second object Reason machine migrates back the data exchange number carried out needed for first physical machine.
6. a kind of partition level attachment device of distributed data base, the distributed data base includes multiple tables of data, described more A tables of data is based on identical subregion key and is partitioned, wherein any data table is divided into multiple logical partitions, is under the jurisdiction of different numbers It is based on the identical subregion key according to multiple logical partitions of table to be attached, the attachment device includes:
Receiving module receives the concatenate rule planned M logical partition for being located at the first physical machine, wherein the M logic Subregion is under the jurisdiction of M tables of data respectively;
Inspection module, examines whether physical machine locating for the M logical partition changes;
Cost evaluation module obtains the second physical machine locating for the logical partition that position changes, and carries out Data Migration generation Valence assessment, the data migration cost assessment migrate the logical partition to the first object from second physical machine for calculating The cost value of reason machine;
Judgment module determines whether to execute the concatenate rule according to the result that the data migration cost is assessed.
7. device according to claim 6, the concatenate rule includes from the M logic for being located at the first physical machine point Area obtains the identical logical partition of the subregion key value, carries out equivalent connect to the identical logical partition of the subregion key value It connects.
8. device according to claim 6 or 7, the judgment module is further, compare the cost value and default cost It is worth threshold value,
If the cost value is less than the default cost threshold value, the logical partition is migrated back from second physical machine position First physical machine executes the concatenate rule to the table of multiple logical partitions of the multiple tables of data;
If the cost value is greater than the default cost threshold value, the concatenate rule is not executed.
9. device according to claim 6, the data migration cost includes by the logical partition from second object Duration needed for reason machine migrates back first physical machine.
10. according to the method described in claim 6, the data migration cost includes by the logical partition from second object Reason machine migrates back the data exchange number carried out needed for first physical machine.
11. a kind of computer equipment, comprising: memory and processor;Being stored on the memory can be by processor operation Computer program;When the processor runs the computer program, the side as described in claims 1 to 5 any one is executed Method.
12. a kind of computer readable storage medium, is stored thereon with computer program, the computer program is run by processor When, execute the method as described in claims 1 to 5 any one.
CN201810682121.7A 2018-06-27 2018-06-27 Partition level connection method and device for distributed database Active CN108959510B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210635770.8A CN115129782A (en) 2018-06-27 2018-06-27 Partition level connection method and device for distributed database
CN201810682121.7A CN108959510B (en) 2018-06-27 2018-06-27 Partition level connection method and device for distributed database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810682121.7A CN108959510B (en) 2018-06-27 2018-06-27 Partition level connection method and device for distributed database

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202210635770.8A Division CN115129782A (en) 2018-06-27 2018-06-27 Partition level connection method and device for distributed database

Publications (2)

Publication Number Publication Date
CN108959510A true CN108959510A (en) 2018-12-07
CN108959510B CN108959510B (en) 2022-04-19

Family

ID=64487428

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201810682121.7A Active CN108959510B (en) 2018-06-27 2018-06-27 Partition level connection method and device for distributed database
CN202210635770.8A Pending CN115129782A (en) 2018-06-27 2018-06-27 Partition level connection method and device for distributed database

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202210635770.8A Pending CN115129782A (en) 2018-06-27 2018-06-27 Partition level connection method and device for distributed database

Country Status (1)

Country Link
CN (2) CN108959510B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020034818A1 (en) * 2018-08-14 2020-02-20 华为技术有限公司 Partition merging method and database server
CN111831425A (en) * 2019-04-18 2020-10-27 阿里巴巴集团控股有限公司 Data processing method, device and equipment
CN112905591A (en) * 2021-02-04 2021-06-04 成都信息工程大学 Data table connection sequence selection method based on machine learning
CN112905596A (en) * 2021-03-05 2021-06-04 北京中经惠众科技有限公司 Data processing method and device, computer equipment and storage medium
CN114416884A (en) * 2022-03-28 2022-04-29 北京奥星贝斯科技有限公司 Method and device for connecting partition table
CN115114328A (en) * 2022-08-29 2022-09-27 北京奥星贝斯科技有限公司 Method and device for generating query plan for distributed database
US11762881B2 (en) 2018-08-14 2023-09-19 Huawei Cloud Computing Technologies Co., Ltd. Partition merging method and database server

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831211A (en) * 2012-08-14 2012-12-19 中山大学 Data sheet migration method based on sheet relation analysis
CN102968498A (en) * 2012-12-05 2013-03-13 华为技术有限公司 Method and device for processing data
CN103440301A (en) * 2013-08-21 2013-12-11 曙光信息产业股份有限公司 Data multi-duplicate hybrid storage method and system
CN104871153A (en) * 2012-10-29 2015-08-26 华为技术有限公司 System and method for flexible distributed massively parallel processing (mpp) database
CN105009110A (en) * 2012-11-30 2015-10-28 华为技术有限公司 Method for automated scaling of massive parallel processing (mpp) database
CN105512268A (en) * 2015-12-03 2016-04-20 曙光信息产业(北京)有限公司 Data query method and device
CN106415534A (en) * 2015-05-31 2017-02-15 华为技术有限公司 Method and device for partitioning association table in distributed database
CN107784044A (en) * 2016-08-31 2018-03-09 华为技术有限公司 Table data query method and device
CN107807938A (en) * 2016-09-09 2018-03-16 华为技术有限公司 A kind of processing method and processing device of tables of data

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831211A (en) * 2012-08-14 2012-12-19 中山大学 Data sheet migration method based on sheet relation analysis
CN104871153A (en) * 2012-10-29 2015-08-26 华为技术有限公司 System and method for flexible distributed massively parallel processing (mpp) database
CN105009110A (en) * 2012-11-30 2015-10-28 华为技术有限公司 Method for automated scaling of massive parallel processing (mpp) database
CN102968498A (en) * 2012-12-05 2013-03-13 华为技术有限公司 Method and device for processing data
CN103440301A (en) * 2013-08-21 2013-12-11 曙光信息产业股份有限公司 Data multi-duplicate hybrid storage method and system
CN106415534A (en) * 2015-05-31 2017-02-15 华为技术有限公司 Method and device for partitioning association table in distributed database
CN105512268A (en) * 2015-12-03 2016-04-20 曙光信息产业(北京)有限公司 Data query method and device
CN107784044A (en) * 2016-08-31 2018-03-09 华为技术有限公司 Table data query method and device
CN107807938A (en) * 2016-09-09 2018-03-16 华为技术有限公司 A kind of processing method and processing device of tables of data

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020034818A1 (en) * 2018-08-14 2020-02-20 华为技术有限公司 Partition merging method and database server
US11762881B2 (en) 2018-08-14 2023-09-19 Huawei Cloud Computing Technologies Co., Ltd. Partition merging method and database server
CN111831425A (en) * 2019-04-18 2020-10-27 阿里巴巴集团控股有限公司 Data processing method, device and equipment
CN112905591A (en) * 2021-02-04 2021-06-04 成都信息工程大学 Data table connection sequence selection method based on machine learning
CN112905591B (en) * 2021-02-04 2022-08-26 成都信息工程大学 Data table connection sequence selection method based on machine learning
CN112905596A (en) * 2021-03-05 2021-06-04 北京中经惠众科技有限公司 Data processing method and device, computer equipment and storage medium
CN112905596B (en) * 2021-03-05 2024-02-02 北京中经惠众科技有限公司 Data processing method, device, computer equipment and storage medium
CN114416884A (en) * 2022-03-28 2022-04-29 北京奥星贝斯科技有限公司 Method and device for connecting partition table
CN114416884B (en) * 2022-03-28 2022-06-14 北京奥星贝斯科技有限公司 Method and device for connecting partition table
CN115114328A (en) * 2022-08-29 2022-09-27 北京奥星贝斯科技有限公司 Method and device for generating query plan for distributed database
CN115114328B (en) * 2022-08-29 2022-10-28 北京奥星贝斯科技有限公司 Method and device for generating query plan for distributed database

Also Published As

Publication number Publication date
CN108959510B (en) 2022-04-19
CN115129782A (en) 2022-09-30

Similar Documents

Publication Publication Date Title
CN108959510A (en) A kind of partition level connection method of distributed data base and device
US11797496B2 (en) System and method for parallel support of multidimensional slices with a multidimensional database
JP6940615B2 (en) Data processing methods and devices
US11144361B2 (en) System and method for automatic dependency analysis for use with a multidimensional database
US20130151535A1 (en) Distributed indexing of data
US20100082599A1 (en) Characterizing Queries To Predict Execution In A Database
CN106933534A (en) A kind of method of data synchronization and device
US9311380B2 (en) Processing spatial joins using a mapreduce framework
US10452632B1 (en) Multi-input SQL-MR
US11068504B2 (en) Relational database storage system and method for supporting fast query processing with low data redundancy, and method for query processing based on the relational database storage method
CN107644286A (en) Workflow processing method and device
US9208273B1 (en) Methods, systems, and articles of manufacture for implementing clone design components in an electronic design
Verma et al. Big Data representation for grade analysis through Hadoop framework
CN105808323A (en) Virtual machine creation method and system
US20200250192A1 (en) Processing queries associated with multiple file formats based on identified partition and data container objects
TWI709049B (en) Random walk, cluster-based random walk method, device and equipment
CN106202092A (en) The method and system that data process
TW201926081A (en) Data allocating system
Liu et al. An improved hadoop data load balancing algorithm
CN110245978A (en) Policy evaluation, policy selection method and device in tactful group
US8849795B2 (en) Optimizing the execution of a query in a multi-database system
CN104239520A (en) Historical-information-based HDFS (hadoop distributed file system) data block placement strategy
CN106991116A (en) The optimization method and device of database executive plan
JP2008225686A (en) Data arrangement management device and method in distributed data processing platform, and system and program
CN110019544A (en) Data query method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40001233

Country of ref document: HK

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200922

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20200922

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210207

Address after: 801-10, Section B, 8th floor, 556 Xixi Road, Xihu District, Hangzhou City, Zhejiang Province 310000

Applicant after: Ant financial (Hangzhou) Network Technology Co.,Ltd.

Address before: Ky1-9008 Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands, ky1-9008

Applicant before: Innovative advanced technology Co.,Ltd.

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210910

Address after: 100020 unit 02, 901, floor 9, unit 1, building 1, No.1, East Third Ring Middle Road, Chaoyang District, Beijing

Applicant after: Beijing Aoxing Beisi Technology Co.,Ltd.

Address before: 801-10, Section B, 8th floor, 556 Xixi Road, Xihu District, Hangzhou City, Zhejiang Province 310000

Applicant before: Ant financial (Hangzhou) Network Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant