WO2018000991A1 - 一种数据均衡方法和装置 - Google Patents

一种数据均衡方法和装置 Download PDF

Info

Publication number
WO2018000991A1
WO2018000991A1 PCT/CN2017/085376 CN2017085376W WO2018000991A1 WO 2018000991 A1 WO2018000991 A1 WO 2018000991A1 CN 2017085376 W CN2017085376 W CN 2017085376W WO 2018000991 A1 WO2018000991 A1 WO 2018000991A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual
nodes
physical
node
physical nodes
Prior art date
Application number
PCT/CN2017/085376
Other languages
English (en)
French (fr)
Inventor
陆敬石
陶维忠
吴刚
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP17818988.2A priority Critical patent/EP3467652B1/en
Priority to BR112018077132-5A priority patent/BR112018077132A2/pt
Publication of WO2018000991A1 publication Critical patent/WO2018000991A1/zh
Priority to ZA201900538A priority patent/ZA201900538B/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • G06F9/5088Techniques for rebalancing the load in a distributed system involving task migration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5017Task decomposition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources

Definitions

  • the present invention relates to the field of data processing, and in particular, to a data equalization method and apparatus for a hash-consistent distributed database system.
  • distributed database technology is a distributed technology commonly used in the field of IT technology. It is mainly applied to webpage page caching, database caching, etc. to meet the requirements of users for the response speed of network systems.
  • a physical node can virtualize multiple virtual nodes, and then map multiple virtual nodes to the ring through a hash algorithm, so that the physical nodes can be mapped on the ring.
  • the range of hash values When a physical node is added to a distributed database system or a physical node is deleted, the number of virtual nodes associated with each physical node is adjusted to be equal, so that all physical nodes of the distributed database system are equalized. Realize the balance of virtual nodes.
  • the physical node may be overloaded and the service may fail.
  • a technical problem to be solved by embodiments of the present invention is to provide a data equalization method.
  • the problem of data imbalance of each physical node of the distributed database system in the prior art can be solved.
  • an embodiment of the present invention provides a data balancing method, including:
  • the data equalization device first acquires the load degree and n weight values of n weight factors of each virtual node on the m physical nodes in the distributed database system, and each weight node corresponds to one weight value for each virtual node.
  • the weighting factor is a parameter for evaluating a dimension of the virtual node, and the specified dimension may be a service dimension and/or a resource dimension, or other dimension, and the load degree of the weighting factor represents a ratio between the actual parameter value of the specified dimension and the rated parameter value, for example,
  • Each virtual node in a distributed database system has the same number and type of weighting factors and has the same weight value configuration.
  • the weighting average is obtained according to the load degree of the n weight factors of the virtual node and the n weight values; for example, a virtual node has 4 weighting factors: memory occupancy rate The number of records, the access frequency, and the CPU usage.
  • the weights of the four weighting factors are: 0.2, 0.3, 0.3, and 0.2.
  • the data equalization device determines the virtual node associated with each physical node according to the mapping relationship between the physical node and the virtual node, and obtains the standard number of fragments according to the virtual node associated with the physical node, for example, for a physical node, the physical node is associated with the physical node.
  • the weight coefficients of all the virtual nodes are added to obtain the standard number of fragments; or, the sum of the weight coefficients of all the associated virtual nodes is enlarged by a preset multiple to obtain the standard number of fragments of the physical node; or all virtual entities to be associated
  • the weight coefficient of the node is weighted and averaged to obtain the standard number of fragments; or the weight coefficients of all the associated virtual nodes are weighted and averaged, and then the preset multiple is obtained to obtain the standard number of fragments; it should be noted that each physical node calculates the standard.
  • the method of the number of fragments is the same, for example, the magnification values are the same, and the weight value configuration for weighting is the same.
  • the data equalization device After obtaining the standard number of fragments of each physical node in the m physical nodes, the data equalization device determines whether the data distribution satisfies the data equalization condition according to the standard number of fragments, and if the data equalization condition is not met, the data equalization device acts on the m physical nodes.
  • the virtual node is migrated, and the virtual node on the physical node with a large number of standard fragments is migrated to the physical node with a small number of standard fragments, so that the standard number of fragments of each physical node is balanced.
  • the meaning of the equalization is that one or more parameter values of each of the plurality of nodes float within a specified range.
  • the weight coefficient of each virtual node is obtained according to the load degree and the weight value of the weighting factor
  • the standard number of fragments of each physical node is obtained according to the weight coefficient of the virtual node
  • the number of standard fragments is determined according to the standard number of fragments.
  • data balancing of the distributed database system includes:
  • the virtual node associated with m physical nodes is simulated, or pre-migrated.
  • the simulated migration does not actually perform the migration action. Instead, the virtual node migration process is simulated according to the migration rule.
  • the migration rule indicates m.
  • the number of standard shards of the physical nodes, the number of virtual nodes, and the weight coefficients of the virtual nodes in the same physical node are balanced, and the data equalization device determines whether the virtual nodes associated with the m physical nodes after the simulated migration satisfy the data equalization condition. If yes, the associated virtual node is actually migrated according to the migration rule described above. After the actual migration, the m physical nodes can meet the data balancing condition.
  • the m physical nodes do not satisfy the data equalization condition, and at least one virtual node is selected from the m physical nodes for fission, and the mapping relationship between the physical node and the virtual node is updated, for example, the virtual node with the largest weight coefficient is selected for fission, The virtual node with the largest weight coefficient among the physical nodes with the largest number of standard fragments is selected for fission.
  • the data equalization device physically migrates the virtual nodes associated with m physical nodes according to the migration rule, so that m physical nodes reach data balance.
  • the preset migration rule may specifically be a physical node that has the largest number of standard fragments, and one fragment is migrated to a physical node with the smallest number of standard fragments.
  • the partial virtual node may be split and then migrated, so that the granularity of the virtual node is too large to achieve data balancing, and all virtual nodes are not needed. Perform fission, effectively control the range and frequency of fission, and reduce processing overhead.
  • selecting at least one virtual node from the virtual nodes associated with the m physical nodes to be split into at least two virtual nodes includes: determining a physical node having the largest number of standard fragments, A virtual node having the largest weight coefficient is selected in the physical node, and the virtual node is split into at least two virtual nodes of the same size.
  • the virtual node with the largest weight coefficient on the standard number of fragments is fissioned and split into two virtual nodes of equal size, which can minimize the number of fission and fission range, and reduce the processing overhead.
  • the migration rule includes: grouping the virtual nodes associated with the m physical nodes according to a weight coefficient, and grouping rules may be grouped by setting a weight coefficient interval, Rounding off, grouping, rounding up, grouping, Down rounding, etc., the principle of grouping is to ensure that the weight coefficient of the virtual nodes in each group floats within the specified range. After the group is completed, data is balanced for each group, so that each physical node maps the same number of virtual nodes.
  • the remaining unallocated in the group acquires unallocated virtual nodes in each group and redistributes all unallocated virtual nodes to m physical nodes.
  • the number of virtual nodes on each physical node is balanced, and the number of standard fragments of physical nodes is balanced.
  • the virtual nodes associated with the physical nodes in each group are equalized in quantity by grouping the virtual nodes associated with the physical nodes, so as to achieve equalization of the number of virtual nodes in each physical node and each The physical node achieves data balance in the amount of data, the equalization process is simple, and the amount of calculation is small.
  • the grouping the virtual nodes associated with the m physical nodes according to the weight coefficient includes:
  • the data equalization device sets a plurality of weight coefficient intervals that do not coincide with each other, each weight coefficient interval corresponds to one group, and the step size of each weight coefficient interval may be equal. For each virtual node, determining that the weight coefficient of the virtual node falls into multiple Which weight coefficient interval is in the weight coefficient interval, and the virtual node is classified into a group corresponding to the weight coefficient interval.
  • n weight factors n weight values
  • the dynamic load degree of the weighting factor indicates that the virtual node is The load factor of the weighting factor of the dynamic monitoring within the preset duration. For example, within one hour, the CPU usage of the virtual node is monitored to be 50%, and 50% is the dynamic load of the CPU usage.
  • the static load degree of the n weighting factors that are statically set by the virtual node is obtained, and the static load factor and the n weight values of the obtained n weighting factors are weighted and summed to obtain a static weight coefficient; the static load degree of the weighting factor indicates
  • the load factor of the configured weighting factor is a fixed value, regardless of the actual load factor of the weighting factor. For example, if the CPU usage of a pre-configured virtual node is 60%, then 60% is the dynamic load of the weighting factor.
  • the actual load of the CPU usage of the virtual node may be 55%.
  • Each virtual node has the same weighting factor configuration and the same The weighting of the dynamic weighting factor and the static weighting factor ensures that each virtual node evaluates its data distribution state under the same criteria. For example, for virtual node 1 and virtual node 4, virtual node 1 and virtual node 4 have the same weighting factors: CPU usage, disk occupancy, and number of records. Virtual node 1 and virtual node 4 have the same weight configuration: dynamic weight The weight of the coefficient is Ws, and the weight of the static weight coefficient is Wd.
  • the data equalization device weights and sums the dynamic weight coefficient and the static weight coefficient of the virtual node to obtain a weight coefficient.
  • the data equalization condition includes: L max ⁇ L average ⁇ (1+ ⁇ )and L min >L average ⁇ (1- ⁇ ); wherein, and represents Relationship, L max is the maximum standard number of fragments among m physical nodes, L average is the average number of standard fragments of the m physical nodes, and L min is the minimum standard number of fragments among m physical nodes.
  • is a deviation coefficient, which can be set in advance, 0 ⁇ ⁇ ⁇ 1.
  • the n weight factors include: one or more of a service level weight factor and a resource level weight factor Species
  • the service level weighting factor includes: one or more of an access frequency and a number of records of the business object;
  • the resource level weighting factor includes one or more of CPU usage, memory usage, disk space usage, and IO interface throughput.
  • the weight coefficients of the virtual nodes are calculated from multiple dimensions, and the data distribution of each virtual node can be accurately evaluated.
  • a data equalization apparatus provided by the embodiment of the present invention includes: an acquisition module, a weight coefficient calculation module, a standard fragment number calculation module, an equalization judgment module, and an equalization module.
  • An obtaining module configured to obtain a load degree and n weight values of n weight factors of each virtual node on the m physical nodes in the distributed database system, where m and n are integers greater than 1;
  • a weight coefficient calculation module configured to obtain a weighting coefficient of each virtual node according to a weighted average of n weight factors and n weight values
  • a standard fragment number calculation module configured to obtain a standard number of fragments according to a weight coefficient of a virtual node associated with each physical node
  • the equalization judging module is configured to judge whether the data distribution satisfies the data equalization condition according to the standard number of fragments corresponding to each of the m physical nodes;
  • the equalization module is configured to perform data equalization processing on the m physical nodes if the judgment result of the equalization judgment module is no.
  • the equalization module includes:
  • Simulating a migration unit performing simulated migration on virtual nodes associated with m physical nodes according to a preset migration rule
  • a determining unit configured to determine whether the m physical nodes after the simulated migration meet the data balancing condition
  • the actual migration unit if the judgment result of the judgment unit is yes, the actual migration of the virtual nodes associated with the m physical nodes according to the migration rule;
  • the fission unit is configured to select at least one virtual node from the virtual nodes associated with the m physical nodes to perform fission if the judgment result of the judging unit is no, and then physically migrate the virtual nodes associated with the m physical nodes according to the migration rule.
  • the fission unit is used to:
  • the migration rule includes:
  • the virtual nodes associated with m physical nodes are grouped according to the weight coefficient.
  • the method of grouping may be a method of grouping by weight coefficient interval, a method of rounding up and rounding, a method of grouping up or rounding down.
  • the virtual nodes remaining after the average allocation of each group are reassigned to m physical nodes.
  • the data equalization condition includes: L max ⁇ L average ⁇ (1+ ⁇ )and L min >L average ⁇ (1- ⁇ ); wherein, and represents Relationship, L max is the maximum standard number of fragments of the physical node, L average is the average standard number of fragments of m physical nodes, L min is the minimum standard number of fragments of the physical node, ⁇ is the deviation coefficient, 0 ⁇ ⁇ ⁇ 1 .
  • the n weight factors include: one or more of a service level weight factor and a resource level weight factor ;
  • the service level weighting factor includes: one or more of the access frequency and the number of records of the business object;
  • the resource-level weighting factors include one or more of CPU usage, memory usage, disk space usage, and IO interface throughput.
  • the application provides a data equalization apparatus, including:
  • processors One or more processors, memories, bus systems, transceivers, and one or more programs, the processors, memories, and transceivers are coupled by a bus system;
  • One or more programs are stored in the memory, and the one or more programs include instructions that, when executed by the terminal, cause the terminal to perform any one of the seventh possible implementations of the first aspect to the first aspect .
  • the present application provides a computer readable storage medium storing one or more programs, the one or more programs including instructions that, when executed by the data equalization device, cause the data equalization device to perform as in the first aspect to the first Any of the seventh possible embodiments of the aspect.
  • FIG. 1 is a schematic structural diagram of a distributed database system according to an embodiment of the present invention.
  • FIG. 2 is a schematic flowchart of a data equalization method according to an embodiment of the present invention.
  • FIG. 3 is another schematic flowchart of a data equalization method according to an embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of a migration rule according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a data equalization apparatus according to an embodiment of the present invention.
  • FIG. 6 is another schematic structural diagram of a data equalization apparatus according to an embodiment of the present invention.
  • a distributed database system includes at least one client, a metadata server, a data node server, and Storage network. At least one client communicates with the metadata server and the data node server through the IP network, wherein the communication interface between the client and the metadata server or the data node server is TCP (Transmission Control Protocol, TCP). Interface or UDP interface (User Datagram Protocol, UDP for short).
  • TCP Transmission Control Protocol
  • UDP User Datagram Protocol
  • the metadata server includes a data equalization device, and the metadata server can be deployed independently, for example, by using a minicomputer, an X86 computer, a personal computer server PC server, or a data node server.
  • the data node server includes a plurality of physical nodes, and the physical node may be a minicomputer, an X86 computer, or a personal computer server PC server.
  • the data of the plurality of physical nodes may be stored in a storage medium of the storage network, and the plurality of physical nodes and the storage network are The storage medium is read and written by the block IO.
  • the storage medium can be HDD (Hard Disk Drive, Hard Disk Drive, HDD), SSD (Solid State Drives, SSD). Referred to as SSD) or memory.
  • the metadata server mainly stores the metadata (Metadata) of the database system, and the metadata is information about the organization of the data, the data domain and the relationship thereof.
  • the metadata is the data used to describe the data.
  • the data equalization device is responsible for the distributed management capabilities of the distributed database system as a whole. In addition to being deployed in a metadata server, the data equalization device can also be built into various physical nodes. The highly reliable deployment mode of the distributed database system may adopt a dual-machine or a cluster mode, and the present invention is not limited.
  • the data equalization device mainly involves the following functions:
  • Metadata definition store and update the mapping relationship between virtual nodes and physical nodes, copy definition of virtual nodes, load degree and weight coefficient calculation method of weight factors, configuration and storage of deviation coefficients, and fission strategy of virtual nodes.
  • Route management Manage the routing data of the business object, and call the metadata definition unit interface according to the mapping relationship between the virtual node and the physical node.
  • Replication management store and update the primary copy relationship of the virtual node, and is responsible for the management of replication integrity, calling the metadata definition unit interface.
  • Node monitoring monitoring service information and node resource information, invoking a communication service unit, acquiring information of each data service node, and issuing a monitoring instruction.
  • Online migration Provides online migration capability in units of data sharding, which has no impact on the business and ensures high availability of migration.
  • the metadata definition interface is used to complete the routing and replication relationship changes.
  • Data equalization Data balancing between physical nodes is achieved according to data equalization conditions.
  • Metadata Synchronization/Persistence Synchronize metadata definition information to each physical node, the driver of the client, and the slaver of the data equalization device. When the metadata is changed, the change information needs to be notified to the above node, and the metadata is persisted to ensure high reliability of the data.
  • Communication service Provides network communication capabilities with peripheral network elements (each physical node, the driver of the client, and the slaver node of the data equalization device).
  • a plurality of physical nodes may deploy an agent agent, and the agent is responsible for information interaction with the data equalization device (for example, reporting node health information, receiving instructions of the data equalization device, and providing self-management of node high availability, for example, domain degradation when the network is abnormal) .
  • the agent is responsible for information interaction with the data equalization device (for example, reporting node health information, receiving instructions of the data equalization device, and providing self-management of node high availability, for example, domain degradation when the network is abnormal) .
  • the driver is also deployed on the client, and the routing information is cached in the driver.
  • the client can complete the route judgment through the cached routing information and access the corresponding physical node, thereby preventing the data balancing device from becoming a bottleneck of the routing query during service access.
  • the method includes:
  • the data equalization device acquires the load degree of the n weight factors of each virtual node on the m physical nodes, and the data equalization device can acquire the load degree of the different weight factors of the virtual node by using the agent deployed on each physical node.
  • Each virtual node has the same number and type of weighting factors, and the configuration of the weight values is also the same.
  • the weight values of the four weighting factors are configured as: the number of records corresponds to the weight value w1, and the memory corresponds to the weight value w2.
  • the memory corresponds to the weight value w3, and the access frequency corresponds to the weight value w4.
  • the distributed database system has four physical nodes: physical node 1, physical node 2, physical node 3, and physical node 4.
  • Physical node 1 is associated with virtual node a and virtual node b
  • physical node 2 is associated with virtual node c and virtual Node d
  • physical node 3 is associated with virtual node e and virtual node f
  • physical node 4 is associated
  • virtual node g and virtual node f then virtual node a-virtual node f has the above four weighting factors, and the weight value configuration of each virtual node is the same.
  • the weighting factor is used to evaluate the parameters of a dimension of the virtual node. It can be a service dimension, a resource dimension, or other dimensions.
  • the weight factor is: number of records, CPU usage, access frequency, disk space usage, or memory usage.
  • the load factor of the weighting factor indicates the ratio of the actual parameter value of the weighting factor to the rated parameter value.
  • a method for calculating a weight coefficient of any one of the virtual nodes associated with the m physical nodes obtaining a load degree of the n weight factors of the virtual node and n weight values, according to load degrees of the n weight factors The weighting coefficients obtained by weighting the n weight values are given to the virtual nodes.
  • the load factor of the four weighting factors of virtual node b is: the load of the number of records: 0.8, the load of the memory occupancy: 0.2, the load of the CPU usage: 0.5, the frequency of access Load degree: 0.3, virtual node b has the same weight value configuration as virtual node a, and the weight values corresponding to the four weight factors of virtual node b are also: 0.1, 0.2, 0.3, and 0.4,
  • performing weighted averaging according to the load degree of the n weight factors and the n weight values to obtain a weight coefficient of each virtual node includes:
  • the dynamic load degree of the weighting factor indicates the load degree of the weighting factor dynamically monitored by the virtual node within a preset duration, which is the actual load degree of the virtual node. For example, within one hour, the CPU usage of the virtual node is monitored to be 50%, and 50% is the dynamic load of the CPU usage.
  • the static load degree of the weighting factor represents the load degree of the pre-configured weighting factor, which is a fixed value, regardless of the actual load degree of the weighting factor. For example, if the CPU usage of a pre-configured virtual node is 60%, then 60% is the dynamic load of the weighting factor.
  • the actual load of the CPU usage of the virtual node may be 55%.
  • Each virtual node has the same weighting factor configuration, and the weighting of the same dynamic weighting factor and static weighting factor ensures that each virtual node evaluates its data distribution state under the same criteria.
  • virtual node 1 and virtual node 4 have the same weighting factors: CPU usage, disk occupancy, and number of records.
  • Virtual node 1 and virtual node 4 have the same weight configuration: dynamic weight
  • the weight of the coefficient is Ws
  • the weight of the static weight coefficient is Wd.
  • the data equalization device weights and sums the dynamic weight coefficient and the static weight coefficient of the virtual node to obtain a weight coefficient.
  • the data equalization device calculates the weight coefficient of the virtual dimension by calculating the dynamic load degree of the weighting factor of the virtual node and the static load degree of the weighting factor, and reduces the weight coefficient of the virtual node, thereby reducing the data distribution state of the evaluation virtual node. error.
  • the data equalization device determines, according to the mapping relationship between the physical node and the virtual node, the virtual node associated with each physical node, and the method for calculating the standard number of fragments of any one of the m physical nodes may be: The sum of the weight coefficients of all the virtual nodes associated with the physical node obtains the standard number of fragments of the physical node.
  • physical node 1 is associated with virtual node a and virtual node b.
  • the weight coefficient of virtual node a is 0.63
  • the weight coefficient of virtual node b is 0.39.
  • the standard number of fragments of physical node 1 is equal to virtual node a and virtual.
  • the sum of the weight coefficients of node b: 0.63 + 0.39 1.02.
  • the data equalization condition indicates that the number of standard fragments between m physical nodes reaches a balanced level. For example, the number of standard fragments of each physical node in m physical nodes floats within a specified range.
  • the standard number of fragments of the physical node is calculated according to the load factor and the weight value of the weighting factor, and the data distribution is evaluated by using the standard number of fragments, so that the data of the physical nodes in the distributed database system can be accurately balanced.
  • FIG. 3 is another schematic flowchart of a data balancing method according to an embodiment of the present invention.
  • the method includes:
  • the distributed database system is provided with m physical nodes, and each physical node can periodically report the load degree of the n weight factors of the associated virtual node to the data equalization device, and the weight values of the n weight factors can be advanced.
  • each virtual node has the same weight value configuration.
  • the monitoring of the load factor of the weighting factor by the data equalization device can be achieved by a proxy device deployed in each physical node.
  • the weighting factor of the virtual node is used to evaluate a parameter of a dimension of the virtual node, and the load degree of the weighting factor represents a ratio of the actual parameter value of the dimension to the rated parameter value, and one virtual node may have multiple different weighting factors. Multiple different weighting factors can belong to multiple dimensions.
  • the number and type of weighting factors of each virtual node in the distributed database system are the same, and the weight value configuration is the same, so that each virtual node compares the data distribution under the same standard. m and n are integers greater than one.
  • the n weight factors include: one or more of a service level weight factor and a resource level weight factor
  • the service level weight factor includes: one or more of an access frequency and a number of records of the service object
  • the resource The level weight factor includes one or more of CPU usage, memory usage, disk space usage, and IO interface throughput.
  • the mapping relationship between the physical node and the virtual node is updated, and the data equalization device triggers acquisition of m in the distributed database system.
  • the n weighting factors and n weight values of the virtual nodes on the physical node and then perform the subsequent data equalization steps, because it is highly likely that the distributed database system will not be satisfied after adding new physical nodes and deleting old physical nodes. Load balance condition.
  • the method for calculating the weight coefficient of any one of the virtual nodes associated with the m physical nodes is: configured according to the load degree of the n weight factors and the weight values of the n weight factors obtained by S301, each virtual The weight factor of the node is the same as the type and the weight value is configured the same.
  • the method for calculating the number of standard fragments of any one of the m physical nodes may be: acquiring a weight coefficient of the virtual node associated with the physical node, and summing the weight coefficients of the associated virtual node to obtain the physical The weighting factor of the node.
  • the physical node 1 is associated with the virtual node a and the virtual node b.
  • the weight coefficient of the virtual node a is 0.9
  • the weight coefficient of the virtual node b is 1.2.
  • the standard number of fragments of the physical node 1 is equal to the associated virtual node a.
  • the sum of the load degrees of the virtual node b: 0.9 + 1.2 2.1.
  • the data equalization device determines, according to the standard number of fragments corresponding to each of the m physical nodes, whether the distributed database system satisfies the load balance condition, and the load balance condition requires that the load degrees corresponding to the m physical nodes are floating within the specified range.
  • the data equalization condition is: L max ⁇ L average ⁇ (1+ ⁇ ) and L min >L average ⁇ (1- ⁇ ); wherein, and represents the relationship between and, and the conditions on both sides It needs to be satisfied at the same time, L max is the maximum standard fragment number of the physical node, L average is the average standard fragment number of the m physical nodes, L min is the minimum standard fragment number of the physical node, and ⁇ is the deviation coefficient, 0 ⁇ ⁇ ⁇ 1, the average number of standard shards of m physical nodes is equal to the sum of the number of standard shards of m physical nodes divided by m.
  • the distributed database system includes physical node 1, physical node 2, physical node 3, and physical node 4.
  • the standard number of fragments of physical node 1 is 2.5, and the number of standard fragments of physical node 2 is 3.5.
  • the standard number of fragments of node 3 is equal to 4.5
  • the number of standard fragments of physical node 4 is equal to 6.5
  • the minimum standard number of fragments in the node is 2.5, and the maximum standard number of fragments in the four physical nodes is 6.5.
  • the deviation coefficient can be preset, and the value of the deviation coefficient is between 0 and 1.
  • the data equalization device determines whether the data distribution conforms to the data according to the standard number of fragments of the m physical nodes. If the determination result is negative, the process proceeds to S305. If the determination result is YES, the process S301 can be continued.
  • the simulated migration representation does not actually migrate the virtual node, but simulates the virtual node migration process according to the migration rule.
  • the migration rule includes: grouping virtual nodes associated with each of the m physical nodes according to the weight coefficient.
  • the method for grouping may be: pre-setting a plurality of weight coefficient intervals that are not overlapping and adjacent according to a distribution range of weight coefficients in the distributed database system, and each weight coefficient section may have The same step size, the absolute value of the difference between the two endpoints of the step weight coefficient section of the weight coefficient interval.
  • Each weight coefficient interval corresponds to one group. For any virtual node associated with m physical nodes, the weight coefficient interval to which the weight coefficient of the virtual node belongs is determined, and the group to which the virtual node belongs is determined according to the associated load degree interval.
  • the weight coefficient interval 1 is [1.5, 2.5), corresponding to group 1; the weight coefficient interval 2 is [2.5, 3.5], corresponding to group 2.
  • square brackets indicate the inclusion of endpoints and parentheses indicate that no endpoints are included. If a certain virtual node has a weight coefficient of 2 and the weight coefficient belongs to the weight coefficient interval 1, the virtual node is divided into the group 1.
  • the method for grouping may be: rounding off the weight coefficients of each virtual node, and dividing the virtual nodes with the same weight coefficient after rounding and rounding into the same group.
  • each group After all the virtual nodes in the m physical nodes are grouped, the following processing is performed for each group: determining the number of virtual nodes corresponding to the m physical nodes in the group, and equally all the virtual nodes in the group to the m physical nodes.
  • Each physical node is assigned the same integer number of virtual nodes. If the number of virtual nodes in the group is not an integer multiple of m, there will be remaining virtual nodes in the group during the average allocation. If the number of virtual nodes in the group is less than m, there will be physical nodes in the group that are not assigned virtual nodes. Obtain the remaining virtual nodes in each group, and reassign the remaining virtual nodes to m physical nodes, ensuring that the standard number of fragments of each physical node meets the data equalization condition.
  • the distributed database system satisfies Data equalization conditions.
  • adding physical node 3 at this time, there are three physical nodes in the distributed database system, 3 The physical nodes do not meet the data equalization conditions.
  • the weight coefficients of the virtual nodes associated with the three physical nodes are shown in Table 1:
  • All the virtual nodes associated with the three physical nodes are grouped according to the weight coefficients, and two weight coefficient intervals are set, the weight coefficient interval 1 is [0.5, 1.5), corresponding to group 1; the weight coefficient interval 2 is [1.5, 2.5), corresponding Packet 2 belongs to the virtual node of packet 1: a, b, c; g, h, i, virtual nodes belonging to packet 2: d, e, f; j, k, l.
  • Table 2 The relationship between the three physical nodes and the virtual nodes in packet 1 is shown in Table 2:
  • the number of virtual nodes in Table 2 is 6, and 6 virtual nodes need to be equally distributed to 3 physical nodes, and each physical node is assigned 2 virtual nodes.
  • 1 virtual node on physical node 1 is required (a Any one of b, c) and one physical node (any one of g, h, i) on physical node 2, so that each physical node is assigned two virtual nodes, and the number of virtual nodes is exactly physical
  • the mapping relationship after migration of virtual nodes in group 1 is as follows:
  • the number of virtual nodes in Table 3 is 6, and six virtual nodes need to be assigned to three physical nodes, each physical node is assigned two physical nodes, and for group 2, one virtual node in physical node 1 is required (d, One of e and f migrates to physical node 3, and one virtual node in physical node 2 is migrated to physical node 3, so that each physical node in group 2 is evenly allocated two virtual nodes.
  • the number of virtual nodes in group 2 is exactly an integer multiple of the number of physical nodes, and there are no remaining virtual nodes in group 2. For example, the mapping relationship after the migration of virtual nodes in group 2 is as shown in Table 5:
  • the data equalization device does not need to process the remaining virtual nodes of each group.
  • the mapping relationship between the three physical nodes and the associated virtual nodes is as shown in Table 6:
  • the migration rule includes: the data equalization device acquires the average standard number of fragments of the m physical nodes, and the average standard number of fragments is the sum of the number of standard fragments of the m physical nodes divided by m. All possible multiple virtual node migration schemes are planned according to the average standard fragmentation number, wherein each virtual node migration scheme requires the standard fragmentation number of the physical node to approach the average load degree; the data equalization device is from multiple The target virtual node migration scheme is determined in the virtual node migration scheme, and the determined method may be: for each virtual node migration scheme, after the virtual migration scheme is executed, the data equalization apparatus calculates the virtual node of each of the m physical nodes.
  • the sum of the squares of the weight coefficients, and the average of the sum of the squares of the m physical nodes is calculated.
  • the physical node 1 is associated with the virtual node a and the virtual node b, and the virtual node a
  • the load degree is 1, the load degree of the virtual node b is 4;
  • the physical node 2 is associated with the virtual node c and the virtual node d, the load degree of the virtual node c is 2, and the load degree of the virtual node 3 is 3;
  • the data distribution of the three physical nodes in the distributed database system obviously does not satisfy the data equalization condition.
  • the standard number of fragments of physical node 1 is 9, and the number of standard fragments of physical node 2 is 9.
  • the standard number of fragments of each physical node For 6 assume that all possible virtual node migration scenarios are planned:
  • Solution 1 The virtual node a (0.9), the virtual node b (1), and the virtual node c (1.1) in the physical node 1 are migrated to the virtual node 3, and the virtual node g (0.9) of the physical node 2, the virtual node h (1) The virtual node (1) is migrated to the virtual node 3.
  • Table 7 The mapping relationship between the physical node and the virtual node after migration is as shown in Table 7:
  • Solution 2 The virtual node b(1) and the virtual node e(2) in the physical node 1 are migrated to the physical node 3, and the virtual node h(1) and the virtual node k(2) on the physical node 2 are migrated to the physical node. 3.
  • the mapping relationship between the physical node and the virtual node after migration is as shown in Table 8:
  • Solution 2 not only ensures the balance of the standard number of fragments on each physical node but also ensures the balance of the number of virtual nodes.
  • the data equalization device uses the second method as the target migration solution.
  • S306. Determine, according to the standard number of fragments corresponding to each of the m physical nodes, whether the data distribution satisfies the data equalization condition.
  • the data equalization device determines whether the number of standard fragments of the m physical nodes after the simulation migration is performed satisfies the data equalization condition. If yes, execute S307, and if no, execute S308.
  • the virtual node associated with the m physical nodes is actually migrated according to the migration rule.
  • the method for the migration is referred to S305, and details are not described herein.
  • the migration of the virtual node cannot always satisfy the data balancing condition of the physical node, and thus needs to be from the m physical nodes.
  • Selecting a virtual node with a large weight coefficient for fission, and splitting into two or more virtual nodes may be: selecting a virtual node with the largest weight coefficient among all virtual nodes associated with m physical nodes for fission, Split into two virtual nodes of the same size, each virtual node has a weight coefficient of 1/2 of the original virtual node; or, determine the physical node with the largest number of standard fragments, and then determine the virtual node with the largest weight coefficient among the physical nodes. , split the virtual node into two virtual nodes of the same size.
  • the data equalization device updates the mapping relationship between the physical node and the virtual node.
  • the data balancing device performs the actual migration of the virtual node according to the migration rule according to the mapping relationship between the updated physical node and the virtual node, so that each physical node achieves data balancing.
  • the data equalization device determines the physical node Node3 with the largest number of standard fragments, and selects one virtual node with the largest weight coefficient from Node3 to be transformed into a virtual node of the same size, assuming that the virtual node V008 is split into V008a and
  • mapping relationship between the physical node and the virtual node after the migration is as follows:
  • the virtual node may be in the form of logical fragmentation or physical fragmentation.
  • the physical fragment corresponds to the page, block, Schema, and Tablespace of the database.
  • the method of migrating physical fragments is highly efficient, has little impact on databases and applications, and is easier to implement lossless migration.
  • the method of migrating the logical slice is highly efficient.
  • FIG. 5 is a schematic structural diagram of a data equalization apparatus according to an embodiment of the present invention.
  • the data equalization apparatus of the embodiment of the present invention is used to perform the method for identifying the infrequently used data in FIG. 2, and the terms and processes involved may be referred to.
  • Figure 2 is a description of an embodiment.
  • the data equalization device 5 includes: an obtaining module 501, The weight coefficient calculation module 502, the standard fragment number calculation module 503, the balance determination module 504, and the equalization module 505.
  • the obtaining module 501 is configured to obtain a load degree and n weight values of n weight factors of each virtual node on the m physical nodes in the distributed database system, where m and n are integers greater than 1.
  • the weight coefficient calculation module 502 is configured to obtain a weight coefficient of each virtual node according to a weighted average of the load factors of the n weight factors and the n weight values.
  • the standard fragment number calculation module 503 is configured to obtain a standard number of fragments according to a weight coefficient of the virtual node associated with each physical node.
  • the equalization determining module 504 is configured to determine, according to the standard number of fragments corresponding to the m physical nodes, whether the data distribution satisfies the data equalization condition.
  • the equalization module 505 is configured to perform data equalization processing on the m physical nodes if the determination result of the equalization determination module is negative.
  • the equalization module 505 includes: an emulation migration unit, a judging unit, an actual migration unit, and a fission unit.
  • Simulating the migration unit performing simulated migration on the virtual nodes associated with the m physical nodes according to a preset migration rule.
  • the determining unit is configured to determine whether the m physical nodes after the simulated migration meet the data equalization condition.
  • the actual migration unit performs actual migration on the virtual nodes associated with the m physical nodes according to the migration rule, if the determination result of the determination unit is yes.
  • a fission unit configured to select at least one virtual node from the virtual nodes associated with the m physical nodes to perform fission if the determination result of the determining unit is negative, and split the m physical nodes according to the migration rule after the fission The associated virtual node performs the actual migration.
  • the fission unit is used to:
  • the virtual node with the largest weight coefficient is split into at least two virtual nodes of the same size.
  • the migration rule includes:
  • the virtual nodes remaining after the average allocation of the respective groups are reassigned to the m physical nodes.
  • the data equalization condition includes: L max ⁇ L average ⁇ (1+ ⁇ ) and L min >L average ⁇ (1 ⁇ ); wherein, and represents a relationship of AND, and L max is a maximum of a physical node.
  • the number of standard fragments, L average is the average standard number of fragments of the m physical nodes, L min is the minimum standard number of fragments of the physical node, and ⁇ is the deviation coefficient, 0 ⁇ ⁇ ⁇ 1.
  • the n weight factors include: one or more of a service level weight factor and a resource level weight factor;
  • the service level weighting factor includes: one or more of an access frequency and a number of records of the business object;
  • the resource level weighting factor includes one or more of CPU usage, memory usage, disk space usage, and IO interface throughput.
  • FIG. 6 is a schematic structural diagram of a data equalization apparatus according to an embodiment of the present invention.
  • the data equalization apparatus 6 includes a processor 601, a memory 602, and a transceiver 603.
  • the transceiver 603 is configured to transmit and receive data with and from an external device.
  • the number of processors 601 in the data equalization device may be one or more.
  • processor 601, memory 602, and transceiver 603 may be connected by a bus system or other means.
  • the data equalization device 6 can be used to perform the method shown in FIG. 2. For the meaning and examples of the terms involved in the embodiment, reference may be made to the corresponding embodiment of FIG. 2. I will not repeat them here.
  • the program code is stored in the memory 602.
  • the processor 601 is configured to call the program code stored in the memory 602 for performing the following operations:
  • performing, by the processor 601, the data equalization processing on the m physical nodes includes:
  • At least one virtual node is selected from the virtual nodes associated with the m physical nodes for fission, and after the fission, the virtual nodes associated with the m physical nodes are actually migrated according to the migration rule.
  • the processor 601 performs the splitting by selecting at least one virtual node from the virtual nodes associated with the m physical nodes to perform fission:
  • the virtual node with the largest weight coefficient is split into at least two virtual nodes of the same size.
  • the migration rule includes:
  • the virtual nodes remaining after the average allocation of the respective groups are reassigned to the m physical nodes.
  • the processor 601 performs the grouping, by using the weight coefficient, the virtual nodes associated with the m physical nodes:
  • the weight coefficient interval to which the weight coefficient of each virtual node belongs is determined, and the virtual node is classified into a group corresponding to the weight coefficient interval to which the virtual node belongs.
  • the data equalization condition includes: L max ⁇ L average ⁇ (1+ ⁇ ) and L min >L average ⁇ (1 ⁇ ); wherein, and represents a relationship of AND, and L max is a maximum of a physical node.
  • the number of standard fragments, L average is the average standard number of fragments of the m physical nodes, L min is the minimum standard number of fragments of the physical node, and ⁇ is the deviation coefficient, 0 ⁇ ⁇ ⁇ 1.
  • the n weight factors include: one or more of a service level weight factor and a resource level weight factor;
  • the service level weighting factor includes: one or more of an access frequency and a number of records of the service object; the resource level weighting factors include: CPU usage, memory usage, disk space occupancy, and IO interface throughput. One or more.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Complex Calculations (AREA)

Abstract

本发明实施例公开了一种数据均衡方法,包括:获取分布式数据库系统中m个物理节点上每个虚拟节点的n个权重因子的负载度和n个权重值,m和n为大于1的整数;根据所述n个权重因子的负载度和所述n个权重值进行加权平均得到每个虚拟节点的权重系数;根据每个物理节点关联的虚拟节点的权重系数得到标准分片数;根据所述m个物理节点各自对应的标准分片数判断数据分布是否满足数据均衡条件;若为否,对所述m个物理节点进行数据均衡处理。本发明实施例还公开了一种数据均衡装置。采用本发明,能对分布式数据库系统中的物理节点进行数据均衡,优化资源的配置。

Description

一种数据均衡方法和装置
本申请要求于2016年6月30日提交中国专利局、申请号为201610511324.0,发明名称为“一种数据均衡方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及数据处理领域,尤其涉及一种哈希一致性分布式数据库系统的数据均衡方法和装置。
背景技术
目前,分布式数据库技术是IT技术领域中普遍使用的一项分布式技术,其主要应用于网页页面缓存、数据库缓存等方面以满足用户对网络系统响应速度的要求。
在基于一致性哈希算法的分布式数据库系统中,物理节点可虚拟出多个虚拟节点,然后将多个虚拟节点通过哈希算法映射到环上,这样物理节点可增大在环上映射的哈希值范围。在分布式数据库系统新增物理节点或删除物理节点时,通过调整各个物理节点关联的虚拟节点的数量,使各个物理节点关联的虚拟节点的数量趋于相等,使分布式数据库系统的所有物理节点实现虚拟节点的均衡。但是,由于虚拟节点的数据分布存在较大波动的情况下,为了保证业务能正常进行,需要考虑物理节点的资源的盈余,根据数据量最大的要求部署物理节点,导致成本的增加;或者不考虑物理节点的资源的盈余的情况下,可能会导致物理节点过载,业务失败。
针对在分布式数据库系统中某些虚拟节点的数据承载存在波动的情况下,目前还没有比较有效的方法来调度分布式数据库系统的中各个物理节点达到数据均衡。
发明内容
本发明实施例所要解决的技术问题在于,提供一种数据均衡方法。可解决现有技术中分布式数据库系统各个物理节点的数据不均衡的问题。
为了解决上述技术问题,本发明实施例提供了一种数据均衡方法,包括:
数据均衡装置首先获取分布式数据库系统中m个物理节点上每个虚拟节点的n个权重因子的负载度和n个权重值,对于每个虚拟节点而言,每个权重因子对应一个权重值,权重因子为评价虚拟节点指定维度的参数,指定维度可以是业务维度和/或资源维度,或其他维度,权重因子的负载度表示指定维度的实际参数值与额定参数值之间的比例值,例如:权重因子为记录数,额定记录数为100W,监测到的实际记录数为50W,则记录数的负载度为50W/100W=50%。分布式数据库系统中的每个虚拟节点具有相同数量和类型的权重因子,且具有相同的权重值配置。对于分布式数据库系统中每个虚拟节点,根据虚拟节点的n个权重因子的负载度和n个权重值进行加权平均后得到权重系数;例如:某个虚拟节点具有4个权重因子:内存占用率、记录数、访问频率和CPU占用率,4个权重因子分别对应的权重值为:0.2、0.3、0.3、0.2,数据均衡装置获取到的4个权重因子的负载度分别为:0.1、0.5、0.6和0.8,则该虚拟节点的权重系数为:0.1*0.2+0.5*0.3+0.3*0.6+0.2*0.8=0.51。数据均衡装置根据物理节点与虚拟节点的映射关系,确定每个物理节点关联的虚拟节点,根据物理节点关联的虚拟节点得到标准分片数,例如:对于某个物理节点,将该物理节点关联的所有的虚拟节点的权重系数相加得到标准分片数;或者,将关联的所有的虚拟节点的权重系数之和放大预设倍数后得到该物理节点标准分片数;或者,将关联的所有虚拟节点的权重系数进行加权平均后得到标准分片数;或者将关联的所有虚拟节点的权重系数进行加权平均后放大预设倍数后得到标准分片数;需要说明的是,每个物理节点计算标准分片数的方法相同,例如:放大倍数相同、用于加权的权重值配置相同。数据均衡装置得到m个物理节点中每个物理节点的标准分片数之后,根据标准分片数判断数据分布是否满足数据均衡条件,若不满足数据均衡条件,数据均衡装置对m个物理节点上的虚拟节点进行迁移,将标准分片数大的物理节点上的虚拟节点迁移到标准分片数小的物理节点上,使各个物理节点的标准分片数达到均衡。其中,均衡的含义为多个节点中每个节点的一个或多个参数值在指定范围内浮动。
上述实施例,根据权重因子的负载度和权重值得到每个虚拟节点的权重系数,根据虚拟节点的权重系数得到每个物理节点的标准分片数,根据标准分片数判断各个物理节点上的数据分布是否满足数据均衡条件,可以更准确对物理 节点上的数据分布情况进行评估,使得各个物理节点达到数据均衡。
结合第一方面,在第一种可能的实施方式中,对分布式数据库系统进行数据均衡包括:
根据迁移规则对m个物理节点关联的虚拟节点进行模拟迁移,或称为预迁移,模拟迁移表示并没有真正执行迁移动作,而是对根据迁移规则进行虚拟节点迁移过程的仿真,迁移规则表示m个物理节点的标准分片数、虚拟节点的数量、同一个物理节点中各个虚拟节点的权重系数达到均衡,数据均衡装置判断模拟迁移后的m个物理节点关联的虚拟节点是否满足数据均衡条件,若满足,根据上述的迁移规则对关联的虚拟节点进行实际迁移,实际迁移后m个物理节点能满足数据均衡条件。在模拟迁移后m个物理节点不满足数据均衡条件,从m个物理节点选择至少一个虚拟节点进行裂变,并更新物理节点和虚拟节点的映射关系,例如:选择权重系数最大的虚拟节点进行裂变、选择标准分片数最大的物理节点中权重系数最大的虚拟节点进行裂变,数据均衡装置在裂变后根据迁移规则对m个物理节点关联的虚拟节点进行实际迁移,使m个物理节点达到数据均衡。其中,预设的迁移规则具体可以为将标准分片数最多的物理节点,迁移一个分片到标准分片数最小的物理节点。
上述实施例,在各个物理节点无法满足数据均衡条件时,通过对部分的虚拟节点裂变后再迁移,可以避免虚拟节点的粒度过大而无法达到数据均衡的问题,同时不需要对所有的虚拟节点进行裂变,有效的控制裂变的范围和次数,减小处理开销。
结合第一方面,在第二种可能的实施方式中,从m个物理节点关联的虚拟节点中选择至少一个虚拟节点裂变为至少两个虚拟节点包括:确定标准分片数最大的物理节点,从该物理节点中选择权重系数最大的虚拟节点,将该虚拟节点裂变为至少两个相同大小的虚拟节点。
上述实施例,通过对标准分片数上权重系数最大的虚拟节点进行裂变,裂变为两个相等大小的虚拟节点,能最小限度的控制裂变次数和裂变范围,减小处理开销。
结合第一方面,在第三种可能的实施方式中,所述迁移规则包括:根据权重系数对所述m个物理节点关联的虚拟节点进行分组,分组的规则可以通过设置权重系数区间进行分组、四舍五入法取整进行分组、向上取整法进行分组、 向下取整法等,分组的原则为确保每个组内虚拟节点的权重系数在指定范围内浮动。分组完成后对每个组进行数据均衡,使每个物理节点上映射相同数量的虚拟节点,如果组内的虚拟节点的数量不为物理节点数据m的整数倍的话,该组内会剩余未分配的虚拟节点,获取每组内未分配的虚拟节点,将所有的未分配的虚拟节点重分配给m个物理节点。使各个物理节点上虚拟节点的数量达到均衡,以及物理节点的标准分片数达到均衡。
上述实施例,通过对物理节点关联的虚拟节点进行分组的方式,对各个分组中的物理节点关联的虚拟节点进行数量上的均衡,以实现对各个物理节点在虚拟节点的数量上达到均衡以及各个物理节点在数据量上达到数据均衡,均衡过程简单,计算量小。
结合第一方面的第二种可能的实施方式,在第四种可能的实施方式中,根据权重系数将m个物理节点关联的虚拟节点进行分组包括:
数据均衡装置设置多个互相不重合的权重系数区间,每个权重系数区间对应一个组,每个权重系数区间的步长可以相等,对于每个虚拟节点,判断虚拟节点的权重系数落入多个权重系数区间中哪个权重系数区间,将虚拟节点归为权重系数区间对应的组。
结合第一方面至第一方面的第二种可能的实施方式,在第五种可能的实施方式中,根据n个权重因子的负载度和n个权重值进行加权平均后得到每个虚拟节点的权重系数包括:
检测虚拟节点的n个权重因子的动态负载度,对检测到的n个权重因子的动态负载度和n个权重值进行加权求和后得到动态权重系数;权重因子的动态负载度表示虚拟节点在预设时长内动态监测的权重因子的负载度。例如:1小时内,监测到虚拟节点的CPU占用率的负载度为50%,则50%为CPU占用率的动态负载度。
获取虚拟节点静态设置的n个权重因子的静态负载度,对获取到的n个权重因子的静态负载度和n个权重值进行加权求和后得到静态权重系数;权重因子的静态负载度表示预先配置的权重因子的负载度,为一个固定值,与权重因子的实际负载度无关。例如:预先配置的某个虚拟节点的CPU占用率的负载度为60%,则60%为权重因子的动态负载度,此时该虚拟节点的CPU占用率的实际负载度可能为55%。每个虚拟节点具有相同的权重因子的配置,以及相同 的动态权重系数和静态权重系数的权重配置,保证每个虚拟节点在相同的标准下评估其数据分布状态。例如,对于虚拟节点1和虚拟节点4,虚拟节点1和虚拟节点4具有相同的权重因子:CPU占用率、磁盘占用率和记录数,虚拟节点1和虚拟节点4具有相同的权重配置:动态权重系数的权重为Ws,静态权重系数的权重为Wd。数据均衡装置将虚拟节点的动态权重系数和静态权重系数进行加权求和后得到权重系数。
上述实施例,通过获取虚拟节点的权重因子的动态负载度和权重因子的静态负载度,对两个维度的负载度进行加权后计算虚拟节点的权重系数,可减小评估虚拟节点的数据分布状态的误差。
结合第一方面,在第六种可能的实施方式中,数据均衡条件包括:Lmax<Laverage×(1+α)and Lmin>Laverage×(1-α);其中,and表示与的关系,Lmax为表示m个物理节点中的最大标准分片数,Laverage为所述m个物理节点的平均标准分片数,Lmin表示m个物理节点中最小标准分片数。α为偏差系数,可以为预先设置,0≤α≤1。
结合第一方面至第一方面的第六种可能的实施方式,在第七种可能的实施方式中,所述n个权重因子包括:业务级权重因子和资源级权重因子中的一种或多种;
所述业务级权重因子包括:访问频度和业务对象的记录数的一种或多种;
所述资源级权重因子包括:CPU占用率、内存占用率、磁盘空间占用率、IO接口吞吐量中的一种或多种。
上述实施例,从多个维度计算虚拟节点的权重系数,能准确的评估各个虚拟节点的数据分布情况。
第二方面,本发明实施例提供的一种数据均衡装置,包括:获取模块、权重系数计算模块、标准分片数计算模块、均衡判断模块和均衡模块。
获取模块,用于获取分布式数据库系统中m个物理节点上每个虚拟节点的n个权重因子的负载度和n个权重值,m和n为大于1的整数;
权重系数计算模块,用于根据n个权重因子的负载度和n个权重值进行加权平均得到每个虚拟节点的权重系数;
标准分片数计算模块,用于根据每个物理节点关联的虚拟节点的权重系数得到标准分片数;
均衡判断模块,用于根据m个物理节点各自对应的标准分片数判断数据分布是否满足数据均衡条件;
均衡模块,用于若均衡判断模块的判断结果为否,对m个物理节点进行数据均衡处理。
结合第二方面,在第一种可能的实施方式中,均衡模块包括:
模拟迁移单元,根据预设的迁移规则对m个物理节点关联的虚拟节点进行模拟迁移;
判断单元,用于判断模拟迁移后的m个物理节点是否满足数据均衡条件;
实际迁移单元,若判断单元的判断结果为是,根据迁移规则对m个物理节点关联的虚拟节点进行实际迁移;
裂变单元,用于若判断单元的判断结果为否,从m个物理节点关联的虚拟节点中选择至少一个虚拟节点进行裂变,裂变后根据迁移规则对m个物理节点关联的虚拟节点进行实际迁移。
结合第二方面的第一种可能的实施方式,在第二种可能的实施方式中,裂变单元用于:
确定m个物理节点中标准分片数最大的物理节点;
确定标准分片数最大的物理节点中权重系数最大的虚拟节点;
将权重系数最大的虚拟节点分裂为至少两个相同大小的虚拟节点。
结合第二方面的第一种可能的实施方式,在第三种可能的实施方式中,迁移规则包括:
根据权重系数将m个物理节点关联的虚拟节点进行分组。例如:分组的方法可以采用权重系数区间进行分组的方法、四舍五入取整进行分组的方法、向上取整或向下取整进行分组的方法。
将每组内的虚拟节点平均分配给m个物理节点;其中,同一组内的物理节点分配整数个虚拟节点;
将各个组进行平均分配后剩余的虚拟节点重分配给m个物理节点。
结合第二方面,在第四种可能的实施方式中,数据均衡条件包括:Lmax<Laverage×(1+α)and Lmin>Laverage×(1-α);其中,and表示与的关系,Lmax为物理节点的最大标准分片数,Laverage为m个物理节点的平均标准分片数,Lmin为物理节点的最小标准分片数,α为偏差系数,0≤α≤1。
结合第二方面至第四种可能的实施方式中的任意一种,在第五种可能的实施方式中,n个权重因子包括:业务级权重因子和资源级权重因子中的一种或多种;
业务级权重因子包括:访问频度和业务对象的记录数的一种或多种;
资源级权重因子包括:CPU占用率、内存占用率、磁盘空间占用率、IO接口吞吐量中的一种或多种。
第三方面,本申请提供了一种数据均衡装置,包括:
一个或多个处理器、存储器、总线系统、收发器以及一个或多个程序,处理器、存储器和收发器通过总线系统相连;
其中一个或多个程序被存储在存储器中,一个或多个程序包括指令,指令当被终端执行时使终端执行如第一方面至第一方面的第七种可能的实施方式中的任意一种。
第四方面,本申请提供了一种存储一个或多个程序的计算机可读存储介质,一个或多个程序包括指令,指令当被数据均衡装置执行时使数据均衡装置执行如第一方面至第一方面的第七种可能的实施方式中的任意一种。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提供的一种分布式数据库系统的结构示意图;
图2是本发明实施例提供的一种数据均衡方法的流程示意图;
图3是本发明实施例提供的一种数据均衡方法的另一流程示意图;
图4是本发明实施例提供的一种迁移规则的流程示意图;
图5是本发明实施例提供的一种数据均衡装置的结构示意图。
图6是本发明实施例提供的一种数据均衡装置的另一结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清 楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
参见图1,为本发明实施例提供的基于哈希一致性的分布式数据库系统的结构示意图,在本发明实施例中,分布式数据库系统包括至少一个客户端、元数据服务器、数据节点服务器和存储网络。至少一个客户端通过IP网络分别与元数据服务器和数据节点服务器进行通信,其中,客户端与元数据服务器或数据节点服务器之间的通信接口可以TCP(Transmission Control Protocol,传输控制协议,简称TCP)接口或UDP接口(User Datagram Protocol,用户数据报协议,简称UDP)。元数据服务器包括数据均衡装置,元数据服务器可以采用独立部署的方式,例如可采用小型机、X86计算机、个人计算机服务器PC server进行部署,也可以和数据节点服务器合设。数据节点服务器包括多个物理节点,物理节点可以为小型机、X86计算机或个人计算机服务器PC server等,多个物理节点的数据可存储在存储网络的存储介质中,多个物理节点与存储网络之间通过块IO(Block IO)进行读写,即以Block的方式对存储介质进行读写,存储介质可以是HDD(Hard Disk Drive,硬盘驱动器,简称HDD)、SSD(Solid State Drives,固态硬盘,简称SSD)或内存等。
其中,元数据服务器主要存储了数据库系统的元数据(Metadata),元数据是关于数据的组织、数据域及其关系的信息,简言之,元数据就是用于描述数据的数据。数据均衡装置负责分布式数据库系统整体的分布式管理能力。数据均衡装置除了可以部署在元数据服务器中,还可以内置到各个物理节点中。分布式数据库系统的高可靠部署方式可采用双机或集群方式,本发明不限。数据均衡装置主要涉及以下几个方面的功能:
元数据定义:存储和更新虚拟节点与物理节点间的映射关系、虚拟节点的副本定义、权重因子的负载度及权重系数计算方法、偏差系数的配置和存储、虚拟节点的裂变策略等。
路由管理:管理业务对象的路由数据,根据虚拟节点与物理节点间的映射关系,调用元数据定义单元接口。
复制管理:存储和更新虚拟节点的主副本复制关系,并负责复制完整性的管控,调用元数据定义单元接口。
节点监控:监控业务信息和节点资源信息,调用通信服务单元,获取各数据服务节点的信息,并下发监控指令。
在线迁移:提供按数据分片为单位的在线迁移能力,支持对业务无影响,保证迁移的高可用性。调用元数据定义接口完成路由和复制关系的变更。
数据均衡:根据数据均衡条件,实现物理节点之间的数据均衡。
元数据同步/持久化:将元数据定义信息同步给各个物理节点、客户端的驱动(Driver)和数据均衡装置的Slaver节点。元数据变更时,需要将变更信息通知到上述节点,并对元数据进行持久化,便于保证数据的高可靠性。
通信服务:提供与周边网元(各个物理节点、客户端的Driver和数据均衡装置的Slaver节点)的网络通信能力。
多个物理节点可部署代理Agent,Agent负责与数据均衡装置进行信息交互(例如:上报节点健康性信息、接收数据均衡装置的指令,并提供节点高可用性的自我管理,例如网络异常时角色降级)。
另外,在客户端还部署了驱动Driver,Driver中缓存路由信息。这样,客户端可以通过缓存的路由信息完成路由判断,访问对应的物理节点,从而避免数据均衡装置成为业务访问时路由查询的瓶颈。
参见图2,为本发明实施例提供的一种哈希一致性分布式数据库系统的数据均衡方法,在本发明实施例中,所述方法包括:
S201、获取分布式数据库系统中m个物理节点上每个虚拟节点的n个权重因子的负载度和n个权重值,m和n为大于1的整数。
具体的,数据均衡装置获取m个物理节点上每个虚拟节点的n个权重因子的负载度,数据均衡装置可通过部署在各个物理节点上的代理来获取虚拟节点的不同权重因子的负载度。每个虚拟节点具有相同数量和类型的权重因子,且权重值的配置也相同。例如:n=4,4个权重因子为:记录数、内存占用率、CPU占用率和访问频率,4个权重因子的权重值的配置为:记录数对应权重值w1、内存对应权重值w2,内存对应权重值w3,访问频率对应权重值w4。分布式数据库系统中具有4个物理节点:物理节点1、物理节点2、物理节点3和物理节点4,物理节点1关联有虚拟节点a和虚拟节点b,物理节点2关联有虚拟节点c和虚拟节点d,物理节点3关联有虚拟节点e和虚拟节点f,物理节点4关联 有虚拟节点g和虚拟节点f,则虚拟节点a-虚拟节点f均具有上述4个权重因子,且每个虚拟节点的权重值配置相同。
权重因子用于评价虚拟节点的某个维度的参数,可以是业务维度、资源维度或其他维度,例如:权重因子为:记录数、CPU占用率、访问频率、磁盘空间占用率或内存占用率。权重因子的负载度表示权重因子的实际参数值与额定参数值的比例值,例如:权重因子为内存,内存的额定参数值为:8G,内存的实际参数值为1G,则内存的负载度为1/8=0.125。
S202、根据n个权重因子的负载度和n个权重值进行加权平均得到每个虚拟节点的权重系数。
具体的,对于m个物理节点关联的虚拟节点中任意一个虚拟节点的权重系数的计算方法:获取虚拟节点的n个权重因子的负载度和n个权重值,根据n个权重因子的负载度和n个权重值进行加权平均后得到给虚拟节点的权重系数。
续上例:对于虚拟节点a,假设虚拟节点a的4个权重因子的负载度为:记录数的负载度:0.5,内存占用率的负载度:0.8、CPU占用率的负载度:0.6,访问频率的负载度:0.6,4个权重因子的权重值分为:0.1、0.2、0.3和0.4,则计算虚拟节点a的权重系数=0.5*0.1+0.8*0.2+0.6*0.3+0.6*0.4=0.63;对于虚拟节点b,假设虚拟节点b的4个权重因子的负载度为:记录数的负载度:0.8,内存占用率的负载度:0.2、CPU占用率的负载度:0.5,访问频率的负载度:0.3,虚拟节点b具有和虚拟节点a相同的权重值配置,虚拟节点b的4个权重因子分别对应的权重值同样为:0.1、0.2、0.3和0.4,则计算虚拟节点a的权重系数=0.8*0.1+0.2*0.2+0.5*0.3+0.3*0.4=0.39。针对分布式数据库系统中其他虚拟节点的计算方法也采用上述方法,此次不再赘述。
可选的,所述根据n个权重因子的负载度和所述n个权重值进行加权平均得到每个虚拟节点的权重系数包括:
监测虚拟节点的n个权重因子的动态负载度,对监测到的所述n个权重因子的动态负载度和所述n个权重值进行加权平均后得到动态权重系数;其中,n≥1且为整数;
获取所述虚拟节点静态设置的所述n个权重因子的静态负载度,对获取到的所述n个权重因子的静态负载度和所述n个权重值进行加权平均后得到静态 权重系数;
将所述动态权重系数和所述静态权重系数进行加权平均后得到所述虚拟节点的权重系数。
具体的,权重因子的动态负载度表示虚拟节点在预设时长内动态监测的权重因子的负载度,为虚拟节点的实际的负载度。例如:1小时内,监测到虚拟节点的CPU占用率的负载度为50%,则50%为CPU占用率的动态负载度。权重因子的静态负载度表示预先配置的权重因子的负载度,为一个固定值,与权重因子的实际负载度无关。例如:预先配置的某个虚拟节点的CPU占用率的负载度为60%,则60%为权重因子的动态负载度,此时该虚拟节点的CPU占用率的实际负载度可能为55%。每个虚拟节点具有相同的权重因子的配置,以及相同的动态权重系数和静态权重系数的权重配置,保证每个虚拟节点在相同的标准下评估其数据分布状态。例如,对于虚拟节点1和虚拟节点4,虚拟节点1和虚拟节点4具有相同的权重因子:CPU占用率、磁盘占用率和记录数,虚拟节点1和虚拟节点4具有相同的权重配置:动态权重系数的权重为Ws,静态权重系数的权重为Wd。数据均衡装置将虚拟节点的动态权重系数和静态权重系数进行加权求和后得到权重系数。
数据均衡装置通过获取虚拟节点的权重因子的动态负载度和权重因子的静态负载度,对两个维度的负载度进行加权后计算虚拟节点的权重系数,可减小评估虚拟节点的数据分布状态的误差。
S203、根据每个物理节点上虚拟节点的权重系数得到标准分片数。
具体的,数据均衡装置根据物理节点与虚拟节点之间的映射关系,确定每个物理节点关联的虚拟节点,针对m个物理节点中任意一个物理节点的标准分片数的计算方法可以为:根据物理节点关联的所有的虚拟节点的权重系数之和得到该物理节点的标准分片数。
续上例,物理节点1关联有虚拟节点a和虚拟节点b,虚拟节点a的权重系数为0.63,虚拟节点b的权重系数为0.39,则物理节点1的标准分片数等于虚拟节点a和虚拟节点b的权重系数之和:0.63+0.39=1.02。
S204、根据m个物理节点各自对应的标准分片数判断数据分布是否满足数据均衡条件。
具体的,数据均衡条件表示m个物理节点之间的标准分片数达到均衡的条 件,例如,m个物理节点中各个物理节点的标准分片数在指定范围内浮动。
S205、若为否,对m个物理节点进行数据均衡处理。
上述实施例,根据权重因子的负载度和权重值计算物理节点的标准分片数,采用标准分片数来评估其数据分布情况,能准确的对分布式数据库系统中的物理节点进行数据均衡。
参加图3,为本发明实施例提供的一种数据均衡方法的另一流程示意图,在本发明实施例中,所述方法包括:
S301、获取分布式数据库系统中m个物理节点上每个物理节点的n个权重因子的负载度和n个权重值。
具体的,分布式数据库系统中设置有m个物理节点,每个物理节点可周期性的向数据均衡装置上报关联的虚拟节点的n个权重因子的负载度,n个权重因子的权重值可以预先设置,每个虚拟节点具有相同的权重值配置。数据均衡装置对于权重因子的负载度的监测可通过部署在各个物理节点中的代理装置来实现。其中,虚拟节点的权重因子用于评价虚拟节点的某个维度的参数,权重因子的负载度表示该维度的实际参数值与额定参数值的比例值,一个虚拟节点可以具有多个不同的权重因子,多个不同的权重因子可以属于多个维度。分布式数据库系统中每个虚拟节点的权重因子的数量和类型相同,以及权重值配置相同,以便各个虚拟节点在相同的标准下比较数据分布情况。m和n为大于1的整数。
可选的,n个权重因子包括:业务级权重因子、资源级权重因子中的一种或多种,业务级权重因子包括:访问频率、业务对象的记录数中的一种或多种,资源级权重因子包括:CPU占用率、内存占用率、磁盘空间占用率、IO接口吞吐量中的一种或多种。
需要说明的是,数据均衡装置检测到分布式数据库系统中增加新的物理节点或删除旧的物理节点时,更新物理节点与虚拟节点的映射关系,数据均衡装置触发获取分布式数据库系统中m个物理节点上的虚拟节点的n个权重因子和n个权重值,然后执行后续的数据均衡的步骤,因为增加新的物理节点和删除旧的物理节点后分布式数据库系统中极有可能会不满足负载度均衡条件。
S302、根据n个权重因子的负载度和n个权重值进行加权平均得到每个虚拟节点的权重系数。
具体的,对于m个物理节点关联的虚拟节点中任意一个虚拟节点的权重系数的计算方法为:根据S301获取到的n个权重因子的负载度和n个权重因子的权重值配置,每个虚拟节点的权重因子的数量和类型相同,且权重值配置相同。设某个虚拟节点的n个权重因子的负载度为:x1、x2、x3、…、xn,权重值配置为:w1、w2、w3、…、wn,wn表示xn的权重值,则该虚拟节点的权重系数=x1*w1+x2*w2+x3*w3+、…、xn*wn。
S303、根据每个物理节点上虚拟节点的权重系数得到标准分片数。
具体的,对于m个物理节点中任意一个物理节点的标准分片数的计算方法可以是:获取物理节点上关联的虚拟节点的权重系数,将关联的虚拟节点的权重系数进行求和得到该物理节点的权重系数。
例如:物理节点1关联有虚拟节点a和虚拟节点b,获取到虚拟节点a的权重系数为0.9,虚拟节点b的权重系数为1.2,则物理节点1的标准分片数等于关联的虚拟节点a和虚拟节点b的负载度之和:0.9+1.2=2.1。
S304、根据所述m个物理节点各自对应的标准分片判断是否满足负载度均衡条件。
具体的,数据均衡装置根据m个物理节点各自对应的标准分片数判断分布式数据库系统是否满足负载度均衡条件,负载度均衡条件要求m个物理节点各自对应的负载度在指定范围内浮动。
在一种可能的实施方式中,数据均衡条件为:Lmax<Laverage×(1+α)and Lmin>Laverage×(1-α);其中,and表示与的关系,and两边的条件需要同时满足,Lmax为物理节点的最大标准分片数,Laverage为所述m个物理节点的平均标准分片数,Lmin为物理节点的最小标准分片数,α为偏差系数,0≤α≤1,m个物理节点的平均标准分片数等于m个物理节点的标准分片数之和除以m。例如:m=4,分布式数据库系统包括物理节点1、物理节点2、物理节点3和物理节点4,物理节点1的标准分片数为2.5,物理节点2的标准分片数为3.5,物理节点3的标准分片数等于4.5,物理节点4的标准分片数等于6.5,则4个物理节点的平均标准分片数=(2.5+3.5+4.5+6.5)/4=4.25,4个物理节点中最小标准分片数为2.5,4个物理节点中最大标准分片数为6.5,偏差系数可以为预先设置的,偏差系数的值位于0和1之间。
数据均衡装置根据m个物理节点的标准分片数判断数据分布是否符合数据 均衡条件,若判断结果为否,执行S305,若判断结果为是,可以继续执行S301。
S305、根据迁移规则对m个物理节点关联的虚拟节点进行模拟迁移。
具体的,模拟迁移表示并没有真正对虚拟节点进行迁移,而是对根据迁移规则进行虚拟节点迁移过程的仿真。
具体的,迁移规则包括:根据权重系数将m个物理节点各自关联的虚拟节点进行分组。
在本发明的一种可能的实施方式中,分组的方法可以是:根据分布式数据库系统中权重系数的分布范围预先设置不重叠且相邻的多个权重系数区间,每个权重系数区间可以具有相同的步长,权重系数区间的步长权重系数区间的两个端点的差值的绝对值。每个权重系数区间对应一个分组,对于m个物理节点关联的任意一个虚拟节点,判断虚拟节点的权重系数所属的权重系数区间,根据所属的负载度区间确定该虚拟节点所属的组。例如:设置的多个权重系数区间,权重系数区间1为[1.5,2.5),对应分组1;权重系数区间2为[2.5,3.5],对应分组2。其中,方括号表示包含端点,圆括号表示不包含端点。某个虚拟节点的权重系数为2,权重系数属于权重系数区间1,则将该虚拟节点划分进分组1。
在另一种可能的实施方式中,分组的方法可以为:将各个虚拟节点的权重系数进行四舍五入取整,将四舍五入取整后权重系数相同的虚拟节点划分为同一组。
对m个物理节点中所有的虚拟节点分组完成后,针对每个组进行如下处理:确定组内m个物理节点对应的虚拟节点的数量,将组内的虚拟节点平均分配给m个物理节点,使每个物理节点分配相同的整数个虚拟节点。如果组内的虚拟节点的数量不为m的整数倍,平均分配时组内会存在剩余的虚拟节点,如果组内的虚拟节点的数量小于m,则组内会存在未分配虚拟节点的物理节点;获取每个组中剩余的虚拟节点,将剩余的虚拟节点重新分配给m个物理节点,保证每个物理节点的标准分片数满足数据均衡条件。
举例说明:分布式数据库系统中当前存在2个物理节点:物理节点1和物理节点2,物理节点1关联有6个虚拟节点,物理节点2关联有6个虚拟节点,此时分布式数据库系统满足数据均衡条件。当需要对分布式数据库系统进行扩容时,例如增加物理节点3,此时分布式数据库系统中存在3个物理节点,3 个物理节点不满足数据均衡条件,3个物理节点关联的虚拟节点的权重系数如表1所示:
Figure PCTCN2017085376-appb-000001
表1
表1中,3个物理节点的平均标准分片数=18/3=6,假设偏差系数为10%,最大标准分片数=9>平均标准分片数=6*(1+10%),最小标准分片数=0<平均标准分片数6*(1-10%),显然,此时不满足数据均衡条件,需要对m个物理节点进行数据均衡处理。
对3个物理节点关联的所有虚拟节点根据权重系数进行分组,设置两个权重系数区间,权重系数区间1为[0.5,1.5),对应组1;权重系数区间2为[1.5,2.5),对应分组2,则属于分组1的虚拟节点:a、b、c;g、h、i,属于分组2的虚拟节点:d、e、f;j、k、l。分组1中3个物理节点与虚拟节点的关联关系如表2所示:
Figure PCTCN2017085376-appb-000002
表2
分组2中3个物理节点与虚拟节点的关联关系如表3所示:
Figure PCTCN2017085376-appb-000003
Figure PCTCN2017085376-appb-000004
表3
表2中虚拟节点的数量为6,需要将6个虚拟节点平均分配给3个物理节点,每个物理节点分配2个虚拟节点,对于组1需要将物理节点1上的1个虚拟节点(a、b、c中的任意1个)和物理节点2上的1个物理节点(g、h、i中的任意一个),使得每个物理节点分配两个虚拟节点,虚拟节点的数量恰好为物理节点数量的整数倍,组1中没有剩余的虚拟节点。例如,对组1内的虚拟节点进行迁移后的的映射关系如下:
Figure PCTCN2017085376-appb-000005
表4
表3中虚拟节点的数量为6,需要将6个虚拟节点分配给3个物理节点,每个物理节点分配2个物理节点,对于组2需要将物理节点1中的1个虚拟节点(d、e和f中的一个)迁移到物理节点3,将物理节点2中的1个虚拟节点迁移到物理节点3,使组2内的每个物理节点平均分配2个虚拟节点。组2内虚拟节点的数量恰好为物理节点的数量的整数倍,则组2内没有剩余的虚拟节点。例如,对组2内的虚拟节点进行迁移后的映射关系如表5所示:
Figure PCTCN2017085376-appb-000006
表5
数据均衡装置不需要处理各个组的剩余的虚拟节点,此时3个物理节点与各自关联的虚拟节点的映射关系如表6所示:
Figure PCTCN2017085376-appb-000007
Figure PCTCN2017085376-appb-000008
表6
表6中,3个物理节点的标准分片数均为6,显然满足数据均衡条件。
在另一种可能的实施方式中,迁移规则包括:数据均衡装置获取m个物理节点的平均标准分片数,平均标准分片数为m个物理节点的标准分片数之和除以m得到的;根据平均标准分片数规划出所有的可能的多种虚拟节点迁移方案,其中,每种虚拟节点迁移方案要求物理节点的标准分片数趋近于平均负载度;数据均衡装置从多种虚拟节点迁移方案中确定目标虚拟节点迁移方案,确定的方法可以是:针对每种虚拟节点迁移方案,假设执行虚拟迁移方案后,数据均衡装置计算m个物理节点中每个物理关联的虚拟节点的权重系数的平方和,计算m个物理节点的平方和的平均值。例如,分布式数据库系统中设置有2个物理节点,分别为物理节点1和物理节点2,执行当前的虚拟节点迁移方案后,物理节点1关联有虚拟节点a和虚拟节点b,虚拟节点a的负载度为1,虚拟节点b的负载度为4;物理节点2关联有虚拟节点c和虚拟节点d,虚拟节点c的负载度为2、虚拟节点3的负载度为3;则根据上述公式物理节点1关联的虚拟节点的权重系数的平方和为:1×1+4×4=17,物理节点2关联的虚拟节点的权重系数的平方和为:2×2+3×3=13。计算m个物理节点的平方和的平均值,该平均值根据m个物理节点的平方和之和除以m得到,续上例:平均值为:(13+17)/2=15。计算m个物理节点中各个物理节点的平方和与平均值的差的平方之和,续上例:差的平方之和为:(17-15)2+(13-15)2=4;将差的平方之和最小的虚拟节点迁移方案作为目标虚拟节点迁移方案。
举例说明:如表1所示,分布式数据库系统中的3个物理节点的数据分布显然不满足数据均衡条件,物理节点1的标准分片数为9,物理节点2的标准分片数为9,物理节点3的标准分片数为0,则3个物理节点的平均标准分片数为(9+9+0)/3=6,根据数据均衡要求,每个物理节点的标准分片数为6,假设规划出所有可能的虚拟节点迁移方案为:
方案一:将物理节点1中的虚拟节点a(0.9)、虚拟节点b(1)和虚拟节点c(1.1)迁移到虚拟节点3,将物理节点2的虚拟节点g(0.9)、虚拟节点h (1)和虚拟节点(1)迁移到虚拟节点3,迁移后物理节点与虚拟节点的映射关系如表7所示:
Figure PCTCN2017085376-appb-000009
表7
分布式系统中各个物理节点的中虚拟节点的权重系数的平方和的平均值=(0.92+12+1.12+1.92+22+2.12)*2/3=10.027。各个物理节点的虚拟节点的平方和的方差和=(12.02-10.027)2*2+(6.04-10.027)2=23.84,方差和较大,虚拟节点的离散性较高。
方案二:将物理节点1中虚拟节点b(1)和虚拟节点e(2)迁移到物理节点3,将物理节点2上的虚拟节点h(1)和虚拟节点k(2)迁移到物理节点3,迁移后物理节点与虚拟节点的映射关系如表8所示:
Figure PCTCN2017085376-appb-000010
表8
分布式系统中各个物理节点的中虚拟节点的权重系数的平方和的平均值=(0.92+12+1.12+1.92+22+2.12)*2/3=10.027。各个物理节点的虚拟节点的平方和的方差和=(10.04-10.027)2*2+(10-10.027)2=0.001,方差和较小,虚拟节点的离散型较低。方案二不仅保证了各个物理节点上标准分片数的均衡还保证和虚拟节点的数量的均衡。
数据均衡装置将方法二作为目标迁移方案。
S306、根据m个物理节点各自对应的标准分片数判断数据分布是否满足数据均衡条件。
具体的,数据均衡装置判断执行模拟迁移后m个物理节点的标准分片数是否满足数据均衡条件,若为是,执行S307,若为否,执行S308。
S307、根据迁移规则对m个物理节点关联的虚拟节点进行实际迁移,迁移的方法参照S305,此次不再赘述。
S308、从m个物理节点关联的虚拟节点中选择至少一个虚拟节点进行裂变。
具体的,当物理节点中的某些虚拟节点的粒度较大,即虚拟节点的权重系数较大,进行虚拟节点的迁移始终无法使物理节点满足数据均衡条件,此时需要从m个物理节点中选择权重系数较大的虚拟节点进行裂变,裂变为两个或两个以上的虚拟节点,选择的方法可以是:选择m个物理节点关联的所有的虚拟节点中权重系数最大的虚拟节点进行裂变,裂变为两个相同大小的虚拟节点,每个虚拟节点的权重系数为原虚拟节点的1/2;或者,确定标准分片数最大的物理节点,再确定该物理节点中权重系数最大的虚拟节点,将虚拟节点裂变为两个相同大小的虚拟节点。裂变后,数据均衡装置更新物理节点与虚拟节点之间的映射关系。
S309、裂变后根据迁移规则对m个物理节点关联的虚拟节点进行实际迁移。
具体的,数据均衡装置根据更新后的物理节点与虚拟节点的映射关系根据迁移规则进行虚拟节点的实际迁移,使得各个物理节点达到数据均衡。
举例说明:分布式数据库系统中存在4个物理节点,每个物理节点与虚拟节点之间的映射关系,以及虚拟节点的权重系数如表9所示:
Figure PCTCN2017085376-appb-000011
表9
表9中,物理节点Node1的标准分片数=2,物理节点Node2的标准分片数=2,物理节点Node3的标准分片数=3,物理节点Node4的标准分片数=2。
假设偏差系数为20%,4个物理节点的平均标准分片数=2.25,那么最大标准分片数=3>2.25(1+20%)=2.7,不满足数据均衡条件,且无论如何进行虚拟节点迁移都无法满足数据均衡条件,触发虚拟节点的裂变。数据均衡装置确定标准分片数最大的物理节点Node3,从Node3中选择权重系数最大的1个虚拟节点裂变为相同大小的虚拟节点,假设选择虚拟节点V008裂变为V008a和
V008b,然后将虚拟节点V008b迁移到物理节点Node4,迁移后的各个物理节点满足数据均衡条件,迁移后物理节点与虚拟节点的映射关系如下:
Figure PCTCN2017085376-appb-000012
表10
需要说明的是,虚拟节点可以采用逻辑分片或物理分片的形式,例如:物理分片对应数据库的Page、Block、Schema、Tablespace等。在虚拟节点为物理分片的情况下,对物理分片进行迁移的方法迁移效率高,对数据库和应用的影响小,更容易实现无损迁移。在虚拟节点为逻辑分片的情况下,对逻辑分片进行迁移的方法管理效率高。
参见图5,为本发明实施例提供的一种数据均衡装置的结构示意图,本发明实施例的数据均衡装置用于执行图2中的不常用数据的识别方法,所涉及的术语和过程可参照图图2实施例的描述。数据均衡装置5包括:获取模块501、 权重系数计算模块502、标准分片数计算模块503、均衡判断模块504和均衡模块505。
获取模块501,用于获取分布式数据库系统中m个物理节点上每个虚拟节点的n个权重因子的负载度和n个权重值,m和n为大于1的整数。
权重系数计算模块502,用于根据所述n个权重因子的负载度和所述n个权重值进行加权平均得到每个虚拟节点的权重系数。
标准分片数计算模块503,用于根据每个物理节点关联的虚拟节点的权重系数得到标准分片数。
均衡判断模块504,用于根据所述m个物理节点各自对应的标准分片数判断数据分布是否满足数据均衡条件。
均衡模块505,用于若所述均衡判断模块的判断结果为否,对所述m个物理节点进行数据均衡处理。
可选的,均衡模块505包括:模拟迁移单元、判断单元、实际迁移单元和裂变单元。
模拟迁移单元,根据预设的迁移规则对所述m个物理节点关联的虚拟节点进行模拟迁移。
判断单元,用于判断模拟迁移后的所述m个物理节点是否满足所述数据均衡条件。
实际迁移单元,若所述判断单元的判断结果为是,根据所述迁移规则对所述m个物理节点关联的虚拟节点进行实际迁移。
裂变单元,用于若所述判断单元的判断结果为否,从所述m个物理节点关联的虚拟节点中选择至少一个虚拟节点进行裂变,裂变后根据所述迁移规则对所述m个物理节点关联的虚拟节点进行实际迁移。
可选的,裂变单元用于:
确定所述m个物理节点中标准分片数最大的物理节点;
确定所述标准分片数最大的物理节点中权重系数最大的虚拟节点;
将所述权重系数最大的虚拟节点分裂为至少两个相同大小的虚拟节点。
可选的,所述迁移规则包括:
根据权重系数将所述m个物理节点关联的虚拟节点进行分组;
将每组内的虚拟节点平均分配给所述m个物理节点;其中,同一组内的物 理节点分配整数个虚拟节点;
将各个组进行平均分配后剩余的虚拟节点重分配给所述m个物理节点。
可选的,所述数据均衡条件包括:Lmax<Laverage×(1+α)and Lmin>Laverage×(1-α);其中,and表示与的关系,Lmax为物理节点的最大标准分片数,Laverage为所述m个物理节点的平均标准分片数,Lmin为物理节点的最小标准分片数,α为偏差系数,0≤α≤1。
可选的,所述n个权重因子包括:业务级权重因子和资源级权重因子中的一种或多种;
所述业务级权重因子包括:访问频度和业务对象的记录数的一种或多种;
所述资源级权重因子包括:CPU占用率、内存占用率、磁盘空间占用率、IO接口吞吐量中的一种或多种。
本发明实施例和图2-图4的实施例基于同一构思,其带来的技术效果也相同,具体过程可参照图2-图4的实施例的描述,此处不再赘述。
参见图6,为本发明实施例提供的一种数据均衡装置的结构示意图,在本发明实施例中,数据均衡装置6包括处理器601、存储器602和收发器603。收发器603用于与外部设备之间收发数据。数据均衡装置中的处理器601的数量可以是一个或多个。本发明的一些实施例中,处理器601、存储器602和收发器603可通过总线系统或其他方式连接。数据均衡装置6可以用于执行图2所示的方法。关于本实施例涉及的术语的含义以及举例,可以参考图2对应的实施例。此处不再赘述。
其中,存储器602中存储程序代码。处理器601用于调用存储器602中存储的程序代码,用于执行以下操作:
获取分布式数据库系统中m个物理节点上每个虚拟节点的n个权重因子的负载度和n个权重值,m和n为大于1的整数;
根据所述n个权重因子的负载度和所述n个权重值进行加权平均得到每个虚拟节点的权重系数;
根据每个物理节点关联的虚拟节点的权重系数得到标准分片数;
根据所述m个物理节点各自对应的标准分片数判断数据分布是否满足数据均衡条件;
若为否,对所述m个物理节点进行数据均衡处理。
可选的,处理器601执行所述对所述m个物理节点进行数据均衡处理包括:
根据预设的迁移规则对所述m个物理节点关联的虚拟节点进行模拟迁移,判断模拟迁移后的所述m个物理节点是否满足所述数据均衡条件;
若为是,根据所述迁移规则对所述m个物理节点关联的虚拟节点进行实际迁移;
若为否,从所述m个物理节点关联的虚拟节点中选择至少一个虚拟节点进行裂变,裂变后根据所述迁移规则对所述m个物理节点关联的虚拟节点进行实际迁移。
可选的,处理器601执行所述从所述m个物理节点关联的虚拟节点中选择至少一个虚拟节点进行裂变包括:
确定所述m个物理节点中标准分片数最大的物理节点;
确定所述标准分片数最大的物理节点中权重系数最大的虚拟节点;
将所述权重系数最大的虚拟节点分裂为至少两个相同大小的虚拟节点。
可选的,所述迁移规则包括:
根据权重系数将所述m个物理节点关联的虚拟节点进行分组;
将每组内的虚拟节点平均分配给所述m个物理节点;其中,同一组内的物理节点分配整数个虚拟节点;
将各个组进行平均分配后剩余的虚拟节点重分配给所述m个物理节点。
可选的,处理器601执行所述根据权重系数将m个物理节点关联的虚拟节点进行分组包括:
获取所述m个物理节点关联的虚拟节点的权重系数;
确定每个虚拟节点的权重系数所属的权重系数区间,将虚拟节点归为所属的权重系数区间对应的组。
可选的,所述数据均衡条件包括:Lmax<Laverage×(1+α)and Lmin>Laverage×(1-α);其中,and表示与的关系,Lmax为物理节点的最大标准分片数,Laverage为所述m个物理节点的平均标准分片数,Lmin为物理节点的最小标准分片数,α为偏差系数,0≤α≤1。
可选的,所述n个权重因子包括:业务级权重因子和资源级权重因子中的一种或多种;
所述业务级权重因子包括:访问频度和业务对象的记录数的一种或多种;所述资源级权重因子包括:CPU占用率、内存占用率、磁盘空间占用率、IO接口吞吐量中的一种或多种。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。
以上所揭露的仅为本发明一种较佳实施例而已,当然不能以此来限定本发明之权利范围,本领域普通技术人员可以理解实现上述实施例的全部或部分流程,并依本发明权利要求所作的等同变化,仍属于发明所涵盖的范围。

Claims (13)

  1. 一种数据均衡方法,其特征在于,包括:
    获取分布式数据库系统中m个物理节点上每个虚拟节点的n个权重因子的负载度和n个权重值,m和n为大于1的整数;
    根据所述n个权重因子的负载度和所述n个权重值进行加权平均得到每个虚拟节点的权重系数;
    根据每个物理节点关联的虚拟节点的权重系数得到标准分片数;
    根据所述m个物理节点各自对应的标准分片数判断数据分布是否满足数据均衡条件;
    若为否,对所述m个物理节点进行数据均衡处理。
  2. 如权利要求1所述的方法,其特征在于,所述对所述m个物理节点进行数据均衡处理包括:
    根据预设的迁移规则对所述m个物理节点关联的虚拟节点进行模拟迁移,判断模拟迁移后的所述m个物理节点是否满足所述数据均衡条件;
    若为是,根据所述迁移规则对所述m个物理节点关联的虚拟节点进行实际迁移;
    若为否,从所述m个物理节点关联的虚拟节点中选择至少一个虚拟节点进行裂变,裂变后根据所述迁移规则对所述m个物理节点关联的虚拟节点进行实际迁移。
  3. 如权利要求2所述的方法,其特征在于,所述从所述m个物理节点关联的虚拟节点中选择至少一个虚拟节点进行裂变包括:
    确定所述m个物理节点中标准分片数最大的物理节点;
    确定所述标准分片数最大的物理节点中权重系数最大的虚拟节点;
    将所述权重系数最大的虚拟节点分裂为至少两个相同大小的虚拟节点。
  4. 如权利要求3所述的方法,其特征在于,所述迁移规则包括:
    根据权重系数将所述m个物理节点关联的虚拟节点进行分组;
    将每组内的虚拟节点平均分配给所述m个物理节点;其中,同一组内的物 理节点分配整数个虚拟节点;
    将各个组进行平均分配后剩余的虚拟节点重分配给所述m个物理节点。
  5. 如权利要求3所述的方法,其特征在于,所述根据权重系数将m个物理节点关联的虚拟节点进行分组包括:
    获取所述m个物理节点关联的虚拟节点的权重系数;
    确定每个虚拟节点的权重系数所属的权重系数区间,将虚拟节点归为所属的权重系数区间对应的组。
  6. 如权利要求1所述的方法,其特征在于,所述数据均衡条件包括:Lmax<Laverage×(1+α)and Lmin>Laverage×(1-α);其中,and表示与的关系,Lmax为物理节点的最大标准分片数,Laverage为所述m个物理节点的平均标准分片数,Lmin为物理节点的最小标准分片数,α为偏差系数,0≤α≤1。
  7. 如权利要求1-6任意一项所述的方法,其特征在于,所述n个权重因子包括:业务级权重因子和资源级权重因子中的一种或多种;
    所述业务级权重因子包括:访问频度和业务对象的记录数的一种或多种;
    所述资源级权重因子包括:CPU占用率、内存占用率、磁盘空间占用率、IO接口吞吐量中的一种或多种。
  8. 一种数据均衡装置,其特征在于,包括:
    获取模块,用于获取分布式数据库系统中m个物理节点上每个虚拟节点的n个权重因子的负载度和n个权重值,m和n为大于1的整数;
    权重系数计算模块,用于根据所述n个权重因子的负载度和所述n个权重值进行加权平均得到每个虚拟节点的权重系数;
    标准分片数计算模块,用于根据每个物理节点关联的虚拟节点的权重系数得到标准分片数;
    均衡判断模块,用于根据所述m个物理节点各自对应的标准分片数判断数据分布是否满足数据均衡条件;
    均衡模块,用于若所述均衡判断模块的判断结果为否,对所述m个物理节 点进行数据均衡处理。
  9. 如权利要求8所述的装置,其特征在于,所述均衡模块包括:
    模拟迁移单元,根据预设的迁移规则对所述m个物理节点关联的虚拟节点进行模拟迁移;
    判断单元,用于判断模拟迁移后的所述m个物理节点是否满足所述数据均衡条件;
    实际迁移单元,若所述判断单元的判断结果为是,根据所述迁移规则对所述m个物理节点关联的虚拟节点进行实际迁移;
    裂变单元,用于若所述判断单元的判断结果为否,从所述m个物理节点关联的虚拟节点中选择至少一个虚拟节点进行裂变,裂变后根据所述迁移规则对所述m个物理节点关联的虚拟节点进行实际迁移。
  10. 如权利要求9所述的装置,其特征在于,所述裂变单元用于:
    确定所述m个物理节点中标准分片数最大的物理节点;
    确定所述标准分片数最大的物理节点中权重系数最大的虚拟节点;
    将所述权重系数最大的虚拟节点分裂为至少两个相同大小的虚拟节点。
  11. 如权利要求9所述的装置,其特征在于,所述迁移规则包括:
    根据权重系数将所述m个物理节点关联的虚拟节点进行分组;
    将每组内的虚拟节点平均分配给所述m个物理节点;其中,同一组内的物理节点分配整数个虚拟节点;
    将各个组进行平均分配后剩余的虚拟节点重分配给所述m个物理节点。
  12. 如权利要求8所述的装置,其特征在于,所述数据均衡条件包括:Lmax<Laverage×(1+α)and Lmin>Laverage×(1-α);其中,and表示与的关系,Lmax为物理节点的最大标准分片数,Laverage为所述m个物理节点的平均标准分片数,Lmin为物理节点的最小标准分片数,α为偏差系数,0≤α≤1。
  13. 如权利要求8-12任意一项所述的装置,其特征在于,所述n个权重因 子包括:业务级权重因子和资源级权重因子中的一种或多种;
    所述业务级权重因子包括:访问频度和业务对象的记录数的一种或多种;
    所述资源级权重因子包括:CPU占用率、内存占用率、磁盘空间占用率、IO接口吞吐量中的一种或多种。
PCT/CN2017/085376 2016-06-30 2017-05-22 一种数据均衡方法和装置 WO2018000991A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP17818988.2A EP3467652B1 (en) 2016-06-30 2017-05-22 Data balancing method and device
BR112018077132-5A BR112018077132A2 (pt) 2016-06-30 2017-05-22 método e dispositivo de balanceamento de dados
ZA201900538A ZA201900538B (en) 2016-06-30 2019-01-25 Data balancing method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610511324.0 2016-06-30
CN201610511324.0A CN107562531B (zh) 2016-06-30 2016-06-30 一种数据均衡方法和装置

Publications (1)

Publication Number Publication Date
WO2018000991A1 true WO2018000991A1 (zh) 2018-01-04

Family

ID=60785839

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/085376 WO2018000991A1 (zh) 2016-06-30 2017-05-22 一种数据均衡方法和装置

Country Status (5)

Country Link
EP (1) EP3467652B1 (zh)
CN (1) CN107562531B (zh)
BR (1) BR112018077132A2 (zh)
WO (1) WO2018000991A1 (zh)
ZA (1) ZA201900538B (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111198649A (zh) * 2018-11-16 2020-05-26 浙江宇视科技有限公司 磁盘选择方法及装置
CN111371583A (zh) * 2018-12-26 2020-07-03 中兴通讯股份有限公司 服务器的扩容方法及装置、服务器、存储介质
WO2021129534A1 (zh) * 2019-12-23 2021-07-01 华为技术有限公司 迁移功能节点的方法和相关设备
CN113568749A (zh) * 2021-07-28 2021-10-29 新华智云科技有限公司 基于Elasticsearch集群的shard分配方法
CN115665161A (zh) * 2022-10-17 2023-01-31 重庆邮电大学 一种clickhouse实时数据流负载均衡方法及系统

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108429790B (zh) * 2018-02-02 2020-03-06 链博(成都)科技有限公司 算力平衡的区块链生成方法
CN108762924A (zh) * 2018-05-28 2018-11-06 郑州云海信息技术有限公司 一种负载均衡的方法、装置和计算机可读存储介质
CN109710269A (zh) * 2018-09-07 2019-05-03 天翼电子商务有限公司 一种单应用离散式集群部署装置和方法
CN110213172B (zh) * 2019-05-17 2020-10-30 华中科技大学 基于动态负载监测的流连接系统负载均衡方法及装置
CN110209496B (zh) * 2019-05-20 2022-05-17 中国平安财产保险股份有限公司 基于数据处理的任务分片方法、装置及分片服务器
CN110633053B (zh) * 2019-09-16 2020-07-10 北京马赫谷科技有限公司 存储容量均衡方法、对象存储方法及装置
CN111124309B (zh) * 2019-12-22 2022-02-18 浪潮电子信息产业股份有限公司 一种分片映射关系确定方法、装置、设备及存储介质
CN111274228B (zh) * 2020-02-21 2023-09-05 泰康保险集团股份有限公司 保单数据迁移存储方法、系统、设备及可读存储介质
CN112100185B (zh) * 2020-11-03 2021-04-30 深圳市穗彩科技开发有限公司 区块链数据平衡负载的索引系统及方法
CN116993859B (zh) * 2023-09-28 2024-01-02 深圳市维度数据科技股份有限公司 一种基于图片数据提取技术生成统计报表的方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020124137A1 (en) * 2001-01-29 2002-09-05 Ulrich Thomas R. Enhancing disk array performance via variable parity based load balancing
CN101697526A (zh) * 2009-10-10 2010-04-21 中国科学技术大学 分布式文件系统中元数据管理的负载均衡方法及其系统
CN103188345A (zh) * 2013-03-01 2013-07-03 北京邮电大学 分布式动态负载管理系统和方法
CN105550323A (zh) * 2015-12-15 2016-05-04 北京国电通网络技术有限公司 一种分布式数据库负载均衡预测方法和预测分析器

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8479216B2 (en) * 2009-08-18 2013-07-02 International Business Machines Corporation Method for decentralized load distribution in an event-driven system using localized migration between physically connected nodes and load exchange protocol preventing simultaneous migration of plurality of tasks to or from a same node
CN102025753B (zh) * 2009-09-17 2013-10-23 阿里巴巴集团控股有限公司 一种服务器数据资源负载均衡处理方法及设备
US8832683B2 (en) * 2009-11-30 2014-09-09 Red Hat Israel, Ltd. Using memory-related metrics of host machine for triggering load balancing that migrate virtual machine
CN101840356B (zh) * 2009-12-25 2012-11-21 北京网康科技有限公司 一种基于ring的多核CPU负载均衡方法及系统
CN102232282B (zh) * 2010-10-29 2014-03-26 华为技术有限公司 一种实现数据中心资源负载均衡的方法及装置
CN103164261B (zh) * 2011-12-15 2016-04-27 中国移动通信集团公司 多中心数据任务处理方法、装置及系统
US8868711B2 (en) * 2012-02-03 2014-10-21 Microsoft Corporation Dynamic load balancing in a scalable environment
US9372907B2 (en) * 2013-11-26 2016-06-21 Sap Se Table placement in distributed databases
CN105471985A (zh) * 2015-11-23 2016-04-06 北京农业信息技术研究中心 负载均衡方法及云平台计算方法、云平台

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020124137A1 (en) * 2001-01-29 2002-09-05 Ulrich Thomas R. Enhancing disk array performance via variable parity based load balancing
CN101697526A (zh) * 2009-10-10 2010-04-21 中国科学技术大学 分布式文件系统中元数据管理的负载均衡方法及其系统
CN103188345A (zh) * 2013-03-01 2013-07-03 北京邮电大学 分布式动态负载管理系统和方法
CN105550323A (zh) * 2015-12-15 2016-05-04 北京国电通网络技术有限公司 一种分布式数据库负载均衡预测方法和预测分析器

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3467652A4 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111198649A (zh) * 2018-11-16 2020-05-26 浙江宇视科技有限公司 磁盘选择方法及装置
CN111198649B (zh) * 2018-11-16 2023-07-21 浙江宇视科技有限公司 磁盘选择方法及装置
CN111371583A (zh) * 2018-12-26 2020-07-03 中兴通讯股份有限公司 服务器的扩容方法及装置、服务器、存储介质
WO2021129534A1 (zh) * 2019-12-23 2021-07-01 华为技术有限公司 迁移功能节点的方法和相关设备
CN113568749A (zh) * 2021-07-28 2021-10-29 新华智云科技有限公司 基于Elasticsearch集群的shard分配方法
CN113568749B (zh) * 2021-07-28 2023-09-05 新华智云科技有限公司 基于Elasticsearch集群的shard分配方法
CN115665161A (zh) * 2022-10-17 2023-01-31 重庆邮电大学 一种clickhouse实时数据流负载均衡方法及系统
CN115665161B (zh) * 2022-10-17 2024-04-02 重庆邮电大学 一种clickhouse实时数据流负载均衡方法及系统

Also Published As

Publication number Publication date
BR112018077132A2 (pt) 2019-04-30
ZA201900538B (en) 2019-10-30
EP3467652A1 (en) 2019-04-10
EP3467652A4 (en) 2019-05-15
CN107562531A (zh) 2018-01-09
CN107562531B (zh) 2020-10-09
EP3467652B1 (en) 2024-06-19

Similar Documents

Publication Publication Date Title
WO2018000991A1 (zh) 一种数据均衡方法和装置
US11429449B2 (en) Method for fast scheduling for balanced resource allocation in distributed and collaborative container platform environment
US10924535B2 (en) Resource load balancing control method and cluster scheduler
CN108829494B (zh) 基于负载预测的容器云平台智能资源优化方法
US11336718B2 (en) Usage-based server load balancing
CN111614746B (zh) 云主机集群的负载均衡方法、装置及服务器
KR20170110708A (ko) 자원 배치 최적화를 위한 기회적 자원 이주
WO2021008225A1 (zh) 数据中心面向微服务的电力资源分配的方法和系统
US20120221730A1 (en) Resource control system and resource control method
CN113348651B (zh) 切片的虚拟网络功能的动态云间放置
CN106681839B (zh) 弹性计算动态分配方法
US10616134B1 (en) Prioritizing resource hosts for resource placement
CN111290699A (zh) 数据迁移方法、装置及系统
CN102480502B (zh) 一种i/o负载均衡方法及i/o服务器
CN107967164A (zh) 一种虚拟机热迁移的方法及系统
CN114625500A (zh) 云环境下拓扑感知的微服务应用调度的方法及应用
Manikandan et al. Virtualized load balancer for hybrid cloud using genetic algorithm
TW201627873A (zh) 用於在分散式計算中處理重發請求的方法與設備
US10594620B1 (en) Bit vector analysis for resource placement in a distributed system
CN116048773B (zh) 一种基于波函数坍缩的分布式协作任务指派方法和系统
JP2014167713A (ja) 情報処理装置、情報処理システム、情報処理装置管理プログラム及び情報処理装置管理方法
CN110971647A (zh) 一种大数据系统的节点迁移方法
EP3096227A1 (en) Resource allocation method in distributed clouds
CN113377866A (zh) 一种虚拟化数据库代理服务的负载均衡方法及装置
CN113574506A (zh) 基于计算节点标识符的请求分配

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17818988

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017818988

Country of ref document: EP

Effective date: 20190103

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112018077132

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112018077132

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20181226