CN116991580A - Distributed database system load balancing method and device - Google Patents
Distributed database system load balancing method and device Download PDFInfo
- Publication number
- CN116991580A CN116991580A CN202310934592.3A CN202310934592A CN116991580A CN 116991580 A CN116991580 A CN 116991580A CN 202310934592 A CN202310934592 A CN 202310934592A CN 116991580 A CN116991580 A CN 116991580A
- Authority
- CN
- China
- Prior art keywords
- load
- data
- range
- node
- index
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000013508 migration Methods 0.000 claims abstract description 53
- 230000005012 migration Effects 0.000 claims abstract description 53
- 238000003860 storage Methods 0.000 claims abstract description 44
- 238000012545 processing Methods 0.000 claims abstract description 20
- 238000001595 flow curve Methods 0.000 claims description 19
- 230000008569 process Effects 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 4
- 238000013398 bayesian method Methods 0.000 claims description 3
- 238000002360 preparation method Methods 0.000 claims description 2
- 238000012163 sequencing technique Methods 0.000 abstract description 3
- 238000011156 evaluation Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 238000009826 distribution Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000010355 oscillation Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5055—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering software capabilities, i.e. software resources associated or available to the machine
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method and a device for balancing loads of a distributed database system, which belong to the technical field of distributed databases and optimize ordering rules of hot spot data, wherein index information required by the ordering rules comprises the following steps: load pressure index set data, storage node processing capability index set data; historical load data; adding statistical information of load balancing scheduling service; and performing migration range copy and migration target storage node candidate queue priority sequencing on the information data. According to the invention, by optimizing the selection rules of the migration in and out range copy targets, the efficient load balancing of the system can be realized, so that the read-write performance of the distributed database system under a high load pressure scene is improved.
Description
Technical Field
The invention relates to the technical field of distributed databases, in particular to a method and a device for balancing loads of a distributed database system.
Background
Load balancing (Load balancing) is one of the factors that must be considered in the design of a distributed system architecture, which generally refers to the uniform distribution of requests/data across multiple operating units for execution. The common internet distributed architecture is divided into a client layer, a reverse proxy nginx layer, a site layer, a service layer and a data layer, and the layers have different strategy load balancing realization:
load balancing from the client layer to the reverse proxy layer is achieved through DNS polling;
load balancing from the reverse proxy layer to the site layer is realized through 'nginx';
load balancing from a site layer to a service layer is realized through a service connection pool;
the load balancing of the data layer is to consider two points of data balancing and request balancing, and common modes are according to range horizontal segmentation and hash horizontal segmentation.
The existing load balancing implementation for the distributed database KaiwuDB is based on range horizontal segmentation. The KaiwuDB first constructs a key according to a user data table, and logically and horizontally divides the key into a plurality of fragments according to the value Range of the key, which is called Range. Multiple copies (the number of the copies can be matched) exist under each Range, the copies are distributed on different cluster nodes, strong consistent synchronization is carried out by means of a Raft protocol, and load balancing is achieved through the position of a dispatching Range while high availability and fault tolerance of partition levels are solved. To achieve load balancing, the KaiwuDB background service may repeatedly perform range splitting and range migration multiple times.
range splitting: and the main node of the range initiates split through a shift protocol by calculating an appropriate key as a split point, and updates range metadata to realize split of the range. The newly split range node inherits the node distribution of the copy of the parent range. The Range itself can not balance load, but Range splitting can generate Range with finer granularity, and the Range is migrated through the balanced scheduling algorithm of the KaiwuDB to disperse read-write traffic to other nodes, so that the effects of horizontal expansion and load balancing are achieved.
range migration: the migration process firstly adds a new copy B into the RAft Group of the copy; then the new copy B is played back through a log to achieve the consistency with the data of the main copy; after synchronization is completed, the Range metadata is updated and the source copy a is deleted. With range migration, load pressure moves from node a to node B, thereby achieving load balancing.
To sum up, the existing load balancing strategy of the KaiwuDB is realized by splitting and migrating range based on the load pressure of range, and the existing realization has the following disadvantages:
1. the range splitting and migrating load pressure conditions are only one index of QPS, and are only effective for inquiring scene pressure feedback;
2. the range copy data distribution data are balanced, only the data quantity balance is considered, and the processing capacity of the storage nodes is not considered;
3. the load balancing only considers the historical pressure, and migration can possibly lead to centralized transfer of the pressure to a certain idle node to lead to sudden increase of the node pressure, and oscillation occurs in the load pressure balancing process;
4. under the condition that a storage node is retired or down, load balance is needed to be considered during copy replenishment, otherwise, chain reactions such as concentrated pressure transmission to a certain storage node, log backlog of the node, excessive occupation of memory, node OOM and the like can occur, so that the domino effect leads to that the nodes are sequentially connected with a system kill.
Disclosure of Invention
The technical task of the invention is to provide a method and a device for balancing the load of a distributed database system aiming at the defects, and the efficient load balancing of the system can be realized by optimizing the selection rules of the migration in and out range copy targets, so that the read-write performance of the distributed database system under a high load pressure scene is improved.
The technical scheme adopted for solving the technical problems is as follows:
a load balancing method of a distributed database system optimizes the ordering rule of hot spot data, wherein index information required by the ordering rule comprises the following steps: load pressure index set data, storage node processing capability index set data; historical load data; adding statistical information of load balancing scheduling service;
performing migration range copy and migration target storage node candidate queue priority ordering on the information data,
according to the ordering rule of the hotspot data range copy, the migrating priority of the hotspot data range copy is calculated, wherein the calculating rule is as follows: calculating comprehensive index values of the same attribute, and sequentially comparing according to priority order:
the index of the load attribute is ordered, the value is multiplied by the weight = comprehensive index value, and the higher the index value is, the hotter the range data is;
the index ordering of the resource attribute, namely the higher the value multiplied by the weight=the comprehensive index value, the more intense the node resource of the range data is, and the more urgent the range copy is moved out;
the index of the service attribute is ordered, the range quantity flows out cleanly and the success rate is high, and the higher the possibility that the range is migrated out is;
the priority of the migration target storage node is calculated according to the sorting rule of the migration target node, wherein the calculation rule is as follows: calculating comprehensive index values of the same attribute, and sequentially comparing according to priority order:
the index of the resource attribute is ordered, the value is multiplied by the weight = comprehensive index value, and the higher the resource is, the more abundant the resource is, the new copy is suitable for migration, and the resource can be used as a target node;
the index of the load attribute is ordered, the value is multiplied by the weight = comprehensive index value, and the higher the index value is, the hotter the range data is;
the index of the service attribute is ordered, the range copy number is net in, the success rate is high, and the higher the feasibility of the node serving as an migration target node is.
According to the method, the selection rule of the storage node during load balancing scheduling is improved, the evaluation indexes of storage node resources, read-write pressure and performance are expanded, and the load balancing scheduling of the distributed system is optimized. For the storage read-write service, the state of the current storage node and the pressure of the future storage node are comprehensively considered to select the copy position of the new writing data and the access copy of the read service, so that the conflict of the service pressure set on the resources is relieved to the greatest extent.
Preferably, the load pressure index set data includes: QPS, WPS, traffic loads QPS and WPS in preparation for migration recorded in the statistics, and fitting the pressure peak of the future load IO flow curve according to the statistics.
The load pressure condition indexes comprise QPS (query-per-second) and WPS (write-per-second) real-time load indexes, and the CPU core number and performance of the machine where the node is located, the hard disk capacity of the storage node and the disk speed index and other processing capacity indexes, and the multiple indexes can more comprehensively judge whether the load of the current storage node needs to be shunted.
Preferably, the storage node processing capability index set, the set element includes: the method comprises the steps of storing node cache capacity, stored data quantity, schedulable thread number, CPU core number and busyness of a machine where a node is located, memory use percentage of the machine where the node is located, and residual capacity and busyness of a node storage path hard disk.
Preferably, the historical load data comprises read-write data access quantity, peak value of QPS/WPS, active range number and active copy number in about 1h/1min/1 s. Recording the load processing condition of each storage node in the history stage, and predicting a load IO flow curve in one hour in the future by using a machine learning algorithm to provide more reliable suggestions for migration target node election.
Further, newly-added statistical information records historical load data and fits a future load IO flow curve, a future load IO flow curve fitting algorithm fits the IO flow curve by adopting a polynomial function, data of the past 24 hours are trained based on a Bayesian method and a minimum likelihood function, and the IO flow curve of the future 12 hours is predicted; in order to improve the accuracy of the estimation of the IO flow curve, data training and curve fitting updating are periodically performed at intervals of 12 hours.
Preferably, the statistical information of the newly added load balancing scheduling service includes the number of migration ranges and data volume, migration success rate and average time of response migration in about 1h/1 min.
The statistical information of the load balancing scheduling service is increased, the load pressure in the intermediate state is prevented from being concentrated on one or a plurality of nodes, the load balancing process is enabled to be more stable, and the jitter condition of the load pressure of the storage nodes is reduced.
Preferably, the ordering rule of the hotspot data range is as shown in the following table 1-1:
TABLE 1-1
Attributes of | Index name | Priority level | Value taking | Weighting of |
Load(s) | QPS (number of queries processed per second) | High height | qPS actual statistics | 0.5 |
Load(s) | WPS (write request times per second) | High height | WPS actual statistics | 0.3 |
Load(s) | qPS to be migrated | High height | QPS prediction statistics | 0.05 |
Load(s) | WPS to be migrated | High height | WPS predictive statistics | 0.10 |
Load(s) | Future IO flow Peak | High height | Fitting curve QPS+WPS value | 0.05 |
(Resource) | Storage node space utilization | In (a) | Occupied/total capacity | 0.4 |
(Resource) | CPU busyness of machine where node is located | In (a) | CPU busyness | 0.2 |
(Resource) | Busyness of hard disk | In (a) | Disk statistics util | 0.2 |
(Resource) | Memory utilization rate of machine where node is located | In (a) | Used memory/total memory | 0.2 |
Service | range number net inflow | Low and low | Number of immigrating-immigrating range | - |
Service | Migration success rate | Low and low | Successful migration/request migration | 1 |
The ordering rules of the migration target node are shown in the following tables 1-2:
TABLE 1-2
Preferably, for the selection of the range/node target to be migrated, 5 candidate targets, 3 candidate targets and 1 candidate target are sequentially reserved based on the calculation rule, so that the final range/node target to be migrated is determined.
Preferably, the statistical information update includes:
the statistical information of the historical load data of the storage node does not need to be updated, the statistical information can be accessed and acquired to a system metric when related index information is acquired, and the part of data can be obtained through calculation processing of the statistical information stored by a KaiwuDB timing engine;
the load service statistical information can be recorded by adding a statistical structure in the structure based on the data structure provided by the original metric interface, and is updated when a range copy to be migrated to a hot spot is selected in a storage node or a range request is received to process the range copy to be migrated.
The invention also discloses a device for balancing the load of the distributed database system, which is used for realizing the method for balancing the load of the distributed database system.
Compared with the prior art, the method and the device for balancing the load of the distributed database system have the following beneficial effects:
the method expands the load pressure evaluation index set, can support load balancing in more pressure scenes, is not limited to query scenes, and can realize load balancing in various load scenes such as data writing, node addition, node deletion and the like;
according to the method, statistical information is newly added, load processing conditions of historical stages of all storage nodes are recorded, machine learning algorithm is used for training historical data to fit load IO flow curves in one hour in the future, load balancing scheduling weighting is carried out on the migrated storage nodes, and cost performance of load balancing scheduling is improved;
according to the method, statistical information of load balancing scheduling service is newly added, service pressure in the process of migration is considered in the load pressure of the storage node, and the service pressure is used as the evaluation of the load pressure of the future node, so that sight blind areas of other load balancing schedulers in the follow-up process of migration can be effectively avoided, and jitter of node pressure curves of one or more nodes serving as optimal migration-in and migration-out nodes under unexpected conditions is avoided;
the method can effectively reduce the read-write pressure of the high-load pressure node, meanwhile, the migration node with idle service and better performance can complete range copy migration more quickly, range migration is quick and effective, and extra copy data writing can be reduced.
The method is based on the existing business flow, interfaces such as range splitting, migration and the like are not required to be modified, and the method is completely compatible with the existing application.
Drawings
Fig. 1 is a schematic diagram of a method for implementing load balancing of a distributed database system according to an embodiment of the present invention.
Detailed Description
The invention will be further illustrated with reference to specific examples.
The embodiment of the invention provides a method for balancing load of a distributed database system, which optimizes the ordering rule of hot spot data, wherein index information required by the ordering rule comprises the following steps: load pressure index set data, storage node processing capability index set data; historical load data; statistical information of load balancing scheduling service is newly added.
Load pressure condition indexes, including QPS (query-per-second), WPS (write-per-second) real-time load indexes, CPU core number and performance of a machine where the node is located, hard disk capacity of a storage node, disk speed index and other processing capacity indexes, and multiple indexes are used for judging whether the load of the current storage node needs to be shunted or not more comprehensively;
adding statistical information, recording load processing conditions of historical stages of each storage node, and predicting a load IO flow curve within one hour in the future by using a machine learning algorithm to provide more reliable suggestions for migration target node elections;
the statistical information of the load balancing scheduling service is increased, the load pressure in the intermediate state is prevented from being concentrated on one or a plurality of nodes, the load balancing process is enabled to be more stable, and the jitter condition of the load pressure of the storage nodes is reduced.
And finally, the read-write performance of the distributed database system under a high load pressure scene is improved through rapid and effective load balancing.
The specific implementation is as follows:
whether lease balanced or copy balanced, it is necessary to obtain hotspot data range and sort. The method carries out optimization design on the ordering rule of the hot spot data.
Index information required by the sequencing rule of the optimal design comprises the following steps: load pressure index set data, storage node processing capability index set data;
the load pressure index set is expanded into on the basis of the existing QPS: QPS, WPS, traffic loads QPS and WPS which are recorded in statistical information and are ready to migrate, and fitting a pressure peak value of a future load IO flow curve according to the statistical information;
storing a node processing capability index set, wherein the set elements comprise: storing node cache capacity, stored data quantity, schedulable thread number, CPU core number and busyness of a machine where the node is located, memory use percentage of the machine where the node is located, residual capacity of a node storage path hard disk, busyness and the like;
the historical load data content comprises read-write data access quantity, peak value of QPS/WPS, active range number, active copy number and the like in the period of nearly 1h/1min/1 s;
the future load IO flow curve fitting algorithm fits an IO flow curve by adopting a polynomial function, trains data of the past 24 hours based on a Bayesian method and a minimum likelihood function, and predicts an IO flow curve of 12 hours in the future. To improve the accuracy of the IO flow curve estimation, data training and curve fitting updating are periodically performed at intervals of 12 hours.
The load service statistical information content comprises the number of the migration copies, the data volume, the migration success rate, the average time of response migration and the like in about 1h/1 min;
(1) The ordering rules for hotspot data range are as follows:
TABLE 1-1
Calculating rules: calculating comprehensive index values of the same attribute, and sequentially comparing according to priority order:
the index of the load attribute is ordered, the value is multiplied by the weight = comprehensive index value, and the higher the index value is, the hotter the range data is;
the index ordering of the resource attribute, namely the higher the value multiplied by the weight=the comprehensive index value, the more intense the node resource of the range data is, and the more urgent the range copy is moved out;
the index of the service attribute is ordered, the range number is out cleanly, the success rate is high, and the higher the possibility that the range copy is migrated is.
(2) The ordering rules for migrating into the target node are as follows:
TABLE 1-2
Attributes of | Index name | Priority level | Value taking | Weighting of |
Load(s) | QPS (number of queries processed per second) | High height | qPS actual statistics | 0.3 |
Load(s) | WPS (write request times per second) | High height | WPS actual statistics | 0.5 |
Load(s) | qPS to be migrated | High height | QPS prediction statistics | 0.05 |
Load(s) | WPS to be migrated | High height | WPS predictive statistics | 0.10 |
Load(s) | Future IO flow Peak | High height | Fitting curve QPS+WPS value | 0.05 |
(Resource) | Storage node space free rate | High height | Residual capacity/total capacity | 0.4 |
(Resource) | CPU (Central processing Unit) idle degree of machine where node is located | High height | CPU count 100-CPU busyness | 0.2 |
(Resource) | Degree of idleness of hard disk | High height | 1-disk statistics util | 0.2 |
(Resource) | Free rate of machine memory where node is located | High height | Residual memory/total memory | 0.2 |
Service | range number net outflow | Low and low | Number of immigrating-immigrating range | - |
Service | Migration success rate | Low and low | Successful migration/request migration | 1 |
Calculating rules: calculating comprehensive index values of the same attribute, and sequentially comparing according to priority order:
the index sequencing of the resource attributes, the value of weight=comprehensive index value, and the higher the resource is, the more abundant the resource is suitable for the migration of new copies, and the resource can be used as a target node;
index ordering of load attributes, wherein the higher the index ordering of load attributes is, the higher the value is the range data is;
the index of the service attribute is ordered, the range quantity flows in cleanly and the success rate is high, and the higher the feasibility of the node serving as the migration target node is.
And (3) selecting a range/node target to be migrated, wherein 5, 3 and 1 candidate targets are reserved in sequence respectively according to the indexes of the resource attribute, the load attribute and the service attribute based on the calculation rules of the steps (1) and (2), and determining the final range/node target to be migrated.
Updating statistical information: the statistical information of the historical load data of the storage node does not need to be updated, and the statistical information can be accessed and acquired to the system metric when the related index information is acquired, and the part of data can be obtained through the statistical information calculation processing stored by the KaiwuDB timing engine. The load service statistical information can be recorded by adding a statistical structure in the store instance structure based on the data structure provided by the original metric interface, and is updated when a range copy to be migrated of a hot spot is selected in a storage node or a range request is received to process the range copy to be migrated.
According to the method, more load pressure evaluation indexes are evaluated and analyzed, and the processing capacity of the storage node is combined, so that proper hot spot data can be more accurately selected for migration, and the migration target node is ensured to be capable of effectively receiving the load pressure of the part. Meanwhile, a machine learning algorithm is introduced to estimate and feed back the load balancing effectiveness and the node load, so that flexible change of load pressure of each node and load balancing among nodes of the whole cluster system are realized, and the performance stability and reliability of the distributed database under a high-load scene are improved.
The embodiment of the invention also provides a device for balancing the load of the distributed database system, which is used for realizing the method for balancing the load of the distributed database system described in the embodiment.
The present invention can be easily implemented by those skilled in the art through the above specific embodiments. It should be understood that the invention is not limited to the particular embodiments described above. Based on the disclosed embodiments, a person skilled in the art may combine different technical features at will, so as to implement different technical solutions.
Other than the technical features described in the specification, all are known to those skilled in the art.
Claims (10)
1. The method for balancing the load of the distributed database system is characterized by optimizing the ordering rule of the hotspot data range copies, wherein the ordering rule needs index information comprising: load pressure index set data, storage node processing capability index set data; historical load data; adding statistical information of load balancing scheduling service;
performing migration range copy and migration target storage node candidate queue priority ordering on the information data,
according to the ordering rule of the hotspot data range copy, the migrating priority of the hotspot data range copy is calculated, wherein the calculating rule is as follows: calculating comprehensive index values of the same attribute, and sequentially comparing according to priority order:
the index of the load attribute is ordered, the value is multiplied by the weight = comprehensive index value, and the higher the index value is, the hotter the range data is;
the index ordering of the resource attribute, namely the higher the value multiplied by the weight=the comprehensive index value, the more intense the node resource of the range data is, and the more urgent the range copy is moved out;
the index of the service attribute is ordered, the range quantity flows out cleanly and the success rate is high, which means that the higher the feasibility of the range copy is migrated;
the priority of the migration target storage node is calculated according to the sorting rule of the migration target node, wherein the calculation rule is as follows: calculating comprehensive index values of the same attribute, and sequentially comparing according to priority order:
the index of the resource attribute is ordered, the value is multiplied by the weight = comprehensive index value, and the higher the resource is, the more abundant the resource is, the new copy is suitable for migration, and the resource can be used as a target node;
the index of the load attribute is ordered, the value is multiplied by the weight = comprehensive index value, and the higher the index value is, the hotter the range data is;
the index of the service attribute is ordered, the range quantity flows in cleanly and the success rate is high, and the higher the feasibility of the node serving as the migration target node is.
2. The method of claim 1, wherein the load pressure index set data comprises: QPS, WPS, traffic loads QPS and WPS in preparation for migration recorded in the statistics, and fitting the pressure peak of the future load IO flow curve according to the statistics.
3. A method of distributed database system load balancing according to claim 1 or 2, wherein the storage node processing capability index set, the aggregate element comprises: the method comprises the steps of storing node cache capacity, stored data quantity, schedulable thread number, CPU core number and busyness of a machine where a node is located, memory use percentage of the machine where the node is located, and residual capacity and busyness of a node storage path hard disk.
4. A method of distributed database system load balancing according to claim 3, wherein the historical load data comprises read-write data access, QPS/WPS peak, active range number, active copy number within approximately 1h/1min/1 s.
5. The method for load balancing of a distributed database system according to claim 4, wherein the statistical information records historical load data and fits a future load IO flow curve, the future load IO flow curve fitting algorithm fits the IO flow curve by using a polynomial function, trains data for the past 24 hours based on a Bayesian method and a minimum likelihood function, and predicts an IO flow curve for the future 12 hours;
data training and curve fitting updates were performed periodically at 12 hour intervals.
6. The method for balancing load of a distributed database system according to claim 4, wherein the statistical information of the newly added load balancing scheduling service includes the number of the migrated range copies and the data amount, the migration success rate and the average time of responding to the migrated in about 1h/1 min.
7. The method for load balancing of a distributed database system according to claim 1, wherein the ordering rule of the hotspot data range is as shown in table 1-1 below:
TABLE 1-1
The ordering rules of the migration target node are shown in the following tables 1-2:
TABLE 1-2
。
8. The method according to claim 1 or 7, wherein for selecting the target of the range/node to be migrated, 5 candidate targets, 3 candidate targets and 1 candidate target are sequentially reserved based on the calculation rule, so as to determine the final target of the range/node to be migrated.
9. The method of distributed database system load balancing of claim 8, wherein the statistical information update comprises:
the statistical information of the historical load data of the storage node does not need to be updated, the statistical information can be accessed and acquired to a system metric when related index information is acquired, and the part of data can be obtained through calculation processing of the statistical information stored by a KaiwuDB timing engine;
the load service statistical information is recorded by adding a statistical structure in a store instance structure based on a data structure provided by an original metric interface, and is updated when a range copy to be migrated of a hot spot is selected in a storage node or a range request is received to process to be migrated.
10. An apparatus for load balancing a distributed database system, wherein the apparatus is configured to implement the method for load balancing a distributed database system according to any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310934592.3A CN116991580B (en) | 2023-07-27 | 2023-07-27 | Distributed database system load balancing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310934592.3A CN116991580B (en) | 2023-07-27 | 2023-07-27 | Distributed database system load balancing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116991580A true CN116991580A (en) | 2023-11-03 |
CN116991580B CN116991580B (en) | 2024-09-13 |
Family
ID=88520850
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310934592.3A Active CN116991580B (en) | 2023-07-27 | 2023-07-27 | Distributed database system load balancing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116991580B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103593347A (en) * | 2012-08-14 | 2014-02-19 | 中兴通讯股份有限公司 | Load balancing method and distributed database system |
WO2017100987A1 (en) * | 2015-12-15 | 2017-06-22 | 上海交通大学 | Embedding implementation method for non-uniform bandwidth virtual data centre based on congestion avoidance |
CN110389813A (en) * | 2019-06-17 | 2019-10-29 | 东南大学 | A kind of dynamic migration of virtual machine method in network-oriented target range |
CN111694636A (en) * | 2020-05-11 | 2020-09-22 | 国网江苏省电力有限公司南京供电分公司 | Electric power Internet of things container migration method oriented to edge network load balancing |
WO2021073083A1 (en) * | 2019-10-15 | 2021-04-22 | 南京莱斯网信技术研究院有限公司 | Node load-based dynamic data partitioning system |
CN113553179A (en) * | 2021-07-16 | 2021-10-26 | 北京东方国信科技股份有限公司 | Distributed key value storage load balancing method and system |
CN114900525A (en) * | 2022-05-20 | 2022-08-12 | 中国地质大学(北京) | Method and system for deflecting data stream |
CN115718644A (en) * | 2022-11-25 | 2023-02-28 | 国网江苏省电力有限公司南京供电分公司 | Computing task cross-region migration method and system for cloud data center |
WO2023103349A1 (en) * | 2021-12-08 | 2023-06-15 | 深圳前海微众银行股份有限公司 | Load adjustment method, management node, and storage medium |
-
2023
- 2023-07-27 CN CN202310934592.3A patent/CN116991580B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103593347A (en) * | 2012-08-14 | 2014-02-19 | 中兴通讯股份有限公司 | Load balancing method and distributed database system |
WO2017100987A1 (en) * | 2015-12-15 | 2017-06-22 | 上海交通大学 | Embedding implementation method for non-uniform bandwidth virtual data centre based on congestion avoidance |
CN110389813A (en) * | 2019-06-17 | 2019-10-29 | 东南大学 | A kind of dynamic migration of virtual machine method in network-oriented target range |
WO2021073083A1 (en) * | 2019-10-15 | 2021-04-22 | 南京莱斯网信技术研究院有限公司 | Node load-based dynamic data partitioning system |
CN111694636A (en) * | 2020-05-11 | 2020-09-22 | 国网江苏省电力有限公司南京供电分公司 | Electric power Internet of things container migration method oriented to edge network load balancing |
CN113553179A (en) * | 2021-07-16 | 2021-10-26 | 北京东方国信科技股份有限公司 | Distributed key value storage load balancing method and system |
WO2023103349A1 (en) * | 2021-12-08 | 2023-06-15 | 深圳前海微众银行股份有限公司 | Load adjustment method, management node, and storage medium |
CN114900525A (en) * | 2022-05-20 | 2022-08-12 | 中国地质大学(北京) | Method and system for deflecting data stream |
CN115718644A (en) * | 2022-11-25 | 2023-02-28 | 国网江苏省电力有限公司南京供电分公司 | Computing task cross-region migration method and system for cloud data center |
Non-Patent Citations (3)
Title |
---|
刘健;张军伟;张浩;邵冰清;杨洪章;刘振军;: "蓝鲸元数据服务器集群的细粒度负载迁移", 计算机研究与发展, no. 1, 15 December 2014 (2014-12-15) * |
韦立;陈珊珊;: "基于Redis单位最大效益自适应迁移策略研究", 计算机技术与发展, no. 10, 28 May 2018 (2018-05-28) * |
高子妍;王勇;: "面向云服务的分布式消息系统负载均衡策略", 计算机科学, no. 1, 15 June 2020 (2020-06-15) * |
Also Published As
Publication number | Publication date |
---|---|
CN116991580B (en) | 2024-09-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11323514B2 (en) | Data tiering for edge computers, hubs and central systems | |
US10466899B2 (en) | Selecting controllers based on affinity between access devices and storage segments | |
JP5765416B2 (en) | Distributed storage system and method | |
CN102984280B (en) | Data backup system and method for social cloud storage network application | |
US20110161294A1 (en) | Method for determining whether to dynamically replicate data | |
Shalita et al. | Social hash: an assignment framework for optimizing distributed systems operations on social networks | |
US20020065833A1 (en) | System and method for evaluating changes in performance arising from reallocation of files among disk storage units | |
US10810054B1 (en) | Capacity balancing for data storage system | |
CN111443867B (en) | Data storage method, device, equipment and storage medium | |
CN112947860B (en) | Hierarchical storage and scheduling method for distributed data copies | |
CN107133228A (en) | A kind of method and device of fast resampling | |
CN112559459B (en) | Cloud computing-based self-adaptive storage layering system and method | |
CN109154933B (en) | Distributed database system and method for distributing and accessing data | |
CN108519856A (en) | Based on the data block copy laying method under isomery Hadoop cluster environment | |
CN111666179A (en) | Intelligent replication system and server for multi-point data disaster tolerance | |
JP7192645B2 (en) | Information processing device, distributed processing system and distributed processing program | |
CN114048186A (en) | Data migration method and system based on mass data | |
CN116991580B (en) | Distributed database system load balancing method and device | |
US20020073290A1 (en) | System and method for identifying busy disk storage units | |
US11010410B1 (en) | Processing data groupings belonging to data grouping containers | |
CN114063888A (en) | Data storage system, data processing method, terminal and storage medium | |
CN110968762B (en) | Adjustment method and device for retrieval | |
KR102054068B1 (en) | Partitioning method and partitioning device for real-time distributed storage of graph stream | |
KR20180035023A (en) | Storage Orchestration Learning Optimization Target Volume Selection Method | |
JPH09231144A (en) | Method and device for managing data file |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |