CN109379298A - A kind of load-balancing method of big data system - Google Patents
A kind of load-balancing method of big data system Download PDFInfo
- Publication number
- CN109379298A CN109379298A CN201811489449.3A CN201811489449A CN109379298A CN 109379298 A CN109379298 A CN 109379298A CN 201811489449 A CN201811489449 A CN 201811489449A CN 109379298 A CN109379298 A CN 109379298A
- Authority
- CN
- China
- Prior art keywords
- data
- back end
- load
- node
- data cell
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/12—Avoiding congestion; Recovering from congestion
- H04L47/125—Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/29—Flow control; Congestion control using a combination of thresholds
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a kind of load-balancing methods of big data system, this method comprises: back end records each data cell data volume being read in each data manipulation of its storage or the data volume being updated;Back end counts the reading total amount of data of each data cell in one day and updates total amount of data;Based on the reading total amount of data and total amount of data is updated for each data cell computational load index;Back end calculates the sum of the load factor of all data cells of its storage, the node load index as the back end;Node load index of the management server based on each back end controls each back end and carries out load balancing.The load of each back end, improves resource utilization in this method balance system.
Description
[technical field]
The invention belongs to computers and internet area more particularly to big data field, specifically, being related to a kind of big data
The load-balancing method of system.
[background technique]
With the fast development of computer and Internet technology, we have been in the epoch of an information explosion, in order to locate
A large amount of information is managed, the concept of big data occurs.So-called big data, referring to can not be in the time range that can be born with conventional
The data acquisition system that software tool is captured, managed and handled is to need new tupe that could have stronger decision edge, hole
Examine magnanimity, high growth rate and the diversified information assets of discovery power and process optimization ability.
Due to the mass property of data, people only with one's own be difficult to these data these analysis, but with
Cloud computing is under the setting off of the technological innovation curtain of representative, these data that is difficult to collect originally and use start to be easy to be utilized
Get up, by constantly bringing forth new ideas for all trades and professions, big data is gradually that the mankind create more values.
Since big data system has mass data, carry out storing data usually using multiple back end, in practical fortune
During row, since each node stores different data, the load of each node is also different, if node is negative
It carries excessively high, necessarily affects its speed for handling data, the problems such as causing the response time too long, and other node loads are low simultaneously,
Lead to resources idle.Although this unbalanced situation leads to system, whole resource is enough, and actual operating state is undesirable, because
This needs to carry out load balancing.
[summary of the invention]
To solve the above-mentioned problems, the invention proposes a kind of load-balancing methods of big data system.
The technical solution adopted by the invention is as follows:
A kind of load-balancing method of big data system, comprising the following steps:
(1) back end record its storage the data volume that is read in each data manipulation of each data cell or by
The data volume of update;
(2) back end counts the reading total amount of data of each data cell in one day and updates total amount of data;
It (3) is each data cell computational load index based on the reading total amount of data and update total amount of data;
(4) back end calculates the sum of the load factor of all data cells of its storage, the section as the back end
Point load factor;
(5) the node load index that back end is calculated is sent to management server;
(6) node load index of the management server based on each back end controls each back end and is loaded
It is balanced;
Wherein, in the step 3, the method for calculating the load factor F an of data cell is as follows:
(3.1) when the data cell is stored in back end, its initial load factor F=0 is set;
(3.2) one day new reading total amount of data R of the data cell is being obtained1With update total amount of data R2Afterwards, it calculates new
Load factor Fnew, it may be assumed that
Fnew=FS+W1R1+W2R2
Wherein, W1And W2It is weighted value predetermined, S is damped expoential predetermined, 0 < S < 1;
(3.3) the load factor F of the data cell is updated to Fnew。
Further, the step 6 specifically includes:
(6.1) shared n back end is set, corresponding node load index is F1, F2... ..., FnManagement server meter
Count stating the average value F of n node load index inave;
(6.2) management server calculates the difference of each node load index and the average value, i.e. Di=Fi–Fave(1≤i≤
n);
(6.3) the n D for being calculatediIf some DiGreater than predefined threshold value, then by its corresponding back end
Node set to be equalized is added;
(6.4) each back end of the management server into node set to be equalized issues a preparation equilibrium
Command messages include above-mentioned average value F in the command messagesave;
(6.5) back end for receiving the command messages calculates the difference of its node load index and the average value,
And select a load factor closest to the data cell of the difference from the data cell that it is stored, by selected data sheet
The load factor of member notifies management server;
(6.6) sequence of the management server according to the load factor received from big to small treats equalizing section point set
Each back end sequence in conjunction, if each back end after sequence is A1, A2... ..., Am;
(6.7) sequence of the management server according to node load index from small to large, to corresponding back end into
Row sequence, m data node, is set as B before taking1, B2... ..., Bm;
(6.8) management server is to back end AjLoad balancing message is issued, is wrapped in the load balancing message
Back end B is includedjAddress (1≤j≤m);
(6.9) back end AjIts selected data cell is moved into back end Bj, A laterjDelete its storage
The data cell.
Further, in the step 2, each back end carries out the statistics in given time.
Further, the moment is daily zero point.
Further, in the whole life cycle of data cell, it is all associated with its corresponding load factor, once the number
It is deleted according to unit, which is also deleted.
Further, in the step 6.3, if the node set to be equalized is empty set, method terminates.
Further, in AjObtain BjAddress after, with BjConnection is established, sends B for selected data cellj
The data cell of its own storage is deleted in storage later.
The invention has the benefit that the load of each back end in balance system, improves resource utilization.
[Detailed description of the invention]
Described herein the drawings are intended to provide a further understanding of the invention, constitutes part of this application, but
It does not constitute improper limitations of the present invention, in the accompanying drawings:
Fig. 1 is the big data system schematic of the method for the present invention application.
[specific embodiment]
Come that the present invention will be described in detail below in conjunction with attached drawing and specific embodiment, illustrative examples therein and says
It is bright to be only used to explain the present invention but not as a limitation of the invention.
Referring to attached drawing 1, it illustrates the basic framework of big data system applied by the method for the present invention, which includes one
A management server and multiple back end, pass through network connection between management server and each back end.The management clothes
Business device is for being managed entire big data system, and the back end is for storing data and according to the life of management server
It enables and carries out corresponding data manipulation.
The data stored in back end be using data cell as data storage unit, in one embodiment, data sheet
Member refers to a data file, and in another embodiment, data cell is also possible to a data record in database,
The present invention is not specifically limited this.
Based on above system framework, the present invention provides the load-balancing methods between a kind of back end, for guaranteeing
The load of each node is essentially identical, is described as follows:
(1) back end record its storage the data volume that is read in each data manipulation of each data cell or by
The data volume of update.
Have three classes for the data manipulation of data cell: read operation updates operation and delete operation.It is grasped if it is deleting
Make, entire data cell is deleted, just not again can logarithm cause to load according to node, therefore load balancing of the invention does not consider to delete
Except operation.If it is read operation, just there is a reading data volume, such as the 1M byte in data cell is read in once-through operation,
Then reading data volume is 1M byte.Similarly, it operates for updating, if the byte number that data cell is updated is 1M, updates
Data volume is 1M byte.
Each back end records the number that each data cell is read based on the data manipulation instruction received every time
According to the data volume measured or be updated, the reading of these data volumes and more new capital cause load to data node.
(2) back end counts the reading total amount of data of each data cell in one day and updates total amount of data.
Since data cell is read every time in step 1 data volume and the data volume being updated all are recorded, number
As soon as according to node can count day in the data cell reading total amount of data and update total amount of data, the reading total amount of data
It is the sum of the data volume that the data cell is read every time in this day, the update total amount of data is exactly that the data cell is each
The sum of data volume being updated.
Each back end can carry out above-mentioned statistics in given time, for example, can count after daily zero point
It goes the reading total amount of data of each data cell in one day and updates total amount of data.
It (3) is each data cell computational load index based on the reading total amount of data and update total amount of data.
Specifically, the method for calculating the load factor F an of data cell is as follows:
(3.1) when the data cell is stored in back end, its initial load factor F=0 is set.
As soon as back end creates a load factor in newly-built data cell, for it, in the entire of data cell
In life cycle, this load factor is all associated with the data cell, once the data cell is deleted, which can also
It is deleted.
(3.2) one day new reading total amount of data R of the data cell is being obtained1With update total amount of data R2Afterwards, it calculates new
Load factor Fnew, it may be assumed that
Fnew=FS+W1R1+W2R2
Wherein, W1And W2It is weighted value predetermined, can be used for indicating reading and updating the load caused by back end
Weight.F is the current load factor of the data cell, and S is damped expoential predetermined, and meets 0 < S < 1, for indicating
The degree that original load effect decays at any time.
(3.3) the load factor F of the data cell is updated to Fnew。
After updating load factor for each data cell, back end can calculate the load factor of its own.
(4) back end calculates the sum of the load factor of all data cells of its storage, the section as the back end
Point load factor.
By the step for, the node load index of itself can be calculated in each back end, due to every number
Above-mentioned steps (such as after daily zero point) all is executed at the scheduled time according to node, therefore each back end will be in basic phase
With at the time of its node load index is calculated.
(5) the node load index that back end is calculated is sent to management server.
Node load index is just sent to management immediately after its node load index is calculated by each back end
Server, then the management server will also obtain the node load index of each back end at the time of essentially identical.
(6) node load index of the management server based on each back end controls each back end and is loaded
It is balanced.
The node load index of each back end indicates the load of each back end data-handling capacity, essence
On be the sum of each data cell load, those skilled in the art can be based on the node load index, to balance each data
The data volume of node, and then balance its load.
A specific embodiment according to the present invention, the step 6 specifically include:
(6.1) shared n back end is set, corresponding node load index is F1, F2... ..., FnManagement server meter
Count stating the average value F of n node load index inave。
(6.2) management server calculates the difference of each node load index and the average value, i.e. Di=Fi–Fave(1≤i≤
n)。
(6.3) the n D for being calculatediIf some DiGreater than predefined threshold value, then by its corresponding back end
Node set to be equalized is added.
The node set to be equalized is used to indicate that load excessive to need the set of balanced back end, and the set is initial
, if sharing m node in the set, if m=0, illustrate without carrying out load balancing, method after step 6.3 for empty set
Directly terminate.
The predefined threshold value is for being arranged a balanced line, i.e., if load is only only slight beyond average value, so that it may
Not have to be handled, only load just needs to carry out load balancing significantly more than average value.
(6.4) each back end of the management server into node set to be equalized issues a preparation equilibrium
Command messages include above-mentioned average value F in the command messagesave。
(6.5) back end for receiving the command messages calculates the difference of its node load index and the average value,
And select a load factor closest to the data cell of the difference from the data cell that it is stored, by selected data sheet
The load factor of member notifies management server.
(6.6) sequence of the management server according to the load factor received from big to small treats equalizing section point set
Each back end sequence in conjunction, if each back end after sequence is A1, A2... ..., Am。
(6.7) sequence of the management server according to node load index from small to large, to corresponding back end into
Row sequence, m data node, is set as B before taking1, B2... ..., Bm。
The step for actually obtain is that minimum m data node is loaded in system, in this, as load balancing
Destination node.
(6.8) management server is to back end AjLoad balancing message is issued, is wrapped in the load balancing message
Back end B is includedjAddress (1≤j≤m).
Based on the step for, management server is B1Destination node as load balancing informs A1, B2It informs
A2... ..., BmInform Am, so as to form m to the source node and destination node of load balancing.
(6.9) back end AjIts selected data cell is moved into back end Bj, A laterjDelete its storage
The data cell.
In AjObtain BjAddress after, can be with BjConnection is established, sends B for selected data celljIt deposits
Storage deletes the data cell of its own storage, in this way, A laterjLoad will download, and BjLoad will rise, thus real
The target of load balancing is showed.
The above description is only a preferred embodiment of the present invention, thus it is all according to the configuration described in the scope of the patent application of the present invention,
The equivalent change or modification that feature and principle are done, is included in the scope of the patent application of the present invention.
Claims (7)
1. a kind of load-balancing method of big data system, which comprises the following steps:
(1) data volume or be updated that each data cell that back end records its storage is read in each data manipulation
Data volume;
(2) back end counts the reading total amount of data of each data cell in one day and updates total amount of data;
It (3) is each data cell computational load index based on the reading total amount of data and update total amount of data;
(4) back end calculates the sum of the load factor of all data cells of its storage, and the node as the back end is negative
Carry index;
(5) the node load index that back end is calculated is sent to management server;
(6) node load index of the management server based on each back end controls each back end and carries out load balancing;
Wherein, in the step 3, the method for calculating the load factor F an of data cell is as follows:
(3.1) when the data cell is stored in back end, its initial load factor F=0 is set;
(3.2) one day new reading total amount of data R of the data cell is being obtained1With update total amount of data R2Afterwards, new bear is calculated
Carry index Fnew, it may be assumed that
Fnew=FS+W1R1+W2R2
Wherein, W1And W2It is weighted value predetermined, S is damped expoential predetermined, 0 < S < 1;
(3.3) the load factor F of the data cell is updated to Fnew。
2. the method according to claim 1, wherein the step 6 specifically includes:
(6.1) shared n back end is set, corresponding node load index is F1, F2... ..., FnOn management server calculates
State the average value F of n node load indexave;
(6.2) management server calculates the difference of each node load index and the average value, i.e. Di=Fi–Fave(1≤i≤n);
(6.3) the n D for being calculatediIf some DiGreater than predefined threshold value, then its corresponding back end is added
Node set to be equalized;
(6.4) each back end of the management server into node set to be equalized issues the order for preparing equilibrium
Message includes above-mentioned average value F in the command messagesave;
(6.5) back end for receiving the command messages calculates the difference of its node load index and the average value, and from
Select a load factor closest to the data cell of the difference in its data cell stored, by selected data cell
Load factor notifies management server;
(6.6) sequence of the management server according to the load factor received from big to small is treated in balanced node set
Each back end sequence, if sequence after each back end be A1, A2... ..., Am;
(6.7) sequence of the management server according to node load index from small to large arranges corresponding back end
Sequence, m data node, is set as B before taking1, B2... ..., Bm;
(6.8) management server is to back end AjLoad balancing message is issued, includes number in the load balancing message
According to node BjAddress (1≤j≤m);
(6.9) back end AjIts selected data cell is moved into back end Bj, A laterjDelete the number of its storage
According to unit.
3. method described in -2 any one according to claim 1, which is characterized in that in the step 2, each back end exists
Given time carries out the statistics.
4. according to the method described in claim 3, it is characterized in that, the moment is daily zero point.
5. method according to any of claims 1-4, which is characterized in that in the whole life cycle of data cell
In, it is all associated with its corresponding load factor, once the data cell is deleted, which is also deleted.
6. according to the method described in claim 2, it is characterized in that, in the step 6.3, if the node set to be equalized is
Empty set, then method terminates.
7. according to the method described in claim 2, it is characterized in that, in AjObtain BjAddress after, with BjConnection is established, it will
Selected data cell is sent to BjThe data cell of its own storage is deleted in storage later.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811489449.3A CN109379298A (en) | 2018-12-06 | 2018-12-06 | A kind of load-balancing method of big data system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811489449.3A CN109379298A (en) | 2018-12-06 | 2018-12-06 | A kind of load-balancing method of big data system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109379298A true CN109379298A (en) | 2019-02-22 |
Family
ID=65376037
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811489449.3A Pending CN109379298A (en) | 2018-12-06 | 2018-12-06 | A kind of load-balancing method of big data system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109379298A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080046895A1 (en) * | 2006-08-15 | 2008-02-21 | International Business Machines Corporation | Affinity dispatching load balancer with precise CPU consumption data |
CN104836819A (en) * | 2014-02-10 | 2015-08-12 | 阿里巴巴集团控股有限公司 | Dynamic load balancing method and system, and monitoring and dispatching device |
CN105187512A (en) * | 2015-08-13 | 2015-12-23 | 航天恒星科技有限公司 | Method and system for load balancing of virtual machine clusters |
CN105306525A (en) * | 2015-09-11 | 2016-02-03 | 浪潮集团有限公司 | Data layout method, device and system |
CN108255427A (en) * | 2017-12-29 | 2018-07-06 | 广东南华工商职业学院 | A kind of data storage and dynamic migration method and device |
-
2018
- 2018-12-06 CN CN201811489449.3A patent/CN109379298A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080046895A1 (en) * | 2006-08-15 | 2008-02-21 | International Business Machines Corporation | Affinity dispatching load balancer with precise CPU consumption data |
CN104836819A (en) * | 2014-02-10 | 2015-08-12 | 阿里巴巴集团控股有限公司 | Dynamic load balancing method and system, and monitoring and dispatching device |
CN105187512A (en) * | 2015-08-13 | 2015-12-23 | 航天恒星科技有限公司 | Method and system for load balancing of virtual machine clusters |
CN105306525A (en) * | 2015-09-11 | 2016-02-03 | 浪潮集团有限公司 | Data layout method, device and system |
CN108255427A (en) * | 2017-12-29 | 2018-07-06 | 广东南华工商职业学院 | A kind of data storage and dynamic migration method and device |
Non-Patent Citations (2)
Title |
---|
柳旭日等: "异构集群服务器的动态加权负载均衡算法", 《微计算机信息》 * |
郝昱文等: "基于分布式环境的存储负载均衡算法研究", 《信息技术》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2015229200B2 (en) | Coordinated admission control for network-accessible block storage | |
US10250673B1 (en) | Storage workload management using redirected messages | |
CN108509275B (en) | A kind of catalogue moving method and metadata load-balancing method | |
CN109144414A (en) | The multistage storage method and device of block chain data | |
CN106648456B (en) | Dynamic copies file access method based on user's amount of access and forecasting mechanism | |
CN109408590B (en) | Method, device and equipment for expanding distributed database and storage medium | |
CN108183947A (en) | Distributed caching method and system | |
CN108255427B (en) | Data storage and dynamic migration method and device | |
CN109510852B (en) | Method and device for gray scale publishing | |
US11553047B2 (en) | Dynamic connection capacity management | |
CN107766159A (en) | A kind of metadata management method, device and computer-readable recording medium | |
CN103647656A (en) | Billing node load control method, data access control method and node | |
CN110321225A (en) | Load-balancing method, meta data server and computer readable storage medium | |
CN108900626A (en) | Date storage method, apparatus and system under a kind of cloud environment | |
CN107291544A (en) | Method and device, the distributed task scheduling execution system of task scheduling | |
CN106201561B (en) | The upgrade method and equipment of distributed caching cluster | |
EP4170491A1 (en) | Resource scheduling method and apparatus, electronic device, and computer-readable storage medium | |
CN109271106A (en) | Message storage, read method and device, server, storage medium | |
CN102480502B (en) | I/O load equilibrium method and I/O server | |
CN107453948A (en) | The storage method and system of a kind of network measurement data | |
CN107395708A (en) | A kind of method and apparatus for handling download request | |
EP3985493A1 (en) | Group member management method and apparatus, group message processing method and apparatus, device, and storage medium | |
CN109379298A (en) | A kind of load-balancing method of big data system | |
CN108259583B (en) | Data dynamic migration method and device | |
US20170031972A1 (en) | Providing consistent tenant experiences for multi-tenant databases |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190222 |