CN109525662A - The method of copy is set for Hot Contents - Google Patents
The method of copy is set for Hot Contents Download PDFInfo
- Publication number
- CN109525662A CN109525662A CN201811355782.5A CN201811355782A CN109525662A CN 109525662 A CN109525662 A CN 109525662A CN 201811355782 A CN201811355782 A CN 201811355782A CN 109525662 A CN109525662 A CN 109525662A
- Authority
- CN
- China
- Prior art keywords
- server
- copy
- load
- name
- hash
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1008—Server selection for load balancing based on parameters of servers, e.g. available memory or workload
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1029—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers using data related to the state of servers by a load balancer
Abstract
The method of copy is arranged for Hot Contents, comprising: step 1: big data system has n server, using consistency hash algorithm, by this n server be placed into the value space of hash function for 0- () hash space ring on, wherein K=32;Concrete mode is that hash algorithm is executed to the Service name of each server, and the length of the cryptographic Hash of output is K.
Description
Technical field
The invention belongs to the load balancing fields of big data.
Background technique
Consistency hash algorithm (Consistent Hashing Algorithm) is a kind of distributed algorithm, is usually used in bearing
It carries balanced.Memcached client also selects this algorithm, solves key-value being evenly distributed to numerous Memcached
Problem on server.It can replace traditional modulo operation, and additions and deletions Memcached can not be coped with by solving modulo operation
The problem of Server, (additions and deletions server will lead to the same key, and in get operation, distribution is really stored less than data
Server, hit rate can sharply decline).
But in the prior art, by the way of consistency hash algorithm and dummy node, accomplish data are generally uniform
Be distributed on each node, but do not account for the cold and hot property between data.And in the application of reality, the cold and hot difference of data
It is very big, it is non-uniform so as to cause the amount of access to each node.When connection number is excessive, system will be unable to offer service.
Occur so as to cause the problem of load imbalance.
Summary of the invention
The method of copy is set for Hot Contents, comprising:
Step 1: there is big data system n server this n server is placed into Hash using consistency hash algorithm
The value space of function be 0-() hash space ring on, wherein K=32;Concrete mode is the clothes to each server
Name of being engaged in executes hash algorithm, and the length of the cryptographic Hash of output is K;
Step 2: after big data system brings into operation, monitor periodically monitors the load parameter of each server, concrete mode
Are as follows: at moment t, the load of server i is, whereinIt is sampling time interval, everyMonitor to
Each server sends request, and server is current by this nodeValue returns to monitor, and monitor can save last time meter
Calculate return value when load;
Step 3: calculating the average load of t moment server zone are as follows:;
Step 4: calculating the load variance of t moment server zone are as follows:;Step 5: if
It is less than predetermined value, then return step 2, it is no to then follow the steps 6;
Step 6: the maximum server of load is found from Servers-all;
Step 7: finding that number is accessed in a period of time (e.g. 1 day) recently is most from the maximum server of the load
M content file, such as m can be set as 10 in advance;
Step 8: for this m content file, copy title is set.Copy number, example are preferably added after raw filename
Such as raw filename are as follows: file name, then copy name can be set to file name #1.If having existed for a pair
This, copy entitled file name #1 is subsequent when also needing to be further added by copy, can set copy name to file name #
2。
Step 9: calculating the cryptographic Hash of copy name, according to consistency hash algorithm, which is stored in corresponding service
On device.All copy informations are all stored on a loaded server, when user needs to access some file content, first to
Loaded server sends request, and loaded server finds that the content is not provided with copy by inquiry, then directly according to existing
Process flow processing directly goes to access original contents, if it find that there are copies, then from the original contents and copy content with
Machine selects a content to access.To realize more copies setting of hot content, the access pressure of individual server is reduced
Power.
Step 10: returning to step 2.
The method of copy is set for Hot Contents, comprising:
Step 1: there is n server in the big data system, using consistency hash algorithm, this n server is placed
To the value space of hash function be 0-() hash space ring on, wherein K=31+;Concrete mode is
Hash algorithm is executed to the Service name of each server, the length of the cryptographic Hash of output is K;
Step 2: after big data system brings into operation, monitor periodically monitors the load parameter of each server, concrete mode
Are as follows: at moment t, the load of server i is, whereinIt is sampling time interval, everyMonitor to
Each server sends request, and server is current by this nodeValue returns to monitor, and monitor can save last time meter
Calculate return value when load;
Step 3: calculating the average load of t moment server zone are as follows:;
Step 4: calculating the load variance of t moment server zone are as follows:;Step 5: if
It is less than predetermined value, then return step 2, it is no to then follow the steps 6;
Step 6: the maximum server of load is found from Servers-all;
Step 7: finding that number is accessed in a period of time (e.g. 1 day) recently is most from the maximum server of the load
M content file, such as m can be set as 10 in advance;
Step 8: for this m content file, copy title is set.Copy number, example are preferably added after raw filename
Such as raw filename are as follows: file name, then copy name can be set to file name #1.If having existed for a pair
This, copy entitled file name #1 is subsequent when also needing to be further added by copy, can set copy name to file name #
2。
Step 9: calculating the cryptographic Hash of copy name, according to consistency hash algorithm, which is stored in corresponding service
On device.All copy informations are all stored on a loaded server, when user needs to access some file content, first to
Loaded server sends request, and loaded server finds that the content is not provided with copy by inquiry, then directly according to existing
Process flow processing directly goes to access original contents, if it find that there are copies, then from the original contents and copy content with
Machine selects a content to access.To realize more copies setting of hot content, the access pressure of individual server is reduced
Power.
Step 10: returning to step 2.
It is an advantage of the invention that according to the mean square deviation of server load, to determine whether to as on the maximum server of load
Hot content copy is set, to reach load sharing, the problem of mitigating individual server load pressure.
Detailed description of the invention
Fig. 1 is schematic diagram 1 according to an embodiment of the present invention;
Fig. 2 is schematic diagram 2 according to an embodiment of the present invention;
Fig. 3 is schematic diagram 3 according to an embodiment of the present invention;
Fig. 4 is schematic diagram 4 according to an embodiment of the present invention.
Specific embodiment
Consistency Hash principle
In simple terms, entire hash-value space is organized into a virtual annulus by consistency Hash, such as assumes certain hash function H
Value space be 0-() (i.e. cryptographic Hash is one K without symbol shaping).It is illustrated below with K=32.
Entire hash space ring is as shown in Figure 1, entire space is organized in the direction of the clock.0 and () in zero point
Direction is overlapped.
Each server is subjected to a Hash using H in next step, specifically can choose the ip or host masterpiece of server
Hash is carried out for keyword, every machine so just can determine that its position on hash space ring, it is assumed here that will above
Three servers use as shown in Figure 2 in the position of annular space after ip Address-Hash.
Next respective server is accessed using following algorithm location data: data key is counted using identical function H
Cryptographic Hash h is calculated, logical to determine position of this data on ring according to h, " walking " clockwise along ring from this position, First encounters
Server be exactly server that it should be navigated to.
Such as we have tetra- data objects of A, B, C, D, position such as Fig. 3 institute after Hash calculation, on annular space
Show: according to consistency hash algorithm, data A can be decided to be on Server 1, and D is decided to be on Server 3, and B, C distinguish
It is decided to be on Server 2.
Dummy node principle
In order to solve the problems, such as that this data skew, consistency hash algorithm introduce dummy node mechanism, i.e., to each service
Node calculates multiple Hash, this service node, referred to as dummy node are placed in each calculated result position.Specific practice can
It is realized with increasing number behind server ip or host name.Such as situation above, we are determined as every server
Three dummy nodes are calculated, " Memcached Server 1#1 ", " Memcached Server 1# can be then calculated separately
2”、“Memcached Server 1#3”、“Memcached Server 2#1”、“Memcached Server 2#2”、
The cryptographic Hash of " Memcached Server 2#3 " then forms six dummy nodes, as shown in Figure 4.
Embodiment 1
The method of copy is set for Hot Contents, comprising:
Step 1: there is big data system n server this n server is placed into Hash using consistency hash algorithm
The value space of function be 0-() hash space ring on, wherein K=32;Concrete mode is the clothes to each server
Name of being engaged in executes hash algorithm, and the length of the cryptographic Hash of output is K;
Step 2: after big data system brings into operation, monitor periodically monitors the load parameter of each server, concrete mode
Are as follows: at moment t, the load of server i is, whereinIt is sampling time interval, everyMonitor to
Each server sends request, and server is current by this nodeValue returns to monitor, and monitor can save last time meter
Calculate return value when load;
Step 3: calculating the average load of t moment server zone are as follows:;
Step 4: calculating the load variance of t moment server zone are as follows:;Step 5: if
It is less than predetermined value, then return step 2, it is no to then follow the steps 6;
Step 6: the maximum server of load is found from Servers-all;
Step 7: finding that number is accessed in a period of time (e.g. 1 day) recently is most from the maximum server of the load
M content file, such as m can be set as 10 in advance;
Step 8: for this m content file, copy title is set.Copy number, example are preferably added after raw filename
Such as raw filename are as follows: file name, then copy name can be set to file name #1.If having existed for a pair
This, copy entitled file name #1 is subsequent when also needing to be further added by copy, can set copy name to file name #
2。
Step 9: calculating the cryptographic Hash of copy name, according to consistency hash algorithm, which is stored in corresponding service
On device.All copy informations are all stored on a loaded server, when user needs to access some file content, first to
Loaded server sends request, and loaded server finds that the content is not provided with copy by inquiry, then directly according to existing
Process flow processing directly goes to access original contents, if it find that there are copies, then from the original contents and copy content with
Machine selects a content to access.To realize more copies setting of hot content, the access pressure of individual server is reduced
Power.
Step 10: returning to step 2.
The method of copy is set for Hot Contents, comprising:
Step 1: there is n server in the big data system, using consistency hash algorithm, this n server is placed
To the value space of hash function be 0-() hash space ring on, wherein K=31+;Concrete mode is
Hash algorithm is executed to the Service name of each server, the length of the cryptographic Hash of output is K;
Step 2: after big data system brings into operation, monitor periodically monitors the load parameter of each server, concrete mode
Are as follows: at moment t, the load of server i is, whereinIt is sampling time interval, everyMonitor to
Each server sends request, and server is current by this nodeValue returns to monitor, and monitor can save last time meter
Calculate return value when load;
Step 3: calculating the average load of t moment server zone are as follows:;
Step 4: calculating the load variance of t moment server zone are as follows:;Step 5: if
It is less than predetermined value, then return step 2, it is no to then follow the steps 6;
Step 6: the maximum server of load is found from Servers-all;
Step 7: finding that number is accessed in a period of time (e.g. 1 day) recently is most from the maximum server of the load
M content file, such as m can be set as 10 in advance;
Step 8: for this m content file, copy title is set.Copy number, example are preferably added after raw filename
Such as raw filename are as follows: file name, then copy name can be set to file name #1.If having existed for a pair
This, copy entitled file name #1 is subsequent when also needing to be further added by copy, can set copy name to file name #
2。
Step 9: calculating the cryptographic Hash of copy name, according to consistency hash algorithm, which is stored in corresponding service
On device.All copy informations are all stored on a loaded server, when user needs to access some file content, first to
Loaded server sends request, and loaded server finds that the content is not provided with copy by inquiry, then directly according to existing
Process flow processing directly goes to access original contents, if it find that there are copies, then from the original contents and copy content with
Machine selects a content to access.To realize more copies setting of hot content, the access pressure of individual server is reduced
Power.
Step 10: returning to step 2.
Claims (4)
1. the method for copy is arranged for Hot Contents, comprising:
Step 1: there is big data system n server this n server is placed into Hash using consistency hash algorithm
The value space of function be 0-() hash space ring on, wherein K=32;Concrete mode is the clothes to each server
Name of being engaged in executes hash algorithm, and the length of the cryptographic Hash of output is K;
Step 2: after big data system brings into operation, monitor periodically monitors the load parameter of each server, concrete mode
Are as follows: at moment t, the load of server i is, whereinIt is sampling time interval, everyMonitor to
Each server sends request, and server is current by this nodeValue returns to monitor, and monitor can save last time meter
Calculate return value when load;
Step 3: calculating the average load of t moment server zone are as follows:;
Step 4: calculating the load variance of t moment server zone are as follows:;Step 5: ifNot
More than predetermined value, then return step 2, no to then follow the steps 6;
Step 6: the maximum server of load is found from Servers-all;
Step 7: finding that number is accessed in a period of time (e.g. 1 day) recently is most from the maximum server of the load
M content file, such as m can be set as 10 in advance;
Step 8: copy title being set for this m content file, copy number, example are preferably added after raw filename
Such as raw filename are as follows: file name, then copy name can be set to file name #1, if having existed for a pair
This, copy entitled file name #1 is subsequent when also needing to be further added by copy, can set copy name to file name #
2;
Step 9: calculating the cryptographic Hash of copy name, according to consistency hash algorithm, which is stored on corresponding server;
Step 10: returning to step 2.
2. the method for copy is arranged for Hot Contents, comprising:
Step 1: there is n server in the big data system, using consistency hash algorithm, this n server is placed
To the value space of hash function be 0-() hash space ring on, wherein K=31+;Concrete mode is
Hash algorithm is executed to the Service name of each server, the length of the cryptographic Hash of output is K;
Step 2: after big data system brings into operation, monitor periodically monitors the load parameter of each server, concrete mode
Are as follows: at moment t, the load of server i is, whereinIt is sampling time interval, everyMonitor to
Each server sends request, and server is current by this nodeValue returns to monitor, and monitor can save last time meter
Calculate return value when load;
Step 3: calculating the average load of t moment server zone are as follows:;
Step 4: calculating the load variance of t moment server zone are as follows:;Step 5: ifNot
More than predetermined value, then return step 2, no to then follow the steps 6;
Step 6: the maximum server of load is found from Servers-all;
Step 7: finding that number is accessed in a period of time (e.g. 1 day) recently is most from the maximum server of the load
M content file, such as m can be set as 10 in advance;
Step 8: copy title being set for this m content file, copy number, example are preferably added after raw filename
Such as raw filename are as follows: file name, then copy name can be set to file name #1, if having existed for a pair
This, copy entitled file name #1 is subsequent when also needing to be further added by copy, can set copy name to file name #
2;
Step 9: calculating the cryptographic Hash of copy name, according to consistency hash algorithm, which is stored on corresponding server;
Step 10: returning to step 2.
3. a kind of computer program, for executing any one method in method 1-2.
4. the system of copy is arranged for Hot Contents, comprising: central processing unit, memory include computer on the memory
Program, the computer program, for executing any one method in method 1-2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811355782.5A CN109525662A (en) | 2018-11-14 | 2018-11-14 | The method of copy is set for Hot Contents |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811355782.5A CN109525662A (en) | 2018-11-14 | 2018-11-14 | The method of copy is set for Hot Contents |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109525662A true CN109525662A (en) | 2019-03-26 |
Family
ID=65777683
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811355782.5A Withdrawn CN109525662A (en) | 2018-11-14 | 2018-11-14 | The method of copy is set for Hot Contents |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109525662A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102244685A (en) * | 2011-08-11 | 2011-11-16 | 中国科学院软件研究所 | Distributed type dynamic cache expanding method and system supporting load balancing |
CN102624922A (en) * | 2012-04-11 | 2012-08-01 | 武汉大学 | Method for balancing load of network GIS heterogeneous cluster server |
CN106572181A (en) * | 2016-11-08 | 2017-04-19 | 深圳市中博睿存科技有限公司 | Object storage interface load balancing method and system based on cluster file system |
CN107302561A (en) * | 2017-05-23 | 2017-10-27 | 南京邮电大学 | A kind of hot spot data Replica placement method in cloud storage system |
CN107463342A (en) * | 2017-08-28 | 2017-12-12 | 北京奇艺世纪科技有限公司 | A kind of storage method and device of CDN fringe nodes file |
CN107483519A (en) * | 2016-06-08 | 2017-12-15 | Tcl集团股份有限公司 | A kind of Memcache load-balancing methods and its system |
CN107508758A (en) * | 2017-08-16 | 2017-12-22 | 北京云端智度科技有限公司 | A kind of method that focus file spreads automatically |
-
2018
- 2018-11-14 CN CN201811355782.5A patent/CN109525662A/en not_active Withdrawn
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102244685A (en) * | 2011-08-11 | 2011-11-16 | 中国科学院软件研究所 | Distributed type dynamic cache expanding method and system supporting load balancing |
CN102624922A (en) * | 2012-04-11 | 2012-08-01 | 武汉大学 | Method for balancing load of network GIS heterogeneous cluster server |
CN107483519A (en) * | 2016-06-08 | 2017-12-15 | Tcl集团股份有限公司 | A kind of Memcache load-balancing methods and its system |
CN106572181A (en) * | 2016-11-08 | 2017-04-19 | 深圳市中博睿存科技有限公司 | Object storage interface load balancing method and system based on cluster file system |
CN107302561A (en) * | 2017-05-23 | 2017-10-27 | 南京邮电大学 | A kind of hot spot data Replica placement method in cloud storage system |
CN107508758A (en) * | 2017-08-16 | 2017-12-22 | 北京云端智度科技有限公司 | A kind of method that focus file spreads automatically |
CN107463342A (en) * | 2017-08-28 | 2017-12-12 | 北京奇艺世纪科技有限公司 | A kind of storage method and device of CDN fringe nodes file |
Non-Patent Citations (1)
Title |
---|
王诚等: "基于贪心算法的一致性哈希负载均衡优化", 《南京邮电大学学报(自然科学版)》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10185721B2 (en) | Distributed data set storage and retrieval | |
CN106464674B (en) | Managing NIC encryption streams for migrating customers or tasks | |
KR101503202B1 (en) | Data synchronization | |
Loesing et al. | Stormy: an elastic and highly available streaming service in the cloud | |
JP2019504412A (en) | Short link processing method, device, and server | |
WO2021003935A1 (en) | Data cluster storage method and apparatus, and computer device | |
Zheng et al. | BatchFS: Scaling the file system control plane with client-funded metadata servers | |
JP2012079242A (en) | Composite event distribution device, composite event distribution method and composite event distribution program | |
EP3639463B1 (en) | Distributed data set encryption and decryption | |
Liu et al. | Popularity-aware multi-failure resilient and cost-effective replication for high data durability in cloud storage | |
Henze et al. | Practical data compliance for cloud storage | |
CN109451069B (en) | Network data file library storage and query method based on distributed storage | |
US20080270483A1 (en) | Storage Management System | |
CN110019205A (en) | A kind of data storage, restoring method, device and computer equipment | |
TWI652621B (en) | Method and system for generating queue based applications dependencies in virtual machines | |
JP5957965B2 (en) | Virtualization system, load balancing apparatus, load balancing method, and load balancing program | |
CN109525662A (en) | The method of copy is set for Hot Contents | |
CN104219163A (en) | Load balancing method for node dynamic forward based on dynamic replication method and virtual node method | |
Huang et al. | Optimizing data partition for scaling out NoSQL cluster | |
US9940342B2 (en) | Stability measurement for federation engine | |
CN109246250A (en) | The method for adjusting dummy node quantity according to the change of number of servers | |
JP6259408B2 (en) | Distributed processing system | |
Ying et al. | Consistent hashing algorithm based on slice in improving Scrapy-Redis distributed crawler efficiency | |
Sun et al. | RPCC: A Replica Placement Method to Alleviate the Replica Consistency under Dynamic Cloud | |
Tech et al. | A view on load balancing of NoSQL databases (Couchbase, Cassandra, Neo4j and Voldemort) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20190326 |