CN103905503A - Data storage method, data scheduling method, device and system - Google Patents

Data storage method, data scheduling method, device and system Download PDF

Info

Publication number
CN103905503A
CN103905503A CN201210581492.9A CN201210581492A CN103905503A CN 103905503 A CN103905503 A CN 103905503A CN 201210581492 A CN201210581492 A CN 201210581492A CN 103905503 A CN103905503 A CN 103905503A
Authority
CN
China
Prior art keywords
cache node
data
node
hash bucket
cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210581492.9A
Other languages
Chinese (zh)
Other versions
CN103905503B (en
Inventor
梁智超
钱岭
周大
孙少陵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201210581492.9A priority Critical patent/CN103905503B/en
Publication of CN103905503A publication Critical patent/CN103905503A/en
Application granted granted Critical
Publication of CN103905503B publication Critical patent/CN103905503B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a data storage method, a data scheduling method, a device and a system. The data scheduling method comprises: the server of a distributed cache system obtaining memory use information of each cache node; the server, according to the memory use information of each cache node, determining a source cache node needing data migration, and determining a Hash barrel needing migration on the source cache node and a target cache node capable of accommodating data in the Hash barrel after the source cache node needing the data migration is determined; the server sending a data scheduling instruction to the source cache node so as to instruct the source cache node to migrate the data needing the migration in the Hash barrel to the target cache node; and the server sending a mapping relation updating instruction to a client which the migrated Hash barrel belongs to so as to instruct the client to update the mapping relation between the Hash barrel and the cache nodes according to data migration of this time.

Description

Data access method, dispatching method, equipment and system
Technical field
The present invention relates to communication network technology field, relate in particular to a kind of data access method, dispatching method, equipment and system.
Background technology
In the Web2.0 epoch, the major applications on the Internet is all to save the data in relevant database, and client is reading out data from database.But along with the increase of data volume and the increase of data access amount, just there will be that database loads increases the weight of, hydraulic performance decline, response slowly, the series of problems such as website display delay, under this background, the caching server based on internal memory arises at the historic moment.
Conventionally, the data access package of existing support multi-client is realized by consistency hash algorithm.Consistency hash algorithm, by the cryptographic Hash of each cache node and data major key is projected on an annular space, is realized the distributed storage of data.In the time judging which cache node is certain data drop on, taking these data, the position on ring, as starting point, travels through annular thresholding space according to clockwise direction to consistency hash algorithm, and first cache node running into is the cache node under these data.For fear of data skewness on cache node, consistency hash algorithm dilutes the data on annular thresholding space by an entity cache node is copied as to several virtual cache nodes.Virtual cache node is only responsible for the division to data, and the data that are distributed on virtual cache node are still stored by entity cache node.
But there is following shortcoming in the virtual cache node data access plan of consistency hash algorithm:
1, introducing after virtual cache node, the cache node under data of access at least needs three step: a. to calculate the position of these data on annular thresholding space; B. find the virtual cache node under these data according to calculating gained position; C. find corresponding entity cache node according to virtual cache node.This shows, the introducing of virtual cache node, makes data access process become more complicated, has reduced data access performance.
2, under multi-client environment, the wall scroll size of data of different clients may be different, even if this just causes the data number of buffer memory on each cache node identical, but total data size is different, that is, some cache node committed memories are more, some cache node committed memories are less, data problem pockety is not only avoided, and also likely amplifies on the contrary, and then the memory usage of caching server node is had a greatly reduced quality.
Therefore, need a kind of data access package and data dispatch scheme badly and occur, in order to solve the problems of the technologies described above.
Summary of the invention
The embodiment of the present invention provides a kind of data access method and equipment, in order to reduce the complexity of data access, promotes data access performance.
To achieve these goals, the embodiment of the present invention adopts following technological means:
The embodiment of the present invention provides a kind of data access method, is applied to distributed cache system, and described method comprises:
The client device of distributed cache system, according to the data major key of request access, is determined the Hash bucket that described data major key is corresponding, and according to the mapping relations of Hash bucket and cache node, is determined the cache node of described Hash bucket correspondence; Wherein, a unique corresponding cache node of Hash bucket;
Described client device is initiated data access request to the cache node of described Hash bucket correspondence, to ask described cache node to carry out access processing according to described data access request.
The embodiment of the present invention also provides a kind of client device, is applied to distributed cache system, comprising:
Determination module, for according to the data major key of request access, determines the Hash bucket that described data major key is corresponding, and according to the mapping relations of Hash bucket and cache node, determines the cache node of described Hash bucket correspondence; Wherein, a unique corresponding cache node of Hash bucket;
Access request module, initiates data access request for the cache node to described Hash bucket correspondence, to ask described cache node to carry out access processing according to described data access request.
Compared with prior art, the above embodiment of the present invention has following useful technique effect:
The data access package that the embodiment of the present invention provides, client device is determined corresponding Hash bucket according to the data major key of request access, and determine with the mapping relations of cache node the cache node that this Hash bucket is corresponding according to Hash bucket, get final product specified data memory location by above-mentioned two-layer mapping relations, simplify data access process, improve data processing speed, and promoted data access performance.
The embodiment of the present invention also provides a kind of data dispatching method, equipment and system, is uniformly distributed in order to realize data, improves the memory usage of cache node.
To achieve these goals, the embodiment of the present invention adopts following technological means:
The embodiment of the present invention provides a kind of data dispatching method of realizing based on aforementioned data access method, and described method comprises:
The server of distributed cache system obtains the memory usage information of each cache node;
Described server is according to the memory usage information of each cache node, determine the source cache node that need to carry out Data Migration, and determining after the source cache node that needs migration, determine the Hash bucket that needs migration on described source cache node, and can hold the target cache node of the data in this Hash bucket;
Described server sends data dispatch instruction to described source cache node, to indicate described source cache node that the Data Migration in the Hash bucket of needs migration is arrived to described target cache node;
Described server sends mapping relations to the client device under the Hash bucket being moved and upgrades instruction, to indicate client to upgrade the mapping relations of Hash bucket and cache node according to this Data Migration operation.
The embodiment of the present invention also provides a kind of server, is applied to distributed cache system, comprising:
Acquisition module, for obtaining the memory usage information of each cache node;
Decision-making module, be used for the memory usage information of the each cache node obtaining according to described acquisition module, determine the source cache node that need to carry out Data Migration, and determining after the source cache node that needs migration, determine the Hash bucket that needs migration on described source cache node, and can hold the target cache node of the data in this Hash bucket;
Scheduler module, for sending data dispatch instruction to described source cache node, to indicate described source cache node that the Data Migration in the Hash bucket of needs migration is arrived to described target cache node;
Upgrade indicating module, send mapping relations for the client device under the Hash bucket to being moved and upgrade instruction, to indicate client to upgrade the mapping relations of Hash bucket and cache node according to this Data Migration operation.
The embodiment of the present invention also provides a kind of distributed cache system, and described system comprises: aforesaid client device, aforesaid server and at least 2 cache nodes, wherein,
Cache node, the data access request of initiating for receiving client device, and described data access request is carried out to access processing; And, the data dispatch instruction that reception server sends, and according to described data dispatch instruction, the Data Migration in the Hash bucket of needs migration is arrived to target cache node.
Compared with prior art, the above embodiment of the present invention has following useful technique effect:
The data dispatch scheme that the embodiment of the present invention provides, determine and need to carry out the source cache node of data dispatch by server, and control source cache node taking Hash bucket as unit, by Data Migration to other cache nodes, be uniformly distributed on cache node thereby realize data, improved memory usage.
Brief description of the drawings
Fig. 1 is the system architecture schematic diagram of the embodiment of the present invention;
The data access schematic flow sheet that Fig. 2 provides for the embodiment of the present invention;
The data dispatch schematic flow sheet that Fig. 3 provides for the embodiment of the present invention;
Fig. 4 a-Fig. 4 d is embodiment of the present invention binary tree data dispatch algorithm schematic diagram;
The structural representation of the client device that Fig. 5 provides for the embodiment of the present invention;
The structural representation of the server that Fig. 6 provides for the embodiment of the present invention.
Embodiment
The problems referred to above that exist for prior art, the embodiment of the present invention provides a kind of data access package and data dispatch scheme.Below in conjunction with accompanying drawing, the embodiment of the present invention is described in detail.
Fig. 1 shows the distributed cache system framework that the embodiment of the present invention provides, and this caching system is supported the data access request of multi-client.In this system architecture, can comprise: client device 11(can be multiple, in figure, only show a client device), server 12 and at least 2 cache nodes 13.Client device 11 is for carrying out data access access; Server 12 distributes for the data of monitoring and dispatching each cache node 13; Cache node 13 is in memory cache data, and the data access request of customer in response end equipment 11.
Data space is divided into multiple Hash buckets by the distributed cache system of the embodiment of the present invention, data are taking Hash bucket as cell stores on each cache node, and the number of Hash bucket can be set according to distributed cache system performance and data traffic requirement.Hash bucket is a data storage cell in logic, and what really store data is cache node.Data in a Hash bucket can not cross-node storage, that is, Hash bucket can only a unique corresponding cache node, and a cache node can corresponding multiple Hash buckets.
In the time that client device is linked into distributed cache system, distribute Hash bucket to distributed cache system request, that is, client device is initiated memory headroom to server and is distributed request.The Hash that server need to take according to this client of carrying in this request is disclosed number, for this client is distributed the Hash bucket of respective numbers, and set up the corresponding relation between Hash bucket and cache node, to generate mapping table, and mapping table is sent to client device.Preferably, server can be in local memory map assignments.Mapping table structure is as shown in table 1:
Table 1
In table 1, the numeral that cache node is corresponding, represents the mark (ID) of cache node, and cache node ID is unique identification cache node in distributed cache system; The mark (ID) of the digitized representation Hash bucket of Hash bucket correspondence, Hash bucket ID is unique identification Hash bucket in distributed cache system.Preferably, Hash bucket ID distributes with the order increasing progressively since 1.
Below in conjunction with said system framework and Fig. 2, describe data access flow process in detail, as shown in the figure, this flow process comprises the following steps:
Step 201-202, client device receives the request of user's access data, and according to the data major key of request access, determines the Hash bucket that this data major key is corresponding.
In embodiments of the present invention, user typically refers to the application on client device.
In distributed cache system, the data of buffer memory are stored in the mode of key-value pair (<key, value>) conventionally, i.e. the corresponding value of a major key (key) (value), therefore, user will provide data major key in the time of access data.
Concrete, the cryptographic Hash of client devices computed data major key, and by by the cryptographic Hash calculating, the Hash barrelage amount of distributing for client device is carried out to modulo operation, determine the Hash bucket that this data major key is corresponding.Preferably, client device utilizes CRC(Cyclical Redundancy Check, CRC) check code of 32 algorithm calculated data major keys, and according to the check code calculating, the Hash barrelage amount of distributing for client device is carried out to modulo operation, the numerical value obtaining is Hash bucket mark, discloses thereby determine the Hash that data major key is corresponding.The mode that client devices computed acquisition Hash is disclosed mark is including but not limited to above-mentioned CRC32 algorithm.
Step 203, client device is according to this Hash bucket mark query mappings table, the cache node under specified data.
Step 204, client device sends data access request to this cache node, to ask this cache node to carry out access processing according to data access request.
Concrete, the cache software (for example Memcached) moving on client device and cache node connects, and sends data access request to it, and this cache node completes after access processing, and result is returned to this client device by cache software.If what client device sent is write data requests, cache node is according to the Hash bucket mark of carrying in this write data requests, and the data that needs are write write in the corresponding Hash bucket on this cache node; If what client device sent is read data request, cache node is according to the Hash bucket mark of carrying in this read data request, reading out data in the corresponding Hash bucket from this cache node.
Step 205, client device receives the result that this cache node returns, and this result is returned to user.
Below in conjunction with table 1, describe client device in detail and utilize Hash bucket mapping algorithm to determine the process of the cache node described in data.In distributed cache system, have 4 cache nodes, and store the data of 3 clients, wherein, the Hash barrelage order that customer end A and customer end B take is 4, and the Hash barrelage order that client C takies is 2.If customer end A has sent a data read request, for example, the value that inquiry major key " foo " is corresponding, customer end A equipment uses CRC32 algorithm to calculate the check code of major key " foo ", and the Hash barrelage order (equaling 4) that this check code is taken customer end A carries out modulo operation, the value obtaining (for example equaling 3) is the corresponding Hash bucket mark of data that major key is " foo ".Customer end A equipment identifies 3 question blanks 1 according to Hash bucket, can determine that major key is that cache node under the data of " foo " is cache node 4.
Can find out by above-mentioned Operational Visit flow process, client device is determined corresponding Hash bucket according to the data major key of request access, and determine with the mapping relations of cache node the cache node that this Hash bucket is corresponding according to Hash bucket, get final product specified data memory location by above-mentioned two-layer mapping relations, simplify data access process, improve data processing speed, and promoted data access performance.
Because the data in distributed cache system are that distributed store is on each cache node, for the data that realize each cache node storage are uniformly distributed, the embodiment of the present invention also provides a kind of data dispatch flow process, is elaborated below in conjunction with Fig. 3, as shown in the figure, this flow process comprises the following steps:
Step 301, server obtains the memory usage information of each cache node.
Concrete, server maintenance the address information of each cache node, in the time that the cycle arrives or while receiving control command, server is by all cache nodes of traversal, can collect the current internal memory service condition of all cache nodes, for example, have how many internal memories for data cached, have how many internal memories free time etc.In embodiments of the present invention, what server was collected is the size of each cache node internal memory free space, and quantizes with the percentage that internal memory free space accounts for physical memory space.For example, the physical memory capacity of certain cache node is 16GB, and amount of free memory is 8GB, and the free space size of this cache node is just for 50(is 50% so).
Step 302, server creates Two Binomial Tree Model according to the memory usage information of each cache node.
Concrete, server, by building Two Binomial Tree Model, is realized the effective management to distributed cache system cache node memory headroom, and it is as follows that Two Binomial Tree Model builds rule:
Leaf node using each cache node in Two Binomial Tree Model, every two child nodes are shared a parent node, the weights of leaf node are the internal memory free space size (percentage that accounts for distributed cache system physical memory space by cache node free space quantizes) of cache node, the weights of parent node are weights larger in two child nodes, and the weights of the root node of binary tree are the size of internal memory free space maximum in each cache node.
Step 303, server, according to the memory usage information of each cache node, judges whether to exist the source cache node that needs scheduling, if exist, performs step 304.
Concrete, server compares the memory usage information of each cache node and predetermined threshold value, if having the internal memory free space occupation rate of cache node lower than described predetermined threshold value, determines that this cache node need to carry out Data Migration, and performs step 304.If the internal memory free space occupation rate of each cache node is all not less than predetermined threshold value, illustrate that the current internal storage data of each cache node distributes comparatively even, reasonable, without carrying out data dispatch.
If server judges without carrying out data dispatch, can finish this scheduling flow.Preferably, server enters resting state, dormancy a period of time.On server, dormancy time can also be set, preferred, dormancy time is more than or equal to the cycle of collecting each cache node memory usage information in step 201.
Step 304, server is selected the Hash bucket that need to move from the source cache node of determining.
Concrete, server can be selected at least one the Hash bucket in source cache node at random; Or server can, according to internal memory free space occupation rate order from big to small, be selected at least one the Hash bucket in source cache node; Or server can, according to internal memory free space occupation rate order from small to large, be selected at least one the Hash bucket in source cache node.
Step 305, server is determined the target cache node of data in the Hash bucket that can hold needs migration, if determine target cache node, performs step 306.Preferably, if do not determine target cache node, alarm.
Concrete, server in the following manner, is determined the target cache node of data in the Hash bucket that can hold needs migration according to Two Binomial Tree Model:
Step a, using the root node of Two Binomial Tree Model as start node;
Step b, judge the data capacity that whether exists internal memory free space occupation rate to be greater than the Hash bucket of needs migration in the child nodes of start node, and the difference of the two is greater than the node of predetermined threshold value, if exist, performs step c, otherwise, process ends;
If the internal memory free space occupation rate of two child nodes of step c start node is all greater than the data capacity of the Hash bucket of needs migrations, and the difference of the two is greater than described predetermined threshold value, selects the wherein larger node of weights; If only there is the internal memory free space occupation rate of a child nodes to be greater than the data capacity of the Hash bucket of needs migrations in two child nodes of start node, and the difference of the two is greater than described predetermined threshold value, selects this child nodes;
Steps d, judge whether the child nodes of selecting is the leaf node of binary tree, if so, the child nodes of selecting is defined as holding the target cache node of the data in this Hash bucket; Otherwise, execution step e;
Step e, using the child nodes of selecting as start node, and perform step b.
If server is by the way in Two Binomial Tree Model, while failing to determine satisfactory leaf node, illustrating that distributed cache system is current does not have the cache node can be as target cache node (not having enough internal memory free spaces to hold data in the Hash bucket of needs migrations), now, the data of each cache node storage have approached the maximum amount of data that server can carry, and server carries out alarm.Alarm mode can have multiple, and for example, mail warning, note warning or printing alarm log etc., with the timely dilatation caching server of system for prompting O&M personnel.
Step 306, server sends data dispatch instruction to source cache node, to indicate source cache node that the Data Migration in the Hash bucket of needs migration is arrived to target cache node.
Concrete, source cache node receives after the data dispatch instruction of server transmission, according to Hash bucket mark and the target cache node identification of the needs migration of carrying in data dispatch instruction, by the Data Migration in this Hash bucket to target cache node, and to server return data scheduling success response.Preferably, source cache node traversal needs the data of the Hash bucket of migration, and repeatedly to set (Multiple Set) mode by extremely described target cache node of the Data Migration of this Hash bucket, to make full use of the network bandwidth, reduces update number of times.
Further, target cache node by Data Migration in the Hash bucket of source cache node after this locality, to server return data scheduling success response, server is receiving after the data dispatch success response that source cache node returns, local update mapping table (in the situation that of storing this mapping table in this locality), and send mapping relations to the client device under the Hash bucket being moved and upgrade instruction, be target cache node identification so that the Hash bucket being moved in mapping table is identified corresponding cache node identification renewal by this client device, thereby realize the renewal of mapping table on client device.
Because the corresponding relation of Hash bucket and cache node is safeguarded with the form of mapping table, Data Migration in Hash bucket can cause the corresponding relation of Hash bucket and cache node to change, the mapping table that therefore will upgrade in time in client, to ensure the follow-up data access request correct cache node that can lead.
Preferably, receiving after the data dispatch success response that source cache node returns, server can also be in local update Two Binomial Tree Model.
Further, when in step 304, when the Hash bucket of selecting in source cache node is one, more than the migration of a Hash barrelage certificate not necessarily can ensure that the free space occupation rate of this cache node rises to predetermined threshold value, therefore, in order to ensure that each cache node data are evenly distributed, complete after the Data Migration of a Hash bucket, server can also be carried out following steps:
Server judges whether the internal memory free space occupation rate that this source cache node carries out after Data Migration is less than predetermined threshold value, if so, illustrates without again carrying out Data Migration, process ends; Otherwise, illustrate that this source cache node still needs to continue to do Data Migration, therefore, execution step 304 and subsequent step.
Can find out by above flow process, determine and need to carry out the source cache node of data dispatch by server, and control source cache node taking Hash bucket as unit, by Data Migration to other cache nodes, be uniformly distributed on cache node thereby realize data, improved memory usage.
The cache node data dispatch scheme that adopts binary tree resource scheduling algorithm to realize in order to clearly demonstrate server, below especially exemplified by instantiation, and be elaborated in conjunction with Fig. 4 a-4d.Distributed cache system comprises 4 cache nodes, the internal memory free space occupation rate of cache node 1 is 46, the internal memory free space occupation rate of cache node 2 is 23, and the internal memory free space occupation rate of cache node 3 is 57, and the internal memory free space occupation rate of cache node 4 is 18.
Server is set up Two Binomial Tree Model, wherein, 4 cache nodes are as leaf node, cache node 1(weights are 46) and cache node 2(weights be 23) share a parent node (weights are 46), cache node 3(weights are 57) and cache node 4(weights be 18) share a parent node (weights are 57), weights are parent node of nodes sharing (weights are 57) that 46 node and weights are 57, and this node is the root node of binary tree.
Server is searched the node that internal memory free space occupation rate is less than predetermined threshold value 20 from Two Binomial Tree Model, determines cache node 4 for need to carry out the source cache node of data dispatch, and selects the Hash bucket (for example, data capacity is 10) of cache node 4.
Server is taking the root node (weights are as 57) of binary tree as start node, the weights that judge two child nodes of root node (weights are respectively 46 and 57) are all greater than 10, and, the weights of two child nodes are all greater than predetermined threshold value 20 with the difference of the Hash ladle capacity 10 that needs migration, select the child nodes (being that weights are 57 node) of weights maximum.In the child nodes (being that weights are respectively 57 and 18 node) of this child nodes, exist unique weights to be greater than the data capacity 10 of the Hash bucket of needs migration, and the difference of the two is greater than the node (being that weights are 57 node) of described predetermined threshold value 20, because this node has been leaf node (cache node 3), determine that it is 57 that target cache node is this cache node 3(weights).
After source cache node 4 completes and moves the data in selected Hash bucket to target cache node 3, upgrade Two Binomial Tree Model, the right value update of source cache node 4 is 28, the right value update of target cache node 3 is 47.
The definite mode that it should be noted that target cache node is not limited to utilize above-mentioned binary tree resource scheduling algorithm to realize, and for example, can also utilize existing memory headroom to realize by dispatching algorithm.
Based on identical technical conceive, the embodiment of the present invention also provides a kind of client device, and this client device is applied to distributed cache system, and as shown in Figure 5, this client comprises:
Determination module 51, for according to the data major key of request access, determines the Hash bucket that described data major key is corresponding, and according to the mapping relations of Hash bucket and cache node, determines the cache node of described Hash bucket correspondence; Wherein, a unique corresponding cache node of Hash bucket.
Access request module 52, initiates data access request for the cache node to described Hash bucket correspondence, to ask described cache node to carry out access processing according to described data access request.
Concrete, determination module 51 specifically for, calculate the cryptographic Hash of described data major key, and by by the cryptographic Hash calculating, the Hash barrelage amount of distributing for described client device carried out to modulo operation, determine the Hash bucket that described data major key is corresponding.
This client device also comprises access module 53 and memory module 54, and access module 53, in the time accessing described distributed cache system, is distributed Hash bucket to described distributed cache system request; And, receive described distributed cache system and be the Hash bucket information that client device distributes described in it, and the mapping relations information of this Hash bucket and cache node.
Memory module 54, for storing the mapping relations information of described uncommon bucket and cache node.
This client device also comprises update module 55, for upgrading after instruction in the mapping relations that receive server transmission, upgrades the mapping relations of instruction renewal Hash bucket and cache node according to described mapping relations; It is that described server sends the Data Migration in the Hash bucket of needs migration at instruction source cache node after target cache node that described mapping relations are upgraded instruction.
Based on identical technical conceive, the embodiment of the present invention also provides a kind of server, and this server is applied to distributed cache system, and as shown in Figure 6, this server comprises:
Acquisition module 61, for obtaining the memory usage information of each cache node.
Decision-making module 62, be used for the memory usage information of the each cache node obtaining according to acquisition module 61, determine the source cache node that need to carry out Data Migration, and determining after the source cache node that needs migration, determine the Hash bucket that needs migration on described source cache node, and can hold the target cache node of the data in this Hash bucket.
Scheduler module 63, for sending data dispatch instruction to described source cache node, to indicate described source cache node that the Data Migration in the Hash bucket of needs migration is arrived to described target cache node.
Upgrade indicating module 64, send mapping relations for the client device under the Hash bucket to being moved and upgrade instruction, to indicate client to upgrade the mapping relations of Hash bucket and cache node according to this Data Migration operation.
Decision-making module 62, specifically for the memory usage information of each cache node and predetermined threshold value are compared, if having the internal memory free space occupation rate of cache node lower than described predetermined threshold value, determines that this cache node need to carry out Data Migration.
Described server also comprises Two Binomial Tree Model maintenance module 65, Two Binomial Tree Model maintenance module 65 for, create and safeguard Two Binomial Tree Model according to the memory usage information of each cache node; Wherein, in described Two Binomial Tree Model, leaf node is each cache node, and every two child nodes are shared a parent node, the weights of leaf node are the internal memory free space occupation rate of cache node, and the weights of parent node equal weights larger in two child nodes.
Decision-making module 62 specifically for, determine and can hold the target cache node of the data in this Hash bucket in the following manner:
Step a, using the root node of Two Binomial Tree Model as start node;
Step b, judge the data capacity that whether exists internal memory free space occupation rate to be greater than the Hash bucket of needs migration in the child nodes of start node, and the difference of the two is greater than the node of predetermined threshold value, if exist, performs step c, otherwise, process ends;
If the internal memory free space occupation rate of two child nodes of step c start node is all greater than the data capacity of the Hash bucket of needs migrations, and the difference of the two is greater than described predetermined threshold value, selects the wherein larger node of weights; If only there is the internal memory free space occupation rate of a child nodes to be greater than the data capacity of the Hash bucket of needs migrations in two child nodes of start node, and the difference of the two is greater than described predetermined threshold value, selects this child nodes;
Steps d, judge whether the child nodes of selecting is the leaf node of binary tree, if so, the child nodes of selecting is defined as holding the target cache node of the data in this Hash bucket; Otherwise, execution step e;
Step e, using the child nodes of selecting as start node, and perform step b.
Decision-making module 62 specifically for, select at random at least one the Hash bucket in described source cache node; Or, according to internal memory free space occupation rate order from big to small, select at least one the Hash bucket in described source cache node; Or, according to internal memory free space occupation rate order from small to large, select at least one the Hash bucket in described source cache node.
Based on identical technical conceive, the embodiment of the present invention also provides a kind of distributed cache system, and as shown in Figure 1, this distributed cache system comprises: as the aforementioned client device 11, as weighed aforesaid server 12 and at least 2 cache nodes 13, wherein,
Cache node 13, the data access request of initiating for receiving client device 11, and described data access request is carried out to access processing; And, the data dispatch instruction that reception server 12 sends, and according to described data dispatch instruction, the Data Migration in the Hash bucket of needs migration is arrived to target cache node.
Cache node 13 also for, by the Data Migration of Hash bucket to this locality, to server 12 return datas scheduling success responses.
Server 12 also for, receiving after the data dispatch success response that target cache node returns, send mapping relations to the client device 11 under the Hash bucket being moved and upgrade instruction.
Through the above description of the embodiments, those skilled in the art can be well understood to the mode that the present invention can add essential general hardware platform by software and realize, and can certainly pass through hardware, but in a lot of situation, the former is better execution mode.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words can embody with the form of software product, this computer software product is stored in a storage medium, comprise that some instructions (can be mobile phones in order to make a station terminal equipment, personal computer, server, or the network equipment etc.) carry out the method described in each embodiment of the present invention.
The above is only the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should be looked protection scope of the present invention.

Claims (17)

1. a data access method, is applied to distributed cache system, it is characterized in that, comprising:
The client device of distributed cache system, according to the data major key of request access, is determined the Hash bucket that described data major key is corresponding, and according to the mapping relations of Hash bucket and cache node, is determined the cache node of described Hash bucket correspondence; Wherein, a unique corresponding cache node of Hash bucket;
Described client device is initiated data access request to the cache node of described Hash bucket correspondence, with ask described cache node according to described data access request to carrying out access processing.
2. the method for claim 1, is characterized in that, described client device, according to the data major key of request access, is determined the Hash bucket that described data major key is corresponding, is specially:
The cryptographic Hash of data major key described in described client devices computed, and by by the cryptographic Hash calculating, the Hash barrelage amount of distributing for described client device is carried out to modulo operation, determine the Hash bucket that described data major key is corresponding.
3. the method for claim 1, is characterized in that, the method also comprises:
In the time that described client device accesses described distributed cache system, distribute Hash bucket to described distributed cache system request;
It is the Hash bucket information that client device distributes described in it that described client device receives and store described distributed cache system, and the mapping relations information of this Hash bucket and cache node.
4. a data dispatching method of realizing based on the data access method as described in one of claim 1-3, is characterized in that, described method comprises:
The server of distributed cache system obtains the memory usage information of each cache node;
Described server is according to the memory usage information of each cache node, determine the source cache node that need to carry out Data Migration, and determining after the source cache node that needs migration, determine the Hash bucket that needs migration on described source cache node, and can hold the target cache node of the data in this Hash bucket;
Described server sends data dispatch instruction to described source cache node, to indicate described source cache node that the Data Migration in the Hash bucket of needs migration is arrived to described target cache node;
Described server sends mapping relations to the client device under the Hash bucket being moved and upgrades instruction, to indicate client to upgrade the mapping relations of Hash bucket and cache node according to this Data Migration operation.
5. method as claimed in claim 4, is characterized in that, described server, according to the memory usage information of each cache node, is determined the source cache node that need to carry out Data Migration, specifically comprises:
Described server compares the memory usage information of each cache node and predetermined threshold value, if having the internal memory free space occupation rate of cache node lower than described predetermined threshold value, determines that this cache node need to carry out Data Migration.
6. method as claimed in claim 4, is characterized in that, described server also comprises after obtaining the memory usage information of each cache node: create Two Binomial Tree Model according to the memory usage information of each cache node; Wherein, in described Two Binomial Tree Model, leaf node is each cache node, and every two child nodes are shared a parent node, the weights of leaf node are the internal memory free space occupation rate of cache node, and the weights of parent node equal weights larger in two child nodes;
Described server is determined the target cache node that can hold the data in this Hash bucket, specifically comprises:
Step a, using the root node of Two Binomial Tree Model as start node;
Step b, judge the data capacity that whether exists internal memory free space occupation rate to be greater than the Hash bucket of needs migration in the child nodes of start node, and the difference of the two is greater than the node of predetermined threshold value, if exist, performs step c, otherwise, process ends;
If the internal memory free space occupation rate of two child nodes of step c start node is all greater than the data capacity of the Hash bucket of needs migrations, and the difference of the two is greater than described predetermined threshold value, selects the wherein larger node of weights; If only there is the internal memory free space occupation rate of a child nodes to be greater than the data capacity of the Hash bucket of needs migrations in two child nodes of start node, and the difference of the two is greater than described predetermined threshold value, selects this child nodes;
Steps d, judge whether the child nodes of selecting is the leaf node of binary tree, if so, the child nodes of selecting is defined as holding the target cache node of the data in this Hash bucket; Otherwise, execution step e;
Step e, using the child nodes of selecting as start node, and perform step b.
7. method as claimed in claim 4, is characterized in that, determines the process that needs the Hash bucket of migration on described source cache node, specifically comprises:
Random at least one Hash bucket of selecting in described source cache node; Or
According to internal memory free space occupation rate order from big to small, select at least one the Hash bucket in described source cache node; Or
According to internal memory free space occupation rate order from small to large, select at least one the Hash bucket in described source cache node.
8. a client device, is applied to distributed cache system, it is characterized in that, comprising:
Determination module, for according to the data major key of request access, determines the Hash bucket that described data major key is corresponding, and according to the mapping relations of Hash bucket and cache node, determines the cache node of described Hash bucket correspondence; Wherein, a unique corresponding cache node of Hash bucket;
Access request module, initiates data access request for cache node to described Hash bucket correspondence, with ask described cache node according to described data access request to carrying out access processing.
9. client device as claimed in claim 8, it is characterized in that, described determination module specifically for, calculate the cryptographic Hash of described data major key, and by by the cryptographic Hash calculating, the Hash barrelage amount of distributing for described client device is carried out to modulo operation, determine the Hash bucket that described data major key is corresponding.
10. client device as claimed in claim 8, is characterized in that, also comprises:
Access module, in the time accessing described distributed cache system, distributes Hash bucket to described distributed cache system request; And, receive described distributed cache system and be the Hash bucket information that client device distributes described in it, and the mapping relations information of this Hash bucket and cache node;
Memory module, for storing the mapping relations information of described uncommon bucket and cache node.
11. client devices as claimed in claim 8, is characterized in that, also comprise:
Update module, for upgrading after instruction in the mapping relations that receive server transmission, upgrades the mapping relations of instruction renewal Hash bucket and cache node according to described mapping relations; It is that described server sends the Data Migration in the Hash bucket of needs migration at instruction source cache node after target cache node that described mapping relations are upgraded instruction.
12. 1 kinds of servers, are applied to distributed cache system, it is characterized in that, described server comprises:
Acquisition module, for obtaining the memory usage information of each cache node;
Decision-making module, be used for the memory usage information of the each cache node obtaining according to described acquisition module, determine the source cache node that need to carry out Data Migration, and determining after the source cache node that needs migration, determine the Hash bucket that needs migration on described source cache node, and can hold the target cache node of the data in this Hash bucket;
Scheduler module, for sending data dispatch instruction to described source cache node, to indicate described source cache node that the Data Migration in the Hash bucket of needs migration is arrived to described target cache node;
Upgrade indicating module, send mapping relations for the client device under the Hash bucket to being moved and upgrade instruction, to indicate client to upgrade the mapping relations of Hash bucket and cache node according to this Data Migration operation.
13. servers as claimed in claim 12, it is characterized in that, described decision-making module, specifically for the memory usage information of each cache node and predetermined threshold value are compared, if have the internal memory free space occupation rate of cache node lower than described predetermined threshold value, determine that this cache node need to carry out Data Migration.
14. servers as claimed in claim 12, is characterized in that, also comprise: Two Binomial Tree Model maintenance module;
Described Two Binomial Tree Model maintenance module, for creating and safeguard Two Binomial Tree Model according to the memory usage information of each cache node; Wherein, in described Two Binomial Tree Model, leaf node is each cache node, and every two child nodes are shared a parent node, the weights of leaf node are the internal memory free space occupation rate of cache node, and the weights of parent node equal weights larger in two child nodes;
Described decision-making module specifically for, determine and can hold the target cache node of the data in this Hash bucket in the following manner:
Step a, using the root node of Two Binomial Tree Model as start node;
Step b, judge the data capacity that whether exists internal memory free space occupation rate to be greater than the Hash bucket of needs migration in the child nodes of start node, and the difference of the two is greater than the node of predetermined threshold value, if exist, performs step c, otherwise, process ends;
If the internal memory free space occupation rate of two child nodes of step c start node is all greater than the data capacity of the Hash bucket of needs migrations, and the difference of the two is greater than described predetermined threshold value, selects the wherein larger node of weights; If only there is the internal memory free space occupation rate of a child nodes to be greater than the data capacity of the Hash bucket of needs migrations in two child nodes of start node, and the difference of the two is greater than described predetermined threshold value, selects this child nodes;
Steps d, judge whether the child nodes of selecting is the leaf node of binary tree, if so, the child nodes of selecting is defined as holding the target cache node of the data in this Hash bucket; Otherwise, execution step e;
Step e, using the child nodes of selecting as start node, and perform step b.
15. servers as claimed in claim 12, is characterized in that, described decision-making module specifically for, select at random at least one the Hash bucket in described source cache node; Or, according to internal memory free space occupation rate order from big to small, select at least one the Hash bucket in described source cache node; Or, according to internal memory free space occupation rate order from small to large, select at least one the Hash bucket in described source cache node.
16. 1 kinds of distributed cache systems, comprising: the client device as described in claim 9-11 any one, server and at least 2 cache nodes as described in right 12-15 any one, it is characterized in that,
Cache node, the data access request of initiating for receiving client device, and carry out access processing according to described access request; And, the data dispatch instruction that reception server sends, and according to described data dispatch instruction, the Data Migration in the Hash bucket of needs migration is arrived to target cache node.
17. systems as claimed in claim 16, is characterized in that, described cache node also for, by the Data Migration of Hash bucket to this locality, to server return data scheduling success response;
Described server also for, receiving after the data dispatch success response that target cache node returns, send mapping relations to the client device under the Hash bucket being moved and upgrade instruction.
CN201210581492.9A 2012-12-27 2012-12-27 Data access method, dispatching method, equipment and system Active CN103905503B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210581492.9A CN103905503B (en) 2012-12-27 2012-12-27 Data access method, dispatching method, equipment and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210581492.9A CN103905503B (en) 2012-12-27 2012-12-27 Data access method, dispatching method, equipment and system

Publications (2)

Publication Number Publication Date
CN103905503A true CN103905503A (en) 2014-07-02
CN103905503B CN103905503B (en) 2017-09-26

Family

ID=50996658

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210581492.9A Active CN103905503B (en) 2012-12-27 2012-12-27 Data access method, dispatching method, equipment and system

Country Status (1)

Country Link
CN (1) CN103905503B (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138489A (en) * 2015-08-13 2015-12-09 东南大学 ID management unit for cache space of network data packages
CN105744001A (en) * 2016-04-11 2016-07-06 青岛海信传媒网络技术有限公司 Distributed Caching System Expanding Method, Data Access Method, and Device and System of the Same
CN106375425A (en) * 2016-08-30 2017-02-01 中国民生银行股份有限公司 Processing method and device for distributed caching
CN106411654A (en) * 2016-10-27 2017-02-15 任子行网络技术股份有限公司 Method and device for processing network traffic analysis
CN106484379A (en) * 2015-08-28 2017-03-08 华为技术有限公司 A kind of processing method and processing device of application
CN106776001A (en) * 2016-11-14 2017-05-31 天津南大通用数据技术股份有限公司 The location mode and device of a kind of distributed experiment & measurement system data
CN106973091A (en) * 2017-03-23 2017-07-21 中国工商银行股份有限公司 Distributed memory fast resampling method and system, main control server
CN107766258A (en) * 2017-09-27 2018-03-06 精硕科技(北京)股份有限公司 Memory storage method and apparatus, memory lookup method and apparatus
WO2018086155A1 (en) * 2016-11-10 2018-05-17 Huawei Technologies Co., Ltd. Separation of computation from storage in database for better elasticity
CN108255952A (en) * 2017-12-19 2018-07-06 东软集团股份有限公司 Data load method, device, storage medium and electronic equipment
CN108446376A (en) * 2018-03-16 2018-08-24 众安信息技术服务有限公司 Date storage method and device
CN109739929A (en) * 2018-12-18 2019-05-10 中国人民财产保险股份有限公司 Method of data synchronization, apparatus and system
CN110083313A (en) * 2019-05-06 2019-08-02 北京奇艺世纪科技有限公司 A kind of data cache method and device
CN110321347A (en) * 2019-05-30 2019-10-11 上海数据交易中心有限公司 Data matching method and device, storage medium, terminal
CN110401657A (en) * 2019-07-24 2019-11-01 网宿科技股份有限公司 A kind of processing method and processing device of access log
CN110516121A (en) * 2019-08-28 2019-11-29 中国银行股份有限公司 Method for reading data and device
CN111192165A (en) * 2020-01-03 2020-05-22 南京天溯自动化控制系统有限公司 Intelligent ammeter management platform based on preprocessing method
CN111400739A (en) * 2020-03-20 2020-07-10 符安文 System data transmission distribution method
CN112748879A (en) * 2020-12-30 2021-05-04 中科曙光国际信息产业有限公司 Data acquisition method, system, device, computer equipment and storage medium
WO2021088531A1 (en) * 2019-11-05 2021-05-14 中兴通讯股份有限公司 Data redistribution method, electronic device, and storage medium
CN113779028A (en) * 2021-08-31 2021-12-10 珠海市新德汇信息技术有限公司 Storage method, query method, electronic device and storage medium fusing primary key index genes

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5692177A (en) * 1994-10-26 1997-11-25 Microsoft Corporation Method and system for data set storage by iteratively searching for perfect hashing functions
CN101122885A (en) * 2007-09-11 2008-02-13 腾讯科技(深圳)有限公司 Data cache processing method, system and data cache device
CN101483605A (en) * 2009-02-25 2009-07-15 北京星网锐捷网络技术有限公司 Storing, searching method and apparatus for data packet
CN101692651A (en) * 2009-09-27 2010-04-07 中兴通讯股份有限公司 Method and device for Hash lookup table
CN102073733A (en) * 2011-01-19 2011-05-25 中兴通讯股份有限公司 Method and device for managing Hash table
WO2011079467A1 (en) * 2009-12-31 2011-07-07 华为技术有限公司 Method, device and system for scheduling distributed buffer resources
CN102541968A (en) * 2010-12-31 2012-07-04 百度在线网络技术(北京)有限公司 Indexing method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5692177A (en) * 1994-10-26 1997-11-25 Microsoft Corporation Method and system for data set storage by iteratively searching for perfect hashing functions
CN101122885A (en) * 2007-09-11 2008-02-13 腾讯科技(深圳)有限公司 Data cache processing method, system and data cache device
CN101483605A (en) * 2009-02-25 2009-07-15 北京星网锐捷网络技术有限公司 Storing, searching method and apparatus for data packet
CN101692651A (en) * 2009-09-27 2010-04-07 中兴通讯股份有限公司 Method and device for Hash lookup table
WO2011079467A1 (en) * 2009-12-31 2011-07-07 华为技术有限公司 Method, device and system for scheduling distributed buffer resources
CN102541968A (en) * 2010-12-31 2012-07-04 百度在线网络技术(北京)有限公司 Indexing method
CN102073733A (en) * 2011-01-19 2011-05-25 中兴通讯股份有限公司 Method and device for managing Hash table

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138489B (en) * 2015-08-13 2018-04-10 东南大学 Network packet spatial cache ID administrative units
CN105138489A (en) * 2015-08-13 2015-12-09 东南大学 ID management unit for cache space of network data packages
CN106484379A (en) * 2015-08-28 2017-03-08 华为技术有限公司 A kind of processing method and processing device of application
CN106484379B (en) * 2015-08-28 2019-11-29 华为技术有限公司 A kind of processing method and processing device of application
CN105744001A (en) * 2016-04-11 2016-07-06 青岛海信传媒网络技术有限公司 Distributed Caching System Expanding Method, Data Access Method, and Device and System of the Same
CN105744001B (en) * 2016-04-11 2019-03-12 聚好看科技股份有限公司 Distributed cache system expansion method, data access method and device and system
CN106375425A (en) * 2016-08-30 2017-02-01 中国民生银行股份有限公司 Processing method and device for distributed caching
CN106411654A (en) * 2016-10-27 2017-02-15 任子行网络技术股份有限公司 Method and device for processing network traffic analysis
US11138178B2 (en) 2016-11-10 2021-10-05 Futurewei Technologies, Inc. Separation of computation from storage in database for better elasticity
WO2018086155A1 (en) * 2016-11-10 2018-05-17 Huawei Technologies Co., Ltd. Separation of computation from storage in database for better elasticity
CN106776001A (en) * 2016-11-14 2017-05-31 天津南大通用数据技术股份有限公司 The location mode and device of a kind of distributed experiment & measurement system data
CN106973091B (en) * 2017-03-23 2020-06-05 中国工商银行股份有限公司 Distributed memory data redistribution method and system, and master control server
CN106973091A (en) * 2017-03-23 2017-07-21 中国工商银行股份有限公司 Distributed memory fast resampling method and system, main control server
CN107766258A (en) * 2017-09-27 2018-03-06 精硕科技(北京)股份有限公司 Memory storage method and apparatus, memory lookup method and apparatus
CN107766258B (en) * 2017-09-27 2021-11-16 恩亿科(北京)数据科技有限公司 Memory storage method and device and memory query method and device
CN108255952B (en) * 2017-12-19 2020-11-03 东软集团股份有限公司 Data loading method and device, storage medium and electronic equipment
CN108255952A (en) * 2017-12-19 2018-07-06 东软集团股份有限公司 Data load method, device, storage medium and electronic equipment
CN108446376B (en) * 2018-03-16 2022-04-08 众安信息技术服务有限公司 Data storage method and device
CN108446376A (en) * 2018-03-16 2018-08-24 众安信息技术服务有限公司 Date storage method and device
CN109739929B (en) * 2018-12-18 2021-03-16 中国人民财产保险股份有限公司 Data synchronization method, device and system
CN109739929A (en) * 2018-12-18 2019-05-10 中国人民财产保险股份有限公司 Method of data synchronization, apparatus and system
CN110083313A (en) * 2019-05-06 2019-08-02 北京奇艺世纪科技有限公司 A kind of data cache method and device
CN110321347A (en) * 2019-05-30 2019-10-11 上海数据交易中心有限公司 Data matching method and device, storage medium, terminal
CN110401657B (en) * 2019-07-24 2020-09-25 网宿科技股份有限公司 Processing method and device for access log
CN110401657A (en) * 2019-07-24 2019-11-01 网宿科技股份有限公司 A kind of processing method and processing device of access log
US11272029B2 (en) 2019-07-24 2022-03-08 Wangsu Science & Technology Co., Ltd. Access log processing method and device
CN110516121A (en) * 2019-08-28 2019-11-29 中国银行股份有限公司 Method for reading data and device
WO2021088531A1 (en) * 2019-11-05 2021-05-14 中兴通讯股份有限公司 Data redistribution method, electronic device, and storage medium
CN111192165A (en) * 2020-01-03 2020-05-22 南京天溯自动化控制系统有限公司 Intelligent ammeter management platform based on preprocessing method
CN111400739A (en) * 2020-03-20 2020-07-10 符安文 System data transmission distribution method
CN112748879A (en) * 2020-12-30 2021-05-04 中科曙光国际信息产业有限公司 Data acquisition method, system, device, computer equipment and storage medium
CN113779028A (en) * 2021-08-31 2021-12-10 珠海市新德汇信息技术有限公司 Storage method, query method, electronic device and storage medium fusing primary key index genes

Also Published As

Publication number Publication date
CN103905503B (en) 2017-09-26

Similar Documents

Publication Publication Date Title
CN103905503A (en) Data storage method, data scheduling method, device and system
US11729073B2 (en) Dynamic scaling of storage volumes for storage client file systems
US10715460B2 (en) Opportunistic resource migration to optimize resource placement
JP6353924B2 (en) Reduced data volume durability status for block-based storage
US8442955B2 (en) Virtual machine image co-migration
CN102142006B (en) File processing method and device of distributed file system
CN103067297B (en) A kind of dynamic load balancing method based on resource consumption prediction and device
CN104202332B (en) Mobile device virtualization system and instant installation method based on linux kernel
CN104123235A (en) Device and method for visiting data recording stored in cache on server
CN104424013A (en) Method and device for deploying virtual machine in computing environment
CN103491155A (en) Cloud computing method and system for achieving mobile computing and obtaining mobile data
CN101370025A (en) Storing method, scheduling method and management system for geographic information data
CN106775446A (en) Based on the distributed file system small documents access method that solid state hard disc accelerates
CN103544153A (en) Data updating method and system based on database
CN111737168A (en) Cache system, cache processing method, device, equipment and medium
CN106155575A (en) Method and apparatus for the cache of extension storage system
CN110740155A (en) Request processing method and device in distributed system
CN103019956A (en) Method and device for operating cache data
US10812408B1 (en) Preventing concentrated selection of resource hosts for placing resources
US10594620B1 (en) Bit vector analysis for resource placement in a distributed system
CN114746850A (en) Providing a dynamic random access memory cache as a second type of memory
CN112688980B (en) Resource distribution method and device, and computer equipment
CN115686811A (en) Process management method, device, computer equipment and storage medium
CN113420050A (en) Data query management method and device, computer equipment and readable storage medium
CN108052536A (en) A kind of file system of IoT equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant