CN105516367B - Distributed data-storage system, method and apparatus - Google Patents

Distributed data-storage system, method and apparatus Download PDF

Info

Publication number
CN105516367B
CN105516367B CN201610074240.5A CN201610074240A CN105516367B CN 105516367 B CN105516367 B CN 105516367B CN 201610074240 A CN201610074240 A CN 201610074240A CN 105516367 B CN105516367 B CN 105516367B
Authority
CN
China
Prior art keywords
data memory
memory node
data
node
stored
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610074240.5A
Other languages
Chinese (zh)
Other versions
CN105516367A (en
Inventor
宋惠卿
孙晓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201610074240.5A priority Critical patent/CN105516367B/en
Publication of CN105516367A publication Critical patent/CN105516367A/en
Application granted granted Critical
Publication of CN105516367B publication Critical patent/CN105516367B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Computer And Data Communications (AREA)

Abstract

This application discloses distributed data-storage system, method and device.One embodiment of the system includes:Data memory node, the data to be stored sent for storage agent node, each data memory node include multiple Hash grooves for being used to map different key value informations, and the data to be stored include key value information and data message;Central administration node, for stored record information, the record information includes the network topology for the cluster that each data memory node is formed, Hash groove in the distribution situation of each data memory node and the load capacity of each data memory node;The agent node, for receiving the data to be stored of client transmission, corresponding with the data to be stored Hash groove and data memory node are determined according to the record information of the central administration node.The embodiment of the system is simple in construction, autgmentability is strong.

Description

Distributed data-storage system, method and apparatus
Technical field
The application is related to field of computer technology, more particularly to distributed data-storage system, method and apparatus.
Background technology
At present, generally use distributed data-storage system carries out data storage in the industry, and distributed data-storage system was both The characteristic of expanding with group system/capacity reducing, can carry out distributed operation again.Therefore, changed in memory data output When, distributed data-storage system can be realized by the data memory node in increase/removal cluster and distributed data is deposited Expansion/capacity reducing of storage system.
In existing distributed data-storage system, cluster can generally be divided using uniformity hash algorithm Piece, then key-value pair data (key-value) etc. is stored.But such a storage system changes in storage data quantity And when needing increase/removal data memory node, due to the limitation of uniformity hash algorithm so that the key assignments of adjacent storage nodes (key) change that mapping occurs, easily causes the loss of data, and cluster expansion is poor.And What is more, some distributed datas Storage system is not present centre management mechanism, system it is complicated.
The content of the invention
The purpose of the application is to propose a kind of improved distributed data-storage system, method and apparatus, come solve with The technical problem that upper background section is mentioned.
In a first aspect, this application provides a kind of distributed data-storage system, the system includes:Data storage section Point, the data to be stored sent for storage agent node, each data memory node include multiple for mapping different keys The Hash groove of value information, the data to be stored include key value information and data message;Central administration node, for stored record Information, the record information include the network topology for the cluster that each data memory node is formed, Hash groove in each data The load capacity of the distribution situation of memory node and each data memory node;The agent node, sent for receiving client Data to be stored, according to the record information of the central administration node determine Hash groove corresponding with the data to be stored and Data memory node.
In certain embodiments, the agent node is additionally operable to:Each data storage in the record information The load capacity of node, judge the load capacity of each data memory node whether in default threshold range;If so, then keep Data memory node invariable number;If it is not, then increase/data memory node is removed, redistribute the Kazakhstan of each data memory node Uncommon groove, and update the record information.
In certain embodiments, the agent node is further used for:Judge whether the load of data memory node Maximum of the amount more than the threshold range;If so, then increase an at least data memory node;Judge whether that data are deposited The load capacity for storing up node is less than the minimum value of the threshold range;If so, then remove at least one data memory node.
In certain embodiments, when increasing an at least data memory node, then by each former data memory node Part Hash groove be transferred in the data memory node newly increased so that Hash groove average mark in each data memory node Match somebody with somebody;When removing at least one data memory node, then the Hash groove in the data memory node of removal is transferred to remaining In back end so that Hash groove mean allocation in each data memory node.
In certain embodiments, the system is used for cloud computing environment;And the system also includes:Resource management section Point, for distributing cloud computing resources for the system;The agent node is further used for obtaining from the Resource Management node Cloud computing resources.
Second aspect, this application provides a kind of distributed data storage method, methods described includes:Receive client hair The data to be stored sent, wherein, the data to be stored include the first data message where key value information and the Hash groove;Root The record information stored according to central administration node, it is determined that Hash groove corresponding with the key value information and data memory node, its In, network topology, Hash groove that the record information includes the cluster that each data memory node is formed are deposited in each data The distribution situation of node and the load capacity of each data memory node are stored up, each data memory node includes multiple for mapping The Hash groove of key value information;According to the key value information, by the first data memory node described in the data Cun Chudao to be stored.
In certain embodiments, methods described also includes:Each data memory node in the record information Load capacity, judge the load capacity of each data memory node whether in default threshold range;If so, then keep data Memory node invariable number;If it is not, then increase/data memory node is removed, the Hash groove of each data memory node is redistributed, And update the record information.
In certain embodiments, the increase/removal data memory node, including:It is more than the threshold when load capacity be present When being worth the data memory node of the maximum of scope, then increase an at least data memory node;It is less than institute when load capacity be present When stating the data memory node of the minimum value of threshold range, then at least one data memory node is removed.
In certain embodiments, the Hash groove for redistributing each data memory node, and the record information is updated, Including:When increasing an at least data memory node, then the part Hash groove in former data memory node is transferred to newly-increased In the data memory node added so that Hash groove mean allocation in each data memory node, and update the record information;When When removing at least one data memory node, then the Hash groove in the data memory node of removal is transferred to remaining data section In point so that Hash groove mean allocation in each data memory node, and update the record information.
In certain embodiments, methods described is used for cloud computing environment;And methods described also includes:From resource management section Point obtains cloud computing resources.
The third aspect, this application provides a kind of Distributed Storage device, described device includes:Receiving module, match somebody with somebody The data to be stored for receiving client transmission are put, wherein, the data to be stored include key value information and data message;Really Cover half block, the record information stored according to central administration node is configured to, it is determined that Hash groove corresponding with the key value information With the first data memory node where the Hash groove, wherein, the record information includes each data memory node and formed The network topology of cluster, Hash groove are in the distribution situation of each data memory node and the load of each data memory node Amount, each data memory node include multiple Hash grooves for being used to map key value information;Memory module, it is configured to according to institute Key value information is stated, by the first data memory node described in the data Cun Chudao to be stored.
In certain embodiments, described device also includes:Judge module, it is configured to each in the record information The load capacity of the data memory node, judge the load capacity of each data memory node whether in default threshold range It is interior;If so, then keep data memory node invariable number;If it is not, then increase/data memory node is removed, redistribute each number According to the Hash groove of memory node, and update the record information.
In certain embodiments, the judge module is further configured to:It is more than the threshold value model when load capacity be present During the data memory node for the maximum enclosed, then increase an at least data memory node;It is less than the threshold when load capacity be present When being worth the data memory node of the minimum value of scope, then at least one data memory node is removed.
In certain embodiments, the judge module is further configured to:When an increase at least data memory node When, then the part Hash groove in former data memory node is transferred in the data memory node newly increased so that Hash groove exists Mean allocation in each data memory node, and update the record information;When removing at least one data memory node, then will Hash groove in the data memory node of removal is transferred in remaining back end so that Hash groove is in each data memory node Middle mean allocation, and update the record information.
In certain embodiments, described device is used for cloud computing environment;And described device also includes:Acquisition module, match somebody with somebody Put for obtaining cloud computing resources from Resource Management node.
Distributed data-storage system, the method and apparatus of the application offer, agent node is according in central administration node Hash groove of the Hash groove of storage in the distribution situation of each data memory node, data memory node reflects with each key value information Relation, and the key value information of data to be stored are penetrated, determines Hash groove and data memory node corresponding to data to be stored, finally The data to be stored are stored, the distributed data-storage system of the application is simple in construction and autgmentability is strong.
Brief description of the drawings
By reading the detailed description made to non-limiting example made with reference to the following drawings, the application's is other Feature, objects and advantages will become more apparent upon:
Fig. 1 is the Organization Chart according to distributed data-storage system one embodiment of the application;
Fig. 2 is the flow chart according to distributed data storage method one embodiment of the application;
Fig. 3 is the flow chart according to another embodiment of the distributed data storage method of the application;
Fig. 4 is the structural representation according to one embodiment of the Distributed Storage device of the application;
Fig. 5 is adapted for the structural representation for realizing the terminal device of the embodiment of the present application or the computer system of server Figure.
Embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Be easy to describe, illustrate only in accompanying drawing to about the related part of invention.
It should be noted that in the case where not conflicting, the feature in embodiment and embodiment in the application can phase Mutually combination.Describe the application in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 shows the exemplary system architecture that can apply distributed data-storage system one embodiment of the application 100。
As shown in figure 1, system architecture 100 can include client 101, agent node 102, the and of data memory node 103 Central administration node 104.Here, client 101, agent node 102, data memory node 103 and central administration node 104 it Between communicated by way of wired connection or wireless connection.
In the present embodiment, above-mentioned data memory node 103 is used for the data to be stored that storage agent node 102 is sent. Wherein, above-mentioned each data memory node 103 includes multiple Hash grooves for being used to map different key value informations, above-mentioned data to be stored Including key value information and data message.
In some optional implementations of the present embodiment, each above-mentioned data memory node 103 can be by one Host node and it is multiple formed from node, and the host node of each data memory node 103 and from phases such as node storage contents Together.Therefore, when machine is delayed in host node failure, you can replace host node, this data storage from node to start one Node 103 can improve the disaster tolerance of distributed data-storage system.
In some optional implementations of the present embodiment, above-mentioned data to be stored can be referred to as key-value numbers According to wherein key value information can be key, and data message can be value.Above-mentioned Hash groove (being referred to as hash slot) Specific key value information can be mapped to by specific data message by hash function, this can ensure key value information and data One-to-one relation between information.For example, key assignments space can be divided into 16384 Hash grooves, and each above-mentioned number It can include a part of above-mentioned Hash groove according to memory node 103, and cause the Kazakhstan in each above-mentioned data memory node 103 Uncommon slot number amount is equal.Generally, 16384 modulus can be calculated belonging to a specified key value information using the CRC16 codings of key Hash groove.Also, the number of the equal Hash groove being not necessarily referring in each data memory node 103 here is completely the same, but It is equal to a certain extent, if for example, above-mentioned cluster is made up of 3 data memory nodes 103, then each data storage Hash groove in node 103 can be respectively 5500,5500,5384;And for example the above-mentioned cluster of fruit is deposited by 1000 data Storage node 103 is formed, then the Hash groove in each data memory node 103 can be 16 or 17.
In some optional implementations of the present embodiment, the number of data memory node 103 is only example in Fig. 1 Property.In fact, the distributed data-storage system 100 can include n data memory node 103 as needed, wherein, n For the natural number less than 16385.Further, according to the actual needs can also dynamic adjustment (for example, increasing or decreasing) system The number of data memory node 103 in system.
In the present embodiment, above-mentioned central administration node 104 is used for stored record information, wherein, above-mentioned record information bag Include the network topology of the cluster that each data memory node 103 is formed, Hash groove each data memory node 103 distribution The load capacity of situation and each data memory node 103.
In some optional implementations of the present embodiment, above-mentioned central administration node 104 and each data memory node 103 connections, it is Sino-Kazakhstan to obtain the network topology for the cluster that each data memory node 103 is formed, each data memory node Distribution situation, each data memory node ultimate load and the present load amount of uncommon groove, and store the information conduct of above-mentioned acquisition Record information.For example, network topology can include the host node in each data memory node 103 and the pass from node Relation between system, each data memory node 103 etc..This side that cluster above- mentioned information is obtained by central administration node 104 Method is compared with the data memory node in cluster is in communication with each other the method for obtaining above- mentioned information, and its communications cost is lower, disaster tolerance Ability is stronger.
In some optional implementations of the present embodiment, when the number of above-mentioned data memory node 103 changes When, the record information in central administration node 104 also can correspondingly update, to ensure that agent node 102 can obtain in real time The accurate information of each data memory node 103.
In the present embodiment, above-mentioned agent node 102 is used for the data to be stored for receiving the transmission of client 101, according to upper The record information for stating central administration node 104 determines corresponding with above-mentioned data to be stored Hash groove and data memory node 103.
In some optional implementations of the present embodiment, what the number of above-mentioned agent node 102 was merely exemplary, And agent node 102 can be connected with above-mentioned client 101, data memory node 103 and central administration node 104 respectively. Wherein, above-mentioned client 101 is used for the process instruction for receiving data storage, and sends data to be stored to agent node 102.Should Agent node 102 obtains the key value information in above-mentioned data to be stored, and the CRC16 codings of key can be used to be taken to 16384 afterwards Mould calculates the Hash groove belonging to the key value information, then obtained from above-mentioned central administration node 104 the Hash groove that calculates and Data memory node where the Hash groove, finally store above-mentioned data to be stored.
In some optional implementations of the present embodiment, above-mentioned agent node 102 can be also used for according in above-mentioned The load capacity of each data memory node 103 in the record information of heart management node 104, judge each data memory node 103 Whether load capacity is in default threshold range.If so, data memory node invariable number is then kept, if it is not, then increase/removal Data memory node, redistributes the Hash groove of each data memory node, and updates above-mentioned record information.Here, above-mentioned agency Node 102 can pre-set the threshold range of the load capacity of data memory node 103, and agent node 102 can be from upper afterwards The load capacity that each data memory node 103 is obtained in central processing node 104 is stated, and it is compared with above-mentioned threshold range, If comparing result for each data memory node 103 load capacity in threshold range, data memory node can be kept 103 invariable number, it otherwise can increase/remove data memory node 103.When increase/removal data memory node 103, The network topology of the cluster that data memory node 103 is formed, Hash groove each data memory node 103 distribution situation And the load capacity of each data memory node 103 can change, at this moment central administration node 104 can be according to above-mentioned change Update the record information of its storage.
In some optional implementations of the present embodiment, above-mentioned agent node 102 can also be further used for judging It is more than the data memory node 103 of the maximum of above-mentioned threshold range with the presence or absence of load capacity;If so, it can then increase at least one Individual data memory node 103;Judge whether that load capacity is less than the data memory node of the minimum value of above-mentioned threshold range 103;If so, it can then remove at least one data memory node 103.
In some optional implementations of the present embodiment, when increasing an at least data memory node 103, then may be used So that the part Hash groove in each former data memory node 103 is transferred in the data memory node 103 newly increased so that breathe out Uncommon groove mean allocation in each data memory node 103, wherein, the Hash groove being transferred includes having stored in number therein According to.For example, above-mentioned distributed data-storage system includes tri- data memory nodes 103 of A, B, C, and data memory node A includes the 1st~the 5500th Hash groove, and data memory node B includes the 5501st~the 11000th Hash groove, data memory node C bags Containing the 11001st~the 16384th Hash groove, when above-mentioned agent node 102 judges that the load capacity that data memory node be present is more than During the maximum of threshold range, then it can increase a data memory node D, at this moment can be from former data memory node A, B, C It is middle to take out a part of Hash groove respectively into the data memory node D newly increased, so that the Kazakhstan in data memory node A, B, C, D The quantity of uncommon groove is roughly equal.
In some optional implementations of the present embodiment, when removing at least one data memory node 103, then may be used It is transferred to the Hash groove in the data memory node 103 by removal in remaining back end 103 so that Hash groove is in each number According to mean allocation in memory node, wherein, the Hash groove being transferred includes having stored in data therein.It is for example, distributed When data-storage system includes tri- data memory nodes 103 of A, B, C, and data memory node A includes the 1st~the 5500th Hash groove, data memory node B include the 5501st~the 11000th Hash groove, and data memory node C includes the 11001st~the 16384 Hash grooves, when above-mentioned agent node 102 judges that the load capacity that data memory node be present is less than the minimum of threshold range During value, then a data memory node can be removed, it is assumed that removal is data memory node C, at this moment can be by data storage Hash groove in node C is transferred in data memory node A, B, so that the quantity of the Hash groove in data memory node A, B is big Cause equal.
In some optional implementations of the present embodiment, the distributed data-storage system 100 of the application can be used In cloud computing environment.When the distributed data-storage system 100 is used for cloud computing environment, then it can increase resource management section Point.The Resource Management node can be that said system distributes cloud computing resources, and above-mentioned agent node 102 can be further used for Cloud computing resources are obtained from the Resource Management node.
It is pointed out that above-mentioned client 101, agent node 102, data memory node 103 and central administration node Connected mode between 104 can be wired connection mode or radio connection.Here, radio connection can To include but is not limited to 3G/4G connections, WiFi connections, bluetooth connection, WiMAX connections, Zigbee connections, UWB (ultra Wideband) connection and other currently known or exploitation in the future radio connections.
In the distributed data-storage system that the above embodiments of the present application provide, agent node 102 can be according to central tube Hash of the Hash groove stored in reason node 104 in the distribution situation of each data memory node 103, data memory node 103 Groove and the mapping relations of each key value information, and the key value information of data to be stored, determine Hash groove corresponding to data to be stored With data memory node 103, above-mentioned data to be stored are finally stored, the distributed data-storage system structure letter in the present embodiment List and autgmentability is strong.
With continued reference to Fig. 2, it illustrates the flow of one embodiment of the distributed data storage method according to the application 200.The distributed data storage method that the present embodiment is provided is realized based on the distributed data-storage system in Fig. 1, It can be performed by agent node.This method comprises the following steps:
Step 201, the data to be stored that client is sent are received.
In the present embodiment, electronic equipment (such as the agency shown in Fig. 1 of distributed data storage method operation thereon Node) data to be stored that client is sent can be received, wherein, data to be stored include key value information and data message. In embodiment, above-mentioned data to be stored can be key-value data, and its included key value information and data message can divide Key and value therein are not corresponded to.
Step 202, according to central administration node store record information, it is determined that Hash groove corresponding with key value information and this The first data memory node where Hash groove.
In the present embodiment, above-mentioned record information includes the network topology pass for the cluster that each data memory node is formed System, distribution situation and each data memory node of the Hash groove (hash slot can also be referred to as) in each data memory node Load capacity.Here, each data memory node includes multiple Hash grooves for being used to map key value information.Above-mentioned electronic equipment is based on The key value information of the data to be stored obtained in step 201 calculates the Hash groove belonging to it, is then stored from central administration node Record information in determine that Hash groove corresponding to the key value information of the data to be stored and the first data where the Hash groove are deposited Store up node.
In some optional implementations of the present embodiment, above-mentioned electronic equipment can pre-establish and above-mentioned central tube Manage the contact of node.Wherein, central administration node is used to manage the cluster that each data memory node is formed, and obtains the network of cluster Topological relation, Hash groove are in the distribution situation of each data memory node and the load capacity of each data memory node.
In some optional implementations of the present embodiment, for example, key assignments space can be divided into 16384 Hash Groove, and each above-mentioned data memory node can include a part of above-mentioned Hash groove.Here Hash groove can pass through Hash Specific key value information is mapped to specific data message by function, and this can ensure between key value information and data message one by one Corresponding relation, so that the specific positions of data Cun Chudao to be stored.Above-mentioned electronic equipment can use the CRC16 of key to compile Code calculates the Hash groove belonging to a specified key value information to 16384 modulus.
Step 203, according to key value information, by the data memory nodes of data Cun Chudao first to be stored.
In the present embodiment, Hash groove corresponding to the above-mentioned key value information that above-mentioned electronic equipment is obtained based on step 202 and The first data memory node where the Hash groove, afterwards by above-mentioned its key value information of data Cun Chudao to be stored corresponding to Kazakhstan The first data memory node where uncommon groove.
The distributed data storage method that above-described embodiment of the application is provided, by the key assignments for calculating data to be stored Hash groove belonging to information, the first data storage section where then obtaining the Hash groove from the record information of central administration node Point, finally completes data storage, and this method make it that data storage is more easy, accurate.
With further reference to Fig. 3, it illustrates another embodiment of the distributed data storage method according to the application Flow 300.The flow 300 of the distributed data storage method, comprises the following steps:
Step 301, the data to be stored that client is sent are received.
In the present embodiment, electronic equipment (such as the agency shown in Fig. 1 of distributed data storage method operation thereon Node) data to be stored that client is sent can be received, wherein, data to be stored include key value information and data message. In embodiment, above-mentioned data to be stored can be key-value data, and its included key value information and data message can divide Key and value therein are not corresponded to.
Step 302, judge the load capacity of each data memory node whether in default threshold range.
In the present embodiment, above-mentioned electronic equipment can set a threshold range to data memory node in advance.It is above-mentioned Electronic equipment obtains the load capacity of each data memory node from central administration node, afterwards by the load capacity of each data memory node Compared with above-mentioned threshold range, if data memory node of the load capacity not in above-mentioned threshold range be present, step is gone to 303, if the load capacity of each data memory node goes to step 304 in above-mentioned threshold range.
Step 303, increase/removal data memory node, redistributes the Hash groove of each data memory node, and updates note Record information.
In the present embodiment, the presence load capacity not data storage in above-mentioned threshold range judged based on step 302 Node, show that data memory node is insufficient or unnecessary in above-mentioned cluster, then need to increase or remove at least one data storage section Point.Further, when the number of data memory node changes, the network topology for the cluster being made up of data memory node Relation, Hash groove can change in the distribution of each data memory node and the load capacity of each data memory node, at this moment Central administration node can update the record information of its storage according to above-mentioned change.
In the present embodiment, this step can determine it is to remove data memory node as follows, or increase number According to memory node:
First, if it is determined that data memory node of the load capacity more than the maximum of threshold range be present, show above-mentioned collection Data memory node deficiency, then can increase data memory node in group.Further, it can also be stored and saved according to current data The load capacity of point and the maxima and minima of above-mentioned threshold range etc., it is determined that finally needing the number of increased data memory node Amount.Also, after data memory node is added, above-mentioned electronic equipment migrates the part Hash groove of former data memory node Into the data memory node newly increased, so that Hash groove mean allocation in each data memory node, afterwards by centre management The record information of node is updated to increase the record information after data memory node.It should be noted that here flat is not Absolute is averaged, but average to a certain extent, if for example, above-mentioned cluster is by 1000 data memory node structures Into, then the Hash groove in each data memory node can be 16 or 17.
Secondly, if it is determined that the data memory node that load capacity is less than threshold range minimum value be present, show above-mentioned cluster Middle data memory node is excessive, then can remove partial data memory node.Further, can also be stored according to current data Maxima and minima of the load capacity of node and above-mentioned threshold range etc., it is determined that the data memory node for finally needing to remove Quantity.Also, after remove data memory node, above-mentioned electronic equipment is by whole Hash of the data memory node of removal Groove is moved in above-mentioned cluster in remaining data memory node, and determines Hash groove average mark in each data memory node Match somebody with somebody, the record information of central administration node is updated to the record information after removal data memory node afterwards.Need what is illustrated Be, here flat be not it is absolute be averaged, it is but average to a certain extent.
Step 304, data memory node invariable number is kept.
In the present embodiment, the load capacity of each data memory node judged based on step 302 is in above-mentioned threshold range Interior, then above-mentioned electronic equipment can keep data memory node invariable number, and the record information of central administration node storage is not yet Become.
Step 305, according to central administration node store record information, it is determined that Hash groove corresponding with key value information and this The first data memory node where Hash groove.
In the present embodiment, above-mentioned electronic equipment according to the Hash groove that central administration node is stored in each data storage section The record informations such as the distribution situation of point can determine the Hash groove corresponding to the key value information of above-mentioned data to be stored, and should The first data memory node where Hash groove, in order to by the above-mentioned above-mentioned first data storage sections of data Cun Chudao to be stored Point.
Step 306, according to key value information, by the data memory nodes of data Cun Chudao first to be stored.
In the present embodiment, above-mentioned electronic equipment can be stored to step according to the key value information of data to be stored First data memory node determined by 305.
From figure 3, it can be seen that compared with embodiment corresponding to Fig. 2, the distributed data storage method in the present embodiment Flow 300 the step of highlighting increase/removal data memory node and updating record information.Thus, the present embodiment describes Scheme can ensure the number stored in data memory node in increase/removal data memory node by shifting Hash groove According to not being lost.
With further reference to Fig. 4, as the realization to method shown in above-mentioned each figure, this application provides a kind of distributed data One embodiment of storage device, the device embodiment is corresponding with the embodiment of the method shown in Fig. 2, and the device can specifically answer For in various electronic equipments.
As shown in figure 4, the Distributed Storage device 400 described in the present embodiment includes:Receiving module 401, determine mould Block 402 and memory module 403.Wherein, receiving module 401 is configured to receive the data to be stored that client is sent, wherein, on Stating data to be stored includes key value information and data message;Determining module 402 is configured to what is stored according to central administration node Record information, it is determined that the first data memory node where Hash groove corresponding with above-mentioned key value information and the Hash groove, wherein, Above-mentioned record information includes the network topology for the cluster that each data memory node is formed, Hash groove in each data storage section The distribution situation of point and the load capacity of each data memory node, each data memory node include multiple for mapping key value information Hash groove;Memory module 403 is configured to according to above-mentioned key value information, by above-mentioned above-mentioned first numbers of data Cun Chudao to be stored According to memory node.
In some optional implementations of the present embodiment, above-mentioned Distributed Storage device 400 also includes:Judge Module (not shown), the load capacity for each data memory node being configured in above-mentioned record information, judges each number According to the load capacity of memory node whether in default threshold range;If so, then keep data memory node invariable number;If It is no, then increase/data memory node is removed, the Hash groove of each data memory node is redistributed, and update above-mentioned record information.
In some optional implementations of the present embodiment, above-mentioned judge module (not shown) further configures use In:When the data memory node for the maximum that load capacity is more than above-mentioned threshold range be present, then increase an at least data and deposit Store up node;When the data memory node for the minimum value that load capacity is less than above-mentioned threshold range be present, then at least one number is removed According to memory node.
In some optional implementations of the present embodiment, above-mentioned judge module (not shown) concrete configuration is used In:When increasing an at least data memory node, then the part Hash groove in former data memory node is transferred to and newly increased Data memory node in so that Hash groove mean allocation in each data memory node, and update above-mentioned record information;Work as shifting During except at least one data memory node, then the Hash groove in the data memory node of removal is transferred to remaining back end In so that Hash groove mean allocation in each data memory node, and update above-mentioned record information.
In some optional implementations of the present embodiment, when above-mentioned Distributed Storage device 400 is based on cloud When calculating environment, the device also includes:Acquisition module (not shown), it is configured to obtain cloud computing money from Resource Management node Source.
It will be understood by those skilled in the art that above-mentioned Distributed Storage device 400 also includes some other known knots Structure, such as processor, memory etc., in order to unnecessarily obscure embodiment of the disclosure, these known structures are in Fig. 4 not Show.
Below with reference to Fig. 5, it illustrates suitable for for realizing the calculating of the terminal device of the embodiment of the present application or server The structural representation of machine system 500.
As shown in figure 5, computer system 500 includes CPU (CPU) 501, it can be read-only according to being stored in Program in memory (ROM) 502 or be loaded into program in random access storage device (RAM) 503 from storage part 508 and Perform various appropriate actions and processing.In RAM 503, also it is stored with system 500 and operates required various programs and data. CPU 501, ROM 502 and RAM 503 are connected with each other by bus 504.Input/output (I/O) interface 505 is also connected to always Line 504.
I/O interfaces 505 are connected to lower component:Importation 506 including keyboard, mouse etc.;Penetrated including such as negative electrode The output par, c 507 of spool (CRT), liquid crystal display (LCD) etc.;Storage part 508 including hard disk etc.;And including such as The communications portion 509 of the NIC of LAN card, modem etc..Communications portion 509 is held via the network of such as internet Row communication process.Driver 510 is also according to needing to be connected to I/O interfaces 505.Detachable media 511, such as disk, CD, magnetic CD, semiconductor memory etc., it is arranged on as needed on driver 510, in order to the computer program read from it Storage part 508 is mounted into as needed.
Especially, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product, it includes being tangibly embodied in machine readable Computer program on medium, the computer program include the program code for being used for the method shown in execution flow chart.At this In the embodiment of sample, the computer program can be downloaded and installed by communications portion 509 from network, and/or from removable Medium 511 is unloaded to be mounted.
Flow chart and block diagram in accompanying drawing, it is illustrated that according to the system of the various embodiments of the application, method and computer journey Architectural framework in the cards, function and the operation of sequence product.At this point, each square frame in flow chart or block diagram can generation The part of one module of table, program segment or code, a part for the module, program segment or code include one or more For realizing the executable instruction of defined logic function.It should also be noted that some as replace realization in, institute in square frame The function of mark can also be with different from the order marked in accompanying drawing generation.For example, two square frames succeedingly represented are actual On can perform substantially in parallel, they can also be performed in the opposite order sometimes, and this is depending on involved function.Also It is noted that the combination of each square frame and block diagram in block diagram and/or flow chart and/or the square frame in flow chart, Ke Yiyong Function as defined in execution or the special hardware based system of operation are realized, or can be referred to specialized hardware and computer The combination of order is realized.
Being described in module involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described module can also be set within a processor, for example, can be described as:A kind of processor bag Include receiving module, determining module and memory module.Wherein, the title of these modules is not formed to the module under certain conditions The restriction of itself, for example, receiving module is also described as " module for receiving the data to be stored that client is sent ".
As on the other hand, present invention also provides a kind of nonvolatile computer storage media, the non-volatile calculating Machine storage medium can be the nonvolatile computer storage media included in device described in above-described embodiment;Can also be Individualism, without the nonvolatile computer storage media in supplying terminal.Above-mentioned nonvolatile computer storage media is deposited One or more program is contained, when one or more of programs are performed by an equipment so that the equipment:Receive Module, it is configured to receive the data to be stored that client is sent, wherein, the data to be stored include key value information and data Information;Determining module, be configured to according to central administration node store the record information, it is determined that with the key value information pair The first data memory node where the Hash groove and the Hash groove answered, wherein, the record information includes each data storage section Distribution situation and each data storage section of the network topology, Hash groove of the formed cluster of point in each data memory node The load capacity of point, each data memory node include multiple Hash grooves for being used to map key value information;Memory module, configuration are used According to the key value information, by the first data memory node described in the data Cun Chudao to be stored.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.People in the art Member should be appreciated that invention scope involved in the application, however it is not limited to the technology that the particular combination of above-mentioned technical characteristic forms Scheme, while should also cover in the case where not departing from the inventive concept, carried out by above-mentioned technical characteristic or its equivalent feature The other technical schemes for being combined and being formed.Such as features described above has similar work(with (but not limited to) disclosed herein The technical scheme that the technical characteristic of energy is replaced mutually and formed.

Claims (15)

1. a kind of distributed data-storage system, it is characterised in that the system includes:
Data memory node, the data to be stored sent for storage agent node, each data memory node include multiple For mapping the Hash groove of different key value informations, the data to be stored include key value information and data message;
Central administration node, for stored record information, the record information includes the cluster that each data memory node is formed Network topology, Hash groove in the distribution situation of each data memory node and the load capacity of each data memory node, its In, the central administration node is connected with each data memory node to obtain the record information;
The agent node, for receiving the data to be stored of client transmission, believed according to the record of the central administration node Breath determines corresponding with the data to be stored Hash groove and data memory node.
2. distributed data-storage system according to claim 1, it is characterised in that the agent node is additionally operable to:
The load capacity of each data memory node in the record information, judge the negative of each data memory node Whether carrying capacity is in default threshold range;
If so, then keep data memory node invariable number;
If it is not, then increase/data memory node is removed, the Hash groove of each data memory node is redistributed, and update the note Record information.
3. distributed data-storage system according to claim 2, it is characterised in that the agent node is further used In:
Judge whether that load capacity is more than the data memory node of the maximum of the threshold range;
If so, then increase an at least data memory node;
Judge whether that load capacity is less than the data memory node of the minimum value of the threshold range;
If so, then remove at least one data memory node.
4. the distributed data-storage system according to Claims 2 or 3, it is characterised in that when an increase at least data During memory node, then the part Hash groove in each former data memory node is transferred in the data memory node newly increased, So that Hash groove mean allocation in each data memory node;
When removing at least one data memory node, then the Hash groove in the data memory node of removal is transferred to remaining In back end so that Hash groove mean allocation in each data memory node.
5. distributed data-storage system according to claim 4, it is characterised in that the system is used for cloud computing ring Border;And
The system also includes:Resource Management node, for distributing cloud computing resources for the system;
The agent node is further used for obtaining cloud computing resources from the Resource Management node.
A kind of 6. Distributed Storage that distributed data-storage system based on described in claim 1-5 any one is realized Method, it is characterised in that methods described includes:
The data to be stored that client is sent are received, wherein, the data to be stored include key value information and data message;
The record information stored according to central administration node, it is determined that Hash groove corresponding with the key value information and the Hash groove institute The first data memory node, wherein, the record information includes the network of cluster that each data memory node is formed and opened up Relation, Hash groove are flutterred in the distribution situation of each data memory node and the load capacity of each data memory node, each data Memory node includes multiple Hash grooves for being used to map key value information;
According to the key value information, by the first data memory node described in the data Cun Chudao to be stored.
7. distributed data storage method according to claim 6, it is characterised in that methods described also includes:
The load capacity of each data memory node in the record information, judge the negative of each data memory node Whether carrying capacity is in default threshold range;
If so, then keep data memory node invariable number;
If it is not, then increase/data memory node is removed, the Hash groove of each data memory node is redistributed, and update the note Record information.
8. the distributed data storage method according to claim 6 or 7, it is characterised in that the increase/removal data are deposited Node is stored up, including:
When the data memory node for the maximum that load capacity is more than the threshold range be present, then increase an at least data and deposit Store up node;
When the data memory node for the minimum value that load capacity is less than the threshold range be present, then remove at least one data and deposit Store up node.
9. distributed data storage method according to claim 7, it is characterised in that described to redistribute each data storage The Hash groove of node, and the record information is updated, including:
When increasing an at least data memory node, then the part Hash groove in former data memory node is transferred to and newly increased Data memory node in so that Hash groove mean allocation in each data memory node, and update the record information;
When removing at least one data memory node, then the Hash groove in the data memory node of removal is transferred to remaining In back end so that Hash groove mean allocation in each data memory node, and update the record information.
10. distributed data storage method according to claim 9, it is characterised in that methods described is used for cloud computing ring Border;And
Methods described also includes:Cloud computing resources are obtained from Resource Management node.
11. a kind of Distributed Storage device, it is characterised in that described device includes:
Receiving module, it is configured to receive the data to be stored that client is sent, wherein, the data to be stored are believed including key assignments Breath and data message;
Determining module, the record information stored according to central administration node is configured to, it is determined that corresponding with the key value information The first data memory node where Hash groove and the Hash groove, wherein, the record information includes each data memory node institute The distribution situation and each data memory node of the network topology of the cluster of composition, Hash groove in each data memory node Load capacity, each data memory node include multiple Hash grooves for being used to map key value information;
Memory module, it is configured to according to the key value information, by the first data storage described in the data Cun Chudao to be stored Node.
12. Distributed Storage device according to claim 11, it is characterised in that described device also includes:
Judge module, the load capacity for each data memory node being configured in the record information, judges each institute The load capacity of data memory node is stated whether in default threshold range;
If so, then keep data memory node invariable number;
If it is not, then increase/data memory node is removed, the Hash groove of each data memory node is redistributed, and update the note Record information.
13. the Distributed Storage device according to claim 11 or 12, it is characterised in that the judge module enters one Step is configured to:
When the data memory node for the maximum that load capacity is more than the threshold range be present, then increase an at least data and deposit Store up node;
When the data memory node for the minimum value that load capacity is less than the threshold range be present, then remove at least one data and deposit Store up node.
14. Distributed Storage device according to claim 12, it is characterised in that the judge module is further matched somebody with somebody Put and be used for:
When increasing an at least data memory node, then the part Hash groove in former data memory node is transferred to and newly increased Data memory node in so that Hash groove mean allocation in each data memory node, and update the record information;
When removing at least one data memory node, then the Hash groove in the data memory node of removal is transferred to remaining In back end so that Hash groove mean allocation in each data memory node, and update the record information.
15. Distributed Storage device according to claim 14, it is characterised in that described device is used for cloud computing ring Border;And
Described device also includes:Acquisition module, it is configured to obtain cloud computing resources from Resource Management node.
CN201610074240.5A 2016-02-02 2016-02-02 Distributed data-storage system, method and apparatus Active CN105516367B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610074240.5A CN105516367B (en) 2016-02-02 2016-02-02 Distributed data-storage system, method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610074240.5A CN105516367B (en) 2016-02-02 2016-02-02 Distributed data-storage system, method and apparatus

Publications (2)

Publication Number Publication Date
CN105516367A CN105516367A (en) 2016-04-20
CN105516367B true CN105516367B (en) 2018-02-13

Family

ID=55723997

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610074240.5A Active CN105516367B (en) 2016-02-02 2016-02-02 Distributed data-storage system, method and apparatus

Country Status (1)

Country Link
CN (1) CN105516367B (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108228087B (en) * 2016-12-21 2021-08-06 伊姆西Ip控股有限责任公司 Apparatus for hyper-converged infrastructure
CN107612972A (en) * 2017-08-17 2018-01-19 深圳市优品壹电子有限公司 Date storage method and device
CN110019503A (en) * 2017-09-01 2019-07-16 北京京东尚科信息技术有限公司 The dilatation of Redis cluster and/or the method and device of capacity reducing
CN107707628B (en) 2017-09-06 2020-06-02 华为技术有限公司 Method and apparatus for transmitting data processing requests
CN107729421B (en) * 2017-09-27 2019-11-15 华为技术有限公司 The execution method, apparatus and storage medium of storing process
CN110351313B (en) * 2018-04-02 2022-02-22 武汉斗鱼网络科技有限公司 Data caching method, device, equipment and storage medium
CN109032502A (en) * 2018-06-13 2018-12-18 广州市信景技术有限公司 A kind of distributed data cluster storage system
CN109032794A (en) * 2018-07-12 2018-12-18 广州市闲愉凡生信息科技有限公司 Cache object caching method of electronic commerce system
CN109274665A (en) * 2018-09-13 2019-01-25 北京奇安信科技有限公司 DNS threatens information processing method and device
CN110901691B (en) * 2018-09-17 2021-10-29 株洲中车时代电气股份有限公司 Ferroelectric data synchronization system and method and train network control system
CN110971627B (en) * 2018-09-28 2022-08-05 杭州海康威视数字技术股份有限公司 Node control method and device and task processing system
CN109639777B (en) * 2018-11-28 2021-12-10 优刻得科技股份有限公司 Data synchronization method, device and system and non-volatile storage medium
CN109388351A (en) * 2018-12-18 2019-02-26 平安科技(深圳)有限公司 A kind of method and relevant apparatus of Distributed Storage
CN109683826B (en) * 2018-12-26 2023-08-29 北京百度网讯科技有限公司 Capacity expansion method and device for distributed storage system
CN109977077B (en) * 2019-03-25 2021-09-24 腾讯科技(深圳)有限公司 Model file storage method and device, readable storage medium and computer equipment
CN110245028B (en) * 2019-05-13 2023-08-25 平安科技(深圳)有限公司 Message storage method, device, computer equipment and storage medium of IoT-MQ
CN110442773B (en) * 2019-08-13 2023-07-18 深圳市网心科技有限公司 Node caching method, system and device in distributed system and computer medium
CN112565325B (en) * 2019-09-26 2022-09-23 华为云计算技术有限公司 Mirror image file management method, device and system, computer equipment and storage medium
CN111046009B (en) * 2019-11-11 2023-11-21 杭州迪普科技股份有限公司 Log storage method and device
CN111090687B (en) * 2019-12-24 2023-03-10 腾讯科技(深圳)有限公司 Data processing method, device and system and computer readable storage medium
CN112434015B (en) * 2020-12-08 2022-08-19 新华三大数据技术有限公司 Data storage method and device, electronic equipment and medium
CN112650737B (en) * 2020-12-31 2024-03-19 北京大米科技有限公司 Data processing method, data processing device, storage medium and electronic equipment
CN114745281B (en) * 2022-04-11 2023-12-05 京东科技信息技术有限公司 Data processing method and device
CN115981848B (en) * 2022-12-17 2024-05-28 上海律保科技有限公司 Memory database fragment adjustment method and equipment
CN116361299B (en) * 2023-05-31 2023-10-10 天翼云科技有限公司 Hash distribution method and system without data migration during system capacity expansion

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102194002A (en) * 2011-05-25 2011-09-21 中兴通讯股份有限公司 Table entry adding, deleting and searching method of hash table and hash table storage device
CN102682116A (en) * 2012-05-14 2012-09-19 中兴通讯股份有限公司 Method and device for processing table items based on Hash table
CN103186668A (en) * 2013-03-11 2013-07-03 北京京东世纪贸易有限公司 Method and device for processing data as well as data storage system based on key value data base
CN105159985A (en) * 2015-08-31 2015-12-16 努比亚技术有限公司 Data query device and method based on redis cluster

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7412449B2 (en) * 2003-05-23 2008-08-12 Sap Aktiengesellschaft File object storage and retrieval using hashing techniques
CN103379156B (en) * 2012-04-24 2016-03-09 深圳市腾讯计算机系统有限公司 Realize the mthods, systems and devices of memory space dynamic equalization

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102194002A (en) * 2011-05-25 2011-09-21 中兴通讯股份有限公司 Table entry adding, deleting and searching method of hash table and hash table storage device
CN102682116A (en) * 2012-05-14 2012-09-19 中兴通讯股份有限公司 Method and device for processing table items based on Hash table
CN103186668A (en) * 2013-03-11 2013-07-03 北京京东世纪贸易有限公司 Method and device for processing data as well as data storage system based on key value data base
CN105159985A (en) * 2015-08-31 2015-12-16 努比亚技术有限公司 Data query device and method based on redis cluster

Also Published As

Publication number Publication date
CN105516367A (en) 2016-04-20

Similar Documents

Publication Publication Date Title
CN105516367B (en) Distributed data-storage system, method and apparatus
CN109697133A (en) ID generation method, apparatus and system
CN105094956B (en) A kind of method for distributing business and device based on channel separation
CN105531688B (en) The service of resource as other services is provided
CN103686206B (en) Video transcoding method and system in cloud environment
CN106201661B (en) Method and apparatus for elastic telescopic cluster virtual machine
CN106843755A (en) For the data balancing method and device of server cluster
JP6243045B2 (en) Graph data query method and apparatus
CN108734353A (en) A kind of public bus network route plan generation method and device
CN105183670B (en) Data processing method and device for distributed cache system
CN106326239A (en) Distributed file system and file meta-information management method thereof
CN104199912B (en) A kind of method and device of task processing
CN108062243A (en) Generation method, task executing method and the device of executive plan
CN104424240B (en) Multilist correlating method, main service node, calculate node and system
CN113169889A (en) Method and apparatus for mapping network slices onto network infrastructure with SLA guarantees
CN106453143A (en) Bandwidth setting method, device and system
CN108255989A (en) Picture storage method, device, terminal device and computer storage media
CN107734017A (en) Data service method and system
CN105069074B (en) policy configuration file processing method, device and system
CN106484881B (en) Document handling method and device
CN106230623A (en) A kind of VIM site selection method and device
CN109284173A (en) A kind of Intelligent management device for virtual machine business migration
CN106570068B (en) Information recommendation method and device
CN103812912B (en) A kind of method and device of maintenance organization structural information
CN107844566A (en) A kind of dump control methods and its system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant