CN103036717B - The consistency maintenance system and method for distributed data - Google Patents

The consistency maintenance system and method for distributed data Download PDF

Info

Publication number
CN103036717B
CN103036717B CN201210535376.3A CN201210535376A CN103036717B CN 103036717 B CN103036717 B CN 103036717B CN 201210535376 A CN201210535376 A CN 201210535376A CN 103036717 B CN103036717 B CN 103036717B
Authority
CN
China
Prior art keywords
data
node
maintenance
message
lock
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210535376.3A
Other languages
Chinese (zh)
Other versions
CN103036717A (en
Inventor
赵耀
宋颖莹
杨放春
邹华
牛琨
张文涛
万能
彭书凯
邹志勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201210535376.3A priority Critical patent/CN103036717B/en
Publication of CN103036717A publication Critical patent/CN103036717A/en
Application granted granted Critical
Publication of CN103036717B publication Critical patent/CN103036717B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The consistency maintenance system and method for distributed data in a kind of distributed cluster system, the data maintenance node being arranged in system by cluster management node, global lock management node and multiple dispersion forms this system, wherein, data maintenance node is all the isomorphism node storing one or more data trnascription, is provided with action listener, data sign processing and event and sends three modules.Global lock management node is responsible for the renewal lock of all data in store and management system and is stored all data maintenance nodal informations, is provided with message sink and sending module, renewal lock request queue, data maintenance nodal information administration module three parts.Cluster management node is in charge of all node up-to-date informations in system, and is responsible for the state of each node in cycle detecting system.The inventive method reliable operation is with flexible, and client can initiate Data Update to multiple target data simultaneously.And in data updating process, easy and simple to handle, communication overhead is few, upgrade time delay short, application prospect is had an optimistic view of.<pb pnum="1" />

Description

The consistency maintenance system and method for distributed data
Technical field
The present invention relates to a kind of consistency maintenance system and method for distributed data, belong to the technical field of computer.
Background technology
Distributed system is by the system multiprocessor architecture of interconnection of telecommunication network performing distributed treatment task.Because the every number in distributed cluster system is according to the multiple nodes be all stored in distributed type assemblies, when the data on wherein certain clustered node are updated, how realizing and to ensure that the data trnascription on all the other nodes in this group system also can carry out identical renewal rewards theory, all copies of each data in whole group system are all ensured, and consistent problem is a focus of scientific and technical personnel's concern in the industry.
The prior art solving distributed data coherence method mainly contains following five kinds, but it exists different shortcoming respectively.Introduce it respectively below:
Main update method: when there is many parts of data trnascriptions of data in system, setting one of them data is master data.During each renewal, first master data is upgraded; After master data is updated successfully, updating message is sent to other data trnascription from master data, makes data trnascription also carry out same renewal rewards theory.Its shortcoming is: updating message must be sent to all back end within the time short as far as possible, if can not be sent completely at short notice, system will produce out-of-date data; And once the one malfunctions at master data place, data trnascription node can not obtain updating message.
Mirror image update method: the method does not consider the data trnascription of each data, only considers each master data and is defined in the mirror image in master data, when upgrading master data, refreshes the mirror image be defined in master data simultaneously.Its shortcoming is: mirror image data is read-only and can not write, and there is certain restriction.
Lazy update method: the method is the renewal rewards theory of data is not perform at once, but only has when the data is accessed, just performs renewal rewards theory.Its defect is increased the time delay of client-access data, and especially when multiple Concurrency Access carries out simultaneously, time delay is more obvious, and, be easy to cause deadlock.
Message queue method: the method is that all renewal rewards theory message all leaves in a message queue, and by nodes sharing all in system, the message in this message queue processes according to the principle of first in first out.Because the message in message queue is according to first in first out process, and the limited length system of message queue, when concurrent updating message is many and the packet of message is larger, easily causes the spilling of message queue, cause the loss of updating message.
Transaction controlling method: affairs refer to a series of atomic operation, data, by performing this series of atomic operation, transfer to another coherency state from a coherency state.The shortcoming of this method is: when generation affairs upgrade unsuccessfully, affairs can be caused to restart the increasing of frequency, and, when local matter and global transaction perform simultaneously, be easy to cause Data Update inconsistent.
In a word, in data updating process, all there is imperfection place in the node failure that the above-mentioned data consistency update method of prior art may occur and network communication failure aspect.Therefore, in the industry scientific and technical personnel also in the solution of seeking diligently to be correlated with.
Summary of the invention
In view of this, the object of this invention is to provide a kind of consistency maintenance system and method for distributed data, under the present invention is used for distributed network environment, realize the consistency maintenance of distributed data, and effectively can reduce cost on network communication, improve reliability, high efficiency and availability that distributed data upgrades.
In order to achieve the above object, the invention provides the consistency maintenance system of distributed data in a kind of distributed cluster system, it is characterized in that, this system is made up of cluster management node, global lock management node and multiple data maintenance node, wherein:
Multiple data maintenance node, be arranged in distributed cluster system dispersedly, each data maintenance node be can receive client Data Update request and there is same treatment function and respective uniquely identified isomorphism node, each data maintenance node stores the copy of one or more data, each data are with mutually different unique identification and different editions number, but the unique identification of all copies of same data is all identical, after each Data Update success, version number is corresponding change also; Be provided with action listener, data sign processing and event and send totally three modules;
Global lock management node, is responsible for the renewal lock of all data in store and management system, and all data maintenance nodal informations in storage system, for broadcast data updating message; Also by the node-home containing identical data copy in same group, each data maintenance node can belong to multiple groups; Global lock management node receives Data Update lock request message, and the renewal of data lock is applied for successfully, transfer of data in distributed cluster system between each data maintenance node is not through global lock management node, to reduce its communications burden and to reduce fault probability of occurrence, improve system reliability; Data maintenance node need more new data time, first prepare the renewal lock of the target data upgraded to the application of global lock management node; If the renewal lock of the target data of this preparation renewal is temporarily unavailable, then the request of the renewal of this target data lock is put into corresponding renewal lock request queue, until this updating target data is locked available; Be provided with the message sink and sending module, renewal lock request queue, data maintenance nodal information administration module totally three parts that are linked in sequence;
Cluster management node, is in charge of the information of data that all nodes in system comprise its IP address, listening port number, running status and storage, and provides the up-to-date information of each node for the data maintenance nodal information administration module in global lock management node; Be responsible for the running status of each data maintenance node in cycle detecting system, and storage is in normal operating condition and the data maintenance nodal information that can reach, and deletion is in abnormal operating condition, inaccessible data maintenance nodal information, ensure only to store the data maintenance nodal information that can reach in cluster management node, unreachable node does not participate in the maintenance update of data consistency.
The renewal lock of described data is for limiting the access to data: data maintenance node is to upgrade certain data, first the renewal lock of this target data must be obtained to the application of global lock management node, after success obtains updating target data lock, this target data just no longer receives the read-write requests of client, until the renewal lock of this target data is released; If the renewal lock of this target data temporarily can not be used, then the update request of this target data is placed into corresponding renewal in lock request queue, until this updating target data lock can use.
In order to achieve the above object, present invention also offers a kind of method of work applying the consistency maintenance system of distributed data in distributed cluster system of the present invention, it is characterized in that: described method comprises following operative step:
(1) initiate the data maintenance node of Data Update request, namely only manage the interim host node of the Data Update request self initiated first sends Data Update lock request message from the unique identification comprising its node unique identification and expect the target data upgraded to global lock management node; Now, interim host node does not upgrade target data, and the state to be updated such as to be in;
(2) message sink in global lock management node and sending module receive this Data Update lock request message, judge the update request whether comprising multiple target data in this message; If so, then this message is divided into separately independently multiple Data Update request message, and puts into the renewal lock request queue of corresponding data respectively; If not, then this message is directly sent in the renewal lock request queue of target data; The message upgraded in lock request queue is sent to data maintenance nodal information administration module according to first-in first-out mode;
(3), after data maintenance nodal information administration module receives this message, the data maintenance node group containing target data is first searched out, and to this Data Update request message of the data maintenance node broadcasts in group; Meanwhile, the time window initial value of receipt message is also set, prepares in the time window of this setting, the acknowledge message whether data that the data maintenance node of reception containing target data sends are updated successfully;
(4) after other data maintenance nodes containing target data receive this Data Update request message, resolve and perform the updating target data operation in this message: if deletion action, then directly delete the corresponding data in target data, and increase the version number of these data; If rolling back action, then the renewal of directly rollback target data the last time execution, and reduce the version number of data; If increase or retouching operation, then, after data maintenance node first newly adds data content from interim host node acquisition, just upgrade target data, increase the version number of data simultaneously;
(5), after the updating target data success of other data maintenance nodes containing target data, the version number after comprising this updating target data success to the transmission of global lock management node is updated successfully acknowledge message;
(6) the data maintenance nodal information administration module in global lock management node resends Data Update request to the failed data maintenance node of renewal, until just receive when being updated successfully acknowledge message within the maximum of sending times, judge whether again to receive beyond the interim host node of removing, the successful acknowledge message of Data Update that all other data maintenance nodes containing target data send, to take corresponding different post-treatment operations;
(7) in distributed cluster system, after data maintenance nodal information administration module completes all operations of a Data Update, the time window value of its receipt message is adjusted, to adapt to the change of network environment.
The consistency maintenance system and method for distributed data of the present invention there is following multiple advantages:
Data maintenance node in distributed node group system of the present invention is all isomorphism and equity, there is not permanent host node.When initiating the data maintenance one malfunctions of renewal rewards theory, the data maintenance node that in this cluster, other comprise identical data can be selected again to initiate Data Update operation requests, which improves functional reliability and the flexibility of system, in addition, data consistency provided by the invention is strong consistency.And client can initiate Data Update operation to multiple target data simultaneously.In data updating process, introduce the version number of data, to reduce the unnecessary operation overhead that network failure brings.
In addition, the data maintenance node of initiating Data Update operation in the present invention is that in the end the stage just performs data renewal rewards theory, this strategy can when data maintenance node generation Data Update failure, the data maintenance node reducing the operation of this initiation Data Update performs rolling back action expense, also shortens the time of implementation of data consistency renewal process.
In a word, the present invention had both combined the reliability that distributed computing technology embodies and (had been embodied in Data distribution8 in data cluster at multiple data maintenance node, and data maintenance node is all isomorphisms), combine again high efficiency and the availability (being embodied in the renewal lock of all data in global lock management node management cluster) of centralized management.Therefore, application prospect of the present invention is had an optimistic view of.
Accompanying drawing explanation
Fig. 1 is the general structure composition schematic diagram of distributed data consistency maintenance system of the present invention.
Fig. 2 is the data maintenance intra-node structure composition schematic diagram in present system.
Fig. 3 is the global lock management node internal structure schematic diagram in present system.
Fig. 4 is the consistency maintenance system works method distributed data consistency maintenance time diagram under normal circumstances of distributed data of the present invention.
Fig. 5 is the distributed data consistency maintenance time diagram of consistency maintenance system works method when the failure of data maintenance node updates of distributed data of the present invention.
Fig. 6 is distributed data consistency maintenance time diagram in the inventive method HP M situation.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly, below in conjunction with accompanying drawing, the present invention is described in further detail.
The application scenarios of the consistency maintenance system and method for distributed data of the present invention is:
Client in present system certain data maintenance node initiate comprise increases, delete or revise Data Update request time, first this Data Update request is sent to global lock management node by this data maintenance node, beyond the node that this Data Update request forward initiates update request to removing by global lock management node again, to comprise this target data every other data maintenance node; After the data on other data maintenance nodes are all updated successfully, the data maintenance node of initiating update request just performs renewal rewards theory to this target data; Then the acknowledge message of returning updating target data to the client initiating Data Update request to complete then.If updating target data failure, then all data maintenance nodes all abandon this Data Update, and all data in guarantee system are under any circumstance all consistent.
See Fig. 1, introduce the structure composition of distributed data consistency maintenance system of the present invention: be made up of cluster management node, global lock management node and multiple data maintenance node, wherein:
Multiple data maintenance node: be arranged in distributed cluster system dispersedly, each data maintenance node be can receive client Data Update request and there is same treatment function and respective uniquely identified isomorphism node, each data maintenance node stores the copy of one or more data, each data are with mutually different unique identification and different editions number, but the unique identification of all copies of same data is all identical, after each Data Update success, version number is corresponding change also; Be provided with action listener, data sign processing and event and send totally three modules (see Fig. 2).The function of these modules is:
Action listener module: be sent to other node in Data Update request message on this data maintenance node or system be sent to Data Update request message on this data maintenance node and/or data rewind operation information for monitoring client, and the transfer of messages received is processed to data sign processing module.Data rewind cancels the last Data Update operation performed.
Data sign processing module: for the Data Update sent from client request message being first kept in the client data update request message queue in this data sign processing module, so that according to the message in this queue of first in first out process, send Data Update lock request message to event sending module simultaneously; And for the Data Update request message from other nodes received and/or data rewind operation information, then directly perform the Data Update operation in data Maintenance Point and/or data rewind operation information; Data Update operation comprises: the increase of data content, deletion or amendment; After the success of each Data Update or after rolling back action success, change the version number of these data simultaneously, and to be updated successfully or the acknowledge message of failure, rolling back action success or failed acknowledge message send to event sending module.
Event sending module: for acknowledge message successful or failed for the Data Update received, rolling back action success or failed acknowledge message or Data Update lock request message are sent to global lock management node.
Global lock management node: the renewal lock being responsible for all data in store and management system, and all data maintenance nodal informations in storage system, for broadcast data updating message; Also by the node-home containing identical data copy in same group, each data maintenance node can belong to multiple groups; Global lock management node receives Data Update lock request message, and the renewal of data lock is applied for successfully, transfer of data in distributed cluster system between each data maintenance node is not through global lock management node, to reduce its communications burden and to reduce fault probability of occurrence, improve system reliability.When data maintenance node carries out Data Update, first prepare the renewal lock of the target data upgraded to the application of global lock management node; If the renewal lock of the target data of this preparation renewal is temporarily unavailable, then the request of the renewal of this target data lock is put into corresponding renewal lock request queue, until this updating target data is locked available.Be provided with the message sink that is linked in sequence and sending module, renewal lock request queue, data maintenance nodal information administration module totally three parts (see Fig. 3), the function of these three modules respectively:.
Message sink and sending module: be responsible for the Data Update request message receiving the one or more data of renewal that data maintenance node sends, if when update request message comprises the update request of multiple data, after then message sink and sending module will be divided into Data Update request independent of each other the update request of the multiple target datas comprised in this message, be sent in the renewal lock request queue of corresponding data respectively, to make the update request of each target data process respectively concurrently separately, to support that multiple target data upgrades simultaneously.The acknowledge message whether Data Update that this module also receives the transmission of renewal lock request queue completes, and this acknowledge message is transmitted to the data maintenance node of initiating Data Update request, this data maintenance node feeds back to client this acknowledge message again, judges that whether Data Update is successful to make client according to the acknowledge message that it receives.
Upgrade lock request queue: the renewal lock request message being in charge of all data in this system, and according to the message in the principle sequential processes queue of first in first out; Be provided with multiple Data Update lock request queue corresponding with the different pieces of information in system respectively, the message in this queue can be sent to data maintenance nodal information administration module, and receive the renewal lock releasing request message that data maintenance nodal information administration module returns, process next renewal to enable renewal lock request queue and lock request message, and after completion processing, to the acknowledge message whether message delivery and reception module transmission Data Update completes.
Data maintenance nodal information administration module: be responsible for multiple data maintenance nodes of dispersion to be divided into different node groups, each node group at least containing a identical data, and is in charge of and is comprised: the IP address of each node, listening port number and whether be in each node group information of normal operating condition.Be provided with the time window of receipt message, and adjust this time window size in real time according to the change of network environment.Be responsible for by Data Update information broadcast to all data maintenance nodes containing this data trnascription in system, and receive Data Update or the rolling back action whether successfully acknowledge message of data maintenance node feeding back; If this acknowledge message does not arrive in time window setting duration, then abandon waiting for this acknowledge message.Also comprise the latest state information of data trnascription of the IP address of node, the port numbers of monitoring, the running status of node and storage thereof from all data maintenance nodes that cluster management node obtains this system according to setting-up time; And according to the synchronous relevant information himself stored of all data maintenance node up-to-date informations that cluster management node stores, when carrying out consistent data maintenance update to make distributed system, reduce the rolling back action expense causing the data maintenance node be updated successfully because of inaccessible data maintenance node updates failure, thus improve reliability and the high efficiency of data sign processing.Data maintenance nodal information administration module is the maximum of setting data update request sending times and the maximum of data rewind operation requests sending times also.
Cluster management node: be in charge of the information of data that all nodes in system comprise its IP address, listening port number, running status and storage, and provide the up-to-date information of each node for the data maintenance nodal information administration module in global lock management node.Be responsible for the running status of each data maintenance node in cycle detecting system, and storage is in normal operating condition and the data maintenance nodal information that can reach, and deletion is in abnormal operating condition, inaccessible data maintenance nodal information, ensure only to store the data maintenance nodal information that can reach in cluster management node, unreachable node does not participate in the maintenance update of data consistency.
See Fig. 4 ~ Fig. 6, introduce the method for work of the consistency maintenance system of distributed data in distributed cluster system of the present invention, for convenience of description, the data maintenance node of initiating Data Update request is called interim host node, interim host node only manages the Data Update request that self initiates.The method comprises following operative step:
Step 1, the data maintenance node (interim host node) of initiating Data Update request sends the uniquely identified Data Update lock request message comprising self unique identification of its node and expect the target data upgraded to global lock management node; Now, interim host node does not upgrade target data, and the state to be updated such as to be in.
Step 2, the message sink in global lock management node and sending module receive this Data Update lock request message, judge the update request whether comprising multiple target data in this message; If so, then this message is divided into separately independently multiple Data Update request message, and puts into the renewal lock request queue of corresponding data respectively; If not, then this message is directly sent in the renewal lock request queue of target data.The message upgraded in lock request queue is sent to data maintenance nodal information administration module according to first-in first-out mode.
Step 3, after data maintenance nodal information administration module receives this message, first searches out the data maintenance node group containing target data, and to this Data Update request message of the data maintenance node broadcasts in group; Meanwhile, the time window initial value of receipt message is also set, prepares in the time window of this setting, the acknowledge message whether data that the data maintenance node of reception containing target data sends are updated successfully.
Step 4, after other data maintenance nodes containing target data receive this Data Update request message, resolve and perform the updating target data operation in this message: if deletion action, then directly delete the corresponding data in target data, and increase the version number of these data; If rolling back action, then the renewal of directly rollback target data the last time execution, and reduce the version number of data; If increase or retouching operation, then, after data maintenance node first newly adds data content from interim host node acquisition, just upgrade target data, increase the version number of data simultaneously.
Step 5, after the updating target data success of other data maintenance nodes containing target data, the version number after comprising this updating target data success to the transmission of global lock management node is updated successfully acknowledge message.
Step 6, data maintenance nodal information administration module in global lock management node judges whether to receive beyond the interim host node of removing, the successful acknowledge message of Data Update that all other data maintenance nodes containing target data send, to take corresponding different following post-treatment operations:
First see Fig. 4, introduce the sequential of Method of data consistency maintenance operating process under normal circumstances.So-called normal condition is that the network service in distributed system between each data maintenance node is not broken down, and each data maintenance node self is also working properly, does not occur any fault.Its operating procedure is:
(61) data maintenance nodal information administration module sends the successful acknowledge message of updating target data to interim host node, after interim host node receives this acknowledge message, take out the team's head message in client data update request message queue, according to this team's head message, corresponding renewal rewards theory and its version number of increase are carried out, simultaneously respectively to data maintenance nodal information administration module and the successful acknowledge message of its updating target data of client feedback to the target data on himself node.
(62), after data maintenance nodal information administration module receives the acknowledge message be updated successfully of interim host node transmission, the renewal lock request queue to target data sends updating target data lock releasing request message; Make the renewal lock upgrading lock request queue release target data, for the renewal lock request use of next data.
If in step (6), when data maintenance nodal information administration module does not receive Data Update that all other data maintenance nodes containing target data send successful acknowledge message, namely occur that following three kinds of abnormal conditions (specifically: the network connection between the failure of data maintenance node updates, data maintenance node is broken down, and network connectivity fai_lure exists between the failure of data maintenance node updates and data maintenance node simultaneously) time, the treatment step of execution is as described below:
(6A) following two parameters of data maintenance nodal information administration module to the successful acknowledge message of Data Update that the data maintenance node containing target data sends reset: time of reception window size, and the maximum times data maintenance node of Data Update failure being sent again to Data Update request; Then, to not having to send the update request message that the data maintenance node being updated successfully acknowledge message sends target data again.
(6B) after data maintenance node receives this update request message again, whether the version number first comparing its target data is number identical with the versions of data in this Data Update request message, if both are identical, then send the acknowledge message be updated successfully directly to data maintenance nodal information administration module; Otherwise, again upgrade target data and be updated successfully acknowledge message to the transmission of data maintenance nodal information administration module.
(6C) after data maintenance nodal information administration module receives the Data Update failed message of data maintenance node transmission, within the maximum of its Data Update request repeat number of times arranged, again send Data Update request, and just receive in the time window set all once upgraded that failed data maintenance node returns be updated successfully acknowledge message time, then perform respective handling according to step (61) and (62) and operate; Otherwise, just send target data rolling back action message to have sent the data maintenance node being updated successfully acknowledge message, to make this data maintenance node after receiving this rolling back action message, cancel the last executed renewal rewards theory of target data, and feed back the acknowledge message whether rolling back action complete; After this, data maintenance nodal information administration module completes the data maintenance node of acknowledge message to not sending rolling back action, within the maximum number of retransmissions of setting, again send rollback operation information; Finally, the acknowledge message of updating target data failure is sent to interim host node;
(6C) data maintenance nodal information administration module to receive data maintenance node send Data Update failure after, within the maximum of its Data Update request repeat number of times arranged, again send Data Update request, and just receive in the time window set all once upgraded that failed data maintenance node returns be updated successfully acknowledge message time, then perform respective handling according to step (61) and (62) and operate; Otherwise, data maintenance nodal information administration module sends target data rolling back action message to have sent the data maintenance node being updated successfully acknowledge message, to make this data maintenance node after receiving this rolling back action message, cancel the last executed renewal rewards theory of target data, and feed back the acknowledge message whether rolling back action complete; After this, data maintenance nodal information administration module completes the data maintenance node of acknowledge message to not sending rolling back action, within the maximum number of retransmissions of setting, again send rollback operation information; Finally, data maintenance nodal information administration module sends the acknowledge message of updating target data failure to interim host node.
(6D) after interim host node receives the acknowledge message of renewal failure of target data, directly abandon the team's head message in client data update request message queue, send the acknowledge message of updating target data failure simultaneously to data maintenance nodal information administration module and client respectively.
(6E) after data maintenance nodal information administration module receives the acknowledge message of the updating target data failure that interim host node sends, renewal lock request queue to target data sends and upgrades lock releasing request message, make the renewal lock upgrading lock request queue release target data, to be transferred to the next lock request message that upgrades to use; After client receives and upgrades failed acknowledge message, select other data maintenance node containing target data in cluster again to initiate update request message, or stop this Data Update solicit operation.
Again see Fig. 5, when introducing the failure of data maintenance node updates, the time sequential routine figure of data sign processing.Data maintenance node updates unsuccessfully refers to that factor data Maintenance Point faults itself causes the operation failure of Data Update, the difference of data consistency renewal process now and consistent update process is under normal circumstances: when data maintenance nodal information administration module do not receive all data maintenance nodes be updated successfully message time, just to the data maintenance node broadcast data update request message again sending Data Update failure.
See Fig. 6, when introducing HP M, the time sequential routine figure of Method of data consistency maintenance:
HP M refers to the network connection interruption between node, or refers to that the message transmission time delay that network congestion causes strengthens, and causes message not arrive in setting-up time, thus causes message to be dropped.Both differences of data sign processing process when data sign processing process during HP M and the failure of above-mentioned data maintenance node updates are: when data maintenance node receives Data Update request message again, not re-execute Data Update operation at once, but the version number first checking target data whether with successfully upgrade after versions of data number consistent, if so, then the acknowledge message be updated successfully directly is sent; If not, the renewal rewards theory of data is just re-executed.
If when the failure of data maintenance node updates exists with network failure simultaneously, data sign processing of the present invention operation is the processing mode comprehensively under above-mentioned two kinds of abnormal conditions, and this situation is also the mode that the inventive method often uses in implementation data consistency maintenance renewal process.Because system itself cannot judge that Data Update is unsuccessfully by which kind of situation caused.
Step 7, in distributed cluster system, after data maintenance nodal information administration module completes all operations of a Data Update, adjusts the time window value of its receipt message, to adapt to the change of network environment.

Claims (8)

1. the consistency maintenance system of distributed data in distributed cluster system, it is characterized in that, this system is made up of cluster management node, global lock management node and multiple data maintenance node, wherein:
Multiple data maintenance node, be arranged in distributed cluster system dispersedly, each data maintenance node be can receive client Data Update request and there is same treatment function and respective uniquely identified isomorphism node, each data maintenance node stores the copy of one or more data, each data are with mutually different unique identification and different editions number, but the unique identification of all copies of same data is all identical, after each Data Update success, version number is corresponding change also; Be provided with action listener, data sign processing and event and send totally three modules;
Global lock management node, is responsible for the renewal lock of all data in store and management system, and all data maintenance nodal informations in storage system, for broadcast data updating message; Also by the node-home containing identical data copy in same group, each data maintenance node can belong to multiple groups; Global lock management node receives Data Update lock request message, and the renewal of data lock is applied for successfully, and the transfer of data in distributed cluster system between each data maintenance node is not through global lock management node; Data maintenance node need more new data time, first prepare the renewal lock of the target data upgraded to the application of global lock management node; If the renewal lock of the target data of this preparation renewal is temporarily unavailable, then the request of the renewal of this target data lock is put into corresponding renewal lock request queue, until this updating target data is locked available; Be provided with the message sink and sending module, renewal lock request queue, data maintenance nodal information administration module totally three parts that are linked in sequence;
Cluster management node, is in charge of the information of data that all nodes in system comprise its IP address, listening port number, running status and storage, and provides the up-to-date information of each node for the data maintenance nodal information administration module in global lock management node; Be responsible for the running status of each data maintenance node in cycle detecting system, and storage is in normal operating condition and the data maintenance nodal information that can reach, and deletion is in abnormal operating condition, inaccessible data maintenance nodal information, ensure only to store the data maintenance nodal information that can reach in cluster management node, unreachable node does not participate in the maintenance update of data consistency.
2. maintenance system according to claim 1, it is characterized in that: described systematic difference scene is: client in this system certain data maintenance node initiate comprise increases, delete or revise Data Update request time, first this Data Update request is sent to global lock management node by described data maintenance node, beyond the node that this Data Update request forward initiates update request to removing by global lock management node again, to comprise this target data every other data maintenance node; After the data on other data maintenance nodes are all updated successfully, the data maintenance node of initiating update request just performs renewal rewards theory to this target data; Then the acknowledge message of returning updating target data to the client initiating Data Update request to complete then; If updating target data failure, then all data maintenance nodes all abandon this Data Update, and all data in guarantee system are under any circumstance all consistent.
3. maintenance system according to claim 1, it is characterized in that: the renewal lock of described data is for limiting the access to data: data maintenance node is to upgrade certain data, first must obtain to the application of global lock management node the renewal lock preparing the target data upgraded, after success obtains updating target data lock, this target data just no longer receives the read-write requests of client, until the renewal lock of this target data is released; If the renewal lock of this target data temporarily can not be used, then the update request of this target data is placed into corresponding renewal in lock request queue, until this updating target data lock can use.
4. maintenance system according to claim 1, is characterized in that: the function of the modules in described data maintenance node is as follows:
Action listener module, is sent to other node in Data Update request message on this data maintenance node or system is sent to Data Update request message on this data maintenance node and/or data rewind operation information for monitoring client; And the transfer of messages received is processed to data sign processing module; Described data rewind cancels the last Data Update operation performed;
Data sign processing module, for the Data Update sent from client request message being first kept in the client data update request message queue in this data sign processing module, so that according to the message in this queue of first in first out process, send Data Update lock request message to event sending module simultaneously; And for the Data Update request message from other nodes received and/or data rewind operation information, then directly perform the Data Update operation in data Maintenance Point and/or perform the operation of corresponding data rewind; Data Update operation comprises: the increase of data content, deletion or amendment; After the success of each Data Update or after rolling back action success, change the version number of these data simultaneously, and to be updated successfully or the acknowledge message of failure, rolling back action success or failed acknowledge message send to event sending module;
Event sending module, for sending to global lock management node by acknowledge message successful or failed for the Data Update received, rolling back action success or failed acknowledge message or Data Update lock request message.
5. maintenance system according to claim 1, is characterized in that: in described global lock management node, the function of modules is as follows:
Message sink and sending module, be responsible for the Data Update request message receiving the one or more data of renewal that data maintenance node sends, if when update request message comprises the update request of multiple data, after then message sink and sending module will be divided into Data Update request independent of each other the update request of the multiple target datas comprised in this message, be sent in the renewal lock request queue of corresponding data respectively, to make the update request of each target data process respectively separately concurrently, to support that multiple target data upgrades simultaneously; The acknowledge message whether Data Update that this module also receives the transmission of renewal lock request queue completes, and this acknowledge message is transmitted to the data maintenance node of initiating Data Update request, this data maintenance node feeds back to client this acknowledge message again, judges that whether Data Update is successful to make client according to the acknowledge message that it receives;
Upgrade lock request queue, be in charge of the renewal lock request message of all data in this system, and according to the message in the principle sequential processes queue of first in first out; Be provided with multiple Data Update lock request queue corresponding with the different pieces of information in system respectively, the message in this queue can be sent to data maintenance nodal information administration module, and receive the renewal lock releasing request message that data maintenance nodal information administration module returns, process next renewal to enable renewal lock request queue and lock request message, and after completion processing, to the acknowledge message whether message delivery and reception module transmission Data Update completes;
Data maintenance nodal information administration module, the multiple data maintenance nodes be responsible for scattering are divided into different node groups, and each node group at least comprises containing a identical data and being in charge of: the IP address of each node, listening port number and whether be in the information of normal operating condition; Be provided with the time window of receipt message, and adjust this time window size in real time according to the change of network environment; Be responsible for Data Update information broadcast to all data maintenance nodes containing this data trnascription in system, and receive Data Update or the rolling back action whether successfully acknowledge message of feedback, if acknowledge message does not arrive in time window setting duration, then abandon receiving this message; Also obtain from cluster management node the up-to-date information of data trnascription that all data maintenance nodes this system comprise the IP address of node, the port numbers of monitoring, the running status of node and storage thereof according to setting-up time; Data maintenance nodal information administration module is the maximum of setting data update request sending times and the maximum of data rewind operation requests sending times also.
6. be applied to a method of work for the consistency maintenance system of distributed data in distributed cluster system as claimed in claim 1, it is characterized in that: described method comprises following operative step:
(1) initiate the data maintenance node of Data Update request, namely only manage the interim host node of the Data Update request self initiated first sends Data Update lock request message from the unique identification comprising its node unique identification and expect the target data upgraded to global lock management node; Now, interim host node does not upgrade target data, and the state to be updated such as to be in;
(2) message sink in global lock management node and sending module receive this Data Update lock request message, judge the update request whether comprising multiple target data in this message; If so, then this message is divided into separately independently multiple Data Update request message, and puts into the renewal lock request queue of corresponding data respectively; If not, then this message is directly sent in the renewal lock request queue of target data; The message upgraded in lock request queue is sent to data maintenance nodal information administration module according to first-in first-out mode;
(3), after data maintenance nodal information administration module receives this message, the data maintenance node group containing target data is first searched out, and to this Data Update request message of the data maintenance node broadcasts in group; Meanwhile, the time window initial value of receipt message is also set, prepares in the time window of this setting, the acknowledge message whether data that the data maintenance node of reception containing target data sends are updated successfully;
(4) after other data maintenance nodes containing target data receive this Data Update request message, resolve and perform the updating target data operation in this message: if deletion action, then directly delete the corresponding data in target data, and increase the version number of these data; If rolling back action, then the renewal of directly rollback target data the last time execution, and reduce the version number of data; If increase or retouching operation, then, after data maintenance node first newly adds data content from interim host node acquisition, just upgrade target data, increase the version number of data simultaneously;
(5), after the updating target data success of other data maintenance nodes containing target data, the version number after comprising this updating target data success to the transmission of global lock management node is updated successfully acknowledge message;
(6) the data maintenance nodal information administration module in global lock management node judges whether to receive beyond the interim host node of removing, the successful acknowledge message of Data Update that all other data maintenance nodes containing target data send, to take corresponding different post-treatment operations;
(7) in distributed cluster system, after data maintenance nodal information administration module completes all operations of a Data Update, the time window value of its receipt message is adjusted, to adapt to the change of network environment.
7. method of work according to claim 6, it is characterized in that: in described step (6), when data maintenance nodal information administration module receives Data Update that the every other data maintenance node containing target data sends successful acknowledge message, the subsequent processing steps of execution is as follows:
(61) data maintenance nodal information administration module sends the successful acknowledge message of updating target data to interim host node, after interim host node receives this acknowledge message, take out the team's head message in client data update request message queue, and according to this team's head message the target data of himself node upgraded accordingly and increase its version number, simultaneously respectively to data maintenance nodal information administration module and the successful acknowledge message of its updating target data of client feedback;
(62), after data maintenance nodal information administration module receives the acknowledge message be updated successfully of interim host node transmission, the renewal lock request queue to target data sends updating target data lock releasing request message; Make the renewal lock upgrading lock request queue release target data, for the renewal lock request use of next data.
8. method of work according to claim 7, it is characterized in that: in described step (6), when data maintenance nodal information administration module does not receive Data Update that all other data maintenance nodes containing target data send successful acknowledge message, the subsequent processing steps of execution is as follows:
(6A) following two parameters of data maintenance nodal information administration module to the successful acknowledge message of Data Update that the data maintenance node containing target data sends reset: the window size of time of reception, and the data maintenance node of Data Update failure sends the maximum times of Data Update request again; Then, to not having to send the update request message that the data maintenance node being updated successfully acknowledge message sends target data again;
(6B) after data maintenance node receives this update request message again, whether the version number first comparing its target data is number identical with the versions of data in this Data Update request message, if both are identical, then send the acknowledge message be updated successfully directly to data maintenance nodal information administration module; Otherwise, again upgrade target data, and be updated successfully acknowledge message to the transmission of data maintenance nodal information administration module;
(6C) after data maintenance nodal information administration module receives the renewal failed message of data maintenance node transmission, within the maximum of its Data Update request repeat number of times arranged, again send Data Update request, and just receive in the time window set all once upgraded that failed data maintenance node returns be updated successfully acknowledge message time, then perform respective handling according to step (61) and (62) and operate; Otherwise, just send target data rolling back action message to have sent the data maintenance node being updated successfully acknowledge message, to make this data maintenance node after receiving this rolling back action message, cancel the last executed renewal rewards theory of target data, and feed back the acknowledge message whether rolling back action complete; After this, data maintenance nodal information administration module completes the data maintenance node of acknowledge message to not sending rolling back action, within the maximum number of retransmissions of setting, again send rollback operation information; Finally, the acknowledge message of updating target data failure is sent to interim host node;
(6D) after interim host node receives the acknowledge message of renewal failure of target data, directly abandon the team's head message in client data update request message queue, send the acknowledge message of updating target data failure simultaneously to data maintenance nodal information administration module and client respectively;
(6E) after data maintenance nodal information administration module receives the acknowledge message of the updating target data failure that interim host node sends, renewal lock request queue to target data sends and upgrades lock releasing request message, make the renewal lock upgrading lock request queue release target data, to be transferred to the next lock request message that upgrades to use; After client receives and upgrades failed acknowledge message, select other data maintenance node containing target data in cluster again to initiate update request message, or stop this Data Update solicit operation.
CN201210535376.3A 2012-12-12 2012-12-12 The consistency maintenance system and method for distributed data Expired - Fee Related CN103036717B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210535376.3A CN103036717B (en) 2012-12-12 2012-12-12 The consistency maintenance system and method for distributed data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210535376.3A CN103036717B (en) 2012-12-12 2012-12-12 The consistency maintenance system and method for distributed data

Publications (2)

Publication Number Publication Date
CN103036717A CN103036717A (en) 2013-04-10
CN103036717B true CN103036717B (en) 2015-11-04

Family

ID=48023230

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210535376.3A Expired - Fee Related CN103036717B (en) 2012-12-12 2012-12-12 The consistency maintenance system and method for distributed data

Country Status (1)

Country Link
CN (1) CN103036717B (en)

Families Citing this family (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104252466A (en) * 2013-06-26 2014-12-31 阿里巴巴集团控股有限公司 Stream computing processing method, equipment and system
CN103559319B (en) * 2013-11-21 2017-07-07 华为技术有限公司 The cache synchronization method and equipment of distributed cluster file system
CN103731485A (en) * 2013-12-26 2014-04-16 华为技术有限公司 Network equipment, cluster storage system and distributed lock management method
CN103744719B (en) * 2013-12-30 2017-12-29 华为技术有限公司 Lock management method and system, the collocation method and device of lock administration system
CN104750738B (en) * 2013-12-30 2018-06-26 中国移动通信集团公司 A kind of method for updating the data information, back end, management node and system
CN104133831B (en) * 2014-02-25 2017-07-07 清华大学 Cross-domain data system of connections, cross-domain data coupling method and node
CN104135505B (en) * 2014-03-06 2017-05-24 清华大学 Data connection method and system across data center
CN104348906B (en) * 2014-09-16 2018-05-04 华为技术有限公司 Data agreement method and device in a kind of distributed system
WO2016065545A1 (en) * 2014-10-29 2016-05-06 华为技术有限公司 Information updating method, apparatus, and device
CN105630823B (en) * 2014-11-04 2019-03-12 阿里巴巴集团控股有限公司 Data cached monitoring method, device and system based on distributed system
CN104407814B (en) * 2014-11-21 2017-10-17 华为技术有限公司 Double method and apparatus for writing data
US9886347B2 (en) * 2015-01-08 2018-02-06 International Business Machines Corporation Data replication in a database management system
CN105868002B (en) * 2015-01-22 2020-02-21 阿里巴巴集团控股有限公司 Method and device for processing retransmission request in distributed computing
CN106209926B (en) 2015-04-30 2019-06-21 阿里巴巴集团控股有限公司 A kind of data-updating method and equipment
CN106302625B (en) * 2015-06-26 2019-10-25 阿里巴巴集团控股有限公司 Data-updating method, device and related system
CN105068877B (en) * 2015-07-14 2018-07-17 许继电气股份有限公司 Data consistency transmission method between a kind of multipriority task
CN106598992B (en) * 2015-10-15 2020-10-23 南京中兴软件有限责任公司 Database operation method and device
CN105227683B (en) * 2015-11-11 2018-10-19 中国建设银行股份有限公司 A kind of LDAP company-datas synchronous method and system
CN105550319B (en) * 2015-12-12 2019-06-25 天津南大通用数据技术股份有限公司 The optimization method of persistence under a kind of cluster Consistency service high concurrent
CN105528464B (en) * 2016-01-28 2019-03-26 北京宇航系统工程研究所 A kind of edition management system judging automatically associated data state of the art consistency
CN105897366B (en) * 2016-04-01 2018-09-04 浪潮电子信息产业股份有限公司 A kind of method and system ensureing computer cluster node time consistency
CN105721617B (en) * 2016-04-28 2019-05-14 安徽四创电子股份有限公司 A kind of rolling update method of cloud service system
CN106888245B (en) * 2016-06-07 2020-04-24 阿里巴巴集团控股有限公司 Data processing method, device and system
CN107783860A (en) * 2016-08-31 2018-03-09 阿里巴巴集团控股有限公司 The recovery point objectives monitoring method and equipment of a kind of data transfer
CN107977376B (en) 2016-10-24 2020-07-07 腾讯科技(深圳)有限公司 Distributed database system and transaction processing method
CN108023908B (en) * 2016-10-31 2020-04-24 腾讯科技(深圳)有限公司 Data updating method, device and system
US10887173B2 (en) * 2016-12-21 2021-01-05 Juniper Networks, Inc. Communicating state information in distributed operating systems
CN106802939B (en) * 2016-12-30 2020-04-03 华为技术有限公司 Method and system for solving data conflict
CN107391620A (en) * 2017-07-06 2017-11-24 天脉聚源(北京)传媒科技有限公司 A kind of method and device for handling collaboration update abnormal
CN107450991A (en) * 2017-07-24 2017-12-08 无锡江南计算技术研究所 A kind of efficiently distributed global lock coordination approach
CN107689889A (en) * 2017-08-28 2018-02-13 长沙曙通信息科技有限公司 A kind of cluster multinode state information maintenance implementation method
CN107491335A (en) * 2017-08-31 2017-12-19 郑州云海信息技术有限公司 The upgrade method and upgrade-system of a kind of cluster controller
US10824590B2 (en) * 2017-12-07 2020-11-03 Rohde & Schwarz Gmbh & Co. Kg Failure tolerant data storage access unit, failure tolerant data storage access system and method for accessing a data storage
CN108234641B (en) * 2017-12-29 2021-01-29 北京奇元科技有限公司 Data reading and writing method and device based on distributed consistency protocol
CN108614873B (en) * 2018-04-20 2020-11-06 新华三技术有限公司 Data processing method and device
US10917198B2 (en) * 2018-05-03 2021-02-09 Arm Limited Transfer protocol in a data processing network
CN108924206B (en) * 2018-06-26 2021-07-16 郑州云海信息技术有限公司 Cluster event synchronization method, device and equipment of distributed system
CN109408203B (en) * 2018-11-01 2019-10-18 无锡华云数据技术服务有限公司 A kind of implementation method, device, the computing system of queue message consistency
CN109739935B (en) * 2019-01-09 2022-12-30 腾讯科技(深圳)有限公司 Data reading method and device, electronic equipment and storage medium
CN109862102B (en) * 2019-02-25 2022-01-28 交通银行股份有限公司 Distributed data multi-copy concurrency control system, server and method
CN111695018B (en) * 2019-03-13 2023-05-30 阿里云计算有限公司 Data processing method and device, distributed network system and computer equipment
CN110381124B (en) * 2019-06-28 2022-03-25 苏州浪潮智能科技有限公司 Lock resource application method and device
CN112035721A (en) * 2020-07-22 2020-12-04 大箴(杭州)科技有限公司 Crawler cluster monitoring method and device, storage medium and computer equipment
CN112306604B (en) * 2020-08-21 2022-09-23 海信视像科技股份有限公司 Progress display method and display device for file transmission
CN113239013B (en) * 2021-05-17 2024-04-09 北京青云科技股份有限公司 Distributed system and storage medium
CN114490691B (en) * 2022-02-15 2022-08-16 北京中电兴发科技有限公司 Distributed system data consistency method
CN114415984B (en) * 2022-03-31 2022-08-16 阿里云计算有限公司 Data processing method and device
CN114945026A (en) * 2022-04-24 2022-08-26 网易(杭州)网络有限公司 Data processing method, device and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1719831A (en) * 2005-07-15 2006-01-11 清华大学 High-available distributed boundary gateway protocol system based on cluster router structure
CN102117287A (en) * 2009-12-30 2011-07-06 成都市华为赛门铁克科技有限公司 Distributed file system access method, a metadata server and client side

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7870226B2 (en) * 2006-03-24 2011-01-11 International Business Machines Corporation Method and system for an update synchronization of a domain information file

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1719831A (en) * 2005-07-15 2006-01-11 清华大学 High-available distributed boundary gateway protocol system based on cluster router structure
CN102117287A (en) * 2009-12-30 2011-07-06 成都市华为赛门铁克科技有限公司 Distributed file system access method, a metadata server and client side

Also Published As

Publication number Publication date
CN103036717A (en) 2013-04-10

Similar Documents

Publication Publication Date Title
CN103036717B (en) The consistency maintenance system and method for distributed data
CN103297268B (en) Based on the distributed data consistency maintenance system and method for P2P technology
US9715522B2 (en) Information processing apparatus and control method
WO2018103318A1 (en) Distributed transaction handling method and system
US8185493B2 (en) Solution method of in-doubt state in two-phase commit protocol of distributed transaction
US7330860B2 (en) Fault tolerant mechanism to handle initial load of replicated object in live system
CN108710638B (en) Distributed concurrency control method and system based on mixed RDMA operation
US8639890B2 (en) Data segment version numbers in distributed shared memory
CN110535680B (en) Byzantine fault-tolerant method
CN101770513B (en) Method and system for validation and correction in a distributed namespace
KR102038527B1 (en) Distributed cluster management system and method for thereof
WO2016070375A1 (en) Distributed storage replication system and method
US20120011100A1 (en) Snapshot acquisition processing technique
US11841781B2 (en) Methods and systems for a non-disruptive planned failover from a primary copy of data at a primary storage system to a mirror copy of the data at a cross-site secondary storage system
KR101296778B1 (en) Method of eventual transaction processing on nosql database
JP7236498B2 (en) File resource processing method, device, equipment, medium, and program
US9703634B2 (en) Data recovery for a compute node in a heterogeneous database system
CN104850416A (en) Upgrading system, method and device and cloud computing node
CN105677380A (en) Method and device for plate-to-plate upgrading with dual master controls isolated
JP4801196B2 (en) Method and apparatus for two-phase commit in data distribution to a web farm
CN103428288B (en) Based on the copies synchronized method of subregion state table and coordinator node
JP4870190B2 (en) Data processing method, computer, and data processing program
CN113326272A (en) Distributed transaction processing method, device and system
EP3039568A1 (en) Distributed disaster recovery file sync server system
CN107436904B (en) Data acquisition method, data acquisition device, and computer-readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
CB03 Change of inventor or designer information

Inventor after: Zhao Yao

Inventor after: Song Yingying

Inventor after: Yang Fangchun

Inventor after: Zou Hua

Inventor after: Niu Kun

Inventor after: Zhang Wentao

Inventor after: Wan Neng

Inventor after: Peng Shukai

Inventor after: Zou Zhiyong

Inventor before: Zhao Yao

Inventor before: Zou Zhiyong

Inventor before: Song Yingying

Inventor before: Peng Shukai

Inventor before: Yang Fangchun

Inventor before: Zou Hua

COR Change of bibliographic data
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20151104

Termination date: 20211212