CN112732493B - Method and device for newly adding node, node of distributed system and storage medium - Google Patents

Method and device for newly adding node, node of distributed system and storage medium Download PDF

Info

Publication number
CN112732493B
CN112732493B CN202110337111.1A CN202110337111A CN112732493B CN 112732493 B CN112732493 B CN 112732493B CN 202110337111 A CN202110337111 A CN 202110337111A CN 112732493 B CN112732493 B CN 112732493B
Authority
CN
China
Prior art keywords
node
state
data
data synchronization
distributed system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110337111.1A
Other languages
Chinese (zh)
Other versions
CN112732493A (en
Inventor
胡细笔
柳正龙
谢磊
朱金奇
陈静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hundsun Technologies Inc
Original Assignee
Hundsun Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hundsun Technologies Inc filed Critical Hundsun Technologies Inc
Priority to CN202110337111.1A priority Critical patent/CN112732493B/en
Publication of CN112732493A publication Critical patent/CN112732493A/en
Application granted granted Critical
Publication of CN112732493B publication Critical patent/CN112732493B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3058Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2336Pessimistic concurrency control approaches, e.g. locking or multiple versions without time stamps
    • G06F16/2343Locking methods, e.g. distributed locking or locking implementation details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Hardware Redundancy (AREA)

Abstract

The application provides a method and a device for adding nodes, nodes of a distributed system and a storage medium, wherein the method is applied to a first node and mainly comprises the following steps: receiving a data synchronization request sent by a second node; continuously synchronizing the data of the second node to the second node, and monitoring the data synchronization progress in real time; when the data synchronization progress is monitored to reach the current tolerance threshold of the first node, the working state is switched from the single machine state to the host preparation state; and the first node in the host preparation state stops transaction updating until the second node confirms that the data synchronization is completed, sends a join notification to the second node to trigger the second node to join the distributed system, and switches the working state from the host preparation state to the host state after the second node successfully joins the distributed system. Therefore, by the added host preparation state, when the data synchronization progress is larger than the tolerance threshold, the data synchronization of the newly added nodes is waited, and the nodes of the standby machine can be rapidly added into the cluster.

Description

Method and device for newly adding node, node of distributed system and storage medium
Technical Field
The present application relates to the field of data synchronization technologies, and in particular, to a method and an apparatus for adding a node, a node of a distributed system, and a storage medium.
Background
In a distributed system, the basic principle of high reliability of the system is "clustering", that is, the high reliability of the system is ensured by setting a redundant node. When the host node is unavailable, the service can be continuously provided through the redundant standby node.
Therefore, to achieve high availability of the system, the standby node needs to be added in the system in time after the master node is determined, and the consistency of the data of the host node and the standby node is ensured, that is, the standby node and the host node need to synchronize data. In the prior art, when a standby node needs to be added to a system, the standby node continuously synchronizes data from a main node until the data of the standby node is completely consistent with the data of the main node, and the main node notifies the standby node to add to the system cluster.
However, since the master node may be in the state of an update transaction for a long time and there is a synchronization delay in data synchronization, if the standby node is added in the existing manner, the standby node cannot completely synchronize data with the master node for a long time, and thus cannot be added to the system quickly, and further cannot realize high availability of the system quickly. Even if the master node is always in the state of transaction update, the standby node may not be added to the system.
Disclosure of Invention
Based on the defects of the prior art, the application provides a method and a device for adding a node, a node of a distributed system and a storage medium, so as to solve the problem that the added node cannot be added into the system in time in the prior art.
In order to achieve the above object, the present application provides the following technical solutions:
a first aspect of the present application provides a method for adding a node, which is applied to a first node, and the method includes:
receiving a data synchronization request sent by a second node; wherein the first node is a first started node; the state of the first node is a single-machine state; the single machine state is used for representing that only one node exists in the distributed system;
continuously synchronizing the data of the second node to the second node, and monitoring the data synchronization progress in real time;
when the data synchronization progress is monitored to reach the current tolerance threshold of the first node, switching the working state from a single machine state to a host preparation state; the first node in the host preparation state stops transaction updating until the second node confirms that data synchronization is completed;
when the data are monitored to be completely synchronous, sending a joining notification to the second node to trigger the second node to join the distributed system; after the second node joins the distributed system, the second node enters a standby state from a standby preparation state;
after the second node joins the distributed system, switching the working state from the host preparation state to a host state; wherein the host state is used to characterize the first node as a host node in the distributed system.
Optionally, in the method for adding a node, after the switching the operating state from the stand-alone state to the host ready state, the method further includes:
judging whether the data are completely synchronized within a first preset time span or not; if the data are not monitored to be completely synchronous within a first preset time span, the state of the single machine is recovered; and if the data are monitored to be completely synchronous within a first preset time span, executing the sending of the join notification to the second node.
Optionally, in the method for adding a node newly, before the step of switching the working state from the stand-alone state to the host preparation state when it is monitored that the data synchronization progress reaches the current tolerance threshold of the first node, the method further includes:
acquiring the data synchronization speed of the second node and the transaction updating speed of the first node at fixed time;
and taking the product of the difference value of the data synchronization speed and the transaction updating speed and the first preset time length as the current tolerance threshold of the first node.
Optionally, in the method for adding a node, after the switching the operating state from the host ready state to the host state, the method further includes:
when a data synchronization request sent by a third node is received, continuously synchronizing the data of the third node to the third node; wherein the third node refers to a node that is currently most recently started;
when the situation that the data are completely synchronized with the third node is monitored, sending a joining notification to the third node to trigger the third node to join the distributed system; and after the third node joins the distributed system, the third node enters a standby state from a standby preparation state.
This application second aspect provides a newly-increased node's device, is applied to first node, the device includes:
a receiving unit, configured to receive a data synchronization request sent by a second node; wherein the first node is a first started node; the state of the first node is a single-machine state; the single machine state is used for representing that only one node exists in the distributed system;
the first data synchronization unit is used for continuously synchronizing the data of the first data synchronization unit to the second node;
the monitoring unit is used for monitoring the data synchronization progress in real time;
the first switching unit is used for switching the working state from a single machine state to a host preparation state when the monitoring unit monitors that the data synchronization progress reaches the current tolerance threshold of the first node; the first node in the host preparation state stops transaction updating until the second node confirms that data synchronization is completed;
the first notification unit is used for sending a join notification to the second node when the monitoring unit monitors that the data are completely synchronized, so as to trigger the second node to join the distributed system; after the second node joins the distributed system, the second node enters a standby state from a standby preparation state;
the second switching unit is used for switching the working state from the host preparation state to the host state after the second node is added into the distributed system; wherein the host state is used to characterize the first node as a host node in the distributed system.
Optionally, in the apparatus for adding a node, the method further includes:
the judging unit is used for judging whether the data are completely synchronized within a first preset time length; if the judging unit judges that the data are completely synchronized within a first preset time span, the first notification unit executes the sending of the join notification to the second node;
and the recovery unit is used for recovering to the single machine state when the judgment unit judges that the data is not completely synchronized within the first preset time span.
Optionally, in the apparatus for adding a node, the method further includes:
the acquisition unit is used for acquiring the data synchronization speed of the second node and the transaction update speed of the first node at regular time;
and the calculating unit is used for taking the product of the difference value of the data synchronization speed and the transaction updating speed and the first preset time length as the current tolerance threshold of the first node.
Optionally, in the apparatus for adding a node, the method further includes:
the second data synchronization unit is used for continuously synchronizing the data of the second data synchronization unit to a third node when receiving a data synchronization request sent by the third node; wherein the third node refers to a node that is currently most recently started;
a second notification unit, configured to send a join notification to the third node when it is monitored that data is completely synchronized with the third node, so as to trigger the third node to join the distributed system; and after the third node joins the distributed system, the third node enters a standby state from a standby preparation state.
A third aspect of the application provides a node of a distributed system, comprising a processor and a memory; wherein:
the memory is to store computer instructions;
the processor is configured to execute the computer instructions stored in the memory, and in particular, to perform any one of the above methods for adding a node.
A fourth aspect of the present application provides a storage medium storing a program for implementing the method for adding a node as described in any one of the above when the program is executed.
The method for adding the node is applied to the first node, the first node refers to a first starting node and is in a single-machine state, and the single-machine state is used for representing that only one node exists in a distributed system. Specifically, after a data synchronization request sent by a second node is received, data of the second node is continuously synchronized to the second node, the data synchronization progress is monitored in real time, when the data synchronization progress is monitored to reach the current tolerance threshold of the first node, the working state is switched from a single-machine state to a host preparation state to stop transaction updating until the second node confirms that data synchronization is completed, and therefore the second node can be effectively guaranteed to be capable of rapidly achieving complete data synchronization with the first node, and large influence on system service is not caused. And finally, when the data are monitored to be completely synchronous, sending a joining notice to the second node to trigger the second node to join the distributed system, and switching the working state from the host preparation state to the host state after the second node joins the distributed system. Therefore, when the data synchronization progress of the newly added node reaches the tolerance threshold, the newly added node is switched to the introduced host preparation state, the newly added node is waited for data synchronization, and continuous external requests are avoided, so that the newly added node can quickly realize complete data synchronization with the main node and is added into the system cluster, and the high availability of the system is quickly realized.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a method for adding a node according to an embodiment of the present disclosure;
FIG. 2 is a flowchart of a tolerance threshold calculation method according to another embodiment of the present disclosure;
FIG. 3 is a flowchart of another method for adding a new node tolerant to a threshold according to another embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a node of a distributed system according to another embodiment of the present application;
fig. 5 is a schematic structural diagram of a node of a distributed system according to another embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In this application, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The embodiment of the application provides a method for adding a node, which is applied to a first node. As shown in fig. 1, a method for adding a node provided in the embodiment of the present application specifically includes the following steps:
s101, receiving a data synchronization request sent by a second node.
The first node is the first node to be started, and the state of the first node is a single state. A standalone state is used to characterize the presence of only one node in a distributed system.
Specifically, the first started node in the distributed system is taken as the first node, and at this time, only one node is in the distributed system, so that the state of the first started node is a stand-alone state. Since the first node, as the first initiated node, will generally act as the master, the subsequently initiated node will act as the standby, and thus the state of the subsequently initiated node is the standby ready state. The second node is the next node to be started after the first node is started.
The subsequently started nodes in the standby state apply for data synchronization to the nodes in the stand-alone state, that is, the second node sends a data synchronization request to the first node.
And S102, continuously synchronizing the data of the second node to the second node, and monitoring the data synchronization progress in real time.
It should be noted that, when the second node synchronizes data from the first node, the first node still performs normal update of transactions, so as to ensure that the service is provided normally, and the data synchronization progress is monitored in real time.
S103, judging whether the monitored data synchronization progress reaches the current tolerance threshold of the first node in real time.
It should be noted that there is a certain delay for the second node synchronization data to update data relative to the first node, and in order to avoid that the second node cannot implement complete data synchronization with the first node for a long time, the embodiment of the present application will tolerate a threshold as a boundary for the first node to stop data update, so as to ensure that the second node can implement complete data synchronization with the first node in time.
Alternatively, the tolerance threshold may specifically be a ratio of the synchronized data amount or the synchronized data amount, or may also be a ratio of the unsynchronized data amount or the unsynchronized data amount. The tolerance threshold is not too large or too small, because the tolerance threshold is the unsynchronized data volume, and if the tolerance threshold is too large, the data synchronization progress can reach the tolerance threshold quickly, which may cause the first node to wait too long, and affect the service provision. If the tolerance threshold is too small, the data synchronization process may take a long time to reach the tolerance threshold because the data of the first node may be continuously updated. Even if the set tolerance threshold is smaller than the data updating amount of the first node in the current unit time, the data synchronization progress can not reach the tolerance threshold all the time, so that the second node can not be completely synchronized with the data of the first node in time due to the fact that the tolerance threshold is too small. Similarly, if the tolerance threshold is the data amount of the synchronized data or the ratio of the synchronized data amount, it is not suitable to set too large or too small.
Optionally, in another embodiment of the present application, a method for calculating a current tolerance threshold of the first node is provided, and is specifically executed before step S103. As shown in fig. 2, the method specifically includes the following steps:
s201, acquiring the data synchronization speed of the second node and the transaction update speed of the first node at regular time.
Specifically, after the data synchronization starts, the data synchronization speed of the second node and the transaction update speed of the first node are collected at regular time. It should be noted that, in order to ensure that the first node and the second node can achieve complete synchronization of data, the data synchronization speed is usually set to be greater than the transaction update speed.
S202, taking the product of the difference value of the data synchronization speed and the transaction updating speed and the first preset time length as the current tolerance threshold of the first node.
Therefore, in the embodiment of the present application, the tolerance threshold of the first node is also updated accordingly, and is updated to the latest tolerance threshold calculated currently, that is, the tolerance threshold in step S103 refers to the latest calculated tolerance threshold.
It should be noted that, when it is monitored that the data synchronization progress reaches the current tolerance threshold of the first node, step S104 is executed.
And S104, switching the working state from the single machine state to the host preparation state, and stopping transaction updating by the first node in the host preparation state until the second node confirms that data synchronization is completed.
It should be noted that, after the first node enters the host ready state from the stand-alone state, the first node and the second node continue to perform data synchronization, and since the first node stops transaction update at this time, that is, the data of the first node is not updated, the second node can quickly and completely synchronize the data with the first node. When the second node confirms that the data synchronization is completed, the first node resumes normal processing of the transaction.
And S105, when the data are monitored to be completely synchronous, sending a joining notification to the second node to trigger the second node to join the distributed system.
Specifically, when it is monitored that the data is completely synchronized with the second node, the second node may be added to the distributed system, so that the join notification is sent to the second node. And after receiving the join notification, the second node joins the distributed system and enters the working state of the second node from the standby machine preparation state into the standby machine state, so that the second node can be determined as a new host when the host node is unavailable.
And S106, after the second node is added into the distributed system, switching the working state from the host preparation state to the host state.
It should be noted that, after the second node joins the distributed system, the distributed system includes two nodes, namely the first node and the second node, that is, there is a standby node, so that the first node can enter the host state. Wherein the host state is used to characterize the first node as a host node in the distributed system.
The method for adding the node is applied to the first node, the first node is a first starting node and is in a single-machine state, and the single-machine state is used for representing that only one node exists in a distributed system. Specifically, after a data synchronization request sent by a second node is received, data of the second node is continuously synchronized to the second node, the data synchronization progress is monitored in real time, when the data synchronization progress is monitored to reach the current tolerance threshold of the first node, the working state is switched from a single-machine state to a host preparation state to stop transaction updating until the second node confirms that data synchronization is completed, and therefore the second node can be effectively guaranteed to be capable of rapidly achieving complete data synchronization with the first node, and large influence on system service is not caused. And finally, when the data are monitored to be completely synchronous, sending a joining notice to the second node to trigger the second node to join the distributed system, and switching the working state of the second node from the host preparation state to the host state after the second node is successfully joined. Therefore, when the data synchronization progress of the newly added node reaches the tolerance threshold, the newly added node is switched to the introduced host preparation state, the newly added node is waited for data synchronization, and continuous external requests are avoided, so that the newly added node can quickly realize complete data synchronization with the main node and is added into the system cluster, and the high availability of the system is quickly realized.
Another embodiment of the present application provides another method for adding a node, where the method is applied to a first node. As shown in fig. 3, the method provided in the embodiment of the present application specifically includes the following steps:
s301, receiving a data synchronization request sent by the second node.
The first node is a first started node, the state of the first node is a single-machine state, and the single-machine state is used for representing that only one node exists in the distributed system.
It should be noted that, in the specific implementation of step S301, reference may be made to step S101 in the foregoing method embodiment, and details are not described here again.
S302, continuously synchronizing the data of the second node to the second node, and monitoring the data synchronization progress in real time.
It should be noted that, in the specific implementation of step S302, reference may be made to step S102 in the foregoing method embodiment, and details are not described here again.
And S303, judging whether the monitored data synchronization progress reaches the current tolerance threshold of the first node in real time.
It should be noted that, in the specific implementation of step S303, reference may be made to step S103 in the foregoing method embodiment, which is not described herein again.
When it is monitored that the data synchronization progress reaches the current tolerance threshold of the first node, step S304 is executed.
S304, the working state is switched from the single machine state to the host preparation state, and the first node in the host preparation state stops transaction updating until the second node confirms that the data synchronization is completed.
It should be noted that, the specific implementation of step S304 may refer to step S104 in the foregoing method embodiment accordingly, and details are not repeated here.
S305, judging whether the data are completely synchronized within a first preset time span.
In order to avoid that the second node takes too long time for data to be completely synchronized due to too slow data synchronization, or the second node fails to realize the complete data synchronization, and further the first node is in a waiting state for a long time or all the time, in this embodiment of the present application, the second node is required to realize the complete data synchronization within a first preset time length. If the data is completely synchronized within the first preset time period, step S306 is executed. If the data is not completely synchronized within the first preset time period, step S307 is executed.
Optionally, the device disaster recovery switching time may be set to a first preset time length, so that the disaster recovery switching of the device may be effectively ensured.
S306, sending a joining notification to the second node to trigger the second node to join the distributed system.
And after the second node joins the distributed system, the second node enters a standby state from a standby preparation state.
Specifically, the specific implementation of step S306 may refer to step S105 in the above method embodiment accordingly, and is not described herein again.
And S307, restoring to the single state.
Specifically, when the data is not completely synchronized within a first preset time period, the first node is restored to the stand-alone state, and the first node can normally process the data and is no longer in the waiting state. And, the data synchronization request of the new node may be received again, and the data synchronization may be performed, that is, the step S301 may be executed again.
And S308, after the second node is added into the distributed system, switching the working state from the host preparation state to the host state.
Wherein the host state is used to characterize the first node as a host node in the distributed system.
It can be seen that after step S306 is executed and the second node successfully joins the distributed system, step S308 is executed. It should be noted that, the specific implementation of step S308 may refer to step S106 in the foregoing method embodiment accordingly, and details are not described here again.
S309, when a data synchronization request sent by a third node is received, continuously synchronizing the data of the third node to the third node, wherein the third node refers to the node which is started up latest at present.
Since there may be a plurality of standby nodes in a normal case, after the second node is added, other nodes may be further added.
Similarly, the initial state of the started third node is a standby state, and a data synchronization request is sent to the first node in the host state for data synchronization.
S310, when the situation that the data is completely synchronized with the third node is monitored, sending a joining notification to the third node to trigger the third node to join the distributed system.
And after the third node joins the distributed system, the third node enters a standby state from a standby preparation state.
It should be noted that, because the second node already exists as the standby node, that is, the distributed system already has a certain high availability, in order to influence the normal provision of the service to the application, in the embodiment of the present application, when the node is newly added subsequently, the first node does not detect the data synchronization progress and waits after entering the tolerance threshold. Since the first node does not update data all the time, subsequently added nodes can also be added to the distributed system, only the time is likely to be relatively long, but the normal processing of the transaction may not be affected in this way. Of course, this is only one optional way, and for a subsequent newly added node, the first node may also wait for the data to be completely synchronized after detecting that the data synchronization progress of the third node enters the tolerance threshold.
Another embodiment of the present application provides an apparatus for adding a node, which is applied to a first node. As shown in fig. 4, an apparatus for adding a node provided in the embodiment of the present application includes the following units:
a receiving unit 401, configured to receive a data synchronization request sent by a second node.
Wherein the first node is the first initiated node. The state of the first node is a stand-alone state. A standalone state is used to characterize the presence of only one node in a distributed system.
A first data synchronization unit 402, configured to continuously synchronize its data to the second node.
And a monitoring unit 403, configured to monitor data synchronization progress in real time.
The first switching unit 404 is configured to switch the working state from the stand-alone state to the host preparation state when the monitoring unit monitors that the data synchronization progress reaches the current tolerance threshold of the first node.
And the first node in the host preparation state stops transaction updating until the second node confirms that the data synchronization is completed.
The first notification unit 405 is configured to send a join notification to the second node when the monitoring unit monitors that the data is completely synchronized, so as to trigger the second node to join the distributed system.
And after the second node joins the distributed system, the second node enters a standby state from a standby preparation state.
The second switching unit 406 is configured to switch the working state from the host ready state to the host state after the second node joins the distributed system.
Wherein the host state is used to characterize the first node as a host node in the distributed system.
Optionally, in an apparatus for adding a node provided in another embodiment of the present application, the apparatus may further include:
and the acquisition unit is used for acquiring the data synchronization speed of the second node and the transaction update speed of the first node at regular time.
And the calculating unit is used for taking the product of the difference value of the data synchronization speed and the transaction updating speed and the first preset time length as the current tolerance threshold of the first node.
Optionally, in an apparatus for adding a new node provided in another embodiment of the present application, the apparatus further includes:
and the second data synchronization unit is used for continuously synchronizing the data of the second data synchronization unit to the third node when receiving the data synchronization request sent by the third node.
Wherein the third node refers to the node that is currently newly started.
And the second notification unit is used for sending a joining notification to the third node when the third node is monitored to realize complete data synchronization with the third node so as to trigger the third node to join the distributed system.
And after the third node joins the distributed system, the second node enters a standby state from a standby preparation state.
Optionally, in an apparatus for adding a node provided in another embodiment of the present application, the apparatus may further include:
and the judging unit is used for judging whether the data are completely synchronized within a first preset time length.
If the judging unit judges that the data are completely synchronized within the first preset time span, the first notification unit executes to send a join notification to the second node.
And the recovery unit is used for recovering to the single machine state when the judging unit judges that the data is not completely synchronized within the first preset time span.
It should be noted that, for the specific working processes of each unit provided in the foregoing embodiments of the present application, reference may be made to the implementation of corresponding steps in the foregoing method embodiments, and details are not described here again. The units described in the above embodiments may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.
Another embodiment of the present application provides a node of a distributed system, as shown in fig. 5, including: a processor 501 and a memory 502.
Wherein: the memory 502 is used for storing computer instructions, and the processor 501 is used for executing the computer instructions stored in the memory 502, and specifically executing the method for adding a node provided by any of the above embodiments.
Another embodiment of the present application provides a computer storage medium for storing a program, which when executed, is used to implement the method for adding a node as provided in any one of the above embodiments.
Computer storage media, including permanent and non-permanent, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for adding a node, which is applied to a first node, the method comprises:
receiving a data synchronization request sent by a second node; wherein the first node is a first started node; the state of the first node is a single-machine state; the single machine state is used for representing that only one node exists in the distributed system;
continuously synchronizing the data of the second node to the second node, and monitoring the data synchronization progress in real time;
when the monitored data synchronization progress reaches the current tolerance threshold of the first node, switching the working state of the first node from a single machine state to a host preparation state; the first node in the host preparation state stops transaction updating until the second node confirms that data synchronization is completed; the tolerance threshold is the set data volume of which the data is synchronized, or the proportion of the synchronized data volume, or the data volume of which the data is not synchronized, or the proportion of the unsynchronized data volume;
when the data are monitored to be completely synchronous, sending a joining notification to the second node to trigger the second node to join the distributed system; after the second node joins the distributed system, the second node enters a standby state from a standby preparation state;
after the second node joins the distributed system, switching the working state of the first node from the host preparation state to a host state; wherein the host state is used to characterize the first node as a host node in the distributed system.
2. The method of claim 1, wherein after the switching the working state of the first node from the stand-alone state to the host ready state, further comprising:
judging whether the data are completely synchronized within a first preset time span or not; if the data are not monitored to be completely synchronous within a first preset time span, the state of the single machine is recovered; and if the data are monitored to be completely synchronous within a first preset time span, executing the sending of the join notification to the second node.
3. The method of claim 2, wherein before switching the operating status of the first node from the stand-alone status to the host-ready status when it is monitored that the data synchronization progress reaches the current tolerance threshold of the first node, the method further comprises:
acquiring the data synchronization speed of the second node and the transaction updating speed of the first node at fixed time;
and taking the product of the difference value of the data synchronization speed and the transaction updating speed and the first preset time length as the current tolerance threshold of the first node.
4. The method of claim 1, wherein after the switching the operating state of the first node from the host ready state to the host state, further comprising:
when a data synchronization request sent by a third node is received, continuously synchronizing the data of the third node to the third node; wherein the third node refers to a node that is currently most recently started;
when the situation that the data are completely synchronized with the third node is monitored, sending a joining notification to the third node to trigger the third node to join the distributed system; and after the third node joins the distributed system, the third node enters a standby state from a standby preparation state.
5. An apparatus for adding a node, the apparatus being applied to a first node, the apparatus comprising:
a receiving unit, configured to receive a data synchronization request sent by a second node; wherein the first node is a first started node; the state of the first node is a single-machine state; the single machine state is used for representing that only one node exists in the distributed system;
the first data synchronization unit is used for continuously synchronizing the data of the first data synchronization unit to the second node;
the monitoring unit is used for monitoring the data synchronization progress in real time;
the first switching unit is used for switching the working state of the first node from a stand-alone state to a host preparation state when the monitoring unit monitors that the data synchronization progress reaches the current tolerance threshold of the first node; the first node in the host preparation state stops transaction updating until the second node confirms that data synchronization is completed; the tolerance threshold is the set data volume of which the data is synchronized, or the proportion of the synchronized data volume, or the data volume of which the data is not synchronized, or the proportion of the unsynchronized data volume;
the first notification unit is used for sending a join notification to the second node when the monitoring unit monitors that the data are completely synchronized, so as to trigger the second node to join the distributed system; after the second node joins the distributed system, the second node enters a standby state from a standby preparation state;
the second switching unit is used for switching the working state of the first node from the host preparation state to the host state after the second node joins the distributed system; wherein the host state is used to characterize the first node as a host node in the distributed system.
6. The apparatus of claim 5, further comprising:
the judging unit is used for judging whether the data are completely synchronized within a first preset time length; if the judging unit judges that the data are completely synchronized within a first preset time span, the first notification unit executes the sending of the join notification to the second node;
and the recovery unit is used for recovering to the single machine state when the judgment unit judges that the data is not completely synchronized within the first preset time span.
7. The apparatus of claim 6, further comprising:
the acquisition unit is used for acquiring the data synchronization speed of the second node and the transaction update speed of the first node at regular time;
and the calculating unit is used for taking the product of the difference value of the data synchronization speed and the transaction updating speed and the first preset time length as the current tolerance threshold of the first node.
8. The apparatus of claim 5, further comprising:
the second data synchronization unit is used for continuously synchronizing the data of the second data synchronization unit to a third node when receiving a data synchronization request sent by the third node; wherein the third node refers to a node that is currently most recently started;
a second notification unit, configured to send a join notification to the third node when it is monitored that data is completely synchronized with the third node, so as to trigger the third node to join the distributed system; and after the third node joins the distributed system, the third node enters a standby state from a standby preparation state.
9. A node of a distributed system, comprising a processor and a memory; wherein:
the memory is to store computer instructions;
the processor is configured to execute the computer instructions stored in the memory, and in particular, to perform the method of adding a node as claimed in any one of claims 1 to 4.
10. A computer storage medium storing a program which, when executed, implements a method of adding a node as claimed in any one of claims 1 to 4.
CN202110337111.1A 2021-03-30 2021-03-30 Method and device for newly adding node, node of distributed system and storage medium Active CN112732493B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110337111.1A CN112732493B (en) 2021-03-30 2021-03-30 Method and device for newly adding node, node of distributed system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110337111.1A CN112732493B (en) 2021-03-30 2021-03-30 Method and device for newly adding node, node of distributed system and storage medium

Publications (2)

Publication Number Publication Date
CN112732493A CN112732493A (en) 2021-04-30
CN112732493B true CN112732493B (en) 2021-06-18

Family

ID=75595978

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110337111.1A Active CN112732493B (en) 2021-03-30 2021-03-30 Method and device for newly adding node, node of distributed system and storage medium

Country Status (1)

Country Link
CN (1) CN112732493B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104281631A (en) * 2013-07-12 2015-01-14 中兴通讯股份有限公司 Distributed database system and data synchronization method and nodes thereof
WO2018210419A1 (en) * 2017-05-18 2018-11-22 Huawei Technologies Co., Ltd. System and method of synchronizing distributed multi-node code execution
CN110213317A (en) * 2018-07-18 2019-09-06 腾讯科技(深圳)有限公司 The method, apparatus and storage medium of message storage
CN110225133A (en) * 2019-06-20 2019-09-10 恒生电子股份有限公司 Message method, node, device, system and relevant device
CN110895545A (en) * 2018-08-22 2020-03-20 阿里巴巴集团控股有限公司 Shared data synchronization method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108270831B (en) * 2016-12-30 2021-05-07 杭州宏杉科技股份有限公司 Arbiter cluster implementation method and device
CN109783577B (en) * 2019-01-05 2021-10-08 咪付(广西)网络技术有限公司 Strategy-based cloud database elastic expansion method
CN109951331B (en) * 2019-03-15 2021-08-20 北京百度网讯科技有限公司 Method, device and computing cluster for sending information
CN110636128A (en) * 2019-09-20 2019-12-31 苏州浪潮智能科技有限公司 Data synchronization method, system, electronic equipment and storage medium
CN110909076B (en) * 2019-10-31 2023-05-23 北京浪潮数据技术有限公司 Storage cluster data synchronization method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104281631A (en) * 2013-07-12 2015-01-14 中兴通讯股份有限公司 Distributed database system and data synchronization method and nodes thereof
WO2018210419A1 (en) * 2017-05-18 2018-11-22 Huawei Technologies Co., Ltd. System and method of synchronizing distributed multi-node code execution
CN110213317A (en) * 2018-07-18 2019-09-06 腾讯科技(深圳)有限公司 The method, apparatus and storage medium of message storage
CN110895545A (en) * 2018-08-22 2020-03-20 阿里巴巴集团控股有限公司 Shared data synchronization method and device
CN110225133A (en) * 2019-06-20 2019-09-10 恒生电子股份有限公司 Message method, node, device, system and relevant device

Also Published As

Publication number Publication date
CN112732493A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
CN109951331B (en) Method, device and computing cluster for sending information
WO2017177941A1 (en) Active/standby database switching method and apparatus
CN106933843B (en) Database heartbeat detection method and device
WO2014059804A1 (en) Method and system for data synchronization
CN102394914A (en) Cluster brain-split processing method and device
CN114466027B (en) Cloud primary database service providing method, system, equipment and medium
CN105069152B (en) data processing method and device
CN115562911B (en) Virtual machine data backup method, device, system, electronic equipment and storage medium
CN106815094B (en) Method and equipment for realizing transaction submission in master-slave synchronization mode
CN112190924A (en) Data disaster tolerance method, device and computer readable medium
CN109597718A (en) A kind of disaster recovery platform and a kind of disaster recovery method
CN114138732A (en) Data processing method and device
CN114490173A (en) Data backup method, device, system and storage medium
CN108243031B (en) Method and device for realizing dual-computer hot standby
CN112929438B (en) Business processing method and device of double-site distributed database
CN113297173B (en) Distributed database cluster management method and device and electronic equipment
CN111984474B (en) Method, system and equipment for recovering double-control cluster fault
CN112887367B (en) Method, system and computer readable medium for realizing high availability of distributed cluster
CN112732493B (en) Method and device for newly adding node, node of distributed system and storage medium
CN112069018A (en) High-availability method and system for database
CN115237674A (en) Data backup method, device and medium for SDN controller based on opennaylight
CN108984346B (en) Method, system and storage medium for producing data disaster tolerance
CN112948484A (en) Distributed database system and data disaster recovery drilling method
CN115934742A (en) Fault processing method, device, equipment and storage medium
CN117395263B (en) Data synchronization method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant