WO2018192534A1 - 节点设备运行方法、工作状态切换装置、节点设备及介质 - Google Patents

节点设备运行方法、工作状态切换装置、节点设备及介质 Download PDF

Info

Publication number
WO2018192534A1
WO2018192534A1 PCT/CN2018/083595 CN2018083595W WO2018192534A1 WO 2018192534 A1 WO2018192534 A1 WO 2018192534A1 CN 2018083595 W CN2018083595 W CN 2018083595W WO 2018192534 A1 WO2018192534 A1 WO 2018192534A1
Authority
WO
WIPO (PCT)
Prior art keywords
node device
state
information
running
heartbeat information
Prior art date
Application number
PCT/CN2018/083595
Other languages
English (en)
French (fr)
Inventor
郭锐
李茂材
梁军
屠海涛
赵琦
王宗友
张建俊
朱大卫
刘斌华
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2018192534A1 publication Critical patent/WO2018192534A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/30Decision processes by autonomous network management units using voting and bidding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0668Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route

Definitions

  • the present application relates to the field of network technologies, and in particular, to a node device operation method, a working state switching device, a node device, and a medium.
  • the BFT-Raft (Byzantine Fault Tolerance algorithm-Raft, Byzantine Fault Tolerance Algorithm-Raft Consensus Algorithm) can be applied to the node device.
  • the operating states of the node devices can be classified into three types: a follow state (follower), a candidate state (candidate), and a leader state (leader).
  • follow state follower
  • candidate state candidate
  • leader leader
  • the node device a When the node device a does not receive the heartbeat information of the node device b for a period of time, it may be determined that the node device b runs a fault and switches to a candidate state operation, and broadcasts the voting request to each node device in the cluster, once the cluster is received. If more than half of the node devices vote, the node device a can switch to the leader state operation, and broadcast the heartbeat information to each node device in the cluster, store the log based on the interaction with the client, and instruct each node device to copy the log.
  • the node device a running in the candidate state receives the heartbeat information, if the running period information carried by the heartbeat information is smaller than the running period information of the node device a, the node device a ignores the heartbeat information.
  • the sub-cluster B includes the node device a running in the leader state in the cluster, and the number of node devices of the sub-cluster A is smaller than The number of node devices in sub-cluster B, the node devices in sub-cluster A cannot vote for a new node device running in the leadership state, so the running cycle information of the node devices of sub-cluster A increases with time, when the sub-cluster After A and sub-cluster B are restored to the network, any node device b in the sub-cluster A can receive the heartbeat information of the node device a. Since the operation period information of the node device a is smaller than the operation period information of the node device b, the node device b will ignore the heartbeat information, causing node device b to fail to join the original cluster.
  • the embodiment of the present application provides a node device operation method and a node device, which can be used to solve the problem that a node device cannot join a cluster caused by a BFT-Raft algorithm when a network is split.
  • the technical solutions are as follows:
  • the embodiment of the present application provides a method for operating a node device, which is applied to a first node device, where the method includes:
  • the operation period information in the heartbeat information is smaller than the operation period information of the first node device, synchronize the operation period information of the first node device with the operation cycle information in the heartbeat information, and The operating state of a node device is switched from the candidate state to the following state or the candidate state is maintained.
  • the embodiment of the present application provides a working state switching device, where the device is applied to a first node device, and the device includes:
  • a receiving module configured to receive heartbeat information of a second node device running in a leadership state
  • An acquiring module configured to obtain, according to the heartbeat information, operation period information if the first node device is in a candidate state
  • a running module configured to synchronize the operation period information of the first node device to the operation cycle information in the heartbeat information, if the operation cycle information in the heartbeat information is smaller than the operation cycle information of the first node device, And switching the working state of the first node device from the candidate state to the following state to run or maintain the candidate state.
  • the embodiment of the present application provides a node device, where the node device includes:
  • One or more processors are One or more processors;
  • One or more memories for storing instructions executed by the one or more processors
  • the one or more processors are configured to execute the instructions to implement the node device operation method described above.
  • an embodiment of the present application provides a computer readable storage medium, where a computer program is stored thereon, wherein when the computer program is executed by a processor, the node device operation method is implemented.
  • the embodiment of the present application obtains the running period information in the heartbeat information when the heartbeat information is received. If the acquired running period information is smaller than the running period information of the heartbeat, the running period information of the heartbeat is synchronized to the running period in the heartbeat information. The information is switched to the following state or the candidate state is maintained, so that the node device that is switched to the following state can directly work with the second sub-cluster according to the current heartbeat information, or maintain the candidate state node device. When the heartbeat information is received again, the switch can be switched to the following state, and the second sub-cluster can also be combined into one system work, thereby improving the operational reliability of the system.
  • FIG. 1 is a schematic diagram of an implementation environment of a node device operation according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of switching of an operating state of a node device according to an embodiment of the present application
  • FIG. 3 is a flowchart of a method for operating a node device according to an embodiment of the present application
  • FIG. 4 is a schematic block diagram of a working state switching device according to an embodiment of the present application.
  • FIG. 5 is a schematic block diagram of a working state switching apparatus according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a node device according to an embodiment of the present application.
  • FIG. 1 is a schematic diagram of an implementation environment of a node device operation according to an embodiment of the present application.
  • the implementation environment is a system composed of a plurality of node devices, and the system is also equivalent to a cluster.
  • the node device 2 is a node device running in a leadership state in the system.
  • the heartbeat information is broadcasted to each node device running in the following state, for example, the node device 3 and the node device 4, and each node device running in the following state can determine that the node device 2 is operating normally when receiving the heartbeat information, and Set the timer (usually a random value between 0.5 and 1 second, so as to avoid the situation that the timers of the respective node devices are the same, which may cause repeated elections), waiting for the next heartbeat information.
  • the timer usually a random value between 0.5 and 1 second, so as to avoid the situation that the timers of the respective node devices are the same, which may cause repeated elections
  • the working state of each node device in the system can be dynamically switched.
  • the embodiment of the present application provides a schematic diagram of switching the working state of the node device.
  • the node device running in the follow state does not receive the heartbeat information when the timer expires, it may be determined that the node device running in the leader state runs faulty and switches to the candidate state (candidate) operation; and further, the node device The timer can be reset and the voting request can be broadcasted until more than half of the voting confirmation messages in the system are switched to the leader operation, or the heartbeat information of the node device running in the leadership state is switched to the following state. If the operation is running, or the timer expires, the candidate state is maintained to start a new round of election; the node device running in the leadership state can switch to the following state when it finds a node device having a higher running period (term) than itself.
  • the service command may be redirected to the node device 2, and the node device 2 broadcasts a log to each node device.
  • the request, the log addition request is used to request to add the service command to the log, and if the node device 2 can receive the confirmation message that the node device adds a request to the log, the service command can be added to the service command of the client.
  • a log copy instruction is broadcast to each node device, so that each node device copies the service command into the log.
  • the system may be a transaction system based on blockchain technology.
  • the service command may be transaction information of the client, and the log stored by each node device may correspond to a blockchain, when adding transaction information. When it is in the log, the transaction information is actually stored in the next block of the current block. Since the data stored in the blockchain cannot be changed, the transaction information can be effectively prevented from being tampered and the transaction information can be improved. Sex.
  • each node device in the system may be split to form a sub-cluster separated by two networks, that is, the first sub-cluster and the second sub-cluster shown in FIG. 1, and the number of node devices in the first sub-cluster is smaller than The number of node devices in the second subcluster.
  • the second sub-cluster includes the node device 2 running in the leadership state in the system. Further, the node device running in the following state in the second sub-cluster can continue to work normally according to the heartbeat information broadcasted by the node device 2; the first sub-cluster is interrupted by the network of the node device 2, and the node running in the following state The device cannot receive the heartbeat information of the node device 2 when the timer expires.
  • the node device running in the following state switches to the candidate state operation, resets the timer, and sets itself.
  • the running cycle information is incremented by one and the voting request is broadcast.
  • any node device in the first sub-cluster cannot receive a voting request that is greater than half of the number of node devices in the system.
  • the node device in the first sub-cluster cannot select the node device in the lead state, and the node device in the first sub-cluster maintains the candidate state, but again when the timer expires.
  • Set the timer add one's own running cycle information, and broadcast the voting request, and loop back and forth.
  • the node device in the first sub-cluster restores the network connection, according to the prior art, even if the node device in the first sub-cluster receives the heartbeat information of the node device 2, the operation cycle information in the heartbeat information is smaller than the The operation period information of the node device in the first sub-cluster, the node device in the first sub-cluster also ignores the heartbeat information of the node device 2, and continues to wait for the voting of other node devices or receives the node that is qualified to be the leader state. The device's heartbeat information, until the timer times out again, so loops.
  • FIG. 3 is a flowchart of a method for operating a node device according to an embodiment of the present application.
  • the method may be applied to a first node device, where the first node device may be the node device 1 of the embodiment shown in FIG. Specifically, the following steps are included:
  • the first node device receives heartbeat information of a node device running in a leadership state.
  • the first node device runs in a candidate state, and may generate a voting request based on its own running cycle information, a last log index, and a node device identifier at the start of the election, and broadcast the voting request.
  • the second node device running in the leadership state may be the node device 2 in the second sub-cluster in the embodiment shown in FIG. 1.
  • the first sub-cluster Since the number of node devices in the first sub-cluster is less than half of the number of node devices in the system, and any node device in the first sub-cluster cannot receive more than half of the number of node devices in the system, the first sub-cluster The node device in the leadership state cannot always be selected, and the first node device always runs in the candidate state, and increments its own running cycle information after each timer expires to start a new round of election.
  • the heartbeat information is periodically broadcast by the second node device in the second sub-cluster, and may carry the node device identifier and the operation period information of the second node device. Optionally, the heartbeat information further carries the latest log index of the second node device. When the first sub-cluster and the second sub-cluster restore the network connection, the heartbeat information broadcast by the second node device may be received by the first node device running in the candidate state.
  • the first node device obtains the running cycle information from the heartbeat information.
  • the running period information refers to the running cycle number of the node device that sends the heartbeat information.
  • the operation period information of the node device that is switched from the following state to the candidate state is incremented, and the node device that eventually becomes the leader state can carry the operation period information in the heartbeat information and broadcast to other node devices to receive
  • the node device to the candidate state of the heartbeat information may switch to the following state operation, and synchronize its own running cycle information to the running cycle information in the heartbeat information. Therefore, the running cycle information may represent whether a node device is always running and running.
  • the node devices in the leadership state are synchronized and functioning properly.
  • the first node device when the heartbeat information further includes the latest log index of the second node device running in the leadership state, the first node device further obtains the latest log index from the heartbeat information.
  • the latest log index refers to the index of the most recently stored log of the node device that sends the heartbeat information.
  • the log copy instruction may be broadcast to other node devices, so that the node device receiving the log copy instruction may synchronize the log of the node device of the leader state and the latest log index, and therefore, the latest log index may represent the log integrity of a node device.
  • the node device running in the leadership state is the node device with the best log integrity in its system.
  • the first node device may separately extract the running cycle information and the latest log index of the corresponding protocol location from the heartbeat information according to the running cycle information and the protocol position of the latest log index in the heartbeat information.
  • the heartbeat information needs to carry each node device in the system in response to the node device running in the leadership state.
  • the signature of the voting request When receiving the heartbeat information, if the first node device is running in the candidate state, multiple signatures may be obtained from the heartbeat information; if the number of the multiple signatures is greater than half of the number of node devices in the system, and multiple signatures are verified Obtaining the running period information from the heartbeat information, and optionally obtaining the latest log index from the heartbeat information.
  • Each node device in the system can be configured with its own private key and the public key of each node device.
  • the first node device may extract the signature of each node device from the heartbeat information as the multiple signatures, and verify the signature of the node device by using the public key of any configured node device, if the signatures of the respective node devices are After the verification succeeds, and the number of signatures passed by the verification is greater than half of the number of node devices in the system, indicating that the heartbeat information is indeed from the node device running in the leadership state, the operation cycle information and the latest log index may be obtained from the heartbeat information, and continue. Perform the following step 303.
  • the first node device determines whether the running period information in the heartbeat information is smaller than the running period information of the first node device. If yes, step 304 is performed, and if not, the heartbeat information is ignored.
  • the network connection is restored, and the node devices in the first sub-cluster are performing elections, considering that the first sub-cluster and the second sub-cluster are
  • the running period information of each node device in the system is the same.
  • the running period information of the first node device is continuously increased, and the second sub-cluster is due to the second node.
  • the device runs normally, and the running cycle information of the node device remains unchanged. Therefore, the running cycle information can be used as one of the verification basis of the foregoing implementation scenario. If the running cycle information in the heartbeat information is smaller than the running cycle information of the first node device.
  • step 304 If the implementation scenario is verified, the process continues to step 304. If the running cycle information in the heartbeat information is not smaller than the running cycle information of the first node device, the description does not meet the foregoing implementation scenario, and the heartbeat information may be ignored, but due to the condition One of the conditions for a node device that elects a new leadership state
  • the first node device can proceed to step 304 (in fact, in this case belongs to the system in a normal election, the present application is not limited to the first embodiment of node device how to handle).
  • the first node device determines whether the latest log index in the heartbeat information is greater than or equal to the latest log index of the first node device. If yes, step 305 is performed, and if not, the heartbeat information is ignored.
  • the first node device synchronizes its running period information into running period information in the heartbeat information, and switches its working state from the candidate state to the following state.
  • the logs of each node device in the system should be synchronized with the logs of the second node device in the lead state, so the second sub-cluster recovers the network in two sub-clusters after a period of service for the client.
  • the log stored by the second node device should be no less than the log stored by the first node device, and therefore the latest log index can be used as one of the basis for verifying the above implementation scenario, if the voting request of multiple node devices
  • the latest log index is not less than the latest log index of the first node device, indicating that the amount of logs stored by the node device corresponding to the voting request is equal to or greater than the log volume of the first node device, and the foregoing implementation scenario finally obtains various verifications. Therefore, the first node device may synchronize the running period information of the first node device to the running cycle information in the heartbeat information, and switch to the following state, if the latest log index in the heartbeat information is smaller than the latest log of the first node device. Index, the above implementation scenario is not confirmed, you can ignore Heartbeat information.
  • the timer needs to be reset, and the log index that the first node device needs to add is determined based on the latest log index of the first node device and the latest log index in the heartbeat information;
  • the second node device ie, node device 2 in FIG. 1 running in the leader state sends a log addition request.
  • the log addition request may carry the log index to be added, so that when the second node device receives the log addition request, the log corresponding to the log index to be added may be returned to the first node device.
  • the first node device can also synchronize its running cycle information into the running cycle information in the heartbeat information, and keep the candidate state running. In this implementation manner, the first node device may switch to the following state operation until the heartbeat information of the second node device is received again, and synchronize the log of the second node device.
  • the embodiment of the present application does not specifically limit the sequence of the steps 303 and 304 performed by the first node device.
  • the first node device may also first judge the latest log index, and then determine the running cycle information. Or, in order to improve the efficiency of the judgment, and make the first sub-cluster and the second sub-cluster work together as one system as soon as possible, the first node device can simultaneously judge the latest log index and the operation cycle information, as long as the two meet the above requirements respectively. With the respective judgment conditions, the first node device can switch the current working state to the following state (or maintain the candidate state).
  • the first node device may directly perform the above step 305 when determining that the running period information in the heartbeat information is smaller than the running period information thereof, and does not perform the judgment about the latest log index.
  • any node device that is in the first sub-cluster running in the candidate state can work in the same system as the second sub-cluster.
  • the embodiment of the present application obtains the running period information in the heartbeat information when the heartbeat information is received. If the acquired running period information is smaller than the running period information of the heartbeat, the running period information of the heartbeat is synchronized to the running period in the heartbeat information. The information is switched to the following state or the candidate state is maintained, so that the node device that is switched to the following state can directly work with the second sub-cluster according to the current heartbeat information, or maintain the candidate state node device. When the heartbeat information is received again, the switch can be switched to the following state, and the second sub-cluster can also be combined into one system work, thereby improving the operational reliability of the system.
  • step 306 may also be performed:
  • the first node device receives a log copy instruction broadcast by the second node device running in the leader state, and copies the log based on the log copy instruction.
  • the second node device running in the leadership state can broadcast a log copy instruction after each node device in the system determines to add a new log.
  • the first node device can be configured to receive the log copy instruction, thereby adding the service command newly received by the system to the log.
  • the BFT-Raft algorithm not only solves the node device consistency, but also solves the problem of node device fraud, data tampering, loss or disorder.
  • the log copy instruction needs to carry the node devices in the system in response to running in the leadership state. The signature of the second node device when voting is requested, so that the first node device can verify the log copy instruction and perform log copy after the verification is passed.
  • the node running in the leadership state may be determined. If the device runs faulty, it switches to the candidate state, resets the timer, and broadcasts the voting request until it receives a voting request greater than half of the number of node devices in the system, becomes the node device of the new leader state, or until new is received.
  • the heartbeat information of the node device of the leader state is switched to the following state.
  • FIG. 4 is a schematic block diagram of a working state switching apparatus according to an embodiment of the present application.
  • the apparatus has the function of implementing the above-described method examples, which may be implemented by hardware or by software executing corresponding software.
  • the device can be applied to the first node device described above.
  • the apparatus may include: a receiving module 401, an obtaining module 402, and an operating module 403.
  • the receiving module 401 is configured to receive heartbeat information of the second node device that is in the leadership state.
  • the obtaining module 402 is configured to obtain the running period information from the heartbeat information if the first node device is in the candidate state.
  • the running module 403 is configured to: if the running cycle information in the heartbeat information is smaller than the running cycle information of the first node device, and synchronize the running cycle information of the first node device into the running cycle information in the heartbeat information, and the first node device The working state is switched from the candidate state to the following state or the candidate state is maintained.
  • the embodiment of the present application obtains the running period information in the heartbeat information when the heartbeat information is received. If the acquired running period information is smaller than the running period information of the heartbeat, the running period information of the heartbeat is synchronized to the running period in the heartbeat information. The information is switched to the following state or the candidate state is maintained, so that the node device that is switched to the following state can directly work with the second sub-cluster according to the current heartbeat information, or maintain the candidate state node device. When the heartbeat information is received again, the switch can be switched to the following state, and the second sub-cluster can also be combined into one system work, thereby improving the operational reliability of the system.
  • the obtaining module 402 is further configured to: obtain the latest log index from the heartbeat information if the first node device is running in the candidate state.
  • the running module 403 is further configured to: if the running period information in the heartbeat information is smaller than the running period information of the first node device, and the latest log index in the heartbeat information is greater than or equal to the first node device The latest log index, the operation period information of the first node device is synchronized to the operation cycle information in the heartbeat information, and the working state of the first node device is switched from the candidate state to the following state or The candidate state is maintained.
  • the apparatus further includes: a determining module and a sending module.
  • the determining module 404 is configured to determine, according to the latest log index of the first node device and the latest log index in the heartbeat information, a log index that the first node device needs to add.
  • the sending module 405 is configured to send a log adding request to the second node device that is in a running state, where the log adding request carries the log index that needs to be added.
  • the obtaining module 402 is further configured to obtain multiple signatures from the heartbeat information if the first node device is running in the candidate state.
  • the obtaining module 402 is further configured to: obtain the running period information from the heartbeat information if the number of the multiple signatures is greater than a half of the number of node devices in the system, and multiple signatures are verified to pass.
  • the receiving module 401 is further configured to receive a log copy instruction broadcast by the second node device running in the leader state, and copy the log based on the log copy instruction.
  • the heartbeat information or log copy instruction carries a signature of each node device in the system in response to a voting request from a second node device running in a leader state.
  • the node device provided in the above embodiment is only illustrated by the division of each functional module in the execution of the node device operation method. In actual applications, the function distribution may be completed by different functional modules as needed. The internal structure of the node device is divided into different functional modules to complete all or part of the functions described above.
  • the node device and the node device operation method embodiment provided in the foregoing embodiments are in the same concept, and the specific implementation process is described in the method embodiment, and details are not described herein again.
  • FIG. 6 is a schematic structural diagram of a node device according to an embodiment of the present application.
  • the node device can be provided as a server, the node device 600 including a processing component 622 that further includes one or more processors, and memory resources represented by the memory 632 for storing by the processing component 622
  • the execution of instructions such as an application.
  • An application stored in memory 632 can include one or more modules each corresponding to a set of instructions.
  • processing component 622 is configured to execute instructions to perform the following method of operating a node device:
  • the operation period information in the heartbeat information is smaller than the operation period information of the first node device, synchronize the operation period information of the first node device with the operation cycle information in the heartbeat information, and The operating state of a node device is switched from the candidate state to the following state or the candidate state is maintained.
  • the one or more processors are further configured to execute the instructions to perform the steps of:
  • the running period information in the heartbeat information is smaller than the running period information of the first node device, and the latest log index in the heartbeat information is greater than or equal to the latest log index of the first node device, Synchronizing the operation period information of the first node device with the operation cycle information in the heartbeat information, and switching the working state of the first node device from the candidate state to the following state or maintaining the candidate state step.
  • the one or more processors are further configured to execute the instructions to perform the steps of:
  • the one or more processors are further configured to execute the instructions to perform the steps of:
  • the step of acquiring the operation cycle information from the heartbeat information is performed.
  • the one or more processors are further configured to execute the instructions to perform the steps of:
  • the log copying instruction carries a signature of each node device in the system in response to the voting request of the second node device running in the leadership state.
  • Node device 600 may also include a power component 626 configured to perform power management of node device 600, a wired or wireless network interface 650 configured to connect node device 600 to the network, and an input/output (I/O) interface 658 .
  • Node device 600 may operate based on an operating system stored in the memory 632, for example, Windows Server TM, Mac OS X TM , Unix TM, Linux TM, FreeBSD TM or the like.
  • the computer readable storage medium having stored thereon a computer program, the computer program being executed by a processor to implement the above-described node device operation method.
  • the computer readable storage medium can be a read only memory (ROM), a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Hardware Redundancy (AREA)
  • Small-Scale Networks (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本申请公开了一种节点设备运行方法、工作状态切换装置、节点设备及介质,属于网络技术领域。该方法包括:第一节点设备接收运行于领导状态的第二节点设备的心跳信息;如果第一节点设备运行于候选状态,从心跳信息中获取运行周期信息;如果心跳信息中的运行周期信息小于第一节点设备的运行周期信息,将第一节点设备的运行周期信息同步为心跳信息中的运行周期信息,并将第一节点设备的工作状态从候选状态切换为跟随状态运行或保持候选状态。本申请通过将第一节点设备的运行周期信息同步为心跳信息中的运行周期信息,解决了现有技术中系统中的子集群恢复网络连接时不能合为一个系统工作的问题,提高了系统的工作可靠性。

Description

节点设备运行方法、工作状态切换装置、节点设备及介质
本申请要求于2017年4月20日提交中华人民共和国国家知识产权局、申请号为201710263587.9、发明名称为“节点设备运行方法及节点设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及网络技术领域,特别涉及一种节点设备运行方法、工作状态切换装置、节点设备及介质。
背景技术
随着网络技术的发展,基于集群为客户端提供服务的方式越来越普遍。为了保证集群中各个节点设备保持一致性,节点设备运行时一般可以应用BFT-Raft(Byzantine Fault Tolerance algorithm-Raft,拜占庭容错算法-Raft共识算法)。
根据BFT-Raft算法,节点设备的工作状态可以分为三种:跟随状态(follower)、候选状态(candidate)和领导状态(leader)。当任一节点设备a运行于跟随状态时,可以根据该集群中运行于领导状态的节点设备b所广播的心跳信息,确定该节点设备b运行正常,并基于节点设备b的指示复制日志。当节点设备a在一段时间内未接收到节点设备b的心跳信息,可以确定节点设备b运行故障,并切换为候选状态运行,将投票请求广播至集群中的各个节点设备,一旦接收到该集群中半数以上的节点设备的投票,节点设备a可以切换为领导状态运行,并将心跳信息广播至集群中的各个节点设备、基于和客户端的交互存储日志、指示各个节点设备复制日志。需要说明的是,在运行于候选状态的节点设备a接收到心跳信息时,如果该心跳信息携带的运行周期信息小于该节点设备a的运行周期信息,则节点设备a会忽略该心跳信息。
由于一个集群可能分裂成网络相隔离的两个子集群,如,子集群A和子集群B,该子集群B中包括该集群中运行于领导状态的节点设备a,且子集群A的节点设备数量小于子集群B的节点设备数量,则子集群A中的节点设备不 能通过投票选出一个新的运行于领导状态的节点设备,因此子集群A的节点设备的运行周期信息随时间递增,当子集群A与子集群B恢复网络连接后,子集群A中的任一节点设备b可以接收到节点设备a的心跳信息,由于节点设备a的运行周期信息小于节点设备b的运行周期信息,则节点设备b会忽略该心跳信息,导致节点设备b无法加入原来的集群。
发明内容
本申请实施例提供了一种节点设备运行方法及节点设备,可用于解决基于BFT-Raft算法在网络分裂时,造成的节点设备无法加入集群的问题。技术方案如下:
一方面,本申请实施例提供一种节点设备运行方法,应用于第一节点设备,所述方法包括:
接收运行于领导状态的第二节点设备的心跳信息;
如果所述第一节点设备运行于候选状态,从所述心跳信息中获取运行周期信息;
如果所述心跳信息中的运行周期信息小于所述第一节点设备的运行周期信息,将所述第一节点设备的运行周期信息同步为所述心跳信息中的运行周期信息,并将所述第一节点设备的工作状态从所述候选状态切换为跟随状态运行或保持所述候选状态。
另一方面,本申请实施例提供一种工作状态切换装置,所述装置应用于第一节点设备,所述装置包括:
接收模块,用于接收运行于领导状态的第二节点设备的心跳信息;
获取模块,用于如果所述第一节点设备运行于候选状态,从所述心跳信息中获取运行周期信息;
运行模块,用于如果所述心跳信息中的运行周期信息小于所述第一节点设备的运行周期信息,将所述第一节点设备的运行周期信息同步为所述心跳信息中的运行周期信息,并将所述第一节点设备的工作状态从所述候选状态切换为跟随状态运行或保持所述候选状态。
再一方面,本申请实施例提供一种节点设备,所述节点设备包括:
一个或多个处理器;
一个或多个存储器,所述一个或多个存储器用于存储由所述一个或多个处 理器执行的指令;
所述一个或多个处理器被配置为执行所述指令,以实现上述节点设备运行方法。
又一方面,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,其中,所述计算机程序被处理器执行时,实现上述节点设备运行方法。
本申请实施例通过在接收到心跳信息时,获取心跳信息中的运行周期信息,如果获取的运行周期信息小于自身的运行周期信息,则将自身的运行周期信息同步为该心跳信息中的运行周期信息,并将自身的工作状态切换为跟随状态或保持候选状态,使得切换为跟随状态的节点设备可以直接根据本次心跳信息与第二子集群合为一个系统工作,或者保持候选状态的节点设备可以当再次接收到心跳信息时,切换为跟随状态运行,进而也能和第二子集群合为一个系统工作,提高了系统的工作可靠性。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的一种节点设备运行的实施环境示意图;
图2是本申请实施例提供的一种节点设备工作状态的切换示意图;
图3是本申请实施例提供的一种节点设备运行方法的流程图;
图4是本申请实施例提供的一种工作状态切换装置的模块示意图;
图5是本申请实施例提供的一种工作状态切换装置的模块示意图;
图6是本申请实施例提供的一种节点设备的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
图1是本申请实施例提供的一种节点设备运行的实施环境示意图。参见图1,该实施环境为一个由多个节点设备构成的系统,该系统也相当于一个集群, 节点设备2为该系统中运行于领导状态的节点设备,在节点设备2运行正常时,可以定时地向各个运行于跟随状态的节点设备广播心跳信息,如,节点设备3、节点设备4,每个运行于跟随状态的节点设备在接收到心跳信息时可以确定节点设备2运行正常,并重置定时器(一般为0.5-1秒之间的随机值,这样可以避免各个节点设备的定时器的计时时长相同可能造成反复选举的情况),等待下一次心跳信息。
事实上,系统中各个节点设备的工作状态是可以动态切换的,参见图2,本申请实施例提供了一种节点设备工作状态的切换示意图。一旦运行于跟随状态(follower)的节点设备在定时器超时的情况下没有接收到心跳信息,可以确定运行于领导状态的节点设备运行故障,并切换为候选状态(candidate)运行;进而,节点设备可以重置定时器,并广播投票请求,直到接收到该系统中半数以上的投票确认消息切换为领导状态(leader)运行,或者接收到运行于领导状态的节点设备的心跳信息时切换为跟随状态运行,或者定时器超时的情况下保持候选状态开始新一轮选举;运行于领导状态的节点设备可以在发现比自身具有更高运行周期信息(term)的节点设备时切换为跟随状态运行。
在该系统为客户端提供服务时,当该系统中的任一节点设备接收到客户端的服务命令时,可以将该服务命令重定向至节点设备2,由节点设备2向各个节点设备广播日志添加请求,该日志添加请求用于请求将该服务命令添加到日志中,如果节点设备2可以接收到各个节点设备对日志添加请求的确认消息,可以响应该客户端的服务命令,将该服务命令添加到日志中,并向各个节点设备广播日志复制指令,使得各个节点设备将该服务命令复制到日志中。在实际的应用场景中,该系统可以是底层基于区块链技术的交易系统,该服务命令可以为客户端的交易信息,每个节点设备所存储的日志可以对应一条区块链,当添加交易信息到日志中时,实际是将该交易信息存储到当前区块的下一区块中,由于已存储至区块链中的数据不可更改,可以有效地防止交易信息被篡改,提高交易信息的安全性。
由于网络中断等原因,该系统中的各个节点设备可能分裂形成两个网络相隔的子集群,即图1所示的第一子集群和第二子集群,且第一子集群的节点设备数量小于第二子集群的节点设备数量。该第二子集群中包括该系统中运行于领导状态的节点设备2。进而,该第二子集群中运行于跟随状态的节点设备可以依据该节点设备2定时广播的心跳信息继续正常工作;第一子集群由于和节 点设备2的网络中断,其中运行于跟随状态的节点设备在定时器超时的情况下也不能接收到节点设备2的心跳信息,依据BFT-Raft算法的超时选举机制,运行于跟随状态的节点设备会切换为候选状态运行,重置定时器,将自身的运行周期信息加一,并广播投票请求。然而,由于该第一子集群的节点设备数量小于系统中节点设备数量的一半,因此该第一子集群中的任一节点设备都不能接收到大于该系统中节点设备数量的一半的投票请求,导致直到各个节点设备的计时器超时,该第一子集群中也不能选出领导状态的节点设备,则第一子集群中的节点设备会保持候选状态,但在定时器超时的情况下再次重置定时器,将自身的运行周期信息加一,并广播投票请求,以此循环往复。如果第一子集群和第二子集群恢复网络连接,依照现有技术,即使该第一子集群中的节点设备接收到节点设备2的心跳信息,但由于该心跳信息中的运行周期信息小于该第一子集群中的节点设备的运行周期信息,该第一子集群中的节点设备也会忽略节点设备2的心跳信息,并继续等待其他节点设备的投票或者接收到有资格成为领导状态的节点设备的心跳信息,直到定时器再次超时,如此循环。
图3是本申请实施例提供的一种节点设备运行方法的流程图,参见图3,该方法可以应用于第一节点设备中,第一节点设备可以是图1所示实施例的节点设备1,具体包括以下步骤:
301、第一节点设备接收运行于领导状态的节点设备的心跳信息。
其中,该第一节点设备运行于候选状态,可以在选举开始时基于自身的运行周期信息、最新日志索引(last log index)和节点设备标识生成投票请求,并广播该投票请求。该运行于领导状态的第二节点设备可以为图1所示实施例中第二子集群中的节点设备2。由于该第一子集群的节点设备数量小于系统中节点设备数量的一半,该第一子集群中的任一节点设备都不能接收到超过系统中节点设备数量的一半的投票,则第一子集群中始终不能选出领导状态的节点设备,进而导致第一节点设备始终运行于候选状态,并在每次定时器超时后递增自身的运行周期信息,开始一轮新的选举。心跳信息由第二子集群中的第二节点设备定时广播,可以携带该第二节点设备的节点设备标识和运行周期信息。可选地,心跳信息中还携带第二节点设备的最新日志索引。在第一子集群和第二子集群恢复网络连接时,该第二节点设备所广播的心跳信息可以被该运行于候选状态的第一节点设备接收到。
302、如果第一节点设备运行于候选状态,第一节点设备从心跳信息中获取运行周期信息。
其中,运行周期信息是指发送该心跳信息的节点设备当前所处的运行周期号。每次进行选举时,由跟随状态切换为候选状态的节点设备的运行周期信息会加一,最终成为领导状态的节点设备可以将运行周期信息携带在心跳信息中,并广播给其他节点设备,接收到心跳信息的候选状态的节点设备可以切换为跟随状态运行,并将自身的运行周期信息同步为该心跳信息中的运行周期信息,因此,该运行周期信息可以表征一个节点设备是否始终与运行于领导状态的节点设备保持同步且运行正常。
可选地,当心跳信息中还包括运行于领导状态的第二节点设备的最新日志索引时,第一节点设备还从心跳信息中获取最新日志索引。最新日志索引是指发送该心跳信息的节点设备最新存储的日志的索引,每次运行于领导状态的节点设备添加新的日志后,该最新日志索引加一,且该运行于领导状态的节点设备可以将日志复制指令广播给其他节点设备,使得接收到日志复制指令的节点设备可以同步该领导状态的节点设备的日志和最新日志索引,因此,该最新日志索引可以表征一个节点设备的日志完整性,显然,运行于领导状态的节点设备为在其系统中日志完整性最好的节点设备。
该步骤中,第一节点设备可以分别按照运行周期信息和最新日志索引在心跳信息中的协议位置,从心跳信息中分别提取出对应协议位置的运行周期信息和最新日志索引。
在实际的应用场景中,为了避免有的网络设备伪装成领导状态的节点设备来发送心跳信息,提高系统的安全性,该心跳信息需携带系统中各个节点设备在响应运行于领导状态的节点设备的投票请求时的签名。当接收到该心跳信息时,如果第一节点设备运行于候选状态,可以从心跳信息中获取多个签名;如果多个签名的数量大于系统中节点设备数量的半数,且多个签名均验证通过,从心跳信息中获取运行周期信息,可选地还从心跳信息中获取最新日志索引。该系统中的每个节点设备可以配置有自身的私钥以及各个节点设备的公钥。第一节点设备可以从心跳信息中提取出各个节点设备的签名作为该多个签名,并采用已配置的任一节点设备的公钥对该节点设备的签名进行验证,如果各个节点设备的签名均验证通过,且验证通过的签名数量大于该系统中节点设备数量的一半,说明该心跳信息确实来自运行于领导状态的节点设备,则可以从心跳 信息中获取运行周期信息和最新日志索引,并继续执行下述步骤303。
303、第一节点设备判断心跳信息中的运行周期信息是否小于第一节点设备的运行周期信息,如果是,执行步骤304,如果否,忽略该心跳信息。
该步骤中,为了印证该系统目前处于分裂后的子集群之间已恢复网络连接,且第一子集群内的节点设备正在进行选举的实施场景,考虑到第一子集群和第二子集群在初始分裂时系统中各个节点设备的运行周期信息相同,则在分裂后该第一子集群不断进行选举的过程中第一节点设备的运行周期信息会不断递增,而第二子集群由于第二节点设备运行正常,其中的节点设备的运行周期信息会保持不变,因此运行周期信息可以作为上述实施场景的印证依据之一,如果心跳信息中的运行周期信息小于该第一节点设备的运行周期信息,上述实施场景得到印证,则继续执行步骤304,如果该心跳信息中的运行周期信息不小于第一节点设备的运行周期信息,说明不符合上述实施场景,可以忽略该心跳信息,但由于该条件符合选举出新的领导状态的节点设备的情况之一,则第一节点设备也可以继续执行步骤304(事实上,该情况属于系统中正常选举的情况,本申请实施例不限定第一节点设备如何处理)。
304、第一节点设备判断心跳信息中的最新日志索引是否大于等于第一节点设备的最新日志索引,如果是,执行步骤305,如果否,忽略该心跳信息。
305,第一节点设备将其运行周期信息同步为心跳信息中的运行周期信息,并将其工作状态从候选状态切换为跟随状态运行。
考虑到在系统分裂之前,该系统中的各个节点设备的日志理应与处于领导状态的第二节点设备的日志同步,因此第二子集群经过为客户端服务的一段时间,在两个子集群恢复网络连接之后,第二节点设备所存储的日志应该不少于该第一节点设备所存储的日志,也因此可以将最新日志索引作为印证上述实施场景的依据之一,如果多个节点设备的投票请求中的最新日志索引均不小于第一节点设备的最新日志索引,说明该投票请求对应的节点设备已存储的日志量等于或多于第一节点设备的日志量,上述实施场景最终得到各项印证,因此第一节点设备可以将第一节点设备的运行周期信息同步为心跳信息中的运行周期信息,并切换为跟随状态运行,如果该心跳信息中的最新日志索引小于第一节点设备的最新日志索引,上述实施场景没有得到印证,则可以忽略该心跳信息。
当然,如果第一节点设备切换为跟随状态,还需要重置定时器,并且基于 第一节点设备的最新日志索引和心跳信息中的最新日志索引,确定第一节点设备需要添加的日志索引;向运行于领导状态的第二节点设备(也即图1中的节点设备2)发送日志添加请求。该日志添加请求可以携带需要添加的日志索引,使得第二节点设备接收到日志添加请求时,可以将需要添加的日志索引对应的日志返回给第一节点设备。
事实上,第一节点设备也可以将其运行周期信息同步为心跳信息中的运行周期信息,并保持候选状态运行。该实现方式中,第一节点设备可以直到再次接收到第二节点设备的心跳信息时切换为跟随状态运行,并同步该第二节点设备的日志。
需要说明的是,本申请实施例对第一节点设备执行步骤303和304的时序不做具体限定,事实上,第一节点设备也可以先对最新日志索引进行判断,再对运行周期信息进行判断,或者,为了提高判断效率,并尽快使得第一子集群和第二子集群合为一个系统工作,第一节点设备也可以同时对最新日志索引和运行周期信息进行判断,只要二者分别满足上述各自的判断条件,第一节点设备即可将当前工作状态切换至跟随状态(或保持候选状态)。
还需要说明的是,在本实施例中,仅以运行周期信息和最新日志索引均满足各自的判断条件,才认为上述系统目前处于分裂后的子集群之间已恢复网络连接,且第一子集群内的节点设备正在进行选举的实施场景得到印证为例。在其它可能的实施例中,第一节点设备也可在判定心跳信息中的运行周期信息小于其运行周期信息时,直接执行上述步骤305,并不执行关于最新日志索引的判断。
基于上述节点设备运行方法,原来为该第一子集群中的运行于候选状态的任一节点设备均能与该第二子集群合为一个系统工作。
本申请实施例通过在接收到心跳信息时,获取心跳信息中的运行周期信息,如果获取的运行周期信息小于自身的运行周期信息,则将自身的运行周期信息同步为该心跳信息中的运行周期信息,并将自身的工作状态切换为跟随状态或保持候选状态,使得切换为跟随状态的节点设备可以直接根据本次心跳信息与第二子集群合为一个系统工作,或者保持候选状态的节点设备可以当再次接收到心跳信息时,切换为跟随状态运行,进而也能和第二子集群合为一个系统工作,提高了系统的工作可靠性。
可选地,第一节点设备在进入跟随状态之后,还可执行下述步骤306:
306、第一节点设备接收运行于领导状态的第二节点设备所广播的日志复制指令,基于该日志复制指令复制日志。
为了保证系统中各个节点设备都能存储完整的日志,从而保证系统的一致性,该运行于领导状态的第二节点设备可以在系统中的各个节点设备确定添加新的日志后广播日志复制指令,使得第一节点设备可以接收到该日志复制指令,从而将该系统最新接收到的服务指令添加到日志中。当然,基于BFT-Raft算法不仅解决节点设备一致性,而且解决了节点设备欺诈、数据被篡改、丢失或顺序错乱的问题,该日志复制指令需携带系统中各个节点设备在响应运行于领导状态的第二节点设备的投票请求时的签名,使得第一节点设备可以对该日志复制指令进行验证,并在验证通过后进行日志复制。
需要说明的是,在第一子集群和第二子集群在恢复网络连接后,如果该系统中任一节点设备在定时器超时的情况下没有接收到心跳信息,可以确定运行于领导状态的节点设备运行故障,则切换为候选状态,重置定时器,并广播投票请求,直到接收到大于该系统中节点设备数量的半数的投票请求时成为新的领导状态的节点设备,或者直到接收到新的领导状态的节点设备的心跳信息时切换为跟随状态。
下述为本申请装置实施例,对于本申请装置实施例中未披露的细节,可参见本申请方法实施例。
图4是本申请实施例提供的一种工作状态切换装置的模块示意图。该装置具有实现上述方法示例的功能,所述功能可以由硬件实现,或者由硬件执行相应的软件实现。该装置可应用于上文介绍的第一节点设备中。参见图4,该装置可以包括:接收模块401、获取模块402和运行模块403。
接收模块401,用于接收运行于领导状态的第二节点设备的心跳信息。
获取模块402,用于如果第一节点设备运行于候选状态,从心跳信息中获取运行周期信息。
运行模块403,用于如果心跳信息中的运行周期信息小于第一节点设备的运行周期信息,并将第一节点设备的运行周期信息同步为心跳信息中的运行周期信息,并将第一节点设备的工作状态从所述候选状态切换为跟随状态运行或保持候选状态。
本申请实施例通过在接收到心跳信息时,获取心跳信息中的运行周期信 息,如果获取的运行周期信息小于自身的运行周期信息,则将自身的运行周期信息同步为该心跳信息中的运行周期信息,并将自身的工作状态切换为跟随状态或保持候选状态,使得切换为跟随状态的节点设备可以直接根据本次心跳信息与第二子集群合为一个系统工作,或者保持候选状态的节点设备可以当再次接收到心跳信息时,切换为跟随状态运行,进而也能和第二子集群合为一个系统工作,提高了系统的工作可靠性。
在一种可能实现方式中,所述获取模块402,还用于如果所述第一节点设备运行于所述候选状态,从所述心跳信息中获取最新日志索引。
所述运行模块403,还用于如果所述心跳信息中的运行周期信息小于所述第一节点设备的运行周期信息,且所述心跳信息中的最新日志索引大于等于所述第一节点设备的最新日志索引,则将所述第一节点设备的运行周期信息同步为所述心跳信息中的运行周期信息,并将所述第一节点设备的工作状态从所述候选状态切换为跟随状态运行或保持所述候选状态。
可选地,如图5所示,所述装置还包括:确定模块和发送模块。
确定模块404,用于基于所述第一节点设备的最新日志索引和所述心跳信息中的最新日志索引,确定所述第一节点设备需要添加的日志索引。
发送模块405,用于向所述运行于领导状态的第二节点设备发送日志添加请求,所述日志添加请求中携带所述需要添加的日志索引。
在一种可能实现方式中,所述获取模块402,还用于如果第一节点设备运行于候选状态,从心跳信息中获取多个签名。
所述获取模块402,还用于如果多个签名的数量大于系统中节点设备数量的半数,且多个签名均验证通过,从心跳信息中获取运行周期信息。
在一种可能实现方式中,接收模块401,还用于接收运行于领导状态的第二节点设备所广播的日志复制指令,基于日志复制指令复制日志。
在一种可能实现方式中,心跳信息或日志复制指令携带系统中各个节点设备在响应运行于领导状态的第二节点设备的投票请求时的签名。
上述所有可选技术方案,可以采用任意结合形成本申请的可选实施例,在此不再一一赘述。
需要说明的是:上述实施例提供的节点设备在执行节点设备运行方法时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上 述功能分配由不同的功能模块完成,即将节点设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的节点设备与节点设备运行方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图6是本申请实施例提供的一种节点设备结构示意图。参照图6,该节点设备可以被提供为一个服务器,该节点设备600包括处理组件622,其进一步包括一个或多个处理器,以及由存储器632所代表的存储器资源,用于存储可由处理部件622的执行的指令,例如应用程序。存储器632中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的模块。此外,处理组件622被配置为执行指令,以执行下述节点设备运行方法:
接收运行于领导状态的第二节点设备的心跳信息;
如果所述第一节点设备运行于候选状态,从所述心跳信息中获取运行周期信息;
如果所述心跳信息中的运行周期信息小于所述第一节点设备的运行周期信息,将所述第一节点设备的运行周期信息同步为所述心跳信息中的运行周期信息,并将所述第一节点设备的工作状态从所述候选状态切换为跟随状态运行或保持所述候选状态。
可选地,所述一个或多个处理器还被配置为执行所述指令,以执行下述步骤:
如果所述第一节点设备运行于所述候选状态,从所述心跳信息中获取最新日志索引;
如果所述心跳信息中的运行周期信息小于所述第一节点设备的运行周期信息,且所述心跳信息中的最新日志索引大于等于所述第一节点设备的最新日志索引,则执行所述将所述第一节点设备的运行周期信息同步为所述心跳信息中的运行周期信息,并将所述第一节点设备的工作状态从所述候选状态切换为跟随状态运行或保持所述候选状态的步骤。
可选地,所述一个或多个处理器还被配置为执行所述指令,以执行下述步骤:
基于所述第一节点设备的最新日志索引和所述心跳信息中的最新日志索引,确定所述第一节点设备需要添加的日志索引;
向所述运行于领导状态的第二节点设备发送日志添加请求,所述日志添加请求中携带所述需要添加的日志索引。
可选地,所述一个或多个处理器还被配置为执行所述指令,以执行下述步骤:
如果所述第一节点设备运行于所述候选状态,从所述心跳信息中获取多个签名;
如果所述多个签名的数量大于系统中节点设备数量的半数,且所述多个签名均验证通过,则执行所述从所述心跳信息中获取运行周期信息的步骤。
可选地,所述一个或多个处理器还被配置为执行所述指令,以执行下述步骤:
接收所述运行于领导状态的第二节点设备所广播的日志复制指令,基于所述日志复制指令复制日志。
可选地,所述日志复制指令携带系统中各个节点设备在响应所述运行于领导状态的第二节点设备的投票请求时的签名。
节点设备600还可以包括一个电源组件626被配置为执行节点设备600的电源管理,一个有线或无线网络接口650被配置为将节点设备600连接到网络,和一个输入输出(I/O)接口658。节点设备600可以操作基于存储在存储器632的操作系统,例如Windows Server TM,Mac OS X TM,Unix TM,Linux TM,FreeBSD TM或类似。
在示例性实施例中,还提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时,实现上述节点设备运行方法。例如,所述计算机可读存储介质可以是只读存储器(ROM)、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
在示例性实施例中,还提供了一种计算机程序产品,当该计算机程序产品被执行时,其用于实现上述方法实施例中各个步骤的功能。
以上所述仅为本申请的示例性实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (14)

  1. 一种节点设备运行方法,应用于第一节点设备,所述方法包括:
    接收运行于领导状态的第二节点设备的心跳信息;
    如果所述第一节点设备运行于候选状态,从所述心跳信息中获取运行周期信息;
    如果所述心跳信息中的运行周期信息小于所述第一节点设备的运行周期信息,将所述第一节点设备的运行周期信息同步为所述心跳信息中的运行周期信息,并将所述第一节点设备的工作状态从所述候选状态切换为跟随状态运行或保持所述候选状态。
  2. 根据权利要求1所述的方法,其中,所述接收运行于领导状态的第二节点设备的心跳信息之后,还包括:
    如果所述第一节点设备运行于所述候选状态,从所述心跳信息中获取最新日志索引;
    如果所述心跳信息中的运行周期信息小于所述第一节点设备的运行周期信息,且所述心跳信息中的最新日志索引大于等于所述第一节点设备的最新日志索引,则执行所述将所述第一节点设备的运行周期信息同步为所述心跳信息中的运行周期信息,并将所述第一节点设备的工作状态从所述候选状态切换为跟随状态运行或保持所述候选状态的步骤。
  3. 根据权利要求2所述的方法,其中,所述将所述第一节点设备的运行周期信息同步为所述心跳信息中的运行周期信息,并将所述第一节点设备的工作状态从所述候选状态切换为跟随状态运行或保持所述候选状态之后,还包括:
    基于所述第一节点设备的最新日志索引和所述心跳信息中的最新日志索引,确定所述第一节点设备需要添加的日志索引;
    向所述运行于领导状态的第二节点设备发送日志添加请求,所述日志添加请求中携带所述需要添加的日志索引。
  4. 根据权利要求1至3任一项所述的方法,其中,所述接收运行于领导状态的第二节点设备的心跳信息之后,还包括:
    如果所述第一节点设备运行于所述候选状态,从所述心跳信息中获取多个 签名;
    如果所述多个签名的数量大于系统中节点设备数量的半数,且所述多个签名均验证通过,则执行所述从所述心跳信息中获取运行周期信息的步骤。
  5. 根据权利要求1至3任一项所述的方法,其中,所述将所述第一节点设备的运行周期信息同步为所述心跳信息中的运行周期信息,并将所述第一节点设备的工作状态从所述候选状态切换为跟随状态运行或保持所述候选状态之后,还包括:
    接收所述运行于领导状态的第二节点设备所广播的日志复制指令,基于所述日志复制指令复制日志。
  6. 根据权利要求5所述的方法,其中,所述日志复制指令携带系统中各个节点设备在响应所述运行于领导状态的第二节点设备的投票请求时的签名。
  7. 一种工作状态切换装置,所述装置应用于第一节点设备,所述装置包括:
    接收模块,用于接收运行于领导状态的第二节点设备的心跳信息;
    获取模块,用于如果所述第一节点设备运行于候选状态,从所述心跳信息中获取运行周期信息;
    运行模块,用于如果所述心跳信息中的运行周期信息小于所述第一节点设备的运行周期信息,将所述第一节点设备的运行周期信息同步为所述心跳信息中的运行周期信息,并将所述第一节点设备的工作状态从所述候选状态切换为跟随状态运行或保持所述候选状态。
  8. 根据权利要求7所述的装置,其中,
    所述获取模块,还用于如果所述第一节点设备运行于所述候选状态,从所述心跳信息中获取最新日志索引;
    所述运行模块,还用于如果所述心跳信息中的运行周期信息小于所述第一节点设备的运行周期信息,且所述心跳信息中的最新日志索引大于等于所述第一节点设备的最新日志索引,则将所述第一节点设备的运行周期信息同步为所述心跳信息中的运行周期信息,并将所述第一节点设备的工作状态从所述候选状态切换为跟随状态运行或保持所述候选状态。
  9. 根据权利要求8所述的装置,其中,所述装置还包括:
    确定模块,用于基于所述第一节点设备的最新日志索引和所述心跳信息中的最新日志索引,确定所述第一节点设备需要添加的日志索引;
    发送模块,用于向所述运行于领导状态的第二节点设备发送日志添加请求,所述日志添加请求中携带所述需要添加的日志索引。
  10. 根据权利要求7至9任一项所述的装置,其中,
    所述获取模块,还用于如果所述第一节点设备运行于所述候选状态,从所述心跳信息中获取多个签名;
    所述获取模块,还用于如果所述多个签名的数量大于系统中节点设备数量的半数,且所述多个签名均验证通过,从所述心跳信息中获取运行周期信息。
  11. 根据权利要求7至9任一项所述的装置,其中,
    所述接收模块,还用于接收所述运行于领导状态的第二节点设备所广播的日志复制指令,基于所述日志复制指令复制日志。
  12. 根据权利要求11所述的装置,其中,所述日志复制指令携带系统中各个节点设备在响应所述运行于领导状态的第二节点设备的投票请求时的签名。
  13. 一种节点设备,所述节点设备包括:
    一个或多个处理器;
    一个或多个存储器,所述一个或多个存储器用于存储由所述一个或多个处理器执行的指令;
    所述一个或多个处理器被配置为执行所述指令,以实现如权利要求1至6中的任一项所述的节点设备运行方法。
  14. 一种计算机可读存储介质,其上存储有计算机程序,其中,所述计算机程序被处理器执行时,实现如权利要求1至6中的任一项所述的节点设备运行方法。
PCT/CN2018/083595 2017-04-20 2018-04-18 节点设备运行方法、工作状态切换装置、节点设备及介质 WO2018192534A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710263587.9A CN107124305B (zh) 2017-04-20 2017-04-20 节点设备运行方法及节点设备
CN201710263587.9 2017-04-20

Publications (1)

Publication Number Publication Date
WO2018192534A1 true WO2018192534A1 (zh) 2018-10-25

Family

ID=59725923

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/083595 WO2018192534A1 (zh) 2017-04-20 2018-04-18 节点设备运行方法、工作状态切换装置、节点设备及介质

Country Status (2)

Country Link
CN (1) CN107124305B (zh)
WO (1) WO2018192534A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107124305B (zh) * 2017-04-20 2019-08-23 腾讯科技(深圳)有限公司 节点设备运行方法及节点设备
CN110377570B (zh) * 2017-10-12 2021-06-11 腾讯科技(深圳)有限公司 节点切换方法、装置、计算机设备及存储介质
CN109729129B (zh) * 2017-10-31 2021-10-26 华为技术有限公司 存储集群系统的配置修改方法、存储集群及计算机系统
CN109726211B (zh) * 2018-12-27 2020-02-04 无锡华云数据技术服务有限公司 一种分布式时序数据库
CN109803024B (zh) * 2019-01-28 2021-12-21 北京中科晶上科技股份有限公司 一种用于集群节点网络的方法
CN112865992B (zh) * 2019-11-27 2022-10-14 上海哔哩哔哩科技有限公司 分布式主从系统中主节点的切换方法、装置和计算机设备
CN111586110B (zh) * 2020-04-22 2021-03-19 广州锦行网络科技有限公司 一种raft在出现点对点故障时的优化处理方法
CN116827966B (zh) * 2023-08-29 2024-04-26 中国兵器装备集团兵器装备研究所 一种数据处理方法与系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104679796A (zh) * 2013-12-03 2015-06-03 方正信息产业控股有限公司 一种选举方法、装置及数据库镜像集群节点
CN104933132A (zh) * 2015-06-12 2015-09-23 广州巨杉软件开发有限公司 基于操作序列号的分布式数据库有权重选举方法
CN105511987A (zh) * 2015-12-08 2016-04-20 上海爱数信息技术股份有限公司 一种强一致性且高可用的分布式任务管理系统
WO2016127580A1 (zh) * 2015-02-10 2016-08-18 华为技术有限公司 处理至少一个分布式集群中的故障的方法、设备和系统
CN107124305A (zh) * 2017-04-20 2017-09-01 腾讯科技(深圳)有限公司 节点设备运行方法及节点设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2495079A (en) * 2011-09-23 2013-04-03 Hybrid Logic Ltd Live migration of applications and file systems in a distributed system
CN103152434A (zh) * 2013-03-27 2013-06-12 江苏辰云信息科技有限公司 一种分布式云系统中的领导节点更替方法
CN105512266A (zh) * 2015-12-03 2016-04-20 曙光信息产业(北京)有限公司 一种实现分布式数据库操作一致性的方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104679796A (zh) * 2013-12-03 2015-06-03 方正信息产业控股有限公司 一种选举方法、装置及数据库镜像集群节点
WO2016127580A1 (zh) * 2015-02-10 2016-08-18 华为技术有限公司 处理至少一个分布式集群中的故障的方法、设备和系统
CN104933132A (zh) * 2015-06-12 2015-09-23 广州巨杉软件开发有限公司 基于操作序列号的分布式数据库有权重选举方法
CN105511987A (zh) * 2015-12-08 2016-04-20 上海爱数信息技术股份有限公司 一种强一致性且高可用的分布式任务管理系统
CN107124305A (zh) * 2017-04-20 2017-09-01 腾讯科技(深圳)有限公司 节点设备运行方法及节点设备

Also Published As

Publication number Publication date
CN107124305A (zh) 2017-09-01
CN107124305B (zh) 2019-08-23

Similar Documents

Publication Publication Date Title
US10833919B2 (en) Node device operation method, work status switching apparatus, node device, and medium
WO2018192534A1 (zh) 节点设备运行方法、工作状态切换装置、节点设备及介质
US20210209131A1 (en) Method for Data Synchronization of Multiple Nodes and Computer Device
Kotla et al. Zyzzyva: Speculative byzantine fault tolerance
EP3889780A1 (en) System and method for ending view change protocol
WO2017067484A1 (zh) 一种虚拟化数据中心调度系统和方法
WO2016070375A1 (zh) 一种分布式存储复制系统和方法
CN114048517B (zh) 区块链的双通道共识系统和方法、计算机可读存储介质
Li et al. Sarek: Optimistic parallel ordering in byzantine fault tolerance
CN105069152B (zh) 数据处理方法及装置
Elnozahy et al. Replicated distributed processes in Manetho
CN104158707A (zh) 一种检测并处理集群脑裂的方法和装置
Amiri et al. Seemore: A fault-tolerant protocol for hybrid cloud environments
US20240054054A1 (en) Data Backup Method and System, and Related Device
Keshav et al. RCanopus: Making canopus resilient to failures and byzantine faults
van Renesse et al. Replication techniques for availability
CN116232893A (zh) 分布式系统的共识方法、装置、电子设备及存储介质
US11609824B2 (en) Byzantine fault tolerant view change processing
Mendizabal et al. Checkpointing in parallel state-machine replication
Jehl et al. Asynchronous reconfiguration for Paxos state machines
LUČIĆ Byzantine fault tolerant raft algorithm with round robin leader election
Shi et al. Distributed file system multilevel fault-tolerant high availability mechanism
Jiang et al. Scalable efficient byzantine fault tolerance
AU2019101575A4 (en) System and method for ending view change protocol
Zbierski Iwazaru: the byzantine sequencer

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18788473

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18788473

Country of ref document: EP

Kind code of ref document: A1