WO2021004256A1 - Procédé de commutation de nœud dans une panne de nœud et dispositif associé - Google Patents

Procédé de commutation de nœud dans une panne de nœud et dispositif associé Download PDF

Info

Publication number
WO2021004256A1
WO2021004256A1 PCT/CN2020/097262 CN2020097262W WO2021004256A1 WO 2021004256 A1 WO2021004256 A1 WO 2021004256A1 CN 2020097262 W CN2020097262 W CN 2020097262W WO 2021004256 A1 WO2021004256 A1 WO 2021004256A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
storage device
master node
container
standby
Prior art date
Application number
PCT/CN2020/097262
Other languages
English (en)
Chinese (zh)
Inventor
郑营飞
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021004256A1 publication Critical patent/WO2021004256A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2002Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where interconnections or communication control functionality are redundant
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements

Definitions

  • the present invention relates to the technical field of cloud computing storage systems, in particular to a method and related equipment for node switching when a node fails.
  • each service application corresponds to a main container and at least one backup container, and the main container and the backup container are provided with a common storage device.
  • the main container can read and write data in the storage device to provide services to the outside world, and the standby container cannot read and write data in the storage device. It can only monitor the status of the main container and take over when the main container fails. The work of upgrading to the main container, read and write to the storage device to provide services.
  • the main container and backup container deployed in different physical machines communicate through the small computer system interface (SCSI) protocol.
  • SCSI small computer system interface
  • the main container can use the SCSI lock command provided by the SCSI protocol. Lock the storage device.
  • the backup container can monitor the status of the main container through the network connection. After the main container fails, the backup container can monitor the failure of the main container in time, and Immediately upgrade to the main and continue to provide external services.
  • the standby container may be created temporarily, and may be created on the same physical machine as the main container, so the standby container cannot establish a network connection with the main container, and thus cannot monitor the status of the main container.
  • the embodiment of the present invention discloses a node method and related equipment when a node fails, which can ensure that the standby node can accurately perceive the state of the primary node in a scenario where multiple nodes sharing a storage device do not perceive each other, and the primary node Take over the master node in the event of a failure, thereby improving application reliability.
  • the present application provides an inter-node switching method, including: a standby node detects a status flag of a master node stored in a storage device, and determines whether the master node is faulty according to the status flag, wherein the The master node is a node that accesses data in the storage device and provides services for users; when the backup node determines that the master node is faulty according to the status flag, the backup node takes over the master node.
  • the standby node does not need to establish a heartbeat with the master node to directly perceive the status of the master node, and indirectly determines whether the master node is faulty by detecting the status flag of the master node stored in the storage device. In the event of a failure, it takes over the master node and provides external services, thereby improving application reliability.
  • the status flag is the heartbeat value of the master node; the standby node periodically detects the heartbeat of the master node stored in the storage device Whether the value is updated; if the heartbeat value of the master node is not updated, it is determined that the master node is faulty.
  • the master node will periodically update the heartbeat value stored in the storage device, for example, periodically increment by one, so that the standby node can determine whether the master node is faulty by periodically detecting the heartbeat value In this way, it can be ensured that the standby node can still accurately sense the status of the master node without establishing a heartbeat connection with the master node.
  • the storage device further stores the master node tag, and the backup node updates the master node tag in the storage device to the backup The label of the node.
  • the storage device only stores the label of one node (that is, the label of the primary node).
  • the standby node needs to update the label of the primary node in the storage device to its own label , So that other standby nodes can determine that there is a new primary node currently, avoiding access to storage devices, ensuring data consistency and application reliability.
  • the standby node writes its own mark to the storage device every first preset duration, and after the first preset duration , Reading the node label stored in the storage device every the first preset period of time; within the second preset period of time, when the node label continuously read by the standby node and the node label of the standby node itself If the tags are the same, stop writing their own tags to the storage device; the N is a positive integer greater than or equal to 1.
  • the standby node competes for the primary node, it is achieved by writing its own mark to the storage device, and the rule that the mark written later overrides the mark written earlier is used for The mark written later has a greater probability of successful competition. If a backup node N consecutive times, for example, 3 times, the read mark is the same as its own mark, it can be considered that the backup node has successfully competed and become a new one. Master node. Through this kind of competition, the accuracy of selecting the master node can be improved, and multiple nodes can avoid accessing the storage device at the same time, and the reliability of the application can be ensured.
  • the heartbeat value stored in the storage device is cleared, and the heartbeat value is updated periodically.
  • the storage device stores the mark of the standby node, and the stored heartbeat value is the heartbeat value of the original primary node. Therefore, the standby node needs to clear it. Zero indirectly informs other standby nodes that a new master node currently exists, and periodically updates the heartbeat value to make other standby nodes perceive their own status.
  • the standby node periodically reads the mark and heartbeat value stored in the storage device, and determines whether the mark and the mark of the standby node are Is the same, and whether the heartbeat value is the same as the heartbeat value written by the backup node in the previous cycle; it is determined that the tag stored in the storage device is the same as the tag of the backup node and the heartbeat value is the same as that of the backup node.
  • the heartbeat value written by the node in the previous cycle is the same, and the standby node updates the heartbeat value.
  • the backup node after the backup node takes over the master node, it needs to periodically update the heartbeat value so that other backup nodes can perceive its state.
  • the standby node Before updating the heartbeat value each time, the standby node needs to determine whether the mark stored in the storage device is the same as its own mark and whether the heartbeat value is the same as the heartbeat value written in the previous cycle. Only the mark stored in the storage device is the same as the one written in the previous cycle. The heartbeat value is updated only when the own label is the same and the heartbeat value is the same as the heartbeat value written in the previous cycle. In this way, it can be ensured that the standby node can detect abnormal conditions in time and ensure the reliability of the application.
  • the backup node detects whether the storage device has a master node mark; if the storage device does not store the master node mark, the backup node The node takes over the master node.
  • the standby node will detect whether the primary node currently exists by detecting the primary node mark stored in the storage device when it is started. If the primary node does not currently exist, the secondary node will directly take over the primary node, and no cycle is required.
  • the ability to detect the status mark of the master node stored in the storage device can improve the efficiency of competition.
  • the present application provides a node including: a detection module for detecting the status flag of the master node stored in the storage device, and determining whether the master node is faulty according to the status flag, wherein: The master node is a node that accesses data in the storage device and provides services for users; a takeover module is used to take over the master node when the detection module determines that the master node is faulty according to the status flag .
  • the status flag is the heartbeat value of the master node; the detection module detects the status flag of the master node stored in the storage device, and When determining whether the master node is faulty according to the status flag, it is specifically used to: periodically detect whether the heartbeat value of the master node stored in the storage device is updated; if the heartbeat value of the master node is not updated, then It is determined that the master node is faulty.
  • the storage device further stores the master node identifier, and when the takeover module takes over the master node, it is specifically configured to: The label of the master node in the device is updated to the label of the node.
  • the takeover module when the takeover module updates the label of the primary node in the storage device to the label of the standby node, it is specifically configured to: Writing the tag of the node device into the storage device for a first preset duration, and after the first preset duration, read the tag stored in the storage device every first preset duration; In the second preset period of time, when the consecutively read mark N times are the same as the written mark, stop writing the mark of the node to the storage device, where N is a positive integer greater than or equal to 1.
  • the takeover module takes over the master node, it is also used to clear the heartbeat value stored in the storage device to zero, and periodically update all State the heartbeat value.
  • the takeover module when the takeover module periodically updates the heartbeat value, is specifically configured to: periodically read the data stored in the storage device And determine whether the label is the same as that of the standby node, and whether the heartbeat value is the same as the heartbeat value written by the standby node in the previous cycle; determine whether the storage device stores The mark of is the same as the mark of the standby node and the heartbeat value is the same as the heartbeat value written by the standby node in the previous cycle, and the heartbeat value is updated.
  • the detection module before the detection module detects the status flag of the master node stored in the storage device, the detection module is further configured to detect Whether there is a master node mark stored in the storage device; the takeover module is further configured to compete for the master node when the detection module detects that the storage device does not store the master node mark.
  • the present application provides a computing device, the computing device includes a processor and a memory, the processor and the memory are connected by an internal bus, the memory stores instructions, and the processor calls all The instructions in the memory are used to execute the method for switching between nodes provided in the foregoing first aspect and in combination with any one of the foregoing first aspects.
  • the present application provides a computer storage medium that stores a computer program.
  • the computer program When the computer program is executed by a processor, it can implement the above first aspect and any combination of the above first aspect.
  • a flow of the method for switching between nodes provided by an implementation method.
  • the present application provides a computer program product.
  • the computer program includes instructions.
  • the computer program can execute the first aspect and any one of the implementations in the first aspect. The process of the switching method between the provided nodes.
  • FIG. 1 is a system architecture diagram of a communication system using SCSI protocol provided by an embodiment of the present application
  • Figure 2 is a schematic diagram of an application scenario provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of an inter-node handover method provided by an embodiment of the present application
  • FIG. 4 is a schematic diagram of a state change of a container during a switching process according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of a sequence relationship in a competition process provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of the structure of a node provided by an embodiment of the present application.
  • Fig. 7 is a schematic structural diagram of a computing device provided by an embodiment of the present application.
  • the small computer system interface is an independent processor standard for system-level interfaces between computers and hardware devices (such as hard disks, optical drives, printers, scanners, etc.).
  • SCSI is a universal interface. Host adapters and SCSI peripheral controllers can be connected to the SCSI bus. Multiple peripherals on a SCSI bus can work at the same time.
  • the SCSI interface can transmit data synchronously or asynchronously.
  • a container is a virtualization technology in a computer operating system. This technology enables processes to run in a relatively independent and isolated environment (including independent file systems, namespaces, resource views, etc.), thereby simplifying the software deployment process and enhancing the software Portability and security, and improve system resource utilization.
  • applications that provide services to users are deployed in virtual machines in the form of containers, and virtual machines are generally deployed in physical machines.
  • the main container and the backup container can be set.
  • the main container and the standby container access the same storage device.
  • the main container can only read and write data in the storage device and provide services to the outside world.
  • the standby container monitors the status of the main container. When the primary container fails, the standby container is upgraded to the primary container to provide external services.
  • FIG. 1 it is a system architecture diagram in which multiple physical machines are connected to a storage device and communicate through the SCSI protocol.
  • the physical machine 121, the physical machine 122, ..., the physical machine 12n are connected to the storage device 110 at the same time.
  • One or more containers are deployed in each physical machine, and each container is deployed with applications that provide services for users.
  • the main container and the standby container can be set, and the main container and the standby container are located on different physical machines.
  • the main container needs to lock the storage device exclusively through the physical machine where the main container is located.
  • the SCSI protocol provides a lock command for locking the storage device, and the physical machine can lock the storage device accessed by the container through the lock command on the container deployed therein.
  • each container is actually allocated to a storage interval on the storage device, so the physical machine actually locks the storage space allocated to the container.
  • the main container can access the storage device.
  • the lock added by the main container to the storage device cannot be released, that is, the lock remains.
  • the physical machine where the standby container is located is connected to the physical machine where the main container is located through the network, so the standby container and the main container can also establish a connection.
  • the standby container can periodically send heartbeat information to the main container.
  • the standby container does not receive a response from the main container within a period of time, it is determined that the main container is faulty, and the standby container can be upgraded to the main container.
  • the standby container can use the mandatory cover lock command provided in the SCSI protocol to cover the exclusive lock added by the main container, and add an exclusive lock to the storage device to access the storage device and externally Provide services.
  • the standby container is temporarily created when it needs to be used, and the physical machine where the temporarily created standby container is located may be the same as the physical machine where the main container is located, or it may not be the same as the main container.
  • the new physical machine establishes a network connection, so the newly-built standby container cannot determine the state of the main container by heartbeat.
  • the standby container cannot be upgraded to the primary container in time, which affects the reliability of the application.
  • this application provides a method. Even when the network connection is not established between the main and standby containers, the standby container can detect the failure of the main container in time, thereby upgrading to the main container and continuing to provide services. .
  • Figure 2 shows a possible application scenario of an embodiment of the present application.
  • the physical machine 2100 and the physical machine 2200 are connected to the storage device 2300.
  • a virtual machine 2110 and a virtual machine 2120 are deployed in the physical machine 2100, a container 2111 is running in the virtual machine 2110, and a container 2121 is running in the virtual machine 2120; a virtual machine 2210 is deployed in the physical machine 2200, and a container 2211 is running in the virtual machine 2210 .
  • Container 2111, container 2121, and container 2211 form a container cluster, where container 2111 is the main container, and container 2121 and container 2211 are standby containers.
  • the same application is deployed in the main container and the backup container.
  • the main container shown accesses the storage device 2300 to provide external services.
  • the container 2111, the container 2121, and the container 2211 may also be directly deployed on the physical machine.
  • the storage device 2300 may be a physical storage device, such as a storage array, or a hard disk, or a section of storage space on the physical storage device, allocated to the container 2111, the container 2121, and the container 2211 storage container Data generated by the deployed application.
  • the storage device 2300 includes a mark storage area 2310, a heartbeat information storage area 2320, and a data storage area 2330.
  • the mark storage area 2310 is used to store the mark of the main container
  • the heartbeat information storage area 2320 is used to store the heartbeat value of the main container
  • the data storage area 2330 is used to store data generated during the operation of the main container.
  • the main container ie container 2111
  • the backup container ie container 2121 and container 2211
  • the data storage area 2330 can access the mark storage area 2310 and the heartbeat information storage area 2320 to monitor the status of the main container.
  • the heartbeat value in the heartbeat information storage area 2320 is periodically updated, for example, it is periodically incremented by one.
  • the container 2121 and the container 2211 respectively periodically monitor the mark storage area 2310 and the heartbeat information storage area 2320 to monitor the status of the container 2111. If it is found that the heartbeat value of the heartbeat information storage area 2320 exceeds a preset time (for example, two monitoring cycles) and is not updated, It can be determined that the container 2111 has failed and cannot provide services normally. At this time, the container 2121 and the container 2211 respectively write their own labels to the label storage area, and compete to select a new main container. The process of the container 2121 and the container 2211 competing for the main container will be described in detail below.
  • the label of the container 2121 will be stored in the label storage area 2310, and the label of the container 2111 will no longer be stored.
  • the container 2121 will clear the heartbeat value in the heartbeat information storage area and proceed. Periodically update, and access the data in the data storage area 2330 to provide external services. If the container 2211 fails to compete, it will continue to monitor the mark storage area 2310 and the heartbeat information storage area 2320 to monitor the status of the container 2121.
  • the standby container can also write in the heartbeat information storage area 2320 by detecting the main container
  • the heartbeat information determines the status of the main container, and can quickly select a new main container and provide services to the outside when the main container fails, ensuring the reliability of the service provided by the container.
  • Figure 3 is a flowchart of the container switching method
  • Figure 4 is the state change of the container during the container switching process Figure. This application takes any container as an example for detailed description. As shown in Figure 3, the method includes but is not limited to the following steps:
  • step S301 When the container is started, it is detected whether the mark storage area 2310 of the storage device 2300 is written with the mark of the main container. If the mark storage area has written the mark of the main container, step S302 is executed. If the mark storage area is not written Enter the mark of the main container, step S303 is executed.
  • the container After the container is started, it first needs to detect whether the mark storage area 2310 of the storage device 2300 is written with the mark of the main container.
  • the mark storage area 2310 in the storage device 2300 is used to store the mark of the main container.
  • the size of the mark storage area 2310 can be set according to actual needs, for example, it can be set to 512 bytes, which is not limited in this application.
  • the label of the container can uniquely identify a container.
  • the mark storage area 2310 stores the mark of the main container, it means that the main container already exists in the container cluster where the container is located, and the container is in the standby state shown in FIG. 4 as a backup container.
  • the mark storage area does not store the mark of the main container, it means that the main container does not exist in the container cluster where the container is located, and the container can compete for the main container. At this time, the container is in the election state shown in FIG. 4.
  • step S302 The container periodically detects whether the heartbeat value of the heartbeat information storage area 2320 has changed within a preset period of time, if it changes, then continue to perform step S302; if there is no change, the container is in the state shown in FIG. 4 Election status, and step S303 is executed.
  • the heartbeat information storage area 2320 is used to store the heartbeat value of the main container.
  • the heartbeat value of the heartbeat information area 2320 will be updated periodically. If other backup containers detect the heartbeat If the value is always updated, it means that the main container is always working. If other backup containers detect that the heartbeat value has not been updated within the preset time period, for example, if the heartbeat value detected in two consecutive detection cycles is the same, it means that the main The container is malfunctioning.
  • the backup container when the backup container detects whether the heartbeat value of the main container is updated, it will record the heartbeat value read in the previous cycle, and then compare the heartbeat value read in the current cycle with the heartbeat value recorded in the previous cycle. If the heartbeat value read is the same as the recorded one, it means that the heartbeat value of the main container has not been updated; if the heartbeat value currently read is different from the recorded heartbeat value, it means that the heartbeat value of the main container has been updated.
  • the heartbeat value of the primary container recorded by the backup container is 8, that is, the primary container updated the heartbeat value to 8 in the previous cycle, and the current heartbeat value read by the backup container is 9, and the backup container will compare the current read
  • the standby container can determine that the heartbeat value has been updated by the main container, and the standby container continues to periodically detect the heartbeat value of the main container and update the recorded heartbeat value to 9.
  • the main container When the main container updates the heartbeat value, it can first read the main container mark stored in the mark storage area 2310 and determine whether the read main container mark is the same as its own mark, and then read the heartbeat information storage area 2320. And determine whether the read heartbeat value is the same as the heartbeat value written in the previous cycle, if the read mark is the same as its own mark and the read heartbeat value is the same as the heartbeat value written in the previous cycle Similarly, the main container updates the heartbeat value, which can be a value, and the main container updates the heartbeat value by incrementing the value. For example, the heartbeat value stored in the heartbeat information storage area at the end of the previous cycle is 15. Then, the heartbeat value is updated to 16 in this cycle.
  • the main container will still be in the main container state shown in Figure 4; if the read mark is different from its own mark or the read heartbeat value is equal to The heartbeat value written in one cycle is not the same, it means that the current system has an unpredictable failure, and the main container needs to perform state switching, as shown in Figure 4.
  • the main container needs to exit and restart, and the entire container cluster needs to re-elect one
  • the new main container for example, the network of the main container is unstable, causing the main container to fail when writing its own mark to the mark storage area 2310 or updating the heartbeat value, but it cannot be sensed by itself, so the read main container's mark and It is different from itself; or the main container has been disconnected for a long time, and subsequently recovered (but the main container itself does not perceive it).
  • another standby container has written a new mark in the mark storage area 2310, thus Causes the mark read by the main container to be different from its own.
  • the main container periodically reads the mark stored in the mark storage area 2310 and compares it with its own mark to determine whether to update the heartbeat value or exit and restart.
  • This can be in extreme cases (for example, the unstable network of the main container causes the network
  • the connection is intermittent, the storage device 2300 has an unpredictable failure that causes the content in the identification storage area to change, etc.), restart in time to avoid continuing to read and write the data in the storage device 2300 to ensure data consistency and application reliability .
  • S303 The container periodically writes its own mark into the mark storage area 2310, and reads the mark stored in the mark storage area 2310 in the same cycle.
  • step S301 When the container is started in step S301, it is determined that the mark storage area 2310 does not store the identifier of the main container, that is, the main container does not exist, or in step S302, it is determined that the heartbeat information in the heartbeat information storage area 2320 has not changed, That is, when the main container fails, the container is in the election state shown in FIG. 4 and can compete for the main container.
  • the multiple standby containers will simultaneously compete for the primary container.
  • the container 2121 first writes its own label 1 to the label storage area 2310 at time t1
  • the container 2211 writes its own label 2 to the label storage area 2310 at time t2, and t1 is less than t2. Therefore, the mark 2 written by the container 2211 will overwrite the mark 1 written by the container 2121, that is, the mark 2 is stored in the mark storage area 2310.
  • the mark stored in the mark storage area 2310 is read at time t3.
  • the mark read by the container 2121 is mark 2.
  • write its own mark to the mark storage area 2310 again at t5 that is, mark 1.
  • the label read by the container 2211 is label 2, and after another sleep period, at time t6, write its own label, that is, label 2 to the label storage area 2310 again.
  • the container detects whether the label stored in the label storage area 2310 read continuously for N times is the same as the label written by the container within a preset period of time, if they are the same, go to step S305; if they are not the same , Go to step S302.
  • each backup container needs to write its own mark to the mark storage area 2310 first, then read it, and then write it again, and the cycle repeats periodically.
  • the label it reads each time is the label of the container written last.
  • the mark of is not the same as the written mark, but for the last container to write its own mark, the mark read each time is the same as the written mark, such as container 2211, if within a preset time period If the label read by the container for N consecutive times is the same as the label written, it can be determined that the container has been successfully upgraded to the new main container, and other containers will abandon this competition and no longer write themselves to the mark storage area 2310 Mark, re-check the heartbeat value of the heartbeat information storage area 2320, and wait for the next competition.
  • N is a positive integer greater than or equal to 1, for example, 3 or 4, which is not limited in this application.
  • each container periodically executes writing marks and reading marks and judging whether they are consistent, so as to finally determine the only mark that is read every time and write mark. Import the container with the same label and upgrade the container to the main container, which can improve the accuracy of the main container selection and ensure that the elected main container is the only one, so as to avoid the situation that multiple containers access the storage device at the same time, and ensure the application Reliability.
  • S305 The container is upgraded to the main container, and the data in the storage device is accessed to provide external services.
  • the container After the container is determined to be upgraded to the primary container, it will access the data in the storage device to provide services to the outside world, clear the heartbeat value of the heartbeat information storage area 2320, and then periodically update the heartbeat value At this time, the container is in the main container state shown in FIG. 4.
  • steps S301 to S305 involved in the foregoing method embodiments are only schematic descriptions and summaries, and should not constitute specific limitations. The involved steps can be added, reduced, or combined as needed.
  • FIGS 3 and 4 are scenarios where there are multiple standby containers in a container cluster, and when the primary container fails or there is no primary container, multiple standby containers compete to become the primary container, but for the container cluster only
  • step S302 if it is detected that the heartbeat of the primary container is not updated, that is, when the primary container fails, the standby container can be directly upgraded to the primary container, that is, the mark of the standby container is directly written
  • the storage area is marked without performing step S304, and the step of determining the main container through multiple writing and reading.
  • the method provided by the present invention is also suitable for switching between a physical machine and a virtual machine.
  • the switching method of the physical machine and the virtual machine except for the different objects of the switching, the other switching methods are the same as those of the container, which will not be repeated here.
  • the standby node by setting the status flag of the master node, such as the heartbeat information of the master node, in the storage device, and the master node periodically updates the heartbeat information, the standby node will also periodically detect the heartbeat information. However, if the heartbeat information is not updated, it can compete for the master node. In this way, even if the master node and the backup node have not established a network connection through their respective physical machines, the backup node can detect the failure of the master node in time and compete for the master node. The node continues to provide services.
  • the master node such as the heartbeat information of the master node
  • the backup physical machine does not need to establish a network connection with the master physical machine. It can be determined whether the main physical machine is faulty.
  • the standby node can be deployed on any physical machine, for example, deployed on the same physical machine as the primary node, or deployed on a physical machine that has no network connection with the primary node , Thereby reducing the constraints of virtual machine and container deployment.
  • the node 600 includes a detection module 610 and a takeover module 620. among them,
  • the detection module 610 is used to detect the status flag of the master node stored in the storage device, and determine whether the master node is faulty according to the status flag, wherein the master node accesses data in the storage device And provide service nodes for users.
  • the detection module 610 is configured to execute the aforementioned steps S301, S302, and S304, and optionally execute optional methods in the aforementioned steps.
  • the takeover module 620 is configured to take over the master node when the detection module 610 determines that the master node is faulty according to the status flag.
  • the takeover module 620 is configured to perform the foregoing steps S303 and S305, and optionally perform optional methods in the foregoing steps.
  • the status flag is the heartbeat value of the master node; the detection module 610 detects the status flag of the master node stored in the storage device, and determines the status flag according to the status flag.
  • the master node is specifically used to: periodically detect whether the heartbeat value of the master node stored in the storage device is updated; if the heartbeat value of the master node is not updated, determine that the master node is faulty.
  • the storage device also stores the master node tag, and when the takeover module 620 takes over the master node, it is specifically configured to update the master node tag in the storage device to The label of the node device.
  • the takeover module 620 updates the label of the primary node in the storage device to the label of the standby node, it is specifically configured to: The tag of the node device is written into the storage device, and after the first preset duration, the tag stored in the storage device is read every first preset duration; within the second preset duration, When the consecutively read N tags are the same as the written tags, stop writing the tag of the node device to the storage device, where N is a positive integer greater than or equal to 1.
  • the takeover module 620 after the takeover module 620 takes over the master node, it is further configured to clear the heartbeat value stored in the storage device to zero, and periodically update the heartbeat value.
  • the takeover module 620 when the takeover module 620 periodically updates the heartbeat value, the takeover module 620 is specifically configured to periodically read the mark and the heartbeat value stored in the storage device, and determine Whether the mark is the same as the mark of the standby node, and whether the heartbeat value is the same as the heartbeat value written by the standby node in the previous cycle; determine whether the mark stored in the storage device is the same as that of the standby node If the tags are the same and the heartbeat value is the same as the heartbeat value written by the standby node in the previous cycle, the heartbeat value is updated.
  • the detection module 610 before the detection module 610 detects the status mark of the master node stored in the storage device, the detection module 610 is also used to detect whether the storage device stores the master node. Node tag; the takeover module 620 is also used to take over the master node when the detection module detects that there is no master node tag stored in the storage device.
  • FIG. 7 is a schematic structural diagram of a computing device provided by an embodiment of the application.
  • the computing device 700 includes a processor 710, a communication interface 720, and a memory 730.
  • the processor 710, the communication interface 720, and the memory 730 are connected to each other through an internal bus 740.
  • the computing device may be a database server.
  • the computing device 700 may be the physical machine 2110 or 2120 in FIG. 2, in which a container or a virtual machine is built. The functions performed by the container in FIG. 2 are actually performed by the processor 710 of the physical machine.
  • the processor 710 may be composed of one or more general-purpose processors, such as a central processing unit (CPU), or a combination of a CPU and a hardware chip.
  • the aforementioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof.
  • the aforementioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL), or any combination thereof.
  • CPLD complex programmable logic device
  • FPGA field-programmable gate array
  • GAL general array logic
  • the bus 740 may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
  • PCI peripheral component interconnect standard
  • EISA extended industry standard architecture
  • the bus 740 can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is used in FIG. 7, but it does not mean that there is only one bus or one type of bus.
  • the memory 730 may include a volatile memory (volatile memory), such as a random access memory (random access memory, RAM); the memory 730 may also include a non-volatile memory (non-volatile memory), such as a read-only memory (read-only memory). Only memory (ROM), flash memory (flash memory), hard disk drive (HDD) or solid-state drive (SSD); memory 730 may also include a combination of the above types.
  • the program code may be used to implement the functional modules shown in the node device 600, or to implement the method steps in the method embodiment shown in FIG. 6 with the standby node as the execution subject.
  • the embodiments of the present application also provide a computer-readable storage medium on which a computer program is stored.
  • the program When the program is executed by a processor, it can implement part or all of the steps of any one of the above method embodiments, and realize the above The function of any one of the functional modules described in Figure 6.
  • the embodiments of the present application also provide a computer program product, which when it runs on a computer or a processor, enables the computer or the processor to execute one or more steps in any of the foregoing methods. If each component module of the aforementioned equipment is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in the computer readable storage medium.
  • the size of the sequence number of the above-mentioned processes does not mean the order of execution.
  • the execution order of each process should be determined by its function and internal logic, and should not be implemented in this application.
  • the implementation process of the example constitutes any limitation.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .
  • the modules in the devices in the embodiments of the present application may be combined, divided, and deleted according to actual needs.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)

Abstract

L'invention concerne un procédé de commutation de nœud dans une panne de nœud et un dispositif associé. Un nœud maître et un nœud de secours sont simultanément connectés à un dispositif de stockage, mais seulement le nœud maître peut accéder à des données d'utilisateur dans le dispositif de stockage et fournir un service pour un utilisateur. Le nœud de secours peut accéder à une étiquette d'état du nœud maître dans le dispositif de stockage. Le nœud de secours surveille l'étiquette d'état du nœud maître stocké dans le dispositif de stockage pendant le fonctionnement et détermine, en fonction de l'étiquette d'état, si le nœud maître tombe en panne. Si le nœud de secours détermine, en fonction de l'étiquette d'état, que le nœud maître tombe en panne, le nœud de secours prend la place du nœud maître. Au moyen du procédé, lorsque de multiples nœuds partageant un dispositif de stockage ne sont pas en contact l'un avec l'autre, il est garanti que le nœud de secours peut détecter avec précision l'état du nœud maître et prend la place du nœud maître lorsque le nœud maître tombe en panne, ce qui améliore la fiabilité d'une application.
PCT/CN2020/097262 2019-07-08 2020-06-19 Procédé de commutation de nœud dans une panne de nœud et dispositif associé WO2021004256A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201910612025.X 2019-07-08
CN201910612025 2019-07-08
CN201911057449.0A CN112199240B (zh) 2019-07-08 2019-10-29 一种节点故障时进行节点切换的方法及相关设备
CN201911057449.0 2019-10-29

Publications (1)

Publication Number Publication Date
WO2021004256A1 true WO2021004256A1 (fr) 2021-01-14

Family

ID=74004723

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/097262 WO2021004256A1 (fr) 2019-07-08 2020-06-19 Procédé de commutation de nœud dans une panne de nœud et dispositif associé

Country Status (2)

Country Link
CN (1) CN112199240B (fr)
WO (1) WO2021004256A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116107814A (zh) * 2023-04-04 2023-05-12 阿里云计算有限公司 数据库容灾方法、设备、系统及存储介质
CN116743550A (zh) * 2023-08-11 2023-09-12 之江实验室 一种分布式存储集群的故障存储节点的处理方法

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113568707B (zh) * 2021-07-29 2024-06-25 中国船舶重工集团公司第七一九研究所 基于容器技术的海洋平台的计算机控制方法及系统
CN113986925A (zh) * 2021-10-28 2022-01-28 傲拓科技股份有限公司 分布式时序数据库及其存储方法
CN116266152A (zh) * 2021-12-16 2023-06-20 中移(苏州)软件技术有限公司 一种信息处理方法及装置、存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5167028A (en) * 1989-11-13 1992-11-24 Lucid Corporation System for controlling task operation of slave processor by switching access to shared memory banks by master processor
CN106789246A (zh) * 2016-12-22 2017-05-31 广西防城港核电有限公司 一种主备服务器的切换方法及装置
CN109302445A (zh) * 2018-08-14 2019-02-01 新华三云计算技术有限公司 主机节点状态确定方法、装置、主机节点及存储介质
CN109446169A (zh) * 2018-10-22 2019-03-08 北京计算机技术及应用研究所 一种双控磁盘阵列共享文件系统
CN109783280A (zh) * 2019-01-15 2019-05-21 上海海得控制系统股份有限公司 共享存储系统和共享存储方法

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102231681B (zh) * 2011-06-27 2014-07-30 中国建设银行股份有限公司 一种高可用集群计算机系统及其故障处理方法
US9411772B2 (en) * 2014-06-30 2016-08-09 Echelon Corporation Multi-protocol serial nonvolatile memory interface
CN104679907A (zh) * 2015-03-24 2015-06-03 新余兴邦信息产业有限公司 高可用高性能数据库集群的实现方法及系统
US9836368B2 (en) * 2015-10-22 2017-12-05 Netapp, Inc. Implementing automatic switchover
US10855515B2 (en) * 2015-10-30 2020-12-01 Netapp Inc. Implementing switchover operations between computing nodes
US10243780B2 (en) * 2016-06-22 2019-03-26 Vmware, Inc. Dynamic heartbeating mechanism
CN108011737B (zh) * 2016-10-28 2021-06-01 华为技术有限公司 一种故障切换方法、装置及系统
CN107122271B (zh) * 2017-04-13 2020-07-07 华为技术有限公司 一种恢复节点事件的方法、装置及系统
CN109005045B (zh) * 2017-06-06 2022-01-25 北京金山云网络技术有限公司 主备服务系统及主节点故障恢复方法
CN109815049B (zh) * 2017-11-21 2021-03-26 北京金山云网络技术有限公司 节点宕机恢复方法、装置、电子设备及存储介质
CN108880898B (zh) * 2018-06-29 2020-09-08 新华三技术有限公司 主备容器系统切换方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5167028A (en) * 1989-11-13 1992-11-24 Lucid Corporation System for controlling task operation of slave processor by switching access to shared memory banks by master processor
CN106789246A (zh) * 2016-12-22 2017-05-31 广西防城港核电有限公司 一种主备服务器的切换方法及装置
CN109302445A (zh) * 2018-08-14 2019-02-01 新华三云计算技术有限公司 主机节点状态确定方法、装置、主机节点及存储介质
CN109446169A (zh) * 2018-10-22 2019-03-08 北京计算机技术及应用研究所 一种双控磁盘阵列共享文件系统
CN109783280A (zh) * 2019-01-15 2019-05-21 上海海得控制系统股份有限公司 共享存储系统和共享存储方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116107814A (zh) * 2023-04-04 2023-05-12 阿里云计算有限公司 数据库容灾方法、设备、系统及存储介质
CN116107814B (zh) * 2023-04-04 2023-09-22 阿里云计算有限公司 数据库容灾方法、设备、系统及存储介质
CN116743550A (zh) * 2023-08-11 2023-09-12 之江实验室 一种分布式存储集群的故障存储节点的处理方法
CN116743550B (zh) * 2023-08-11 2023-12-29 之江实验室 一种分布式存储集群的故障存储节点的处理方法

Also Published As

Publication number Publication date
CN112199240A (zh) 2021-01-08
CN112199240B (zh) 2024-01-30

Similar Documents

Publication Publication Date Title
WO2021004256A1 (fr) Procédé de commutation de nœud dans une panne de nœud et dispositif associé
RU2596585C2 (ru) Способ отправки данных, способ приема данных и устройство хранения данных
CN108616382B (zh) 升级网卡固件的方法、装置、网卡和设备
US7840662B1 (en) Dynamically managing a network cluster
US8601314B2 (en) Failover method through disk take over and computer system having failover function
US8041791B2 (en) Computer system, management server, and mismatched connection configuration detection method
US7007192B2 (en) Information processing system, and method and program for controlling the same
US9367412B2 (en) Non-disruptive controller replacement in network storage systems
US6968382B2 (en) Activating a volume group without a quorum of disks in the volume group being active
WO2021082465A1 (fr) Procédé pour assurer la cohérence de données et dispositif associé
CN111147274B (zh) 为集群解决方案创建高度可用的仲裁集的系统和方法
US11182252B2 (en) High availability state machine and recovery
US11734133B2 (en) Cluster system and fail-over control method of cluster system
CN115454329A (zh) 用于存储集群设备的管理方法、装置、设备和存储介质
US20210019221A1 (en) Recovering local storage in computing systems
CN114115703A (zh) 裸金属服务器在线迁移方法以及系统
JP6788188B2 (ja) 制御装置および制御プログラム
WO2024000535A1 (fr) Procédé et appareil de mise à jour de table de partition, dispositif électronique et support de stockage
CN118132350B (zh) Cxl内存容错方法、服务器系统、存储介质和电子设备
US7305497B2 (en) Performing resource analysis on one or more cards of a computer system wherein a plurality of severity levels are assigned based on a predetermined criteria
CN117453242A (zh) 一种虚拟机的应用更新方法、计算设备及计算系统
CN115664943A (zh) 主从关系的确定方法和装置、存储介质和电子设备
CN115757266A (zh) 一种soc芯片、电子设备及防止配置数据丢失的方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20837405

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20837405

Country of ref document: EP

Kind code of ref document: A1