WO2004114570A2 - Method of rebooting a multi-device cluster while maintaining cluster operation - Google Patents

Method of rebooting a multi-device cluster while maintaining cluster operation Download PDF

Info

Publication number
WO2004114570A2
WO2004114570A2 PCT/IB2004/001929 IB2004001929W WO2004114570A2 WO 2004114570 A2 WO2004114570 A2 WO 2004114570A2 IB 2004001929 W IB2004001929 W IB 2004001929W WO 2004114570 A2 WO2004114570 A2 WO 2004114570A2
Authority
WO
WIPO (PCT)
Prior art keywords
cluster
members
rebooting
reboot
rebooted
Prior art date
Application number
PCT/IB2004/001929
Other languages
French (fr)
Other versions
WO2004114570A3 (en
Inventor
Ajay Mittal
Laura Xu
Srikanth Koneru
Original Assignee
Nokia Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Inc. filed Critical Nokia Inc.
Priority to EP04736549A priority Critical patent/EP1644828A4/en
Publication of WO2004114570A2 publication Critical patent/WO2004114570A2/en
Publication of WO2004114570A3 publication Critical patent/WO2004114570A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/177Initialisation or configuration control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1441Resetting or repowering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/54Link editing before load time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4405Initialisation of multiprocessor systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]

Definitions

  • the term for starting software on a device is 'booting' (short for 'bootstrapping'); when this is performed on a device that is active, the term is 'rebooting' .
  • a reboot is normally performed for a variety of reasons, including: to activate new versions of the software; and to restore functionality of the device after a fatal error in the software that prevents the device's operation.
  • the reboot of devices requires special consideration, since maintenance of the cluster functionality is of utmost importance. Rebooting the cluster, however, may interfere with its operation. What is needed is a way to reboot members of a cluster such that the cluster operation is maintained.
  • the present invention is directed at rebooting a cluster while maintaining cluster operation.
  • cluster operation is automatically maintained during the reboot.
  • the cluster reboot process at least one member of the cluster remains active during the rebooting of the other members.
  • a user such as an administrator triggers the cluster reboot process.
  • the administrator does not have to manually reboot each member of the cluster. Instead, the cluster reboot process handles the reboots of the members.
  • an algorithm is executed which reboots members of the cluster at different times. Rebooting all cluster members at the same time would cause the operation of the cluster to be lost until at least one member is restored to operation.
  • FIGURE 1 illustrates an exemplary cluster rebooting environment
  • FIGURE 2 illustrates an exemplary computing device that may be used
  • FIGURE 3 shows an exemplary architecture of a cluster
  • FIGURE 4 illustrates components of the RMB; and FIGURE 5 shows a process for rebooting a cluster; in accordance with aspects of the invention.
  • IP means any type of Internet Protocol.
  • node means a device that implements IP.
  • Router means a node that forwards IP packets not explicitly addressed to itself.
  • routable address means an identifier for an interface such that a packet is sent to the interface identified by that address.
  • link means a communication facility or medium over which nodes can communicate.
  • cluster refers to a group of nodes configured to act as a single node.
  • RMB Remote Management Broker
  • CS Configuration Subsystem
  • CLI Command Line Interface
  • CM Cluster Management
  • GUI Graphical User Interface
  • MAC Message Authentication Code
  • NM Network Management.
  • FIGURE 1 illustrates an exemplary cluster rebooting environment, in accordance with aspects of the invention.
  • rebooting environment 100 includes management computers 105 and 108, cluster 130, outside network 110, management network 120, routers 125, and inside network 145.
  • Cluster 130 includes nodes 135 that are arranged to 'act as a single node.
  • the networks maybe wired or wireless networks that are coupled to wired or wireless devices.
  • the present invention is directed at rebooting a cluster while maintaining cluster operation. At least one member of the cluster stays active during the reboot process. An administrator triggers the reboot process and then does not have to perform any other steps during the reboot process. An algorithm is executed which reboots members of the cluster at different times while always maintaining operation of at least one member of the cluster.
  • routers are intermediary devices on a communications network that expedite message delivery.
  • a router receives transmitted messages and forwards them to their correct destinations over available routes.
  • a router acts as a link between LANs, enabling messages to be sent from one to another.
  • Communication links within LANs typically include twisted wire pair, fiber optics, or coaxial cable, while communication links between networks may utilize analog telephone lines, full or fractional dedicated digital lines including Tl, T2, T3, and T4, Integrated Services Digital Networks (ISDNs), Digital Subscriber Lines (DSLs), wireless links, or other communications links.
  • ISDNs Integrated Services Digital Networks
  • DSLs Digital Subscriber Lines
  • wireless links or other communications links.
  • Management computer 105 is coupled to management network 120 through communication mediums.
  • Management computer 108 is coupled to inside network 145 through communication mediums.
  • Management computers 105 and 108 may be used to manage a cluster, such as cluster 130, as well as to trigger a cluster reboot.
  • IP network 100 may include many more components than those shown in FIGURE 1. However, the components shown are sufficient to disclose an illustrative embodiment for practicing the present invention.
  • the media used to transmit information in the communication links as described above illustrates one type of computer-readable media, namely commumcation media.
  • computer-readable media includes any media that can be accessed by a computing device.
  • Communication media typically embodies computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.
  • modulated data signal means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • communication media includes wired media such as twisted pair, coaxial cable, fiber optics, wave guides, and other wired media and wireless media such as acoustic, RF, infrared, and other wireless media.
  • FIGURE 2 illustrates an exemplary computing device that may be used in accordance with aspects of the invention.
  • node 200 is only shown with a subset of the components that are commonly found in a computing device.
  • a computing device that is capable of working in this invention may have more, less, or different components as those shown in FIGURE 2.
  • Node 200 may include various hardware components! In a very basic configuration, Node 200 typically includes central processing unit 202, system memory 204, and network component 216.
  • system memory 204 may include volatile memory, non-volatile memory, data storage devices, or the like. These examples of system memory 804 are all considered computer storage media.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by node 200. Any such computer storage media may be part of node 200.
  • Node 200 may include input 'component 212 for receiving input.
  • Input component 212 may include a keyboard, a touch screen, a mouse, or other input devices.
  • Output component 214 may include a display, speakers, printer, and the like.
  • Node 200 may also includes network component 216 for communicating with other devices in an IP network.
  • network component 216 enables node 200 to communicate with mobile nodes and corresponding nodes.
  • Node 200 may be configured to use network component 216 to receive and send packets to and from the corresponding nodes and the mobile nodes. The communication may be wired or wireless. Signals sent and received by network component 216 are one example of communication media.
  • the term computer readable media as used herein includes both storage media and communication media.
  • System memory 204 typically includes an operating system 205, one or more applications 206, and data 207. As shown in the figure, system memory 204 may also include cluster rebooting program 208. Program 208 is a component for performing operations relating to rebooting a cluster as described herein. Program 208 includes computer-executable instructions for performing processes relating to cluster rebooting.
  • FIGURE 3 shows an exemplary architecture of a cluster, in accordance with aspects of the invention. As shown in the figure, cluster 300 includes nodes 305, 310, and 315; GUI 320, CLI 325, Configuration Subsystems 335, 340, and 345 and RMB 350.
  • the GUI and CLI may be configured to present a view of a node(s) within the cluster.
  • RMB 350 distributes information between the nodes within the cluster.
  • GUI 305 is configured to execute on a workstation (not shown) and interact with Configuration Subsystem 325 of device 305.
  • GUI 320 provides a graphical interface to perform operations relating to device 305. One of these operations is performing a reboot of a cluster.
  • CLI 325 provides a command line interface that allows the user to perform operations on device 305 by an application executing on device 305. The GUI and CLI associated with device 305 may also be used to trigger a cluster reboot.
  • RMB 350 is configured to communicate with device 305 and other devices (device 310 and device 315) within the cluster. RMB 350 maybe included within device 305 or it may be separate from device 305. Generally, RMB 350 is used to communicate information between the members of the cluster.
  • GUI 320 is implemented as a set of Web pages in a browser and a Web Server.
  • the server may operate on a device within the cluster or a device separate from the cluster.
  • the server may operate on all or some of the cluster members.
  • CLI 325 is a management CLI that presents the cluster information relating to the device and the cluster textually to a user.
  • RMB 350 interacts with the configuration subsystems of the devices being rebooted.
  • the reboot process is stopped.
  • RMB 350 may be configured to restore the configurations to the devices before the reboot process began. This helps to ensure that all the members of the cluster maintain the same attributes.
  • RMB 350 may indicate that there was a failure to the GUI and CCLI, or send the error to some other location.
  • the administrator may perform other operations. The reboot action is triggered by a control in an application using the
  • GUT Graphical User Interface
  • CLI Command Line Interface
  • the control or command causes a script to be run that performs the cluster rebooting process.
  • the script initiates a reboot by contacting each cluster member, providing an attribute that causes each member to temporarily be removed from the cluster, and then providing an attribute that causes the reboot operation to begin.
  • the script detects the loss of contact with the device and attempts to reestablish contact.
  • the script When the script has established contact, it internally indicates that that device is now rebooted and informs the administrator which device has been rebooted.
  • the device from which the rebooting process is initiated is not rebooted until all of the other devices have been rebooted.
  • the reboot for all of the devices, except for the one on which the reboot is initiated, can either be performed sequentially (one device at a time) or in parallel.
  • the parallel method reduces the overall time needed to restore the cluster to full operation.
  • FIGURE 4 illustrates components of the RMB, in accordance with aspects of the invention.
  • RMB 400 includes RMB Client 420, configuration subsystem 410, RMB Server 440 and secure transport 435.
  • RMB Client 420 includes Cluster API (application programming interface) 425 and Remote API 430.
  • Cluster API 425 maintains information about the cluster's members.
  • Remote API 430 maintains information about each cluster member and tracks NM operations.
  • Secure Transport 435 delivers and receives messages to perform NM operations, such as the cluster reboot operation, and performs integrity checks on the messages.
  • RMB Server 440 is arranged to communicate with configuration subsystem 410 and communicates with RMB client 420 through secure transport 435.
  • Remote Management System 400 acts as the backbone for the nodes within the cluster.
  • RMB 400 provides base mechanisms including: discovering the members within the cluster; delivering queries and operations relating to NM attributes to the devices in the cluster; ensuring message integrity; an interface for management applications; and an interface to each device's local configuration subsystem.
  • RMB 400 also includes a secure mechanism for transporting the information in the messages sent between the nodes within the cluster.
  • RMB 400 is also configured to automatically query the nodes it is coupled with in order to determine the cluster members. These queries are performed periodically to help ensure that all cluster members are available at any given time. According to one embodiment, RMB 400 ensures consistency of the configuration by using database transactions. For example to begin a transaction whenever an attribute is to be changed and applying a 'commit' database operation if the change is successful on all devices and a 'rollback' operation when the change fails on any device. The RMB may implement these transactions either internally or by using the transaction capabilities of the Configuration Subsystem. According to one embodiment, the Configuration Subsystem's transactions are used since these maybe complicated operations.
  • RMB Client 420 uses Cluster API 425 to discover the cluster's member devices.
  • RMB 400 uses messages to perform system and NM operations. The system operations include acquiring and releasing the configuration lock.
  • the system operations include acquiring and releasing the configuration lock.
  • the RMB fills in header and delivers the message.
  • the RMB checks the header and accepts the message only if values in the fields of the header are valid. The RMB discards any message whose header has invalid values in the fields.
  • RMB Client 420 composes the body of a RMB message and uses Cluster API 425 to deliver the message to the cluster members; receive the responses from the members; and extract the result of the operation from the message.
  • Remote API 430 delivers the message to a particular cluster member and checks that a response message is received for every request message sent.
  • Secure Transport 435 is the transport mechanism that actually sends and receives the messages.
  • the RMB Client can be implemented as a collection of shared-object libraries with well-defined Application Programming Interfaces (APIs). CGUI and CCLI can use these APIs to interact with the RMB to perform NM operations.
  • APIs Application Programming Interfaces
  • the RMB Server can be implemented as a daemon that is launched during system start-up.
  • RMB's Secure Transport can be implemented as a Secure Sockets Layer (SSL) socket. This provides and extra layer of security by providing the ability to encrypt the RMB messages.
  • SSL Secure Sockets Layer
  • FIGURE 5 shows a process for rebooting a cluster, in accordance with aspects of the invention.
  • process 500 flows to block 505 where a list of cluster members is obtained.
  • the list of cluster members is used to help ensure that all of the cluster members are rebooted.
  • the identity of the member on which the reboot is initiated is obtained.
  • a reboot is performed on each member of the cluster other than the member who initiated the reboot.
  • the cluster members minus the initiating member are rebooted in parallel. For example, if there are five members of the cluster then four of the five members are rebooted at the same time.
  • the members may be rebooted in any order, so long as at least one member remains active during the rebooting of the other members.
  • decision block 520 a determination is made as to whether an error occurred during the cluster reboot on the members other than the initiating member. When an error occurs, the process flows to block 530, where the reboot process is halted. Transitioning to block 525, a reboot is performed on the member initiating the cluster reboot.
  • decision block 530 a determination is made as to whether an error occurred during any step of the cluster reboot. When an error occurs, the process flows to block 530, where the reboot process is halted. When there are no errors, the process flows to an end block and returns to processing other actions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Security & Cryptography (AREA)
  • Quality & Reliability (AREA)
  • Multimedia (AREA)
  • Stored Programmes (AREA)
  • Retry When Errors Occur (AREA)

Abstract

The present invention is directed at rebooting a cluster (130) while maintaining cluster operation. Cluster operation is automatically maintained during the reboot since at least one member of the cluster stays active during the process. An administrator (105,108) triggers the reboot process and then does not have to perform any other steps during the reboot process . An algorithm executes wich reboots members of the cluster at different times. While always maintaining operation of at least one member of the cluster. triggers the reboot process and then does not have to perform any other steps during the reboot process. An algorithm executes which reboots members of the cluster at different times, while always maintaining operation of at least one member of the cluster.

Description

METHOD OF REBOOTING A MULTI-DEVICE CLUSTER WHILE MAINTAINING CLUSTER OPERATION
Background of the Invention Equipment that provides a high degree of reliability is a prime consideration of organizations that supply Internet and Intranet services. To help meet this need, technology has become available to combine several devices into a cluster that is configured to act as a single device. Using the cluster arrangement, it is intended that the failure of one device does not significantly affect the remaining components within the cluster.
The term for starting software on a device is 'booting' (short for 'bootstrapping'); when this is performed on a device that is active, the term is 'rebooting' . A reboot is normally performed for a variety of reasons, including: to activate new versions of the software; and to restore functionality of the device after a fatal error in the software that prevents the device's operation.
In a cluster environment, the reboot of devices requires special consideration, since maintenance of the cluster functionality is of utmost importance. Rebooting the cluster, however, may interfere with its operation. What is needed is a way to reboot members of a cluster such that the cluster operation is maintained.
Summary of the Invention
The present invention is directed at rebooting a cluster while maintaining cluster operation.
According to one aspect of the invention, cluster operation is automatically maintained during the reboot. During the cluster reboot process at least one member of the cluster remains active during the rebooting of the other members.
According to another aspect of the invention, a user, such as an administrator triggers the cluster reboot process. The administrator does not have to manually reboot each member of the cluster. Instead, the cluster reboot process handles the reboots of the members.
According to another aspect, an algorithm is executed which reboots members of the cluster at different times. Rebooting all cluster members at the same time would cause the operation of the cluster to be lost until at least one member is restored to operation.
Brief Description of the Drawings
FIGURE 1 illustrates an exemplary cluster rebooting environment; FIGURE 2 illustrates an exemplary computing device that may be used; FIGURE 3 shows an exemplary architecture of a cluster;
FIGURE 4 illustrates components of the RMB; and FIGURE 5 shows a process for rebooting a cluster; in accordance with aspects of the invention.
Detailed Description of the Preferred Embodiment In the following detailed description of exemplary embodiments of the invention, reference is made to the accompanied drawings, which form a part hereof, and which is shown by way of illustration, specific exemplary embodiments of which the invention may be practiced. Each embodiment is described in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other embodiments may be utilized, and other changes may be made, without departing from the spirit or scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims.
Throughout the specification and claims, the following terms take the meanings explicitly associated herein, unless the context clearly dictates otherwise. The term "IP" means any type of Internet Protocol. The term "node" means a device that implements IP. The term "router" means a node that forwards IP packets not explicitly addressed to itself. The term "routable address" means an identifier for an interface such that a packet is sent to the interface identified by that address. The term "link" means a communication facility or medium over which nodes can communicate. The term "cluster" refers to a group of nodes configured to act as a single node. The following abbreviations are used throughout the specification and claims: RMB = Remote Management Broker; CS = Configuration Subsystem; CLI = Command Line Interface; CM = Cluster Management; GUI = Graphical User Interface; MAC = Message Authentication Code; and NM = Network Management.
Referring to the drawings, like numbers indicate like parts throughout the views. Additionally, a reference to the singular includes a reference to the plural unless otherwise stated or is inconsistent with the disclosure herein.
FIGURE 1 illustrates an exemplary cluster rebooting environment, in accordance with aspects of the invention. As shown in the figure, rebooting environment 100 includes management computers 105 and 108, cluster 130, outside network 110, management network 120, routers 125, and inside network 145. Cluster 130 includes nodes 135 that are arranged to 'act as a single node. The networks maybe wired or wireless networks that are coupled to wired or wireless devices.
The present invention is directed at rebooting a cluster while maintaining cluster operation. At least one member of the cluster stays active during the reboot process. An administrator triggers the reboot process and then does not have to perform any other steps during the reboot process. An algorithm is executed which reboots members of the cluster at different times while always maintaining operation of at least one member of the cluster.
As illustrated, inside network 145 is an LP packet based backbone network that includes routers, such as routers 125 to connect the support nodes in the network. Routers are intermediary devices on a communications network that expedite message delivery. On a single network linking many computers through a mesh of possible connections, a router receives transmitted messages and forwards them to their correct destinations over available routes. On an interconnected set of LANs, including those based on differing architectures and protocols, a router acts as a link between LANs, enabling messages to be sent from one to another. Communication links within LANs typically include twisted wire pair, fiber optics, or coaxial cable, while communication links between networks may utilize analog telephone lines, full or fractional dedicated digital lines including Tl, T2, T3, and T4, Integrated Services Digital Networks (ISDNs), Digital Subscriber Lines (DSLs), wireless links, or other communications links.
Management computer 105 is coupled to management network 120 through communication mediums. Management computer 108 is coupled to inside network 145 through communication mediums. Management computers 105 and 108 may be used to manage a cluster, such as cluster 130, as well as to trigger a cluster reboot.
Furthermore, computers, and other related electronic devices may be connected to network 110, network 120, and network 145. The public Internet itself may be formed from a vast number of such interconnected networks, computers, and routers. IP network 100 may include many more components than those shown in FIGURE 1. However, the components shown are sufficient to disclose an illustrative embodiment for practicing the present invention.
The media used to transmit information in the communication links as described above illustrates one type of computer-readable media, namely commumcation media. Generally, computer-readable media includes any media that can be accessed by a computing device. Communication media typically embodies computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, communication media includes wired media such as twisted pair, coaxial cable, fiber optics, wave guides, and other wired media and wireless media such as acoustic, RF, infrared, and other wireless media.
FIGURE 2 illustrates an exemplary computing device that may be used in accordance with aspects of the invention. For illustrative purposes, node 200 is only shown with a subset of the components that are commonly found in a computing device. A computing device that is capable of working in this invention may have more, less, or different components as those shown in FIGURE 2. Node 200 may include various hardware components! In a very basic configuration, Node 200 typically includes central processing unit 202, system memory 204, and network component 216.
Depending on the exact configuration and type of computing device, system memory 204 may include volatile memory, non-volatile memory, data storage devices, or the like. These examples of system memory 804 are all considered computer storage media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by node 200. Any such computer storage media may be part of node 200.
Node 200 may include input 'component 212 for receiving input. Input component 212 may include a keyboard, a touch screen, a mouse, or other input devices. Output component 214 may include a display, speakers, printer, and the like. Node 200 may also includes network component 216 for communicating with other devices in an IP network. In particular, network component 216 enables node 200 to communicate with mobile nodes and corresponding nodes. Node 200 may be configured to use network component 216 to receive and send packets to and from the corresponding nodes and the mobile nodes. The communication may be wired or wireless. Signals sent and received by network component 216 are one example of communication media. The term computer readable media as used herein includes both storage media and communication media.
Software components of node 200 are typically stored in system memory 204. System memory 204 typically includes an operating system 205, one or more applications 206, and data 207. As shown in the figure, system memory 204 may also include cluster rebooting program 208. Program 208 is a component for performing operations relating to rebooting a cluster as described herein. Program 208 includes computer-executable instructions for performing processes relating to cluster rebooting. FIGURE 3 shows an exemplary architecture of a cluster, in accordance with aspects of the invention. As shown in the figure, cluster 300 includes nodes 305, 310, and 315; GUI 320, CLI 325, Configuration Subsystems 335, 340, and 345 and RMB 350.
The GUI and CLI may be configured to present a view of a node(s) within the cluster. RMB 350 distributes information between the nodes within the cluster.
According to one embodiment, GUI 305 is configured to execute on a workstation (not shown) and interact with Configuration Subsystem 325 of device 305. GUI 320 provides a graphical interface to perform operations relating to device 305. One of these operations is performing a reboot of a cluster. CLI 325 provides a command line interface that allows the user to perform operations on device 305 by an application executing on device 305. The GUI and CLI associated with device 305 may also be used to trigger a cluster reboot.
RMB 350 is configured to communicate with device 305 and other devices (device 310 and device 315) within the cluster. RMB 350 maybe included within device 305 or it may be separate from device 305. Generally, RMB 350 is used to communicate information between the members of the cluster.
According to one embodiment, the system acquires exclusive authority of the cluster during the reboot process. This helps to prevent more than one user or system from affecting the devices during the reboot. According to one embodiment, GUI 320 is implemented as a set of Web pages in a browser and a Web Server. The server may operate on a device within the cluster or a device separate from the cluster. The server may operate on all or some of the cluster members.
CLI 325 is a management CLI that presents the cluster information relating to the device and the cluster textually to a user. When the reboot process is initiated, RMB 350 interacts with the configuration subsystems of the devices being rebooted. According to one embodiment, when an error occurs during a reboot with one of the cluster members, the reboot process is stopped. According to one embodiment, RMB 350 may be configured to restore the configurations to the devices before the reboot process began. This helps to ensure that all the members of the cluster maintain the same attributes. When a problem occurs RMB 350 may indicate that there was a failure to the GUI and CCLI, or send the error to some other location. When the rebooting is complete, the administrator may perform other operations. The reboot action is triggered by a control in an application using the
Graphical User Interface (GUT) or a command in a Command Line Interface (CLI) shell.
The control or command causes a script to be run that performs the cluster rebooting process. The script initiates a reboot by contacting each cluster member, providing an attribute that causes each member to temporarily be removed from the cluster, and then providing an attribute that causes the reboot operation to begin. The script then detects the loss of contact with the device and attempts to reestablish contact. When the script has established contact, it internally indicates that that device is now rebooted and informs the administrator which device has been rebooted. According to one embodiment, the device from which the rebooting process is initiated is not rebooted until all of the other devices have been rebooted.
The reboot for all of the devices, except for the one on which the reboot is initiated, can either be performed sequentially (one device at a time) or in parallel. The parallel method reduces the overall time needed to restore the cluster to full operation.
If the reboot fails on any of the devices, as indicated by failure to reestablish contact with the device, the reboot process halts, thereby preserving the state of the devices not rebooted. The administrator is informed that the cluster reboot has been stopped prematurely along with the identity of the device or devices that have failed. FIGURE 4 illustrates components of the RMB, in accordance with aspects of the invention. As illustrated in the figure, RMB 400 includes RMB Client 420, configuration subsystem 410, RMB Server 440 and secure transport 435. RMB Client 420 includes Cluster API (application programming interface) 425 and Remote API 430. Cluster API 425 maintains information about the cluster's members. Remote API 430 maintains information about each cluster member and tracks NM operations. Secure Transport 435 delivers and receives messages to perform NM operations, such as the cluster reboot operation, and performs integrity checks on the messages. RMB Server 440 is arranged to communicate with configuration subsystem 410 and communicates with RMB client 420 through secure transport 435.
Remote Management System 400 acts as the backbone for the nodes within the cluster. RMB 400 provides base mechanisms including: discovering the members within the cluster; delivering queries and operations relating to NM attributes to the devices in the cluster; ensuring message integrity; an interface for management applications; and an interface to each device's local configuration subsystem. RMB 400 also includes a secure mechanism for transporting the information in the messages sent between the nodes within the cluster.
RMB 400 is also configured to automatically query the nodes it is coupled with in order to determine the cluster members. These queries are performed periodically to help ensure that all cluster members are available at any given time. According to one embodiment, RMB 400 ensures consistency of the configuration by using database transactions. For example to begin a transaction whenever an attribute is to be changed and applying a 'commit' database operation if the change is successful on all devices and a 'rollback' operation when the change fails on any device. The RMB may implement these transactions either internally or by using the transaction capabilities of the Configuration Subsystem. According to one embodiment, the Configuration Subsystem's transactions are used since these maybe complicated operations.
RMB Client 420 uses Cluster API 425 to discover the cluster's member devices. RMB 400 uses messages to perform system and NM operations. The system operations include acquiring and releasing the configuration lock. When a message is to be sent, the RMB fills in header and delivers the message. When a message is received, the RMB checks the header and accepts the message only if values in the fields of the header are valid. The RMB discards any message whose header has invalid values in the fields.
RMB Client 420 composes the body of a RMB message and uses Cluster API 425 to deliver the message to the cluster members; receive the responses from the members; and extract the result of the operation from the message. Remote API 430 delivers the message to a particular cluster member and checks that a response message is received for every request message sent. Secure Transport 435 is the transport mechanism that actually sends and receives the messages.
The RMB Client can be implemented as a collection of shared-object libraries with well-defined Application Programming Interfaces (APIs). CGUI and CCLI can use these APIs to interact with the RMB to perform NM operations.
The RMB Server can be implemented as a daemon that is launched during system start-up.
RMB's Secure Transport can be implemented as a Secure Sockets Layer (SSL) socket. This provides and extra layer of security by providing the ability to encrypt the RMB messages.
FIGURE 5 shows a process for rebooting a cluster, in accordance with aspects of the invention. After a start block, process 500 flows to block 505 where a list of cluster members is obtained. The list of cluster members is used to help ensure that all of the cluster members are rebooted. Moving to block 510, the identity of the member on which the reboot is initiated is obtained. Flowing to block 515, a reboot is performed on each member of the cluster other than the member who initiated the reboot. According to one embodiment, the cluster members minus the initiating member are rebooted in parallel. For example, if there are five members of the cluster then four of the five members are rebooted at the same time. As discussed above, the members may be rebooted in any order, so long as at least one member remains active during the rebooting of the other members. Moving to decision block 520, a determination is made as to whether an error occurred during the cluster reboot on the members other than the initiating member. When an error occurs, the process flows to block 530, where the reboot process is halted. Transitioning to block 525, a reboot is performed on the member initiating the cluster reboot. Moving to decision block 530, a determination is made as to whether an error occurred during any step of the cluster reboot. When an error occurs, the process flows to block 530, where the reboot process is halted. When there are no errors, the process flows to an end block and returns to processing other actions. The above specification, examples and data provide a complete description of the invention. Since many embodiments of the invention can be made without departing from the spirit and scope of the invention, the invention resides in the claims hereinafter appended.

Claims

WHAT IS CLAIMED IS:
1. A method for rebooting a cluster, comprising: initiating a reboot of the cluster; determining cluster members; and rebooting each of the cluster members while at least one of the cluster members remains active while the other cluster members are being rebooted.
2. The method of Claim 1 , further comprising determining an initiating cluster member that initiated the reboot and controlling the rebooting from the initiating cluster member.
3. The method of Claim 1 , wherein rebooting each of the cluster members while the at least one of the cluster members remains active while the other cluster members are being rebooted further comprises rebooting the cluster members other than the at least one of the cluster members that remains active in parallel.
4. The method of Claim 2, wherein the at least one of the cluster members that is maintaining normal operation is the initiating cluster member.
5. The method of Claim 1 , wherein initiating the reboot of the cluster is performed by a user.
6. The method of Claim 1 , wherein rebooting each of the cluster members, further comprises removing the cluster member being rebooted and determining when the removed cluster member has been rebooted.
7. The method of Claim 6, wherein determining when the removed cluster member has been rebooted further comprises attempting to re-establish contact with the removed cluster member.
8. The method of Claim 1 , further comprising halting the reboot process when it is determined that an error occurs during the reboot process.
9. A system for rebooting a cluster while maintaining operation of the cluster, comprising: a network interface configured to communicate with cluster members; a memory configured to store information relating to the cluster; a remote management broker (RMB) configured to distribute information to the cluster members; and a processor configured to perform actions, including: initiating a reboot of the cluster; determining the cluster members; and rebooting each of the cluster members while at least one of the cluster members remains active while the other cluster members are being rebooted.
10. The system of Claim 9, further comprising determining an initiating cluster member that initiated the reboot and controlling the rebooting from the initiating cluster member.
11. The system of Claim 9, wherein rebooting each of the cluster members further comprises rebooting each cluster member other than at least one cluster member that remains active in parallel.
12. The system of Claim 11 , wherein the at least one of the cluster member is the initiating cluster member.
13. The system of Claim 9, further comprising a user interface used to initiate the reboot of the cluster.
14. The system of Claim 10, wherein rebooting each of the cluster members, further comprises removing the cluster member being rebooted and determining when the removed cluster member has been rebooted.
15. The system of Claim 14, wherein determining when the removed cluster member has been rebooted further comprises attempting to re-establish contact with the removed cluster member.
16. The system of Claim 9, further comprising halting the reboot process when it is determined that an error occurs during the reboot process.
17. An apparatus for rebooting a cluster while maintaining operation of the cluster, comprising: means for initiating a reboot of the cluster; means for determining cluster members; and means for rebooting each of the cluster members while at least one of the cluster members remains active while the other cluster members are being rebooted.
18. The apparatus of Claim 17, wherein the means for rebooting each of the cluster members, further comprises means for removing the cluster member being rebooted and means for deterniining when the removed cluster member has been rebooted.
19. The apparatus of Claim 18, wherein determining when the removed cluster member has been rebooted further comprises means for attempting to reestablish contact with the removed cluster member.
PCT/IB2004/001929 2003-06-25 2004-06-10 Method of rebooting a multi-device cluster while maintaining cluster operation WO2004114570A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP04736549A EP1644828A4 (en) 2003-06-25 2004-06-10 Method of rebooting a multi-device cluster while maintaining cluster operation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/606,645 2003-06-25
US10/606,645 US7076645B2 (en) 2003-06-25 2003-06-25 Method of rebooting a multi-device cluster while maintaining cluster operation

Publications (2)

Publication Number Publication Date
WO2004114570A2 true WO2004114570A2 (en) 2004-12-29
WO2004114570A3 WO2004114570A3 (en) 2005-04-14

Family

ID=33540118

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2004/001929 WO2004114570A2 (en) 2003-06-25 2004-06-10 Method of rebooting a multi-device cluster while maintaining cluster operation

Country Status (5)

Country Link
US (1) US7076645B2 (en)
EP (1) EP1644828A4 (en)
KR (1) KR100792280B1 (en)
CN (1) CN100481004C (en)
WO (1) WO2004114570A2 (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6646953B1 (en) * 2000-07-06 2003-11-11 Rambus Inc. Single-clock, strobeless signaling system
US20040141461A1 (en) * 2003-01-22 2004-07-22 Zimmer Vincent J. Remote reset using a one-time pad
US7469279B1 (en) * 2003-08-05 2008-12-23 Cisco Technology, Inc. Automatic re-provisioning of network elements to adapt to failures
US8812635B2 (en) * 2003-12-14 2014-08-19 Cisco Technology, Inc. Apparatus and method providing unified network management
US20060075001A1 (en) * 2004-09-30 2006-04-06 Canning Jeffrey C System, method and program to distribute program updates
US7661025B2 (en) * 2006-01-19 2010-02-09 Cisco Technoloy, Inc. Method of ensuring consistent configuration between processors running different versions of software
US7818621B2 (en) * 2007-01-11 2010-10-19 International Business Machines Corporation Data center boot order control
WO2010022100A2 (en) 2008-08-18 2010-02-25 F5 Networks, Inc. Upgrading network traffic management devices while maintaining availability
US20120079474A1 (en) * 2010-09-24 2012-03-29 Stephen Gold Reimaging a multi-node storage system
US10481963B1 (en) * 2016-06-29 2019-11-19 Amazon Technologies, Inc. Load-balancing for achieving transaction fault tolerance
US10305970B2 (en) * 2016-12-13 2019-05-28 International Business Machines Corporation Self-recoverable multitenant distributed clustered systems
CN107707595B (en) * 2017-03-17 2018-06-15 贵州白山云科技有限公司 A kind of member organizes variation and device
US11048523B2 (en) 2018-10-25 2021-06-29 Dell Products, L.P. Enabling software sensor power operation requests via baseboard management controller (BMC)
US10860078B2 (en) 2018-10-25 2020-12-08 Dell Products, L.P. Managing power request during cluster operations

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003021465A1 (en) 2001-09-05 2003-03-13 Pluris, Inc. Method and apparatus for performing a software upgrade of a router while the router is online

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5469542A (en) * 1991-07-22 1995-11-21 International Business Machines Corporation Serial diagnostic interface bus for multiprocessor systems
US6044461A (en) * 1997-09-16 2000-03-28 International Business Machines Corporation Computer system and method of selectively rebooting the same in response to a system program code update
US6324692B1 (en) * 1999-07-28 2001-11-27 Data General Corporation Upgrade of a program
US6779176B1 (en) * 1999-12-13 2004-08-17 General Electric Company Methods and apparatus for updating electronic system programs and program blocks during substantially continued system execution
US6757836B1 (en) * 2000-01-10 2004-06-29 Sun Microsystems, Inc. Method and apparatus for resolving partial connectivity in a clustered computing system
GB2359385B (en) * 2000-02-16 2004-04-07 Data Connection Ltd Method for upgrading running software processes without compromising fault-tolerance
US6691244B1 (en) * 2000-03-14 2004-02-10 Sun Microsystems, Inc. System and method for comprehensive availability management in a high-availability computer system
US6854069B2 (en) * 2000-05-02 2005-02-08 Sun Microsystems Inc. Method and system for achieving high availability in a networked computer system
EP1231537A1 (en) * 2001-02-09 2002-08-14 Siemens Aktiengesellschaft Automatic turn-on of a computer cluster after a curable failure
US20030149735A1 (en) * 2001-06-22 2003-08-07 Sun Microsystems, Inc. Network and method for coordinating high availability system services

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003021465A1 (en) 2001-09-05 2003-03-13 Pluris, Inc. Method and apparatus for performing a software upgrade of a router while the router is online

Also Published As

Publication number Publication date
CN100481004C (en) 2009-04-22
CN1864134A (en) 2006-11-15
KR20060026877A (en) 2006-03-24
US20040268112A1 (en) 2004-12-30
EP1644828A2 (en) 2006-04-12
US7076645B2 (en) 2006-07-11
KR100792280B1 (en) 2008-01-08
WO2004114570A3 (en) 2005-04-14
EP1644828A4 (en) 2008-01-23

Similar Documents

Publication Publication Date Title
US7076645B2 (en) Method of rebooting a multi-device cluster while maintaining cluster operation
US7822718B1 (en) Restoration of archived configurations for a network device
US20190324793A1 (en) Transaction control arrangement for device management system
US9690836B2 (en) System and method for supporting state synchronization in a network environment
US8458534B1 (en) Method and system for providing high availability to computer applications
US8196142B2 (en) Use of external services with clusters
US7688719B2 (en) Virtualization and high availability of network connections
US6704752B1 (en) Method and system for executing, tracking and restoring temporary router configuration change using a centralized database
US6728723B1 (en) Method and system for verifying configuration transactions managed by a centralized database
CN110995481A (en) Configuration method, server and computer-readable storage medium
CN113326272A (en) Distributed transaction processing method, device and system
US7587475B2 (en) System for joining a cluster by cloning configuration
WO2011076058A1 (en) Distributed databases upgrade method, upgrade processing device and upgrade controlling device
JP2001034568A (en) Logical path establishing method, and storage medium
US6952703B1 (en) Subsystem application notification method in a centralized router database
TWI801730B (en) Server with system setting data synchronization function
JP4154441B2 (en) Single point management system for devices in a cluster
JP2006227763A (en) Data sharing system, data sharing method, and program
US11403093B1 (en) Application modification with proxy service process
Cisco Use Case Provisioning
WO2012065395A1 (en) Service configuration processing method and apparatus
Cisco 8.4.10 Version Software Release Notes Cisco StrataView Plus for AIX Release Notes
Cisco Cisco StrataView Plus for AIX Release Notes
WO2015127758A1 (en) Backup file data retransmission method, device and system
JP2002189623A (en) Remote file transfer system

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 1020057024106

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 20048176357

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2004736549

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1020057024106

Country of ref document: KR

WWP Wipo information: published in national office

Ref document number: 2004736549

Country of ref document: EP

DPEN Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed from 20040101)