MXPA98005490A - Dynamic changes in configurac - Google Patents

Dynamic changes in configurac

Info

Publication number
MXPA98005490A
MXPA98005490A MXPA/A/1998/005490A MX9805490A MXPA98005490A MX PA98005490 A MXPA98005490 A MX PA98005490A MX 9805490 A MX9805490 A MX 9805490A MX PA98005490 A MXPA98005490 A MX PA98005490A
Authority
MX
Mexico
Prior art keywords
configuration
computer program
configuration change
transaction
change transaction
Prior art date
Application number
MXPA/A/1998/005490A
Other languages
Spanish (es)
Inventor
W Arendt James
Chao Chingyun
David Kistler Michael
Daniel Lawlor Frank
Augusto Mancisidor Rodolfo
Ramanathan Jayashree
Raymond Strong Hovey
Original Assignee
International Business Machines Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corporation filed Critical International Business Machines Corporation
Publication of MXPA98005490A publication Critical patent/MXPA98005490A/en

Links

Abstract

Configuration changes are dynamically applied to a group multiprocessing system by placing a configuration change event on the waiting list. When the configuration change event is processed, the above configuration is backed up and each computer program component applies an appropriate portion of a configuration change transaction in a synchronized and orderly fashion. Each component of the computer program applies its portion of the transaction either by reinitialization or by means of a registered transaction operation. If the configuration change transaction fails, the components of the computation program re-execute or re-execute the portions of the configuration change already applied in a synchronized and orderly manner by restoring the previous configuration. Multiple events can be placed on the waiting list for different configuration changes

Description

redundant, regardless of whether the component is a processor, card-; of memory, hard disk drive, adapter, power supply, etc. While providing urj, seamless jump and continuous operation, fault tolerant systems are expensive due to the requirement of redundant computing equipment. access to shared resources. A node can "have" a set of resources -disks, volume groups, file systems, networks, network addresses and / or aplicacioh.es- as long as that node is available. When that node is deactivated, access to resources is provided through a different node. i An active configuration comprises a set of program entities and computer equipment in addition to a set of relationships between these entities, the combination of entities and relationships that provide services to users. Computer equipment entities specify nodes, adapters, shared disks, etc. while the computer program entities specify redirection and reintegration policies. For example, a particular computer program entity may specify that an application server must be redirected to node B when node A fails. You can also specify whether the application server should regress to node A when node A is reinstated. Within grouped multiprocessing systems, it would be advantageous to reconfigure critical mission that can not be deactivated for long periods of time (and preferably without deactivating). An example of a situation that requires uninterrupted support for dynamic configuration changes would be to perform a computer equipment update within a group of four nodes (nodes A, B, C, and D). A user might require deactivating the node, such as node D, to update it, update the computer equipment, reconnect node D to the group, and possibly make configuration changes. If node D were equipped with a faster processor and / or additional memory, for example, the user could wish for node D to become the primary system for an application server that previously ran on a different node. The user will want to make these changes and will want the changes to be preserved through power outages and group startups. Another example of a situation that requires dynamic changes: configuration involves dynamic and transient configuration changes. If the workload of a node increases temporarily, the user may wish to move an application server that previously ran on that system to another node. Since the increase in the workload is not normal, the non-I1I change needs to be maintained through the group starts.
There is at least one group computation program -HACMP for AIX®, which can be obtained from International Business Machines Corporation of Armonk, New York-which provides some dynamic reconfiguration capabilities. Ce.da node includes a default configuration that is copied into the active configuration for the respective node in the start group. The default configuration can be modified at the same time the group is active and copied to the default settings of other gnoclos. This default, modified configuration is subsequently copied in a stepwise configuration to each active node. The new configuration is verified and, when the daemons are refreshed for each group node, they are copied into the active configuration for the active nodes. The group services for inactive nodes added by the reconfiguration can then be started. The existing or state-of-the-art system for dynamic reconfiguration has several limitations. First, multiple reconfigurations can not be synchronized. When a second reconfiguration is initiated at the same time a reconfiguration is in progress. dynamic, the presence of a stepwise configuration on any group node acts as a lock preventing the initiation of a new dynamic reconfiguration event.
Second, the state-of-the-art system can not be used to effect dynamic changes when multiple computer program components are involved in the application of different parts of the changes to the configuration. When a dynamic configuration change involving multiple computer program components fails, the changes already made up to the time of the failure must be re-executed. This is much more complex than dynamically changing a single component, and reverting to a previous configuration if the attempted configuration change fails. Therefore, the changes that can be made dynamically are limited. It would be desirable, therefore, to provide a multiprocessing system of groups with support for dynamic changes involving multiple components of computer programs, and for multiple dynamic reconfigurations of syn- onization. It would also be desirable to coordinate dynamic configuration changes with other events in a 'system and make dynamic changes in a secure manner; in case of failure. It is therefore an object of the present invention to provide an improved system of multiprocessing of groups.
Ethernet, Token-Riñg, FDDI, or an optical serial channel i connector network (SOCC). A serial network may also provide point-to-point communication between nodes 104-110, used for message control and latent traffic in the event that an alternative subsystem fails. As described in the exemplary embodiment, the i system 102 may include some level of redundancy to eliminate unique failure points. For example, each node 104-110 can be connected to each public network 112-114 by means of two network adapters: a service adapter that provides the primary active connection between a node and the network and an adapter eg reserve that replaces the adapter of service in the event that the service adapter fails. in this way, when a resource within the system 102 becomes unavailable, the alternative resources can be quickly replaced by the resource that has failed. Those of ordinary skill in the art will appreciate that the computer equipment described in the exemplary embodiment of Figure 1 may vary.
For example, a system may include more or fewer nodes, additional clients, and / or other connections not shown.
Furthermore, the present invention can be implemented within any computer program that uses configuration data and does not need to support dynamic changes in such data. Systems that provide high availability are used solely for the purpose of illustrating and explaining the invention. With reference to Figure 2, a waiting list structure is illustrated which can be employed by a process for dynamically re-configuring a highly available multiprocessing system that involves multiple computer program components in accordance with a preferred embodiment of the present invention. .. SSee requires coordination in the processing of events - typically malfunction events (or "redirection") and recovery events (or "reintegration") - related to highly available resources. Such coordination is provided by a duplicate event waiting list. The waiting list of event in the exemplary modality is a duplicate waiting list maintained by means of a coordinating component of the "high availability" group computing program: the coordination component is a distributed entity that has a demon that each node runs within the group The coordination component subscribes to other components of the high availability group computing program such as a component to handle adapter and node failures, an i component to handle forced redirections by a system administrator, and / or a component for which it is being used in various phases of the process of a given event, resulting in incorrigible or incorrect behavior.The wait list structure 202 that can be extended, described, includes a plurality of Wait list entries 204 together with an indicator 206 to a first waiting list entry and an I indicator 208 for a final wait list entry.
The waitlist structure 202 also includes flags 210, which can be used to dynamically reconfigure a highly available data processing system that involves multiple components of computer programs. Each wait list entry 204 may include an event name (such as "node_up") and a priority. Priority classes can be used to manage all events that relate to the nodes are assigned a primary priority while all events that relate to adapters are assigned a secondary priority and all events that are related with the application servers are assigned a tertiary priority. Each waiting list entry 204 may also include an identification node, a time stamp, indicators to the next waiting list entry and the previous waiting list entry, a type of configuration in the event waiting list duplicated In exemplary mode, the duplicate event waiting list is the same waiting list used for recovery and failure events. A separate event waiting list can be used for configuration and change events, but would still require coordination with the existing event waiting list. The wait list may contain other events that have already been scheduled and / or other events that have a higher associated priority than the configuration change event I. [For example, a previous configuration change event may be in progress, or it may be assigned to a recovery event or fails a higher priority and processed before the configuration change events. The process then proceeds to step 308, which illustrates a determination of whether the configuration change event can be processed later. At a minimum, this stage requires a determination of whether event B is complete for which processing had already begun at the time the configuration change event was initiated. Depending on the particular implementation, this stage may also require a determination of whether the waiting list contains other events that have a higher priority than the configuration change event, such as parts of the change [of configuration, then they can apply the PRIORITY portion of the configuration change undo the transaction and can be achieved in a similar way. Depending on the method to apply a configuration change, a computational program component can be reinitialized under the old configuration or ba or a reverse scan through a register transition operation. Once the configuration is restored, the process goes to the. step 320, which describes the notification to the user of the failed configuration change transaction. The indication used to provide such an aiviso may include information regarding the reason why the configuration change transaction failed, including an identification of the computational program component in which the transaction failed. From this information, a system administrator can correct the problem and restart the configuration change. From step 320, the process is directed to step 322, which illustrates the resumption of event processing. Within this path of the described process, the event processing resumes under the old configuration. Referring again to step 314, once the transaction is successfully completed, the process continues to step 1, 322, described above.
In this access route of the process described, however,

Claims (1)

  1. synchronize each portion within the sequence of ordered portions using flags. The method according to claim 1, further characterized in that the step of effecting the configuration change transaction in a sequence of ordered portions further comprises: resetting at least one computer program component within the plurality of components of computer program with a new configuration. The method according to claim 1, further characterized in that the step of effecting the configuration change transaction in a sequence of ordered portions further comprises: executing a transition operation from the previous configuration to a new configuration in at least one component of the computer program within the plurality of computer program components; and record the transition operation. 6. The method according to claim 1, further characterized in that the step of initiating a configuration change transaction involving a plurality of computer program components further comprises: creating a copy of the above configuration; An apparatus to support dynamic configuration changes e? a multiprocessing system of groups, characterized in that it comprises: means of transaction initiation to initiate a configuration change transaction involving a plurality of computer program components at the same time as the multiprocessing system of groups is running; transaction execution means for effecting the configuration change transaction in a sequence of ordered portions, each portion is applied by means of a computer program component within the plurality of computer program components; I and restoration means, which respond to the detection of fail D in the configuration change transaction, restoring a previous configuration. 10. The apparatus according to claim 9, further characterized in that the restoration means further comprises: means for effecting in reverse order the succession of ordered portions. The apparatus according to claim 9, further characterized in that the means of executing the transaction further comprises: 17. A computer program product for use with a data processing system, characterized in that it comprises: a computer means; first; instructions in the useful computer means for initiating a configuration change transaction involving a plurality of computer program components; second instructions in the computer's useful means for effecting the configuration change transaction in a succession of ordered portions, each portion is executed by means of a computer program component with the plurality of computer program components and third instructions in the useful means of computer, which respond to the detection of the configuration change transaction failed, restoring a previous configuration. 18. The computer program product according to claim 17, further characterized in that the third instructions further comprise: instructions to perform in reverse order the succession of ordered portions. 19. A group multiprocessing system, characterized in that it comprises: a plurality of nodes connected by at least one network, each node within the plurality of nodes includes a memory containing information of the configuration for J the multiprocessing system of groups; multiprocessing system computation program of groups developing in each node, the purchase program: initiates a configuration change transaction involving a plurality of component! js of the computation program at the same time as the multiproce system runs; Groups of groups; and i in response to the detection that the configuration change transaction failed, restore a previous configuration. 20. The group multiprocessing system according to claim 19, further characterized in that the group multiprocessing computation program performs the configuration change transaction in a succession of ordered portions, each portion is applied by means of a computer program component with the plurality of computer program components.
MXPA/A/1998/005490A 1997-07-07 1998-07-07 Dynamic changes in configurac MXPA98005490A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US08888550 1997-07-07

Publications (1)

Publication Number Publication Date
MXPA98005490A true MXPA98005490A (en) 1999-09-01

Family

ID=

Similar Documents

Publication Publication Date Title
US6003075A (en) Enqueuing a configuration change in a network cluster and restore a prior configuration in a back up storage in reverse sequence ordered
US8769132B2 (en) Flexible failover policies in high availability computing systems
KR100326982B1 (en) A highly scalable and highly available cluster system management scheme
US6671704B1 (en) Method and apparatus for handling failures of resource managers in a clustered environment
US6745241B1 (en) Method and system for dynamic addition and removal of multiple network names on a single server
CA2621249C (en) Application of virtual servers to high availability and disaster recovery solutions
US7657536B2 (en) Application of resource-dependent policies to managed resources in a distributed computing system
JP5102901B2 (en) Method and system for maintaining data integrity between multiple data servers across a data center
US8230256B1 (en) Method and apparatus for achieving high availability for an application in a computer cluster
US20080052327A1 (en) Secondary Backup Replication Technique for Clusters
US7085956B2 (en) System and method for concurrent logical device swapping
JP5384467B2 (en) Reusable recovery environment
US20070094659A1 (en) System and method for recovering from a failure of a virtual machine
US7987394B2 (en) Method and apparatus for expressing high availability cluster demand based on probability of breach
JP4141875B2 (en) Recovery processing method, its execution system, and its processing program
EP2545450A1 (en) System and method to define, visualize and manage a composite service group in a high-availability disaster recovery environment
US6442685B1 (en) Method and system for multiple network names of a single server
US8015432B1 (en) Method and apparatus for providing computer failover to a virtualized environment
JP3967499B2 (en) Restoring on a multicomputer system
US6968390B1 (en) Method and system for enabling a network function in a context of one or all server names in a multiple server name environment
CA2241861C (en) A scheme to perform event rollup
CN115878361A (en) Node management method and device for database cluster and electronic equipment
MXPA98005490A (en) Dynamic changes in configurac
US7558858B1 (en) High availability infrastructure with active-active designs
CN109995560A (en) Cloud resource pond management system and method