WO2000070461A1

WO2000070461A1 - Method for controlling inheritance information in combined computer system

Info

Publication number: WO2000070461A1
Application number: PCT/JP1999/002466
Authority: WO
Inventors: Takashi Ando
Original assignee: Fujitsu Limited
Priority date: 1999-05-13
Filing date: 1999-05-13
Publication date: 2000-11-23

Abstract

A method for controlling inheritance information in a highly-reliable combined computer system. An acting cluster (112) inheriting the control from a representing cluster (111), before starting a new processing as a representing cluster, compares a set of inheritance information (51) with another set of inheritance information (52) for which generation management is performed by a shared storage (12) to extract the difference. The acting cluster (112) then sets a new set of inheritance information (53) based on the difference information. The acting cluster (112) reflects the inheritance information (53) on other clusters (11n).

Description

Specification

Control method of handover information in multi-computer system

The present invention relates to a method for controlling takeover information in a multi-computer system, and more particularly, when a failure occurs in a representative cluster that controls a multi-computer system, a representative cluster is used as a proxy cluster that controls the system instead of the down representative cluster. Related to the control method for inheriting the information of Background art

Conventionally, in a complex computer system including a plurality of computers (clusters), each cluster is connected to a shared storage device and configured to exchange information via the shared storage device. One of the clusters functions as a representative cluster that controls the entire system based on command information input from the console. The representative cluster performs control of passing console information indicating the operation state of the console to another cluster via the shared storage device as takeover information. As a result, each cluster stores console information in the same state. This is because the system continues processing when the representative cluster goes out of control due to a failure. In other words, if a failure occurs in the representative cluster, one of the other clusters controls the system in place of the representative cluster. Such a cluster replacing the representative cluster is called a proxy cluster.

The representative cluster takes over the console information of the representative cluster via the shared storage device and controls the entire system based on the information. In this way, by continuing the console information of the representative cluster to the proxy cluster, processing can be continued.

However, if a failure occurs in the representative cluster, the takeover information on the shared storage device may lack credibility. As a result, the proxy cluster cannot correctly take over the console information when the representative cluster is running, and the processing continues. Was not guaranteed, leading to a decrease in system reliability.

SUMMARY OF THE INVENTION An object of the present invention is to provide a control method of takeover information in a highly reliable compound computer. Disclosure of the invention

According to a first aspect of the present invention, there is provided a method for controlling takeover information in a multifunction computer system. The multifunction computer system is configured by connecting a plurality of clusters by a shared storage device, and the surrogate cluster takes over control based on takeover information reflected from a representative cluster that controls the multifunction computer system. The shared storage device manages the generation of the inherited information from the representative cluster. Before taking over control from the representative cluster and starting a new process, the proxy cluster checks the credibility of the inherited information based on multiple generations of inherited information, and rebuilds information lacking credibility based on the result. And transfer the new takeover information to another cluster. BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 is a schematic configuration diagram of a compound computer system according to an embodiment of the present invention. FIG. 2 is a schematic configuration diagram of a cluster.

FIG. 3 is a schematic configuration diagram of the shared storage device.

FIG. 4 is an explanatory diagram of a method for controlling takeover information during normal operation.

FIG. 5 is an explanatory diagram of a method of controlling takeover information when a failure occurs. BEST MODE FOR CARRYING OUT THE INVENTION

An embodiment in which the present invention is embodied in a compound computer system will be described below with reference to the drawings.

Complex computer system of FIG. 1, a plurality of clusters 1 1, 1 1, 1 _1;!, - - -,

1 1 _n , shared storage device 12, switching device 13, console 14.

Each cluster ll] ~ ll _n are connected to the shared storage device 1 2, the shared storage device 1 2 Has a shared area to be accessed from each cluster 1 1 ~ 1 1 _n. Each cluster 1 1 E to 1 l _n is the row exchanges information with each other through a shared region of the shared storage device 1 2

Ό

Also, each cluster 1 1〗 ~ 1 1 _n are connected to the switching device 1 3, the console 1 4 is connected via a switching device 1 3 in any one of them. In the initial state, the console 14 is connected to the first cluster 11〗 via the switching device 13. As a result, the first cluster 11 provides a function as a representative cluster for controlling the entire multifunction computer system.

Further, each of the other cluster 1 1 ₂ ~ 1 1 _n, when a failure occurs in the first cluster 1 1] which functions as a representative cluster, functions as a substitutional cluster which controls the entire system in place thereto. In detail, when the representative cluster 1 1 fails, one output through the shared storage device 1 2 as much as possible control signal representative to another cluster of the cluster 1 1 ₂ ~ 1 l _n At the same time, output to the switching device 13. The switching device 13 connects the console 14 to the cluster that outputs the control signal. Thus, the cluster that has output the control signal functions as a proxy cluster that controls the entire system.

FIG. 2 is a schematic configuration diagram of the first cluster 11 1. The clusters 1 1 ₂ to 1 1 _η have the same configuration as the first cluster 11, and therefore the drawings and explanation are omitted. The first cluster 11 1] has at least one CPU 21 , A memory 23, a disk device 24, and an input / output channel 22 in a state of being connected thereto.

I / O channel 22 is connected to shared storage device 12 and switching device 13, and CPU 21 exchanges information with shared storage device 12 and switching device 13 via I / O channel 22. Do.

The memory 23 includes an area 23a for storing command information and an area 23b for storing console information. CPU 21 stores command information input from console 14 in area 23a. If this command information is for you The CPU 21 executes processing based on this, and displays the processing result on the console 14.

CPU 2 1, when the command information is Ru der intended for other clusters 1 1 ₂ ~ 1 1 _n, performs control to pass to the desired cluster through the shared storage device 1 2 that information. The cluster that has received the information executes processing based on the information, and passes the processing result to the representative cluster 11. The typical cluster 1 1 _x, which has received the processing result, and displays the processing result to the console 1 4.

Further, the CPU 21 stores console information indicating the operation state of the console 14 in the area 23b. This console information is passed from the representative cluster to another cluster via the shared storage device 12. That is, the CPU 21 of the representative cluster stores attribute information (for example, for a system operator, for a general user, etc.) indicating the identification information and the status information of the console 14 and the range of functions in the area 23b. It has a configuration. The CPU 21 of the representative cluster outputs the console information to the other clusters via the shared storage device 12, and each other cluster stores the received console information in its own area 23b.

The CPU 21 of the representative cluster outputs the changed console information to another cluster when the multifunction computer system starts up and when any change is made to the console information. In this way, the console information is passed from the representative cluster to other clusters.

The disk device 24 includes a plurality of areas 24a and 24b. Cluster 1 1 in the first area 2 4 a! Is stored. This program data includes general operating systems and various business programs. The CPU 21 executes a business program for processing corresponding to the command information stored in the memory 23.

The second area 24b stores initialization information. This initialization information is information for initializing the hardware resources and the software resources. That is, the CPU 21 of the representative cluster that controls the entire system executes the console based on this initialization information. Initialize the rule information. Also, the proxy cluster that replaces the representative cluster initializes the console information based on the initialization information when the information (console information) inherited from the representative cluster is low in reliability (information is damaged or missing). Become Incidentally, storage devices such as a magnetic disk device, an optical disk device, and a magneto-optical disk device are usually used for the disk device 24, and these storage devices depend on the type and state of data stored in the disk device 24. It is appropriately selected and used. FIG. 2 functionally illustrates the disk device 24, and may have a configuration including one or a plurality of storage devices corresponding to each area.

Various information including program data and initialization information stored in the disk device 24 is provided by a recording medium (not shown). The recording medium includes an MT, a memory card, a floppy disk, an optical disk (CD-ROM, Any computer-readable recording media such as DVD-ROM, ...), and magneto-optical disks (M0, MD,...) Can be used. Further, various kinds of information may be recorded on the recording medium or the disk device 24 via the communication medium. The CPU 21 drives a drive device (not shown), loads the program data recorded on the recording medium into the disk device 24, and executes it. It should be noted that the configuration may be such that the program data recorded on the recording medium is directly executed. FIG. 3 is a schematic configuration diagram of the shared storage device 12.

The shared storage device 12 includes a control unit 31, a first area 32 for storing command information, and a second area 33 for storing console information. The control unit 31 stores command information input from each cluster in the first area 32, and stores console information in the second area 33.

The second area 33 includes first and second generation storage units 34 and 35, and a generation management unit 36 that manages the generation of information stored in the storage units 34 and 35. The generation management unit 36 provides a function of performing generation management of console information input via the control unit 31. The generation management unit 36, when console information is input from the representative cluster, The console information of the generation storage unit 35 is moved or copied to the first generation storage unit 34 and input. The stored console information is stored in the second generation storage unit 35. As a result, the second generation storage unit 35 always stores the latest console information, and the first generation storage unit 34 stores the console information of the previous generation.

These two generations of console information are used by a proxy cluster that controls the entire system instead of the representative cluster. When the proxy cluster takes over control from the representative cluster, it first checks the credibility of the takeover information.

That is, the proxy cluster reads the console information stored in both storage units 34 and 35 of the shared storage device 12 and compares them. If the console information matches based on the comparison result, the proxy cluster determines that the console information has been correctly inherited as the inherited information from the representative cluster. Then, the proxy cluster stores the takeover information in its own memory 23, and controls the entire system based on the information.

On the other hand, if there is a difference between the two console information based on the comparison result, the proxy cluster determines that the takeover information taken over from the representative cluster is suspicious information. Then, the proxy cluster resets the console information to the area 22b of the memory 23 based on the initialization information stored in the area 24b of the disk device 24 for the suspicious information.

Next, the operation of the composite computer system configured as described above will be described with reference to FIGS.

First, control during normal operation will be described.

Figure 4 shows the takeover control of console information during normal operation. Now, in an initial state, the first cluster 11 i controls and controls the entire system as a representative cluster.

At this time, the first cluster 11 i outputs console information 41 indicating the operation state of the console 14 to the shared storage device 12. The shared storage device 12 moves or copies the console information 42 stored in the second-generation storage unit 35 of FIG. 3 to the first-generation storage unit 34 and is input to the second-generation storage unit 35. The console information 41 is stored. other Cluster 1 1 ₂ ~ 1 1 _n is out read this console information 4 1 from the shared storage device 1 2 and stores it in its own memory 2 3. In this way, the multifunction computer system passes the console information of the representative cluster to another cluster. Next, control when a failure occurs will be described.

Figure 5 shows the takeover control of console information when a failure occurs. Now, the first cluster 1 fails a representative cluster, the second cluster 1 1 ₂ controls the entire system as a proxy cluster lieu.

At this time, the console information 51 of the first cluster 11 i is reflected in the information of the shared storage device 12. That is, the shared storage device 12 stores the console information 51 and the console information 52 one generation earlier.

The second cluster 1 1 ₂ to take over the control from the first cluster 1 1 i, before starting a new process as a representative cluster, implementing the following.

(1) The two generations of console information 51 and 52 (takeover information) on the shared memory are read and compared.

In this step, the console information 5 1, 5 2 if there is no difference, the second cluster 1 1 ₂ authenticity of the console information 5 2 is high, to ensure its legitimacy. The second cluster 11 has console information 52; and operates as a representative cluster for controlling the entire system. That is, the second cluster 1 1 ₂ reflects the console one Le information 5 2 to another cluster 1 1 ₃ ~ 1 1 _n, to start a new process as a representative cluster.

In the above process, if there is a difference in the console information 5 2, 5 1, second cluster 1 1 ₂ is carried out the following steps.

(2) Extract the difference between the console information 51 and 52. The difference is shown in Fig. 5 with a dashed circle.

(3) Generates console information 53 with the extracted difference information reset (initialized, re-read from definition body, re-opened file, etc.) using initialization information.

(4) Reflect the reset console information 53 on other clusters. Next, features of the compound computer system according to the embodiment of the present invention will be described below.

(1) the second cluster 1 1 ₂ that operates as a substitutional cluster to take over control from the first cluster 1 1 i which functions as a representative cluster, before starting a new processing as a representative cluster shared storage device 1 2 Console information 51, generation-managed by

The credibility of the console information 51, 52 is checked by comparing 52. As a result, the validity of the console information 51 and 52 can be easily guaranteed, and the reliability of the multifunction computer system can be improved.

(2) The difference is extracted as a proxy cluster, and new takeover information 53 is reset based on the difference information. Then, the proxy cluster 1 1 ₂ reflects the reset takeover information 53 on the other clusters 1 1 _n . As a result, the processing time is shorter than when all the console information is reset, and a new process can be executed in a short time.

The above embodiment may be modified as follows.

In the above embodiment, the shared storage device 21 is configured to include the first and second generation storage units 34 and 35 as areas for managing two generations of console information. However, three or more generations are managed. Alternatively, the configuration may be changed.

Claims

The scope of the claims

1. A plurality of clusters (1 1 i to 1 1 _n) sharing storage device (1 2) Contact the proxy cluster take over control of the complex computer system combined binding from the typical cluster by Les Te,

The shared storage device generationally manages the takeover information from the representative cluster, and the proxy cluster uses the takeover information of the multiple generations before taking over control from the representative cluster and starting a new process. A method of controlling inherited information in a multifunction computer system that checks the authenticity of information and, based on the result, resets information that lacks authenticity and inherits new inherited information to another cluster.

2. In the control method of takeover information in the compound computer system according to claim 1,

The proxy cluster compares the takeover information of a plurality of generations managed by the shared storage device, and performs a reset if each takeover information has a difference. .

3. In the control method of takeover information in the complex computer system according to claim 1,

The proxy cluster includes a storage device (24) including an area (24b) in which initialization information is stored in advance, and controls the takeover information in the complex computer system that performs the reset by the initialization information. method.

4. In the control method of takeover information in the compound computer system according to claim 1,

The proxy cluster is A step of comparing the plurality of pieces of inherited information whose generation is managed,

A step of extracting a difference between the respective handover information;

A step of resetting the difference of the extracted handover information;

A step of reflecting the reconfigured takeover information on other clusters;

A method for controlling handover information in a multi-function computer system.

5. In the control method of takeover information in the complex computer system according to claim 1,

The shared storage device,

An area (34, 35) for storing the plurality of pieces of transfer information;

A generation management unit (36) for managing the generation of the handover information;